Hacker Newsnew | past | comments | ask | show | jobs | submit | klibertp's commentslogin

If you claim it's the fastest, how does it compare to one-more-re-nightmare?

- https://github.com/telekons/one-more-re-nightmare

- https://applied-langua.ge/posts/omrn-compiler.html

OMRN is a regex compiler that leverages Common Lisp's compiler to produce optimized assembly to match the given regex. It's incredibly fast. It does omit some features to achieve that, though.


As a potential user (not the author), what jumps out at me about the two is:

OMRN: No lookaround, eager compilation, can output first match

RE#: No submatches, lazy compilation, must accumulate all matches

Both lookaround and submatch extraction are hard problems, but for practical purposes the lack of lazy compilation feels like it would be the most consequential, as it essentially disqualifies the engine from potentially adversarial REs (or I guess not with the state limit, but then it’s questionable if it actually counts as a full RE engine in such an application).


> No community I know.

Everybody in sales in every software company in the world would be part of that community, I think. Some of the devs, too. Software was always marketed (and discussed with normal people) as something that could automate error-prone tasks, thereby eliminating the inevitable mistakes humans make when performing those tasks. Would Excel be the cornerstone of so many businesses if it sometimes gave the wrong value as a sum of a column?

That marketing (and the fact that, indeed, Excel can sum anything users throw at it without making mistakes) worked; now we have 3 generations of users who believe that once a computer "gets it" (ie. the correct software is installed and properly configured), it will perform a task given to it correctly forever. The fact that it's almost true (true in the absence of bugs and no changes to the setup, no updates, no hardware degradation, no space rays flipping important bits, etc.) doesn't help - that preceding parenthetical is hard to understand and often omitted when a developer talks to a non-developer.

We've always had software that wasn't as reliable as Excel - speech recognition and OCR come to mind. But in those cases, the errors are plainly visible - they cannot be "confidently wrong". Now we have LLMs that can be confidently wrong, and a vast number of users trained to think that software is either always right or, when it's wrong, it's immediately noticeable.

I don't think developers should bear the whole responsibility here - I think marketing had a much larger role in shaping users' minds. However, devs not clearly communicating the risks of bugs to users (for fear of scaring potential customers or out of laziness) over decades makes us partly responsible as well.


> Software was always marketed (and discussed with normal people) as something that could automate error-prone tasks, thereby eliminating the inevitable mistakes humans make when performing those tasks.

That's far from a community touting that computers don’t make mistakes.

> Would Excel be the cornerstone of so many businesses if it sometimes gave the wrong value as a sum of a column?

You mean like if it was running on a Pentium with the FDIV bug? :)

I agree there's a perception computer output is generally reliable, and that leaves users at the mercy of snake oil parrots that are generally unreliable and are sold without a warning. But I don't agree the cause is that touting.


Also, disable the formatting if stdout is not a terminal. That way, your colors and cursor movements won't be visible when piping to another program, and your tool will be usable in apps that don't understand the CSI and chars that follow. Use a command-line switch with more than two states, e.g., `ls` (and probably other GNU tools) has `--color=always|auto|never` which covers most use cases.

Also not mentioned in the article: there are a few syntaxes available for specifying things in control sequences, like

    \x1b[38;2;{r};{g};{b}m
for specifying colors. There's a nice list here: https://gist.github.com/ConnerWill/d4b6c776b509add763e17f9f1... You can also cram as many control codes as you want into a control sequence, though it probably isn't useful in a modern context in 99.9% of cases.

Lose. Evacuate the government. Then mount a guerrilla, and wait for an opportunity. It'll come, most likely sooner rather than later.

Why is that unthinkable? I can understand people in the US being unable to process such a scenario, but here in Europe, there's not a single nation that wasn't off the map for some time.

I know why Ukrainians don't want that, but the demographic costs of tens to hundreds of thousands of "military age men" dying are so huge that any plausible alternative should be considered, even if it's very unpleasant.


> Why is that unthinkable?

Because it’s unthinkably stupid.

> I know why Ukrainians don't want that, but the demographic costs of tens to hundreds of thousands of "military age men" dying are so huge that any plausible alternative should be considered, even if it's very unpleasant.

And you imagine they won’t die in your guerrilla war? Or the next invasion after an emboldened Russia regroups?


You're suggesting a decades long guerrilla movement under occupation will be better for the Ukrainian people than conscription during an existential defensive war?

In terms of the number of lives lost? Yes. Guerrilla resistance is a way of trading important advantages (like control of the territory or political legitimacy) for time and human lives. Guerrillas in a favorable environment tend to suffer much lower casualties per fighter per unit of time than trench warfare along a frontline.

It's a desperate measure, but so is snatching people from the street to bus them off to trenches.

Personally, I think people can live through almost any hell (and can make a comeback later) - unless they die, in which case they can't do anything anymore. Decades of hard times, in this view, are preferable to tens of thousands of excess deaths per year over a decade.

I understand why people are reluctant to consider this - I'm just trying to show that there are alternatives to the current situation; not strictly better, but at least presenting different trade-offs. In a situation of "existential defensive war," we should discuss all plausible options, even the most controversial ones.


Not necessarily, if Ukraine surrenders then Russia will disarm them. Then when they revolt Russia will be able to bomb them with impunity because the resistance will not have the air defenses and manufacturing that the Ukrainian military now has.

Not to mention that Russia will almost certainly genocide or atleast severely oppress the Ukrainians if they win


EDIT: important to note that abandoning the trenches and the frontline does not mean surrendering, and I never said they should surrender! I suggested evacuating the govt and continuing the resistance with other means - I don't believe the actual surrender would do any good.

You're right - the risks are, of course, very significant. And we've been through that here in Poland, historically, like 3 times already. We've had quite a few failed uprisings, and we've had anti-communist guerrillas here for a while after WW2 - they were quickly (it still took 3-5 years, though!) dismantled, and most of them were killed. So the risks are real, and it is a "desperate measure".

On the other hand, it worked quite a few times: Cuba, Vietnam, Afghanistan all proved that it's possible to win (or at least not lose) using guerrilla tactics. In case of Ukraine, I think the circumstances would favor the resistance: Russia's already not doing well economically; the "severe oppression" of the Ukrainians (which I agree would follow) would cement the support for the resistance, and it would cost Russia a lot; Russia had air superiority since day one, and it didn't really help them much (it would be much more of a threat had Russia have US-level intelligence capabilities - but they domonstrably don't).

Yes, as long as it's possible, the conventional war should continue. At some point, though, the costs (all kinds of them) of continuing to fight in the field become so high that it's better to stop and switch to other ways of defending.

I'm not saying that moment is now - and it's not for me to dictate when it happens - I'm just trying to say that there are other ways of dealing with the aggressor that may (in favorable circumstances) lead to lower casualties without forgoing the hope of eventually winning. Which I wish Ukraine with all my heart, BTW.


The countries that got invaded by the US fought guerrilla because that is the only thing they could do. It wasn't some deliberate strategy to rope the US in.

And the only reason it worked out for them is that the US wasn't determined to create new states and had very low domestic support to begin with. That's not the case with Russia where this war is clearly a big deal to them.


Honestly, as much as I love Smalltalk, I see this post as an illustration of all that's wrong in that community. Instead of documenting the original solution well, making a cheat sheet for it, or gathering examples (which you otherwise need to "hunt" for, according to the author) in one place, we're given another solution on top of the original inscrutable one. That new solution is still undocumented; in the post, there's no explanation of how it was written and how it works; instead, we get another handful of examples.

GToolkit has Lepiter, and Smalltalks in general have comments on classes that can be used to write long-form docs. Yet, the majority of Smalltalkers apparently believe that all that is not needed, and that searching for examples in the codebase gives you all the information you need. It may be true - not in all cases, but some of them - but the efficiency of that process is incredibly low. And that's even if the examples are there - mostly in test code. It's not unusual to find large areas of code that are entirely untested, though, in which case something that could be done in 15 minutes if you were provided with a solid README can take days to figure out.

Yeah, I know: the IDE, the Smalltalk environment, IS very valuable and it DOES give you superpowers. That doesn't fully offset the issue of notorious lack of documentation, though, and in effect, you're still less productive than you could be with Python. GToolkit's Lepiter is a step in the right direction, but even in GT, class comments and method docstrings are sparse. GToolkit-like environment coupled with Emacs Lisp self-documenting docstrings culture would be ideal, but somehow it just won't happen.


Interesting. This has not been my experience. I have not regularly worked in Smalltalk for 12+ years, other than the occasional download and show some young programmer how cool the "petri dish of objects" was/is.

For the last many years, I've been working in C, Swift, Kotlin, Python, and Elixir. In no case, have I felt that there was a step difference between the communities. I'm doing a lot of Elixir right now, which supposedly has "the docs are great". And yet, I often can't find the answer I need to figure something out. Sure there's some inscrutible parts of Smalltalk code bases, but my point is, journeying aboard in these other lands (I do quite a bit of Python too), hasn't made me feel like "oh wow, here's the documentation I could never find." Your mileage obviously varies, I respect that. Just wasn't my experience.


For Latin alphabet-based languages, it's pretty similar to how names from those languages are transliterated to Japanese or Korean. You get "Clare" in English and (what, to me, sounds like) "Kurea" in Japanese; equivalent (I'm told!) but not the same. It would be wrong to try to assess the IQ of Japanese (who don't know English) by asking about properties of the original word that are not shared by the Japanese equivalent. On the other hand, English speakers won't ever experience haiku fully, since the script plays a big role in the composition (according to what I'm told... I don't know Japanese, but anime intake exposed me to opinions like this; and even if I'm dead wrong with details, it sounds like a plausible analogy, at least...)

The examples are fine for an early-stage poc project like this one. `minutes` with evaluation trace and `[Fold]<-` are illustrative, and if you work them out with pen and paper, you can get a good grasp on the main ideas of the language. That you have to search for them on a page that looks like a slightly-formatted README instead of having a nice scrollable with syntax-highlighted snippets at the top is because this IS a slightly-formatted README - and that's also completely fine at this stage. What's important is that there are a few interesting concepts there and that it was published. Even if this one fizzles, as 99.999% of languages do, that doesn't matter if some other language down the line gets inspired by those concepts.

From the Conclusion:

> Does this mean that it is futile or meaningless to attempt to compose Elvish sentences? Well, no. [...] it is indeed possible to produce written Elvish that so far as anyone now can tell conforms grammatically and idiomatically to the exemplars and statements that Tolkien provided to a very high degree (for example, by relying only upon attested elements and derivational mechanisms, attested grammatical devices, and attested syntactic patterns that can reasonably be thought to belong to the same conceptual phase) — though I very much doubt that anyone will ever be able to do so quickly enough to use Elvish as a spoken language, for any but the most trivial sorts of declarative sentences.

I hate to be the one, but I haven't seen anyone else refer to them: how good are LLMs at following patterns of invented languages, in either direction (ie. inventing a translation from English to Sindarin and then, separately, translating the invented Sindarin back to English)? It's a usage where "hallucinations" are basically required, but also, the consistency of hallucinations has to be high.


> I do not smoke myself, but it made me realize how little I know regarding THC and CBD

Long-term use causes the psychedelic part of THC effects to diminish over time. At some point, only a mild depressant effect remains - somewhat similar to chamomile. It does have some effect on intelligence and short-term memory, but if the alternative is to be too stressed to think at all, it might be better to just smoke.

Obviously, if possible, psychotherapy or a prescription from a psychiatrist (or better yet, a change of environment) would be better (in the latter case, it depends on the prescribed drug, of course), but THC is not that bad an alternative where it's legal.


Make is a very good choice for storing common maintenance commands for a project. We use it at work for this. It started when we migrated to Docker more than a decade ago - before docker-compose was a thing, building and running a set of containers required quite a bit of shell scripting, and we decided to use Make for that. Make is ubiquitous, cross-platform, the targets are essentially snippets of shell with some additional features/syntax added on top, there's a dependency system (you can naturally express things like "if you want to run X, you need to build Z and Y first, then X, then you can run it"), it allows for easy parameterization (`make <target> ARG=val`), plus it's actually Turing-complete language with first-class lambdas and capacity for self-modifying code[1]. And when some rule becomes too complex, it's trivial to dump it into `scripts/something.sh` and have Make call it. Rewriting the script in another language also works, and Make still provides dependencies between targets.

TL;DR: Make is a very nice tool for gathering the "auxiliary" scripts needed for a project in a language-agnostic manner. It's better than setup.py and package.json precisely because it provides a single interface for projects of both kinds.

[1] Which is worth knowing so you can avoid both features like the plague.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: