> This is really the only way to get composability with regexes If you restrict ...

beeforpork · on May 28, 2020

> ... and can be compiled into finite state machines. ...

I am not a big fan of this argument without a 'but'. It is usually overlooked that the finite state machine can become exponentially large (=totally unusable), because sets of non-deterministic states are used for deterministic states. This caused me major trouble in a project where this effect was not noticed during the design phase.

I find this important to stress because the finite state machine can usually be constructed without problems (e.g. by the flex tool), so this property of regexps is undoubtedly useful.

But sometimes, it may explode on you.

a1369209993 · on May 28, 2020

That's deterministic finite state machines (DFA); the nondeterministic (NFA) representation is linear in regex length (unless you use "a{99}", but that's nonstandard for a reason). You can also do concat/intersect/etc directly on the (actually-regular) regex representaion directly, although it is a bit ugly.

Drup · on May 28, 2020

Unfortunately, this is not the case as soon as you add grouping! To properly express regular expression with grouping, finite state automatons are not sufficient and you need the theory of transducers which does not admit the same properties (in particular regarding to determinisation).

MaxBarraclough · on May 28, 2020

What do you mean by 'grouping'?

a1369209993 · on May 28, 2020

Capture groups, I think.

thaumasiotes · on May 28, 2020

> Regular languages are closed not only under concatenation and Kleene star, but also:

> - union

Union doesn't belong in the bottom list. Disjunction is one of the regular operations, just like concatenation and Kleene star.

JadeNB · on May 28, 2020

> > Regular languages are closed not only under concatenation and Kleene star, but also:

> > - union

> Union doesn't belong in the bottom list. Disjunction is one of the regular operations, just like concatenation and Kleene star.

That seems to be what your parent said. Did you perhaps read "Regular languages are closed not only under …" as "Regular languages are not closed under"?

thaumasiotes · on May 28, 2020

The parent comment says "closed not only under concatenation and Kleene star, but also under these other four operations". That divides the closure properties into two sets, obvious and nonobvious. The obvious ones are obvious because they are regular operations -- concatenation and Kleene star are operators defined by the language of regular expressions. Intersection isn't.

I'm pointing out that set union belongs in the obvious group and not the nonobvious group. Like concatenation and Kleene star, it is one of the operators used to define what a regular expression is.

To show that regular languages are closed under intersection / complementation / string reversal, you need to do a proof. The proof of closure under set union is just that it's part of the definition of regular expressions.