Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Another important dimension when evaluating these regexes is performance. The Gruber v2 regex has exponential (?) behavior on certain pathological inputs (at least in the python re module).

There are some examples of these pathological inputs at https://github.com/tornadoweb/tornado/blob/master/tornado/te...



In node.js too. I found this out the hard way. I ended up modifying it so that it didn't work as well, but at least stopped DoSing my service:

https://github.com/PiPeep/NotVeryCleverBot/blob/coffee-rewri...

Note the commented out lines in the here-regex.


From experience, the python re module does weird things sometimes. There is a better third-party regex module, https://pypi.python.org/pypi/regex.


Does it use NFA?

http://swtch.com/~rsc/regexp/regexp1.html

Because the issue with the URL regex mentioned is with backtracking.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: