Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
DuckDuckGo Regex Search (duckduckgo.com)
150 points by lelf on July 8, 2014 | hide | past | favorite | 33 comments


Every comment here at time of writing is from people thinking that this is a way of searching the web using regular expressions. It is not.

It is a way of taking a regular expression and an input and then applying that regular expression to that input. In this particular example it takes the regular expression:

/(?x: (\w+) \s (\w+) )/

And applies it to:

"hacker news"

And then spits out the result:

"hacker | news"

Representing the two captured results.


Thank you. Can you explain what "?x:" does? Searching for regular expressions syntax (another thing this doesn't do) is quite tricky.


http://regex101.com/ tells me:

>"x modifier: extended. Spaces and text after a # in the pattern are ignored" //

The other explainers I had to hand failed and/or called it an error.


It depends on the flavor of regex being used. Try switching regex101 to JS or PY mode, and you'll see that the x modifier isn't supported.


It's another way of specifying a regular expression modifier, normally given at the end of a regex.

In this case, 'x' means "Extend your pattern's legibility by permitting whitespace and comments"


I agree. What the internet needs is semantics not regular expressions. If there is a need for regular expression, it's only because semantics is not working, for example, the search engine does not have all the synonyms so I might write "do|does|done". We should be focusing more on textual query expansion (if the query hides some background knowledge), auto-tagging of web pages (if the web pages assume some background knowledge) and clever disambiguation of queries (if we don't know which background knowledge tap into).


So.... it executes a regex?


Yes. It's just a normal search term. Except for in the "Answers" section that DuckDuckGo sometimes provides when it thinks it knows the answer to what you're asking, it supplies the results of executing the regex.


What is the use for this?


Testing regular expressions


Awwww. I hoped for a moment it was literally searching the web with a regex. Unfortunately no, although it may be a handy regex cheat sheet.

Wake me when someone does this properly. It's only been done on a small scale or with very limited precomputed expressions, to my knowledge. Not many people would need it, but for those people it'd be insanely useful. But it's also insanely computationally hard - which means it'd be a really interesting technical achievement! There are no general-purpose reverse indexes that I know of that accelerate that as easily as keywords, but there are some data structures that might help a bit, although I can't think of practical ways to deploy them over arbitrary regexps specified at runtime! Plus some sanity heuristics and limits, of course, as regexps can undergo combinatorial explosion and some fun unexpected worse-case performance.


Maybe I am missing out potential applications but I can't understand why I would like to search the entire web with a regex pattern? Find all strings that could be SSNs?



Google does not implement regex search support because they said somewhere that the data storage needed for the index would be huge. And since regex search is used only by a small subset of users, it is not worth the effort.


This does not do what you think it does. It takes a regular expression and an input and applies that regular expression to that input. It doesn't search the web using the regex.


IIRC "security" plays important role here too, because using regexps you can much easier find sensitive data, or to be precise: context (neighborhood) of sensitive data.


I would guess that small subset of users is a big subset of users that use DDG.


More likely, regex search has the power only allowed in the paid Google Search API. And Google knows the value of its search data.


actually they can support only for some computer-related domains such as stackoverflow etc.


Does anyone provide Boolean web searches, with nested terms (e.g., "a AND (b OR (c AND d))"? I don't need it all the time but it would be very handy on occasion.



As all applications turn into web applications, all one-liners you should be able to knock out in the shell become search engine extensions.

I'm sure there's something positive in here if you look hard enough.


Doesn't appear to work for me .. is this supposed to be doing a regex search or just solving; doesn't seem to solve generally?



Although, admittedly, this [1] doesn't do what I'd expect it to.

[1] https://duckduckgo.com/?q=regex+%2F%28%5Cw%2B%29+%5Cs+%28%5C...


Oh wait, I have to look at the text in the small grey bar above, and not at the list of search results. Still pretty non-obvious.


So, why again do we need a cloud service to test a regular expression?


I guess it's too much work to start a REPL.


/a{2,2}b/

Should return aaab right? But im getting some different stuff.


It should be aab.


s/Search/Eval/


/shameless plug/

I coded a regex (domain) search engine recently, and it looks like this:

http://namegrep.com/#hacker%28news%29%3F%7C.combinator

/end shameless plug/

What DuckDuckGo is doing seems just like basic regex evaluation.


[deleted]


This does not do what you think it does. Take a second to look at the link.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: