I'm working on a text indexing/retrieval program, like locate (http://www.openbs... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		silentbicycle on Feb 3, 2011 \| parent \| context \| favorite \| on: What can you do in 2k LOC of C? I'm working on a text indexing/retrieval program, like locate (http://www.openbsd.org/cgi-bin/man.cgi?query=locate) but for content and not just filenames, and with an index <=2% the size of the indexed data. It's very nearly together (integrating individually working parts now), and is currently ~1,500 lines (according to sloccount). Adding support for indexing Unicode text, more configuration, composite search queries (A and B near C and not D), etc. will no doubt make the source expand a bit, but it's still pretty small. If you're interested in trying it out once it's ready, contact info is in my profile. I'm shooting for within a week or two for a beta vulgaris. (Requires Unix. ANSI C, strung together with sh and/or awk to avoid dependencies.)

ajays on Feb 5, 2011 [–]

Does the index include the dictionary too, in your calculation? I'd be interested in seeing this indexing/retrieval program of yours, I hope you release it soon!

silentbicycle on Feb 5, 2011 | [–]

Yes.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact