Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm trying to picture a non-synthetic program where normalizing a URL is a significant part of the inner loop and I'm drawing a blank. Can someone help me out?


Here's a similar case: Ranking of poker hands. That can involve many of the same steps, like sorting a small list of inputs (5 or 7 cards) into a normalized configuration (three-of-a-kind is represented as AAABC where B > C) and taking a hash to memoize the result. And you sure might want that in a tight inner loop if you're writing a poker simulator or AI where you want to crunch a few billion iterations.


Actually, for a poker hand evaluator the last thing you want to do is sort the hands. There's a small enough number of unique card combinations that you can arrange it so that you look up in pre-computed tables more often than not.

One clever trick is assigning a prime to each card value. By multiplying the values together you get a unique number representing each hand without needing to sort.

There's a great description of an algorithm at http://www.suffecool.net/poker/evaluator.html.


They want to use a normalized URL as the hashkey to cache system.

Depending on exactly what type of hashing they are doing, the normalization could be a significant fraction of the lookup time.

If you are going to get all micro-optimized, you should probably combine the normalization and hash steps, as there is likely some string parsing / building that could be eliminated.

Also: it's a pretty fun exercise.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: