Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think much of this issue can be attributed to 2 most underrated things

1. Cache line misses. 2. So called definition of BigData. (if data can be easily fit into memory, then its not Big period! )

Many times, I have seen simple awk / grep commands will outperform Hadoop jobs. I personally feel, its lot better to spin up larger instances, compute your jobs and shut it down than bearing the operational overhead of managing hadoop cluster.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: