Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What I learn from this is issue is partly because of not proper use "Go Channels" and open source product "BoltDB"


IMO looking at the root causes here isn't that helpful. Software is complicated and there will always be some unknown bottleneck or bug lurking to knock you over on a bad day. The important lessons here are about:

* How their system architecture made them particularly vulnerable to this kind of issue

* Their actions to diagnose and attempt to mitigate the issue

* The whole later part about effectively cold-starting their entire infrastructure, all while millions of users were banging on their metaphorical door to start using the service again.


That and going all-in on Hashicorp.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: