Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

My point may seem facetious, but wouldn't ensuring (through proper catches and guards) that a 500 doesn't happen at all inside your own systems be the best resiliency? And when you experience one, it's a red alert and you harden that component to whatever failed immediately? I guess that's just the school of thought I've operated under.


Those approaches don't need to be mutually exclusive. However, the main assumption of the post is that these events will occur regardless of how hard you try to prevent them:

> ...there will always be some number off in the tail lost to an accidental bug, a bad deploy, an external service that fails to respond, or a database failure or timeout.

It's just advocating the same approach that's the norm for distributed systems: operate under the assumption that some components of the system will fail. This is assumed to be the case in an area such as modern database design, but is rarely considered when it comes to behavior of a generic web service.

That said, IMO there's tons of cases where if your data is inconsistent for a few records you honestly just don't need to care, and so the idea of devoting attention and engineering resources to something that's both rare and low impact just doesn't make sense. There is a disclaimer in the post that this is only advised for critical services, but I think that's being ignored in the discussion.


> there's tons of cases where if your data is inconsistent for a few records you honestly just don't need to care

Enterprise applications can be data-sparse, but every incorrect record field has disproportionately high downstream impact.


The problem comes once you leave and that 500 red alert goes. You want to give the person after you enough time to figure out what is up and fix it before your boss's boss's boss rolls in with a head of steam.

It is gonna fail eventually so make it happen gracefully.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: