Not disagreeing with your point, and I'm sure you already know this, I just want...

atombender · on July 13, 2021

This misses the problem explained in the article, which is that there are scenarios where events are "acked" but things still go wrong because of bugs.

For example, you rolled out code on the receiver side that did the wrong thing with each message. Now there's no way to replay the old webhooks events in order to reinstate the right behaviour; there's no way to ask the producer to send them again.

The only way around this is to store a record of every received message on the receiver side, too, which the article author thinks is an unnecessary burden compared to polling.

Personally, I think push is an antipattern in situations where data needs to be kept in sync. The state about where the consumer is in the stream should be kept at the consumer side precisely so it can go back and forth.

curryst · on July 14, 2021

If you want to be 100% sure that you get all the webhooks, the sender could implement an incrementing "webhook ID". If the receiver knows the last webhook ID was 53 and the sender sends one for 55, you can tell one has been dropped. There are some other concerns around that like if 54 has been sent but they arrived out of order, or if they arrive almost simultaneously. Nothing that isn't solvable afaict though.

Of course, then you need a way for the receiver to retrigger or view the webhook if one gets missed, which starts to look like you have to have a polling endpoint anyways, though.

BasieP · on July 13, 2021

We have a system that pushes loads of messages (as in thousands a minute) and some consumer insists on using there http backend to push the messages to. There system is down every once in a while for quite some time. We're using an async queueing solution, but you can't keep those messages forever. We sometimes have milions of messages for them in there queue's, which take up space... If all of our consumers had those problems we would have to buy loads of storage.. We're simply dropping messages older than x, and have an endpoint that they can call to retreive the 'latest state of things'. This way when they come back from a failure, they simply get the latest state, and then continue with updates from our end.. It's far from perfect, but it works really well.

I know the goal for most systems is just to be 'up to date' Not to get the entire history. So in most cases you don't need to stash all the messages, you just need to be able to retreive the latest state of stuff...

ThrowawayR2 · on July 13, 2021

> "Of course, the issue with this approach is most webhook providers... don't do that "

Embedded systems don't do that for webhooks because they can't (very little RAM or non-volatile storage) but customers clamor for webhooks anyway because it's what their web developers know how to use. So inevitably they're going to lose data but they're only getting what they asked for.