Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

If you make your goal as high availability as possible, and you only get paged on outages, then your goal should be to never get paged.

You should be building resilient architectures, not being on firewatch duty.



This is a classic developer vs business incentives misalignment.

Developers don't want to ever be paged because they don't want to be bothered, but the business might be perfectly happy to pay you to be on firewatch duty.

Consider a "low traffic" alert, how can you tell the difference between a slow period at 3am on a holiday vs a true outage? You can't without someone getting up and testing if the site is still up. (Maybe you can automate that check but there's always edge-cases you can't automate).

OP seemed to suggest it's better to disable the alarm than to just suffer the false alarm every now and then. I doubt very much that the people paying you for the on-call service would agree though.


> Developers don't want to ever be paged because they don't want to be bothered

This is a very reductive statement.

Developers have experienced their best colleagues burning out and leaving jobs because of on-call being completely overwhelming.

Developers want to behave intelligently.

Developers want the system to work.

Developers don’t want to burn their lifespan for false alarms that are being sent because someone didn’t spend 30 seconds thinking about whether a human being needs to be woken up in the middle of the night for whatever widget they’re slapping together.


> The goal for oncall should be to NEVER get called.

Is that not also reductive then? Or maybe my statement pretty accurately captures that sentiment without 4 sentences of explanation.

But no, instead of engaging with the meat of my argument you just reductively attack one sentence.

I get it, I'm oncall right now for my job. I don't like it when alarms go off. I also understand that if I were to tune the alarms so I "NEVER get called" I'd be out of a job soon enough because the business would go under.


Okay, dialing up good-faith engagement.

How would your interpretation change if the article said this instead?

> The goal for oncall should be to continuously tune the system toward having no outages and no false alarms.

FWIW, I did only attack one sentence. This was not exactly intended to be dismissive. It was my reaction to, in my eyes, the weakest part of your argument.


This is classic misalignment of business needs vs. perceived management wants.

Are they paying you to answer false alarms, or are they paying you to make sure the site is available and performant to keep customers happy? Nobody with half a brain wants to answer a bunch of false alarms. Are there are people that will happily get paid to ACK yet another noisy alarm just to collect a paycheck? Certainly; but these are button pushers, not problem solvers.

Your low traffic alert scenario simply requires synthetic requests. This is you you test anything with low usage, but requires high reliability.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: