Johanna Larsson·April 18, 2026monitoringreliabilityhow-it-works

How multi-probe voting eliminates false alerts

Most uptime alerts are false positives. A single probe in a single location gets a timeout, and suddenly your on-call engineer is checking their phone at 3 AM for nothing. Here's how multi-location verification fixes that.

If you've ever been woken up at 3 AM by a monitoring alert, only to find your site was perfectly fine, you know the feeling. You check the dashboard, everything is green, and you go back to bed wondering why you're paying for a service that can't tell the difference between an outage and a network hiccup.

This happens a lot. The industry number that gets thrown around is that 85% of uptime alerts are false positives. I don't know if that's exactly right, but having been on the receiving end of monitoring alerts for years, it feels right.

The root cause is almost always the same: a single probe in a single location runs a check, gets a timeout or a connection blip, and fires an alert. One bad data point, one page. That's how most monitoring tools work, and it's why most teams eventually start ignoring their alerts.

Why single-probe checks fail you

Think about what happens when your monitoring service checks your site from a single server in Virginia. That check has to travel through the public internet, hit your server, and come back. There are a lot of things that can go wrong along the way that have nothing to do with your service being down.

The probe's ISP has a routing issue. A submarine cable is having a bad day. There's packet loss at a peering point. Your CDN is slow to respond from that specific region. Any of these will cause a timeout, and a timeout means an alert.

The monitoring service doesn't know why the check failed. It just knows it did. So it alerts you.

Checking from multiple locations

The fix is straightforward: don't rely on a single check from a single location. Run the same check from multiple locations at the same time, and only alert when a majority of them agree that something is wrong.

Here's what that looks like in practice:

Nuremberg   → timeout    ✗
Helsinki    → 200 OK     ✓
Virginia    → 200 OK     ✓
Singapore   → 200 OK     ✓

Result: 1 of 4 failed → no alert

One probe timed out. Three others confirmed the service is up. No alert. No 3 AM wake-up. The timeout was a network issue between the Nuremberg probe and your server, not an actual outage.

Compare that to a single-probe system: one timeout, one alert, one angry engineer checking their phone for nothing.

This is how Larm works. Every check runs from multiple global probes, and we only create an incident when a majority of probes report failure. It sounds simple, and it is, but it makes a huge difference in practice.

Adding a confirmation window

Voting alone isn't quite enough. Network issues can affect an entire region temporarily. If half the internet has trouble reaching your server for 30 seconds because of a BGP route change, you probably don't want an alert for that either.

So on top of voting, Larm adds a confirmation window. When a majority of probes report failure, we don't immediately fire an alert. We wait for the next check cycle and verify that the issue persists. Only then does the monitor's state transition.

The monitor moves between states: pending (waiting for first data), up, down, and stale (no recent data from probes). Each transition from up to down requires sustained agreement across probes over multiple check cycles. A momentary blip gets absorbed. A real outage gets caught and confirmed before anyone gets paged.

The tradeoff

There is one, and it's worth being honest about. Multi-probe voting with confirmation windows means Larm is slightly slower to alert than a single-probe system. If your service genuinely goes down, there's a delay while we confirm the outage from multiple locations.

In practice, this delay is a few seconds to a couple of minutes depending on your check interval. For most teams, that's a worthwhile tradeoff. You lose a tiny bit of detection speed and gain dramatically fewer false alerts. Your team starts trusting the alerts they receive, which means they actually respond to them.

What you get when an alert does fire

When Larm does page you, you know something is genuinely wrong. And you get the data you need to start debugging immediately. Every check captures a full request waterfall from every probe location: DNS resolution, TCP connection, TLS handshake, time to first byte, and content transfer.

So instead of just "your site is down", you can see that DNS is resolving fine, the TCP connection is timing out, and it's happening from all locations. That's probably a server issue, not a network issue. Or you can see that TLS handshake is slow from Asia but fine from Europe, which tells you something different entirely.

This is the core of how Larm approaches monitoring. We think the job of a monitoring tool is to reduce stress, not create it. If you're tired of alert fatigue, give it a try. The free plan includes 15 monitors with multi-probe voting from all locations.

Start monitoring in minutes.

Free plan. 15 monitors. Multi-probe voting. No credit card.

Sign Up Free