Published: June 8, 2026

False positive uptime alerts: practical tuning for indie SaaS

Q: Can I set StillOnline to alert on one failure only?

No. Production uses two consecutive failures before DOWN to reduce noise.

Q: Why did I get DOWN for 30 seconds during deploy?

Two failed probes in a row crossed the threshold. Use lighter /health or deploy when you can accept brief red — interval sets minimum recovery visibility.

Q: Does shortening interval to 60 s on Pro increase false positives?

It can. Faster detection = more sensitivity to blips. Start at 300 s until health URL is stable — see interval honest guide before you pay for 30s elsewhere.

Q: StillOnline shows green but users cannot log in?

Probe likely hits wrong URL — auth flow limits and antibot guide.

False positives train you to ignore on-call: a blip during deploy, a cold start, or a 200 from a login page while the API is fine. The fix is not “buy Datadog” — it is health URL design and understanding how StillOnline marks checks DOWN.

This guide complements uptime probes and antibot with interval choices, debounce behavior, and when to split checks on Pro.

Quick answer

StillOnline marks a check DOWN after two consecutive failed probes — on Free, five-minute interval means roughly 10 minutes from first failure to alert. Reduce noise by pointing at a stable GET /api/health (not homepage behind WAF), returning 200 in under two seconds, and exempting the health path from aggressive bot rules. Free cannot change interval (300 s only); Pro allows 120–300 s and up to 10 URL checks per project. Fix redirect chains with curl -L before blaming the monitor.

Knobs you actually have

Knob	Free	Pro / Ultimate
Probe interval	300 s (5 min) fixed	60–300 s per check
Fail threshold → DOWN	2 consecutive fails	Same
Owner alert repeat while DOWN	Email throttled 15 min; Telegram per transition	Same
Number of URL checks	1 per project	10 / 25

Debounce is intentional — notifications avoid paging on one packet loss.

Tuning workflow

1 — Verify the URL like a probe

curl -sS -o /dev/null -w "%{http_code} time:%{time_total}s final:%{url_effective}\n" -L --max-redirs 5 "https://api.yourproduct.com/health"

Expect 200, time < 2s, stable final: URL.

2 — Separate liveness from heavy `/ready`

Cold starts (serverless, Edge Functions) may exceed timeout once — use lightweight /health without DB on the cold path — health design.

3 — Antibot and redirects

Homepage 200 with challenge HTML = false green; login redirect = false green for product — full guide: antibot probes. PROBE_LIMITED (yellow) means antibot blocked probe without opening incident.

4 — Split checks on Pro

Check A: marketing site (optional)
Check B: api.yourproduct.com/health (authoritative)

Free must combine signals in one URL or accept tradeoffs.

5 — Deploy windows

Brief 503 during deploy may trigger DOWN — use /health liveness only if you want green during rolling deploys, or pause alerts manually (no snooze button in v1 — plan deploys).

When not to tune the monitor

Situation	Fix infrastructure, not threshold
Real 500s after deploy	Roll back
DB pool exhausted	Fix pool or return 503 on `/ready` — DB pool health
Cert expires tomorrow	SSL monitoring

FAQ

Can I set StillOnline to alert on one failure only?

No. Production uses two consecutive failures before DOWN to reduce noise.

Why did I get DOWN for 30 seconds during deploy?

Two failed probes in a row crossed the threshold. Use lighter /health or deploy when you can accept brief red — interval sets minimum recovery visibility.

Does shortening interval to 60 s on Pro increase false positives?

It can. Faster detection = more sensitivity to blips. Start at 300 s until health URL is stable — see interval honest guide before you pay for 30s elsewhere.

StillOnline shows green but users cannot log in?

Probe likely hits wrong URL — auth flow limits and antibot guide.

False positive uptime alerts: practical tuning for indie SaaS

Quick answer

Knobs you actually have

Tuning workflow

1 — Verify the URL like a probe

2 — Separate liveness from heavy `/ready`

3 — Antibot and redirects

4 — Split checks on Pro

5 — Deploy windows

When not to tune the monitor

FAQ

Uptime probes, redirects, and antibot false greens

Health Endpoint Design for SaaS: /health vs /ready

Multi-region health check strategy for indie SaaS

Telegram alerts when your service goes down

Sub-minute uptime intervals: honest comparison

False positive uptime alerts: practical tuning for indie SaaS

Quick answer

Knobs you actually have

Tuning workflow

1 — Verify the URL like a probe

2 — Separate liveness from heavy /ready

3 — Antibot and redirects

4 — Split checks on Pro

5 — Deploy windows

When not to tune the monitor

FAQ

Further reading

Uptime probes, redirects, and antibot false greens

Health Endpoint Design for SaaS: /health vs /ready

Multi-region health check strategy for indie SaaS

Telegram alerts when your service goes down

Sub-minute uptime intervals: honest comparison

2 — Separate liveness from heavy `/ready`