← Blog

How to monitor Server-Sent Events streams without killing connections

Your status page shows green while live notifications stopped ten minutes ago. That is an SSE false green: a naive HTTP probe got 200 on a handshake once, or a half-open stream looks alive while no events flow. Monitor Server-Sent Events health check honestly with a companion GET sidecar, not by hammering the long-lived /events URL.

Quick answer

Do not point StillOnline at your SSE /events stream — long-lived connections and buffering break naive GET monitors. Add GET /health/sse that returns 200 when the last event was sent within N seconds, or 503 when the stream pipeline is stale. StillOnline checks that sidecar URL with HTTP GET and status codes only. Free: one URL, five-minute interval.

For general health URL design, see health endpoint design for SaaS. For WebSocket-specific upgrade issues, see WebSocket uptime health checks.

1. Map why naive HTTP probes break SSE consumers

Three failure modes: probing /events directly (opens/closes streams), probing a static page while /events is dead (false green), and half-open connections where TCP stays up but the server stopped writing events.

Probe mistakeWhat goes wrongBetter approach
GET /events every 5 minNew connection each probe; may skew metricsSeparate /health/sse sidecar
GET / onlyMarketing page up; stream pipeline staleFreshness heartbeat JSON
200 on first byte onlyHandshake OK; no events for hours503 when last_event_age > threshold

Do: treat SSE like a subsystem with its own health signal mapped to HTTP 503. Do not: register the long-lived stream URL in StillOnline Free.

Triage: alert → curl GET /health/sse → if 503, SSE worker logs → if 200 but users report silence, suspect CDN buffering.

2. Compare companion /health/sse vs synthetic stream readers

A companion sidecar exposes GET /health/sse with JSON like {"status":"ok","last_event_seconds_ago":12}. A synthetic stream reader opens a real SSE connection from a worker. StillOnline fits the sidecar: scheduled HTTP GET, status code only on Free.

ApproachStillOnlineSynthetic SSE reader
ProtocolHTTP GET on /health/sseLong-lived text/event-stream client
ProvesApp believes events are freshEnd-to-end event delivery
Cost/complexityLow — one routeWorker + scheduler
let lastEventAt = Date.now();
app.get("/health/sse", (req, res) => {
  const ageSec = (Date.now() - lastEventAt) / 1000;
  if (ageSec > 120) return res.status(503).json({ status: "stale", ageSec });
  res.json({ status: "ok", ageSec });
});
// In SSE handler: lastEventAt = Date.now() on each write

Do: update last_event_at whenever your SSE handler sends any event. Do not: return 200 with stale last_event_at — map staleness to 503 so StillOnline Free sees failure.

3. Fix proxy buffering and CDN pitfalls

Reverse proxies often buffer responses until complete — that breaks SSE delivery. nginx needs proxy_buffering off and often proxy_cache off on the SSE location. Send X-Accel-Buffering: no from the app.

location /events {
  proxy_pass http://app_upstream;
  proxy_http_version 1.1;
  proxy_set_header Connection '';
  proxy_buffering off;
  proxy_cache off;
  chunked_transfer_encoding off;
}

Do: send periodic comment heartbeats (: keepalive) from the server if your product allows idle periods. Do not: rely on default 60s proxy_read_timeout without aligning server heartbeat interval.

4. Run detect, alert, and status page update workflow

When /health/sse flips to 503, run a fixed sequence so customers get honest comms fast.

  1. Confirm sidecar — curl GET /health/sse from outside.
  2. Check StillOnline — wait for debounced DOWN (two failed probes on Free).
  3. Triage layer — app process vs proxy vs upstream publisher.
  4. Post status page update within 15 minutes — "Live updates delayed."
  5. Resolve — re-test /health/sse 200 and spot-check browser EventSource.

Do: label the status component "Notifications" or "Live feed." Do not: mark resolved while last_event age is still above threshold.

5. Wire StillOnline operator runbook for SSE

Register https://api.yourproduct.com/health/sse — not /events. StillOnline sends GET, expects 200 when healthy, treats 503 as DOWN on Free.

  1. Sign in at stillonline.tech/app.
  2. Add check with full /health/sse URL, GET, expect 200.
  3. Exclude from auth — monitors send no cookies or JWT.
  4. Enable alertsFree: one channel; Pro ($9/mo) adds more — pricing.
  5. Document stall vs outage in internal runbook.

Do: rehearse stall in staging by pausing the event publisher while keeping HTTP up. Do not: set freshness threshold shorter than your quiet-period heartbeats.

What's next

You stopped probing /events directly, added /health/sse freshness, fixed proxy buffering, and registered StillOnline on the sidecar. Add optional synthetic SSE reader when live updates are revenue-critical.

Open the StillOnline dashboard, paste /health/sse, and enable the alert channel you read during incidents.

Related guides

FAQ

How do I tell a stream stall from a hard outage with StillOnline?

Stall: sidecar returns 503 with high ageSec while the process still accepts HTTP. Hard outage: connection refused or 502/503 at the edge. Start with curl /health/sse from outside.

Does StillOnline open SSE streams?

No. StillOnline sends HTTP GET to the URL you configure. Point it at /health/sse, not text/event-stream /events.

What CDN and proxy settings break SSE monitoring?

Response buffering, aggressive idle timeouts, and caching on the stream path. Disable buffering, extend read timeouts above heartbeat interval.

Do I need a synthetic stream reader beyond StillOnline?

Not for MVP. The freshness sidecar plus StillOnline covers most indie SaaS. Add a worker that opens EventSource when notifications are business-critical.

Should I monitor /events or /health/sse in StillOnline?

/health/sse on Free (one URL). Never the long-lived stream URL.

How does SSE monitoring relate to WebSocket monitoring in StillOnline?

Same sidecar pattern: map subsystem health to GET 200/503. WebSockets need upgrade-specific checks — see our WebSocket uptime article.