← Blog

Monitoring background workers and queues when the web tier is green

Split stacks — Next.js on Vercel plus BullMQ on Railway, or Django web with Celery workers — fail in half. Users load the dashboard; exports and emails never finish.

StillOnline only reaches public HTTPS URLs. Internal workers without ingress need a proxy health signal or heartbeat URL your web tier exposes. This guide maps what to monitor and how to label Background jobs on your status page.

Quick answer

Register a URL that fails when workers stall — usually GET /health including queue heartbeat or a dedicated GET /workers/health returning 503 when last job success is stale. StillOnline probes every five minutes on Free (one URL per project); two failsDOWN. Add a Background processing component on your StillOnline status page for incidents that do not take down the API host. Native queue depth metrics are not built into StillOnline — encode what you can in HTTP JSON. See cron heartbeat for scheduled jobs.

Web vs worker: what each layer proves

LayerTypical hostStillOnline without extra work
Web / APIapi.product.com/healthYes — primary check
Worker processNo public portNo — needs proxy
Queue brokerRedis internalOnly via app health handler
CronScheduled shellHeartbeat URL pattern

Redis queue patterns are internal; external monitors see only what your app exposes.

Pattern 1 — Heartbeat in /health

{
  "status": "ok",
  "worker_last_ok": "2026-06-08T11:55:00Z",
  "queue_lag_seconds": 12
}

Handler returns 503 when now - worker_last_ok > 600 (10 min). One URL fits Free.

Pattern 2 — Dedicated worker health on Pro

Pro allows 10 checks per project:

  • Check 1: api.product.com/health (web)
  • Check 2: api.product.com/workers/health (jobs)

Status page: API vs Background jobs components — degrade independently.

Pattern 3 — Platform worker with public URL

Some PaaS workers expose HTTP (Fly machines, Railway public networking). If worker.project.fly.dev/health is truly customer-critical, register it — Fly monitoring.

Status page communication

When web returns 200 but you pause job processing for maintenance:

  • Mark Background jobs Degraded or Under maintenance
  • Keep API Operational if reads still work

Manual incidents on StillOnline — no auto queue integration.

Related guides

FAQ

Can StillOnline connect to my Redis queue directly?

No. Expose queue/worker state through your HTTPS health route.

Worker on private network only — options?

Update DB/Redis from worker; public /health reads last success timestamp — heartbeat guide.

Should worker failures mark the whole status page red?

Only if customers cannot use the product. Otherwise degrade Background jobs component only.