Monitoring background workers and queues when the web tier is green
Split stacks — Next.js on Vercel plus BullMQ on Railway, or Django web with Celery workers — fail in half. Users load the dashboard; exports and emails never finish.
StillOnline only reaches public HTTPS URLs. Internal workers without ingress need a proxy health signal or heartbeat URL your web tier exposes. This guide maps what to monitor and how to label Background jobs on your status page.
Quick answer
Register a URL that fails when workers stall — usually GET /health including queue heartbeat or a dedicated GET /workers/health returning 503 when last job success is stale. StillOnline probes every five minutes on Free (one URL per project); two fails → DOWN. Add a Background processing component on your StillOnline status page for incidents that do not take down the API host. Native queue depth metrics are not built into StillOnline — encode what you can in HTTP JSON. See cron heartbeat for scheduled jobs.
Web vs worker: what each layer proves
| Layer | Typical host | StillOnline without extra work |
|---|---|---|
| Web / API | api.product.com/health | Yes — primary check |
| Worker process | No public port | No — needs proxy |
| Queue broker | Redis internal | Only via app health handler |
| Cron | Scheduled shell | Heartbeat URL pattern |
Redis queue patterns are internal; external monitors see only what your app exposes.
Pattern 1 — Heartbeat in /health
{
"status": "ok",
"worker_last_ok": "2026-06-08T11:55:00Z",
"queue_lag_seconds": 12
}
Handler returns 503 when now - worker_last_ok > 600 (10 min). One URL fits Free.
Pattern 2 — Dedicated worker health on Pro
Pro allows 10 checks per project:
- Check 1:
api.product.com/health(web) - Check 2:
api.product.com/workers/health(jobs)
Status page: API vs Background jobs components — degrade independently.
Pattern 3 — Platform worker with public URL
Some PaaS workers expose HTTP (Fly machines, Railway public networking). If worker.project.fly.dev/health is truly customer-critical, register it — Fly monitoring.
Status page communication
When web returns 200 but you pause job processing for maintenance:
- Mark Background jobs Degraded or Under maintenance
- Keep API Operational if reads still work
Manual incidents on StillOnline — no auto queue integration.
Related guides
- Cron job heartbeat monitoring
- Webhook ingestion monitoring
- Health endpoint design
- API-only SaaS checks
FAQ
Can StillOnline connect to my Redis queue directly?
No. Expose queue/worker state through your HTTPS health route.
Worker on private network only — options?
Update DB/Redis from worker; public /health reads last success timestamp — heartbeat guide.
Should worker failures mark the whole status page red?
Only if customers cannot use the product. Otherwise degrade Background jobs component only.