FastAPI health check endpoint monitoring for SaaS
Your FastAPI app returns 200 on every route until the database pool dies at 3 a.m. The uptime monitor still shows green because it never hit a real probe URL. FastAPI health check endpoint monitoring closes that gap: two small routes, correct status codes, and an external check that sees what users see. You will add /health and /ready, test them with curl from outside your network, and register the URL in StillOnline for alerts and a public status page.
Quick answer
Use GET /health for liveness (process alive, no database) and GET /ready for readiness (database or Redis ping with a 3–5 second timeout). Return HTTP 200 when healthy and 503 when a required dependency fails. Keep probe paths public, respond in under two seconds, and point StillOnline at the URL that means “customers cannot use the product.” Free: one project, one URL, five-minute interval, status code only — pricing.
FastAPI is a Python framework for HTTP APIs. Uvicorn is the server process on your port. An external uptime monitor sends GET requests from the public internet on a schedule — like a browser without a login. That is different from an in-process library check: the monitor proves DNS, TLS, and your reverse proxy work end to end.
For stack-agnostic basics, read our health check URL quickstart. This guide adds FastAPI copy-paste routes, explains liveness vs readiness probe semantics, and shows how to monitor FastAPI API SaaS workloads with StillOnline without running Kubernetes.
1. Define what your FastAPI health route exposes (and hides)
A probe response is small JSON — a doorplate, not a warehouse inventory.
| Field | Include? | Why |
|---|---|---|
| status (ok / down) | Yes | Clear signal for logs and on-call |
| timestamp (UTC ISO) | Yes | Proves the response is fresh |
| version or git SHA | Yes on /health | Helps after deploys |
| checks per dependency | Yes on /ready | Shows what failed |
| Passwords, API keys, internal hosts | No | Probe URLs are often unauthenticated |
| Stack traces | No | Leaks details to scanners |
Do: use a Pydantic response model. Do not: reuse admin serializers on a public path.
Minimal /health handler:
from datetime import datetime, timezone
from fastapi import APIRouter
from pydantic import BaseModel
router = APIRouter()
class HealthResponse(BaseModel):
status: str
timestamp: str
version: str
@router.get("/health", response_model=HealthResponse)
async def health() -> HealthResponse:
return HealthResponse(
status="ok",
timestamp=datetime.now(timezone.utc).isoformat(),
version="1.0.0",
)
Mount with app.include_router(router). Name paths consistently (/health and /ready, or /healthz if your platform expects it) and never change them silently after monitors go live.
Do: keep /health async without database I/O. Do not: put SELECT 1 on liveness — that adds noise when the DB blips.
2. Split /health and /ready for workers and database pools
Liveness asks: “Is the process accepting HTTP?” Readiness asks: “Can this instance serve real traffic?” On Kubernetes, failed liveness restarts the pod; failed readiness removes it from the load balancer. On Railway or Fly.io without K8s, the split still matters: Uvicorn can stay alive while every API call times out because the pool is exhausted.
Probe workflow: StillOnline GET → load balancer → Uvicorn → /ready → parallel DB/Redis checks → 200 or 503 → status page + alert.
Readiness with async DB check (5s timeout):
import asyncio
from fastapi import APIRouter, Response, status
from sqlalchemy import text
CHECK_TIMEOUT = 5.0
async def check_database() -> bool:
try:
conn = await asyncio.wait_for(engine.connect(), timeout=CHECK_TIMEOUT)
async with conn:
await conn.execute(text("SELECT 1"))
return True
except Exception:
return False
@router.get("/ready")
async def ready(response: Response):
if not await check_database():
response.status_code = status.HTTP_503_SERVICE_UNAVAILABLE
return {"status": "down", "checks": {"database": "fail"}}
return {"status": "ok", "checks": {"database": "ok"}}
Do: run checks with asyncio.gather so total time stays near one timeout. Do not: call sync DB drivers inside async def without a thread pool — that blocks Uvicorn.
Most API SaaS teams point StillOnline at /ready. Document the choice in README. See uptime checks for API-only SaaS for URL patterns.
3. Set return codes, timeouts, and auth on probe URLs
StillOnline Free checks the HTTP status code only, not JSON fields. Return 200 when healthy and 503 when a required dependency fails. Codes 200–399 count as success for Kubernetes HTTP probes; 503 is the honest readiness failure signal.
Aim for sub-second /health; cap total /ready under about three seconds with per-check timeouts of 3–5 seconds. External monitors often time out near two seconds — treat that as a hard ceiling.
Cache-Control on /health:
@router.get("/health")
async def health(response: Response):
response.headers["Cache-Control"] = "no-cache, no-store"
return {"status": "ok"}
Do: exclude /health and /ready from JWT middleware. Do not: require OAuth on probes — monitors send no Authorization header and you will get false 401 alerts.
Verify from outside your network:
curl -sS -o /dev/null -w "%{http_code}\n" https://api.example.com/health
curl -sS https://api.example.com/ready
Fix 301 redirects and TLS before registering the URL. A monitor that follows http to https may look healthy while your API certificate expired. Compare patterns in our health endpoint design guide and the Laravel SaaS health route if your stack is mixed.
4. Wire StillOnline checks and your alert channel
StillOnline runs scheduled GET probes against your full HTTPS URL, updates a public status page, and sends owner alerts. It does not scan your repo — you paste the exact URL.
- Sign in at stillonline.tech/app and create a project (one on Free).
- Add a check with the full URL, e.g.
https://api.yourproduct.com/ready. Method GET, expect 200. Free interval is about five minutes. - Wait 2–3 probe cycles before judging flakiness.
- Enable owner alerts in settings. Free = one channel (email or Telegram or Slack). Pro ($9/mo) and Ultimate ($29/mo) add more checks and channels — pricing.
- Share the status page link in support docs.
Do: monitor the public edge URL customers use. Do not: point at localhost or internal hostnames.
StillOnline complements in-cluster probes: you catch DNS, TLS, and load-balancer failures pods cannot see. On Free, probes run about every five minutes and debounce two consecutive failures before alerting — plan on-call around that cadence, not per-second metrics.
5. Pre-production checklist
- Document canonical URLs in README and runbook.
- Confirm 503 when DB is stopped in staging; 200 when restored.
- Measure latency from a home network, not only inside the datacenter.
- Strip secrets from JSON and access logs on probe paths.
- Record which URL StillOnline monitors and update on route renames.
- Skip PyPI health libraries until plain routes feel limiting.
Do: add health routes to every deploy checklist. Do not: ship without GET — StillOnline does not use HEAD-only endpoints.
What's next
You have FastAPI routes an external monitor can trust and a StillOnline check that pages you when the public URL fails. Add a second check for your marketing site if it lives on another host, upgrade when you need more URLs or channels, and link the status page from your footer.
Open the StillOnline dashboard, paste your /ready URL, and enable the alert channel you read at 3 a.m.
Related guides
- Health check URL quickstart
- Health endpoint design: /health vs /ready
- Uptime checks for API-only SaaS
- Laravel SaaS health check route
FAQ
Should the StillOnline health endpoint hit the database?
Not on /health. Put database or Redis pings on /ready if you want StillOnline to show down when dependencies fail. Brief DB blips can cause alert noise — tune timeouts accordingly.
Should I monitor /health or /ready in StillOnline?
Monitor the URL that means “customers cannot use the product.” Most API SaaS teams use /ready. Use /health alone only when dependency checks would cause unacceptable false positives.
503 or 200 when the database is down?
Return 503 on /ready when a required dependency is unreachable. Keep /health at 200 if the process is alive. StillOnline Free reads status codes only — pick the URL that matches the alert story you want.
How fast must /health respond for StillOnline?
Sub-second is ideal. Stay under about two seconds for external monitors including TLS and geographic latency.
Do I need a PyPI package for FastAPI health monitoring?
No for MVP. Plain routes plus Pydantic are enough. Add libraries like fastapi-health when you run many conditional checks behind Kubernetes suites.