Framework Guide
FastAPI API Monitoring
Set up health check endpoints and uptime monitoring for your FastAPI application. Covers async health checks, Pydantic response models, dependency injection, database checks, and production monitoring with UptimeSignal.
Why Monitor Your FastAPI App?
FastAPI applications run on ASGI servers like Uvicorn and can fail in ways that are invisible without external monitoring. The process might be running, but the event loop could be blocked, database connections exhausted, or a deployment could have introduced a breaking change.
- Event loop blocking -- Sync code in async endpoints blocks the entire Uvicorn worker, making every request hang
- Worker exhaustion -- All Uvicorn workers occupied by slow requests means new connections queue indefinitely
- Dependency failures -- Database, Redis, or third-party API outages cascade through your application
- Memory leaks -- Long-running ASGI processes can accumulate objects that the garbage collector never frees
- Silent deployment failures -- A bad deploy might start the process but fail on the first real request
Basic Health Check Endpoint
The simplest FastAPI health check returns a JSON response with a status field. Use a Pydantic model for the response so it appears in the auto-generated OpenAPI docs.
from fastapi import FastAPI
from pydantic import BaseModel
from datetime import datetime, timezone
app = FastAPI()
class HealthResponse(BaseModel):
status: str
timestamp: str
version: str
@app.get("/health", response_model=HealthResponse)
async def health() -> HealthResponse:
return HealthResponse(
status="healthy",
timestamp=datetime.now(timezone.utc).isoformat(),
version="1.0.0",
)
Readiness Check with Database
A readiness endpoint verifies that your app can actually serve requests by checking critical dependencies. This is the endpoint you should point UptimeSignal at.
from fastapi import FastAPI, Response
from pydantic import BaseModel
app = FastAPI()
class CheckResult(BaseModel):
status: str
response_time_ms: float | None = None
message: str | None = None
class ReadyResponse(BaseModel):
status: str
checks: dict[str, CheckResult]
@app.get("/ready", response_model=ReadyResponse)
async def ready(response: Response) -> ReadyResponse:
checks: dict[str, CheckResult] = {}
healthy = True
# Check database
try:
import time
start = time.monotonic()
await database.execute("SELECT 1")
elapsed = (time.monotonic() - start) * 1000
checks["database"] = CheckResult(
status="ok", response_time_ms=round(elapsed, 2)
)
except Exception as e:
checks["database"] = CheckResult(status="error", message=str(e))
healthy = False
# Check Redis
try:
start = time.monotonic()
await redis_client.ping()
elapsed = (time.monotonic() - start) * 1000
checks["redis"] = CheckResult(
status="ok", response_time_ms=round(elapsed, 2)
)
except Exception as e:
checks["redis"] = CheckResult(status="error", message=str(e))
healthy = False
if not healthy:
response.status_code = 503
return ReadyResponse(
status="healthy" if healthy else "unhealthy",
checks=checks,
)
SQLAlchemy Async Health Check
If you use SQLAlchemy with async sessions (the recommended approach for FastAPI), here is how to check database connectivity in your health endpoint.
import asyncio
from sqlalchemy.ext.asyncio import AsyncSession, create_async_engine
from sqlalchemy.orm import sessionmaker
from sqlalchemy import text
engine = create_async_engine("postgresql+asyncpg://user:pass@localhost/db")
async_session = sessionmaker(engine, class_=AsyncSession, expire_on_commit=False)
async def check_database() -> dict[str, str]:
"""Check database connectivity with a 5-second timeout."""
try:
async with async_session() as session:
await asyncio.wait_for(
session.execute(text("SELECT 1")),
timeout=5.0,
)
return {"status": "ok"}
except asyncio.TimeoutError:
return {"status": "error", "message": "database timeout"}
except Exception as e:
return {"status": "error", "message": str(e)}
Tortoise ORM Health Check
For projects using Tortoise ORM, the health check pattern is similar but uses Tortoise's connection API.
from tortoise import Tortoise
async def check_database() -> dict[str, str]:
"""Check Tortoise ORM database connectivity."""
try:
conn = Tortoise.get_connection("default")
await conn.execute_query("SELECT 1")
return {"status": "ok"}
except Exception as e:
return {"status": "error", "message": str(e)}
Dependency Injection for Health Checks
FastAPI's dependency injection system is the recommended way to structure health checks. Create reusable dependencies that check each service, then compose them in your endpoint.
from fastapi import Depends, FastAPI, Response
from pydantic import BaseModel
app = FastAPI()
class ServiceStatus(BaseModel):
database: str
redis: str
all_healthy: bool
async def get_db_status() -> str:
try:
async with async_session() as session:
await session.execute(text("SELECT 1"))
return "ok"
except Exception:
return "error"
async def get_redis_status() -> str:
try:
await redis_client.ping()
return "ok"
except Exception:
return "error"
async def get_service_status(
db_status: str = Depends(get_db_status),
redis_status: str = Depends(get_redis_status),
) -> ServiceStatus:
return ServiceStatus(
database=db_status,
redis=redis_status,
all_healthy=(db_status == "ok" and redis_status == "ok"),
)
@app.get("/health")
async def health(
response: Response,
status: ServiceStatus = Depends(get_service_status),
) -> dict:
if not status.all_healthy:
response.status_code = 503
return {
"status": "healthy" if status.all_healthy else "unhealthy",
"checks": {
"database": status.database,
"redis": status.redis,
},
}
Middleware for Monitoring Metrics
Add middleware to track request latency and error rates. This data complements external uptime monitoring by giving you internal visibility.
import time
from fastapi import FastAPI, Request
from starlette.middleware.base import BaseHTTPMiddleware
app = FastAPI()
class MetricsMiddleware(BaseHTTPMiddleware):
def __init__(self, app: FastAPI) -> None:
super().__init__(app)
self.request_count: int = 0
self.error_count: int = 0
self.total_latency: float = 0.0
async def dispatch(self, request: Request, call_next):
start = time.monotonic()
response = await call_next(request)
latency = time.monotonic() - start
self.request_count += 1
self.total_latency += latency
if response.status_code >= 500:
self.error_count += 1
response.headers["X-Response-Time"] = f"{latency:.4f}"
return response
metrics = MetricsMiddleware(app)
app.add_middleware(MetricsMiddleware)
@app.get("/metrics")
async def get_metrics() -> dict:
avg_latency = (
metrics.total_latency / metrics.request_count
if metrics.request_count > 0
else 0
)
return {
"requests_total": metrics.request_count,
"errors_total": metrics.error_count,
"avg_latency_seconds": round(avg_latency, 4),
}
Monitor with UptimeSignal
Use UptimeSignal to monitor your FastAPI health endpoint externally. This catches issues that internal monitoring misses -- server crashes, DNS failures, network problems, and certificate expiry.
- Add the
/healthendpoint to your FastAPI app using any of the examples above - Deploy your app and verify the endpoint returns 200 at
https://your-api.com/health - Sign up at app.uptimesignal.io and create a monitor pointing to your health URL
- Configure alerts -- UptimeSignal emails you when the endpoint returns non-200 or times out
The free tier includes 25 monitors with 5-minute check intervals. Pro ($15/mo) supports 1-minute intervals and unlimited monitors -- ideal for FastAPI microservice architectures with many endpoints to watch.
Best Practices
- Always use async -- FastAPI runs on an async event loop. Sync health checks block the worker and delay other requests
- Add timeouts to dependency checks -- Use
asyncio.wait_for()with a 5-second timeout on database and cache pings - Return proper status codes -- 200 for healthy, 503 for unhealthy. UptimeSignal uses the status code to determine up/down
- Use Pydantic response models -- They document your health endpoint in the OpenAPI schema automatically
- Keep the endpoint unauthenticated -- Monitoring services need access without managing API keys
- Include version info -- Helps debug whether a deployment caused an outage