API Monitoring Guide
Get alerted when OpenAI has issues before your AI-powered features break. Learn which endpoints to monitor and how to set up proactive alerting.
If your application relies on GPT-4, ChatGPT, or DALL-E, an OpenAI outage can completely break your product. OpenAI has had significant outages, especially after major product launches. This guide covers how to monitor effectively.
OpenAI maintains a status page at status.openai.com. But relying on it has problems:
External monitoring catches issues in 1-2 minutes. That means you can alert customers, show a graceful fallback, or switch to a backup before complaints roll in.
Not all endpoints are equal for health checking. Here's the recommended approach:
GET https://api.openai.com/v1/models
Lists available models. Fast, lightweight, and free to call (no token charges). Best for health checks.
POST https://api.openai.com/v1/chat/completions
Chat/GPT-4 endpoint. Don't use for monitoring — costs tokens per request and can be slow.
GET https://api.openai.com/v1/embeddings
Embeddings API. Costs tokens. Only monitor if you specifically rely on embeddings.
Pro tip: The /v1/models endpoint verifies your API key is valid, OpenAI's servers are reachable, and the API is functional. That's everything you need to know.
In the OpenAI dashboard, go to API Keys and create a new key. You can use a dedicated key for monitoring or your existing one.
Add a new HTTP monitor with these settings:
Set up your preferred alert channels:
For AI features, 1-minute checks (Pro) are recommended. OpenAI issues can develop quickly during high-demand periods.
OpenAI's API has strict rate limits. It's important to distinguish between hitting your limits and an actual outage:
If your monitoring returns 401, your API key is invalid or expired. This isn't an OpenAI outage — check that your key is correctly configured.
When your monitoring alerts you to an OpenAI issue, here's how to respond:
Check status.openai.com and @OpenAI on Twitter. Remember: your monitoring may catch issues before they're publicly acknowledged.
If you've built fallbacks, now's the time. Options include: cached responses, simpler model fallback (GPT-3.5 if GPT-4 is down), or a "temporarily unavailable" message.
Update your status page. "AI features may be slower than usual due to upstream provider issues" is better than silence.
For critical applications, consider having a backup: Anthropic's Claude, Google's Gemini, or open-source models. Multi-provider architectures are more resilient.
503
OpenAI is overloaded. Common after major releases or during peak hours. Usually temporary — retries often work.
429
You've exceeded your rate limits (requests per minute or tokens per minute). Implement exponential backoff.
500
Something broke on OpenAI's side. These are genuine outages. Report to OpenAI if persistent.
Timeout
OpenAI took too long to respond. Common during high demand. For complex prompts, this may be normal — increase timeout thresholds.
Model unavailable
Sometimes GPT-4 is unavailable while GPT-3.5 works fine, or vice versa. The /v1/models endpoint will show which models are currently available.
GET https://api.openai.com/v1/models. It's lightweight, returns quickly, and is free to call (no token charges). Avoid monitoring /v1/chat/completions for health checks as it incurs costs per request and can be slow.
Add OpenAI to your monitors and get alerted on issues before your users notice. Free tier includes 25 monitors.
Start monitoring free →No credit card required. Commercial use allowed.