Why should I monitor OpenAI API instead of checking their status page?

OpenAI's status page (status.openai.com) can lag 5-15 minutes behind actual issues. External monitoring catches problems in real-time, typically within 1-2 minutes. OpenAI has had significant outages, especially after major releases like GPT-4 Turbo, where the status page didn't reflect issues immediately.

Which OpenAI API endpoint should I monitor?

The best endpoint to monitor is GET https://api.openai.com/v1/models - it's lightweight, returns quickly, and verifies your API key works. Avoid monitoring /v1/chat/completions for health checks as it incurs costs per request and can be slow. The /v1/models endpoint is essentially free to call.

API Monitoring Guide

Monitor OpenAI API

Q: How do I tell the difference between an OpenAI outage and rate limiting?

Rate limiting returns HTTP 429 with a 'Rate limit exceeded' message and usually resolves in seconds. An actual outage returns 500, 502, 503, or times out completely. If you're getting 429s, you've hit your usage limits - that's not an OpenAI problem. 5xx errors indicate OpenAI is having issues.

Get alerted when OpenAI has issues before your AI-powered features break. Learn which endpoints to monitor and how to set up proactive alerting.

If your application relies on GPT-4, ChatGPT, or DALL-E, an OpenAI outage can completely break your product. OpenAI has had significant outages, especially after major product launches. This guide covers how to monitor effectively.

Why Monitor OpenAI Externally?

OpenAI maintains a status page at status.openai.com. But relying on it has problems:

OpenAI's status page limitations

• Delayed updates: During the November 2023 GPT-4 Turbo launch, users reported issues for 30+ minutes before the status page updated
• Partial outages: Sometimes specific models (GPT-4 vs GPT-3.5) or regions have issues not reflected in overall status
• Degraded performance: Slow responses during high demand often aren't reported as incidents
• No proactive alerts: You have to check the page manually or subscribe to their feed

External monitoring catches issues in 1-2 minutes. That means you can alert customers, show a graceful fallback, or switch to a backup before complaints roll in.

Which OpenAI Endpoint to Monitor

Not all endpoints are equal for health checking. Here's the recommended approach:

GET https://api.openai.com/v1/models

Lists available models. Fast, lightweight, and free to call (no token charges). Best for health checks.

Recommended

POST https://api.openai.com/v1/chat/completions

Chat/GPT-4 endpoint. Don't use for monitoring — costs tokens per request and can be slow.

Avoid

GET https://api.openai.com/v1/embeddings

Embeddings API. Costs tokens. Only monitor if you specifically rely on embeddings.

Optional

Pro tip: The /v1/models endpoint verifies your API key is valid, OpenAI's servers are reachable, and the API is functional. That's everything you need to know.

How to Set Up OpenAI Monitoring

Get your OpenAI API key

In the OpenAI dashboard, go to API Keys and create a new key. You can use a dedicated key for monitoring or your existing one.

Create a monitor in UptimeSignal

Add a new HTTP monitor with these settings:

URL: https://api.openai.com/v1/models

Method: GET

Header: Authorization: Bearer sk-your_key_here

Expected status: 200

Configure alerting

Set up your preferred alert channels:

• Email — Immediate notification to your inbox
• Slack — Alert your team's ops channel
• Webhook — Trigger your incident management or fallback system

Set appropriate check interval

For AI features, 1-minute checks (Pro) are recommended. OpenAI issues can develop quickly during high-demand periods.

Rate Limiting vs Actual Outages

OpenAI's API has strict rate limits. It's important to distinguish between hitting your limits and an actual outage:

Rate Limiting (429)

• HTTP 429 response
• Message: "Rate limit exceeded"
• Resolves in seconds/minutes
• Your issue, not OpenAI's
• Reduce request frequency

Actual Outage (5xx)

• HTTP 500, 502, 503, or timeout
• Message: "Internal error" or none
• Persists across multiple checks
• OpenAI's issue
• Wait or use fallback

What about 401 Unauthorized?

If your monitoring returns 401, your API key is invalid or expired. This isn't an OpenAI outage — check that your key is correctly configured.

When OpenAI Goes Down: Response Playbook

When your monitoring alerts you to an OpenAI issue, here's how to respond:

1. Verify the outage

Check status.openai.com and @OpenAI on Twitter. Remember: your monitoring may catch issues before they're publicly acknowledged.

2. Activate graceful degradation

If you've built fallbacks, now's the time. Options include: cached responses, simpler model fallback (GPT-3.5 if GPT-4 is down), or a "temporarily unavailable" message.

3. Communicate proactively

Update your status page. "AI features may be slower than usual due to upstream provider issues" is better than silence.

4. Consider alternatives (if critical)

For critical applications, consider having a backup: Anthropic's Claude, Google's Gemini, or open-source models. Multi-provider architectures are more resilient.

Common OpenAI API Issues

503

Service Unavailable

OpenAI is overloaded. Common after major releases or during peak hours. Usually temporary — retries often work.

429

Rate Limited

You've exceeded your rate limits (requests per minute or tokens per minute). Implement exponential backoff.

500

Internal Server Error

Something broke on OpenAI's side. These are genuine outages. Report to OpenAI if persistent.

Timeout

Request Timeout

OpenAI took too long to respond. Common during high demand. For complex prompts, this may be normal — increase timeout thresholds.

Model unavailable

Specific Model Down

Sometimes GPT-4 is unavailable while GPT-3.5 works fine, or vice versa. The /v1/models endpoint will show which models are currently available.

Frequently Asked Questions

Why monitor OpenAI API instead of checking their status page?

OpenAI's status page can lag 5-15 minutes behind actual issues. During the November 2023 GPT-4 Turbo launch, users reported issues for 30+ minutes before the status page updated. External monitoring catches problems in real-time, typically within 1-2 minutes.

Which OpenAI endpoint should I monitor?

Use GET https://api.openai.com/v1/models. It's lightweight, returns quickly, and is free to call (no token charges). Avoid monitoring /v1/chat/completions for health checks as it incurs costs per request and can be slow.

How do I tell the difference between an outage and rate limiting?

Rate limiting returns HTTP 429 with a "Rate limit exceeded" message and usually resolves in seconds. An actual outage returns 500, 502, 503, or times out completely. If you're getting 429s, you've hit your usage limits — that's not an OpenAI problem.

Can I monitor OpenAI API for free?

Yes. UptimeSignal's free tier includes 25 monitors with 5-minute check intervals. The /v1/models endpoint is free to call (no token charges), so monitoring OpenAI costs nothing beyond UptimeSignal's free tier. Commercial use is allowed.

Does monitoring count against my OpenAI rate limits?

Yes, but minimally. The /v1/models endpoint is very lightweight. Checking once per minute is 1,440 requests/day, well under typical rate limits of tens of thousands per day. This won't impact your actual application usage.

Start monitoring OpenAI in 60 seconds

Add OpenAI to your monitors and get alerted on issues before your users notice. Free tier includes 25 monitors.

Start monitoring free →

No credit card required. Commercial use allowed.