Handling Anthropic's Silent 'Overloaded' Errors
A step-by-step guide to detecting Anthropic’s streamed overloaded_error
and building a resilient, automatic fallback to another model using Portkey.
As a developer, one of the most frustrating issues to debug is a silent failure. Your application appears to be working, your status codes are green, but your users are getting a broken experience. This is exactly what can happen with Anthropic’s API during peak load.
This cookbook provides a production-ready recipe to handle these silent errors gracefully, ensuring your application remains resilient and your users stay happy.
The Challenge: Anthropic’s “Successful” Failure
When Anthropic’s models are overloaded, they can send an overloaded_error
message. For streaming responses, this error comes in the first chunk of the stream, but the HTTP response code is still 200 OK
.
Because the status code is 200
, standard fallback logic (like Portkey’s on_status_codes
) won’t trigger. Your application might interpret this as a successful but empty response, leading to a poor user experience.
The Strategy: Content-Aware Fallbacks
We’ll use Portkey’s Output Guardrails to inspect the content of the response stream. This allows us to create a rule that looks for the specific overloaded_error
message and triggers a fallback, even when the status code is 200
.
Here’s the plan:
- Create an Output Guardrail to detect the
"type":"overloaded_error"
string in the response. - Build a Fallback Config that uses this guardrail to trigger a fallback to a different model (like GPT-4o).
- Implement and Verify the solution in your code.
The Recipe: A Step-by-Step Guide
Step 1: Create the Output Guardrail
First, we’ll create a simple guardrail on Portkey that scans the response for the specific error string.
- Navigate to the Guardrails page in your Portkey dashboard.
- Click Create Guardrail and configure it with the following settings:
- Guardrail Type:
Contains
- Check:
ANY
- Words or Phrases:
"type":"overloaded_error"
- Guardrail Type:

Configuring the 'Contains' Output Guardrail in Portkey
Give your guardrail a memorable name (like anthropic-overload-detector
) and save it. Note down its ID (e.g., grl_...
), as you’ll need it in the next step.
Step 2: Create the Fallback Config
Now, let’s define the fallback behavior. We’ll create a Portkey Config that links our new guardrail to a fallback strategy.
- Navigate to the Configs page and click Create Config.
- Paste the following JSON configuration:
What this config does:
output_guardrails
: Attaches our detector guardrail to this config.strategy
: Defines afallback
mode.on_status_codes: [246,446]
: This is the key part. When our Output Guardrail detects the error string in a stream, Portkey internally assigns it status446
or246
based on your settings. This rule tells Portkey to trigger the fallback when that happens.targets
: Defines the sequence of models to try. It will first attempt the primary Anthropic model and fall back to the OpenAI model if the guardrail is triggered.
Save the config and note its ID (e.g., cfg_...
).
Step 3: Implement in Your Code
Finally, let’s use this config in our application. Simply pass the Config ID when making your Portkey request.
With this setup, if Anthropic returns an overloaded_error
, the guardrail will catch it, and Portkey will automatically retry the request with your fallback model (OpenAI’s GPT-4o), ensuring your application remains operational.
Verifying the Fallback
You can confirm the fallback is working by checking your Portkey logs.
- Go to the Logs page in your Portkey dashboard.
- You should see two requests for a single incoming call:
- The first request to Anthropic will have a status of
output_guardrail_triggered
(Status Code446
). - The second request to your fallback provider (OpenAI) will have a status of
success
.
- The first request to Anthropic will have a status of
This provides a clear audit trail of the automatic failover, giving you full visibility into your application’s resilience.
Summary
By combining Portkey’s Output Guardrails with a Fallback Config, you can build a robust system that intelligently handles content-level errors that are normally invisible to traditional, status-code-based monitoring. This ensures your AI features remain reliable, even when upstream providers are experiencing issues.