A step-by-step guide to detecting Anthropic’s streamed overloaded_error
and building a resilient, automatic fallback to another model using Portkey.
overloaded_error
message. For streaming responses, this error comes in the first chunk of the stream, but the HTTP response code is still 200 OK
.
200
, standard fallback logic (like Portkey’s on_status_codes
) won’t trigger. Your application might interpret this as a successful but empty response, leading to a poor user experience.
overloaded_error
message and triggers a fallback, even when the status code is 200
.
Here’s the plan:
"type":"overloaded_error"
string in the response.Contains
ANY
"type":"overloaded_error"
Configuring the 'Contains' Output Guardrail in Portkey
anthropic-overload-detector
) and save it. Note down its ID (e.g., grl_...
), as you’ll need it in the next step.
output_guardrails
: Attaches our detector guardrail to this config.strategy
: Defines a fallback
mode.on_status_codes: [246,446]
: This is the key part. When our Output Guardrail detects the error string in a stream, Portkey internally assigns it status 446
or 246
based on your settings. This rule tells Portkey to trigger the fallback when that happens.targets
: Defines the sequence of models to try. It will first attempt the primary Anthropic model and fall back to the OpenAI model if the guardrail is triggered.cfg_...
).
overloaded_error
, the guardrail will catch it, and Portkey will automatically retry the request with your fallback model (OpenAI’s GPT-4o), ensuring your application remains operational.
output_guardrail_triggered
(Status Code 446
).success
.