Load Balancing feature efficiently distributes network traffic across multiple LLMs.
config
object to include a strategy
with loadbalance
mode.
Here’s a quick example to load balance 75-25 between an OpenAI and an Azure OpenAI account
virtual keys
(or provider
+ api_key
pairs), and assign a weight
value to each target. The weights represent the relative share of requests that should be routed to each target.weight
value is1
weight
value is0
weight
is not set for a target, the default weight
value (i.e. 1
) is applied."weight":0
for a specific target to stop routing traffic to it without removing it from your Config