vLLM
Integrate vLLM-hosted custom models with Portkey and take them to production
Portkey provides a robust and secure platform to observe, govern, and manage your locally or privately hosted custom models using vLLM.
Here’s a list of all model architectures supported on vLLM.
Integrating Custom Models with Portkey SDK
Expose your vLLM Server
Expose your vLLM server by using a tunneling service like ngrok or any other way you prefer. You can skip this step if you’re self-hosting the Gateway.
Install the Portkey SDK
Initialize Portkey with vLLM custom URL
- Pass your publicly-exposed vLLM server URL to Portkey with
customHost
(by default, vLLM is onhttp://localhost:8000/v1
) - Set target
provider
asopenai
since the server follows OpenAI API schema.
More on custom_host
here.
Invoke Chat Completions
Use the Portkey SDK to invoke chat completions from your model, just as you would with any other provider:
Next Steps
Explore the complete list of features supported in the SDK:
SDK
You’ll find more information in the relevant sections: