Portkey provides a robust and secure platform to observe, govern, and manage your locally or privately hosted custom models using vLLM.

Here’s a list of all model architectures supported on vLLM.

Integrating Custom Models with Portkey SDK

1

Expose your vLLM Server

Expose your vLLM server by using a tunneling service like ngrok or any other way you prefer. You can skip this step if you’re self-hosting the Gateway.

ngrok http 11434 --host-header="localhost:8080"
2

Install the Portkey SDK

npm install --save portkey-ai
3

Initialize Portkey with vLLM custom URL

  1. Pass your publicly-exposed vLLM server URL to Portkey with customHost (by default, vLLM is on http://localhost:8000/v1)
  2. Set target provider as openai since the server follows OpenAI API schema.
import Portkey from 'portkey-ai'

const portkey = new Portkey({
    apiKey: "PORTKEY_API_KEY",
    provider: "openai",
    customHost: "https://7cc4-3-235-157-146.ngrok-free.app" // Your vLLM ngrok URL
    Authorization: "AUTH_KEY", // If you need to pass auth
})

More on custom_host here.

4

Invoke Chat Completions

Use the Portkey SDK to invoke chat completions from your model, just as you would with any other provider:

const chatCompletion = await portkey.chat.completions.create({
    messages: [{ role: 'user', content: 'Say this is a test' }]
});

console.log(chatCompletion.choices);

Using Virtual Keys

Virtual Keys serve as Portkey’s unified authentication system for all LLM interactions, simplifying the use of multiple providers and Portkey features within your application. For self-hosted LLMs, you can configure custom authentication requirements including authorization keys, bearer tokens, or any other headers needed to access your model:

  1. Navigate to Virtual Keys in your Portkey dashboard
  2. Click “Add Key” and enable the “Local/Privately hosted provider” toggle
  3. Configure your deployment:
    • Select the matching provider API specification (typically OpenAI)
    • Enter your model’s base URL in the Custom Host field
    • Add required authentication headers and their values
  4. Click “Create” to generate your virtual key

You can now use this virtual key in your requests:

const portkey = new Portkey({
    apiKey: "PORTKEY_API_KEY",
    virtualKey: "YOUR_SELF_HOSTED_LLM_VIRTUAL_KEY"

async function main() {
  const response = await client.chat.completions.create({
    messages: [{ role: "user", content: "Bob the builder.." }],
    model: "your-self-hosted-model-name",
  });

console.log(response.choices[0].message.content);
})

For more information about managing self-hosted LLMs with Portkey, see Bring Your Own LLM.

Next Steps

Explore the complete list of features supported in the SDK:

SDK


You’ll find more information in the relevant sections:

  1. Add metadata to your requests
  2. Add gateway configs to your requests
  3. Tracing requests
  4. Setup a fallback from OpenAI to your local LLM