> ## Documentation Index
> Fetch the complete documentation index at: https://docs.portkey.ai/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Triton

> Integrate Trtiton-hosted custom models with Portkey and take them to production

Portkey provides a robust and secure platform to observe, govern, and manage your **locally** or **privately** hosted custom models using Triton.

<Info>
  Here's the official [Triton Inference Server documentation](https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/getting_started/quickstart.html) for more details.
</Info>

## Integrating Custom Models with Portkey SDK

<Steps>
  <Step title="Expose your Triton Server">
    Expose your Triton server by using a tunneling service like [ngrok](https://ngrok.com/) or any other way you prefer. You can skip this step if you’re self-hosting the Gateway.

    ```sh theme={"system"}
    ngrok http 11434 --host-header="localhost:8080"
    ```
  </Step>

  <Step title="Install the Portkey SDK">
    <Tabs>
      <Tab title="NodeJS">
        ```sh theme={"system"}
        npm install --save portkey-ai
        ```
      </Tab>

      <Tab title="Python">
        ```sh theme={"system"}
        pip install portkey-ai
        ```
      </Tab>
    </Tabs>
  </Step>

  <Step title="Initialize Portkey with Triton custom URL">
    1. Pass your publicly-exposed Triton server URL to Portkey with `customHost`
    2. Set target `provider` as `triton`.

    <Tabs>
      <Tab title="NodeJS SDK">
        ```js theme={"system"}
        import Portkey from 'portkey-ai'

        const portkey = new Portkey({
            apiKey: "PORTKEY_API_KEY",
            provider: "triton",
            customHost: "http://localhost:8000/v2/models/mymodel" // Your Triton Hosted URL
            Authorization: "AUTH_KEY", // If you need to pass auth
        })
        ```
      </Tab>

      <Tab title="Python SDK">
        ```python theme={"system"}
        from portkey_ai import Portkey

        portkey = Portkey(
            api_key="PORTKEY_API_KEY",
            provider="triton",
            custom_host="http://localhost:8000/v2/models/mymodel" # Your Triton Hosted URL
            Authorization="AUTH_KEY", # If you need to pass auth
        )
        ```
      </Tab>
    </Tabs>

    More on `custom_host` [here](/product/ai-gateway/universal-api#integrating-local-or-private-models).
  </Step>

  <Step title="Invoke Chat Completions">
    Use the Portkey SDK to invoke chat completions (generate) from your model, just as you would with any other provider:

    <Tabs>
      <Tab title="NodeJS SDK">
        ```js theme={"system"}
        const chatCompletion = await portkey.chat.completions.create({
            messages: [{ role: 'user', content: 'Say this is a test' }]
        });

        console.log(chatCompletion.choices);
        ```
      </Tab>

      <Tab title="Python SDK">
        ```python theme={"system"}
        completion = portkey.chat.completions.create(
            messages= [{ "role": 'user', "content": 'Say this is a test' }]
        )

        print(completion)
        ```
      </Tab>
    </Tabs>
  </Step>
</Steps>

## Next Steps

Explore the complete list of features supported in the SDK:

<Card title="SDK" href="/api-reference/portkey-sdk-client" />

***

You'll find more information in the relevant sections:

1. [Add metadata to your requests](/product/observability/metadata)
2. [Add gateway configs to your requests](/product/ai-gateway/universal-api#ollama-in-configs)
3. [Tracing requests](/product/observability/traces)
4. [Setup a fallback from triton to your local LLM](/product/ai-gateway/fallbacks)
