Skip to main content
Portkey provides a robust and secure gateway to integrate various Large Language Models (LLMs) into applications, including Deepinfra’s hosted models. With Portkey, take advantage of features like fast AI gateway access, observability, prompt management, and more, while securely managing API keys through Model Catalog.

Quick Start

Get Deepinfra working in 3 steps:
from portkey_ai import Portkey

# 1. Install: pip install portkey-ai
# 2. Add @deepinfra provider in model catalog
# 3. Use it:

portkey = Portkey(api_key="PORTKEY_API_KEY")

response = portkey.chat.completions.create(
    model="@deepinfra/nvidia/Nemotron-4-340B-Instruct",
    messages=[{"role": "user", "content": "Say this is a test"}]
)

print(response.choices[0].message.content)
Tip: You can also set provider="@deepinfra" in Portkey() and use just model="nvidia/Nemotron-4-340B-Instruct" in the request.

Add Provider in Model Catalog

  1. Go to Model Catalog → Add Provider
  2. Select Deepinfra
  3. Choose existing credentials or create new by entering your Deepinfra API key
  4. Name your provider (e.g., deepinfra-prod)

Complete Setup Guide →

See all setup options, code examples, and detailed instructions

Supported Endpoints

EndpointSupported
/chat/completions✅
/completions✅
/embeddings✅

Tool Calling

DeepInfra supports tool calling (function calling) for compatible models. Use the standard OpenAI tools format:
from portkey_ai import Portkey

portkey = Portkey(api_key="PORTKEY_API_KEY")

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {"type": "string", "description": "City name"}
                },
                "required": ["location"]
            }
        }
    }
]

response = portkey.chat.completions.create(
    model="@deepinfra/meta-llama/Meta-Llama-3.1-70B-Instruct",
    messages=[{"role": "user", "content": "What's the weather in London?"}],
    tools=tools,
    tool_choice="auto"
)

print(response.choices[0].message.tool_calls)

Supported Models

Deepinfra hosts a wide range of open-source models for text generation. View the complete list:

Deepinfra Models

Browse all available models on Deepinfra
Popular models include:
  • nvidia/Nemotron-4-340B-Instruct
  • meta-llama/Meta-Llama-3.1-405B-Instruct
  • Qwen/Qwen2.5-72B-Instruct

Next Steps

For complete SDK documentation:

SDK Reference

Complete Portkey SDK documentation
Last modified on March 11, 2026