Learn how to integrate Azure AI Foundry with Portkey to access a wide range of AI models with enhanced observability and reliability features.
Azure AI Foundry provides a unified platform for enterprise AI operations, model building, and application development. With Portkey, you can seamlessly integrate with various models available on Azure AI Foundry and take advantage of features like observability, prompt management, fallbacks, and more.
from portkey_ai import Portkey# 1. Install: pip install portkey-ai# 2. Add @azure-foundry provider in Model Catalog# 3. Use it:portkey = Portkey( api_key="PORTKEY_API_KEY", provider="@azure-foundry")response = portkey.chat.completions.create( model="DeepSeek-V3-0324", # Your deployed model name messages=[{"role": "user", "content": "Tell me about cloud computing"}])print(response.choices[0].message.content)
To integrate Azure AI Foundry with Portkey, you’ll create a provider in the Model Catalog. This securely stores your Azure AI Foundry credentials, allowing you to use a simple identifier in your code instead of handling sensitive authentication details directly.
OpenAI models on Azure
If you’re specifically looking to use OpenAI models on Azure, you should use
Azure OpenAI instead, which is optimized
for OpenAI models.
Integrate Azure AI Foundry with Portkey to centrally manage your AI models and deployments. This guide walks you through setting up the provider using API key authentication.
For managed Azure deployments:Required parameters:
Azure Managed ClientID: Your managed client ID
Azure Foundry URL: The base endpoint URL for your deployment, formatted according to your deployment type:
For AI Services: https://your-resource-name.services.ai.azure.com/models
For Managed: https://your-model-name.region.inference.ml.azure.com/score
For Serverless: https://your-model-name.region.models.ai.azure.com
Azure API Version: The API version to use (e.g., “2024-05-01-preview”). This is required if you have api version in your deployment url.
Examples:
If your URL is https://mycompany-ai.westus2.services.ai.azure.com/models?api-version=2024-05-01-preview, the API version is 2024-05-01-preview
Azure Deployment Name: (Optional) Required only when a single resource contains multiple deployments.
To use this authentication your azure application need to have the role of: conginitive services user.
Enterprise-level authentication with Azure Entra ID:Required parameters:
Azure Entra ClientID: Your Azure Entra client ID
Azure Entra Secret: Your client secret
Azure Entra Tenant ID: Your tenant ID
Azure Foundry URL: The base endpoint URL for your deployment, formatted according to your deployment type:
For AI Services: https://your-resource-name.services.ai.azure.com/models
For Managed: https://your-model-name.region.inference.ml.azure.com/score
For Serverless: https://your-model-name.region.models.ai.azure.com
Azure API Version: The API version to use (e.g., “2024-05-01-preview”). This is required if you have api version in your deployment url.
Examples:
If your URL is https://mycompany-ai.westus2.services.ai.azure.com/models?api-version=2024-05-01-preview, the API version is 2024-05-01-preview
Azure Deployment Name: (Optional) Required only when a single resource contains multiple deployments. Common in Managed deployments.
Enter the following details for your Azure deployment:Model Slug: Use your Azure Model Deployment name exactly as it appears in Azure AI Foundry
Short Description: Optional description for team referenceModel Type: Select “Custom model”Base Model: Choose the model that matches your deployment’s API structure (e.g., select gpt-4 for GPT-4 deployments)
This is just for reference. If you can’t find the particular model, you can
just choose a similar model.
Custom Pricing: Enable to track costs with your negotiated ratesOnce configured, this model will be available alongside others in your provider, allowing you to manage multiple Azure deployments through a single set of credentials.
Azure AI Foundry supports Anthropic models (Claude) through a slightly different configuration process. Follow these steps to integrate Anthropic models with Portkey.
Model Slug: Enter your deployment name from the Azure Foundry console
Base Model: Search for and select your Anthropic model (e.g., claude-opus-4-5-20251101, claude-sonnet-4-5-20250929, claude-haiku-4-5-20251001, claude-opus-4-1-20250805)
Once configured, you can call your Anthropic model using the Model Slug you saved:
NodeJS
Python
cURL
import Portkey from 'portkey-ai';const client = new Portkey({ apiKey: 'PORTKEY_API_KEY', provider: '@AZURE_FOUNDRY_ANTHROPIC_PROVIDER'});const response = await client.chat.completions.create({ messages: [{ role: "user", content: "Hello, Claude!" }], model: "your-azure-deployment-name", // Use the Model Slug you configured});console.log(response.choices[0].message.content);
from portkey_ai import Portkeyclient = Portkey( api_key="PORTKEY_API_KEY", provider="@AZURE_FOUNDRY_ANTHROPIC_PROVIDER")response = client.chat.completions.create( model="your-azure-deployment-name", # Use the Model Slug you configured messages=[ {"role": "user", "content": "Hello, Claude!"} ])print(response.choices[0].message.content)
Using the /messages Route with Azure Foundry Anthropic Models
Access Anthropic models on Azure AI Foundry through Anthropic’s native /messages endpoint using Portkey’s SDK or Anthropic’s SDK.
The /messages route provides access to Anthropic-native features like
extended thinking, prompt caching, and native streaming formats when using
Claude models on Azure AI Foundry.
This is the default behaviour and requires no extra configuration. Portkey accepts the request at /v1/responses, adapts the payload to Chat Completions, calls Azure AI Foundry’s /chat/completions endpoint, and translates the response back to the Responses API format before returning it to you.Because the call is made over Chat Completions on Azure’s side, every model on your Azure AI Foundry provider works — even ones that don’t natively expose a Responses endpoint.
from portkey_ai import Portkeyportkey = Portkey( api_key="PORTKEY_API_KEY", provider="@AZURE_FOUNDRY_PROVIDER")response = portkey.responses.create( model="DeepSeek-V3-0324", # Any model on your Azure AI Foundry provider input="Explain the difference between AKS and ACI in one paragraph.")print(response.output_text)
Features that rely on Azure storing response state — previous_response_id, store, retrieve/delete on /v1/responses/:id, and built-in tools like web_search — are not available in this mode. Use multi-turn input arrays to carry conversation history. See Open Responses → Native-Only Features.
When your Azure AI Foundry deployment exposes a native /responses endpoint (for example, Azure AI Agents), set the x-portkey-provider-responses-proxy header to true. Portkey will skip the Chat Completions transformation and forward the request body exactly as-is to Azure’s native Responses endpoint.This is required to use Azure-specific parameters such as agent_reference, which targets a deployed Azure AI Agent.
from portkey_ai import Portkeyportkey = Portkey( api_key="PORTKEY_API_KEY", provider="@AZURE_FOUNDRY_PROVIDER", # Forwards the request directly to Azure's native /responses endpoint provider_responses_proxy=True,)response = portkey.responses.create( model="gpt-4.1", # Your Azure deployment name input="Summarise the latest Q3 sales numbers for the EMEA region.", agent_reference="asst_abc123", # Azure AI Agent reference)print(response.output_text)
Direct mode forwards the payload verbatim. Your Azure deployment must accept the Responses API schema — if it doesn’t, requests will fail. Use Gateway Transformation (Mode 1) for everything else.
Use Gateway Transformation for general-purpose Responses API calls against any Azure-deployed model.
Use Direct (Proxy) when calling Azure AI Agents or any Azure deployment that exposes the native Responses endpoint and you need Azure-specific parameters (agent_reference, server-side response storage, etc.).
For a deep dive into the Responses API itself — streaming, tool calling, reasoning, vision, structured output — see the Open Responses guide.
Once you’ve created your provider, you can start making requests to Azure AI Foundry models through Portkey.
NodeJS
Python
cURL
Install the Portkey SDK with npm
npm install portkey-ai
import Portkey from 'portkey-ai';const client = new Portkey({ apiKey: 'PORTKEY_API_KEY', provider:'@AZURE_FOUNDRY_PROVIDER'});async function main() { const response = await client.chat.completions.create({ messages: [{ role: "user", content: "Tell me about cloud computing" }], model: "DeepSeek-V3-0324", // Replace with your deployed model name }); console.log(response.choices[0].message.content);}main();
Install the Portkey SDK with pip
pip install portkey-ai
from portkey_ai import Portkeyclient = Portkey( api_key = "PORTKEY_API_KEY", provider = "@AZURE_FOUNDRY_PROVIDER")response = client.chat.completions.create( model="DeepSeek-V3-0324", # Replace with your deployed model name messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Tell me about cloud computing"} ])print(response.choices[0].message.content)
Get consistent, parseable responses in specific formats:
import jsonresponse = portkey.chat.completions.create( model="cohere-command-a", # Use a model that supports response formats messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "List the top 3 cloud providers with their main services"} ], response_format={"type": "json_object"}, temperature=0)print(json.loads(response.choices[0].message.content))
You can manage all prompts to Azure AI Foundry in the Prompt Library. Once you’ve created and tested a prompt in the library, use the portkey.prompts.completions.create interface to use the prompt in your application.
prompt_completion = portkey.prompts.completions.create( prompt_id="Your Prompt ID", variables={ # The variables specified in the prompt })
Azure AI Foundry supports reranking through Cohere models deployed on the platform. Use the Portkey unified /rerank endpoint with the cohere. model prefix:
from portkey_ai import Portkeyclient = Portkey(api_key="PORTKEY_API_KEY",provider="@azure-foundry")result = client.post("/v1/rerank",body={"model": "cohere.Cohere-rerank-v4.0-pro","query": "What is deep learning?","documents": ["Deep learning is a subset of machine learning","The weather is sunny today","Neural networks have multiple layers"]})print(result)
The cohere. prefix in the model name is automatically stripped before forwarding to the provider.