Integrations are designed for organization admins and managers who need to manage AI provider access across teams. If you’re looking to use AI models, see the Model Catalog documentation.
Integrations are the secure foundation for AI provider management in Portkey. Think of them as your organization’s credential vault - a centralized place where you store API keys, configure access controls, and set usage policies that cascade throughout your entire AI infrastructure. When you create an Integration, you’re not just storing credentials - you’re establishing a governance layer that controls:
  • Who can access these AI services (through workspace provisioning)
  • What models they can use (through model provisioning)
  • How much they can spend (through budget limits)
  • How fast they can consume resources (through rate limits)

Why Integrations Matter

In enterprise AI deployments, raw API keys scattered across teams create security risks and make cost control impossible. Integrations solve this by:
  1. Centralizing Credentials: Store API keys once, use everywhere through secure references
  2. Enabling Governance: Apply organization-wide policies that automatically enforce compliance
  3. Simplifying Management: Update credentials, limits, or access in one place
  4. Maintaining Security: Never expose raw API keys to end users or applications
  5. Granular Observabilty: Get complete end-to-end observability and track 40+ crucial metric for every single LLM call

Creating an Integration

Let’s walk through creating an Integration for AWS Bedrock as an example:
1

Navigate to Integrations

From your admin panel, go to Integrations and click Create New Integration.
2

Select Your AI Provider

Choose from 200+ supported providers. Each provider may have different credential requirements.
Creating Bedrock Integration
3

Configure Integration Details

  • Name: A descriptive name for this integration (e.g., “Bedrock Production”)
  • Slug: A unique identifier used in API calls (e.g., “bedrock-prod”)
  • Description: Optional context about this integration’s purpose
  • Endpoint Type: Choose between Public or Private endpoints
4

Enter Provider Credentials

Each provider requires different credentials:For OpenAI:
OpenAI Integration Setup
  • API Key
  • Optional: Organization ID, Project ID
For AWS Bedrock:
  • AWS Access Key
  • AWS Secret Access Key
  • AWS Access Key ID
  • AWS Region

Connect Bedrock with Amazon Assumed Role

How to integrate Bedrock using Amazon Assumed Role Authentication
Similarly for:
  • Azure OpenAI
  • Google Vertex AI
  • Anthropic
  • Gemini and more…

Configuring Your Integration Access & Limits

After creating your Integration, you’ll need to configure three key aspects that work together to control access and usage:

1. Workspace Provisioning

Workspace provisioning determines which teams and projects can access this Integration. This is crucial for maintaining security boundaries and ensuring teams only access approved AI resources.

How It Works

When you provision an Integration to a workspace:
  1. That workspace can create AI Providers using this Integration’s credentials
  2. All usage is tracked at the workspace level for accountability
  3. Budget and rate limits can be applied per workspace
  4. Access can be revoked instantly if needed

Setting Up Workspace Provisioning

Workspace Provisioning Configuration
  1. In your Integration settings, navigate to Workspace Provisioning
  2. Select which workspaces should have access:
    • All Workspaces: Grants access to every workspace in your organization
    • Specific Workspaces: Choose individual workspaces that need access
  3. For each workspace, click the Edit Budget & Rate Limits icon to configure:
    • Custom budget limits (see Budget Limits section below)
    • Custom rate limits (see Rate Limits section below)
    • Specific model access

Best Practices

  • Principle of Least Privilege: Only provision to workspaces that genuinely need access
  • Environment Separation: Create separate Integrations for dev/staging/production
  • Regular Audits: Review workspace provisioning quarterly to remove unnecessary access

2. Model Provisioning

Model provisioning gives you fine-grained control over which AI models are accessible through an Integration. This is essential for:
  • Controlling costs by restricting access to expensive models
  • Ensuring compliance by limiting models to approved ones
  • Maintaining consistency by standardizing model usage across teams

Setting Up Model Provisioning

Model Provisioning Settings
  1. In your Integration settings, navigate to Model Provisioning
  2. Select the configuration options:
    • Allow All Models: Provides access to all models offered by the provider
    • Allow Specific Models: Create an allowlist of approved models

Advanced Model Management

Custom Models

The Model Catalog isn’t limited to standard provider models. You can add:
  • Fine-tuned models: Your custom OpenAI or Anthropic fine-tunes
  • Self-hosted models: Models running on your infrastructure
  • Private models: Internal models not publicly available
Each custom model gets the same governance controls as standard models.

Custom Models

Add and manage your fine-tuned, self-hosted, or private models

Overriding Model Details (Custom Pricing)

Override default model pricing for:
  • Negotiated rates: If you have enterprise agreements with providers
  • Internal chargebacks: Set custom rates for internal cost allocation
  • Free tier models: Mark certain models as free for specific teams
Custom pricing ensures your cost tracking accurately reflects your actual spend.

Custom Pricing

Configure custom pricing for models with special rates

3. Budget & Rate Limits

Budget and rate limits are configured within Workspace Provisioning and provide financial and usage guardrails for your AI operations.
Budget and Rate Limit Configuration

Budget Limits

Budget Limits on Integrations provide a simple way to manage your spending on AI providers (and LLMs) - giving you confidence and control over your application’s costs. They act as financial guardrails, preventing unexpected AI costs across your organization. These limits cascade down to all AI Providers created from this Integration.

Setting Budget Limits

  1. In your Integration settings, navigate to Workspace Provisioning
  2. Select which workspaces should have access:
    • All Workspaces: Grants access to every workspace in your organization
    • Specific Workspaces: Choose individual workspaces that need access
  3. Click on the Edit Budget & Rate Limits icon to edit the budget limits for the selected workspace
  4. Set your desired budget Limits
  5. Optionally, Select the Apply to every workspace where this integration is enabled checkbox to apply the same budget limits to all workspaces where this integration is enabled

Cost-Based Limits

Set a budget limit in USD that, once reached, will automatically expire the key to prevent further usage and overspending.

Token-Based Limits

Set a maximum number of tokens that can be consumed, allowing you to control usage independent of cost fluctuations.

Key Considerations for Budget Limits

  • Budget limits can be set as either cost-based (USD) or token-based
  • The minimum cost limit you can set is $1
  • The minimum token limit you can set is 100 tokens
  • Budget limits apply until exhausted or reset
  • Budget limits are applied only to requests made after the limit is set; they do not apply retroactively
  • Once set, budget limits cannot be edited by any organization member
  • Budget limits work for all AI provider created on Portkey using the given integration

Alert Thresholds

You can now set alert thresholds to receive notifications before your budget limit is reached:
  • For cost-based budgets, set thresholds in USD
  • For token-based budgets, set thresholds in tokens
  • Receive email notifications when usage reaches the threshold
  • Continue using the key until the full budget limit is reached

Periodic Reset Options

You can configure budget limits to automatically reset at regular intervals:
Reset Period Options:
  • No Periodic Reset: The budget limit applies until exhausted with no automatic renewal
  • Reset Weekly: Budget limits automatically reset every week
  • Reset Monthly: Budget limits automatically reset every month
Reset Timing:
  • Weekly resets occur at the beginning of each week (Sunday at 12 AM UTC)
  • Monthly resets occur on the 1st calendar day of the month, at 12 AM UTC, irrespective of when the budget limit was set prior

Rate Limits

Rate limits control the velocity of API usage, protecting against runaway processes and ensuring fair resource distribution across teams.

Setting Rate Limits

  1. In your Integration settings, navigate to Workspace Provisioning
  2. Select which workspaces should have access:
    • All Workspaces: Grants access to every workspace in your organization
    • Specific Workspaces: Choose individual workspaces that need access
  3. Click on the Edit Budget & Rate Limits icon to edit the rate limits for the selected workspace
  4. Set your desired rate Limits
  5. Optionally, Select the Apply to every workspace where this integration is enabled checkbox to apply the same rate limits to all workspaces where this integration is enabled

Configuration Options

Limit Types:
  • Request-based: Limit number of API calls (e.g., 1000 requests/minute)
  • Token-based: Limit token consumption rate (e.g., 1M tokens/hour)
Time Windows: You can choose from three different time intervals for your rate limits:
  • Per Minute: Limits reset every minute, ideal for fine-grained control
  • Per Hour: Limits reset hourly, providing balanced usage control
  • Per Day: Limits reset daily, suitable for broader usage patterns

Key Considerations for Rate Limits

  • Rate limits can be set as either request-based or token-based
  • Time intervals can be configured as per minute, per hour, or per day
  • Setting the limit to 0 disables the virtual key
  • Rate limits apply immediately after being set
  • Once set, rate limits cannot be edited by any organization member
  • Rate limits work for all providers available on Portkey and apply to all organization members who use the virtual key
  • After a rate limit is reached, requests will be rejected until the time period resets

Use Cases for Rate Limits

  • Cost Control: Prevent unexpected usage spikes that could lead to high costs
  • Performance Management: Ensure your application maintains consistent performance
  • Fairness: Distribute API access fairly across teams or users
  • Security: Mitigate potential abuse or DoS attacks
  • Provider Compliance: Stay within the rate limits imposed by underlying AI providers

Exceeding Rate Limits

When a rate limit is reached:
  • Subsequent requests are rejected with a specific error code
  • Error messages clearly indicate that the rate limit has been exceeded
  • The limit automatically resets after the specified time period has elapsed

Monitoring and Analytics

Tracking Spending and Usage

You can track your spending, usage, and 40+ crucial metrics for any specific AI integration by navigating to the Analytics tab and filtering by the desired key and timeframe.