Components and Sizing Recommendations
| Component | Options | Sizing Recommendations |
|---|
| AI Gateway | Deploy in your GKE cluster using Helm charts. | Use GKE worker nodes, each providing at least 2 vCPUs and 4 GiB of memory. For high availability, deploy them across multiple Availability Zones. |
| Logs Store (optional) | Google Cloud Storage or S3-compatible Storage | Each log document is ~10kb in size (uncompressed) |
| Cache (Prompts, Configs & Providers) | Built-in Redis, Google Memorystore Redis or Valkey | Deployed within the same VPC as the Portkey Gateway. |
Prerequisites
Ensure that following tools and resources are installed and available:
Create a Portkey Account
- Go to the Portkey website.
- Sign up for a Portkey account.
- Once logged in, locate and save your
Organisation ID for future reference. You can find it in the browser URL:
https://app.portkey.ai/organisation/<organisation_id>/
- Contact the Portkey AI team and provide your Organisation ID and the email address used during signup.
- The Portkey team will share the following information with you:
- Docker credentials for the Gateway images (username and password).
- License: Client Auth Key.
Setup Project Environment
CLUSTER_NAME=<GKE_CLUSTER_NAME> # Specify the name of the GKE cluster where the gateway will be deployed.
NAMESPACE=<NAMESPACE> # Specify the namespace where the gateway should be deployed (for example, portkeyai).
KSA=<KSA> # Provide a name for the Service Account to be associated with Gateway Pod (for example, gateway-sa)
mkdir portkey-gateway
cd portkey-gateway
touch values.yaml
Image Credentials Configuration
# Update the values.yaml file
imageCredentials:
- name: portkey-enterprise-registry-credentials
create: true
registry: https://index.docker.io/v1/
username: <PROVIDED BY PORTKEY>
password: <PROVIDED BY PORTKEY>
gatewayImage:
repository: "docker.io/portkeyai/gateway_enterprise"
pullPolicy: Always
tag: "latest"
dataserviceImage:
repository: "docker.io/portkeyai/data-service"
pullPolicy: Always
tag: "latest"
redisImage:
repository: "docker.io/redis"
pullPolicy: IfNotPresent
tag: "7.2-alpine"
environment:
create: true
secret: true
data:
ANALYTICS_STORE: control_plane
SERVICE_NAME: <SERVICE_NAME> # Specify a name for the service
PORTKEY_CLIENT_AUTH: <PROVIDED BY PORTKEY>
ORGANISATIONS_TO_SYNC: <ORGANISATION_ID> # This is obtained after signing up for a Portkey account.
Based on the choice of components and their configuration update the values.yaml.
MCP Gateway (Optional)
By default, only the AI Gateway is enabled in the deployment. To enable the MCP Gateway, add the following configuration to values.yaml:
environment:
data:
SERVER_MODE: "mcp/all"
MCP_PORT: "8788"
MCP_GATEWAY_BASE_URL: "<This must be set to MCP LoadBalancer URL or Hostname pointing to MCP Service>"
Note: MCP_GATEWAY_BASE_URL does not need to be provided during the initial deployment. Once the MCP Load Balancer is created after first deployment and hostname mapping is configured, you can set this value and redeploy.
Server Modes
"" (empty or not provided): Deploys only the AI Gateway. This is the default configuration.
"mcp": Deploys only the MCP Gateway.
"all": Deploys both the AI Gateway and MCP Gateway.
Cache Store
The Portkey Gateway deployment includes a Redis instance pre-installed by default. You can either use this built-in Redis or connect to an external cache like Google Memorystore Redis or Valkey.
Built-in Redis
No additional permissions or network configurations are required.
## To use the built-in Redis, add the following configuration to the values.yaml file.
environment:
data:
CACHE_STORE: redis
REDIS_URL: "redis://redis:6379"
REDIS_TLS_ENABLED: "false"
Google Memorystore
To enable the gateway to work with a Memorystore cache, ensure that network access from GKE cluster on required port.
## To use Google Memorystore Redis or Valkey, add the following configuration in the values.yaml file.
environment:
data:
CACHE_STORE: memory-store
REDIS_URL: "redis://<GCP_MEMORY_STORE_IP>:<Port>"
REDIS_TLS_ENABLED: "false" ## "true"/"false"
REDIS_MODE: cluster ## Add this parameter only if cluster mode is enabled on Memorystore
To set up IAM-based authentication for Portkey Gateway to GCP Memorystore, follow the steps and add following configuration in values.yaml.## To use IAM based authentication with Google Memorystore Redis or Valkey, add the following configuration in the values.yaml file.
environment:
data:
CACHE_STORE: memory-store
GCP_REDIS_AUTH_MODE: workload
REDIS_URL: "redis://<MEMORY_STORE_IP>:<Port>"
REDIS_TLS_ENABLED: "false" ## "true"/"false"
REDIS_MODE: cluster ## Add this parameter only if cluster mode is enabled on Memorystore
REDIS_PASSWORD: <MEMORY_STORE_AUTH_STRING>
TLS (Optional)
If TLS is enabled on your GCP Memorystore Redis instance, you must provide the self-signed certificate to the Gateway to enable SSL/TLS connections.
- Download the certificate file
server-ca.pem from your GCP Memorystore Redis cluster.
- Create a Kubernetes secret to store the Memorystore certificate:
kubectl create secret generic memorystore-tls-certs --from-file=server-ca.pem -n $NAMESPACE
- Add the following configuration to
values.yaml:
environment:
data:
REDIS_TLS_CERTS: /etc/ssl/certs/server-ca.pem
REDIS_TLS_ENABLED: "true"
volumes:
- name: memorystore-tls-certs
secret:
secretName: memorystore-tls-certs
volumeMounts:
- name: memorystore-tls-certs
mountPath: /etc/ssl/certs/server-ca.pem
subPath: server-ca.pem
Log Store
Google Cloud Storage
-
Create a GCS bucket for storing LLM access logs.
-
Set up access to the log store. The Gateway supports the following methods for connecting to GCS bucket for log storage:
- Workload Identity Federation
- HMAC
Depending on the chosen GCS access method, update values.yaml with the following configuration.
To set up IAM-based authentication for Portkey Gateway to GCP bucket, follow the steps and add following configuration in values.yaml.## To enable `Workload Identity Federation` update values.yaml with the following details:-
serviceAccount:
create: true
automount: true
# Provide the name of service account. Must be same as the name you provide while binding GSA and KSA during workload identity permission setup.
name: <KSA>
annotations:
# Replace <GSA> and <PROJECT_ID_A> with Google Service Account name and Project ID of service account respectively.
iam.gke.io/gcp-service-account: <GSA>@<PROJECT_ID_A>.iam.gserviceaccount.com
environment:
data:
LOG_STORE: gcs_assume
GCP_AUTH_MODE: workload
LOG_STORE_REGION: <GCS_BUCKET_REGION> # Specify the GCP region where the GCS log bucket resides (e.g., us-east1).
LOG_STORE_GENERATIONS_BUCKET: <GCS_BUCKET_NAME> # Specify the name of GCS log bucket.
## To enable HMAC based access update values.yaml with following details:-
serviceAccount:
create: true
automount: true
name: <KSA>
environment:
data:
LOG_STORE: gcs
LOG_STORE_REGION: <GCS_BUCKET_REGION> # Specify the GCP region where the GCS log bucket resides (e.g., us-east1).
LOG_STORE_GENERATIONS_BUCKET: <GCS_BUCKET_NAME> # Specify the name of GCS log bucket.
LOG_STORE_ACCESS_KEY: <HMAC_ACCESS_KEY> # Specify the HMAC access key of service account.
LOG_STORE_SECRET_KEY: <HMAC_SECRET_KEY> # Specify the HMAC secret key of service account.
-
(Optional) Configure log path format using
LOG_STORE_FILE_PATH_FORMAT. See Log Object Path Format for details.
Data Service (Optional)
The Data Service is a component of the Portkey deployment responsible for batch processing, fine-tuning, and log exports.
To enable Data Service, add the following configuration to the values.yaml file.
dataservice:
name: "dataservice"
enabled: true
env:
DEBUG_ENABLED: false
SERVICE_NAME: "portkeyenterprise-dataservice"
serviceAccount:
create: true
name: <KSA>
Network Configuration
Set Up External Access
To make the Gateway service accessible externally, you can set up either of the following:
- GCS Application Load Balancer with Kubernetes
Ingress
- GCS Network Load Balancer with Kubernetes
Service
Prerequisites
GCP Load Balancer Ingress
To create Application Load Balancer Ingress update the values.yaml file with following configuration:
service:
type: ClusterIP
port: 8787
ingress:
enabled: true
# hostname: "<AI Gateway Hostname>"
# hostBased: false
# mcpHostname: "<MCP Gateway Hostname>"
annotations:
kubernetes.io/ingress.class : gce
ingress.gcp.kubernetes.io/healthcheck-path: /v1/health
Note: If SERVER_MODE is set to all (i.e., both AI Gateway and MCP Gateway are enabled), you must enable host-based routing by setting hostBased to true and provide the hostname on which the AI Gateway and MCP Gateway will be accessible.
GCP Load Balancer Controller provides additional annotations (like TLS, custom health checks etc ) for managing Ingress Load Balancer. For a comprehensive list of available annotations, refer to the GCP Ingress Load Balancer.
GCP Load Balancer Service
To create Load Balancer update the values.yaml with following configuration:
service:
type: LoadBalancer
port: 8787
annotations:
cloud.google.com/l4-rbs: "enabled" # Use this annotation for creating external Load Balancer.
# networking.gke.io/load-balancer-type: "Internal" # Use this annotation for creating internal Load Balancer.
spec.loadBalancerSourceRanges: "0.0.0.0/0"
GCP Load Balancer Controller provides additional annotations (like TLS, custom health checks etc ) for managing Service Load Balancer. For a comprehensive list of available annotations, refer to the GCP Service Load Balancer.
Ensure Outbound Network Access
By default, Kubernetes allows full outbound access, but if your cluster has NetworkPolicies that restrict egress, configure them to allow outbound traffic.
Example NetworkPolicy for Outbound Access:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-all-egress
namespace: portkeyai
spec:
podSelector: {}
policyTypes:
- Egress
egress:
- to:
- ipBlock:
cidr: 0.0.0.0/0
This allows the gateway to access LLMs hosted both within your VPC and externally. This also enables connection for the sync service to the Portkey Control Plane.
Deploying Portkey Gateway
# Add the Portkey AI Gateway helm repository
helm repo add portkey-ai https://portkey-ai.github.io/helm
helm repo update
# Install the chart
helm upgrade --install portkey-ai portkey-ai/gateway -f ./values.yaml -n ${NAMESPACE} --create-namespace
Verify the deployment
To confirm that the deployment was successful, follow these steps:
- Verify that all pods are running correctly.
#
kubectl get pods -n ${NAMESPACE}
# You should see all pods with a 'STATUS' of 'Running'.
Note: If pods are in a Pending, CrashLoopBackOff, or other error state, inspect the pod logs and events to diagnose potential issues.
-
Test Gateway by sending a cURL request.
- Port-forward the Gateway pod
kubectl port-forward <POD_NAME> -n ${NAMESPACE} 9000:8787 # Replace <POD_NAME> with your Gateway pod's actual name.
- Once port forwarding is active, open a new terminal window or tab and send a test request by running:
# Specify LLM provider and Portkey API keys
OPENAI_API_KEY=<OPENAI_API_KEY> # Replace <OPENAI_API_KEY> with an actual API key
PORTKEY_API_KEY=<PORTKEY_API_KEY> # Replace <PORTKEY_API_KEY> with Portkey API key which can be created from Portkey website(https://app.portkey.ai/api-keys).
# Configure and send the curl request
curl 'http://localhost:9000/v1/chat/completions'`\
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "x-portkey-provider: openai" \
-H "x-portkey-api-key: $PORTKEY_API_KEY" \
-d '{
"model": "gpt-4o-mini",
"messages": [{"role": "user","content": "What is a fractal?"}]
}'
- Test gateway service integration with Load Balancer.
# Replace <LOAD_BALANCER_IP> and <LB_LISTENER_PORT_NUMBER> with the DNS name and listener port of the created load balancer, respectively.
curl 'http://<LOAD_BALANCER_IP>:8787/v1/chat/completions' \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "x-portkey-provider: openai" \
-H "x-portkey-api-key: $PORTKEY_API_KEY" \
-d '{
"model": "gpt-4o-mini",
"messages": [{"role": "user","content": "What is a fractal?"}]
}'
Integrating Gateway with Control Plane
Portkey supports the following methods for integrating the Control Plane with the Data Plane/Gateway:
IP Whitelisting
Allows control plane to access the Data Plane over the internet by restricting inbound traffic to specific IP address of Control Plane. This method requires the Data Plane to have a publicly accessible endpoint.
To whitelist, add an inbound rule to the VPC Firewall allowing connections from the Portkey Control Plane’s IPs (54.81.226.149, 34.200.113.35, 44.221.117.129) on Load Balancer listner port.
To integrate the Control Plane with the Data Plane, contact the Portkey team and provide the Public Endpoint of the Data Plane.
Verifying Gateway Integration with the Control Plane
- Send a test request to Gateway using
curl.
- Go to Portkey website -> Logs.
- Verify that the test request appears in the logs and that you can view its full details by selecting the log entry.
Uninstalling Portkey Gateway
helm uninstall portkey-ai -n ${NAMESPACE}
Setting up IAM Permission
To enable the Portkey Gateway to access GCS bucket for log storage and, optionally, Vertex AI for model invocation, specific permissions are required.
Follow the steps below to configure permissions based on your chosen access method.
- Create a Google Service Account.
PROJECT_ID_A=<SERVICE_ACCOUNT_PROJECT_ID> # Specify id of project in which service account is to be created.
GSA=<GSA_NAME> # Specify name of Google Service Account to be created.
bucket_name=<GCS_BUCKET_NAME> # Specify the name of GCS bucket which will store logs. Bucket must already be created.
gcloud iam service-accounts create ${GSA} \
--display-name="Portkey Gateway Service Account"
- Create an IAM Policy binding to bind GSA to Gateway’s KSA.
gcloud iam service-accounts \
add-iam-policy-binding ${GSA}@${PROJECT_ID_A}.iam.gserviceaccount.com \
--role roles/iam.workloadIdentityUser \
--member "serviceAccount:${PROJECT_ID_A}.svc.id.goog[${NAMESPACE}/${KSA}]"
- Grant the required permissions to the GSA for accessing the GCS bucket. You can either assign the
roles/storage.objectAdmin role to the GSA, or create a custom role with only the necessary permissions, such as storage.objects.create and storage.objects.get, and attach that role instead.
GCS Bucket
Same Project Access
gcloud projects add-iam-policy-binding ${PROJECT_ID_A} \
--member="serviceAccount:${GSA}@${PROJECT_ID_A}.iam.gserviceaccount.com" \
--role="roles/storage.objectAdmin"
Cross Project Access
# Replace <PROJECT_B_ACCOUNT_ID> with Project Id
# in which GCS bucket is created.
PROJECT_ID_B=<PROJECT_B_ACCOUNT_ID>
gcloud projects add-iam-policy-binding ${PROJECT_ID_B} \
--member='serviceAccount:${GSA}@${PROJECT_ID_A}.iam.gserviceaccount.com'
--role='roles/storage.objectAdmin'
- (Optional) To allow the Gateway to access Vertex AI models, grant the
roles/aiplatform.user role to the GSA.
Vertex AI
Same Project Access
gcloud projects add-iam-policy-binding ${PROJECT_ID_A} \
--member="serviceAccount:${GSA}@${PROJECT_ID_A}.iam.gserviceaccount.com" \
--role="roles/aiplatform.user"
Cross Project Access
# Replace <PROJECT_B_ACCOUNT_ID> with Project Id
# in which Vertex AI is to be called.
PROJECT_ID_B=<PROJECT_B_ACCOUNT_ID>
gcloud projects add-iam-policy-binding ${PROJECT_ID_B} \
--member='serviceAccount:${GSA}@${PROJECT_ID_A}.iam.gserviceaccount.com'
--role='roles/aiplatform.user'
Last modified on February 4, 2026