Cache Behavior

The Gateway uses a local cache store (Redis or compatible) for two distinct purposes:

Control Plane entity cache: stores configuration objects (API keys, virtual keys, configs, prompts, guardrails, integrations) fetched from the Control Plane
LLM response cache: stores LLM request/response pairs for reuse across identical requests

Hybrid vs air-gapped: In a hybrid deployment, the Control Plane is hosted by Portkey. In an air-gapped deployment, the Control Plane runs entirely within your own infrastructure.

TTL: Control Plane Entities

All configuration objects are cached with a 7-day TTL (604,800 seconds). The TTL resets each time an item is re-fetched and re-written to cache.

Object type	TTL
API keys	7 days, or until the key’s `expires_at` date (whichever comes first)
Virtual keys	7 days
Configs	7 days
Prompt templates	7 days
Prompt partials	7 days
Guardrails	7 days
Integrations	7 days

Cache entries are lazy-loaded: an object is only written to cache the first time it is requested. Objects that have never been requested are not present in cache.

TTL: LLM Response Cache

LLM response caching is opt-in and must be explicitly enabled per request or via a Portkey Config. TTL only applies when caching is active. The Cache (Simple & Semantic) doc covers how to enable caching, set TTL via max_age, configure org-level default TTL, and use force refresh.

Sync: Control Plane → Gateway

Every minute, the Gateway sends a sync request to the Control Plane carrying a stable syncIdentifier (a UUID generated once per Gateway instance and persisted in cache). The Control Plane uses this identifier to return only the objects that have changed since the last successful sync for that Gateway instance. The response contains the identifiers of changed objects grouped by type: virtual keys, API keys, configs, prompts, prompt partials, guardrails, and integrations. For each object in the delta, the Gateway deletes its cache entry. The updated data is not pushed into the cache at this point. On the next incoming request that needs that object, the Gateway fetches the latest version from the Control Plane and re-populates the cache with a fresh 7-day TTL.

Resync: Gateway → Control Plane

Separately, a resync process also runs every minute. Its direction is the opposite of sync: it pushes data from the Gateway back to the Control Plane. The only data pushed back is usage counters (token usage and cost usage). Rather than writing to the Control Plane on every request, the Gateway accumulates these counters locally in cache as requests are processed. The resync worker reads the accumulated values and flushes them to the Control Plane in batches. After a successful flush, the local counter keys are deleted from cache. Usage counters are tracked for:

API keys
Virtual keys
Integration workspaces
Usage limit policies

No other cached data (configs, prompts, guardrails, or LLM responses) is ever pushed back to the Control Plane.

Cache Invalidation and Refresh

Invalidation and refresh are two sides of the same lifecycle: an entry is first invalidated (removed from cache), and on the next request for that object, it is refreshed (re-fetched and re-cached).

Control Plane Entities

Trigger	What happens
Delta sync (every minute)	The Gateway deletes cache entries for any object the Control Plane reports as changed. The next request for that object fetches the latest version and re-caches it with a fresh 7-day TTL.
TTL expiry (7 days)	The entry is removed automatically. The next request triggers a fresh fetch from the Control Plane.
Memory eviction	The entry is evicted by the cache store. The next request triggers a fresh fetch, same as TTL expiry.

LLM Response Cache

Trigger	What happens
`x-portkey-cache-force-refresh: true` header	The cached response for that request is deleted and replaced with a fresh LLM response.
TTL expiry	The entry is removed. The next matching request results in a cache miss and a live LLM call.
Memory eviction	Same behaviour as TTL expiry.

See Cache (Simple & Semantic) for full details on force refresh and TTL configuration.

Data-Bound: Memory Capacity Scenarios

The cache store is an in-memory system. When it reaches its configured memory limit, it evicts entries based on the eviction policy set on the cache instance. Depending on the eviction policy in use:

LRU-based policies evict the least recently used entries first. Recently accessed config objects and LLM responses are retained; idle ones are removed.
Random eviction policies remove entries without regard to recency, which may evict active objects.
noeviction causes all new write operations to fail once the limit is reached, which prevents new entries from being cached at all.

In each case, an evicted entry behaves the same as an expired one: the next request for that object triggers a fresh fetch from the Control Plane (for config objects) or a live LLM call (for response cache entries).

Enterprise Architecture

Overview of the Data Plane / Control Plane split and data flow between them

Enterprise Components

Supported cache backends: Redis, AWS ElastiCache, and more

Helm Chart: Cache Store

Full configuration reference for the cache store

Data Plane Resiliency

Detailed resiliency guide including network flow diagrams, outage scenarios, and Helm configuration

Introduction

Product

Self-Hosting

Support

TTL: Control Plane Entities

TTL: LLM Response Cache

Sync: Control Plane → Gateway

Resync: Gateway → Control Plane

Cache Invalidation and Refresh

Control Plane Entities

LLM Response Cache

Data-Bound: Memory Capacity Scenarios

Enterprise Architecture

Enterprise Components

Helm Chart: Cache Store

Data Plane Resiliency

​TTL: Control Plane Entities

​TTL: LLM Response Cache

​Sync: Control Plane → Gateway

​Resync: Gateway → Control Plane

​Cache Invalidation and Refresh

​Control Plane Entities

​LLM Response Cache

​Data-Bound: Memory Capacity Scenarios

​Related

Enterprise Architecture

Enterprise Components

Helm Chart: Cache Store

Data Plane Resiliency

TTL: Control Plane Entities

TTL: LLM Response Cache

Sync: Control Plane → Gateway

Resync: Gateway → Control Plane

Cache Invalidation and Refresh

Control Plane Entities

LLM Response Cache

Data-Bound: Memory Capacity Scenarios

Related