/chat/completions
or /completions
routes.cache
params to your config object.
x-portkey-debug: "false"
header is included in the request/chat/completions
endpoint, Portkey requires at least two message objects in the messages
array. The first message object, typically used for the system
message, is not considered when determining semantic similarity for caching purposes.
For example:
user
message (“Who is the president of the US?”) is used for finding semantic matches in the cache. The system
message (“You are a helpful assistant”) is ignored.
This means that even if you change the system
message while keeping the user
message semantically similar, Portkey will still return a semantic cache hit.
This allows you to modify the behavior or context of the assistant without affecting the cache hits for similar user queries.
max_age
is specified in the request, the organization-level default value is used.max_age
value in a request is greater than the organization-level default, the organization-level value takes precedence.max_age
in a request is less than the organization-level default, the lower request value is honored.cacheForceRefresh
as true
without passing the relevant cache config will not have any effect)x-portkey-cache-namespace
header in your API requests, followed by any custom string value. Portkey will then use this namespace string as the sole basis for partitioning the cache, disregarding all other headers, including metadata.
For example, if you send the following header:
user-123
, ignoring any other headers or metadata associated with the request.
user-123
, ignoring any other headers or metadata.
Cache Disabled
when you are not using the cache, and any of Cache Miss
, Cache Refreshed
, Cache Hit
, Cache Semantic Hit
based on the cache hit status. Read more here.
override_params
then cache on that target will not work until that particular combination of params is also stored with the cache. If there are no override_params
for that target, then cache will be active on that target even if it hasn’t been triggered even once.