If you have multiple users of your GenAI app triggering the same or similar queries to your models, fetching LLM response from the models can be slow and expensive.
portkey
instance with apiKey
and virtualKey
parameters. You can find the arguments for both of them in your Portkey Dashboard.
Visit the reference to obtain the Portkey API key and learn how to create Virtual Keys.
mode
key specifies the desired strategy of caching you want for your app.
config
parameter that can accept these configurations as an argument. To learn about more ways, refer to the 101 on Gateway Configs.
messages
array and see if you notice any difference in the time it takes to receive a response or the quality of the response itself.
Can you refresh the cache on demand? Yes, you can!
Can you control how long the cache remains active? Absolutely!
Explore the docs on caching to know all the features available to control how you cache the LLM responses.
See the full code