Sign in Subscribe

latency

OpenAI’s Prompt Caching: A Deep Dive

OpenAI’s Prompt Caching: A Deep Dive

This update is welcome news for developers who have been grappling with the challenges of managing API costs and response times. OpenAI's Prompt Caching introduces a mechanism to reuse recently seen input tokens, potentially slashing costs by up to 50% and dramatically reducing latency for repetitive tasks. In this post,

GPT-4 is Getting Faster 🐇

GPT-4 is Getting Faster 🐇

Over the past few months, we've been keenly observing latencies for both GPT 3.5 & 4. The emerging patterns have been intriguing. The standout observation? GPT-4 is catching up in speed, closing the latency gap with GPT 3.5. Our findings reveal a consistent decline in GPT-4 latency. While your