> ## Documentation Index > Fetch the complete documentation index at: https://docs.portkey.ai/docs/llms.txt > Use this file to discover all available pages before exploring further. # AI Engineering Hours > Discussion notes from the weekly AI engineering meetup Teams from Springworks and Haptik shared hard-won insights from running LLMs in production: Gemini outperforms gpt-4o for Hinglish translation, and shifting to managed Gateways cuts latency in half. Plus practical tips on caching and RAG optimization at scale. SDE-2, Springworks DevOps Engineer, Jio Haptik Gen AI, NetApp Gen AI, NetApp **On Production Patterns** * Haptik & Springworks map Portkey virtual keys to their model deployments, making it simple for engineers to prototype & build AI features * Monitor Portkey analytics to understand deployment behavior and pre-scale resources to avoid rate limits * For secure testing, use short-lived virtual keys instead of sharing long-term access **Some Learnings** * Infrastructure insight: Each additional middleware layer (auth, rate limiting) compounds latency at scale - consider using Gateway features directly instead of custom layers * Plan for caching early: Auxiliary services inevitably add latency at scale - implement caching in your initial development cycle * In RAG pipelines, Vector DB operations become bottlenecks before LLM calls - optimize these first * For Hinglish audio translations, especially with noise, Gemini proves more reliable than gpt-4o