How a leading delivery platform built a GenAI system to support 1000+ engineers

How a leading delivery platform built a GenAI system to support 1000+ engineers

From infrastructure bottlenecks to processing tens of millions of requests across 150+ AI models, one major food delivery company had to rethink how it managed its AI operations at scale.

From infrastructure bottlenecks to processing tens of millions of requests across 150+ AI models, one major food delivery company had to rethink how it managed its AI operations at scale.

About

A leading delivery platform that connects users with restaurants, grocery stores, and retailers, providing fast and efficient services.

Industry

Online food delivery

Company Size

10,000+ employees

Headquarters

North America

Founded

Early 2010s

Why Portkey:

Unified files and batch API, fine-tuning API

0+
LLM Providers
0+
Engineers
0BN+
AI Requests
The growing complexity of AI at scale

By early 2024, this company had expanded its AI initiatives across multiple teams, supporting use cases in fraud detection, customer service automation, and internal tooling. What started as a series of isolated experiments had grown into a vast AI ecosystem, with over 400 engineers building AI-powered features on top of 150+ models spread across multiple cloud providers.

Managing this scale came with serious infrastructure challenges. The machine learning platform team was under pressure to maintain performance, reliability, and cost efficiency while enabling fast-paced AI development.

The challenges of multi-provider AI infrastructure

As AI became more deeply integrated across teams, the delivery platform started experiencing mounting operational headaches.

Their engineers had to connect to dozens of different LLM providers - different APIs to integrate, various rate limits to monitor, and performance that varied widely between services. When calls to these services failed (which happened regularly at their scale), they needed backup systems that could quickly step in and rescue those requests.

High-volume AI workloads can get expensive fast, especially without proper optimization. To reduce unnecessary costs, they wanted to set up request deduplication and caching for common queries.

All this while keeping customer data secure and meeting compliance requirements.

The challenge wasn’t just about technology - it was about creating an infrastructure that balanced flexibility for developers with the control required at an enterprise level.different APIs, rate limits, and performance profiles

We needed a way to give teams the freedom to innovate while maintaining control over costs and security. The challenge wasn't just technical—it was about building a platform that could scale with our growth.

~ Engineering Lead, AI Platform Team

Finding the right solution

The team explored several options before discovering Portkey’s open-source AI Gateway. They were looking for a solution that could handle large-scale production workloads without sacrificing enterprise security. Portkey stood out for its:

  • Battle-tested gateway with sub-5ms latency

  • Support for all major LLM providers, enabling seamless provider switching

  • Intelligent request deduplication and caching to reduce redundant calls

  • Comprehensive cost monitoring and optimization tools

  • Enterprise-grade security architecture for compliance and data protection

Implementing an AI infrastructure built for scale

To maintain control over sensitive data, the company deployed Portkey in a hybrid model—running the data plane within their VPC while using Portkey’s cloud-based control and optimization features. This approach ensured that:

  • Sensitive data remained within their infrastructure

  • Observability across all AI requests was maintained

  • Cost tracking and usage monitoring were streamlined

The platform now processes tens of millions of AI requests per quarter. It automatically retries failed requests, dynamically reroutes traffic across LLM providers, and optimizes API calls through caching.

The question wasn’t whether we could build this infrastructure ourselves—we absolutely could. The question was whether we should dedicate our best engineers to infrastructure instead of AI products that drive business value.

~ Platform Director, AI Division

The impact: More efficiency, lower costs, and faster iteration

After implementing Portkey, the delivery platform saw tangible benefits that went beyond solving technical hurdles.

The system saw a 3100X increase in traffic without breaking a sweat. Even during peak load, with traffic spiking to 1,800 RPS, the platform remained stable.

Reliability is built in. Smart fallback logic has already rescued nearly half a million failed requests, ensuring a seamless user experience. Caching prevents redundant API calls, and optimized routing directs traffic to the most cost-effective providers. Together, these systems have reduced overall LLM spends, saving over $500,000 to date.

Adding new LLM providers, once a week-long integration, now takes hours. Teams can test and deploy new models quickly, accelerating time-to-value across use cases.

The system is now used by over 1000+ engineers across 350+ workspaces. It routes traffic across Anthropic, OpenAI, Vertex, and others, automatically retrying failed requests, optimizing calls through caching, and maintaining 99.99% uptime across billions of requests.

Lessons for AI teams scaling their operations

Based on this delivery platform's experience, teams building large-scale AI systems should keep several points in mind.

Build with growth in mind from day one. Selecting infrastructure that can expand with your needs saves you from painful rebuilds down the road. What works for ten engineers won't necessarily work for a thousand.

Watch what's happening in your system. Without good visibility into costs and request patterns, small inefficiencies can quickly grow into major expenses. The platform team found that detailed monitoring helped them spot and fix problems before they impacted the bottom line.

Be ready for things to get complicated. Running AI at scale involves many moving parts, especially when you're using multiple models from different providers. You'll need smart systems that can handle failures gracefully, retry when needed, and switch between providers seamlessly.

The path this delivery company took shows how crucial it is to think about AI infrastructure strategically rather than tactically. As AI becomes central to more business functions, having the right foundation differentiates between sustainable growth and constant firefighting.

As AI adoption grows, having a flexible yet controlled infrastructure will be the difference between scalable innovation and unmanageable complexity.


If you’d like to see what Portkey can do for you, book a demo with us today.

Based on this delivery platform's experience, teams building large-scale AI systems should keep several points in mind.

Build with growth in mind from day one. Selecting infrastructure that can expand with your needs saves you from painful rebuilds down the road. What works for ten engineers won't necessarily work for a thousand.

Watch what's happening in your system. Without good visibility into costs and request patterns, small inefficiencies can quickly grow into major expenses. The platform team found that detailed monitoring helped them spot and fix problems before they impacted the bottom line.

Be ready for things to get complicated. Running AI at scale involves many moving parts, especially when you're using multiple models from different providers. You'll need smart systems that can handle failures gracefully, retry when needed, and switch between providers seamlessly.

The path this delivery company took shows how crucial it is to think about AI infrastructure strategically rather than tactically. As AI becomes central to more business functions, having the right foundation differentiates between sustainable growth and constant firefighting.

As AI adoption grows, having a flexible yet controlled infrastructure will be the difference between scalable innovation and unmanageable complexity.


If you’d like to see what Portkey can do for you, book a demo with us today.

Based on this delivery platform's experience, teams building large-scale AI systems should keep several points in mind.

Build with growth in mind from day one. Selecting infrastructure that can expand with your needs saves you from painful rebuilds down the road. What works for ten engineers won't necessarily work for a thousand.

Watch what's happening in your system. Without good visibility into costs and request patterns, small inefficiencies can quickly grow into major expenses. The platform team found that detailed monitoring helped them spot and fix problems before they impacted the bottom line.

Be ready for things to get complicated. Running AI at scale involves many moving parts, especially when you're using multiple models from different providers. You'll need smart systems that can handle failures gracefully, retry when needed, and switch between providers seamlessly.

The path this delivery company took shows how crucial it is to think about AI infrastructure strategically rather than tactically. As AI becomes central to more business functions, having the right foundation differentiates between sustainable growth and constant firefighting.

As AI adoption grows, having a flexible yet controlled infrastructure will be the difference between scalable innovation and unmanageable complexity.


If you’d like to see what Portkey can do for you, book a demo with us today.

Build your AI app's
control panel now

Build your AI app's
control panel now

Build your AI app's control panel now

Manage models, monitor usage, and fine-tune settings—all in one place.

Manage models, monitor usage, and fine-tune settings—all in one place.

🎉Portkey + Palo Alto Networks: PRISMA AIRS guardrails on the Gateway. Learn More

Products

Solutions

Developers

Resources

...

🎉Portkey + Palo Alto Networks: PRISMA AIRS guardrails on the Gateway. Learn More

...

🎉 Portkey + Palo Alto Networks: PRISMA AIRS guardrails on the Gateway. Learn More