How a leading delivery platform built a GenAI system to support 1000+ engineers

How a leading delivery platform built a GenAI system to support 1000+ engineers

From infrastructure bottlenecks to processing tens of millions of requests across 150+ AI models, one major food delivery company had to rethink how it managed its AI operations at scale.
From infrastructure bottlenecks to processing tens of millions of requests across 150+ AI models, one major food delivery company had to rethink how it managed its AI operations at scale.

About

A leading delivery platform that connects users with restaurants, grocery stores, and retailers, providing fast and efficient services.

Industry

Online food delivery

Company Size

10,000+ employees

Headquarters

North America

Founded

Early 2010s

Favorite Feature:

Unified files and batch API, fine-tuning API

150+

LLM Providers

1000+

Engineers

1BN+

AI Requests
The growing complexity of AI at scale

By early 2024, this company had expanded its AI initiatives across multiple teams, supporting use cases in fraud detection, customer service automation, and internal tooling. What started as a series of isolated experiments had grown into a vast AI ecosystem, with over 400 engineers building AI-powered features on top of 150+ models spread across multiple cloud providers.

Managing this scale came with serious infrastructure challenges. The machine learning platform team was under pressure to maintain performance, reliability, and cost efficiency while enabling fast-paced AI development.

The challenges of multi-provider AI infrastructure

As AI became more deeply integrated across teams, the delivery platform started experiencing mounting operational headaches.

Their engineers had to connect to dozens of different LLM providers - different APIs to integrate, various rate limits to monitor, and performance that varied widely between services. When calls to these services failed (which happened regularly at their scale), they needed backup systems that could quickly step in and rescue those requests.

High-volume AI workloads can get expensive fast, especially without proper optimization. To reduce unnecessary costs, they wanted to set up request deduplication and caching for common queries.

All this while keeping customer data secure and meeting compliance requirements.

The challenge wasn’t just about technology - it was about creating an infrastructure that balanced flexibility for developers with the control required at an enterprise level.different APIs, rate limits, and performance profiles

We needed a way to give teams the freedom to innovate while maintaining control over costs and security. The challenge wasn't just technical—it was about building a platform that could scale with our growth.

~ Engineering Lead, AI Platform Team

Finding the right solution

The team explored several options before discovering Portkey’s open-source AI Gateway. They were looking for a solution that could handle large-scale production workloads without sacrificing enterprise security. Portkey stood out for its:

  • Battle-tested gateway with sub-5ms latency

  • Support for all major LLM providers, enabling seamless provider switching

  • Intelligent request deduplication and caching to reduce redundant calls

  • Comprehensive cost monitoring and optimization tools

  • Enterprise-grade security architecture for compliance and data protection

Implementing an AI infrastructure built for scale

To maintain control over sensitive data, the company deployed Portkey in a hybrid model—running the data plane within their VPC while using Portkey’s cloud-based control and optimization features. This approach ensured that:

  • Sensitive data remained within their infrastructure

  • Observability across all AI requests was maintained

  • Cost tracking and usage monitoring were streamlined

The platform now processes tens of millions of AI requests per quarter. It automatically retries failed requests, dynamically reroutes traffic across LLM providers, and optimizes API calls through caching.

The question wasn’t whether we could build this infrastructure ourselves—we absolutely could. The question was whether we should dedicate our best engineers to infrastructure instead of AI products that drive business value.

~ Platform Director, AI Division

The impact: More efficiency, lower costs, and faster iteration

After implementing Portkey, the delivery platform saw tangible benefits that went beyond just solving their technical challenges.

Their system now handles tens of millions of AI requests without breaking a sweat. The platform runs smoothly even during peak traffic periods. The system also rescues hundreds of thousands of failed requests through smart retry mechanisms, maintaining a seamless experience for users. Additionally with caching, the system prevents redundant API calls, while optimized routing directs traffic to the most cost-effective providers for each type of request. This led to a remarkable 40%reduction in their overall LLM costs.

Security remains rock-solid with zero incidents since deployment. The hybrid architecture keeps sensitive data protected while still leveraging cloud-based management features.

The technical team gained agility too. Previously, adding a new LLM provider took weeks of integration work. Now they can bring new models online in hours, creating 10x faster deployment cycles for AI features across the organization.

Perhaps most impressive is the reliability—the platform maintains 99.99% uptime, ensuring AI capabilities are available whenever and wherever they're needed throughout the business.

The platform now processes billions of AI requests. It automatically retries failed requests, dynamically reroutes traffic across LLM providers, and optimizes API calls through caching.

Lessons for AI teams scaling their operations

Based on this delivery platform's experience, teams building large-scale AI systems should keep several points in mind.

Build with growth in mind from day one. Selecting infrastructure that can expand with your needs saves you from painful rebuilds down the road. What works for ten engineers won't necessarily work for a thousand.

Watch what's happening in your system. Without good visibility into costs and request patterns, small inefficiencies can quickly grow into major expenses. The platform team found that detailed monitoring helped them spot and fix problems before they impacted the bottom line.

Be ready for things to get complicated. Running AI at scale involves many moving parts, especially when you're using multiple models from different providers. You'll need smart systems that can handle failures gracefully, retry when needed, and switch between providers seamlessly.

The path this delivery company took shows how crucial it is to think about AI infrastructure strategically rather than tactically. As AI becomes central to more business functions, having the right foundation differentiates between sustainable growth and constant firefighting.

As AI adoption grows, having a flexible yet controlled infrastructure will be the difference between scalable innovation and unmanageable complexity. If you’d like to see what Portkey can do for you, book a demo with us today.

Based on this delivery platform's experience, teams building large-scale AI systems should keep several points in mind.

Build with growth in mind from day one. Selecting infrastructure that can expand with your needs saves you from painful rebuilds down the road. What works for ten engineers won't necessarily work for a thousand.

Watch what's happening in your system. Without good visibility into costs and request patterns, small inefficiencies can quickly grow into major expenses. The platform team found that detailed monitoring helped them spot and fix problems before they impacted the bottom line.

Be ready for things to get complicated. Running AI at scale involves many moving parts, especially when you're using multiple models from different providers. You'll need smart systems that can handle failures gracefully, retry when needed, and switch between providers seamlessly.

The path this delivery company took shows how crucial it is to think about AI infrastructure strategically rather than tactically. As AI becomes central to more business functions, having the right foundation differentiates between sustainable growth and constant firefighting.

As AI adoption grows, having a flexible yet controlled infrastructure will be the difference between scalable innovation and unmanageable complexity. If you’d like to see what Portkey can do for you, book a demo with us today.

Based on this delivery platform's experience, teams building large-scale AI systems should keep several points in mind.

Build with growth in mind from day one. Selecting infrastructure that can expand with your needs saves you from painful rebuilds down the road. What works for ten engineers won't necessarily work for a thousand.

Watch what's happening in your system. Without good visibility into costs and request patterns, small inefficiencies can quickly grow into major expenses. The platform team found that detailed monitoring helped them spot and fix problems before they impacted the bottom line.

Be ready for things to get complicated. Running AI at scale involves many moving parts, especially when you're using multiple models from different providers. You'll need smart systems that can handle failures gracefully, retry when needed, and switch between providers seamlessly.

The path this delivery company took shows how crucial it is to think about AI infrastructure strategically rather than tactically. As AI becomes central to more business functions, having the right foundation differentiates between sustainable growth and constant firefighting.

As AI adoption grows, having a flexible yet controlled infrastructure will be the difference between scalable innovation and unmanageable complexity. If you’d like to see what Portkey can do for you, book a demo with us today.

Product

Developers

Solutions

Resources

Product

Developers

Solutions

Resources

Build your AI app's
control panel now

Build your AI app's
control panel now

Build your AI app's control panel now

Manage models, monitor usage, and fine-tune settings—all in one place.

Manage models, monitor usage, and fine-tune settings—all in one place.