Claude Code best practices for enterprise teams

Claude Code is one of the most widely used AI coding agents today.

But when adoption moves beyond a handful of developers, the operational gaps show up fast. API keys get scattered. Costs spiral without anyone noticing. There's no visibility into who's using what, no guardrails, and no fallback when a provider goes down.

This guide covers the best practices every team should follow when running Claude Code at scale, and how to actually implement them.

Two ways to access Claude Code today

There are two ways to use Claude Code today.

With a subscription (Pro, Max, Team, or Enterprise), you authenticate with your Claude account and get a fixed usage allocation included in your plan. Pro is $20/month, Max starts at $100/month, Team is $25/seat/month with premium seat upgrades for Claude Code access, and Enterprise is custom pricing. Usage limits reset every five hours and are shared between Claude's web/desktop apps and Claude Code. No API key required.
With the Anthropic API (pay-as-you-go), you authenticate with an API key from the Anthropic Console and pay per token. This gives you dedicated limits that don't compete with your web usage, and full cost visibility in the Console dashboard.

Where this breaks down at scale

With subscriptions, usage is a black box. There's no dashboard showing per-developer token consumption or cost breakdowns. The only way to see actual usage is by parsing JSONL files in each developer's local ~/.claude/ directory. Admins on Team and Enterprise plans can set per-user spend caps and enable extra usage at API rates, but the visibility is limited. You can set a ceiling, but you can't see what's happening underneath it in real time.

With the API, you get better visibility but new problems. Every developer needs an API key. Keys get shared in Slack, committed to repos, stored in .env files. Revoking access means hunting down every copy. There's no built-in way to scope a key to a specific team or project, set per-key budget limits, or route traffic through fallback providers if Anthropic's API goes down.

And with either approach, there's no native way to isolate teams so one group's runaway session doesn't eat into another's capacity, apply guardrails to Claude Code's inputs and outputs, load balance across providers or set up automatic failover, switch between Anthropic, Bedrock, and Vertex AI without reconfiguring every developer's machine, or get a single pane of glass showing cost, usage, and performance across the entire org.

This is the gap an AI gateway fills. It sits between Claude Code and your LLM provider, giving platform teams the control layer that doesn't exist natively. Developers keep using Claude Code exactly as they do today. The AI gateway handles everything else. Here's how you can set it up:

1. Set up a credential hierarchy, don't hand out raw API keys

The first thing most teams get wrong is distributing raw provider API keys to individual developers. The better approach is to build a credential hierarchy through the gateway.

At the org level, store your actual provider API keys (Anthropic, Bedrock, Vertex AI) in one central place. Developers never see or touch these. At the team or project level, issue scoped API keys that inherit the provider credentials from above but have their own budget limits, rate limits, and access controls. At the developer level, individuals get a key or inherit their team's that lets them use Claude Code normally. They can't exceed the limits set on their key, and they can't access the underlying provider credentials.

This gives platform teams a single place to rotate provider credentials, revoke access instantly by disabling a key, and scope every request to a team or individual without changing how developers work.

2. Set budget and rate limits before anyone starts coding

Claude Code can burn through API credits fast, especially during agentic loops where it makes multiple sequential calls to reason through a problem. Without limits, a single runaway session can rack up significant costs overnight.

Set budget limits at the team or project level before distributing access. Enforce limits based on total cost or token consumption, scoped to individual developers, teams, or departments. For larger orgs, isolate teams into separate workspaces so each operates within its own spending boundary.

Budget limits cap total spend. Rate limits cap the velocity of requests. Both matter.

If you're sharing provider capacity across teams, especially on Bedrock or Vertex AI accounts with region-level quotas, one team's heavy usage can starve another. Per-team or per-developer rate limits prevent any single user from overwhelming the shared pool.

4. Get full visibility into every request

You can't manage what you can't see. Every Claude Code request should be logged with metadata: who made it, which team, which project, what model, how many tokens, what it cost, how long it took.

Details observability lets you break down costs by team or project, debug issues back to specific sessions, spot anomalies like sudden usage spikes, and build audit trails for compliance. If your org is in a regulated industry, this isn't optional.

5. Tag requests for cost attribution

Logging alone isn't enough if everything lands in one undifferentiated bucket. With metadata, you can tag every request with team, project, developer, and environment identifiers. This turns raw logs into actionable cost attribution.

When the finance team asks "how much did the payments team spend on Claude Code last month," you should be able to answer in seconds, not days.

6. Set up provider fallbacks so outages don't stop work

If your only provider goes down, every developer on your team is stuck. Claude is available through Anthropic's direct API, AWS Bedrock, and Google Vertex AI. Configure automatic failover so that if one provider is unavailable, requests route to another without developer intervention.

You can also load balance across providers to distribute traffic, stay within rate limits, and improve overall reliability. Developers shouldn't need to know or care which provider is handling their request.

7. Apply guardrails to inputs and outputs

Claude Code operates in the terminal with the same permissions as the developer. Without guardrails, sensitive data can leak into prompts, and outputs can include content that violates org policies.

Validate both prompts and responses. Filter for PII before prompts leave your network. Enforce token or prompt length limits. Run content safety checks. Block prompt injection patterns. Layer in whatever policies your security team requires.

8. Make provider switching easy

Locking into a single provider means you can't optimize for cost, latency, or availability. You should be able to swap providers or models without touching developer workflows.

This is useful for cost optimization (routing simpler tasks to cheaper models), experimenting with new providers, or responding to pricing changes without a fire drill.

How Portkey makes all of this possible

Every best practice above requires an operational layer between Claude Code and your LLM provider. That's what Portkey is.

Portkey acts as a unified LLM gateway that sits between Claude Code and any provider: Anthropic, Bedrock, or Vertex AI. It gives you the credential hierarchy, budget controls, rate limits, request logging, metadata tagging, provider fallbacks, guardrails, and multi-provider routing described above, all without changing how developers use Claude Code.

Setup takes about two minutes. Edit ~/.claude/settings.json (user-level) or .claude/settings.json (project-level):

{
  "env": {
    "ANTHROPIC_BASE_URL": "https://api.portkey.ai",
    "ANTHROPIC_AUTH_TOKEN": "YOUR_PORTKEY_API_KEY",
    "ANTHROPIC_CUSTOM_HEADERS": "x-portkey-api-key: YOUR_PORTKEY_API_KEY\nx-portkey-provider: @anthropic-prod",
    "ANTHROPIC_DEFAULT_SONNET_MODEL": "claude-sonnet-4-20250514",
    "ANTHROPIC_DEFAULT_HAIKU_MODEL": "claude-haiku-4-20250514"
  },
  "model": "claude-sonnet-4-20250514"
}

Portkey uses a single ANTHROPIC_BASE_URL for all providers. You don't need provider-specific URLs. The config pattern stays the same across Anthropic, Bedrock, and Vertex AI. Only the provider slug and model names change.

Once traffic flows through Portkey, everything described in this guide is available out of the box: scoped API keys with budget and rate limits, full request logs with metadata and cost attribution, composable routing configs for fallbacks and load balancing, guardrail integrations with partners like AWS Bedrock Guardrails and Azure Content Safety, and provider flexibility without workflow changes.

Get started with the Claude Code integration docs or book a demo for a walkthrough of the enterprise setup.