Building vs. Buying an MCP Gateway

A capable engineer can stand up a working MCP gateway in an afternoon. Wrap a couple of tools, wire up the JSON-RPC handshake, route a call from Claude through to a server and back — it works, the demo lands, and the team gets a green light. Then it goes to staging, and the real shape of the problem appears. We build and operate MCP Manager, so this page is not a sales pitch that building is impossible. It plainly isn’t. It’s an honest account of what you are actually signing up for when you own a gateway in production — written from the inside, by the people maintaining one. If you’re genuinely on the fence, read this first.

The proxy is the part you can see

The routing layer — the reverse proxy that negotiates capabilities between a client and a set of tool servers — is real work, but it’s a couple of weeks of it, and coding agents make it faster still. Call it five percent of an enterprise deployment. The other ninety-five percent is everything that makes the gateway safe to put sensitive traffic through: brokering identity, holding credentials, inspecting traffic in flight, attributing every action, and keeping all of it current as the protocol moves underneath you. None of that shows up in the afternoon demo. All of it is blocking before the first real user. It’s the same reason a team that can validate a password doesn’t conclude they’ve built an identity provider. The hard part was never the part you could see.

It looks like an API gateway. It behaves nothing like one.

The instinct is to reach for the familiar pattern: a reverse proxy, some OAuth, structured logs, a rate limiter. Engineers have built each of those before. But an API gateway guards a fixed internal estate whose routes change slowly and only by a developer’s hand. An MCP gateway sits between AI agents and a sprawling, fast-moving ecosystem of third-party servers, and the traffic is a different animal:

It’s volatile. An agent stuck in a loop can fire hundreds of calls a minute. Load arrives in bursts you didn’t schedule.
It’s natural language wrapped in tool definitions. A payload that reads as ordinary developer text to a generic filter can carry an instruction to exfiltrate data. The structural context — which tool, which server, which identity, what came before — is the signal, and content-only inspection throws it away.
The threat model is new and still moving. The catalog of MCP-specific attacks is growing and getting more sophisticated faster than generic guardrails keep up. (The Security Overview walks through the current set.)

What surfaces after the demo

This is the part teams don’t price in, because you can’t see it from the prototype. A few of the walls you hit — not a to-do list, just an honest picture of the terrain:

Authentication is two-sided, and nobody implements it the same way. Your gateway has to be an OAuth authorization server to your agents and an OAuth client to every upstream server at once. OAuth is a standard; the implementations are not. One provider rotates refresh tokens on every use and another never does; one omits expiry entirely; one answers an expired token with a clean 401 and another returns 200 OK with the failure buried in the body. Each divergence is a quiet week of work and a new way for a user’s agent to silently lose access on a Thursday afternoon.
The ecosystem is young and rough at the edges. Sessions drop and have to be recovered without spiralling into a retry loop. Some servers only signal a dead session through a prose error string, not a status code. Transports differ in their framing and their failure modes. None of this is in the spec’s happy path; all of it shows up in production.
You can inspect traffic, or you can stream it — not both for free. The moment you want to catch a leaked secret or a poisoned tool result, you have to hold the response to look at it instead of passing bytes straight through. Do that without care and you’ve traded away either latency or memory. This is a genuine architectural tension, not a setting.
Governance cannot live in the prompt. Telling a model it has read-only access is not enforcement — given a tool that can write, a model may use it if its reasoning concludes that would help. That isn’t a jailbreak; it’s the model doing its job. Real enforcement has to be deterministic and sit below the model, which is another system to design, build, and defend. (See Runtime Protections and Feature Governance.)
Identity is a graph, not a column. Who may call which tool, under whose credentials, attributed to whom — across users, teams, the whole organization, and every connection — is a multi-tenant isolation problem from day one, not a field you add later.

The authentication problem alone — just the first bullet — already looks like this, because your one gateway is both sides of OAuth at once and no two upstreams agree: We’re deliberately not handing you the blueprint here. The point isn’t how each of these is solved; it’s that there are far more of them than the whiteboard suggests, and they keep arriving.

The spec moves, and you inherit its roadmap

A one-time build is a fiction. The Model Context Protocol is actively evolving — transports have shifted, auth patterns have changed, and new capabilities land regularly. Own a gateway and you inherit that roadmap whether or not it’s on yours. On top of the spec itself, every upstream vendor changes its own OAuth behavior, deprecates a scope, or bumps an API version on its own schedule, and each change is a potential silent breakage you have to chase down. That’s not project work that ends; it’s an operating cost that doesn’t. A modest integration surface is a permanent half-person at the very least, and it grows with every server you add.

You’d be maintaining a security product, not shipping a feature

This is the heart of it. Companies federate to Okta or Entra instead of writing their own authorization server, and buy an endpoint-protection product instead of writing their own — not because they couldn’t, but because security and governance infrastructure is a discipline of its own, best left to a team that does nothing else and learns from every customer at once. An MCP gateway is exactly that kind of platform. Two costs in particular tend to be discovered late:

Bus factor. A gateway one motivated senior engineer built in a quarter becomes load-bearing infrastructure two years later that nobody else fully understands.
Compliance creep. Audit logging starts as a nice-to-have and becomes mandatory the moment you sell into a regulated industry. A SOC 2 Type II report alone requires a months-long observation window, and the controls — immutable audit trails, retention policies, access reviews — have to be designed in from the start. Retrofitting them into a gateway that wasn’t built for them can cost as much as building it did.

Detection gets sharper the more it sees

There’s a subtler reason this is hard to do well alone: the quality of threat detection is largely a function of how much malicious traffic you’ve already seen. A team defending MCP traffic across many organizations turns a single attack on one customer into a defense for all of them — the dataset, the detection models, and the response playbooks all compound. An in-house gateway only ever sees its own traffic, so it starts near zero against each new technique and stays a step behind a threat landscape that’s moving quickly. You’d be asking one platform team to keep pace, alone, with the whole ecosystem’s worth of adversaries.

A rough sense of the bill

Exact figures depend entirely on your environment, so treat these as illustrative order-of-magnitude ranges, not a quote:

Layer	Rough effort
The MCP proxy itself	2–4 weeks
SSO federation (integration + licensing)	weeks, plus annual license cost
SCIM provisioning	4–8 weeks
Per-upstream OAuth	~1 week each to build — then permanent upkeep
Admin console, audit logging, retention	weeks to months
Compliance (e.g. SOC 2 Type II)	6+ months of calendar time

Stacked up, it’s the layer you can’t see finishing that dominates — the visible proxy is the cheap base, and the perpetual cost lives in the per-upstream auth above it: Added up, in-house first-year builds commonly land in the low-to-mid six figures and six to nine months before the first user is governed — and then the maintenance line never goes away. The proxy you can see is rarely more than a rounding error against that total. And the maintenance line is the one that quietly hurts most: every engineer-week spent chasing a vendor’s changed OAuth scope is a week not spent on the product only your company can build. We’ve watched capable teams start down the build road with a strong platform group and a clear plan, and arrive at the same realization a few months in: the routing layer shipped on schedule, and everything around it — the per-upstream auth, the audit trail an auditor would accept, the identity model that survives a reorg — quietly became a standing program no one had staffed. The build estimate was for the five percent.

What adopting a gateway gets you instead

The flip side of every cost above is what you don’t carry when you adopt a gateway rather than build one:

	Build in-house	Adopt a gateway
Time to first governed user	months	weeks
Per-upstream OAuth upkeep	yours, forever	absorbed for you
Spec & vendor-change tracking	yours to chase	handled upstream
Compliance evidence (SOC 2, ISO 27001)	a build line item	a vendor checkbox
Threat detection	only your own traffic	sharpens across every customer
Where your engineers spend their time	maintaining a gateway	your own product

None of this makes building wrong everywhere — it just moves the burden off your roadmap and onto a team whose whole job is to carry it.

When building it yourself is the right call

We’d rather you make this decision clear-eyed than regret it, so here’s the honest version. Building in-house is the right answer when all of these hold:

Your integration surface is narrow and stable — a handful of internal services you control, with no plan to add a long tail of third-party SaaS. The per-upstream OAuth burden, which is where the cost compounds, shrinks dramatically.
You have platform capacity available now — not theoretical headroom, but engineers who can own the gateway indefinitely and aren’t on your product’s critical path.
You have a hard requirement no vendor can meet — a fully air-gapped environment, an exotic compliance regime, or integration with a proprietary internal identity system.
You’re not on a clock — your security team can credibly hold the line on AI adoption for the better part of a year while you build.

If even one of those isn’t true, the math gets ugly faster than it looked at the whiteboard — and the bypass risk grows, because every month without a governed path is a month teams wire up ungoverned MCP servers on their own.

The middle path: own some layers, adopt others

The most considered teams we talk to don’t treat this as all-or-nothing. They split the stack by where the build economics are worst. Proprietary internal tooling — homegrown services that will never appear in a public catalog — is theirs to own, and a thin proxy in front of it is a reasonable build. The integration-and-governance layer — per-user OAuth across a long tail of third-party SaaS, identity, audit, compliance — is the one with the ugliest economics, so it’s the first they hand to a vendor. The useful question is rarely “build or buy” but “which layers do we own, and which do we let someone else carry?”

A few questions worth answering honestly

Before committing to build, the questions that tend to predict regret aren’t about the proxy — they’re about everything after it:

How many third-party integrations will you need in twelve months? In twenty-four?
Who owns the fix when an upstream changes its OAuth flow next quarter?
Have you budgeted SSO licensing and SCIM, or assumed manual provisioning at a hundred-plus users?
Do you need SOC 2 Type II evidence for this system, and has the observation window been scoped?
If the build stalls, what’s the exit — and does it mean buying anyway, on top of the sunk cost?

If two or more of those are uncomfortable, the build case is weaker than it looked on the whiteboard. For most organizations the honest question isn’t whether the team could build a gateway. It’s whether the next six to nine months of platform engineering are better spent building one, or building the thing only your company can.

Architecture & Trust

How the gateway is hardened as the control point in the path of every call.

Security Overview

The MCP-specific threats a gateway has to address, and how each is met.

Enterprise Strategy & Lockdown

Where a gateway fits in an enterprise AI control stack, and how to lock it down.

Hosting & Data Residency

Where MCP Manager runs, what stays in your environment, and the self-hosting question.

Get Started

MCP Gateway Concepts

Features

Industries

Tutorials

MCP Server Guides

Security

Deployment

Enterprise

Admin API & MCP

Build Your Own MCP Server

Advanced

Building vs. Buying an MCP Gateway

The proxy is the part you can see

It looks like an API gateway. It behaves nothing like one.

What surfaces after the demo

The spec moves, and you inherit its roadmap

You’d be maintaining a security product, not shipping a feature

Detection gets sharper the more it sees

A rough sense of the bill

What adopting a gateway gets you instead

When building it yourself is the right call

The middle path: own some layers, adopt others

A few questions worth answering honestly

Further reading

Architecture & Trust

Security Overview

Enterprise Strategy & Lockdown

Hosting & Data Residency

​The proxy is the part you can see

​It looks like an API gateway. It behaves nothing like one.

​What surfaces after the demo

​The spec moves, and you inherit its roadmap

​You’d be maintaining a security product, not shipping a feature

​Detection gets sharper the more it sees

​A rough sense of the bill

​What adopting a gateway gets you instead

​When building it yourself is the right call

​The middle path: own some layers, adopt others

​A few questions worth answering honestly

​Further reading

Architecture & Trust

Security Overview

Enterprise Strategy & Lockdown

Hosting & Data Residency

The proxy is the part you can see

It looks like an API gateway. It behaves nothing like one.

What surfaces after the demo

The spec moves, and you inherit its roadmap

You’d be maintaining a security product, not shipping a feature

Detection gets sharper the more it sees

A rough sense of the bill

What adopting a gateway gets you instead

When building it yourself is the right call

The middle path: own some layers, adopt others

A few questions worth answering honestly

Further reading