Do we need a separate MCP server per client (Claude, ChatGPT, Cursor)?

No. A single MCP server works for every compliant client. That is the point of the standard. You ship one server, register it once, and Claude, ChatGPT, Cursor, VS Code, Copilot Studio and any future agent-capable client connect to the same URL. If you find yourself writing client-specific branches, you are doing something wrong or you are hitting a client-specific transport quirk worth filing upstream.

Is MCP a replacement for OpenAPI?

No. OpenAPI describes a REST surface to developers and code generators. MCP describes a tool surface to AI agents at runtime. They solve different problems and usually coexist: REST + OpenAPI for humans and traditional apps, MCP for agents on top of the same backend. Several teams auto-generate the MCP layer from the OpenAPI document, but the mapping is rarely one-to-one because what is useful to expose differs.

What does it cost to run an MCP server in production?

Baseline infrastructure is comparable to a small API service: a container, a database connection, TLS. The variable cost is traffic. Agent-driven traffic is lumpy and token-sensitive, so you pay more attention to response size and pagination than to request volume. Most teams we see absorb MCP hosting into their existing API budget; the real investment is in defining a safe tool surface and the auth layer around it, not the runtime.

What is MCP and why your SaaS needs an MCP server in 2026

A Model Context Protocol (MCP) server is a standardized endpoint that exposes your SaaS tools, data, and actions to AI agents through a single protocol based on JSON-RPC 2.0. It is the bridge that lets Claude, ChatGPT, Cursor, VS Code, Copilot Studio, and any other MCP-capable client invoke your product without a custom integration per client.

In 2026, MCP is no longer a research curiosity. Anthropic introduced the protocol in November 2024, reached 97 million monthly SDK downloads by December 2025, and in the same month handed governance to the Agentic AI Foundation under the Linux Foundation, co-founded with Block and OpenAI. If your SaaS has endpoints worth calling from an AI agent, the question in 2026 is not whether you need an MCP server, but how soon you can ship one that holds up under real traffic.

The 30-second version

REST APIs were designed for deterministic callers: apps, scripts, partners. Agents are not deterministic callers. They discover capabilities at runtime, choose tools by natural-language intent, and compose calls across services. MCP gives them a protocol to do that without a bespoke wrapper per product. You write one MCP server. Every compliant client can use it.

Where MCP came from and why it spread

Anthropic introduced MCP in late November 2024 to solve the integration tax every AI product was paying: M clients times N tools equals M times N custom adapters. The protocol flips that to M plus N. OpenAI adopted it in April 2025. Microsoft folded it into Copilot Studio in July 2025. By early 2026, an industry tracker reported 28% of Fortune 500 companies had deployed MCP servers for production AI workflows, and Forrester projected 30% of enterprise application vendors would ship their own MCP server during 2026.

The specification lives at modelcontextprotocol.io. The latest revision at time of writing is 2025-11-25.

How an MCP server actually works

An MCP server is a process that speaks JSON-RPC 2.0 over one of two transports:

STDIO for local servers launched by the client (think a binary on the user's machine).
Streamable HTTP for remote servers reachable over the network, now the dominant transport for SaaS use cases.

The server exposes three primitives:

Tools. Callable actions (create invoice, query orders, run SQL). The client lists them at runtime and invokes them on the user's behalf.
Resources. Readable data (a file, a record, a query result). The client fetches them to populate the model's context.
Prompts. Server-owned prompt templates a client can surface to the user as shortcuts.

On connect, the client performs a capability handshake: it asks what the server offers, receives machine-readable descriptions, and presents them to the model. The model picks a tool by intent, the client invokes it, the server runs it, the result goes back into the conversation. No hardcoded endpoint. No SDK version to pin. No glue code in the client.

Why REST alone is no longer enough

A common question is "we already have a REST API, why ship an MCP server on top?" Three reasons.

Runtime discovery. Agents do not read OpenAPI specs the way a developer does. MCP is designed for agents to query capabilities at connect time, pick the right tool, and adapt as your product changes. New tool on the server? The next connection sees it.

Token economics. Calling a REST API from an LLM means serializing arguments, stuffing the response back into the context window, and paying for every token in both directions. Benchmarks put MCP at roughly 15% to 25% slower per call than a direct REST call due to JSON-RPC overhead, but at 50% to 80% fewer tokens consumed by the model, which is the line item that drives inference cost.

Consumer integration cost. Every client that wants to call your REST API writes its own wrapper, handles its own auth flow, and maintains its own retries. With MCP, the client writes zero integration code for your product. It points at your server URL and goes.

This does not make REST obsolete. High-throughput, deterministic, app-to-app traffic still belongs on REST or gRPC. MCP sits on top for the agent layer.

Security: what changes when your product is agent-addressable

Remote MCP servers use OAuth 2.1 with mandatory SHA-256 PKCE. The specification explicitly forbids the plain PKCE method and requires clients to prove they hold the authorization code. The 2025-11-25 revision added role-based authorization annotations so servers can gate individual tools behind specific user roles.

The attack surface is real. Over 2025, researchers cataloged more than a thousand publicly exposed MCP servers with weak or missing auth, plus a family of credential-handling bugs across popular SDKs. A survey of over a thousand enterprise technology leaders reported 62% identified security as their top concern when deploying AI agents.

Treat an MCP server the way you treat a public API: threat-model it, scope tokens narrowly, log every tool call, and rotate. Do not let the novelty of the protocol talk you out of ordinary hygiene.

When your SaaS needs an MCP server

You need one when any of these is true:

Your users already try to use your product from inside Claude, ChatGPT, Cursor, Copilot, or a custom agent, and the experience is clunky.
You sell to developers or operations teams who live in Claude Code or VS Code.
Your product holds data useful to an agent mid-task (CRM records, feature flags, dashboards, tickets, documents).
A competitor in your category has shipped one. The 2026 buyer expects it.

You can defer shipping one when your product is consumed almost entirely through your own UI, your API has no read-write operations worth automating, or your users are not technical and will not install agent tooling.

What to expose (and what to hide)

The temptation when wrapping an existing API is to expose everything. Resist. A good MCP surface is a curated set of verbs the agent can use safely:

Read operations that return structured data with stable shapes.
High-signal write operations that are idempotent or gated by a confirmation step.
Search and query primitives rather than raw pagination plumbing, which agents still struggle to drive correctly.

Leave out destructive administrative endpoints, anything that requires complex state machines the agent cannot reason about, and anything that depends on UI context the server cannot verify. If a tool can delete a production table, it should not live on an MCP server a junior engineer connected to a chat client.

Hosted vs self-built

Two paths exist.

Self-built. You write an MCP server in TypeScript, Python, or Go using the official SDKs, deploy it next to your API, and own the transport, auth, and observability. Best when the product has non-trivial business logic, when you want deep control, or when compliance rules out third-party handling of user tokens.

Hosted MCP platforms. Vendors expose existing SaaS APIs as MCP servers and handle auth, tool curation, and rate limiting. Faster to market, narrower control, and a third party in the token path. Fine for a v1, weaker as a long-term position once agents become a primary interface.

A pragmatic path is self-built for the tools that define your product, hosted for the integrations that only support your product (for example, exposing a partner CRM you resell).

Adjacent concepts worth knowing

Tool calling (OpenAI function calling, Anthropic tools). The per-vendor mechanism an MCP client uses under the hood to invoke your server's tools.
A2A (Agent to Agent protocol). Complementary, not competing. MCP connects agents to tools; A2A connects agents to each other.
llms.txt. A static manifest at your site root that advertises which canonical pages AI crawlers should index. Distinct from MCP but often deployed in the same 2026 AI-readiness checklist (see our piece on Generative Engine Optimization).

Sources

Photo by Albert Stoynov ↗ on Unsplash ↗

Frequently asked questions

Do we need a separate MCP server per client (Claude, ChatGPT, Cursor)?: No. A single MCP server works for every compliant client. That is the point of the standard. You ship one server, register it once, and Claude, ChatGPT, Cursor, VS Code, Copilot Studio and any future agent-capable client connect to the same URL. If you find yourself writing client-specific branches, you are doing something wrong or you are hitting a client-specific transport quirk worth filing upstream.
Is MCP a replacement for OpenAPI?: No. OpenAPI describes a REST surface to developers and code generators. MCP describes a tool surface to AI agents at runtime. They solve different problems and usually coexist: REST + OpenAPI for humans and traditional apps, MCP for agents on top of the same backend. Several teams auto-generate the MCP layer from the OpenAPI document, but the mapping is rarely one-to-one because what is useful to expose differs.
What does it cost to run an MCP server in production?: Baseline infrastructure is comparable to a small API service: a container, a database connection, TLS. The variable cost is traffic. Agent-driven traffic is lumpy and token-sensitive, so you pay more attention to response size and pagination than to request volume. Most teams we see absorb MCP hosting into their existing API budget; the real investment is in defining a safe tool surface and the auth layer around it, not the runtime.

Studio