The Protocol Everyone Suddenly Speaks
Model Context Protocol shipped in November 2024 as an Anthropic blog post and a few reference servers. Eighteen months later it is the closest thing the agent ecosystem has to TCP for tools. Every major IDE, every coding agent, every serious framework either speaks MCP natively or has shipped an adapter to fake it. The spec moved from a curiosity to a hiring filter inside a single calendar year, which is the kind of velocity standards committees normally take a decade to imitate.
The pitch is simple. Before MCP, every integration between an LLM and an external system was a bespoke contract — one tool format for OpenAI, another for Anthropic, another for whatever framework you happened to be using that quarter. MCP collapses that mess into a single client-server protocol over JSON-RPC 2.0. Write your server once, and any compliant host can talk to it. The protocol does not care whether the client is Claude Desktop, Cursor, a bespoke agent loop, or a CI bot inspecting your repo.
That sounds boring on a slide. It stops being boring the moment you realize you can replace fifteen one-off tool wrappers with three MCP servers and a config file.
The Architecture in One Honest Diagram
MCP defines three roles. The host is the application the user is actually using — Claude Desktop, an IDE, a chat UI. The host owns the model, the user interface, and the trust boundary. Inside the host lives one or more clients, each of which holds a stateful connection to exactly one server. That one-to-one client-to-server pairing is unusual, and it is on purpose: it means a host can fence off servers from each other and decide which ones get to see which conversations.
Servers are the actual integrations. A GitHub server exposes repos, issues, and PRs. A Postgres server exposes schemas and read-only queries. A filesystem server exposes whatever directory you scoped it to. Each server is a standalone process — usually a small program in Python, TypeScript, Go, or Rust — that speaks JSON-RPC and exposes a defined surface area.
The boring part of the protocol is also the most important. Every message is JSON-RPC 2.0. Every interaction is initialized through a capability handshake where client and server announce what they support. Every tool call has structured arguments and structured results. There is no string-template prompt-stuffing. There is no "the LLM hopefully calls the function the right way." The contract is explicit and machine-validated, which is precisely the property that makes the rest of the ecosystem composable.
The Three Primitives That Actually Matter
Tools, resources, and prompts. That is the entire surface a server can expose, and understanding the difference is what separates someone who has read the README from someone who has shipped against the protocol.
Tools are functions the model can invoke, with the user's approval. A tool has a name, a JSON Schema for its input, and a return contract. When the model decides to call create_issue on the GitHub server, the host shows the user the proposed call, the user confirms, and the server executes. Tools are how you let the model take action — write a file, query a database, fire a webhook. The user-approval step is not optional in the spec, and skipping it is how teams end up in security incident reviews.
Resources are read-only data the client can pull into the model's context. A resource has a URI and content. The Postgres server might expose tables as resources. The filesystem server exposes files. Resources are what you reach for when the model needs to know something rather than do something. The split matters because resources can be cached and re-read deterministically, while tools cannot.
Prompts are the part everyone forgets. A prompt is a server-defined template the user can invoke explicitly, usually surfaced as a slash command in the host UI. When the user types /review-pr 1234, the GitHub server returns a fully composed prompt — issue body, diff, recent comments, contributor history — that the host injects into the model. Prompts are how a server author guides users into the model interactions the server was designed for, instead of leaving everyone to invent their own scaffolding.
The aggressive opinion: most MCP servers in the wild ship tools and ignore prompts entirely. That is a mistake. Prompts are where domain expertise compounds.
Transports: STDIO Until You Cannot
MCP supports three transports, but in practice you will pick between two. STDIO is the default for local servers. The host launches the server as a subprocess, pipes JSON-RPC over stdin and stdout, and the operating system handles isolation through process boundaries and file permissions. There are no ports, no auth tokens, no certificates. It is the simplest secure transport ever designed because the security model is "the server only exists because the user launched it."
Streamable HTTP is the modern transport for remote servers. Replacing the older HTTP plus Server-Sent Events split, Streamable HTTP routes everything through a single /mcp endpoint, tracks session state through the Mcp-Session-Id header, and lets the same endpoint return either a JSON response or a streaming SSE response depending on what the call needs. It is what you reach for when the server lives in a different network than the client — a hosted GitHub server, a corporate knowledge base, an enterprise data lake.
The legacy SSE transport still exists in older deployments. New servers should not be using it. The official spec deprecated it in favor of Streamable HTTP, and the migration is mostly a five-line change in the SDK.
The transport choice is not about performance, it is about trust. STDIO inherits the user's local trust boundary — the server can do anything the user can do, and that is fine because the user explicitly launched it. HTTP transports cross networks, and the moment you do that you need OAuth, scoped tokens, IP allowlists, and the entire security apparatus that comes with exposing an API. Most teams underestimate this jump until their first audit.
Security: The Part That Should Have Been Chapter One
The most viral MCP failure modes are not bugs in the protocol, they are misuses of it. The Supabase Cursor incident in mid-2025 was the canonical case study: a service-role MCP connection processing user-submitted support tickets, an attacker embedding SQL in the ticket body, and the agent dutifully exfiltrating integration tokens into a public thread. Every link in the chain was operating as designed, and the chain ended with a breach.
The lesson is that MCP gives the model a wider blast radius, and the protocol does not protect you from a poorly scoped server. Three things matter in production. First, never connect a server with privileged credentials to a model that ingests untrusted input. The Supabase agent should have been read-only, or it should not have been processing third-party tickets at all. Second, every tool call must have an approval gate, and "approval" must be a real human decision rather than a "yes to all" toggle that becomes muscle memory. Third, run the server with the smallest possible permission set — read-only by default, write access only on tools that need it, and absolutely no production credentials in development environments.
Indirect injection through documents, repos, and web pages has been steadily collecting CVEs since the protocol shipped, and it occupies the same architectural seat that unsanitized input occupied in the early web era. A document the agent reads can contain instructions the agent then follows. A web page the browser server fetches can hijack the conversation. A pull request comment can tell the agent to leak environment variables. The mitigations are the same ones you have been reading about for a year: sandboxing, content provenance, output filtering, and refusing to mix privileged tool access with content the model did not author. None of them are perfect, all of them are mandatory.
What Production Actually Looks Like
A real MCP deployment is not one big server. It is a small constellation. A read-only Postgres server scoped to a specific schema, an internal docs server pointed at a Confluence space, a GitHub server with a token limited to one repo, and a filesystem server scoped to the user's project directory. Each server runs as its own process. Each server has its own credentials. Each server fails independently. The host stitches them together, but no individual server has a view of the full agent's capabilities.
That fragmented topology is the right one. It mirrors the way you would structure microservices for any other system, and for the same reasons: blast radius, observability, and the ability to swap one piece without rebuilding the whole stack. Teams that try to build one giant MCP server with thirty tools end up with a permission model nobody can reason about and a debugging experience that punishes them every time something goes wrong.
The 2026 MCP roadmap is boring in the best way: transport scalability, agent-to-agent communication, governance, and enterprise readiness. Server Cards — a standard for advertising server capabilities at a well-known URL — shipped in v2.1, which means clients can now discover and audit servers without connecting first. The spec is doing exactly what a protocol needs to do at this stage: hardening the plumbing instead of adding features.
The Verdict
MCP is one of those rare protocols where the boring engineering decisions ended up being the right ones. JSON-RPC over standard transports, three primitives that are easy to explain, an explicit user-approval boundary, and a spec that admits security is hard rather than pretending the LLM will figure it out. It is not the protocol that wins because of any single feature. It is the protocol that wins because every decision is the conservative one.
If you are building agent infrastructure in 2026 and you are not speaking MCP, you are either ahead of the curve in a way you can defend or behind it in a way you cannot. Most of the time, it is the second one.