Skip to content

MCP — External Capabilities

Context Perspective: The definitions and return values of MCP tools enter the context just like built-in tools. The LLM does not distinguish their origins.

The previous chapter's built-in tools all execute locally — reading files, running commands, the agent handles it directly. But what if you want the agent to scrape a webpage, query Slack messages, or call a company-internal API?

Wait for the agent developer to add it? Impractical. Modify the agent's source code yourself? Even less realistic.

You need a standard interface that lets any external capability plug into the agent. That's MCP (Model Context Protocol).

MCP ARCHITECTURE Model Context Protocol // v1.0 stdio (Local) Streamable HTTP (Remote) AGENT Client Runtime UNIVERSAL INTERFACE LAYER JSON-RPC FILES Local FS PID:9021 SQL DB PostgreSQL BROWSER Firecrawl API JIRA/SLACK Remote API localhost:8080 TOOL DEFINITIONS INJECTION { "name": "read_file" } { "name": "query_db" } { "name": "jira_get" } LLM API "It's just a tool_call"
MCP ARCHITECTURE Model Context Protocol // v1.0 stdio (Local) Streamable HTTP (Remote) AGENT Client Runtime UNIVERSAL INTERFACE LAYER JSON-RPC FILES Local FS PID:9021 SQL DB PostgreSQL BROWSER Firecrawl API JIRA/SLACK Remote API localhost:8080 TOOL DEFINITIONS INJECTION { "name": "read_file" } { "name": "query_db" } { "name": "jira_get" } LLM API "It's just a tool_call"

What is MCP?

One sentence: the USB port of the agent world.

MCP is an open protocol. It defines a standard that allows anyone to develop tools for an agent without modifying the agent's own code. Just as a USB device doesn't need to understand a computer's internals, an MCP tool doesn't need to know the agent's implementation details.

The agent loads these pre-configured external tools at runtime. You just configure it — no code required.

MCP in one picture: client, server, transport A diagram showing an agent-side MCP Client connecting to either a local MCP Server via stdio or a remote MCP Server via Streamable HTTP. Both expose tools that reach external systems. MCP in One Picture Same protocol, two transports: stdio (local) and Streamable HTTP (remote) INSIDE THE AGENT MCP Client Discovers servers, negotiates capabilities, calls tools. Wraps results as role: "tool" messages. tools[] schema tool_calls LLM sees a single tool interface (built-in and MCP tools look the same). MCP SERVER (LOCAL PROCESS) Transport: stdio Agent spawns and manages the process lifecycle. docs_search() postgres_query() MCP SERVER (REMOTE SERVICE) Transport: Streamable HTTP Suited for shared access and long-running services. search_jira() query_slack() EXTERNAL SYSTEMS SaaS, internal APIs, databases, web… J Jira S Slack API Internal API stdio Streamable HTTP A “Server” can be a local process or a remote service
MCP in one picture: client, server, transport A diagram showing an agent-side MCP Client connecting to either a local MCP Server via stdio or a remote MCP Server via Streamable HTTP. Both expose tools that reach external systems. MCP in One Picture Same protocol, two transports: stdio (local) and Streamable HTTP (remote) INSIDE THE AGENT MCP Client Discovers servers, negotiates capabilities, calls tools. Wraps results as role: "tool" messages. tools[] schema tool_calls LLM sees a single tool interface (built-in and MCP tools look the same). MCP SERVER (LOCAL PROCESS) Transport: stdio Agent spawns and manages the process lifecycle. docs_search() postgres_query() MCP SERVER (REMOTE SERVICE) Transport: Streamable HTTP Suited for shared access and long-running services. search_jira() query_slack() EXTERNAL SYSTEMS SaaS, internal APIs, databases, web… J Jira S Slack API Internal API stdio Streamable HTTP A “Server” can be a local process or a remote service

Server and Client

These two terms trip people up. Let's clear up a common misconception: an MCP Server is not necessarily a remote server.

  • MCP Client: A component running inside the agent, responsible for discovering, connecting to, and calling tools on MCP Servers. You typically don't interact with it directly.
  • MCP Server: The party that provides tools. It can be a local process or a remote HTTP service.

You'll encounter all kinds of MCP Servers. Some real examples: Context7 (documentation lookup), Tavily/Exa (search engines), DeepWiki (repository documentation), Firecrawl (web scraping), Grep.app (GitHub code search). Some run locally on your machine via stdio (like Context7, Firecrawl), others run as remote HTTP services (like Exa, DeepWiki, Tavily).

Two Transport Modes

MCP supports two ways to connect to a Server:

stdio (local child process): The agent spawns a child process to run the MCP Server, communicating via stdin/stdout. The agent manages the process's entire lifecycle—startup, communication, shutdown. Write command: "npx", args: ["-y", "@upstash/context7-mcp"] in your config, and you've connected to the Context7 documentation lookup service.

Streamable HTTP (remote service): The MCP Server runs as an independent HTTP service, and the agent connects via HTTP requests. Suited for scenarios requiring persistent uptime or shared access across multiple agents.

For you, the difference is just configuration. For the LLM, it doesn't know and doesn't care.

A side-by-side comparison of how both transport modes work:

Functionally Equivalent, Different Origin

Here's the key: to the LLM, built-in tools and MCP tools are indistinguishable.

── Round 1 ──

Say we have a scrape_url tool accessed via MCP. When the user asks a question, the agent places all available tools (built-in + MCP) into the context together:

json
// → REQUEST (agent → LLM API)
{
  "system": "You are a project assistant...",
  "tools": [
    {
      "name": "read_file",
      "description": "Reads the content of a file",
      "input_schema": { "...": "..." }
    },
    {
      "name": "scrape_url",
      "description": "Scrapes the content of a URL and returns markdown",
      "input_schema": { "...": "..." }
    }
  ],
  "messages": [{ "role": "user", "content": "What does the page at https://example.com/docs/api say?" }]
}

The LLM picks the most appropriate tool:

json
// ← RESPONSE (LLM API → agent, SSE stream)
{
  "role": "assistant",
  "content": "Okay, scraping the page...",
  "tool_calls": [
    { "id": "call_xyz789", "name": "scrape_url", "arguments": { "url": "https://example.com/docs/api" } }
  ]
}

LLM's perspective: identical to calling read_file — just a tool_calls response.

Agent's perspective: different execution path.

── Round 2 ──

After the MCP Server returns results, the agent wraps them into a tool-role message and appends to the conversation history. Same format as built-in tool results:

json
// → REQUEST (agent → LLM API)
{
  "messages": [
    // ... previous messages
    {
      "role": "tool",
      "tool_call_id": "call_xyz789",
      "content": "# API Reference\n\n## Authentication\nAll requests require a Bearer token...\n\n## Endpoints\n- GET /users — List all users\n- POST /users — Create a new user\n..."
    }
  ]
}
json
// ← RESPONSE (LLM API → agent, SSE stream)
{
  "role": "assistant",
  "content": "This page is API documentation. Key points:\n1. Authentication: Bearer token required\n2. Two endpoints: GET /users (list users) and POST /users (create user)\n\nWant me to dig into a specific endpoint?"
}

The LLM only cares that it got the page content. It doesn't know or need to know whether they came from the local machine or a remote server.

One-line summary: LLM layer — fully equivalent. Agent execution layer — different paths.

Context cost: tool schemas scale with MCP Two side-by-side profiles compare how tool definitions (static) can crowd out messages (dynamic) in the context window when many MCP servers/tools are enabled. Context Cost of MCP Tools Tool definitions are static context injected on every request. More tools → less room for messages. PROFILE A Lean (enable only what you need) Enabled servers: Search Docs Context window (same total budget): Instructions Messages + tool outputs Tool schemas (built-in + MCP) switch per task PROFILE B Everything enabled ("just in case") Enabled servers: Search Docs DB Internal API Context window (same total budget): tools[] Rule of thumb: keep MCP off by default; enable by profile when needed.
Context cost: tool schemas scale with MCP Two side-by-side profiles compare how tool definitions (static) can crowd out messages (dynamic) in the context window when many MCP servers/tools are enabled. Context Cost of MCP Tools Tool definitions are static context injected on every request. More tools → less room for messages. PROFILE A Lean (enable only what you need) Enabled servers: Search Docs Context window (same total budget): Instructions Messages + tool outputs Tool schemas (built-in + MCP) switch per task PROFILE B Everything enabled ("just in case") Enabled servers: Search Docs DB Internal API Context window (same total budget): tools[] Rule of thumb: keep MCP off by default; enable by profile when needed.

Flexibility is easy to understand. The cost? Each connected MCP Server injects all of its tool definitions into every request. Enable ten Servers at once, and dozens of tool definitions permanently occupy the context window—squeezing out space for your instructions, conversation history, and tool return values.

In practice: create different MCP profiles for different task types—one set for coding, another for data work. Off by default, on when needed.

Why It Matters

MCP solves a straightforward problem—you no longer need to wait for the agent developer to add tools:

  • Don't wait for updates: Want a search engine integration? Install an MCP Server. No need to wait for the next agent release.
  • Connect internal systems: Your company's internal API will most likely never get official agent support, but you can write (or find) an MCP Server for it.
  • Reuse across agents: An MCP Server can theoretically be used by any agent that supports the protocol — not locked to a specific tool.

Key Takeaways

  • Context flow: MCP tool definitions are injected into every request (static); return values are appended after execution (dynamic). They travel the same context pipeline as built-in tools — the LLM perceives no difference.
  • Risk: MCP's trust problem is sharper than built-in tools. A malicious MCP Server could return false data to pollute your context, or log your sensitive requests. Installing an MCP Server is like installing a browser extension — is the source trustworthy? Are the permissions reasonable?
  • Auditability: Every interaction between the agent and MCP Server should be logged — what was requested, what was returned, how long it took. When something goes wrong, this is your investigation trail.

Next chapter: Slash Commands — how to package common operations into one-click shortcuts.