The Ultimate Guide to MCP

For almost a year, I haven't updated AI-related blogs. On one hand, I've been busy with side projects. On the other hand, although AI technology changes rapidly, AI application development hasn't brought many new things. It's still the same three things from my 2023 blog: Prompt, RAG, Agent.

But since Claude (Anthropic) launched MCP (Model Context Protocol) in late November last year, AI application development has entered a new era.

However, there seems to be limited resources about MCP explanation and development, so I decided to organize my experience and thoughts into an article to help everyone.

Why MCP is a Breakthrough

We know that in the past year, AI model development has been very rapid, from GPT-4 to Claude Sonnet 3.5 to Deepseek R1. Both reasoning and hallucination have improved significantly.

There are also many new AI applications, but we can all feel that most AI applications on the market today are completely new services that are not integrated with the services and systems we usually use. In other words, AI model integration with our existing systems has been developing slowly.

For example, we currently cannot use one AI application to simultaneously do web search, send emails, publish our blogs, etc. These functions are not difficult to implement individually, but if we want to integrate them all into one system, it becomes very difficult.

If you don't have a specific feeling yet, let's think about daily development. Imagine in an IDE, we can use the IDE's AI to complete the following tasks:

Ask AI to query existing data in local databases to help with development
Ask AI to search GitHub Issues to determine if a problem is a known bug
Use AI to send PR comments to colleagues' messaging software (like Slack) for code review
Use AI to query or even modify current AWS, Azure configurations for deployment

The functions mentioned above are becoming reality through MCP. You can follow Cursor MCP and Windsurf MCP for more information. You can try using Cursor MCP + browsertools plugin to experience the ability to automatically get Chrome dev tools console logs in Cursor.

Why has AI integration with existing services been so slow? There are many reasons. On one hand, enterprise data is sensitive, and most companies need long processes to make decisions. On the other hand, technically, we lack an open, universal, consensus-based protocol standard.

MCP is an open, universal, consensus-based protocol standard launched by Claude (Anthropic). If you're a developer familiar with AI models, you probably know Anthropic. They released the Claude 3.5 Sonnet model, which should still be the strongest programming AI model to date (just as I finished writing this, they released 3.7😅).

I must mention that this protocol's best opportunity should have belonged to OpenAI. If OpenAI had promoted the protocol when they first released GPT, everyone would have accepted it. But OpenAI became CloseAI and only released a closed GPTs. Such protocol standards that need leadership and consensus are usually hard to form spontaneously in the community and are usually led by industry giants.

After Claude released MCP, the official Claude Desktop opened MCP functionality and promoted the open source organization Model Context Protocol, with participation from different companies and communities. Here are some examples of MCP servers released by different organizations:

Official MCP Integration Examples:

Git - Git reading, operations, searching.
GitHub - Repo management, file operations, and GitHub API integration.
Google Maps - Google Maps integration for location information.
PostgreSQL - Read-only database queries.
Slack - Slack message sending and querying.

🎖️ Third-party Platform Official MCP Support Examples

MCP servers built by third-party platforms.

Grafana - Search and query data in Grafana.
JetBrains – JetBrains IDEs.
Stripe - Interact with Stripe API.

🌎 Community MCP Servers

Here are some MCP servers developed and maintained by the open source community.

AWS - Operate AWS resources with LLM.
Atlassian - Interact with Confluence and Jira, including searching/querying Confluence spaces/pages, accessing Jira issues and projects.
Google Calendar - Google Calendar integration, scheduling, finding time, and adding/removing events.
Kubernetes - Connect to Kubernetes clusters and manage pods, deployments, and services.
X (Twitter) - Interact with Twitter API. Post tweets and search tweets through queries.
YouTube - YouTube API integration, video management, shorts creation, etc.

Why MCP?

You might have a question here. When OpenAI released GPT function calling in 2023, couldn't it achieve similar functions? The AI Agent we introduced in previous blogs was used to integrate different services, right? Why did MCP appear again?

What are the differences between function calling, AI Agent, and MCP?

Function Calling

Function Calling refers to the mechanism where AI models automatically execute functions based on context.
Function Calling acts as a bridge between AI models and external systems. Different models have different Function Calling implementations, and code integration methods are also different. They are defined and implemented by different AI model platforms.

If we use Function Calling, we need to provide LLM with a set of functions through code, and provide clear function descriptions, function inputs and outputs, so that LLM can reason and execute functions based on clear structured data.

The disadvantage of Function Calling is that it doesn't handle multi-turn conversations and complex requirements well. It's suitable for tasks with clear boundaries and clear descriptions. If you need to handle many tasks, Function Calling code is difficult to maintain.

Model Context Protocol (MCP)

MCP is a standard protocol, like the Type-C protocol for electronic devices (can charge and transfer data), enabling AI models to seamlessly interact with different APIs and data sources.
MCP aims to replace fragmented Agent code integration, making AI systems more reliable and effective. By establishing universal standards, service providers can launch AI capabilities for their services based on the protocol, supporting developers to build more powerful AI applications faster. Developers also don't need to reinvent the wheel and can build powerful AI Agent ecosystems through open source projects.
MCP can maintain context across different applications/services, enhancing overall autonomous task execution capabilities.

You can understand MCP as layered processing of different tasks, with each layer providing specific capabilities, descriptions, and limitations. The MCP Client judges according to different tasks, chooses whether to call certain capabilities, and then constructs an Agent that can handle complex, multi-step conversations and unified context through the input and output of each layer.

AI Agent

AI Agent is an intelligent system that can run autonomously to achieve specific goals. Traditional AI chat only provides suggestions or requires manual task execution, while AI Agent can analyze specific situations, make decisions, and take actions autonomously.
AI Agent can use capability descriptions provided by MCP to understand more context and automatically execute tasks across various platforms/services.

Thoughts

Why was MCP widely accepted after Claude launched it? Actually, in the past year, I personally participated in the development of several small AI projects. During development, integrating AI models with existing systems or third-party systems was indeed quite troublesome.

Although there are some frameworks that support Agent development in the market, such as LangChain Tools, LlamaIndex, or Vercel AI SDK.

Although LangChain and LlamaIndex are both open source projects, their overall development is quite chaotic. First, the code abstraction level is too high. They want to promote letting developers complete certain AI functions with just a few lines of code, which is quite useful in the Demo stage, but in actual development, once business becomes complex, poor code design brings very bad programming experience. Also, these projects are too eager to commercialize, ignoring overall ecosystem construction.

There's also Vercel AI SDK. Although I personally think Vercel AI SDK's code abstraction is better, it's only good for frontend UI integration and partial AI function encapsulation. The biggest problem is that it's too deeply bound to Next.js, with insufficient support for other frameworks and languages.

So Claude's promotion of MCP can be said to be good timing. First, Claude Sonnet 3.5 has a high status among developers, and MCP is an open standard, so many companies and communities are willing to participate, hoping Claude can maintain a good open ecosystem.

The benefits of MCP for community ecosystem are mainly the following two points:

Open standards for service providers, who can open their APIs and partial capabilities for MCP.
No need to reinvent the wheel, developers can use existing open source MCP services to enhance their Agents.

How MCP Works

Let's introduce how MCP works. First, let's look at the official MCP architecture diagram.

It's divided into five parts:

MCP Hosts: Hosts are applications where LLM initiates connections, like Cursor, Claude Desktop, Cline applications.
MCP Clients: Clients maintain 1:1 connections with Servers within Host applications.
MCP Servers: Provide context, tools, and prompts to Client through standardized protocols.
Local Data Sources: Local files, databases, and APIs.
Remote Services: External files, databases, and APIs.

The core of the entire MCP protocol lies in the Server. Because Host and Client are familiar to anyone who knows computer networks and are easy to understand, but how to understand Server?

Looking at Cursor's AI Agent development process, we'll find that the entire AI automation process development evolves from Chat to Composer to complete AI Agent.

AI Chat only provides suggestions. How to convert AI responses into actions and final results depends entirely on humans, such as manual copy-paste or making certain modifications.

AI Composer can automatically modify code but requires human participation and confirmation, and cannot do operations other than code modification.

AI Agent is a fully automated program that can automatically read Figma images, automatically generate code, automatically read logs, automatically debug code, and automatically push code to GitHub.

MCP Server exists to achieve AI Agent automation. It's a middleware layer that tells AI Agent what services, APIs, and data sources currently exist. AI Agent can decide whether to call certain services based on information provided by Server, then execute functions through Function Calling.

How MCP Server Works

Let's look at a simple example. Suppose we want AI Agent to automatically search GitHub Repository, then search Issues, then determine if it's a known bug, and finally decide whether to submit a new Issue.

Then we need to create a GitHub MCP Server that provides three capabilities: finding Repository, searching Issues, and creating Issues.

Let's look at the code directly:

const server = new Server(
  {
    name: "github-mcp-server",
    version: VERSION,
  },
  {
    capabilities: {
      tools: {},
    },
  }
);

server.setRequestHandler(ListToolsRequestSchema, async () => {
  return {
    tools: [
      {
        name: "search_repositories",
        description: "Search for GitHub repositories",
        inputSchema: zodToJsonSchema(repository.SearchRepositoriesSchema),
      },
      {
        name: "create_issue",
        description: "Create a new issue in a GitHub repository",
        inputSchema: zodToJsonSchema(issues.CreateIssueSchema),
      },
      {
        name: "search_issues",
        description: "Search for issues and pull requests across GitHub repositories",
        inputSchema: zodToJsonSchema(search.SearchIssuesSchema),
      }
    ],
  };
});

server.setRequestHandler(CallToolRequestSchema, async (request) => {
  try {
    if (!request.params.arguments) {
      throw new Error("Arguments are required");
    }

    switch (request.params.name) {
      case "search_repositories": {
        const args = repository.SearchRepositoriesSchema.parse(request.params.arguments);
        const results = await repository.searchRepositories(
          args.query,
          args.page,
          args.perPage
        );
        return {
          content: [{ type: "text", text: JSON.stringify(results, null, 2) }],
        };
      }

      case "create_issue": {
        const args = issues.CreateIssueSchema.parse(request.params.arguments);
        const { owner, repo, ...options } = args;
        const issue = await issues.createIssue(owner, repo, options);
        return {
          content: [{ type: "text", text: JSON.stringify(issue, null, 2) }],
        };
      }

      case "search_issues": {
        const args = search.SearchIssuesSchema.parse(request.params.arguments);
        const results = await search.searchIssues(args);
        return {
          content: [{ type: "text", text: JSON.stringify(results, null, 2) }],
        };
      }

      default:
        throw new Error(`Unknown tool: ${request.params.name}`);
    }
  } catch (error) {}
});

async function runServer() {
  const transport = new StdioServerTransport();
  await server.connect(transport);
  console.error("GitHub MCP Server running on stdio");
}

runServer().catch((error) => {
  console.error("Fatal error in main():", error);
  process.exit(1);
});

In the code above, we use server.setRequestHandler to tell the Client what capabilities we provide, use the description field to describe the function of this capability, and use inputSchema to describe the input parameters needed to complete this capability.

Let's look at the specific implementation code:

export const SearchOptions = z.object({
  q: z.string(),
  order: z.enum(["asc", "desc"]).optional(),
  page: z.number().min(1).optional(),
  per_page: z.number().min(1).max(100).optional(),
});

export const SearchIssuesOptions = SearchOptions.extend({
  sort: z.enum([
    "comments",
    ...
  ]).optional(),
});

export async function searchUsers(params: z.infer<typeof SearchUsersSchema>) {
  return githubRequest(buildUrl("https://api.github.com/search/users", params));
}

export const SearchRepositoriesSchema = z.object({
  query: z.string().describe("Search query (see GitHub search syntax)"),
  page: z.number().optional().describe("Page number for pagination (default: 1)"),
  perPage: z.number().optional().describe("Number of results per page (default: 30, max: 100)"),
});

export async function searchRepositories(
  query: string,
  page: number = 1,
  perPage: number = 30
) {
  const url = new URL("https://api.github.com/search/repositories");
  url.searchParams.append("q", query);
  url.searchParams.append("page", page.toString());
  url.searchParams.append("per_page", perPage.toString());

  const response = await githubRequest(url.toString());
  return GitHubSearchResponseSchema.parse(response);
}

You can clearly see that our final implementation is through https://api.github.com API to interact with GitHub. We use the githubRequest function to call GitHub's API and finally return results.

Before calling GitHub's official API, MCP's main work is to describe what capabilities the Server provides (for LLM), what parameters are needed (what specific functions the parameters have), and what the returned results are.

So MCP Server is not a novel, profound thing. It's just a protocol with consensus.

If we want to implement a more powerful AI Agent, for example, we want AI Agent to automatically search related GitHub Repository based on local error logs, then search Issues, and finally send results to Slack.

Then we might need to create three different MCP Servers: one is Local Log Server for querying local logs; one is GitHub Server for searching Issues; and one is Slack Server for sending messages.

After the user inputs the command I need to query local error logs and send related Issues to Slack, AI Agent autonomously judges which MCP Servers need to be called, decides the calling order, and finally decides whether to call the next Server based on the return results of different MCP Servers to complete the entire task.

How to Use MCP

If you haven't tried how to use MCP yet, we can consider using Cursor (I've only tried Cursor), Claude Desktop, or Cline to experience it.

Of course, we don't need to develop MCP Servers ourselves. The benefit of MCP is that it's universal and standard, so developers don't need to reinvent the wheel (but learning can reinvent the wheel).

First recommended are some Servers from the official organization: Official MCP Server List.

Currently, community MCP Servers are still quite chaotic, with many lacking tutorials and documentation, and many code functions have problems. We can try some examples from Cursor Directory. I won't go into detail about specific configuration and practice, everyone can refer to official documentation.

MCP Resources

Here are some MCP resources I personally recommend for your reference.

Official MCP Resources

Community MCP Server Lists

Final Thoughts

This article was written quite hastily, so errors are inevitable. Welcome all experts to correct me.

Finally, this article can be reposted, but please indicate the source. It will be published simultaneously on X/Twitter, Xiaohongshu, WeChat Official Account. Welcome all experts to follow.