Why is this MCP thing everywhere?
A realistic look at the three letters that have taken over the internet.
It feels impossible to get through X these days without seeing yet another company announce their MCP server.
But what does MCP actually mean? Why does it need a server? And why are so many AI companies building for it?
This post will follow up on critically acclaimed Technically correspondent Jason Ganz’s earlier note on MCP.
The TL;DR
The Model Context Protocol helps models like Claude talk and work with external systems in a standardized way.
AI models today struggle mightily with accessing external data from the tools you use every day
The Model Context Protocol is a standardized way for AI models to talk to external systems like databases, Google Maps, and more
Via MCP, models can pull data from these systems, or even take action in them (like sending an email)
Providers need to build MCP servers that allow models to integrate with their systems
MCP is probably the “it” topic in AI right now, so let’s dive in.
AI needs access to your data (but like, in a good way)
Today’s AI models are awesome, yes. But until recently they lacked a kind of major feature: the ability to access specific datasets. Allow me to explain.
ChatGPT and Claude were pre-trained on massive internet-scale datasets, so they know pretty much everything that’s out there on the public web. This is why Claude is great if you want to finish your poem, generate a college essay, or take a best guess at why your heel hurts when you run. These are all things that are out there on the web.
You might think that this means that these models know everything on the internet, but that’s not exactly true. Training on the internet dataset doesn’t mean that models now have an encyclopedic knowledge of every single fact; it’s a hazier, more general kind of intelligence. Kind of like when you read a book. You might not remember every single detail, but you will definitely remember the main themes and ideas (plus a few details, too).
If you don’t believe me, give a model this little test. Here’s a fairly straightforward question to answer:
The data to answer this question is not hard to find on the web. Here’s the roster of the Stanley Cup winning Philadelphia Flyers all the way back in 1975, complete with heights for each player. Conceivably, since Claude has been trained on most of the internet, it should be able to just pull this data out of memory and analyze it. But it can’t. Instead, it starts searching the web:
And it doesn’t even search the web for the raw data: it searches the web for my exact question. It then gets stuck in a bit of a loop, ends up writing some analysis code, and outputs a completely incorrect and useless answer. AI models are trained on the web, but don’t know everything about the web.
You could, of course, go yourself and manually gather all of the data, then paste it into your prompt. But that’s a ton of work…and is the context window of the model even large enough to hold all of that data?
It’s also hard for professional developers and businesses to do this stuff. Connecting AI models to external data sources is difficult, time consuming, and requires a bunch of specialized expertise. You’ll need one connector to get your Snowflake data into Claude. Then a separate one (with totally different code) for your marketing data in Hubspot. If you want to hook up a model to a server and run code, that’s a third thing…you get the gist.
So what’s a person who wants to connect an AI model to another system supposed to do?
Keep reading with a 7-day free trial
Subscribe to Technically to keep reading this post and get 7 days of free access to the full post archives.