Generative AI, RAG and MCP: when the model needs to know about your company

14/11/2025

A language model can draft contracts, summarize reports, and answer technical questions. But it doesn't know how much you billed last month or what your return policy says. For generative AI to be useful in a business context, it needs access to real-world data. RAG and MCP are two distinct approaches to solving that problem.

Imagine asking a generative AI assistant to prepare a summary of the status of your active projects. The model gives you a structured response, well-written and impeccably formatted. But the data is made up. The names don't exist. The figures don't add up.

It's not a flaw in the model. It's a design limitation. Language models generate text based on patterns learned during training, not on up-to-date information from your company. They don't have access to your CRM, your database, or your internal documents. That's where RAG and MCP come in.

The challenge isn't whether generative AI works. It's how to make it work with your data.

What does it mean to connect a model with real data?

When we talk about connecting a generative AI model to business data, we're not talking about retraining it. Training a large model is expensive, time-consuming, and, in most cases, unnecessary.

The goal is to provide context, similar to how technical debt reduction is approached: not starting from scratch, but working with what's already there. The aim is for the model to have access to the relevant information at the time of generating the answer when it answers a question about your company. It shouldn't "know" the answer by heart, but rather be able to consult it.

There are two main approaches to doing this, and they are complementary: RAG (Retrieval-Augmented Generation) and MCP (Model Context Protocol). Each solves a different part of the problem.

RAG: that the model searches before responding

RAG is an architectural pattern that has been in use since 2020, when Meta formalized it in a research paper . The idea is simple: before the model generates a response, a search system retrieves the most relevant pieces of information from your database and passes them to it as context.

The workflow works like this: The user asks a question. The system converts that question into a numerical vector (an embedding ). It searches a vector database for the most similar documents or fragments, retrieves them, and sends them to the model along with the original question. The model generates the answer using that information as a reference.

It's like giving someone an exam with their notes in front of them. The student doesn't need to remember anything; they just need to read carefully what's in front of them.

Where RAG works well

Internal knowledge bases. Product manuals, technical documentation, company policies, FAQs. RAG is especially effective when the information is in relatively static text documents.

Customer service. A chatbot with RAG can answer specific questions about your products by searching your actual documentation, instead of making up generic answers.

Semantic search. Instead of searching for exact keywords, RAG allows users to ask questions in natural language and get answers based on the actual content of your documents.

The limitations of RAG

RAG works well with static or slowly changing information. But it has clear limitations.

It only reads, it doesn't act. RAG retrieves information and passes it to the model, but it can't perform actions: it can't create a ticket in Jira, update a record in your CRM, or send an email.

Quality depends on indexing. If your documents are poorly structured, if the chunking (the way documents are divided) is inadequate, or if the database isn't updated frequently, the results will be poor. Garbage in, garbage out.

It doesn't connect to real-time data. RAG works with pre-indexed information. If you need the model to check the current status of an order or the balance of an account, RAG isn't sufficient.

MCP: a standard for the model to use tools

MCP (Model Context Protocol) is an open protocol released by Anthropic in late 2024. If RAG gives the model information to read, MCP gives it tools to use.

The most direct analogy: RAG is like giving someone a library. MCP is like giving them a phone, access to a computer, and permission to manage things.

MCP defines a communication standard between the model and external systems. Instead of each integration being built ad hoc, MCP establishes a common protocol that any tool can implement.

An MCP server can expose data (query a database, read a file), but also actions: create a record, send a notification, update a field.

What makes MCP possible?

Access to real-time data. The model can query the current state of a system, not an indexed copy from three hours ago. How many open tickets are there right now? MCP can answer that.

Action execution. The model doesn't just read: it can do things. Create a task in your project manager, update a field in your CRM, generate a report in your reporting system. All through a standardized protocol.

Tool composition. A model with access to multiple MCP servers can combine information from different sources into a single response. Query the CRM, cross-reference with billing data, and generate an executive summary—all in one conversation.

Interoperability. As an open protocol, any provider can create an MCP server for their service. There are already connectors for Slack, Google Drive, GitHub, SQL databases, REST APIs, and dozens more tools. The ecosystem grows every week.

The limitations of MCP

It's a young protocol. MCP is less than two years old. The standard is evolving, implementations vary in maturity, and best practices are still being defined. It works, but don't expect the stability of a technology with a ten-year history.

It requires infrastructure. Each data source needs its own MCP server. If you want to connect the model to your ERP, someone has to build (or configure) that connector. In projects using Python and Django as the backend, integration is usually more straightforward. It's not plug and play in all cases, though.

Security is critical. Giving a model the ability to perform actions on your systems is powerful, but also risky. Permissions, authentication, and the limits of what the model can do need to be carefully designed.

RAG and MCP do not compete: they complement each other

A common mistake is to present RAG and MCP as alternatives. In practice, most serious enterprise generative AI implementations use both.

RAG is the way to go when you need the model to work with large volumes of documentation: manuals, contracts, historical data, regulations. Information that already exists, changes little, and that the model needs to consult to provide accurate answers.

MCP is the way to go when you need the model to interact with live systems: querying real-time data, executing actions, or combining information from multiple operational sources.

A concrete example: Imagine an internal assistant for the sales team. With RAG, it can answer questions about standard contract terms, discount policies, and product documentation.

With MCP, you can check the CRM to see the status of an opportunity, review the communication history, and create a follow-up task in the project manager.

Neither of them, separately, covers the whole stage. Together, they do.

What you need before you start

The technology is available. The APIs exist, the frameworks are documented, and vendors offer ready-to-use tools. But the technology is the easy part. What determines whether an enterprise generative AI project succeeds or fails lies before the code.

Organized data. If your internal documentation is a chaotic mess of duplicate PDFs, uncontrolled versions, and folders that no one has organized since 2019, RAG will index that chaos. The model will provide answers based on outdated or contradictory information. Before connecting AI, organize what you're going to feed it.

Defined use cases. "I want to implement AI in the company" is not a use case. "I want the support team to be able to resolve 40% of technical queries without escalating to engineering" is. Without a concrete use case, the project becomes an endless demo that never reaches production.

Data governance. Who can see what? Does the model have access to financial data? Customer information? HR data? The access policies you apply to your employees must also apply to the model. This isn't a technical detail: it's a legal requirement in many cases, especially under regulations like the GDPR.

Iteration capacity. The first deployment won't be perfect. Embeddings will need tweaking, MCP servers will require additional permissions, and users will ask questions no one anticipated. You need a team (internal or external) with the ability to iterate quickly and adjust the implementation week by week.

The time is now, but rushing doesn't help.

Generative AI connected to business data is no longer an experiment. Companies of all sizes are using it in production for customer support, internal operations, document analysis, and process automation.

But the difference between an implementation that generates value and one that generates frustration lies in the foundation. A company that organizes its data, defines its use cases, and chooses the appropriate architecture (RAG, MCP, or both) before writing a single line of code goes further than one that starts with the technology.

If your team already works with language models and needs to take the step of connecting them with real data, the first step isn't technical. It's deciding what specific problem you want to solve. The second is ensuring your data is ready to provide good answers. From there, the architecture practically defines itself.