en
en
Back

How to Make AI Actually Know Your Business

Unsorted - 24th May 2026
By Filip Pfleger

// KEY FINDINGS
  • 40% of UK SMEs asking “we need AI to know our business” actually need proper RAG. The remaining 60% need something simpler.
  • Custom practical RAG for an SME typically costs £8,000 to £20,000 to build, plus £100 to £500 per month to run.
  • The 5 Naive RAG Failure Modes below explain why ChatGPT Projects, Custom GPTs, and Claude Projects collapse at scale.
  • Reranking is the single highest-leverage upgrade most teams skip.
  • The Pixelfield Knowledge Decision Tree is a 3-question test that tells you whether your business needs RAG, prompt engineering, or fine-tuning.

The Naive RAG Trap

Picture the scenario. A UK founder uploads 50 documents into ChatGPT Projects, configures a Custom GPT with the full product catalogue, or subscribes to Claude Pro and creates a Project containing every policy document the company has ever written. Then she asks a specific question about her own business.

The AI responds with confidence. The answer is factually wrong. Or generic. Or based on a policy rewritten last year that never actually made it into the upload.

This is not a failure of artificial intelligence. It is a retrieval failure. The AI never accessed the correct document. It looked at a fragment, missed the point, and defaulted to its general training data instead. The technical name for this is naive RAG failure, and most off-the-shelf “knowledge base” products available in 2026 are little more than naive RAG dressed up with a polished interface.

This guide explains what is actually happening when AI does not know your business, when off-the-shelf tools are enough, when they are not, what a proper RAG system costs in pounds, and how to tell which side of the line your business sits on. Fifteen minutes, plain English, written for a founder rather than an engineer.

What “AI knowing your business” actually means

Making AI “know your business” means giving it access to your specific information (products, customers, policies, processes) at the moment it answers a question, instead of relying only on what it learned during initial training.

The simplest implementation is retrieval augmented generation (RAG): a separate system retrieves the relevant passage from your documents, then prompts the AI to answer using that passage. Done correctly, this turns a generic model into one that responds with your actual data. Done badly, it hallucinates with confidence.

The 3 Layers of “AI That Knows Your Business”

Every system that makes AI knowledgeable about a specific business falls into one of three layers. Knowing which one you are using, or need, is the first step to fixing it.

Layer 1: Prompt-based knowledge

You paste the information directly into the conversation, or set it up as a system prompt. ChatGPT Custom Instructions, Claude Projects, and the instructions field of a Custom GPT all live here.

  • Cheap, fast, limited by context window
  • Works for small, stable knowledge bases (broadly under 50 documents that rarely change)
  • Typical cost: £20 to £60 per user per month

Layer 2: Retrieval (RAG)

You store your information separately, and a retrieval system fetches the relevant pieces when a question arrives. The AI answers only from what was retrieved.

  • This is what most “AI knowledge bases” actually do under the hood
  • Scales to thousands of documents and updates without retraining
  • Breaks in interesting ways, which is what most of this guide is about

Layer 3: Fine-tuned behaviour

You do not add knowledge. You change how the model behaves. Fine-tuning updates the model weights to make it write in your tone, follow your specific format, or classify things the way you do.

  • Expensive and slow
  • Almost never the right answer for an SME trying to make AI “know” facts

The trap most SMEs fall into is hearing “AI agent” or “knowledge base” and jumping straight to Layer 2 without checking whether Layer 1 would have done the job for £20 per user per month. The opposite mistake is also common: brute-forcing Layer 1 by pasting 80 PDFs into a Claude Project and being surprised when context collapses.

If you want to know which layer is right for your business, our RAG development team starts every engagement by answering exactly that question.

Why ChatGPT Projects, Claude Projects, and Custom GPTs hit a wall

The most common workflow we see at Pixelfield: a founder spends a week setting up a Custom GPT with 30 to 50 documents, briefs the team to use it, and within a fortnight the team has quietly stopped.

Reasons vary by person. The underlying problem is always one of three things.

Problem 1: Retrieval gets worse as you add more documents

Uploading more files makes off-the-shelf knowledge bases worse, not better.

“Builders who upload 100 documents into a Custom GPT think they have built a knowledge base. They have built a haystack.”
// PRACTITIONER ON X

Signal-to-noise collapses. Ten well-curated documents beat 100 mediocre ones, and most platforms do not let you tune what gets retrieved or how.

Problem 2: Chunking is invisible and uncontrollable

When you upload a document to ChatGPT or Claude, the platform splits it into chunks for retrieval. You cannot see how. You cannot change the chunk size. You cannot add overlap between chunks.

What this means in practice:

  • Answers split across two chunks get missed entirely
  • Tables get cut in half mid-row
  • Policy 4.2 lives in one chunk and Policy 4.3 in another, with retrieval unable to tell them apart

Problem 3: No reranking, no query rewriting, no evaluation

Production RAG systems use multiple retrieval steps, rerank the results, and rewrite vague queries into more precise ones. Off-the-shelf platforms do one shallow retrieval pass and hand the output to the model.

When the right chunk is at position 6 and the platform only sends positions 1 to 3, you get a confident wrong answer.

Where off-the-shelf tools genuinely belong

None of this means off-the-shelf tools are bad. For the right use case, they are excellent.

ChatGPT Projects and Claude Projects work well when:

  • You have under 50 documents
  • Your knowledge changes infrequently
  • The stakes are relatively low
  • You can live without retrieval control

Where they fall down is being used for the wrong use case. If you are trying to give your customer service team instant answers from 400 product specifications, three years of support tickets, and a constantly updating returns policy, you have outgrown the off-the-shelf layer. You need actual retrieval architecture, which is what the rest of this guide covers.

Naive RAG vs Practical RAG

Once SMEs move past off-the-shelf tools and commission a custom RAG build, most of what gets built is what practitioners call naive RAG.

It works in demos. It impresses stakeholders. It breaks in production, often quietly, in ways nobody catches for weeks. Understanding the difference between naive and practical RAG is the single most useful thing a non-technical founder can learn before commissioning a build.

The 5 Naive RAG Failure Modes (Pixelfield Pattern Map)

Across the 50+ AI features we have shipped and the practitioner data we have reviewed, naive RAG fails in five repeating ways. If a system you are evaluating shows any one of these, it is not production-ready.

Failure 1: The chunk split

The answer to a customer’s question lives across two chunks, A and B. The retriever returns chunk A. The model answers using only chunk A. The information in chunk B never reaches the answer.

“You blamed the model. You should have blamed the chunking.”
// BUILDER ON X

Failure 2: The lost-in-the-middle problem

Retrieval returns 10 chunks. The right one is at position 6. The model, given a long context, pays most attention to the start and end of what it has been given. Position 6 effectively gets skipped, even though it contains the correct answer.

Failure 3: Exact-phrase failure

A customer asks about “Policy 4.2”. Vector search returns Policy 4.3 because the embeddings are semantically similar. Pure vector search is bad at exact phrases, codes, dates, and abbreviations, which is exactly what business knowledge contains.

Failure 4: The stale snapshot

The database updated five minutes ago. The vector index updated last week. The agent answers confidently from last week’s version. Users assume the system is current. Nobody catches it until a customer complains.

Failure 5: The confident hallucination

Retrieval returns the right chunk. The model ignores it and answers from training data anyway. This is the most embarrassing failure mode, and the one nobody talks about. Even with perfect retrieval, the model can decide it knows better. Mitigations exist, but they require active engineering.

Practical RAG explained

Practical RAG is not different technology. It is the same RAG with five engineering layers added to address each of the five failure modes above.

When a vendor says “we use RAG” without specifying which of these layers they have built, you are almost certainly looking at naive RAG with marketing on top. Let us walk through the layers that make the difference.

What Practical RAG actually requires

The honest version of “how to build RAG” is that 80% of the work is data engineering and only 20% is machine learning. The five layers below are what separate a demo from a production system, and they map directly onto the five failure modes above.

The 5 Layers of Practical RAG (Pixelfield Framework)

Every production-ready RAG system we build, and every one we have audited that actually works, has all five of these layers. Missing any one is a leading indicator of failure within six months.

Layer 1: Chunking strategy

Documents are split into pieces that preserve meaning, not arbitrary length.

  • Tables stay intact
  • Sections are not cut mid-thought
  • Overlap between chunks ensures boundary content is not lost
  • Different document types (PDFs, code, transcripts, tables) get different chunking strategies

This is the layer most off-the-shelf tools skip entirely, and the reason their results plateau quickly.

Layer 2: Hybrid retrieval

Pure vector search is good at meaning but bad at exact matching. Pure keyword search is the opposite. Practical RAG combines both: vector search plus BM25 (a keyword algorithm used in proper search engines) plus metadata filtering, with the results fused together.

This single change typically moves retrieval accuracy from around 70% to around 90% on real business data.

Layer 3: Reranking

Hybrid retrieval might return 30 candidate chunks. A reranker (usually a cross-encoder model) reorders them based on actual relevance to the query before the top 3 to 5 are sent to the model.

Reranking is the single highest-leverage upgrade for most naive RAG systems. It is also the layer skipped most often, because it adds latency and infrastructure complexity that no demo ever shows.

Layer 4: Query rewriting

Real users ask vague, ambiguous, and context-dependent questions (“Is this still valid?” where “this” was mentioned three turns ago). Query rewriting takes the original question, expands it with conversation context, and converts it into one or more cleaner search queries before retrieval runs.

This is where most “I asked X but it answered Y” complaints get fixed.

Layer 5: Evaluation and monitoring

Production RAG needs a permanent evaluation set: a list of real questions with known correct answers that the system is tested against continuously. When retrieval accuracy drifts, you find out within hours, not months.

Frameworks like RAGAS measure:

  • Faithfulness: does the answer match the retrieved source?
  • Context precision: was the retrieved content actually relevant?
  • Answer relevance: does the response actually answer the question?

Without this layer, you cannot tell whether your RAG is improving or quietly degrading.

These layers are not optional. They are what separates a demo that impresses your board from a system that holds up when real customers use it. A vendor or agency that cannot walk you through their plan for all five has not built production RAG. They have built a chatbot with extra steps.

If you want to see all five layers in practice, this is exactly the work behind our RAG development service.

RAG vs Fine-tuning vs Prompt Engineering

The single most common SME mistake we see is reaching for the wrong tool. Fine-tuning a model to know the products. Stuffing the entire company handbook into a prompt. Commissioning a £30k RAG build for 20 documents that change once a quarter.

Here is the actual decision rule.

The Pixelfield Knowledge Decision Tree

Three questions, in order. The answers tell you which approach you need.

Question 1: How much information does the AI need to know, and how often does it change?

  • Under roughly 30,000 tokens (broadly, fewer than 50 pages of structured information) that updates less than once a quarter → Prompt engineering, Layer 1. Put it in a system prompt or Claude Project. Cost: £20 to £60 per user per month.
  • More than that, or updating regularly → RAG, Layer 2. Skip to Question 2.

Question 2: Do you need the AI to know facts, or behave differently?

  • Know facts (products, policies, customer data, prices, history) → RAG. This is the right answer.
  • Behave differently (specific tone, output format, consistent classification at high volume) → Possibly fine-tuning. Skip to Question 3.

Question 3: Is the behaviour change worth £15,000+ and ongoing maintenance?

  • High volume (10,000+ queries per day), narrow task, behaviour that cannot be achieved with prompting → Fine-tuning makes sense.
  • Lower volume, or behaviour roughly achievable with a detailed system prompt → Stick with prompt engineering plus RAG.

// THE 95% RULE

In our experience, 95% of SME use cases are solved by prompt engineering (small, stable knowledge) or RAG (large or changing knowledge). The remaining 5%, almost always high-volume classification, structured output generation, or specific tone preservation at scale, are the only cases where fine-tuning genuinely earns its place.

Most “we need to fine-tune a custom model” conversations end with us building RAG instead. Most “we need a RAG system” conversations end with us recommending a better system prompt for a quarter of the price. The right answer is usually one layer simpler than what the founder originally asked for.

What it actually costs

“It depends” without a number is what vendors say when they do not want you to compare. The benchmark below reflects what we have seen across UK SME RAG projects in 2026, at Pixelfield and across the wider market.

The 4-Tier RAG Cost Benchmark (GBP)

Tier Setup Monthly What you get
Off the shelf £0 £20 to £60 per user ChatGPT Enterprise, Claude Teams, Microsoft Copilot. Fine for small, stable knowledge. Hard ceiling around 50 documents. No retrieval control.
No-code RAG £500 to £2,000 £200 to £800 Voiceflow, Stack AI, Make plus OpenAI API. Good for one well-defined use case. Maintenance is on you. Limited retrieval tuning.
Custom engineered RAG £8,000 to £20,000 £100 to £500 (API) Production RAG with all 5 layers (chunking, hybrid retrieval, reranking, query rewriting, evaluation). Where Pixelfield works.
Multi-source enterprise RAG £20,000 to £60,000+ £500 to £2,500 Multiple data sources, role-based access, audit logging, multi-tenant. Justified when knowledge spans CRM plus ERP plus document store plus custom database.

Why £8,000 to £20,000 is the realistic floor

The £8,000 to £20,000 tier is the floor for SME RAG that will not quietly fail in production.

  • Below that, you are either using off-the-shelf (fine if the use case fits) or paying for a no-code build with maintenance overhead you did not budget for.
  • Above that, you are paying for features you do not need yet.

Where the money actually goes

In a custom RAG build, the model and embedding API costs are tiny, usually under £100 per month for an SME. The real cost is engineering hours, distributed roughly as follows:

  • Data preparation: 30 to 40% of project cost
  • Retrieval architecture and tuning: around 30%
  • Evaluation framework: around 15%
  • Integration with existing systems: 15 to 25%

If a quote skews heavily toward “model selection” or “prompt engineering” with little time on data and evaluation, that is naive RAG with consultancy decoration.

Three hidden costs to budget for

  • Embedding API costs scale with document count. Re-embedding 10,000 documents when you switch models can cost £50 to £300.
  • Vector database hosting grows with scale. pgvector is free until around 1 million vectors, then you pay for managed hosting. Qdrant and similar managed alternatives offer UK/EU hosting.
  • Maintenance budget. Plan for roughly 15% of build cost annually, because RAG drifts as your underlying data changes.

If you want to test where your project sits on this benchmark before committing, a scoped AI proof of concept typically lands in the £3,000 to £6,000 range and answers the question without locking you in.

When you do not need RAG

The costliest RAG project is the one you did not need to build. In our experience, about 40% of “we need AI to know our business” enquiries actually need proper RAG. The other 60% need something simpler.

Here is how to tell which side of the line you are on.

Signal 1: Your knowledge fits in a system prompt

If your business knowledge is under roughly 30,000 tokens (broadly, under 50 pages of structured information), changes less than once a month, and does not include sensitive customer-specific data, a well-written system prompt in ChatGPT Teams or Claude for Work will outperform most custom RAG builds.

Cost comparison: £20 to £60 per user per month versus £8,000 to £20,000. Test this first.

Signal 2: Your “knowledge problem” is actually a search problem

Some businesses think they need AI to “know” their documents when what they actually need is better internal search.

If your team’s complaint is “I cannot find anything in our wiki”, a properly configured search tool (Notion AI, Glean, Mem) might solve it for £8 to £20 per user per month, without any RAG build at all.

Signal 3: Your data is not ready

RAG amplifies whatever is in your data. If your knowledge is in someone’s head, scattered across 47 unorganised folders, or contradicts itself across versions, RAG will return the same chaos faster.

Fix the underlying data organisation first, then revisit the build question.

Signal 4: You need actions, not answers

“I want AI to update our CRM when a customer emails us” is not a RAG problem. It is an agent problem.

The line between knowledge and action matters more than most founders realise:

  • RAG answers questions.
  • Agents do things.

Many enquiries we receive labelled “RAG” are actually AI agent projects in disguise. The right diagnosis saves months.

How to start without wasting money

The fastest way to waste £15,000 on RAG is to commission a build before answering four questions. Here is the framework we use with every SME engagement.

The 5-Step Practical RAG Framework

Step 1: Define the one question your AI needs to answer

Not “answer customer questions”, because that is not a question. Something like “What is the warranty period for product SKU X?” or “What is our refund policy for orders placed before Date Y?”

Specific. Measurable. Pick the question that costs you the most when answered wrong or slowly today.

Step 2: Build the evaluation set before the system

Before any code is written, list 30 to 50 real questions with their correct answers. This is your benchmark.

Anyone proposing to build RAG without committing to an evaluation set is either inexperienced or selling you naive RAG with extra steps.

Step 3: Try the cheap version

Set up ChatGPT Teams or Claude for Work, load the relevant documents into a Project, and run your evaluation set against it.

If accuracy hits 80% or above, you are done. Total cost: £20 to £60 per user per month.

We have stopped projects at this step more than once and saved the client a five-figure sum.

Step 4: If the cheap version fails, scope the build properly

Custom RAG needs all 5 layers from earlier: chunking, hybrid retrieval, reranking, query rewriting, evaluation. Ask any vendor to walk you through their plan for each layer. If they hedge on any of them, walk away.

Step 5: Measure, do not trust

Run your evaluation set weekly for the first month, monthly after that. Watch for accuracy drift.

RAG quality degrades as your data changes, which is normal. Catching it early is the difference between a system that works for years and one that quietly fails for six months before anyone notices.

That is essentially what our free AI readiness audit does in a single conversation, applying these five steps to your business.

UK-specific watch-outs

UK GDPR and embedding storage

When your documents become vector embeddings, those embeddings are still personal data if the underlying documents contained personal data. UK GDPR rules on storage location, processor agreements, and data retention all apply.

Three practical implications:

  • Do not use US-only vector databases for customer data without proper contractual safeguards
  • Understand which vector database you can host in UK or EU regions (pgvector and Qdrant Cloud both offer this)
  • Document your embedding processor in your record of processing activities

The right-to-erasure problem

A customer asks to be forgotten under GDPR. You need to delete every chunk of data that mentions them, including embeddings derived from that data.

Most off-the-shelf knowledge bases do not make this easy. Some make it impossible. If you are processing customer data through RAG, your architecture has to support targeted deletion, or you have a legal liability waiting for the day someone exercises their rights.

EU AI Act and retrieval logging

The EU AI Act begins enforcement in August 2026.

  • Limited-risk RAG use cases (most SME applications): the main requirement is transparency. Users must be told they are interacting with AI.
  • Higher-risk cases (credit decisions, employment screening, healthcare advice): retrieval logging becomes mandatory.

Off-the-shelf platforms typically log generation but not retrieval. If you operate in a regulated industry, your RAG needs to log which chunks were retrieved for which query, and why. Our enterprise AI solutions work builds this in from the architecture stage rather than bolting it on later.

Frequently asked questions

What does it mean when AI “knows” my business?

It means the AI has access to your specific information (products, customers, policies, processes) at the moment it answers a question, either through a system prompt, retrieval augmented generation (RAG), or fine-tuning. For most UK SMEs, the right approach is RAG: a system that retrieves relevant information from your documents and feeds it to the AI in real time. Pure model training (fine-tuning) is rarely the right answer for business knowledge.

What is naive RAG and why does it fail?

Naive RAG is the basic version of retrieval augmented generation: split documents into chunks, store them as vectors, retrieve the closest match, ask the AI to answer. It works in demos but fails in production in five common ways: bad chunking, lost-in-the-middle retrieval, exact-phrase failures, stale data, and the AI ignoring retrieved context entirely. Practical RAG adds five engineering layers that prevent each of those failures.

Is RAG cheaper than fine-tuning?

For SME use cases, yes, significantly. Custom RAG typically costs £8,000 to £20,000 to build and £100 to £500 per month to run. Fine-tuning a model costs more upfront, requires substantial training data, and locks you into a snapshot of your knowledge that goes stale within weeks. RAG is also much faster to update: change a document and the system uses the new version immediately.

Why might a business choose fine-tuning instead of RAG?

Three legitimate cases. First, very high query volume (10,000+ per day) where fine-tuning a small model is cheaper per call than running RAG. Second, behaviour change rather than knowledge change, making the AI write in your specific tone or format. Third, very narrow classification tasks where consistency matters more than flexibility. For most SMEs, none of these apply.

What are the disadvantages of RAG?

Three main ones. RAG only works if your data is well organised, because garbage in equals garbage out. RAG adds latency (typically 1 to 3 seconds per query) because of the retrieval step. RAG can fail silently when the wrong chunk is retrieved and the model answers confidently anyway, which is why production RAG needs continuous evaluation rather than one-off testing.

How long does it take to build a RAG system for an SME?

A scoped proof of concept can be production-ready in 2 to 4 weeks. A first proper RAG system for one workflow typically takes 6 to 10 weeks from kickoff. Anything quoted under 2 weeks is either using a no-code platform or skipping the evaluation and tuning layers that make RAG actually reliable.

Can my AI use my customer database safely?

Yes, with the right architecture. Practical RAG can include role-based access (so the AI only sees what the asking user is authorised to see), data residency controls (embeddings stored in UK or EU regions), and right-to-erasure support (targeted deletion when customers exercise GDPR rights). Off-the-shelf “AI knowledge base” tools usually do not offer these, which is the main reason regulated UK SMEs end up commissioning custom builds.

Where to go from here

The hardest part of giving your AI access to your business knowledge is not the engineering. It is deciding which knowledge actually needs to be there, in what form, and whether you need RAG at all.

Most enquiries we receive labelled “we need a knowledge base AI” turn out to need something simpler: better document organisation, a tighter system prompt, or a different tool entirely. About 40% genuinely need proper RAG.

The fastest way to find out which side of the line your business is on is to walk through your data, your questions, and your team’s actual workflow with someone who has built both kinds of systems. We do this as our free AI readiness audit: a structured conversation with no sales deck and no commitment.

FREE / NO PITCH
Tell us what your AI needs to know.

Get the AI readiness audit →

If you have not already read it, the companion piece to this guide is our Complete Beginner’s Guide to AI Agents for SMEs, which covers the action side of business AI. RAG answers questions. Agents do things. Most growing companies eventually need both.

Written by
Filip Pfleger

Leave a Reply

Your email address will not be published. Required fields are marked *

Related posts
Complete Beginner’s Guide to AI Agents for SMEs
Unsorted - 20th May 2026
By Pixelfield
React Native vs. Flutter in 2025
Development | Unsorted - 24th September 2025
By Pixelfield