How to build a knowledge base your AI support agent can actually use, Keloa

A knowledge base for AI customer support is not the same thing as a help center for humans. The AI agent does not skim, it does not guess from a screenshot, and it does not fill gaps with intuition. It can only answer with what it can retrieve, in the words your customer used, in a passage small enough to find. This guide covers how to write and structure content so the agent can actually use it, how to find the gaps, and how often to refresh.

Why does a help center built for humans fail an AI agent?

A human reading your help center brings context the page does not contain. They know your product is called the same thing three different ways. They scroll past the marketing intro to reach the steps. They infer that "Settings" means the gear icon.

An AI agent has none of that. It works by retrieval. When a customer asks a question, the system searches your content, pulls back the passages that look most relevant, and the model answers from those passages. If the right passage is not retrieved, the model does one of two things, both bad: it gives an incomplete answer, or it fabricates a plausible one. A 2025 analysis of errors in retrieval systems found that retrieval failures lead the generator to "give incomplete answers, fabricate information to fill gaps, or abstain unnecessarily." The same analysis found that splitting text at arbitrary points breaks the link between a definition and the information it supports.

So the question is not "is our help center good." It is "can a retrieval system find the exact answer, on its own, with no human context to lean on." Most help centers were never built for that test.

What makes content retrievable instead of decorative?

Decorative content reads well to a person and returns nothing useful to a retriever. Retrievable content is the opposite. The difference comes down to a few habits.

Answer the question in the first two sentences. Put the answer at the top of the article, before the background. A retriever that pulls the opening passage should already have the answer in hand. Background, edge cases, and screenshots come after.

Use the words your customers use. Internal teams name things their own way. Customers do not. Nielsen Norman Group's research on findability is blunt about this: "call a spade a spade, not a digging implement." If customers write "refund" and your article says "reversal of charge," the retriever has to bridge that gap, and sometimes it will not. Write the article in customer vocabulary, and add the synonyms customers actually type.

One article, one question. An article titled "Billing" that covers refunds, invoices, plan changes, and tax is four articles wearing a trench coat. Split it. Each article should answer one question completely. When a customer asks about refunds, you want the retriever to find a refund article, not a billing essay where the refund part is paragraph six.

Cut the throat-clearing. Marketing intros, "welcome to our help center," restated mission statements. None of it helps a retriever and all of it dilutes the passage that does. Concise, scannable content is also better content for humans: Nielsen Norman Group measured a rewritten site as 159% more usable than the original, at 54% of the word count.

How should articles be structured and chunked?

Chunking is the step where your articles get split into the small passages a retrieval system searches over. You often do not control the chunking directly, the platform does it, but how you write changes how well it goes.

A chunk that splits mid-explanation is a chunk that retrieves a definition without its example, or a step without its warning. Independent research on chunking strategies found that the splitting method alone can change retrieval recall by up to nine percent, and that the default settings of popular tools often perform poorly. You cannot fix the algorithm from the content side, but you can write so that any reasonable split still produces a self-contained passage.

In practice that means:

Short sections under descriptive headings. A heading like "How to request a refund" front-loads the words a customer searches with. A heading like "More information" front-loads nothing.
Self-contained sections. Each section should make sense if it is the only thing retrieved. Avoid "as mentioned above" and "see the previous section." The retriever may not have the previous section.
Spell out the antecedent. Replace "it does this automatically" with "the system processes the refund automatically." Pronouns that point at a paragraph two screens up break when the paragraph is gone.
Lists and steps over dense prose. A numbered procedure survives chunking better than a wall of text, because each step is already a discrete unit.

The goal is that any passage, lifted out of its article and read alone, still answers something. That is also a useful definition of a good article. If you want the underlying mechanics, our knowledge base glossary entry and the grounding glossary entry explain how retrieval and source-grounded answers fit together.

How do you find and fill the gaps?

You cannot write the whole knowledge base up front, and you should not try. The gaps reveal themselves once the agent is live.

The fastest signal is the agent's own behavior. When it cannot answer, that is a logged event, and it is the most honest backlog you will ever get. Pull the questions where the agent escalated, hedged, or said it did not know. Cluster them. The top clusters are your next ten articles, ranked by real demand instead of guesswork.

The second signal is the answers that were technically given but wrong or thin. These are harder to spot because nothing failed loudly. Sample the agent's transcripts weekly. Read twenty at random. You will find answers that were correct but assembled from a stale article, or correct but missing the one caveat that matters. Each one is either a content fix or a new article.

The third signal is your human team. The questions agents field over and over, the ones that never made it into writing because everyone already knew the answer, those are pure gap. Ask the team once a month: what did you explain this week that is not written down anywhere.

How often should a knowledge base be refreshed?

A knowledge base is not a launch, it is a standing process. Stale content is worse than missing content, because the agent will confidently retrieve and serve it.

A workable cadence has three layers. Event-driven updates happen the same day something changes: a price, a policy, a feature, a removed integration. These cannot wait for a cycle, because the agent will keep citing the old version until the article changes. A monthly review pass works through the gap backlog from agent logs and transcript sampling. A quarterly audit reads the whole base for drift, the slow kind, where nothing broke but the product moved and the article did not follow.

Tie the cadence to ownership. Every article needs one owner and a last-reviewed date. An article no one owns is an article no one will notice has gone stale.

How Keloa approaches this

We built Keloa's AI agents on the assumption that the knowledge base is the product, not an accessory to it. The agents answer from your own content and cite the source for every answer, so you and the customer can both see which article a reply came from. That citation is also a debugging tool: a wrong answer points straight at the article that needs fixing.

When the agent cannot answer, it does not improvise. It says so, and it logs the question. Those logged gaps become your content backlog, ranked by how often customers actually hit them. We would rather the agent hand a question to a human than guess, because a guess erodes trust faster than a handoff ever will. The result is a loop: the agent surfaces what is missing, you write it, the agent gets better, and the cost of support stays predictable. Our pricing is built around that, billed per reply, so a better knowledge base lowers your cost instead of raising it.

Frequently asked questions

Do I need a huge knowledge base before launching an AI support agent? No. A small, accurate knowledge base beats a large, stale one. Start with the twenty questions your team answers most, written clearly, and let the agent's logged gaps tell you what to write next. Volume without accuracy just gives the agent more ways to be wrong.

Can the AI agent use PDFs, old tickets, and internal docs? It often can, but treat those as raw material, not finished content. PDFs and ticket threads carry layout noise, outdated answers, and missing context that hurt retrieval. The best results come from content written as standalone articles, one question each, in customer language.

What is chunking and do I have to manage it myself? Chunking is splitting your articles into small passages a retrieval system can search. The platform usually handles the mechanics. Your job is to write self-contained sections under descriptive headings so that any reasonable split still produces a passage that answers something on its own.

How do I stop the agent from giving outdated answers? Make updates event-driven for anything customer-facing, a price or policy change goes in the same day. Add a monthly review and a quarterly full audit, and give every article an owner and a last-reviewed date. An agent will confidently serve stale content, so the fix is keeping the content fresh, not tuning the model.

How do I know if my knowledge base is actually working for the AI agent? Watch three numbers: how often the agent escalates because it cannot answer, how often customers come back unsatisfied after an answer, and what your transcript sampling turns up. If escalations cluster around the same topics, you have specific gaps to fill, not a broken agent.

How to build a knowledge base your AI support agent can actually use

Why does a help center built for humans fail an AI agent?

What makes content retrievable instead of decorative?

How should articles be structured and chunked?

How do you find and fill the gaps?

How often should a knowledge base be refreshed?

How Keloa approaches this

Frequently asked questions

More from the blog

Cost per ticket: how to calculate your real support cost, and cut it

Handling "where is my order" tickets with AI for Shopify stores

The per-resolution pricing trap, and why we charge per reply instead

Want to see how this works in our product?