22 September 2025

How Structured Knowledge Makes AI Less Wrong

By Asgeir Albretsen5 min read

aiknowledge-basestructured-dataretrieval

Giving AI more context doesn't always help. The format of that information turns out to matter more than most people expect.

Here's something I learned by doing it wrong first: pasting your entire notes document into a chat with Claude does not make Claude smarter about your situation. Sometimes it makes things worse.

This surprised me. The model has a huge context window — it can handle hundreds of thousands of tokens without breaking a sweat. More context should mean better answers. Except it often doesn't. And the reason turns out to be less about what's in the context and more about how it's arranged.

The problem with walls of text

When you hand an AI a block of unstructured text — a journal entry, a long note about someone you know, a collection of meeting summaries — the model has to do several jobs at once. It needs to figure out what's a stable fact and what's a passing impression. What's current versus what's historical. What's a real preference versus something you mentioned once and don't care about anymore.

Unstructured text doesn't carry that information. "Sarah runs the marketing team" and "Sarah ran the marketing team before she moved to product" look similar on the surface. A human reading your notes would catch the distinction through context and tense. A model, working probabilistically across the whole document, might not — and it tends to resolve the ambiguity with a confidence it hasn't earned.

This is where hallucinations often come from. Not from the AI knowing nothing, but from the AI being handed facts it can't cleanly interpret. It fills the gaps. It blends the old and the new. It picks one reading of an ambiguous sentence and commits to it.

Orlando Ayala and Patrice Bechard wrote about this precisely in a paper presented at NAACL 2024. They deployed an enterprise RAG system that produced structured workflow outputs from natural language requirements, and found that structured retrieval — querying typed data rather than pulling unstructured document chunks — significantly reduced hallucination rates. The mechanism isn't complicated: when facts arrive pre-typed and pre-labeled, the model's job shifts from interpretation to use. It doesn't have to guess what something means.

What typed facts actually do

Contrast unstructured prose with something like this:

topic: communication_style
value: prefers short async updates over long meetings
source: stated directly
confidence: high

Five fields. No ambiguity. The model doesn't have to infer anything from sentence structure or surrounding context. It reads a fact and uses it.

This is what structured knowledge actually provides: not more information, but more interpretable information. The difference shows up sharply in personal context — the kind of thing people try to give AI by writing long system prompts or pasting their notes. Typed entities, discrete preference records, explicit relationships with labeled fields don't just compress the data. They remove the interpretive work that causes errors.

There's a retrieval benefit too. If an AI needs to answer a question about a specific person, a structured system can pull exactly the relevant entity — name, relationship type, known preferences, open tasks — and nothing else. The model gets a tight, targeted context instead of a noisy document with maybe-relevant passages scattered throughout.

Why personal knowledge is the hardest case

Most people who try to give an AI "context about themselves" do it through prose. A long system prompt. A document full of notes. Some combination of both. And this partly works — until the AI references a project you finished six months ago as ongoing, or conflates two people because their names appear in the same paragraph, or gives advice that's technically consistent with your notes but misses something obvious that any human would catch.

The hard part is that personal notes are designed for human reading. They're full of shorthand, references to past events, emotional context, implicit assumptions. They're useful precisely because you don't have to make everything explicit — your own memory fills the gaps. The AI doesn't have that memory. It's reading your notes cold, without the background you take for granted.

What a structured knowledge layer does is make the implicit explicit. Not by replacing the notes — you still write things however you want — but by maintaining typed entities underneath: a person record with discrete fields, a preference record with a confidence level and a source, a project with a status. When the AI answers a question, it queries that layer, not the document. The document stays for you.

I keep coming back to a specific case: someone mentions in conversation that they hate being cc'd on emails unnecessarily. Buried in a note, that fact might get retrieved or it might not, might be weighted correctly or might not. Stored as preference: avoid unnecessary CC on emails, confidence: high, source: stated — the AI will use it. Every time. Without having to decide whether it was important enough to remember.

That's the actual value. Not smarter AI. Just less room for the AI to get confused.

Asgeir Albretsen is the founder of Harbor.

← All posts