14 September 2025

The hearsay in your notes

By Asgeir Albretsen4 min read

knowledgeaistructured-data

Courts figured out in the 1600s that written statements and sworn testimony need to be treated differently. Your notes app hasn't.

In 1603, Sir Walter Raleigh was tried for treason. The central evidence against him was a statement that Lord Cobham had allegedly made to a third party, who repeated it to someone else, who reported it to the court. Cobham himself was never called to testify. Raleigh demanded to confront his accuser. The judges refused. He was convicted and eventually executed.

Lawyers spent the next century working out why that was wrong, and the answer became the rule against hearsay: a written or spoken statement is not the same as testimony. You can cross-examine a person. You cannot cross-examine a document.

Your notes app has no idea this distinction exists.

What your notes don't say

Scroll through any serious note-taker's archive and you'll find, sitting in the same list with the same font and the same timestamp format:

Things they witnessed directly
Things someone told them
Things they inferred
Things they half-remember and probably got wrong
Things that were true in 2021 and quietly stopped being true

None of that is marked. "Jonas hinted the deal might fall through" sits next to "Q3 numbers look fine — reviewed the contract" with no indication that one is third-hand hearsay about a deal that ultimately closed successfully, and the other is a personal assessment from six months ago.

When you're the only reader, it doesn't matter much. You remember Jonas. You remember that he tends toward catastrophizing. You remember the qualifier that the note didn't include. Your brain resolves the ambiguity automatically, drawing on context that never made it into the text.

But that context belongs to you in the moment you wrote the note. It doesn't belong to the note itself.

When the reader changes

AI makes this structural problem suddenly visible.

An AI connected to your knowledge base reads every note at roughly equal weight. It has no way to know that Jonas was unreliable, that the deal closed, that "hinted" was doing epistemological work in that sentence. It processes the statement and may surface it confidently months later — "according to your notes, the deal with the client fell through" — based on something that was wrong when you wrote it, attributed to someone who was wrong to begin with.

The failure isn't retrieval. The note was found successfully. The failure is that there was never any way to record what kind of evidence the note was in the first place.

This is what courts were trying to solve in the seventeenth century. A statement can fail as evidence not because it's false but because there's no way to assess how reliable it is — no way to know whether the person who said it was in a position to know, was being careful, had reasons to mislead. Mixing unreliable statements with reliable ones, with no distinguishing metadata, produces conclusions that are confidently wrong.

What hearsay law actually built

The exceptions to the hearsay rule are more interesting than the rule itself.

The "business records exception" — Federal Rules of Evidence 803(6) — allows records into evidence if they were made at or near the time of the event, by someone with firsthand knowledge, in the regular course of business. The reliability of the record follows from the circumstances of its creation. A contemporaneous entry written by the person who was there, as part of a systematic practice, is treated differently from a retrospective account assembled later from memory.

That's a design spec. Not for law — for a knowledge base.

Records made at the time, by the person with direct knowledge, in a structured format, are simply more trustworthy than notes reconstructed later. Scientists learned this before lawyers did. Faraday kept 16,000 numbered diary entries, dated and signed, specifically because a measurement you can't verify the origin of is worth very little. Lab notebooks developed their conventions — dated, witnessed, methodology documented — because "compound stable at 40°C" has completely different value depending on whether you measured it yourself, your colleague mentioned it, or you read it in a paper you didn't cite.

Most personal knowledge software has never borrowed this insight.

What structure partially fixes

Typed entities don't solve hearsay. But they narrow the problem.

A generic note about Jonas and the deal buries provenance in prose. A person record for Jonas, a project record for the deal, notes attached to each with explicit types — that moves the question from "every sentence in every note" to "every entity." And an entity record, because it has a schema, can carry fields that a blob of text cannot: source, date, confidence, last-verified.

You don't need to tag every sentence. You need structure at the level where provenance matters — the claim about a person, the decision you recorded, the preference you noted down. Those are the things an AI will retrieve and act on. Those are the things worth making legible.

The business records exception works because it anchors reliability in process. Regular practice, contemporaneous recording, someone with firsthand knowledge. The parallel for a knowledge base is simpler: typed entities, sourced explicitly, with a date attached. Not because it's philosophically tidy, but because a reader who wasn't there — including an AI — needs some way to weight what you knew against what you were told.

Raleigh was convicted on what one man allegedly said another man said. The note in your knowledge base that Jonas hinted the deal might fall through has the same evidentiary status. Courts figured out why that matters in 1603. It took them another two centuries to build the rule. We're still early.

Asgeir Albretsen is the founder of Harbor.

← All posts