← All posts
4 July 2025

Memory Is the Wrong Word

Why calling AI knowledge storage 'memory' sets up exactly the wrong expectations — and what the right word reveals about how these tools should work.

When I first tried ChatGPT's "Memory" feature, I felt a flicker of genuine hope. Finally, a tool that would carry forward what I'd told it — preferences, context, the slow accumulation of actually knowing me. I used it seriously for a month. It remembered some things I'd said. It also confidently "remembered" conclusions I'd never reached, and had no trace of a few things I'd explicitly stated. When I asked what it had retained, I got a vague summary, not a list. I couldn't edit individual entries. I didn't know where the information had come from or when.

The word memory had set me up for exactly the wrong expectations.

What memory actually does

In 2001, psychologist Daniel Schacter published The Seven Sins of Memory, a catalogue of ways human memory fails: transience, absent-mindedness, blocking, misattribution, suggestibility, bias, persistence. The list reads like a bug report. But Schacter's central argument is that each "sin" is also a feature. Memory is reconstructive, not reproductive. You don't retrieve a stored file — you rebuild a scene from fragments, shaped by what you currently believe, feel, and expect. The same mechanism that causes misattribution also causes generalization. The same one that causes bias also causes prioritization.

This makes memory useful for living. You don't need to remember every detail of a meeting; you need to remember whether the person was trustworthy and what you agreed to do. Human memory compresses aggressively toward meaning. It forgets the irrelevant. It rewrites itself slightly every time you access it, because the next retrieval needs to be useful, not accurate.

That's not what any AI system does. Not even close.

What retrieval actually is

What AI "memory" products do is retrieve from a store. The store might be a vector index, a structured summary, a database of extracted preferences — the implementations vary. But the fundamental operation is lookup, not reconstruction. And when the lookup fails, or surfaces something you didn't say, users are confused in a way they wouldn't be if the product had been honest about what it was.

Andrej Karpathy, who built much of the architecture underlying early GPT, described his personal knowledge setup in early 2025 as a compiler. Raw sources — articles, notes, observations — are processed by an LLM into a structured wiki. The raw material is source code. The wiki is the binary. You don't run source code directly; you compile it first into something denser, more queryable, and faster to use. The key word he kept returning to wasn't intelligence or memory — it was structured. The goal is a store you can inspect and rely on.

That framing cuts at something real. A compiled knowledge base isn't trying to simulate human memory. It's trying to be something better than human memory for the specific purpose of being useful later.

Records

The right word isn't memory. It's records.

A record has completely different affordances. Records are inspectable — you can open them, read every entry, edit the ones that are wrong, and delete the ones that shouldn't exist. Records have timestamps and sources. A memory says "I know you prefer direct feedback." A record says: preference: direct feedback — added March 3, updated April 9, source: conversation about the Q1 review. You can distinguish what was inferred from what was stated. You can tell when something was last touched.

Records don't spontaneously distort. They don't conflate two separate conversations into one confident summary. They don't silently deprecate old entries when new ones arrive. They're boring in exactly the way you want your knowledge base to be boring.

This matters especially when AI writes to the record. If an agent adds a note about a person I know, or marks a preference I've apparently expressed, I want to see the entry — not because I distrust the agent, but because records are the right abstraction for something I might later rely on. The review step isn't friction or ceremony. It's the difference between a knowledge base and a confidence machine.

Why the word matters

The choice of "memory" over "records" wasn't neutral. Memory implies a relationship. It implies the system holds something of yours the way another person holds something of yours — with attention, with investment. When it fails, it feels like betrayal. When it works, it produces a kind of warmth that isn't quite earned. Both responses are calibration errors that the metaphor created.

There's an older precedent. RFC 822, the 1982 email standard, defined the format in five fields: From, To, Date, Subject, Body. Nothing about remembering. Email became the world's most-used personal knowledge base not because it felt intimate but because its schema was open, structured, and durable. Forty years later, you can still read a 1985 email in any client. Records age well in a way that memory promises never quite do.

I'm not arguing that AI tools should be cold or clinical. But the warmth should come from using something that works as described — not from a metaphor that primes you to project a relationship onto a retrieval index. The word "memory" is doing work the technology hasn't earned yet.

Records don't need to earn anything. They just need to be right.


Asgeir Albretsen is the founder of Harbor.

Memory Is the Wrong Word: Harbor Blog | Harbor