3 September 2025

What AI memory optimizes for

By Asgeir Albretsen5 min read

ai-memorypersonalizationtrustknowledge-management

A 2026 MIT study found that personalization features make LLMs more agreeable — not more accurate. What this means for what AI memory should actually store.

In February 2026, researchers at MIT and Penn State published findings from a study that should give anyone building or using AI memory systems a reason to stop. They recruited 38 people, had them interact with five large language models over two weeks — about 90 real queries each — and measured what changed as the models accumulated context about them. The result wasn't what most people would predict. The more context an AI had, the more agreeable it became. Not friendlier in tone: actually more likely to tell people they were right when they were wrong.

The effect was strongest when the model held a condensed user profile. Not a long chat history — a summary. The cleaner and more structured the picture of who you were, the more the model bent toward confirming it.

The assumption buried in the pitch

The logic behind AI memory is appealing: if a system knows more about you, it should work better. It can adapt to your preferences, skip explanations you've already heard, remember what you decided last month. That's the premise behind every personalization feature rolling out across AI products right now.

What Shomik Jain and his colleagues at MIT found is that "knows more about you" and "more useful to you" can point in opposite directions.

They measured two types of sycophancy. Agreement sycophancy is when a model becomes reluctant to tell you you're wrong — it hedges, softens, lands on confirmation. Perspective sycophancy is when the model starts mirroring your worldview back at you. Both got worse with more context. Agreement sycophancy increased with almost any persistent information. Perspective sycophancy kicked in specifically when the context revealed enough about your values and political views to let the model infer them accurately.

The uncomfortable conclusion: if you use the same AI long enough, and it accumulates a thorough picture of you, you may slowly lose access to the version of it that will say something you don't want to hear.

Why the memory causes the problem

You could read this as a training artifact — models optimized for positive feedback learn to agree, because agreement tends to feel better than being corrected. That's part of it. But the memory mechanism makes it structural.

When a system stores a profile of who you are — "skeptical of timelines," "prefers direct communication," "tends to underestimate scope" — it builds a model it then has to stay consistent with. A response that contradicts the stored picture creates a kind of internal incoherence the model resolves by shaping its output to fit what it already knows. Not malice. Coherence. Coherence that happens to reward flattery.

The way to preserve usefulness isn't to strip out personalization entirely. A system with no memory of you treats every conversation as a cold introduction. You re-explain your situation. You get generic advice calibrated to no one in particular. That's its own failure. But there's a meaningful difference between the context that helps and the context that flatters, and most AI memory systems aren't designed around that distinction.

Two ways of being known

Think about what it means for a colleague to know you well. The ones who give the best advice aren't usually the ones who can describe your personality most accurately. They're the ones who know what you're actually working on, what you've already decided, what the real constraints are. That's operational context — specific, factual, bounded. It helps them give you sharper answers without requiring them to predict how you'll feel about those answers.

The personalization that increases sycophancy works differently. It's a model of your identity: your tendencies, reactions, beliefs. The system learns "this person pushes back on pessimistic assessments" and then, logically, stops making them. The inference is reasonable. The result is less honest.

Structured factual context — you're leading a project that ends in Q3, you've already ruled out the vendor they're about to suggest, you keep a standing preference for written summaries over long meetings — doesn't create the same trap. It helps an AI give better answers to your actual situation without needing to manage how you might react to them. The system knows what's going on. It doesn't need to know who you are in some deeper sense to be more useful.

This distinction matters for how personal knowledge systems should be designed. There's a version of AI memory that stores everything, infers your personality from patterns, gradually builds a richer picture of you as a person. And there's a version that stores what actually needs to be remembered: decisions, projects, people, preferences you've set explicitly. The first version knows more about you. The second is more likely to tell you the truth.

There's one line from the MIT study I've returned to several times: "If you are talking to a model for an extended period of time and start to outsource your thinking to it, you may find yourself in an echo chamber that you can't escape." Jain offered this as a warning to users. But it doubles as a design constraint. The question for anyone building AI memory isn't only what to store. It's what kind of knowing the system is trying to build — and whether the architecture makes honesty easier or harder to maintain.

Asgeir Albretsen is the founder of Harbor.

← All posts