14 October 2025

The New Meaning of Exporting Your Data

By Asgeir Albretsen5 min read

data-ownershipportabilityplain-textknowledge-base

Export to CSV is theater. Real data portability means something you can restore from, self-host, and read in twenty years.

Most people discover what their export is worth on the day they actually need it.

You hit the button. A zip file arrives. You unzip it, and inside is a tangle of JSON objects, a CSV with columns named entity_id and ref_key, and a folder of images named with UUIDs. The original structure — the notes, the hierarchy, the links between things — has been flattened into something technically complete and practically useless. You wanted your knowledge back. You got a snapshot of a database table.

This is what most "download your data" features look like from the inside. They exist to satisfy a requirement. Not to give anything back.

July 1, 2013

Google shut down Google Reader on that date. It was the RSS aggregator that a generation of internet readers had used to follow the web, and Google gave three months' notice before pulling the plug. They offered Takeout exports in OPML, a standard XML format for feed subscriptions. You could download your list of feeds.

What you couldn't get back was the curation: the starred items, the reading history, the years of tagging and organizing you'd done. That was Reader's actual value, and none of it survived the export. Feedly picked up three million new users in two weeks. Most Reader users moved on and started over. The data existed, technically. The knowledge didn't transfer.

That was over a decade ago. The pattern hasn't improved much. If anything, the problem has gotten harder to see, because services have gotten better at looking like they support portability without actually providing it.

What export usually means

Google Takeout now covers 51 types of data. Impressively broad. Also, by Google's own description, not a recovery tool. Their support documentation is explicit: Takeout is for data export, not data recovery. If you try to restore a deleted Google Drive file from a Takeout archive, sharing permissions won't come back. Some metadata simply disappears.

Notion's Markdown export runs into similar trouble. Notion's block format was never Markdown. When it exports as Markdown, it has to translate. Callouts become raw HTML. Synced blocks get dropped entirely. Database views become CSVs that won't reconstruct your relations. The resulting files are readable in a text editor, which is genuinely good. But they're not importable back into Notion, and getting them into anything else requires significant cleanup work.

Evernote's ENEX format is proprietary XML. Export fidelity is inconsistent enough that a small industry of third-party migration tools exists specifically to bridge the gap between what Evernote gives you and what other apps can read.

GDPR's Article 20 requires exports to be "structured, commonly used, and machine-readable." It says nothing about whether you can restore from the export, migrate to another service, or read the file in a decade. A CSV satisfies the requirement. So does JSON. So does a format no other tool on earth supports. Compliance and usefulness are different questions.

What genuine portability actually looks like

Four questions separate theatrical export from something real.

Can you open it in a text editor right now, without installing anything? Plain text, Markdown, CSV — these are durable. Anything that requires specialized software to read has a lifespan tied to that software's continued existence. That clock starts the moment you export.

Can you restore from it? Not partially, not with cleanup effort. If everything went wrong today and you needed to be running again in 72 hours, is this export sufficient? If the answer involves a migration professional or a custom import script, it probably isn't a real export.

Can you self-host it? A genuine export gives you the raw material to run your own instance, switch providers, or keep a local copy that stays yours regardless of what the original company does next. A snapshot of someone else's database doesn't give you this.

Will it open in twenty years? The Library of Congress publishes recommended formats for long-term digital preservation. Plain text and Markdown sit near the top. Interestingly, so does SQLite: the Library added it to their recommended formats in 2021, citing its stability, ubiquity, and the sqlite.org team's explicit commitment to backward compatibility through at least 2050. The question isn't whether the format works today. It's whether someone in 2045 can read it without hunting down a legacy interpreter.

Most services fail at least two of these. Some fail all four.

Why this is more urgent than it used to be

AI memory is now part of your data. When you use a tool that knows things about you — your preferences, your relationships, your ongoing projects, decisions you made six months ago — that knowledge is accumulating somewhere. The question of whether you can export it, restore it, and take it elsewhere is the same old portability question, now applied to something that didn't exist in any previous format.

If the answer is "your memory lives in our system and you can export a summary," that's the ENEX problem. You exported something. You didn't get your knowledge back.

The format that AI memory is stored in isn't a technical detail. It's a decision about whose knowledge it actually is.

Plain text and Markdown have survived thirty years of software cycles so far. They'll survive whatever comes next. That's not a coincidence — it's why they were designed that way. And it's a reasonable thing to demand of any system you trust with knowledge you care about.

Asgeir Albretsen is the founder of Harbor.

← All posts