← Patrick White

The Text Distance

Why AI is eating software first, law second, and surgery last

In a recent conversation with Dwarkesh Patel, Dario Amodei made the case that we're "near the end of the exponential"—that a "country of geniuses in a data center" is one to three years away. Dwarkesh pushed back, and his skepticism centered on a specific example: his video editors.

The argument went like this. His editors learn his preferences over months. They build up context about his audience, his taste, the trade-offs he cares about. When will AI be able to do that? And until it can, how much of the economy can it really transform?

It's a reasonable question. But it mistakes what the bottleneck actually is.

* * *

Here's a thing I notice that most people talking about AI don't seem to: the domains where AI is already transformative are the domains that were already text.

Software engineering is going first. This isn't because code is simple. It isn't because SWE is the least skilled knowledge work. It's because code is text in, text out. The codebase is text. The requirements are text. The output is text. The verification is text. Everything about the domain—including the artifacts, the tools, and the feedback loops—lives in the native medium of a linguistic intelligence.

An LLM writing code isn't translating. It's operating at home.

* * *

In Browser Agents Aren't the Future, I argued that forcing an AI through a visual interface is like making a geometer use surveying instruments. The core insight was about what a large language model is: a fundamentally linguistic intelligence. It emerged from text. It thinks in text. Its native medium is language. And programming—the linguistic formalization of action—is the closest thing to operating in that native medium while doing real work in the world.

What I didn't fully draw out in that piece is the corollary: every domain has a text distance. And that distance predicts, with surprising accuracy, how fast AI will transform it.

The speed at which AI transforms a domain is a function of how close that domain already is to text.

Think about it as a spectrum. At one end, you have domains that are almost entirely textual—the artifacts are text, the reasoning is text, the output is text. At the other end, you have domains that are fundamentally embodied—spatial, physical, requiring hands and eyes and presence.

Code is pure text. Already being transformed. Claude Code, Cursor, Copilot—these aren't early experiments. They're the new default. Engineers at Anthropic don't write code anymore; they direct it. Dario says this in the interview: "We have engineers who don't write any code."

Law is next, and it's already happening. Legal research, brief writing, contract analysis—these are text-native tasks. Real lawyers are already reporting that GPT-5.2 Pro writes better research memos than their junior associates. The intelligence is there. What's missing is the interface—nobody has built the Claude Code of law. The work still happens in Word, which means lawyers are copy-pasting between a chatbot and a document editor, which is the 2026 equivalent of printing out your emails to read them. The moment someone builds a tool that lets AI operate directly on legal documents the way Claude Code operates on a codebase, law goes the same direction as software engineering. Not because the AI got smarter. Because the interface caught up with the intelligence.

Finance and accounting are close behind—structured data plus text, clear verification criteria, well-defined rules. Auditing is already being eaten.

Medicine splits in half. The diagnostic and research side is deeply textual—papers, notes, lab results, imaging reports. AI is already better than most physicians at differential diagnosis given a text description of symptoms. But the practice of medicine—the physical exam, the bedside manner, the procedural skill—is embodied. The text distance of "figure out what's wrong" is short. The text distance of "perform the surgery" is long.

Video editing—Dwarkesh's example—is revealing because it seems embodied but mostly isn't. The actual intellectual work of editing—what makes a good clip, what the audience responds to, what the pacing should be—is judgment that can be fully expressed in text. We demonstrated this accidentally while discussing the Dario interview: I pulled the transcript, identified the key sections, found the most interesting exchanges. That is the editor's core cognitive work. The part that requires a visual interface—the timeline, the cuts, the transitions—is just tool manipulation. Give the AI an editing API instead of a GUI, and the bottleneck evaporates.

* * *

This reframes the whole "diffusion" debate from the interview. Dario talks about economic diffusion as a speed limit—fast, but not infinitely fast. Dwarkesh treats the gap between AI capability and economic impact as evidence that AI isn't as capable as claimed. They're both wrong, or at least imprecise.

What they're calling "diffusion" is largely text distance. It's not that enterprises are slow to adopt. It's not that the models lack intelligence. It's that most domains don't yet have the text-native interface that lets AI operate in its native medium.

Coding had this interface from day one. The terminal is already text-native. The IDE is already text-native. So AI walked in and got to work. Law will get there when someone builds the legal equivalent of Claude Code. Medicine will get there—at least on the diagnostic side—when patient records and medical knowledge have a text-native workflow around them. Every domain will get there eventually, because...

Building text-native interfaces is itself a coding problem. Which is the domain AI is already best at.

This is the recursive move that makes the whole thing accelerate. The better AI gets at code, the faster it can build the bridges into every other text-adjacent field. Every API built, every structured data format created, every workflow tool that lets AI operate beneath the GUI instead of through it—each one shortens the text distance for another domain.

We're not waiting for AI to get smarter. We're waiting for the world to become more legible to a linguistic intelligence. And AI is building the legibility itself.

* * *

Dwarkesh's video editor example is actually the perfect illustration of the confusion. He sees that AI can't sit down at Final Cut Pro and edit his videos the way a human editor does after six months on the job. He concludes something about the intelligence. But what he's really observing is text distance—the gap between what the AI can understand and what it can directly act on.

Give the AI the transcript and ask it to identify the best clips? It can do that now. It could do that two years ago. Ask it to learn Dwarkesh's specific preferences from ten examples of clips he loved and ten he cut? That's in-context learning—exactly the thing Dario says already works well at a million tokens. Ask it to make the actual cuts in a timeline? That's the bottleneck. Not intelligence. Interface.

The brilliant new hire can't do their job because nobody gave them a laptop.

* * *

If text distance is the real variable, then the question everyone should be asking isn't "when will AI be smart enough?" It's "how fast can we close the text distance for each domain?"

And the answer is: as fast as we can write the code. Which is getting very fast indeed.