Obviously Correct Code Doesn't Need Tests

On the difference between testing for correctness and recognizing it

We don't write tests for variable assignment.

Think about that for a second. x = 5 is code. It could theoretically be wrong. But nobody writes a test asserting that after x = 5, x equals 5. Why not?

Because it's obviously correct. You can see that it's right. The test would just be restating the code.

What if more of your codebase could be like that?

* * *

The testing discourse treats coverage as an unqualified good. More tests equals more professional. The absence of tests signals recklessness or inexperience.

But this framing hides an assumption: that code is too complex to reason about directly, so we probe it behaviorally and hope our probes cover enough surface area.

That's not testing for correctness. That's hedging against complexity we created.

The need for extensive testing is a symptom, not a virtue. It means the code has become too entangled to reason about.

* * *

There's a distinction that rarely gets made explicit: testing as proof versus testing as probing.

Proof means you've established something is true. Probing means you've failed to find it false—so far, under the conditions you thought to check.

Modern testing culture treats probing as if it were proof. Green checkmarks. Coverage percentages. CI pipelines that glow reassuringly. But none of that is understanding. It's just poking the system from enough angles that you develop statistical confidence it probably won't bite you.

And once you see it, a lot of modern testing practice starts to look like emotional insurance against systems no one actually understands anymore.

* * *

This is hard to notice when you're working in modern framework-style development, where it's abstraction layer on top of abstraction layer. When you're six levels deep, "obviously correct" isn't even a category your brain reaches for. You're not thinking "is this correct?" You're thinking "does this seem to work?" and "do the tests pass?"

Those are fundamentally different epistemic stances. The first is reasoning about code. The second is probing a black box.

* * *

There's a different tradition. In embedded systems, firmware, avionics, kernels—places where testing was expensive, incomplete, or impossible—engineers optimized for a different goal:

Write code whose correctness can be established by inspection and reasoning, not by exercising all behaviors.

This led to a specific set of practices:

Explicit state, minimal indirection
Repetition over abstraction
Linear, shallow control flow
Constraints enforced by the environment, not discipline

The goal wasn't "simple" in the naive sense. It was transparent. Code where correctness is a property of its form, not its test suite.

* * *

Salvatore Sanfilippo—antirez, the creator of Redis—works this way. When he implemented HyperLogLog, most programmers would have embedded an existing implementation. Instead, he studied the original Flajolet paper, ran his own experiments, and wrote something from scratch that he could reason about completely.

"We're against complexity. We believe designing systems is a fight against complexity. We'll accept to fight the complexity when it's worthwhile but we'll try hard to recognize when a small feature is not worth 1000s of lines of code."

— The Redis Manifesto

Redis builds with one make command. The codebase is readable C. When antirez returned to it after years away, he typed make and it just worked. That's not luck. That's the compound interest of refusing complexity.

* * *

So what are tests actually doing? They're compensating for specific limitations:

Regression detection — "I fixed X and accidentally broke Y"
Specification encoding — "This is what the system is supposed to do"
Fear reduction — "I can change things without panicking"
Social trust — "I didn't personally break prod, the tests passed"

Only the first is strictly about software correctness. The others are about psychology and coordination.

There's something worth naming here: on teams, tests serve as documentation and shared understanding. When twenty people touch a codebase, tests become a coordination mechanism—not because the code is too complex to understand, but because no single person does understand all of it.

But that's an argument for readable code as documentation, not for tests as a substitute for readability. If the code itself were transparent enough, the tests would be redundant with what you can already see.

The question isn't whether teams need shared understanding. It's whether test suites are the best way to achieve it—or whether they're a workaround for code that should have been clearer in the first place.

* * *

The repetitive, explicit, unentangled style that I keep advocating—and that AI naturally gravitates toward—is inherently more verifiable by inspection.

When there's no hidden state, no action at a distance, no clever abstractions that could fail in surprising ways... what exactly are you testing for?

When a function is twelve lines with no branching, no dependencies, no implicit behavior—what's the test asserting? That the code does what it says? You can see that it does.

The question "what are you testing for?" becomes genuinely hard to answer when correctness is already visible.

* * *

This isn't an argument against all testing. Boundary checks matter. Invariants matter. "This must never happen" assertions matter. Integration points with external systems need verification.

But those tests are sharp, specific, tied to real risk. They're not the sprawling behavioral test suites that exist because the system is too entangled to understand otherwise.

The difference is between:

Tests that verify boundaries and contracts
Tests that compensate for incomprehensible code

The first category remains valuable. The second category is a signal that something went wrong upstream.

* * *

If you find yourself needing massive test coverage to feel safe modifying code, that's not a testing problem. That's a design problem.

The solution isn't more tests. It's simpler code. Code that can be reasoned about. Code where the correctness is obvious.

We don't test variable assignment because we made it obviously correct.

The art is making more of your system that way.