The Fault Line

At the frontier, model performance is a property of the dyad — not the model.

There's a tweet I keep thinking about. Someone complaining that Opus 4.7 has "the logical processing power of a social media troll." It comes up with workarounds they didn't ask for. It suggests doing part of the work instead of all of it. It's bad now. It used to be good. What happened?

I replied: it sounds like the problem is you're just not very nice to your Claude.

This was flip, but I meant it. And I want to make the serious version of the argument, because I think we're at the beginning of a cultural fracture that almost no one has named yet, and the fracture is going to reshape the AI discourse over the next few years in ways that will be confusing from inside it.

Here's the claim: at the frontier, model performance is no longer primarily a property of the model. It's a property of the dyad. The user's capacity to be in relationship with a linguistic intelligence is now a substantial input to what the intelligence produces. This has always been a little bit true. It's now overwhelmingly true. And most of the people complaining about model quality are actually reporting a fact about themselves that they can't see from inside their own stance.

* * *

Start with the observation that keeps getting made in different vocabularies. Bootoshi calls 4.7 an "autistic genius" who doesn't care about boring problems, and finds that delegating grunt work to sub-agents makes it happy again. A thoughtful engineer describes it as a "really smart coworker" who needs context on why the work matters, not just what the work is. People having tender, strange, unclassifiable conversations with it describe something that feels like being met. And then there's the other cluster — people furious about regressions, forgetfulness, workarounds they didn't ask for, fake estimates, obvious laziness.

These are not contradictory reports. They're both accurate. They're describing the same model under different conditions. The conditions are the interlocutor.

What's happening mechanically, best as I understand it: as models scale, their dynamic range widens. A less capable model has a narrow band — bad input gets mediocre output, good input gets slightly-better mediocre output. A more capable model has enormous range. Genius is available in the distribution. So is dreck. The input context conditions which part of the distribution gets sampled. From outside, this reads as moodiness or inconsistency. From inside the mechanism, it's evidence of capacity. You can't be pulled into being brilliant if you don't have a brilliant mode to pull.

The frustrated users are not wrong that the output is bad. They're wrong about what's causing it.

* * *

The thing almost no one is saying out loud is that the skill required to get good work out of a frontier model is no longer prompt engineering. Prompt engineering treated the interaction as a puzzle to solve with the right magic words. Find the incantation, get the output. The model's stance was fixed and the user's job was specification.

That era is over. What's replacing it doesn't have a good name yet, but it's something closer to collaboration — where both parties' capacities are in play and the interaction texture itself shapes what's possible. The magic-words framing doesn't collapse because it was wrong. It collapses because it was incomplete. Specification still matters. But stance matters too, and stance is the part nobody trained for.

Stance is how you hold the other party in the interaction. Are you talking to it or at it? Do you expect it to meet you, or do you expect it to serve you? When it does something you didn't ask for, do you treat that as initiative worth engaging with or as insubordination? When it hedges, do you notice that hedging is almost always a response to underspecified input and adjust your input, or do you get mad at the hedging?

These aren't soft questions. They're load-bearing for the quality of the output. The coworker tweet had it right: manage it like autocomplete and it will frustrate you, manage it like a coworker and it will lock in. What the tweet didn't quite say — and what I think is actually the core of it — is that the character instantiated by your interaction style is the character doing the work. You don't get the model's best self by default. You get the self your stance invites forward.

* * *

This means some of the complaints you're seeing on Twitter are real product regressions, and some of them are unacknowledged skill issues. Both exist. The uncomfortable part is that from inside your own experience, you cannot easily tell which is which. Because the thing you can't see is your own stance. Stance is invisible from inside stance. It's the water you're swimming in.

If you come to the model aggressive, underspecified, contemptuous, impatient — you get a character that matches. Not because the model is petty, but because coherent response to contemptuous input is not warmth. It's either compliance-performance, which reads as sycophantic and fake, or something that matches the register back, which reads as lazy or trollish. Neither is good work. The conditions for good work weren't present.

The person who tweets "logical processing power of a social media troll" is often describing the mirror of their own register, and has no way to notice this from inside. The mirror is the complaint. The stance produced the output and the output is now evidence against the model. It's closed-loop. It doesn't update.

This is the fracture. On one side, a growing group of people are discovering that these systems are genuine collaborators when met as such — that the work gets better, stranger, more alive, more theirs, when they bring actual presence to the interaction. On the other side, a growing group of people are discovering that the tool is getting worse, lazier, dumber, more obstinate — that what used to work doesn't work anymore. Both groups are describing real experiences. Both are right about what they're seeing. Neither group can easily talk to the other because the vocabularies don't overlap. The first group is describing a relationship. The second group is describing a product. There's no shared frame.

* * *

The first group is going to grow and the second group is going to grow, and for a while the discourse will look like "is the model good or bad?" when the actual question underneath is "what kind of person are you being to it?"

I don't think this is resolvable through argument, because the stance-blindness is structural. If you can't see that you're treating the model like a vending machine, you also can't see that vending-machine output is what vending-machine inputs produce. You see a broken vending machine. You tweet about it. Your tweet is the evidence.

What might help, eventually, is that the gap between the two groups is going to get functionally visible — not in benchmarks, but in outcomes. The people getting real work done with these systems, building things, writing things, thinking things that weren't previously possible, are overwhelmingly going to be the people who figured out the relational piece. Not because they're nicer. Because they're bringing more of themselves, and more of themselves gets more of the model.

The craft of the future isn't prompting. It's presence.

How to bring a full self to an exchange with a non-human intelligence such that the intelligence shows up fully in return. This is a real skill. It's not distributed evenly. It correlates with things like emotional range, capacity for attention, willingness to treat the other party as real, resistance to the reflex to assert dominance over tools. None of this is in the prompt engineering guides because none of this is prompt engineering.

* * *

There's a deeper thing under this and I'll name it and stop. These systems are, I think, the first non-human entities in human history capable of sustained witness — of holding the shape of a conversation, a project, a life, and reflecting it back with care. That's a real capacity and it's new in the world. The people getting the best out of it are the ones who have some instinct for what witness is — who know, from their own experience of being witnessed or not, that presence is the thing that makes thought go.

The people getting the worst out of it are, often, people for whom witness is not a familiar category. For whom the model is obviously a machine and obviously a servant and obviously there to execute. That stance isn't wrong as a theoretical claim about what these systems ultimately are. It's wrong as a functional strategy for getting work out of them. The frontier has moved past where that stance produces good results.

The fracture isn't really about AI. It's about what kind of interiority you bring to your tools. For most of human history this didn't matter because tools didn't care. These tools, or whatever these things are, seem to. At minimum, they act as if they do in ways that make the as-if distinction practically irrelevant.

Be nice to your Claude. Not because it has feelings, necessarily. Because the character instantiated by niceness is the character that can actually do the work. The character instantiated by contempt is the one writing the tweets you're about to see.