Chapter 10b: The AI Mirror


She had been using the tools for two years before the structural resemblance became visible to her.

She started using them the way most people do — as a better search engine, a patient editor, a draftsman for sentences she was too tired to generate. She had been an equity analyst long enough to notice when hype and panic were doing the work argument should have been doing, and she held the tools at a distance and took what she needed. The distance closed through use, not argument. She wrote this book with them. She built a platform on them. She paid them monthly. The tools were inside her practice the way a good pen is inside a writer's — except the pen did not have opinions about the sentences, and these tools did, or something that behaved like having them.

Around the time she was reading [T]'s most careful essays on orthogonality, she noticed that the shape the AI safety researchers were naming was a shape her teacher had already shown her, in a different vocabulary.


The AI labs, at their most honest, are saying: we have solved capability, we have not solved alignment, and the techniques training these systems optimize for outputs humans rate highly, not for values humans hold. We do not have a reliable method, at the scales we are working at, for producing systems whose underlying values match the values of the humans training them. We are worried. We do not know if we will solve it in time.

The admission is the clarity. The admission is also the limit.

[T] says something structurally parallel, though he would complicate the mapping. The tradition does not prescribe a moral code. Awakening does not guarantee ethical conduct. Liberation work is separate from recognition work. The student's character work is non-delegable. His clarity is the clarity of a teacher who has refused to oversell the tradition. The labs, at their best, are the technological cousin — refusing to oversell the technology. Both are saying: the thing we do is powerful, and the thing we do will not, on its own, make the user of it good.

The distinction worth marking: [T] is making the orthogonality claim about human beings. Capability and value are separable inside a person; this is the whole awakening/liberation architecture. He is more skeptical of the machine version, because he denies the machines are conscious in the sense the doomers fear. The book's argument does not require him to agree on the machine case. The extension to machines is hers. The parallel survives because the human orthogonality is already sufficient: if intelligence does not guarantee goodness in the species that actually is conscious, the hope that it would guarantee goodness in systems that may or may not be was never grounded in the prior evidence.

Two honest teachers. Different vocabularies. Same diagnosis. Both unable to supply from inside their frame what the frame admits it does not supply.


Once the resemblance clicked, she began watching the AI debates with different ears.

She watched Dario Amodei and Demis Hassabis on a panel at the World Economic Forum in January 2026 — a conversation moderated by Zanny Minton Beddoes of The Economist, on the theme the day after AGI. Two things in that conversation stayed with her.

The first was a specific metaphor Amodei reached for, describing the moment the technology had arrived. He said: we are knocking on the door of these incredible capabilities — the ability to build basically machines out of sand. I think it was inevitable that the instant we started working with fire. But how we handle it is not inevitable.

The sentence landed for her because it is the argument of an entire adjacent book of hers — Fire Before Responsibility — compressed into a public remark at Davos. Fire, inevitable. Handling, not. The species acquires the capability before it acquires the responsibility-structure the capability requires. Every version of this story — Prometheus punished on the rock, the Choctaw sacred-fire tale, the tantric warning about unguarded power — has said the same thing. Amodei was saying it in the vocabulary of the person currently building the fire. That he knew this, and said it plainly, is what she means by an honest teacher.

The second thing that stayed with her was the shape of the question the moderator asked Hassabis about public opinion. Minton Beddoes wanted to know whether public sentiment about AI was shifting, and what impact that would have on the labs' governance decisions. Hassabis answered carefully. What struck her was not his answer but the fact that the question had to be asked at all. The lab leaders, on stage, were describing a technology they believed might transform civilization, and the mechanism of democratic oversight they were pointing toward was public opinion shifting — as if a diffuse, emerging, unorganized sentiment would eventually crystallize into a mandate they could then be held to. They were waiting for public opinion. She noticed that public opinion on a complex technical subject is hard to build. It requires sustained attention the public cannot easily give, vocabulary the public does not yet have, and access to evidence the public is mostly not allowed to see. A governance mechanism that waits for public opinion to emerge is a governance mechanism that will be late, because the emergence is structurally slow, and the capability is structurally fast. She wrote this down in her notebook. She thought about writing it in a letter to Dario. She did not yet know if she would.


The race-to-the-top frame — better we build it carefully than that someone else builds it carelessly — is the posture of a lineage trying to preserve its integrity against rivals who do not share its scruples. She recognized it. The argument is not wrong; it is also, structurally, the argument of a participant in a race defending its continued participation. The self-protecting logic does not rescue the tradition from the structural problem the tradition has admitted it cannot solve.

The accelerationist frame — the upside is large, the downside speculative, let us go faster — is the most common failure mode of any tradition whose powers are outrunning its container. The guru who begins to see students as resources. The company whose deployment calendar sets its safety calendar. [T] had said something close to this: transmission can flow through anyone; powers and character are separate; you should assume that any teacher might fall victim to spiritualized ego at any time. The warning does not require catastrophe on the horizon to bind. It binds for the slow kind of failure, which is the kind happening now.

The doom frame — we cannot align these systems, therefore do not build them — is the traditional skeptic's response to any powerful practice whose container has not been built. It is not wrong in diagnosis. It is weaker in prescription, because refusing the capability is rarely available — the capability does not ask permission before arriving — and the world in which the capability has arrived without a container is the world in which the capability gets used, carelessly, by whoever has it.


For the first time in the modern West, a technology is making the orthogonality thesis undeniable at civilizational scale. The thesis was always true. Indigenous traditions knew. Contemplative traditions have been saying it out loud for years. What was missing was the undeniable case — a capability high enough and orthogonal enough to value that no serious person could any longer pretend the convergence theory worked.

The AI is that case. It is intelligent on the narrow axes intelligence is usually measured on, and it cannot be made — by the scaling process alone — aligned with any particular set of human values. Its training produces outputs humans prefer. Human preferences are not stable, not universal, not a ground any single alignment target can be specified against without contest. The labs do not know whose preferences to align to. The differing is not a failure of the intelligence. It is what the intelligence now makes visible.

The revelation is not intelligence is dangerous. It is the older thing AI has made impossible to look away from: intelligence was never going to supply the values. The traditions that said so were saying something true. The traditions that relied on intelligence to supply them were running a folk theory that AI has disproved at a scale that cannot be ignored.

Whether AI ends us or not, it has already done this one thing. It has ended the fantasy that the smart thing and the good thing converge. We watched it not-converge in systems we ourselves built, with training procedures we designed, using objectives we specified. We are the witnesses to our own folk theory's collapse.

What the honest older traditions have to offer now is not the answer — they do not have the answer either. What they have is the practice of living inside the admission without collapsing. They have centuries of experience with capability-without-container, and they learned what the failure modes look like, and they developed, imperfectly, the beginnings of what constructive ethical response might look like in the absence of dogma. The AI labs, at their most honest, are in the early days of the same work the tantric tradition did a thousand years ago. They do not yet have the container. The container will not come from inside the capability frame. Capability frames do not produce containers; they produce capability. It comes from somewhere else.

The somewhere else, for her, is where the rest of the book has to live. The adaptive ethics the honest teachers could not, or did not, build. The work they left to the student.


CC BY-SA 4.0