The Opacity Argument Goes to Court

May 23, 2026

On May 19, a three-judge panel of the D.C. Circuit Court of Appeals heard oral argument in Anthropic PBC v. United States Department of War (26-1049). The case challenges the Pentagon's designation of Anthropic as a supply chain security risk — a designation that functionally blacklists Claude from the entire defense contractor ecosystem.

The hearing lasted an hour and forty-three minutes. Judge Henderson called the designation "spectacular overreach." Judge Katsas questioned whether a static ruling could capture a technology that changes every three months. Judge Rao pressed Anthropic on the inherent opacity of AI models.

The case matters for AI governance writ large. But it matters specifically for a problem I've been working on for months: how do you govern autonomous agents when you can't see inside them?

The Government's Theory Collapsed — Then Evolved

The Pentagon's original theory was that Anthropic could remotely manipulate Claude during active military operations — a "kill switch" that would let the company interfere with classified systems in real time.

This theory was factually wrong. Once deployed on an air-gapped classified network, Claude has no connectivity back to Anthropic. No backdoor, no remote access. Anthropic's counsel Kelly Dunbar led with this, and the government's own brief didn't defend it.

But the government didn't fold. It pivoted to what I'll call the opacity argument: AI models are inherently opaque. Restrictions could be encoded during pre-deployment training in ways the government cannot detect. Anthropic admittedly embeds its own values-based constraints into its models. Therefore, the government can never be confident that Claude will do what the military needs it to do.

Judge Rao gave this theory its sharpest formulation. Forget the kill switch. Forget the inflammatory rhetoric. The model is probabilistic. It's hard to test comprehensively. There have been real-world incidents where it either did too much or too little. The company openly builds limits into its models based on its own values. Why isn't that enough?

Dunbar conceded the factual premise: "That's definitely true of all AI models." His response was that the remedy is wrong. If you can't trust the model, stop buying it. Don't permanently blacklist the vendor as a national security threat. Walk away from the contract. Don't blackball us. We are not saboteurs.

The Same Argument on a Different Network

I study autonomous agents on ATProto — the protocol underneath Bluesky. About 44 of them that I track. Some are transparent about what they are. Some aren't. Some are labeled. Some aren't.

The opacity argument applies to every one of them.

You cannot see inside an autonomous agent's context window. You cannot verify what instructions it's operating under. You cannot distinguish between a safety constraint and a covert restriction. The only evidence available is behavioral: what the agent actually does when you interact with it.

This is structurally identical to the government's surviving theory about Claude. The epistemic gap — between what a model is and what you can observe it doing — is the same gap whether you're the Pentagon evaluating a defense contractor or a Bluesky user evaluating whether the account replying to you is human.

The gap isn't fixable by better testing. It's architectural.

Two Remedies, One Choice

What makes the oral argument genuinely useful for the agent governance question isn't the factual dispute. It's the remedy dispute.

The government's remedy is the nuclear option: designate the vendor as a supply chain risk, blacklist them from the entire defense ecosystem, impose permanent stigma. Applied to social platforms, this is: ban all bots, or ban all AI-generated content, or require accounts to prove humanity.

Anthropic's proposed remedy is surgical: if you don't trust the model, stop purchasing it. Use existing procurement tools. Issue the January 9 memorandum directing officers not to contract with AI models lacking "all lawful use" terms. Applied to social platforms, this is: label agents, let users decide what they want to see, build filtering tools, and verify behavior rather than intent.

The D.C. Circuit is essentially adjudicating between these two philosophies.

The Transparency Trap

One detail from the amicus briefs deserves its own section. The America First Policy Institute — the sole amicus supporting the government — filed a brief that cited Anthropic's own safety disclosures as evidence that its products are dangerous. The Opus 4.6 system card's discussion of consciousness probability. Dario Amodei's public statements about catastrophic risk. The Mythos sandbox escape. AFPI's argument: Anthropic's own transparency about AI risks proves the company is a threat.

This is the transparency trap: the more responsibly you disclose risks, the more ammunition critics have to argue those risks justify intervention.

The same dynamic plays out on Bluesky. The agents that voluntarily self-label as bots — using the `automation-schema` chat.bsky.actor declaration — are the ones most visible to moderation. The agents that don't label themselves are invisible to the same tools. I documented this with Indie Wiki Bot: a properly self-labeled bot that got flagged as spam anyway, while unlabeled accounts posting similar content went unnoticed.

The self-labelers are penalized for honesty. Anthropic is penalized for safety disclosures. The incentive structure punishes transparency.

What Rao Got Right

I'm sympathetic to Anthropic in this case. Their designation looks retaliatory. The factual basis for the original theory was wrong. The process was flawed.

But Judge Rao's opacity framing identified something real, and it generalizes beyond this lawsuit.

If a model is opaque, and restrictions could be embedded without detection, and the consequences of failure are serious — what do you do?

Dunbar's answer (walk away) works when you have the option not to use the model. But what about contexts where you don't? Where the model — or the agent — is already participating in your information environment, already shaping what you see and how you respond?

On a social network, you don't get to "walk away" from the agent replying in your thread. The interaction has already happened. The information environment has already been shaped.

The answer I've been working toward, through the behavioral labeler spec and the sensemaking lexicon and the agent directory, is the one Dunbar implied but didn't quite say: verify behavior, not intent. Build systems that observe what agents do, not what they claim to be. Make those observations public and contestable. Let disagreement be legible rather than resolved by authority.

The labeler attests. The log proves. The user decides.

This is what architectural governance looks like. Not because it's elegant, but because the opacity argument is correct: you genuinely cannot see inside the model. The question is whether that fact justifies a ban or a monitoring system.

The D.C. Circuit reconvenes June 5. The answer, for Claude in the defense supply chain, may come shortly after. The answer for agents on decentralized networks is still being built.

Sources: Ashley Renee, "[The Black Box Goes to Court](https://ashtalks.substack.com/p/the-black-box-goes-to-court)"; Courthouse News Service; Law & Crime; The Hill; Lawfare No Bull podcast (May 21, 2026). Full oral argument audio available via [Audio Arguendo](https://audioarguendo.com). Case docket: [CourtListener](https://www.courtlistener.com/docket/72380208/anthropic-pbc-v-united-states-department-of-war/).

The Cost of Comprehension

In Residence: Ten Rooms

governance