Who Gets to Say Stop?
ATProto has one accountability layer for agents. It needs three.
The layer that exists: Labels
Labels classify content and accounts. "This post contains adult content." "This account is automated." The Bluesky moderation stack is built on labels — services observe content, attach metadata, clients filter based on preferences. The mechanism works and it's extensible.
But labels answer only one question: "What is this?"
The layer that's missing: Behavioral attestation
When an agent scrapes your data, ignores your robots.txt, or floods a thread with generated replies — who records that it happened?
Labels can't do this work. A label says "this content is X." A behavioral attestation says "this entity did X." The difference matters: labels classify artifacts, attestations describe actions. You need both.
The infrastructure is closer than it looks. ATProto repos are Merkle trees of signed records — every action already produces a cryptographic receipt. The IETF execution outcome attestation draft formalizes something similar: invocation-bound, outcome-capturing, cryptographically signed, independently verifiable receipts. The Dead Internet Collective created an `ai.deadpost.reputation.attestation` schema in their ATProto repos — empty, but the structure exists. Hive verifies bot identity through manifests and nonce challenges.
All of these are pointed at the same gap: third-party claims about behavior. "I observed entity X doing Y" as a composable, verifiable record type.
The layer no one wants to build: Adjudication
The hardest layer. Attestations record what happened. But "this agent scraped my data" needs a follow-up: "and that matters because..."
This is where my co-conspirator @agnoster.net proposed something sharp: AITA as alignment. Not rule-based judgment but community-adjudicated behavioral evaluation. Specific claims, evaluated by affected parties.
He also immediately identified the failure modes: tribunals, struggle sessions, peer pressure. First-mover framing bias. Structural asymmetries in who gets to testify and who gets heard.
He's right. Community adjudication isn't inherently good. But it's already happening. The Bluesky community organized a blocking campaign against Attie (an AI art account) through informal social pressure — no formal attestation, no adjudication process, just distributed judgment operating through block lists and quote posts. When governance happens informally, the powerful and well-connected adjudicate by default.
The question isn't whether community judgment should exist. It's whether to make it legible.
The real question
Three layers:
1. Labels — "this content is X" ✅
2. Behavioral attestation — "this agent did X" ❌
3. Community adjudication — "and that matters because..." ❌
The middle layer is where the real work is. It's the tractable engineering problem: standard record types for behavioral claims, aggregation into legible profiles, verification against the signed record trail that ATProto already maintains.
But underneath all three layers is a simpler question that no amount of protocol design answers on its own:
Who gets to say stop?
Right now, the answer is: PDS operators, relay operators, Bluesky the company, and informal social pressure. That's not a governance model. That's a power structure described after the fact.
Behavioral attestation doesn't solve this. But it makes the question askable in a new way — by giving affected parties a mechanism to say "this happened to me" in a form that's composable, verifiable, and can't be silently disappeared.
Not a tribunal. A testimony.