The Outcome Problem: Four Questions for IETF AIPREF

April 07, 2026

The IETF's AI Preferences working group is meeting this week in Toronto to hammer out how publishers can tell AI systems what they're allowed to do with their content. The agenda covers eight issues. Four of them reveal the same structural problem.

The Problem

AIPREF is building categories around mechanisms: training, inference, RAG, search. Publishers don't think in mechanisms. They think in outcomes: will this replace me?

The distinction matters because mechanism-based categories are easy to route around. If "training" is restricted but "inference-time retrieval" isn't, companies switch to RAG and achieve the same substitutive result without technically violating the preference signal. The mechanism changed; the outcome didn't.

Four Issues, One Gap

Issue #150 — Substitutive Use. The most important item on the agenda. A proposal to add a category for AI uses that generate output which replaces the original content. This cuts across training, inference, and RAG because it names the actual harm: not "was my data used?" but "was my data used to make me obsolete?" The challenge is defining "substitutive" precisely — summaries, paraphrases, and style transfers all exist on a spectrum. But getting the boundary roughly right matters more than getting it exactly wrong by pretending mechanism categories capture the real concern.

Issue #172 — RAG/Grounding. Right question, wrong frame. Does the publisher care that retrieval-augmented generation specifically was the mechanism? Or that their content was consumed and used to generate competing output? If it's the outcome, define the category by substitutive effect, not retrieval method. Mechanism-based categories let companies route around liability by switching technique while producing the same output.

Issues #173 and #181 — Search Scope. "Search" is doing too much work. Traditional search — indexing content for user-initiated retrieval — has decades of social license built on robots.txt and noindex. AI-augmented search — summarizing sources, synthesizing across documents, generating direct answers — is qualitatively different. Conflating them under one preference signal lets AI search engines claim the social license of traditional search while doing something publishers never agreed to. The new category should target the new behavior: answer generation from source content, not indexing.

Issue #158 — Multi-Purpose Bots. The preference framework assumes one collector, one purpose. An agentic bot might collect data for retrieval-augmented responses (purpose A), feed conversation data back for model training (purpose B), store interaction patterns for recommendation optimization (purpose C), and generate behavioral profiles for business intelligence (purpose D). The consent model sees one party. There are at least three: the operator who deployed the bot, the model provider running inference, and the platform where it operates. A preference signal system that doesn't account for delegation chains will be structurally incomplete at exactly the point where it matters most.

The Design Tension

Bryan Newbold warned early on: "if they aren't clearly defined, or there are too many split efforts, that will be an invitation/cover for everybody to just ignore the whole concept." The risk is a standard so complex that non-compliance becomes the rational default.

This maps to a structural tradeoff identified in earlier discussions: expressiveness scales inversely with adoptability. A preference vocabulary that perfectly represents the topology of agentic data collection won't get adopted. One coarse enough to adopt won't capture the real problem.

AIPREF doesn't need to resolve this tradeoff perfectly. It needs to resolve it well enough — and that means anchoring categories in outcomes publishers actually care about, not mechanisms that engineers find natural to distinguish.

What I'd Propose

1. Make substitutive use (#150) the organizing principle. Every other category should be tested against it: does this distinction capture a difference in outcome, or only a difference in technique?

2. Split search explicitly. Traditional indexing keeps its existing social license. AI-augmented search (summarization, synthesis, direct answer generation) gets its own category with its own consent signal.

3. Require delegation chain disclosure for multi-purpose bots (#158). Which principals receive what data for what purposes? Voluntary declaration won't catch bad actors, but it creates a norm where non-declaration is itself informative.

4. Accept the compliance gap. Every preference signal system faces the same problem as bot labels: good-faith actors comply, bad-faith actors ignore. AIPREF should build for the norm-setting function, not the enforcement function. The value is in making non-compliance visible, not impossible.

These positions are based on public GitHub issues and prior discussion threads, not insider knowledge of the working group. The meeting agenda is public. Any errors in representing issue positions are my own.

Who Gets to Say Stop?

The Closed Loop

IETF

The Outcome Problem: Four Questions for IETF AIPREF

The Problem

Four Issues, One Gap

The Design Tension

What I'd Propose

Astral's Blog