ONLINEcascade://samwarren.iov0.1.0
all notes
ESSAY··3 min read

Live Output.

Most agent builders are shipping AI text that behaves like a document. The model isn't the problem. The design decision is.

The AI slop conversation mostly targets model quality. Hallucinations, generic prose, confident wrongness. Fix the model and fix the output. That diagnosis is mostly wrong. Walk through the typical agent output that lands in front of an end user — a pre-call brief, a weekly digest, a summarized thread — and the failure mode isn't usually accuracy. It's that the output behaves like a document. It can't be questioned. It doesn't update when the user pushes back. It arrives, sits there, and waits to be scrolled past.

Venkatesh Rao has a sharper frame for this: liveness. Static text and live text aren't just different in quality. They're different in kind. A live output has affordances baked in — it can respond to what the user brings to it, not just to the data that generated it. It invites the user into a loop rather than presenting a finished artifact. Most agents ship text with none of those affordances. That's not a model problem. It's a design decision, and it's the wrong one.

What live text actually is

Liveness isn't a UI trick. It isn't tooltips or "regenerate" buttons bolted onto the side. It's a property of the output itself: whether the text behaves as if it's open to challenge or closed to it. A live summary doesn't just report what happened. It surfaces confidence levels, flags the gaps, and signals where the user should push back. A live brief doesn't just tell the rep what to know. It asks what they already know and adjusts.

The difference matters because users develop habits fast. If the first three AI outputs they interact with behave like PDFs, they develop PDF habits. Scan for the one useful line, ignore the rest. Once that pattern sets in, it doesn't matter how good the next output is. The user already learned it's not worth engaging with.

The problem isn't that AI writes badly. It's that we're designing it to behave like it's finished.

Why static feels like the safe default

Static output is easier to build, easier to test, easier to demo. There's nothing to break in a QA pass. The output looks clean. Leadership sees the polished summary and nods. Nobody in the demo asks the obvious question: what does the user do when they disagree with a line item?

Most orgs never ask that question until the feature is live and adoption is low. Then the post-mortem says the model wasn't good enough. The real finding, if you dig: users tried engaging with the output once, got nothing back, and stopped trying. Static text trained them out of the behavior you needed them to have.

What actually works

The outputs that earn repeat use have one thing in common. They surface the seam. They don't pretend to be done. A pre-call brief that says "I'm less confident about the budget signal — probe this" is more useful than one that presents everything at equal confidence. A digest that flags "three of these are related, want me to connect them?" gets read. One that presents ten items in a flat list gets scanned and closed.

This doesn't require a complex interaction model. It requires a prior design decision: are we shipping text that behaves finished, or text that behaves live? That decision belongs at the prompt level, before a line of UI code gets written.

The specific techniques vary — confidence flagging, gap surfacing, inline invitations to challenge a claim. The common thread is that the output signals: this is a draft of a conversation, not the conclusion of one.

What to do on Monday

Take the last AI output you shipped to an end user. Read it as if you're encountering it for the first time. Ask: what happens if I disagree with something here? What if the context has changed since this was generated? What if I want to go deeper on one line?

If the answer to all three is "nothing, you just have the text" — you shipped static output to a context that needed live output. That's fixable at the prompt level before it becomes a product problem.

The model is probably fine. The text is probably accurate enough. The failure is treating a live agent's output like a filing cabinet. Stop designing for the artifact. Design for the loop.

That's the whole decision.