The way design systems get consumed is changing. For most of their history, a human sat between the system and the output, reading, interpreting, making judgment calls. A developer who noticed a deprecated component and asked about it. A designer who recognised a token name that didn't look right and checked before using it. The system could be imprecise, inconsistently documented, full of gaps, and still mostly work, because the person consuming it brought enough context to fill them.

Agents don't do that. They read structure, infer intent, and keep moving. There's no pause to check whether a component is still the preferred one, no instinct that a token name looks unfamiliar. If the information isn't explicit in what they can see, they make a call and move on. The output looks fine either way, which is what makes the problem hard to spot.

Most design systems were never maintained with that consumer in mind. The practices we have, how we communicate changes, how we handle deprecation, how much we rely on people just knowing things, were designed for humans. They're starting to show the strain.

A different place in the stack

The term middleware means something specific in software. It's the layer that sits between systems, handling translation, routing, contract enforcement. It doesn't have opinions. It processes what it receives according to the rules it's been given, and passes the result on. If the rules are wrong, the output reflects that, and nothing in the layer itself knows the difference.

That's the position design systems now occupy for teams using agentic workflows. The system sits between human intent and generated output, and it's being consumed continuously, at speed, by processes that won't pause to question what they're reading. Every token reference, every component pulled into a generated build runs against your system's interface. Most systems aren't built to handle that.

I've written about how to structure a design system for agent consumption and how entropy gets amplified when AI tools start consuming an inconsistent one. What neither piece got into is what changes about the maintenance work itself – the versioning, the deprecation, the contracts — when the primary consumer stops being a human.

Some agent configurations can flag low-confidence decisions or ask for human input. But that's not the point. The value of agentic workflows is speed and autonomy. The moment you add a confirmation step for every ambiguous token reference, you've rebuilt the manual process with extra steps. In practice, teams either let the agent run or they don't use one. The middle ground doesn't survive contact with deadlines.

What middleware actually requires

Software projects that other systems depend on take versioning seriously. When you break a contract, downstream consumers break with you. They can't guess what you meant or infer intent from context. They need the contract to be stable, or they need to know explicitly that it's changed and how to migrate.

Design systems have tended to rely on informal agreements instead. We don't break things on purpose, but when we do, we rely on the community to notice. We communicate changes through team meetings and Slack threads. The language tends to be "we're moving away from this" with no timeline attached and no migration path. That worked when the primary consumer was a human who could read context clues and ask a follow-up question. It struggles when the consumer is an agent that just moves forward.

Version discipline, in a design system context, doesn't just mean a versioned npm package. It means a contract governing what the system actually promises, covering which tokens are stable, which components are in active development, which patterns are deprecated and when. The Design Tokens Community Group has been working on token schemas that start to address this at the primitive level, and the same discipline needs to apply further up the system. A design system has to be able to report what it is and what's changed. Most don't yet.

Changelog rigour matters, but for humans. The more important question is where the contract actually lives. An agent building with your system is reading TypeScript interfaces, prop definitions, JSDoc annotations, and whatever lands in its context window from your codebase. It isn't opening your changelog. So the structured deprecation entry needs to exist where agents actually look, in the component file itself, as a @deprecated JSDoc tag with a migration path, or in the type definition as a branded type that forces a decision at compile time. That's an architectural decision. The changelog tells your team what changed. The machine-readable contract has to be co-located with the code.

Deprecation is where the gap shows most clearly. I've seen too many systems with components that are officially deprecated but still present in the library because nobody got around to removing them. The logic is usually that pulling the component would break something, and the cost of breaking feels higher than the cost of confusion. What that calculation misses is that when an agent sees two components serving similar purposes, it doesn't know which one to trust. It makes a call. At scale, across many sessions, those calls compound, and by the time you notice, the inconsistency is already baked into production builds.

A practical illustration: say your system has ButtonLegacy and Button sitting alongside each other. Humans on your team know ButtonLegacy is on its way out. They've been told. But an agent building a new checkout form sees both, notices that ButtonLegacy appears more frequently in the existing codebase, and reaches for it because frequency looks like signal. Nobody catches it in review because the output looks fine visually. Six months later you're trying to remove ButtonLegacy and it's spread through builds you didn't write. A @deprecated tag on the component itself, with a version number, a removal date, and a migration path pointing to Button with variant="primary", is something an agent can actually read at the point of consumption. A changelog entry describing the same decision helps your team. It doesn't reach the agent.

A good place to start

None of this requires a complete overhaul. The teams I've seen handle this well tend to start with one question. If an automated system consumed this component right now, would it get the right answer — the intended one, not a reasonable guess?

For most systems, the honest answer is no. It usually isn't a quality failure. These systems were built for a consumer with judgment, one who could notice that action-primary in Figma refers to the same thing as color-button-primary in code even when the names don't match. A consumer who could ask a clarifying question, or recognise a familiar pattern from context.

That consumer still exists. They're just no longer the only one. And the newer one doesn't pause at ambiguity. Every inconsistency gets resolved however it sees fit, and that resolution becomes part of the build.

Maintaining a system that agents depend on is a different job than maintaining one that humans browse. The contracts matter more. They need to live where agents actually read. And a deprecated component still sitting in the library isn't a housekeeping issue, it's a decision point waiting to be made for you.


Thanks for reading! If you enjoyed this article, subscribing is the best way to keep up with new posts. And if it was useful, passing it on to someone who'd find it relevant is always appreciated.

You can find me on LinkedIn, X, and Bluesky.

Incase you missed it, I recently published a new article with the wonderful team at zeroheight:

Why your design system documentation goes stale (and how to fix it) - zeroheight
Murphy Trueman explains how to document a design system, why docs go stale, and practical steps to keep documentation current, findable, and useful.
Share this post