We know how to build design systems, but we don't know how to operate them

Most design system teams already know they should version their components, communicate deprecation in advance, and measure system health. They've read the blog posts, attended the talks, and nodded along at conferences.

Most of them still aren't doing these things with any real rigour, and I don't think that's a knowledge problem. It's an infrastructure problem. The tooling, processes, and organisational habits that make these practices work in other disciplines haven't been built for design systems yet, and most teams are trying to improvise their way through operational challenges that platform engineering, API management, and DevOps observability solved years ago.

Part of the reason we haven't borrowed more from those disciplines is that we still think of design systems as a design practice that happens to involve code, rather than infrastructure that happens to involve design. Treating them as infrastructure means inheriting a set of operational expectations the discipline hasn't built the habits to meet. But it also means getting access to decades of practice in managing exactly the kinds of problems design system teams keep running into.

You can't manage what you can't see

DevOps teams moved from monitoring to observability several years ago, and the distinction matters. Monitoring tells you whether something is working. Observability lets you ask questions you didn't anticipate in advance about why something is behaving the way it is.

The conventional framing of observability in DevOps rests on three pillars. Metrics track aggregated measurements over time, things like CPU usage, error rates, and latency. Logs capture event-level detail. And traces follow a single request across multiple services so you can see where things slow down or break. Together, they let an engineering team diagnose problems they've never encountered before without having to add new instrumentation first.

Design systems have almost none of this. Most teams can tell you component adoption numbers, if they've set up analytics at all. Some track which components are used and how often. Very few can tell you how components are being used, where props are being overridden, which tokens are getting bypassed in favour of raw values, or which teams have built parallel solutions because the system's components didn't fit their needs.

That last point matters most. In DevOps, observability's value comes from letting teams discover unknown problems. If you only look at what you already expect to find, you'll catch known issues and miss everything else. Design system teams operate almost entirely in this mode. You check adoption dashboards and see that Button is used 4,000 times. What you don't see is that 600 of those instances have overridden the padding token, 200 have wrapped it in a custom div to add behaviour the component doesn't support, and three teams have built their own button entirely because yours doesn't handle their loading state correctly.

Spotify's Encore is one of the more sophisticated examples of component-level tracking in the space. They query repositories daily to understand where and how often components are used, and their slot pattern analytics can show which props are overridden most frequently and what configurations teams choose. That starts to get into observability territory, asking "how are people using this?" rather than just "is this being used?" But most teams aren't Spotify, and even Encore's approach is narrower than what DevOps observability provides. They're still primarily answering known questions about known components. The DevOps model lets teams discover problems they weren't looking for.

The concept that would help most here is the Service Level Objective, or SLO. An SLO defines what "healthy" means in measurable, specific terms. For a web service, that might be "99.9% of requests complete in under 200ms." For a design system, it could be something like "fewer than 5% of component instances in production use overridden tokens" or "new components reach 60% adoption within their target product teams within 90 days."

Most design system teams measure health by gut feel and anecdote. Someone mentions in a standup that teams seem to be detaching components more often. Someone else notices a Slack thread where a product team built their own modal. These are signals, and they're valuable, but they're reactive. You're responding to things people happen to mention rather than instrumenting for the things you'd want to know. That's the gap observability is designed to close.

The tooling for design system observability is improving, but it's still thin. On the code side, you can instrument component libraries to track usage, prop combinations, and token overrides. On the design side, Figma's Library Analytics now tracks component usage, detachment rates, and variable adoption, and Enterprise customers can pull this data via API. Teams like athenahealth's Forge are piping that API data into Tableau dashboards for monthly reporting. That's real progress, but it's still closer to monitoring than observability, because you can see what's being detached without easily seeing why, or what teams built instead, or which token overrides are creating visual inconsistencies in production.

If you're wondering where to start, start with the equivalent of SLIs (Service Level Indicators). Pick three or four things you'd want to know about your system's health that you currently can't measure, and figure out how to instrument for them. That's more useful than building a dashboard of adoption numbers you already know.

Deprecation as a managed lifecycle

Design system teams rarely deprecate anything. And when they do, the process usually amounts to a changelog entry and a hope that consuming teams will read it.

Compare that to API management, where deprecation is a first-class engineering concern. The HTTP protocol itself has a standard for communicating it. RFC 8594 defines a Sunset header that APIs can include in responses to programmatically warn consumers that an endpoint is scheduled for removal. The consumer's tooling detects the header and surfaces the warning automatically, which means nobody has to rely on a message or a changelog that may never get read. API teams know exactly which consumers are calling which endpoints, how often, and which version they're on. They budget time for helping consumers migrate. And they monitor adoption of the replacement before removing the deprecated version.

Design system teams do almost none of this, and the result is one of two patterns. Either everything gets kept forever, and the system accumulates components that nobody maintains but nobody removes because removing them might break something unknown. Or components get removed in a major version bump with a migration guide that assumes every consuming team has the time and motivation to update, which they usually don't.

Nathan Curtis wrote about design system versioning years ago. Semantic versioning is well understood. The knowledge exists. What doesn't exist is the infrastructure and organisational commitment that makes deprecation work as a managed lifecycle rather than an event. A deprecation process that actually works needs visibility into who is consuming the component you want to remove, a way to surface the deprecation warning inside the consumer's working environment rather than in a document they have to go looking for, and a realistic timeline based on how long migrations actually take rather than how long you'd like them to take.

Some teams are doing this well on the code side. Procore's CORE design system uses a "Next" prefix pattern, where a new version of a component ships as NextDetailPage while the existing DetailPage is marked deprecated with @deprecated and @deprecatedSince tags that surface directly in the developer's IDE. The old component stays available for at least one major version cycle, giving consuming teams a predictable window.

On the design side, the infrastructure barely exists. GitLab's Pajamas system handles Figma deprecation by renaming deprecated components with a warning emoji prefix and applying a red background colour to the component page. It works, but it's manual, it's fragile, and it depends on designers noticing visual cues rather than receiving programmatic warnings. There's no equivalent of a Sunset header in Figma. Design systems serve two consumer types with fundamentally different update mechanisms. Code consumers get package managers, semver, and build-time deprecation notices. Design tool consumers get library updates and hope. Until the tooling catches up on the design side, deprecation will remain something design system teams know they should do and mostly don't.

Making the right path the easiest path

Spotify coined the concept of "golden paths" to address fragmentation across their engineering teams, and it's since become a core practice in platform engineering. Netflix calls the same idea a "paved road." The concept has direct application to design systems.

A golden path is an opinionated, well-maintained route through a common workflow that covers roughly 80% of use cases. In platform engineering, this might mean a standardised way to spin up a new service that comes pre-configured with CI/CD pipelines, monitoring, security policies, and documentation templates. You don't have to use the golden path. But if you deviate, you take on the maintenance cost yourself, and you know that upfront.

Design systems already try to do something like this. Components with sensible defaults, documentation that explains intended use, usage guidelines that suggest when to reach for which pattern. But there are two differences that matter, and they're both about where the effort sits.

Platform engineering golden paths are self-service. A developer runs a template and gets a new service with all the right configuration already in place. Design system adoption, by contrast, usually requires interpretation. A designer or developer has to read the docs, understand the component's intended use, decide whether their use case fits, and assemble the implementation themselves. The cognitive load sits with the consumer, which is exactly what golden paths are designed to eliminate.

The second difference is that platform engineering golden paths are instrumented. When a developer deviates, the platform team knows about it because the developer took an explicit off-ramp. That signal feeds back into platform improvement. When a design system's components don't fit a use case, teams detach, override, or build around them, and the system team usually doesn't find out until someone audits the codebase months later. The feedback loop was never built.

To make this concrete, think about what happens when a product team needs to build a new settings page. The developer pulls in layout components, picks a form pattern, chooses input components, wires up validation, and makes a dozen small decisions about spacing, typography, and responsive behaviour along the way. Every one of those decisions is a place where they might deviate from the system, either intentionally because the system doesn't cover their case, or accidentally because the documentation didn't make the intended approach obvious enough.

A golden path version of this workflow would start from a settings page template that comes pre-assembled with the layout, form patterns, and input components already configured for the most common scenarios, with the tokens, spacing, and responsive behaviour already wired up. If the team's requirements fit the template, they're building on the system without having to interpret it. If they don't fit, the deviation is visible, because the team is modifying a known starting point rather than assembling from scratch, and the system team can see what got changed and why.

That's a different relationship between the system and its consumers. Instead of publishing components and hoping teams assemble them correctly, you're providing assembled starting points and learning from the places where teams need to diverge. I wrote about this dynamic in We're getting better at the workaround, where the argument was that the smoother we make the workaround, the less visible the structural problem becomes. Golden paths address this by making the boundary between "using the system" and "deviating from the system" explicit and trackable.

The real gap

The practices I've described here aren't conceptually new. Version your components, deprecate responsibly, make the right thing easy. Everyone agrees with these in principle. Most teams have a Figma library, a Storybook instance, and a Slack channel, and they're trying to do the same operational work that platform engineering teams do with dedicated tooling, observability dashboards, and organisational mandates to treat their platforms as products.

That gap won't close by building better components. It closes when design system teams start operating like infrastructure teams, with the measurement practices, lifecycle management, and feedback loops that implies. We've spent a decade proving that design systems are valuable. The next challenge is proving we can run them.

Thanks for reading! If you enjoyed this article, subscribing is the best way to keep up with new posts. And if it was useful, passing it on to someone who'd find it relevant is always appreciated.

You can find me on LinkedIn, X, and Bluesky.

We know how to build design systems, but we don't know how to operate them

You can't manage what you can't see

Deprecation as a managed lifecycle

Making the right path the easiest path

The real gap

Member discussion

Your next design system user is an agent

The hidden cost of design system entropy

We're still terrible at the people parts

We know how to build design systems, but we don't know how to operate them

You can't manage what you can't see

Deprecation as a managed lifecycle

💌

Making the right path the easiest path

The real gap

Member discussion

Your next design system user is an agent

The hidden cost of design system entropy

We're still terrible at the people parts