When conversation leads: Rethinking design systems for a world where intent comes first

We’re no longer navigating. We’re describing.

Instead of clicking through UI, you describe your problem in plain language. The interface still shows up – buttons appear, confirmations land – but they respond to intent you’ve already expressed. The conversation leads. The UI follows. That breaks a core assumption design systems still depend on.

Most systems assume interaction starts with a component. Conversation breaks that immediately.

Conversational interaction is already changing how teams ship – often faster than their design systems can adapt. The conversational AI market is projected to grow from $13B in 2024 to over $50B by 2030, and that growth is forcing uncomfortable questions about what design systems are actually responsible for now.

By “conversation”, I don’t mean chat interfaces replacing screens. I mean intent – spoken or typed – becoming the first move. The UI still matters, just not as the starting point.

The component problem

For the last decade, we’ve built design systems around visual primitives. Buttons, form fields, navigation patterns, cards. These are the parts we assemble into interfaces.

In a conversational interface, a “button” becomes a decision point in language.

“Set a timer for 10 minutes” is intent, not UI selection.

The screen still matters – you see the timer count down – but the interaction already happened somewhere else. That needs different building blocks.

In traditional design systems, we’ll happily maintain 15 button variants (primary, secondary, ghost, destructive, and so on). In conversational systems, the work moves to intent, missing information, and recovery. We’re designing behaviour, not objects.

What consistency means when conversation leads

One of the primary jobs of a design system is ensuring consistency. But when conversation leads, consistency takes on a different shape.

Visual consistency is the easy part. You can spot a mismatched button or drifting spacing immediately. Conversational consistency is harder to see. You feel it when something sounds off – when confirmation, error, and fallback all speak differently.

That consistency lives in language, not layout.

Atlassian’s system is useful here. Their content guidelines treat words as first-class – voice, tone, inclusive language. Not metaphorically, but structurally.

Unlike visual components, these don’t have neat boundaries or predictable states. They move with context, which makes them harder to systematise – not less important.

The building blocks of conversation

If buttons and cards aren’t the right mental model, what is?

Stop thinking in components, and a different set of primitives emerges. Some define structure. Others shape experience.

In conversational design, the primitives are intents, utterances, and slots.

Intent is the job. Utterance is the phrasing. Slots are the details you still need before you can do anything useful. (Note: in conversational systems, “slots” means data parameters – the pieces of information needed to fulfil an intent. This differs from visual design systems, where “slots” recently refers to composition patterns for custom content.)

If someone says “Play some relaxing music,” they’re expressing intent, with details folded into the language itself.

I used to think conversation design lived mostly in UX writing. Then I tried mapping a real multi-turn flow with engineers. The interaction logic collapsed almost immediately.

Multi-turn conversations are where things get genuinely complex – maintaining context across exchanges, handling interruptions gracefully, managing who takes initiative when. If your system can’t support that, things fall apart quickly.

You’re still using visual components. The music player still has play and pause. But the primary design challenge has moved from arranging elements on a screen to choreographing dialogue.

Traditional systems document component anatomy – spacing, colour, states. Conversational systems need to document memory, acknowledgement, and interruption.

Tone as infrastructure

In visual systems, tone and voice often live in a separate content guide. Important, but secondary.

In conversational systems, tone is infrastructure, not optional metadata.

When the PatternFly team started documenting conversational patterns, they emphasised a simple truth that systems are only as good as how they speak. Language builds trust, sets expectations, and carries users through moments of uncertainty.

The difference between “Let me check that” and “I’m thinking” matters (the latter inappropriately anthropomorphises AI behaviour).

If these rules aren’t written down, teams fill the gaps themselves. The product starts to sound inconsistent fast.

A comprehensive conversational design system needs to define personality (formal or casual?), conversational patterns (how errors are handled, how input is acknowledged), context awareness (what the system remembers), and turn-taking (when the system waits versus leads).

These decisions sit a layer deeper than the UI. They need the same level of care.

Isometric illustration of a person walking across fragmented platform blocks in black and white halftone style

The multimodal reality

Conversation doesn’t replace screens. It changes their role.

Voice is just one modality. The deeper change is intent leading interaction – whether spoken, typed into a command bar, or directed at a copilot.

Multimodal interfaces – combining text, voice, and visual elements – are already transforming how people interact with systems. Voice assistants show visual confirmations. Copilots surface suggestions alongside code. Text, voice, and UI work together.

Once interaction spans language and UI, design systems stop being about reuse and start being about coordination.

You still need visual components. But you also need conversational patterns. The two have to align – and that raises harder questions about ownership. Who maintains consistency when interaction crosses modalities? Where does the design system boundary sit when a feature spans voice, text, and screen?

If your UI says “medication” but your conversational layer says “drug” or “pill”, trust erodes. Consistency now spans modalities, which means governance needs to as well.

So what’s a “component” in a conversational system?

This is where the question stops being useful.

The problem is recognising that conversational systems need different building blocks entirely, not finding equivalents to buttons.

Most teams treat conversational UX as a copy problem. That’s why their systems fall apart the moment a conversation goes off-script. Intent changes mid-flow, context adjusts, users interrupt themselves. Copy can’t hold that.

What replaces components looks more like patterns for behaviour.

Dialogue patterns work like reusable conversational structures. Take password reset – it involves verification, failure handling, and context handoff, not just messaging. Similar to how visual systems document “login flow” or “checkout process,” but for language.

Intent libraries catalogue recognisable user goals with the various ways people might express them. Similar to how visual design systems document component variants, but the variance is linguistic.

Slot definitions describe structured information that can be collected through conversation, with guidelines for how to prompt for missing information and validate responses.

Tone playbooks document how personality manifests in different scenarios. How the system sounds when things go wrong versus when they go right, how it adapts based on user sentiment or context.

Response templates provide frameworks for structuring information in audio or text form, considering constraints like working memory (keep responses under 30 seconds) and cognitive load (break complex information into smaller chunks).

These aren’t components in the traditional sense. They’re patterns that keep conversation coherent.

Isometric illustration of disconnected architectural platforms and bridges in black and white halftone style

What trips teams up next

Conversation adapts. It doesn’t sit still long enough to inspect the way a modal or button does. The visual interface is still there – showing the timer, confirming the action – but the logic that got you there is invisible.

This doesn’t make design systems irrelevant. It forces them to evolve.

The purpose stays the same – shared understanding, consistency at scale, systems that hold up over time. But the implementation looks very different.

You can see the shape of this already. Google’s conversation guidelines focused on context and multi-turn flows. Alexa’s docs centre on intents and dialogue structure. Same impulse, just not always called a design system. Yet.

I’ve spent years arguing that design systems should be treated as APIs, not libraries. That thinking matters even more here. A conversational design system is an API for dialogue. Inputs, outputs, constraints, and rules – shaped by how people actually speak.

Designing these systems feels less like arranging a toolbar and more like managing a bullpen rotation. Timing, memory, contingency. Which means harder questions.

How do you version control a conversation flow? What does A/B testing tone look like in practice? What does accessibility mean when the interface has no screen? How do conversational patterns and visual components work together without creating friction?

Components aren’t disappearing. The definition is stretching. If design systems don’t stretch with it, they won’t break loudly, they’ll just get bypassed.

Thanks for reading! Subscribe for free to receive new posts directly in your inbox.

This article is also available on Medium, where I share more posts like this. If you’re active there, feel free to follow me for updates.

I’d love to stay connected – join the conversation on X and Bluesky, or connect with me on LinkedIn to talk design systems, digital products, and everything in between. If you found this useful, sharing it with your team or network would mean a lot.

When conversation leads: Rethinking design systems for a world where intent comes first

The component problem

What consistency means when conversation leads

The building blocks of conversation

Tone as infrastructure

The multimodal reality

So what’s a “component” in a conversational system?

What trips teams up next

Member discussion

We're still terrible at the people parts

Slots and the control paradox: Why loosening your design system might save it

Designing with APIs: A practical guide for beginners