Tutorial
We gave our design system a brain
We added MCP expecting a remote control. The emergent effect was stranger: pointed at our own system, it started auditing us, grading our palette and naming against sourced best practice, citations and all. The same connection also teaches the platform, ingests a scattered old system and writes real code.

This is the sequel. If you have not met our MCP server yet, start with talk to your design system: it covers connecting any AI to your project and editing your live system in plain language. This piece is about what we did not plan for.
We added MCP so an AI could read and edit our system. We expected a remote control. The emergent effect was stranger: pointed at our own work, it started auditing us. It graded our colour system against a sourced rubric, confirmed the parts we had got right, and flagged that our Dark theme silently inherits nine colours from Light instead of overriding them, citing the principle behind why that is usually drift. We had not noticed. A design system that reviews its own owners was not on the roadmap.
That is possible because the same endpoint carries more than your tokens. It also carries a curated body of design-system best practice and the entire product manual, and the AI on the other end reads almost anything you hand it. Three kinds of knowledge down one pipe, and it stops being a remote control. Here is the fun stuff.
TL;DR
- It audits you. Point it at your own system and it grades your palette, naming and theming against a sourced body of best practice, telling a hard rule from a house preference, with real citations.
- It only writes when you say so. Reading and grading need no permissions. Changes need a write token, and any change undoes in one step. Free on every plan, including the free one.
- It is the manual too. Ask how anything in Zaklad works and it answers from the live docs, with a deep link.
- It ingests a mess. Hand it a scattered old system (exports, CSS, a brand PDF) and it sorts the lot into clean layers, flagging anything ambiguous.
- It writes real code. Point an agent at it in a repo and your live system becomes the source it builds from, a new app or an existing screen pulled onto the system.
- It is one source of truth. Any tool reads today's system over the same link, build scripts and CI included, never a stale file.
Then we gave it knowledge
It started as a way to read and edit the system. Then the connection got two more things: a curated, sourced guide to what a good design system looks like (palette, naming, theming, typography, spacing and more), and the entire Zaklad manual, the same docs you can read on the site. So one endpoint carries what your system is, what good looks like, and how the platform works. The AI reads all three at once, and that is where it stops being a remote control and starts being something stranger.
Ask it like a guru
This is the part that surprised us. Because the guide and a worked reference system are right there next to your data, the AI can grade your work, not just describe it. It reads your tokens, pulls the relevant principles, compares you against a reference system, and tells you where you have drifted. Point it at your own work and ask the awkward question:
you grade our colour system and cite everything
claude reading your tokens, the rubric and the reference over MCP
zaklad → list_principles(color) · get_reference_design_system(color)
→ get_theme_coverage(Dark) · validate_current
claude graded against the rubric, with sources:
✓ names read by role, not appearance · background, border,
container · "name tokens by role" (universal · N. Curtis)
✓ 12 foundation ramps, OKLCH-even · "author colour in OKLCH"
(universal · CSS Color 4)
✓ validator clean · every semantic slot resolves
⚠ Dark overrides 126 of 135 colours but inherits 9 from Light
deliberate, or drift? · "flip colours per theme" (Zaklad view)Every finding is labelled by where the advice comes from, a widely held best practice, a Zaklad house preference, or a wider tooling opinion, and it can show you the real source on request (a spec, a paper, a name you will recognise). So it tells you that a semantic token holding a literal hex should reference a foundation step instead and points you at why, while flagging that a sort-order convention is only our house style, which you are free to overrule. And where the craft genuinely disagrees, it says so and shows you both sides instead of faking a consensus.
The first time it graded us, we pushed back: those inherited Dark colours were intentional. It did not dig in, and it did not fold. It laid out the principle, the source, and the nine tokens, and let us judge. A couple turned out to be deliberate. The rest were not. Being argued out of a sloppy call by your own design system is a strange feeling, and a good one.
And notice how little of this changes anything. You can interrogate your own system all day and never touch a token:
you make a high-contrast version of our dark theme for accessibility
zaklad drafted "Dark HC" as a new theme · nothing else touched
✓ body text on surface now 8.1:1, was 4.2:1 · clears WCAG AAA
✓ borders lifted to 3:1 against their backgrounds
⚠ brand.default fails on the darkest surface
proposed a lighter brand step for this theme only
review the diff, then keep it or throw it awayyou can we borrow from Material 3's colour roles without copying it?
claude read the public Material 3 role model, mapped onto your tokens:
· their container roles → your surface.raised + .sunken
· their tonal steps → your existing foundation ramp, no new values
✓ nothing changed · this is a proposal, say the wordNotice how little of that writes anything. It grades, compares and proposes, and only changes a token when you hand it a write token and say go, in the same conversation. An opinionated guru that shows its working and keeps its hands off until you ask. The docs cover asking the AI to review your design system and the standards behind it.
A design system used to be something you maintained. Connected, it becomes something that tells you when you’re the one drifting.
Nothing sticks until you say so
Acting on what it finds is the bit people get nervous about, so it is worth being precise. Writing is opt-in and reversible. Without a write token it can only read and advise. With one, every change it makes is yours to keep or undo in a single step. So you can let it loose on your live system and never hold your breath:
you warm up our brand, indigo runs a little cold
claude update_theme_override on the primary slots · Light and Dark
zaklad ✓ applied · Light was indigo.500 #5B76E2, now a warmer step
✓ Button, Spinner and Slider pick it up automatically
claude re-checked contrast against the rubric · still AA on both themes
you hmm, not quite. undo that
zaklad ✓ undo_last_change · back exactly as it wasEvery change is scoped to what you allow and reversible in one step, not an archaeology dig through your history. That is what makes it safe to hand an AI the keys to the one thing every other tool depends on.
One connection keeps going
The audit is the part that caught us off guard. But the same three kinds of knowledge power a lot more, and we lean on all of it weekly. A quick tour.
It is the manual. Because the docs are on the same link, you can ask how the product itself works and get an answer with a deep link, not a guess:
you how do I switch themes in the generated package?
zaklad set data-theme on a parent element, or call setTheme("dark")
docs → /docs?tab=developers#switching-themesIt ingests a mess. Plenty of teams are not starting fresh. They have a system spread across an old export, a pile of CSS variables and a brand PDF nobody has opened in a year. The AI reads all of that and builds clean foundation, semantic and component layers as it goes, cleaning names to your conventions:
you here's our old export, a CSS file and our brand PDF. sort it into our layers
claude read all three · building tokens over MCP
zaklad ✓ 140 raw values → foundation
✓ 52 aliases → semantic, matched to your conventions
✓ 38 names cleaned up · --c-blu-2 → color.brand.500
⚠ 6 ambiguous values flagged for you, not guessedIf you would rather skip the AI for a clean import, Zaklad reads DTCG JSON directly too, see bringing an existing system in.
It writes real code from your live system. Point an agent like Claude Code at the connection from inside a repo and your system becomes the source it builds from. The case that wins people over is the unglamorous one: an app that already exists, full of hardcoded colours, pulled back onto the system:
you we already have a Settings screen. refactor it onto our design system
claude read your live tokens and components over MCP
✓ swapped 18 hardcoded hex values for your semantic tokens
✓ replaced the local Button with your component
✓ pulled spacing onto your scale, no more magic numbers
⚠ 1 custom shadow has no home in your system · left it, flagged itIt did not guess at your brand. It read the live system, found where the screen had drifted from it, and pulled it back, leaving anything genuinely new for you to decide on. Change a token later and you regenerate from the same source rather than hand-patching a stale copy.
It is the source of truth for everything else. A DESIGN.md export is the easiest on-ramp there is, see the DESIGN.md guide, but a file is a snapshot. The connection is live, so any tool you point at it reads today's system: build scripts, a second AI workflow, and CI. Wire it into CI and every pull request is checked against the system as it stands now, so drift fails the build instead of surfacing in a design review three weeks later:
ci checking the diff against the live system over MCP
zaklad ⚠ Card.tsx uses #1f1f1f, not a token
⚠ 2 spacing values off the scale (13px, 27px)
✓ everything else resolves to current tokens
→ failing the check · drift caught before mergeYou are limited by your imagination
Look at what one connection turned into: a consultant who has read every standard and cites its sources, a remote control for your system, and the product manual, all answering in plain language and all kept honest by scopes, locks and an undo button. We have shown you the paths we use every day. They are not the edges.
Here are a few we did not even get to. Diff your last two releases and it writes the upgrade guide, the chore nobody volunteers for:
you diff our last two releases and draft the migration notes
claude reading both release snapshots over MCP
claude 12 changes · here is the upgrade guide:
✓ renamed color.accent → color.brand · safe find-and-replace
⚠ removed space.xxs · 3 components used it, remap to space.xs
✓ added a high-contrast theme, opt-in
drafted as MIGRATION.md, edit before you ship itAsk whether your dark theme is genuinely complete or just inheriting light, and it reads the coverage and answers. Ask why a colour is what it is, and it walks the chain from component to semantic to the raw value. Try to rename a token and it tells you which dependents would dangle before you commit. Lock your brand ramp in the editor and it cannot touch it, write token or not. The connection is the same. Only the question changes.
And none of it is a trap. Everything here runs on an open format, the same DTCG JSON that brings a system in takes it back out. You can let an AI audit, refactor and scaffold against your system and still walk away with everything in a standard file any tool can read. There is no version of this where you are stuck.
It costs nothing to find out. MCP is on every plan, including the free one, and you do not need to be a developer to use it. So here is a dare: connect yours, ask it the one honest question about your own system you have been putting off, and see if it tells you something you did not want to hear. Part one gets you connected in two minutes. Then create a project, point your AI at it, and find out what your design system says about itself.


