What is a design system audit and when you need one
A design system audit is a structured review of token usage, component drift, accessibility compliance, governance gaps, and adoption rate that reveals where a product's UI has decayed since launch. It produces a ranked list of fixes, a layer-level compliance score, and a repair roadmap with owners, dates, and order of operations.
A design system audit is a structured review of token usage, component drift, accessibility compliance, governance gaps, and adoption rate that reveals where a product's UI has decayed since launch. The audit is the map. The work after it is the repair, and the repair is usually three to five times larger than the report.
The question what is a design system audit usually arrives on a Monday. A designer opens a Figma file, switches to the production site, and realises the two no longer share much more than a logo. A developer pushes a card component for the fourth time this quarter because the one in the shared library no longer fits the brief. An accessibility scan lights up with contrast failures that were in green at launch. Somewhere in parallel, a rebrand quote lands on the CEO's desk. The audit is the answer to a question none of these people have asked out loud yet: how much of the system still works.
Who needs one: product teams that shipped a design system, ran it for 18 months or more across two or more consumer projects, and now spend more time explaining which button is the real button than building features. The output is a ranked list of specific fixes, a measurable compliance score per layer, and a roadmap with owners, dates, and the order of operations.
The 30-second version
You have a design system. The shipped product has drifted from it. You do not know by how much. You cannot convince anyone to fund the repair without numbers. A design system audit produces the numbers, the specific locations of the damage, the governance gap that caused it, and the order in which to fix them. It does not rebrand. It does not rewrite. It names what is wrong so the team can decide what to do with money that would otherwise go to replacing a system that mostly still works.
Why design system audits exist
Design systems were supposed to solve design debt. In practice they created a new kind: the gap between the system as designed and the system as shipped. Every sprint that skips a DS review, every component re-implemented because someone was in a hurry, every hex value typed where a token should live widens the gap.
Forrester's research on design system ROI reports up to a 671% return on investment when systems are actively governed (Forrester via Autentika, 2025). Figma's data science team, in a controlled task study, found designers completed objectives 34% faster with access to a design system than without (Figma, Measuring the value of design systems, February 2025). Both numbers collapse the moment the system stops matching production.
The most useful reframe comes from a March 2026 piece that names the problem directly: the real debt is not design debt, it is workflow debt (Webflowforge, March 2026). Audits that only inventory visual compliance miss the governance and process layers that caused the drift. A useful audit looks at both.
There is also external pressure. The European Accessibility Act came into force on 28 June 2025, with EN 301 549 and WCAG 2.2 as the presumptive conformance standard, and fines up to three million euros per violation in some member states (Level Access EAA guide). Accessibility drift that used to be a backlog ticket is now a legal exposure.
The five layers of a serious audit
Any audit that skips one of these layers is producing a report that will not change anything. Each layer produces a specific artifact the team can act on.
Tokens
Pull every token reference from the consumer codebase. For a CSS design system this means every var(--ds-*) usage in source files. For each token, count usages inside DS components vs. one-off references inside page or route code. The ratio is the canary: a healthy system shows 80% or more of token references living inside components; a drifting system shows hex values, magic pixel widths, and hardcoded font weights scattered in page files.
The scan itself is mechanical. Tools like design-lint, a DTIF-native linter, can fail a pull request when design conventions regress, which is the fastest way to stop the drift from growing while you repair what is already there. On the Figma side, plugins like False Tokens highlight every element in a frame that uses a non-token value, producing a gap inventory without touching code.
The artifact this layer should produce: a ranked list of token violations per file, a token compliance percentage per page, and a shortlist of missing tokens. If a dozen files hardcode the same value, the token is probably missing from the system, not broken in the files.
Components
For every component in the system, answer three questions against the shipped product: is it used, is it used unmodified, and is it used in the contexts the system intended. Components used once across all consumers are candidates for deletion. Components re-implemented in five or more page files are candidates for promotion into the DS or for a new variant.
This layer is where most audits stop at inventory. The useful audits go further and compare each instance against the canonical version. Visual regression tools like Chromatic can run Storybook stories against shipped pages, surface pixel-level deltas automatically, and plug the review back into CI so the next drift does not ship silently. Storybook Connect, a Figma plugin co-maintained by the Storybook and Chromatic teams, embeds stories directly in Figma so designers can compare the design variant against the coded version in the same file.
On the design side, Figma's Library Analytics, generalised in February 2025, exposes usage data for components, styles, and variables directly inside Figma (Figma, Making Metrics Matter, February 2025). A component with zero detach events in six months is mature. A component detached dozens of times is a design problem masquerading as a component.
Accessibility
Run axe-core or an equivalent rule engine on every unique route. Log color-contrast failures, focus-trap bugs, missing alt attributes, keyboard-navigation dead ends, and focus-order anomalies. Storybook's accessibility addon can catch most of these at component level, but route-level issues such as skip links, landmark regions, and dynamic content only appear at runtime on the shipped pages.
The standard to benchmark against is EN 301 549, which currently references WCAG 2.1 and is being updated to WCAG 2.2, at conformance level AA. This is the presumptive standard under the European Accessibility Act, in force since 28 June 2025. Non-compliance is not a design critique anymore, it is a market-access risk: products can be removed from EU markets, and fines reach three million euros in some jurisdictions.
The artifact: a severity-ranked list of accessibility violations with WCAG reference plus location plus fix, a compliance score by route, and a short list of recurring issues that a single DS-level fix could eliminate across the product.
Governance
The most skipped and the most valuable layer. Ask five people on the team the same five questions: who can add a token, who can promote a component to the DS, what is the review path, who owns migrations when a token changes, who signs off on a breaking change. If the five answers diverge, the audit's most important finding is not a CSS problem. It is that nobody owns the governance loop, which is why the system drifted in the first place.
IBM Carbon offers a working reference here. The Carbon team built an internal tool called Beacon that lets product managers and adopter teams self-evaluate their adoption maturity, and paired it with an exemption process. Teams that did not want to adopt Carbon 10 could apply for an exemption, but the exemption process required executive sign-off and a real client-impact justification. In practice, the exemption path was harder than the adoption path, which is exactly the governance outcome the Carbon team was designing for (Knapsack, Lessons learned from Carbon for IBM.com).
The artifact: a governance map with roles, review steps, and decision owners; a list of ambiguous ownership zones; and a recommended RFC template for breaking changes.
Adoption
Measure the percentage of screens built with DS components versus custom code. The Salesforce Lightning Design System, the most cited adoption case, ships with a 60% productivity increase and a 70% reduction in CSS after disciplined adoption across Sales Cloud, Service Cloud, and the broader platform (Netguru, ROI of Design Systems). IBM Cloud, after migrating to Carbon 10 with patterns and governance, reduced user journeys by seven steps and lifted its NPS score by 57 points. Those are the shape of numbers an adopting team should be able to produce. If no number is possible, the credibility claim is not.
Low adoption is a diagnosis, not a verdict. It usually maps to one of four root causes: poor documentation, inconsistent component naming, missing patterns because teams built their own when the DS did not have what they needed, or missing deprecation because legacy components are still available and win by habit. The audit should identify which of the four is active and quantify it per consumer.
The methodology we use
A credible audit is short and structured. For a consumer product with 40 to 80 components and a single active codebase, five working days of a senior auditor produce a serious report. Larger systems, with two or more consumers or 100+ components, take two to three weeks. Timelines stretch when governance is unclear, when the system is undocumented, or when the codebase has more hidden dialects than expected.
The scoring model is deliberately simple: each of the five layers is rated on a 0 to 2 scale. Zero means the layer is broken: no tokens enforced, components routinely re-implemented, AA accessibility failing, no one owns governance, adoption below 20%. One means the layer works in part. Two means the layer is actively healthy. A system that scores below 6 out of 10 is in drift; below 4 and the question shifts from repair to replatform.
A typical five-day shape, for one consumer and 40 to 80 components:
- Day 1. Kickoff, codebase scan, token and component inventory.
- Day 2. Visual regression, component consistency check, Figma-to-code comparison.
- Day 3. Accessibility pass, axe scan on every unique route, WCAG reference.
- Day 4. Governance interviews with five team members on the same five questions, adoption metrics pull.
- Day 5. Scoring, prioritisation, roadmap drafting, stakeholder review.
The deliverable is a report with five layer-level sections, a compliance score, a ranked list of 20 to 40 concrete fixes, and a 30/60/90-day repair plan with owners and estimated hours.
After the audit: the deprecation cycle
The audit ends, the findings arrive, and now someone has to repair without breaking the product. The discipline that works here is a formal deprecation strategy. A breaking change moves through three phases: deprecation, migration, removal.
Mark the old component or token as deprecated in both the code repository and the design library. Add lint warnings that flag deprecated usage in pull requests. Write a migration guide that names the new component and shows a before-and-after example. Use an RFC process for anything that would break more than one consumer; the RFC turns the breaking change into a conversation instead of a surprise (Zeroheight, deprecating in design systems).
Support at least one major version back. Two is kinder, three becomes a tax on the DS team. A component that sees less than 5% adoption six months after release is a fast-track deprecation candidate; keeping it alive costs more than deleting it. The same logic applies in reverse to heavily-used legacy components: if 80% of screens still render the old Button after a new one ships, the new Button has a documentation or API problem, not an adoption problem.
When to run an audit, and when not to
Run an audit when:
- The product has shipped for 18 months or more with a design system in place.
- Two or more consumer codebases use the same system.
- Engineering or design complaints about the DS not having what the team needs happen more than once a sprint.
- A rebrand, a major redesign, or a platform migration is planned within the next six months.
- Accessibility compliance deadlines are active (EU AA since 28 June 2025; US Section 508 refresh cycles; public sector procurement thresholds).
- A major DS release with new tokens, a new component API, or a theme overhaul is under consideration; the audit scopes the cost of the upgrade honestly.
Do not run an audit when:
- The product is less than a year old. There is not enough drift to measure, and the interesting gaps are still in design, not in production.
- No one is allocated to act on the findings. An audit report that sits unread in a Notion page for six months costs more than it saves. The audit is half the work; the repair is the other half, and the second half has to be funded.
- The team is in the middle of a rewrite. Audit after the rewrite stabilises, not during. Mid-rewrite audits produce artefacts that are stale the day they ship.
Adjacent concepts
A design system audit sits in a family of related reviews. Knowing what overlaps with what prevents scope creep during the engagement.
- Component audit. Narrower scope. Inventory of components and their variants only. No governance, no adoption, no accessibility beyond the component level. Useful as a preamble to a full audit when the component library is large.
- Accessibility audit. Deeper on WCAG conformance, often run by a specialist firm for legal sign-off. A DS audit that finds serious accessibility failures usually escalates that specific layer to a dedicated accessibility audit for remediation.
- Design debt assessment. Broader than a DS audit. Looks at the product's overall UX decisions and whether the patterns shipped to production still match the user needs they were built for. A product can pass a DS audit with high scores and still be a design-debt disaster if the underlying patterns are wrong for the current user.
- Token audit. A subset of the DS audit focused only on the token layer, usually performed before a theme refresh or a brand refresh to scope the blast radius of any color or spacing change.
Studio
Start a project.
One partner for companies, public sector, startups and SaaS. Faster delivery, modern tech, lower costs. One team, one invoice.