Teaching agents product design at Vercel

Vercel AI·John Phamous

15h ago

·~6 min·6/25/2026·en·0

Quick Answer

Quick Take

Vercel's innovative approach to teaching coding agents product design involves integrating accepted product decisions into their code repository, allowing agents to learn from context and maintain design consistency. This system, which includes structured routing and deterministic lint rules, enhances the agents' ability to produce UI that aligns with established design standards while ensuring traceability and human oversight in the review process.

Key Points

Vercel integrates product decisions into the code repository for agent learning.
The system features structured routing for design consistency and traceability.
Deterministic lint rules help agents produce UI that adheres to standards.
Weekly evidence intake workflow collects design feedback for continuous improvement.
Human reviewers validate agent suggestions before merging changes.

Article Content

From source RSS / original summary

Coding agents can produce working UI fast, but what's harder is a different shape. They can copy your product's style, match its patterns, and try to follow its conventions. What they cannot do is understand why those patterns exist. Code shows agents what shipped, not why one component, phrase, or interaction became your standard. That reasoning lives in design reviews, PR comments, Slack threads, and with the people who were in the room. For an agent, context that isn't in the codebase doesn't exist.

Vercel is an agent-native team. We treat accepted product decisions like code, keeping them in the repository, reviewing changes against them, and making them available to every agent working there. The way we do this is through. It's a system with three parts:product-designAny team can build the same structure around their own standards. The skill lives inside the repository alongside the code it governs.

Here's a simplified view of its structure: resolves the request mode first: shape, implement, review, copy, or harden. This keeps audits from becoming edits and copy passes from expanding into redesigns. It skips backend-only work, telemetry, console errors, generated files, and tests with no shipped UI impact. SKILL. mdThe skill routes to canonical sources instead of duplicating them. Component APIs, design-system rules, accessibility criteria, and interaction guidance stay with their owners.

Routing is specific to both task and surface. Material changes load product-judgment and interface-quality first. Copy, component, layout, interaction, accessibility, and resilience work each route to focused references. A modal loads destructive-action patterns and canonical verbs. A settings form loads labels, validation, progressive disclosure, and accessible-name guidance.

You can use this simplified structure as a starting point and replace the paths and standards with your own:Routing is only part of what makes the skill useful. The other part is how findings stay traceable once the skill produces them. Copy rules have stable IDs and point to their canonical sources:When Vercel Agent proposes a patch, it validates the change in a secure Vercel Sandbox with the repository's builds, tests, and linters before posting the suggestion.

We prefer deterministic checks when a linter can enforce a rule reliably. Linters are fast and cheap to run, so developers and coding agents get feedback while they work instead of waiting for a later review. Code can count two or three static options, so a linter can recommend radio buttons. Naming the right object and consequence for a destructive action requires product context, so the skill handles it.

Examples in the codebase include rules that:Each rule explains why the pattern is a problem and suggests a concrete fix. Some rules autofix safe migrations, such as replacing deprecated Tailwind utility names. Accepted decisions can take several forms:The lint rule below shows how one product guideline is encoded as a deterministic check:Each of these catches a class of mistake automatically, freeing code review for the decisions that actually require judgment.

Lint rules are deterministic, but agent behavior can vary, so we test the skill on interfaces it has not seen before. An agent edits a before state, then a judge checks the results against a rubric. Evals come from shipped examples documented in the skill. Holdouts hide their expected edits, testing whether the guidance generalizes. We also run fixtures without the skill to measure whether it changed the agent's behavior. We score rule correctness separately from similarity to the shipped result.

Shipped code can contain a flaw that the agent should improve instead of reproduce. Product standards change as components, names, workflows, and failure states change, and every update needs evidence and human review. Our weekly evidence-intake workflow collects design feedback that may improve. It searches Slack conversations and preserves links to Figma files, pull requests, review comments, and previews as evidence. When evidence is incomplete, it records the code or commit needed for verification.

product-designThe workflow separates collection from judgment:Every candidate links to its source and remains pending. A comment from an experienced reviewer can raise its priority, but every candidate still needs evidence. Automation ends with the review packet. A human decides whether a candidate becomes agent guidance, a lint rule, an example, an eval, or no change. Accepted changes go into the narrowest relevant file and pass the relevant checks before merging.

Our setup reflects Vercel's product, components, and review history, but other teams can adapt the structure to their own standards. Choose one product surface where the same review comments keep appearing: destructive actions, error states, settings forms, empty states, or navigation. Collect examples from shipped code and real reviews, and write down the decision, why it matters, exceptions, and the source. Avoid starting with broad adjectives like,, or. Agents need observable decisions. is usable. is not.

clearpolishedintuitiveDestructive actions use Verb + NounButtons should be clearFill in the fields specific to your surface before expanding to others. Tell agents when to load the skill in persistent repository instructions, and define the files and surfaces it covers along with the areas it must skip. In, agents failed to invoke an available skill in 56% of cases. Test the trigger separately from the guidance, because failing to load the skill and failing to follow a rule are different problems. separate Next.

Read on vercel.com

Want this in your inbox every morning?

Daily brief at your local 8am — bilingual EN/中文, free.

Subscribe — it's free

More from Vercel AI

See more →

Vercel AI·Tom Occhino

1w ago

FeaturedOriginal

The Agent Stack

AI Summary

The Agent Stack by Vercel AI provides essential building blocks for creating production-grade agents, enabling seamless integration across multiple AI models and secure operations. It features components like AI Gateway for model routing, Workflow SDK for durable execution, and Vercel Connect for scoped access, streamlining agent development and deployment across various platforms.

#Agent #AI Coding #Open Source #Security