The use of artificial intelligence agents in software development is entering a more mature phase. It is no longer just about asking a model to write code and hoping it gets it right, but about better organizing which part of the work should be handled by the most capable model, which part can be executed by a cheaper model, and where the human developer should step in. In this context, shadcn/improve appears as an agent skill with a simple idea: use the strongest model to audit a project and write improvement plans, but not to implement the changes directly.
The proposal fits a problem that is becoming increasingly common in technical teams. Advanced models are good at understanding complex codebases, detecting technical debt, prioritizing risks, and writing specifications. But they are also expensive. If they are used for every minor change, every refactor, or every repetitive adjustment, the bill can grow very quickly. shadcn/improve tries to separate intelligence from execution: the expensive model thinks and plans; another, cheaper agent implements under precise instructions.
The plan as the product, not the generated code
The philosophy behind shadcn/improve can be summarized in one sentence: the plan is the product. The skill does not modify the project’s source code. Its job is to inspect the repository, identify findings, prioritize them, and generate self-contained implementation plans in Markdown inside a plans/ folder. Then another agent, or a human developer, can execute those plans.
This nuance matters because it changes the way AI is used in development. Instead of delegating an entire task to an agent that improvises as it goes, the system first forces the problem to be documented, file paths to be cited, relevant excerpts of the current state to be included, verifiable steps to be defined, test commands to be listed, and stop conditions to be made explicit. It is a way to turn AI usage into a workflow closer to how a tech lead works: review, prioritize, specify, and then execute with control.
| Command | Main use |
|---|---|
/improve | Full repository audit, prioritized findings, and plans |
/improve quick | Fast, lower-cost review of critical points |
/improve deep | Exhaustive audit by packages and categories |
/improve security | Security-focused review |
/improve branch | Audit limited to the current branch changes |
/improve next | Project evolution suggestions based on evidence |
/improve plan <description> | Create a specific plan without a prior audit |
/improve review-plan <file> | Critique and refine an existing plan |
/improve execute <plan> | Delegate execution to a cheaper agent and review its work |
/improve reconcile | Update the plan backlog, verify, unblock, or retire items |
Installation is presented with a direct command: npx skills add shadcn/improve. It works in agents compatible with the Agent Skills format and generates plans in plain text. This makes the result independent from a specific model session. A plan can be read by another agent, another developer, or the team itself during an internal review.
Why separating audit and execution can save money
The approach makes economic sense. In AI-assisted programming, the most expensive parts are usually those that require deeper understanding: reading the repository, understanding conventions, assessing impact, detecting real problems, and avoiding false positives. This is where a high-end model can provide more value. By contrast, implementing a concrete list of steps, running tests, applying mechanical changes, or fixing a localized duplication can be delegated to a cheaper model if the plan is well written.
This does not mean any cheap model can handle any change. The key lies in the quality of the specification. shadcn/improve writes plans for “the weakest plausible executor”, meaning an agent that has never seen the advisor session and may have far less reasoning ability. That is why it includes context, exact paths, code excerpts, verification commands, completion criteria, and explicit boundaries.
| Phase | Recommended model | Reason |
| Repository reconnaissance | Powerful model | Needs to understand architecture, stack, and conventions |
| Problem audit | Powerful model | Requires technical judgment and prioritization |
| Plan writing | Powerful model | Must produce complete and verifiable instructions |
| Guided implementation | Cheaper model or human | Follows defined and bounded steps |
| Test execution | Cheap agent or local environment | Mechanical work with verifiable output |
| Final review | Powerful model or senior developer | Checks intent, scope, and diff quality |
The savings do not come only from using the expensive model less. They may also come from reducing failed iterations. Many AI development costs appear when the agent changes too much, breaks tests, goes out of scope, or invents a solution because it did not understand the context. If the plan includes stop conditions, expected commands, and scope limits, the executor has less room to drift.
Auditing with evidence, not generic recommendations
Another relevant aspect is that the skill does not aim to produce generic “best practices” lists. During the audit, it dispatches subagents across categories such as correctness, security, performance, test coverage, technical debt, dependencies, developer experience, documentation, and product direction. Every finding must be backed by evidence from the repository itself, with references to files and lines.
Then the advisor rereads the cited points before showing them. This second review attempts to reduce false positives, correct wrong attributions, and record rejections so they do not reappear in future runs. In the example shared by the project, a supposed SSRF alert linked to an https_proxy variable is rejected as expected behavior because it follows a standard convention used by many CLI tools.
| Audit category | What it can detect |
| Correctness | Logical errors, edge cases, or inconsistent behavior |
| Security | Real risks backed by code evidence |
| Performance | Expensive algorithms, inefficient queries, or duplicated work |
| Tests | Critical areas without enough coverage |
| Technical debt | Duplications, broken abstractions, or incomplete migrations |
| Dependencies | Updates, incompatibilities, or problematic packages |
| DX | Friction in development, testing, or deployment |
| Documentation | Incomplete or outdated instructions |
| Direction | Product ideas justified by the repository’s current state |
This insistence on evidence is useful because one of the biggest problems with coding agents is noise. A model can detect “problems” that are actually deliberate decisions, accepted debt, or project-specific patterns. If every finding must cite concrete code and pass an internal review, the output looks more like a technical audit and less like a list of generic tips.
Plans designed to survive outside the session
The plans generated by shadcn/improve are self-contained. They include the commit they were written against, so the executor can run a drift check before touching anything. If the code has changed too much, the plan must stop and report the issue instead of improvising.
They also include “verification gates”. Each step ends with a command and an expected output. This makes success more measurable. The agent does not have to decide whether something “looks done”; it must pass tests, linting, compilation, or other criteria defined by the repository itself.
| Plan property | Value for the team |
| Inline context | The executor does not depend on the original conversation |
| Exact paths | Reduces unnecessary exploration |
| Code excerpts | Clarifies the current state before the change |
| Verified commands | Uses the repository’s real tools |
| Completion criteria | Avoids ambiguous closures |
| Stop conditions | Prevents a smaller model from improvising |
| Reference commit | Detects whether the plan has become outdated |
| Scope limits | Reduces unwanted side changes |
This approach can fit well in teams that already use issues, pull requests, and reviews. Plans can be published as GitHub issues with --issues, so the work lands where the team already manages its backlog. For organizations that want to introduce agents without losing control, this is a practical idea: AI does not replace the process; it writes better work items that then enter the process.
Isolated execution and result review
The skill also includes /improve execute <plan>, which dispatches a cheaper executor in an isolated worktree, hands it the plan, and then reviews the result. The workflow goes back through the completion criteria, checks whether the diff respects the scope, and issues a verdict: approve, request revision, or block and refine the plan.
Merging the change remains in the user’s hands. This matters from a security and control perspective. The agent can prepare a proposal, but it does not make the final decision to integrate it. For many teams, this separation may be the difference between using AI as an assistant and letting it modify a product without enough supervision.
| Hard rule | Why it matters |
| The skill does not modify source code | Reduces risk during the audit phase |
It only writes to plans/ | Limits the scope of its direct changes |
| It does not run commands that mutate the working tree | Avoids side effects during analysis |
| It does not reproduce secrets | Only reports location and credential type |
| Executors work in disposable worktrees | Isolates changes and makes review easier |
| The merge remains in the user’s hands | Keeps human control over the repository |
There is also /improve reconcile, designed to clean up the backlog: verify plans that have already been executed, investigate blocked ones, refresh plans that have drifted, and retire findings that were fixed through another path. This part is less flashy than the initial command, but it may be one of the most useful. Improvement plans age quickly if the repository changes every day.
A signal of where coding agents are heading
shadcn/improve is not just a curious tool for users of Claude Code, Codex, or other compatible environments. It represents a broader trend: development agents need processes, limits, and intermediate products. Asking “fix my repo” is too open-ended. Asking “audit, prioritize, write verifiable plans, and let someone else execute” is much more controllable.
This pattern resembles how human teams work. An architect or tech lead usually does not implement every change. They analyze, decide priorities, write tickets, review proposals, and validate results. AI can provide value in that role if it has access to the repository, understands conventions, and produces plans that others can execute.
The idea also fits the price war in AI models. If the most capable models remain expensive, companies will need to reserve them for tasks where their reasoning actually makes a difference. Mechanical execution, repetitive changes, and tests can move to cheaper models or automated tools.
It does not replace technical review, but it can raise the baseline
The real usefulness will depend on each repository. A small project may not need such a structured audit. A large monorepo, with technical debt, pending migrations, and several teams, may get much more value from it. It will also depend on the quality of the model used as advisor and on the team’s discipline in reviewing the plans before executing them.
It should not be presented as a replacement for a senior developer. It can, however, act as a multiplier. It can find duplications, turn scattered findings into clear plans, generate actionable issues, and prevent a cheap agent from working without context. For many teams, that is already a meaningful step forward.
AI-assisted programming is moving beyond the era of improvised prompts. The next step will be workflows where expensive models reason, cheap models execute, tests verify, and humans keep the final decision. shadcn/improve points exactly in that direction: less magic, more process, and plans that can actually be reviewed.
Frequently asked questions
What is shadcn/improve?
It is an Agent Skill that audits a repository, detects possible improvements, and writes Markdown implementation plans for other agents or humans to execute.
How is it installed?
The project says it can be installed with the command npx skills add shadcn/improve in environments compatible with the Agent Skills format.
Does it make changes to the code?
Not directly. The skill writes plans in the plans/ folder. Execution can be delegated to another agent in an isolated worktree, but the final merge remains under the user’s control.
Why can it reduce AI costs?
Because it allows teams to use an expensive model for understanding and planning, and cheaper models for executing well-scoped tasks with verification commands and clear limits.
