my-claude-marketplace
Browse the plugin reference book
Personal Claude Code and Codex plugin marketplace - MCP servers, skills, and tools.
Usage
Claude Code
Add this marketplace:
/plugin marketplace add https://github.com/JedimEmO/my-claude-marketplace.git
Install a plugin:
/plugin install <plugin-name>@my-claude-marketplace
Codex
Add this marketplace:
codex plugin marketplace add JedimEmO/my-claude-marketplace
Then open /plugins in Codex, choose Mathias Marketplace, and install the plugins you want.
Adding a plugin
Create a plugin directory under plugins/:
plugins/my-plugin/
├── .claude-plugin/
│ └── plugin.json
├── .codex-plugin/
│ └── plugin.json
└── skills/ # and/or commands/, agents/, hooks/, .mcp.json
└── my-skill/
└── SKILL.md
Then add entries to both marketplace registries.
Claude entry in .claude-plugin/marketplace.json:
{
"name": "my-plugin",
"source": "my-plugin",
"description": "What it does",
"version": "1.0.0"
}
The source is relative to plugins/ (configured via pluginRoot).
Codex entry in .agents/plugins/marketplace.json:
{
"name": "my-plugin",
"source": {
"source": "local",
"path": "./plugins/my-plugin"
},
"policy": {
"installation": "AVAILABLE",
"authentication": "ON_INSTALL"
},
"category": "Coding"
}
Codex plugin manifests should point at the shared skill tree:
{
"skills": "./skills/"
}
If the plugin has .mcp.json, add "mcpServers": "./.mcp.json" to plugins/my-plugin/.codex-plugin/plugin.json.
name: agent-communication description: Use when the user asks about how agents communicate, orchestration vs choreography, delegation patterns, agent-to-agent messaging, trust boundaries, capability gates, human-in-the-loop checkpoints, or back-pressure in multi-agent systems. version: 1.0.0
Agent Communication — Delegation, Trust, and Flow Control
Once you have decomposed work into multiple agents, communication is where the real complexity lives. Get it wrong and you end up with chatty agents burning tokens on round-trips that accomplish nothing, context degradation through long delegation chains, or security holes where agents access capabilities they were never meant to have.
The goal is always the same: get the right information to the right agent with the minimum overhead, and make sure no agent can do more than its role requires.
Orchestration vs Choreography
These are the two fundamental coordination strategies. Most real systems use a hybrid, but understanding the pure forms matters.
Orchestration
A central coordinator owns the workflow. It decides what happens next, delegates to specialists, collects results, and makes routing decisions.
Strengths:
- Easy to reason about — follow the coordinator's trace and you see the whole flow
- Clear control flow with explicit sequencing and branching
- Simple error handling — the coordinator decides what to do when a specialist fails
- Natural place to enforce budget limits and deadlines
Weaknesses:
- Coordinator is a bottleneck and single point of failure
- Coordinator's context grows with system complexity — it needs to understand enough to route
- Adding a new specialist means changing the coordinator
Use when: the workflow has clear sequential or branching logic, you need guaranteed ordering, or the system is small-to-medium (under 5 specialists).
Choreography
Agents react to events autonomously. No central controller. Each agent knows its triggers and what to produce. Agents publish results; interested agents pick them up.
Strengths:
- No single bottleneck — agents operate independently
- Agents are independently deployable and replaceable
- Naturally resilient — one agent failing doesn't block others (unless they depend on its output)
Weaknesses:
- Hard to debug — no single place shows the full flow
- Emergent behavior can surprise you — interactions between agents create effects nobody designed
- Difficult to guarantee ordering or ensure all steps completed
- Error handling is distributed and harder to get right
Use when: agents are truly independent, you need high resilience, or workflows are simple event-reaction pairs with minimal coordination.
The Hybrid Approach (Usually the Right Choice)
Use a coordinator for the happy path — the main workflow that needs to happen in order. But let specialists escalate, emit events for exceptional cases, or communicate directly when it makes sense. The coordinator owns the skeleton; agents add flesh where needed.
This gives you debuggability (follow the coordinator) with flexibility (agents can handle edge cases locally).
Decision rule: default to orchestration. The coordinator owns the main happy path, controls sequencing, and is your primary observability checkpoint. Add choreography at the edges — for exception handling, escalation, and cases where a specialist needs to notify others without waiting for the coordinator to route. Keep the skeleton orchestrated; add choreography as a targeted enhancement, not a starting point.
Delegation Patterns
How one agent hands work to another. Pick the simplest pattern that works.
Direct Invocation
Agent A calls Agent B as a subtask, like a function call. A blocks until B returns a result.
- Synchronous, tightly coupled, simple to understand
- Best for: coordinator calling a specialist when it needs the result to continue
- Risk: deep call chains lose context at each hop and are hard to debug
- Keep chains to 2 hops maximum — if you need more, your decomposition is wrong
Message Passing
Agent A publishes a message or artifact. Agent B subscribes to a topic or queue and picks it up. A does not wait for B.
- Asynchronous, loosely coupled, naturally parallel
- Best for: event-driven flows, fan-out to multiple consumers, when A does not need B's result immediately
- Risk: harder to trace end-to-end, messages can be lost or processed out of order, eventual consistency issues
- Requires explicit correlation IDs to trace a request through the system
Shared Workspace
Agents read from and write to a common artifact store — a file system, database, or shared context object. Collaboration happens through the artifacts, not direct communication.
- Best for: iterative refinement workflows (drafting agent writes, review agent reads and annotates), parallel work on different aspects of the same artifact
- Risk: write conflicts when multiple agents modify the same artifact, stale reads if an agent caches, no built-in ordering
- Mitigate with: clear ownership (one writer per artifact section), versioning, or optimistic locking
Structured Handoff Objects
When delegating, pass a structured handoff — not raw conversation history. A handoff object includes:
- Task: what the specialist should do, stated as a clear objective
- Context: relevant facts the specialist needs, pruned to essentials
- Constraints: time budget, token budget, tool restrictions, output format
- Expected output: what the result should look like
Example handoff from coordinator to a research specialist:
{
"task": "Find security best practices for JWT token refresh",
"context": {
"project_language": "Rust",
"current_approach": "Rotating refresh tokens with 24h expiry",
"concern": "User reported tokens not refreshing in mobile app"
},
"constraints": {
"max_sources": 3,
"token_budget": 4000,
"focus": "mobile-specific JWT issues, not general JWT tutorials"
},
"expected_output": {
"format": "structured findings",
"fields": ["source_url", "finding", "relevance_to_our_case", "recommended_action"]
}
}
Notice what is NOT in the handoff: the full conversation history, unrelated project details, the coordinator's reasoning chain. The specialist gets exactly what it needs to do its job.
This is the single most important pattern for preventing context degradation. Never forward full conversation history between agents. Every handoff is an opportunity to compress, focus, and clarify.
Conversation Threading and Context Flow
How context moves between agents determines system quality. Get this wrong and agents act on stale or distorted information.
Full Context Forwarding
Pass everything from one agent to the next. The entire conversation, all artifacts, full history.
- Simple to implement — just pass it along
- Context grows linearly with chain length
- Quickly hits token limits
- Almost never the right choice beyond a single hop
Summary Passing
The coordinator summarizes relevant context before delegating. Only the summary is passed to the specialist.
- Bounded context size regardless of conversation length
- Lossy — the coordinator's judgment determines what matters
- Good enough for many workflows, especially when specialists are narrowly focused
- The coordinator must be good at summarization, which is an underrated requirement
Structured Handoff (Recommended Default)
Define an explicit schema for what gets passed between agents. Treat it like an API contract.
- Most reliable approach — forces you to think about what information actually matters
- Self-documenting — the schema tells you what each agent needs
- Testable — you can validate handoff objects independently
- Decouples agents — as long as the contract holds, implementations can change
The Context Compression Problem
Every hop between agents loses information. This is unavoidable — the question is how much and whether you lose the right things.
A 3-agent chain where each hop retains 70% of context delivers only 34% of the original information to the final agent. A 4-agent chain: 24%. This compounds fast.
Mitigations:
- Keep chains short (2 hops max for most workflows)
- Use structured handoffs to preserve critical information explicitly
- Have the final agent access the original source when possible, not just what was passed through the chain
- If a chain must be longer, use the coordinator as a context authority that specialists can query
Concrete example of context degradation:
User asks: "Review the auth module for security issues, focusing on the token refresh flow and the session cleanup cron job."
- Coordinator (full context): passes to research agent: "Research security patterns for JWT refresh and session cleanup"
- Research agent (70% retained): searches for "JWT refresh security" — the session cleanup part was in the context but not in the search query
- Research agent returns: findings about JWT refresh only
- Coordinator passes to reviewer: "Review auth module with these security findings" — session cleanup concern is now gone from the working context
- Reviewer: produces a review covering only JWT refresh. Session cleanup is never reviewed.
The user's full request was 2 concerns. After 2 hops, only 1 survived. This is the telephone game. Fix it by including both concerns explicitly in every structured handoff, not relying on context to carry them.
Trust Boundaries and Capability Gates
Security in agent systems is enforced at the communication layer. An agent's tool list IS its permission set — this is the simplest and most effective security model.
Least Privilege
Every agent should have exactly the tools it needs for its role and nothing more.
- Read-only agents (researchers, analyzers) should not have Write or Bash
- Agents that modify code should not have deployment tools
- Agents that interact with external services should be isolated from internal systems
- When in doubt, start with fewer tools and add as needed — it is much easier to grant access than to revoke it after a mistake
Separating Read and Write Agents
For high-stakes workflows, split agents by whether they read or write:
- Analysis agents: Read, Glob, Grep — can explore freely but cannot change anything
- Modification agents: Edit, Write — can change things but only when given explicit instructions from a coordinator
- Execution agents: Bash — isolated, monitored, with constrained command sets
This separation means a confused or misbehaving analysis agent cannot accidentally modify files, and a modification agent cannot accidentally run destructive commands.
Human-in-the-Loop Checkpoints
Human approval is not just a safety mechanism — it is a communication checkpoint. The agent must explain what it intends to do clearly enough for a human to make a decision.
Place checkpoints at trust boundary crossings:
- Before external side effects (API calls, file writes, deployments)
- Before irreversible actions (deletions, sends, publishes)
- Before high-cost operations (expensive API calls, large-scale changes)
- When the agent's confidence is below a threshold
The approval request must include:
- What will happen (the specific action)
- Why the agent decided this (reasoning chain)
- What the alternatives were (so the human can pick a different path)
- What happens if the human says no (graceful fallback)
Escalation Patterns
Define escalation triggers explicitly in each agent's system prompt. Do not rely on the agent to figure out when to escalate — that is an unreliable heuristic.
- Specialist to Coordinator: "I cannot handle this input, routing back with explanation"
- Any Agent to Human: "This exceeds my confidence or authority, here is what I recommend"
- Coordinator to Fallback: "Primary specialist failed, trying alternative approach"
Each escalation must include: what was attempted, why it failed, and what the escalating agent recommends as a next step.
Back-Pressure and Flow Control
Agents can generate work faster than downstream agents can process it. Without flow control, you get cascading token spend and runaway costs.
Depth Limits
Cap how many levels deep a delegation chain can go. Three is usually plenty. If you find yourself needing more, your decomposition is almost certainly wrong — you are creating agents for what should be steps within a single agent.
Pass the current depth as part of every handoff. Each agent increments it and refuses to delegate further when the limit is reached, returning its best partial result instead.
Token and Cost Budgets
Set a total token budget for a multi-agent task. The coordinator subdivides it across specialists based on expected complexity.
- When a specialist's budget is exhausted, it must return its best partial result — not fail silently
- The coordinator tracks total spend and can abort early if costs are trending above the budget
- Log actual vs budgeted spend for every task to calibrate future budgets
Concurrency Limits
Cap how many specialists a coordinator can run in parallel. More parallelism means more simultaneous token spend, and results often need to be reconciled — which costs tokens too.
- Start with sequential execution and parallelize only when you have evidence it helps
- Two or three parallel specialists is usually the sweet spot
- Beyond that, the coordinator's reconciliation cost starts to dominate
Timeout Budgets
Set a wall-clock deadline for the overall task. Subdivide it across agents.
- Each agent gets a time slice proportional to its expected work
- If a specialist exceeds its slice, the coordinator can cancel it and use a fallback or return a partial result
- Always prefer a partial result over a timeout with nothing — the coordinator can decide whether partial is good enough
Compensating Actions (Saga Pattern)
When a multi-agent workflow partially fails, you need a plan for undoing completed work. This is the saga pattern from distributed systems, adapted for agents.
The problem: Agent A deploys a service. Agent B updates the config. Agent C sends a notification. B fails. Now you have a deployed service with no config update and no notification. The system is in an inconsistent state.
The solution: for each agent action that has side effects, define a compensating action — the undo. If the workflow fails partway through, execute compensations in reverse order.
| Agent Action | Compensating Action |
|---|---|
| Deploy service | Roll back deployment |
| Update config | Restore previous config |
| Send notification | Send correction/retraction |
| Create resource | Delete resource |
| Grant access | Revoke access |
Design rules:
- The coordinator must track which agents completed successfully, so it knows which compensations to run
- Compensating actions must be idempotent — compensating an already-compensated action should be safe
- Not all actions have compensations. "Send email" cannot be unsent. For irreversible actions, use human-in-the-loop approval BEFORE execution, not compensation after failure
- Compensations can fail too. Log compensation failures prominently — they leave the system in an inconsistent state that requires manual intervention
- Keep workflows short. The more steps in a saga, the more likely a mid-workflow failure and the more complex the compensation chain. If a workflow has more than 4-5 compensatable steps, reconsider the design
Anti-Patterns
Chatty Agents
Too many round-trips between agents when one agent could have done the work. If agents are constantly asking each other for clarification, they either need better handoff objects (the sender is not providing enough context) or they should be merged into a single agent.
Symptom: agent A delegates to agent B, which asks A a question, which A answers, which B uses to ask another question. This is a conversation, not a delegation.
The Telephone Game
Context degrades through long chains. Agent C acts on a distorted version of what Agent A intended because B's summary was lossy. The longer the chain, the worse the distortion.
Fix: keep chains to 2 hops max. If the final agent needs original context, let it access the source directly rather than relying on intermediaries.
Over-Delegation
A coordinator that does nothing itself — just farms everything out to specialists and stitches results together. If the coordinator adds no judgment, no routing logic, and no synthesis beyond concatenation, it should not exist.
The coordinator should own: routing decisions, context management, quality assessment of specialist outputs, and final synthesis. These are real jobs.
Premature Choreography
Building an event-driven agent mesh when a simple coordinator would do. Choreography is powerful but hard to debug, hard to reason about, and hard to test. You earn choreography with proven complexity, not because it sounds architecturally elegant.
Start with orchestration. Move to choreography only when the coordinator becomes a genuine bottleneck or when agents truly need to operate independently.
Related Skills
- For deciding which agents to create and how to scope them, see
agent-decomposition - For designing the tool interfaces agents use, see
tool-design - For tracing and debugging multi-agent flows, see
agent-observability - For managing state across agent boundaries, see
agent-state
Topology Patterns — Visual Reference
Common agent topologies with trade-offs. Pick the simplest topology that handles your workflow. You can always add complexity later; removing it is much harder.
Hub-and-Spoke (Coordinator Pattern)
┌──────────────┐
│ Coordinator │
└──────┬───────┘
┌───────────┼───────────┐
▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────┐
│ Research │ │ Code │ │ Review │
│ Agent │ │ Agent │ │ Agent │
└──────────┘ └──────────┘ └──────────┘
The coordinator delegates tasks to specialists and collects results. Specialists never talk to each other directly — all communication goes through the coordinator.
When to use: Most multi-agent systems should start here. Works well when the coordinator can understand enough to route effectively and when specialists produce independent outputs.
Watch out for: Coordinator context bloat. As the number of specialists grows, the coordinator must understand more and its prompts get larger. If you pass 5-6 specialists, consider hierarchical.
Pipeline (Sequential Chain)
┌──────────┐ ┌──────────┐ ┌──────────┐
│ Ingestion│────▶│ Analysis │────▶│ Synthesis│
│ Agent │ │ Agent │ │ Agent │
└──────────┘ └──────────┘ └──────────┘
Each agent processes the output of the previous agent. Work flows in one direction. No branching, no feedback loops.
When to use: ETL-style workflows, document processing pipelines, or any task with clear sequential stages where each stage's output is the next stage's input.
Watch out for: Context degradation at each hop. Use structured handoff objects between stages. Keep the pipeline to 3 stages max — beyond that, context loss compounds severely.
Map-Reduce (Fan-Out / Fan-In)
┌──────────────┐
│ Coordinator │
└──────┬───────┘
┌───────────┼───────────┐
▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────┐
│ Worker A │ │ Worker B │ │ Worker C │
└─────┬────┘ └─────┬────┘ └─────┬────┘
└─────────────┼─────────────┘
▼
┌──────────────┐
│ Aggregator │
└──────────────┘
The coordinator splits work into independent chunks and fans out to parallel workers. An aggregator (often the coordinator itself) collects and combines results.
When to use: When work is naturally parallelizable — analyzing multiple files, researching multiple topics, reviewing multiple sections. Each worker handles one chunk independently.
Watch out for: The aggregation step is harder than it looks. Combining partial results requires judgment, not just concatenation. Budget tokens for aggregation — it often costs as much as a single worker.
Hierarchical (Two-Level Coordination)
┌───────────────┐
│ Top Coordinator│
└───────┬───────┘
┌────────────┼────────────┐
▼ ▼ ▼
┌────────────┐ ┌──────────┐ ┌────────────┐
│ Frontend │ │ Backend │ │ QA │
│ Lead │ │ Lead │ │ Lead │
└─────┬──────┘ └────┬─────┘ └─────┬──────┘
┌───┼───┐ ┌───┼───┐ ┌───┼───┐
▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼
CSS JS A11y API DB Auth Unit Int E2E
A top-level coordinator delegates to sub-coordinators, each of which manages its own specialists. Two levels of delegation.
When to use: Large systems where a single coordinator cannot understand all specialist domains. Each sub-coordinator is an expert in its domain and knows how to route within it.
Watch out for: Two hops of delegation means two hops of context loss. The top coordinator's instructions to a sub-coordinator must be precise enough that the sub-coordinator makes the right routing decisions. This topology is expensive — use it only when a flat hub-and-spoke genuinely cannot handle the complexity.
Peer Mesh (Decentralized)
┌──────────┐◄────────►┌──────────┐
│ Agent A │ │ Agent B │
└─────┬────┘ └────┬─────┘
│ ◄──────────► │
│ ┌──────────┐ │
└──►│ Agent C │◄────┘
└──────────┘
Every agent can communicate directly with every other agent. No coordinator. Agents negotiate, request help, and share results peer-to-peer.
When to use: Almost never for LLM-based agents. This topology is common in traditional distributed systems but creates severe problems with LLM agents: unbounded token spend from cross-talk, no single point of observability, and emergent behavior that is nearly impossible to debug.
Watch out for: Everything. If you think you need a peer mesh, you probably need a hub-and-spoke with better routing logic. The only legitimate use case is when agents are truly autonomous entities with their own goals (multi-player simulations, adversarial setups).
Comparison Table
| Topology | Best For | Coordination Cost | Debuggability | Resilience |
|---|---|---|---|---|
| Hub-and-Spoke | Most workflows, clear routing | Low-Medium | High | Low |
| Pipeline | Sequential processing stages | Low | High | Low |
| Map-Reduce | Parallelizable independent work | Medium | Medium | Medium |
| Hierarchical | Large systems, domain-specific routing | High | Medium | Medium |
| Peer Mesh | Avoid for LLM agents | Very High | Very Low | High |
Reading the table:
- Coordination Cost: token and latency overhead from the topology itself (not the work)
- Debuggability: how easy it is to trace a request through the system and understand what happened
- Resilience: how well the system handles a single agent failing
The default recommendation is hub-and-spoke. Graduate to map-reduce when you have parallelizable work. Graduate to hierarchical only when you have proven that a flat coordinator cannot handle the routing complexity. Avoid peer mesh for LLM agents.
name: agent-decomposition description: Use when the user asks about splitting a system into agents, agent boundaries, how many agents to use, agent responsibility assignment, capability allocation, or agent topology design. Also triggers on "should this be one agent or multiple", "agent architecture", or "agent roles". version: 1.0.0
Agent Decomposition — Boundaries, Roles, and Topologies
This is the most consequential architectural decision you will make in an agentic system. Every other choice — communication patterns, state management, tool design — flows downstream from how you draw agent boundaries. Get it wrong in one direction and you have a god-agent drowning in a 200k-token context window, unable to focus. Get it wrong in the other direction and you have six agents burning tokens on coordination overhead, passing messages back and forth to accomplish what one agent could have done in a single turn. The default posture is restraint: start with one agent and split only when you have evidence.
The Decomposition Decision
The default is one agent with tools. A single agent with a focused system prompt and a well-chosen tool set handles the vast majority of tasks. Multi-agent is not an upgrade — it is a trade-off. You are exchanging simplicity and shared context for isolation and specialization.
Only decompose when at least one of these concrete pressures exists:
| Pressure | Signal | Example |
|---|---|---|
| Context window saturation | Agent performance degrades as conversation grows; it forgets earlier instructions or loses track of state | A coding agent working across a 500-file monorepo that needs domain docs, API specs, and test fixtures simultaneously |
| Role specialization | The system prompt tries to be two things at once and does both poorly | "You are an expert code reviewer AND a creative copywriter" — these require fundamentally different personalities |
| Trust boundaries | Different tasks need different permission levels | One task needs filesystem write access; another should only read from a web API |
| Model cost differentiation | Some subtasks are simple extraction; others need deep reasoning | Use Opus for architectural decisions, Haiku for parsing log files |
| Independent scaling | One capability is called 100x more than others | A data-extraction pipeline that fans out to dozens of parallel workers |
Priority order when multiple pressures exist: context window saturation is the strongest signal — if an agent is hitting context limits, split immediately. Role specialization is next — contradictory system prompts degrade everything. Trust boundaries come third. Cost differentiation and independent scaling are weaker signals that rarely justify splitting on their own.
Decision checklist before splitting:
- Have you actually hit context limits, or are you anticipating them?
- Would a better system prompt or tool design solve the problem without splitting?
- Can you quantify the coordination cost of the split?
- Will each resulting agent have enough context to do its job independently?
- Is there a simpler solution — like clearing context mid-conversation — that avoids multi-agent entirely?
If you answered "no" to question 1 or "yes" to question 2, stop. You do not need multiple agents yet.
Agent Responsibility Patterns
These are the recurring roles that emerge in well-designed multi-agent systems. Not every system needs all of them — most need two or three.
Coordinator
Routes work, does not do it. Holds the plan and delegates to specialists. The coordinator's system prompt is about task decomposition and routing logic, not domain expertise. It should be thin by design — if your coordinator's system prompt is longer than any specialist's, something is wrong.
A coordinator decides: "This looks like a database migration task, sending to the data-specialist" or "The user wants a code review followed by documentation updates — I will sequence specialist-code-review then specialist-docs."
Specialist
Deep domain expertise, narrow tool set. Does one thing well. A specialist agent might be "the database agent" with access to query tools, schema introspection, and migration utilities — and nothing else. Its system prompt is dense with domain knowledge because it does not waste context on capabilities it will never use.
Transformer
Reshapes data between systems or formats. No domain logic, pure translation. When agent A produces output in format X and agent B needs format Y, a transformer sits between them. This is often a function rather than an agent — only promote it to an agent when the transformation requires LLM reasoning (e.g., summarizing a 50-page document into a structured brief).
Validator
Checks output quality and enforces constraints. A second pair of eyes. Validators are especially valuable when the cost of errors is high — generating SQL that will run against production, producing customer-facing content, or making irreversible API calls. The validator does not produce; it critiques.
Aggregator
Collects results from multiple agents and synthesizes a unified response. In a map-reduce topology, the aggregator is the reduce step. It resolves conflicts between specialist outputs, merges partial results, and presents a coherent answer to the user.
Anti-pattern: The God Agent
One agent with 30 tools, a 4000-word system prompt, and instructions that try to cover every possible scenario. You will recognize it by its symptoms: inconsistent behavior depending on which part of the system prompt the model attends to, tools that are never called, and performance that degrades as you add more capabilities. If you have a god agent, the fix is not "add more instructions" — it is decomposition.
But note: a single agent with many capabilities is often the RIGHT starting point. The god agent is an anti-pattern only when you have evidence of the symptoms above. A capable agent with 15 tools and a well-structured prompt that performs well is not a god agent — it is a well-designed single agent. Do not split preemptively to avoid a label.
Anti-pattern: Atomic Agent Syndrome
The opposite extreme — one agent per tool, each maximally specialized. A file_reader_agent, a grep_agent, a file_writer_agent. It sounds clean in theory. In practice, coordination overhead dominates: the coordinator spends more tokens routing between 12 micro-agents than a single agent would spend doing the work directly. Every delegation is a context hop, and every hop loses information. If an agent has only one tool and no domain knowledge in its system prompt, it should not be an agent — it should just be a tool.
Capability Allocation
Which tools belong to which agent. The core principle: an agent should only have tools it has the context to use well.
Group tools by domain. All database tools go with the data agent. All API integration tools go with the integration agent. All filesystem tools go with the coding agent. When a tool does not clearly belong to one domain, that is a signal your domain boundaries need refinement.
Remove unused tools. If an agent has access to a tool it never uses in practice, remove it. Every tool description consumes context tokens and adds cognitive load to the model's tool-selection reasoning. Audit tool usage periodically.
Tool count as a code smell. If an agent has more than 10 tools, question whether it is really one agent or two agents crammed together. The sweet spot for most specialists is 3-7 tools.
Read vs. write asymmetry. Read-only tools (search, query, inspect) can be shared more freely across agents because they cannot cause damage. Write tools (create, update, delete, execute) should be allocated carefully and usually belong to exactly one agent. If two agents both need to write to the same resource, you either have a boundary problem or you need a mediator.
Topology Patterns
Hub-and-Spoke (Coordinator to Specialists)
The most common and most recommended starting topology.
┌──────────┐
│Coordinator│
└─────┬─────┘
┌────────┼────────┐
▼ ▼ ▼
┌────────┐ ┌──────┐ ┌──────┐
│Code │ │Data │ │Docs │
│Agent │ │Agent │ │Agent │
└────────┘ └──────┘ └──────┘
One coordinator fans out to specialists. Simple to reason about, single point of coordination, easy to debug because all routing decisions are visible in one place. Start here.
Pipeline (Sequential Handoff)
┌──────────┐ ┌──────────┐ ┌──────────┐
│ Extract │───▶│Transform │───▶│ Validate │
└──────────┘ └──────────┘ └──────────┘
Each stage transforms or enriches the output of the previous stage. Good for ETL-like workflows, content pipelines (draft → review → polish), or any process with clear sequential phases. The key constraint: each agent must be able to do its job with only the output of the previous stage — no reaching back two steps.
Map-Reduce (Fan-out / Fan-in)
┌──────────┐
│Coordinator│
└─────┬─────┘
┌──────┬───┼───┬──────┐
▼ ▼ ▼ ▼ ▼
┌───┐ ┌───┐┌───┐┌───┐ ┌───┐
│ W1│ │ W2││ W3││ W4│ │ W5│
└─┬─┘ └─┬─┘└─┬─┘└─┬─┘ └─┬─┘
└──────┴───┬┴────┴──────┘
┌─────▼─────┐
│ Aggregator│
└───────────┘
Coordinator splits work into N parallel tasks, workers execute independently, aggregator combines results. Good for parallelizable work like searching multiple sources, processing batches, or evaluating multiple options. The workers must be truly independent — if worker 3 needs results from worker 1, this is not a map-reduce problem.
Peer Network
Agents communicate directly without a central coordinator. Each agent decides when to invoke another. This is harder to debug and reason about, but avoids the coordinator bottleneck. Use only when agents are truly autonomous and the interaction patterns are unpredictable. In practice, most systems that think they need peer networks actually work better with hub-and-spoke.
The Two-Level Sweet Spot
In practice, most systems work best with at most two levels of hierarchy: a coordinator and its specialists. Deeper hierarchies — a coordinator that delegates to sub-coordinators that delegate to specialists — add latency, lose context at each handoff, and make debugging painful. If you find yourself building a three-level hierarchy, reconsider your decomposition. You may be over-splitting, or you may need to restructure as two independent two-level systems rather than one deep tree.
Boundary Heuristics
Concrete rules for drawing the line.
Split when:
- Context window would exceed roughly 60% capacity in normal operation, leaving room for the conversation itself to grow
- Tasks require different model tiers and the cost difference is material at your scale
- Trust requirements genuinely differ — one task needs filesystem access, another should be sandboxed to web search only
- Failure of one task should not poison another's context — a failed code generation attempt filling context with error traces should not degrade the research agent's performance
- Domain expertise is genuinely disjoint — a coding agent and a market-research agent share almost no system prompt content
Do not split when:
- The agents would just pass data through without adding value — if agent B needs everything agent A knows, they should be one agent
- Coordination overhead (routing logic, message formatting, context summarization) exceeds the context savings from splitting
- The "specialist" would need the coordinator's full context to function — this is a sign the boundary is in the wrong place
- You are splitting for organizational reasons ("the database team wants their own agent") rather than technical ones
- The task is simple enough that tool-use within a single agent handles it cleanly
Evolutionary Decomposition
Do not design a multi-agent system on a whiteboard. Grow it from a working single-agent system.
The decomposition journey:
-
Single agent with all tools. This is your starting point. It works until context limits hit or capability conflicts emerge. Do not skip this step — you need the empirical evidence of where the single agent struggles.
-
Extract the first specialist. Look for the capability that is most context-heavy or most frequently called. Extract it into its own agent with its own focused system prompt and tool set. The original agent becomes a coordinator by default.
-
Add a thin coordinator. Once you have 3+ specialists and routing logic starts cluttering the original agent's prompt, extract the routing into a dedicated coordinator with no domain tools — only the ability to delegate.
-
Stop. Resist the urge to keep splitting. Every new agent adds coordination cost, increases latency, and creates another failure mode. Add agents only when you have measured evidence that the current system cannot handle the load, context, or capability requirements.
Worked Example: Code Review System
Start: single agent with tools: read_file, grep, glob, web_search, create_comment. System prompt: "You are a code reviewer. Read the diff, research relevant best practices, check for security issues, and post review comments."
Problem 1: context fills up. The agent reads the diff (2K tokens), searches for best practices (8K tokens of search results), reads 5 related files (15K tokens), and has barely any room left for reasoning. Signal: context saturation.
Split 1: extract a research agent. The research agent gets web_search and read_file. The original agent keeps grep, glob, create_comment and becomes the reviewer. Coordinator delegates: "research best practices for X pattern" → gets a 500-token summary back instead of 8K of raw results.
Problem 2: security review needs different expertise and tools. The reviewer's system prompt is trying to be both a style reviewer and a security auditor. It catches style issues well but misses vulnerabilities. Signal: role specialization.
Split 2: extract a security specialist with its own system prompt focused on OWASP patterns, plus access to a vulnerability_db tool the style reviewer doesn't need.
Result: 3 agents (coordinator, research, security) + the original reviewer, now focused on style and correctness. Each agent's context budget is comfortable and its system prompt is focused. The coordinator is thin — it reads the diff, routes to the relevant specialists, and synthesizes their findings.
What we did NOT split: the coordinator still handles final comment creation. It could delegate this to a "comment writer" agent, but that would add a hop without adding value — the coordinator already has the synthesized findings and can write the comment directly.
The most common mistake is jumping to step 3 or 4 on day one. Premature decomposition is harder to recover from than a monolithic agent — at least the monolith works, even if slowly. An over-decomposed system might not work at all.
Related Skills
- For how agents communicate once decomposed, see
agent-communication - For designing the tools agents use, see
tool-design - For managing state across agents, see
agent-state - For tracing and debugging multi-agent decompositions, see
agent-observability
name: agent-observability description: Use when the user asks about tracing agent decisions, debugging multi-agent flows, monitoring tool usage, error handling in agent systems, resilience patterns for agents, circuit breakers, retry strategies, cost tracking, or human-in-the-loop observability. version: 1.0.0
Agent Observability — Tracing, Resilience, and Cost
In traditional systems, you debug with stack traces and logs. In agent systems, there are no stack traces — decisions emerge from reasoning over context. Observability means capturing not just what happened, but why the agent chose it. Error handling means designing for nondeterminism: the same input can produce different outputs, tools fail in new ways, and agents can confidently produce wrong answers. This skill combines observability and error handling because in agent systems, you debug through traces and build resilience through monitoring.
Decision Tracing
The most important thing to trace is not WHAT the agent did, but WHY.
What to capture at each decision point:
- What the agent saw (relevant context at decision time)
- What it considered (which tools or options were evaluated)
- What it chose (the action taken)
- What happened (the result)
Trace structure:
- Each agent invocation = a span
- Each tool call within an agent = a child span
- Delegation to another agent = a linked span with the same correlation ID
- The full trace tells the story: "coordinator decided to delegate research, research agent searched 3 sources, found 2 relevant results, returned a summary, coordinator used the summary to generate the final response"
Correlation IDs — every multi-agent task gets a single trace ID. Pass it through every delegation. Without this, you cannot reconstruct what happened across agents. This is non-negotiable. If you do nothing else for observability, do this.
Trace storage — traces are only useful if you can query them later. Store them in a structured format (JSON lines, a database, or an observability platform). At minimum, you need to answer: "show me everything that happened for task X" and "show me all tasks where agent Y failed in the last hour."
Decision logging — beyond structured traces, log the agent's reasoning in a parseable format. If the agent explains its choice before acting, capture that explanation as metadata on the span. This is the difference between "agent called grep" and "agent called grep because the user asked about error handling and the agent decided to search for try/catch patterns first."
Multi-Agent Flow Visualization
When something goes wrong in a 4-agent flow, you need to see the whole picture.
Span-based tracing (borrowed from distributed systems):
[Coordinator]──────────────────────────────────
├─[Research Agent]────────────
│ ├─ tool: web_search ──
│ └─ tool: web_fetch ────
├─[Code Agent]──────────────────
│ ├─ tool: read_file ──
│ ├─ tool: grep ───
│ └─ tool: edit_file ─────
└─[Review Agent]──────
└─ tool: read_file ──
Each span records: start time, end time, token count, tool calls, success/failure, and any context passed in. The visualization does not need to be fancy — even a structured log that you can grep through is better than nothing.
What to look for in traces:
- Long spans — an agent stuck in a reasoning loop
- Deep nesting — too many delegation levels, context is degrading at each hop
- Repeated tool calls — agent retrying the same thing expecting different results
- Context bloat — context size growing across spans as too much is passed along
- Silent failures — an agent returned a result but skipped part of the task
Health Signals
Borrowed from container orchestration: before routing work to an agent, know whether it can handle it.
Liveness — is the agent responding at all? In agent systems, this means: can the model be reached, do the tools work, does the system prompt load without errors? A non-live agent should not receive work.
Readiness — can the agent accept new work right now? An agent might be live but not ready: its context window is near capacity from a previous task, a critical tool is in a circuit-breaker open state, or it's in the middle of a long operation. A non-ready agent should be skipped in favor of another instance or a fallback.
Resource signals to monitor:
- Context utilization — how full is the agent's working window? Above 70% and it's constrained.
- Error rate trending — a rising error rate means something is degrading, even if individual errors are handled.
- Latency trending — increasing response times signal a problem before failures appear.
- Tool availability — if a critical tool is down, the agent is effectively degraded even if it's "live."
These signals feed into routing decisions. A coordinator that blindly delegates to a specialist without checking health will sometimes route work into a black hole.
Bulkhead Pattern
Isolate failures so one misbehaving workflow doesn't take down the whole system. Named after ship bulkheads that prevent one flooded compartment from sinking the ship.
Token budget isolation — each task type gets its own token pool. A research task running wild and consuming 10x its expected tokens should exhaust the research budget, not the budget for code review or deployment tasks.
Model instance isolation — if possible, route different task types to different model instances. A stuck agent consuming rate limits on one instance doesn't affect agents on another.
Tool concurrency isolation — cap how many concurrent calls each agent type can make to shared tools. If the research agent is hammering the web search API, it should hit its own concurrency limit before affecting the code agent's ability to use the same API.
The principle: a failure in one part of the system should degrade that part, not cascade. Without bulkheads, one runaway agent can exhaust shared resources (tokens, API rate limits, model capacity) and starve every other agent in the system.
Tool Usage Monitoring
Tools are the observable actions of an agent. Monitor them.
Metrics to track:
- Call frequency per tool, per agent — which tools are hot?
- Success/failure rate — a tool with >10% failure rate needs attention
- Latency distribution — slow tools bottleneck the whole flow
- Token consumption — tool descriptions in context + tool output tokens
- Misuse rate — agent calling a tool that returns errors or empty results repeatedly
Signals that something is wrong:
- An agent calls the same tool 3+ times with similar inputs — it is not getting what it needs. The tool interface is wrong, the tool output is unhelpful, or the agent's prompt does not teach it how to use the tool effectively.
- An agent never calls a tool it has — remove it. It is burning context space for nothing.
- Tool output tokens dominate the context window — the tool is returning too much. Add filtering, pagination, or summarization to the tool output.
- Tool calls cluster at the start then stop — the agent front-loads tool use and then reasons from stale information. Consider whether it should re-check state before concluding.
Error Handling Patterns
Agent errors are different from service errors. Services fail with exceptions. Agents fail by producing wrong outputs, making poor decisions, or getting stuck in loops. Design for these failure modes explicitly.
Retry with Backoff
For tool failures (network errors, rate limits, transient issues). Standard pattern: retry 2-3 times with exponential backoff. The agent should understand this is automatic, not a decision point. Do not surface transient tool failures to the agent's reasoning loop — handle them in the tool layer.
Fallback Agents
If a specialist fails, route to a more capable (but more expensive) agent. Or to a generalist that can attempt the task with less precision.
- Code review agent fails → fall back to general-purpose agent with code review instructions
- Specialist with Haiku fails → retry with Sonnet
- The key design constraint: the fallback agent must be able to pick up from where the failed agent left off. This means the failed agent's partial work must be accessible — store intermediate results, not just final output.
Graceful Degradation
Return partial results rather than nothing. If 3 of 4 research queries succeed, report those 3 and note the failure rather than failing the entire task. The coordinator should be designed to synthesize incomplete inputs. This requires the response schema to support partial results — a list of findings with a status field per item, not a single monolithic answer.
Circuit Breakers
If a tool or agent is consistently failing, stop calling it temporarily. Especially important for external tools (APIs, databases). Pattern:
- Closed — normal operation, requests flow through
- Open — failures exceeded threshold, all requests fail fast without attempting the call
- Half-open — after cooldown, allow one request through. If it succeeds, close the circuit. If it fails, reopen.
Track failure counts per tool. A circuit breaker on a tool that fails 5 times in 60 seconds saves you from burning tokens on retries that will not succeed.
Hallucination Detection
The hardest failure mode. The agent does not know it is wrong, and it will express high confidence in incorrect answers.
Structural checks (cheapest):
- Schema validation — if the output doesn't match the expected structure, it's suspect. Catches a surprising number of hallucinations where the agent fabricates fields or invents formats.
- Reference verification — if the agent cites a file, read the file. If it quotes a function signature, grep for it. If it claims a URL exists, fetch it. Hallucinated references are common and cheap to catch.
- Constraint checking — verify outputs against known invariants. If the agent says "this function has no side effects" but the function writes to a database, the claim is hallucinated.
Cross-validation (moderate cost):
- Run the same task through two agents independently and compare. Disagreement doesn't prove either is wrong, but agreement increases confidence. This is expensive (2x token spend) so reserve it for high-stakes outputs.
- Have a validator agent check the primary agent's output against the source material. The validator doesn't redo the work — it spot-checks claims.
Confidence calibration (least reliable):
- Ask the agent to rate its confidence. Useful as one signal among many, but agents are poorly calibrated — they express high confidence even when wrong. Never use confidence alone as a quality gate. Use it to prioritize which outputs get deeper checks.
When to invest in hallucination detection: when the cost of a wrong answer exceeds the cost of checking. Customer-facing content, code that will be deployed, financial calculations, security assessments — these warrant cross-validation. Internal research notes or brainstorming — probably not.
Loop Detection
Agents can get stuck: calling the same tool repeatedly, going back and forth between two options, or generating increasingly long responses without progress. Detect by monitoring:
- Repeated tool calls with identical or near-identical inputs (3+ times is a strong signal)
- Response length growing without new information being added
- Turn count exceeding expected range for the task complexity
- The agent apologizing or restating the problem — this is a reliable signal it is stuck
When a loop is detected, intervene: inject a prompt that breaks the pattern, escalate to a human, or terminate with a partial result. Do not let it burn tokens indefinitely.
Resilience Patterns
Timeout Budgets
Set a total wall-clock and token budget for a multi-agent task. Subdivide:
- Coordinator gets 20% for routing and synthesis
- Each specialist gets a proportional share of the remaining 80%
- If a specialist exceeds its budget, the coordinator must proceed without its result
Budget enforcement must be external to the agent. Agents are not good at tracking their own resource consumption. The orchestration layer should enforce hard limits. When a budget is exceeded, the coordinator should receive a structured signal — not just a timeout — so it can make an informed decision about how to proceed with reduced information.
Dead-Letter Handling
When an agent fails terminally, what happens to its work?
- Log the partial result and the failure reason with full trace context
- Route the task to a fallback agent or back to the coordinator with the failure context attached
- Never silently drop a task — this creates invisible gaps in output that are extremely hard to debug
The dead-letter queue is your audit trail. Every failed task should be queryable: what was attempted, why it failed, what was recovered.
Idempotent Recovery
If the system crashes mid-task, you need to resume from a checkpoint.
- Design agent tasks to be restartable: the coordinator can re-delegate to a specialist without causing duplicate side effects
- Persist checkpoints to external state (see
agent-state) for long-running multi-agent workflows - Use the artifact store as the source of truth — completed work is stored, incomplete work is re-attempted
- Side-effecting tools (write file, send email, deploy) need idempotency keys or pre-checks to avoid double execution
Human-in-the-Loop as Observability
Approval checkpoints serve two purposes: trust gate AND observability window. This is your most powerful debugging tool.
When to surface decisions to humans:
- Before irreversible actions (deploy, send, delete, publish)
- When the agent's confidence is below a threshold
- For novel situations the agent has not encountered before
- When the cost of proceeding exceeds a threshold
- When two agents disagree on the correct approach
What to show the human:
- The decision the agent wants to make
- Why — the reasoning and evidence that led to this choice
- What alternatives were considered and why they were rejected
- What will happen if they approve or reject
- The current cost and time spent on this task so far
A human reviewing an agent's decision can catch errors that no automated check will find. But do not over-use this — too many approval gates and the system is no longer autonomous, it is a chatbot with extra steps. Reserve human checkpoints for high-stakes, irreversible, or low-confidence decisions.
Cost Observability
Token spend is the cloud bill of agent systems. Track it or be surprised by it.
Track by dimension:
- Per agent — which agents are expensive?
- Per task type — which workflows cost the most?
- Per tool — tool outputs that consume lots of tokens are expensive inputs
- Per model tier — if using different models for different agents, track each tier separately
Cost optimization signals:
- A specialist using Opus for a task that Haiku could handle — right-size the model
- Tool outputs that are mostly discarded (the agent only reads 10% of what the tool returns) — add filtering to the tool
- Coordinator spending more tokens than specialists — the routing is more expensive than the work, simplify the coordinator
- Retry loops consuming budget without making progress — fix the root cause instead of retrying
Budget alerts:
- Set cost limits per task type based on historical averages
- Alert when a task exceeds 2x its typical cost
- Hard-stop when a task hits a maximum budget — this prevents runaway loops from draining your account
- Track cost trends over time — increasing costs for the same task type means something is degrading
- Log the model used for each agent invocation — model version changes can silently change cost profiles
Cost attribution in multi-agent flows:
Assign costs to the originating task, not just the agent that spent the tokens. A research agent's cost belongs to the user-facing task that triggered it. Without this attribution, you optimize individual agents but miss that certain task types are disproportionately expensive end-to-end.
Anti-Patterns
Observability Tax
Tracing every decision, logging every tool call in full, capturing complete context at every span. The observability system consumes 20-30% of the token budget, the traces are too verbose to read, and nobody looks at them because there's too much data.
Right-size your observability:
- Trace all agent invocations and tool calls (cheap — just names, timestamps, status)
- Log full context and reasoning only on errors or anomalies (expensive — do it selectively)
- Sample detailed traces in production (1 in 10 or 1 in 100) rather than tracing everything
- Set retention policies — detailed traces older than a week are rarely useful
The goal is enough observability to diagnose problems, not a complete recording of everything. If your observability costs more than 5% of your total token spend, you're over-observing.
Related Skills
- For designing agents that are observable by default → see
agent-decomposition - For communication patterns that support tracing → see
agent-communication - For state management and checkpointing → see
agent-state - For building tools that produce observable, well-structured output → see
tool-design
Tracing Patterns — Formats and Debugging Walkthroughs
Trace Format
A trace captures the full lifecycle of a multi-agent task. Each entry is a span.
Span Schema
span:
trace_id: "task-2024-abc123" # Shared across all agents in this task
span_id: "coord-001" # Unique to this span
parent_span_id: null # null for root, parent's ID for children
agent: "coordinator" # Which agent produced this span
type: "agent_invocation" # agent_invocation | tool_call | delegation
start_time: "2024-01-15T10:00:00Z"
end_time: "2024-01-15T10:00:12Z"
tokens_in: 2400
tokens_out: 350
status: "success" # success | error | timeout | cancelled
context_size_at_start: 3200 # Tokens of context when span began
metadata:
decision: "Delegating to research-agent because task requires web search"
input_summary: "User asked for competitive analysis of 3 products"
output_summary: "Delegated research for each product to specialist"
Tool Call Span
span:
trace_id: "task-2024-abc123"
span_id: "research-tool-001"
parent_span_id: "research-001" # Child of the research agent span
agent: "research-agent"
type: "tool_call"
tool_name: "web_search"
tool_input:
query: "product X market share 2024"
max_results: 5
tool_output_tokens: 1200
status: "success"
latency_ms: 2300
Example: Full Trace of a Multi-Agent Task
Task: "Analyze the authentication module and suggest improvements"
TRACE: task-2024-auth-review
│
├─ [coordinator] 10:00:00 - 10:00:45 (tokens: 2400→350)
│ Decision: "Auth analysis needs code reading + security expertise.
│ Delegating code exploration to code-agent, security
│ review to security-agent, then synthesizing."
│
├─ [code-agent] 10:00:02 - 10:00:18 (tokens: 1800→900)
│ │ Decision: "Need to find auth module files, read implementation,
│ │ understand the flow"
│ ├─ tool: glob("**/auth/**") 2ms → 8 files found
│ ├─ tool: read("src/auth/middleware.rs") 1ms → 120 lines
│ ├─ tool: read("src/auth/jwt.rs") 1ms → 85 lines
│ └─ tool: grep("session|token|cookie") 3ms → 14 matches
│ Output: "Auth uses JWT with refresh tokens, sessions stored
│ in Redis, no CSRF protection on token endpoint"
│
├─ [security-agent] 10:00:02 - 10:00:25 (tokens: 2200→600)
│ │ Decision: "Reviewing auth patterns against OWASP checklist"
│ ├─ tool: read("src/auth/middleware.rs") 1ms → 120 lines
│ ├─ tool: read("src/auth/jwt.rs") 1ms → 85 lines
│ └─ tool: grep("verify|validate|check") 3ms → 9 matches
│ Output: "3 findings: missing CSRF on /token, JWT secret
│ from env without rotation, no rate limit on /login"
│
└─ [coordinator] 10:00:26 - 10:00:45 (tokens: 3800→1200)
Decision: "Both agents returned successfully. Synthesizing
code understanding with security findings."
Output: Final analysis with 3 prioritized recommendations
Total: 8.2K input tokens, 3.1K output tokens, 45 seconds, 7 tool calls
Debugging Walkthrough: Agent Stuck in a Loop
Symptom: task taking 3x longer than usual, token spend climbing.
Trace reveals:
├─ [code-agent] 10:00:02 - 10:02:45 ⚠ LONG SPAN
│ ├─ tool: grep("handleAuth") → 0 results
│ ├─ tool: grep("handle_auth") → 0 results
│ ├─ tool: grep("authHandler") → 0 results
│ ├─ tool: grep("auth_handler") → 0 results
│ ├─ tool: grep("AuthHandler") → 0 results
│ ├─ tool: glob("**/auth*handler*") → 0 results
│ ├─ tool: grep("authenticate") → 3 results ← finally
│ ...
Diagnosis: the agent is searching for a function name that doesn't exist in the codebase. It's trying variations but not finding it.
Fix options:
- Better handoff — the coordinator should have included the actual function/file names
- Better tools — a "find relevant code" tool that does fuzzy matching
- Loop detection — after 4 failed searches with similar inputs, surface to coordinator or human
Debugging Walkthrough: Context Degradation
Symptom: final output is missing information that was found by a specialist.
Trace reveals:
├─ [research-agent] output: 2400 tokens
│ "Found 5 competitors. Detailed analysis of pricing,
│ features, market share for each..."
│
├─ [coordinator] receives research output
│ context_size_at_start: 3200 tokens
│ context_size_after_receiving: 5600 tokens ← research output added
│ context_size_when_delegating_to_writer: 5600 tokens
│
├─ [writer-agent] receives: 800 tokens of context ⚠ LOSSY
│ "Write a report about competitors. Key findings:
│ 5 competitors identified." ← detail lost!
Diagnosis: the coordinator summarized the research output too aggressively when creating the handoff for the writer agent.
Fix: use structured handoffs with explicit fields (competitors list, pricing table, feature matrix) so the coordinator can't accidentally drop structured data during summarization.
Debugging Walkthrough: Silent Tool Failure
Symptom: output is correct but incomplete. No errors in the trace.
Trace reveals:
├─ [data-agent]
│ ├─ tool: query_database("SELECT * FROM users WHERE active")
│ │ status: success
│ │ output: {"rows": [], "count": 0} ← empty, not an error
│ │
│ Decision: "No active users found. Proceeding with empty dataset."
Diagnosis: the tool returned an empty result which is technically a success. The agent treated "no data" as "no active users" when the real issue was a permissions problem — the tool's database credentials didn't have access to the users table, so it returned empty rather than an error.
Fix: tools should distinguish between "no results" and "cannot access." The error contract should include: {"rows": [], "count": 0, "accessible_tables": ["logs"], "requested_table": "users", "warning": "table not in accessible set"}.
Key Metrics Dashboard
What to track in a monitoring dashboard:
┌─────────────────────────────────────────────────────┐
│ AGENT SYSTEM HEALTH │
├──────────────────┬──────────────────────────────────┤
│ Active tasks │ 12 │
│ Avg completion │ 34s │
│ Error rate │ 2.1% │
│ Total token/hr │ 1.2M │
├──────────────────┴──────────────────────────────────┤
│ PER-AGENT BREAKDOWN │
│ calls err% avg_tokens avg_time │
│ coordinator 48 0.0% 1.2K 4.2s │
│ research-agent 35 5.7% 3.4K 12.1s │
│ code-agent 41 2.4% 2.1K 8.3s │
│ review-agent 22 0.0% 1.8K 6.7s │
├──────────────────────────────────────────────────────┤
│ TOOL HEALTH │
│ calls err% avg_latency tokens │
│ web_search 62 8.1% 2.3s 800 │
│ read_file 145 0.7% 12ms 450 │
│ grep 98 0.0% 8ms 200 │
│ edit_file 34 2.9% 15ms 300 │
├──────────────────────────────────────────────────────┤
│ ALERTS │
│ ⚠ web_search error rate above 5% threshold │
│ ⚠ research-agent avg tokens trending up (+15%/day) │
└──────────────────────────────────────────────────────┘
name: agent-state description: Use when the user asks about state management in agent systems, where agent state lives, prompt architecture, system prompt design, context window management, shared state between agents, agent memory, context compression, or prompt versioning. version: 1.0.0
Agent State — Context, Prompts, and Shared Memory
State management in agent systems is fundamentally different from traditional services. There is no database by default — state is spread across conversation history, system prompts, and whatever external stores you wire up. The conversation IS the agent's working memory, the system prompt IS its configuration, and the context window IS its RAM. Understanding these constraints shapes every design decision.
Most agent failures are state failures: an agent acting on stale context, a system prompt that contradicts itself, a handoff that lost critical information, or a context window that silently dropped the instructions that mattered most. Get state right and everything else gets easier.
State Locations
Three places state can live, each with different characteristics. The art is picking the cheapest location that meets your durability and sharing requirements.
Conversation Context (Ephemeral, Window-Bounded)
The agent's working memory — everything that has happened in this session. It grows as the conversation progresses, and it is the most natural place for state to accumulate.
- Bounded by the context window — eventually gets compressed or truncated
- Ephemeral — gone when the session ends, no persistence guarantee
- Free to write (it is just conversation), free to read (the model always sees it)
- Positional bias matters: information near the start or end of context gets more attention than information buried in the middle
- Use for: current task state, intermediate results, reasoning chains, scratchpad work
- Do not use for: anything that must survive a session boundary, anything shared with other agents
System Prompt (Persistent Per-Session)
The agent's configuration — loaded at the start of every interaction. This is the most important piece of state in the system because it defines who the agent is and what it does.
- Persistent within a session, but static — it does not learn or change during conversation
- High-attention position — the model weights system prompt content heavily
- Competes with working memory for context budget
- Use for: identity, capabilities, constraints, behavioral rules, output format requirements, tool usage guidance
- Do not use for: dynamic state, user-specific data that changes per request, large reference material that should be fetched on demand
External Stores (Persistent, Shared)
Files, databases, key-value stores, vector databases — anything the agent accesses via tool calls. This is the only state that persists across sessions and can be shared between agents.
- Persistent across sessions, shareable across agents
- Requires tool calls to read and write — adds latency, token spend, and failure modes
- Can grow without bound (not constrained by context window)
- Use for: accumulated knowledge, user preferences, project state, artifacts, audit logs
- Cost: every read and write is a tool call, which means tokens and latency
Decision rule: conversation context is free but ephemeral. System prompt is free but static. External stores are durable but expensive. Start with conversation context, promote to external stores only when you need persistence or sharing, and keep the system prompt tight and stable.
Prompt Architecture
System prompts are contracts. Design them with the same rigor you would give an API specification. A poorly structured system prompt produces inconsistent behavior — not because the model is unreliable, but because the instructions are ambiguous.
Structure
A well-structured system prompt follows a consistent order. The model processes it sequentially, so put the most important constraints early where they get the strongest attention.
- Identity — who the agent is, one sentence. This anchors all subsequent behavior.
- Capabilities — what it can do, what tools it has access to. Reference the tool list rather than duplicating tool descriptions.
- Constraints — what it must NOT do. Hard boundaries. These must be unambiguous and testable.
- Process — how to approach tasks. Optional for simple agents, essential for complex workflows. Step-by-step when ordering matters, principles when it does not.
- Output format — what the output should look like. Be specific: JSON schema, markdown structure, required fields.
- Context injection point — where dynamic context gets inserted per-session or per-task.
This ordering works because identity and constraints frame everything that follows. An agent that knows its boundaries first makes better decisions about process and output.
Composition
System prompts are rarely monolithic. They compose from layers, and keeping layers separate matters for versioning, testing, and reuse.
- Base prompt — identity, core capabilities, universal constraints. Shared across all instances of this agent type. Changes infrequently.
- Context injection — dynamic data loaded per-session or per-task. User information, project state, relevant history. Changes every session.
- Task-specific instructions — what to do right now. Often comes from the delegating agent as part of the handoff, not from the system prompt itself.
Assemble them with clear delimiters. Use XML tags or markdown headers to separate sections so the model can parse them reliably:
<identity>You are a code review specialist...</identity>
<constraints>Never modify code directly...</constraints>
<context>The project uses Rust with a workspace layout...</context>
<task>Review the changes in the following diff...</task>
This structure makes it obvious what is stable configuration versus what is dynamic input. It also makes it easier to test — you can swap the context and task sections while keeping identity and constraints fixed.
Versioning
Version prompts like APIs. When you change a system prompt, the downstream effects are just as real as changing a function signature.
- Breaking changes (different output format, removed capabilities, changed identity) = major version. Any agent that consumes this agent's output may break.
- Behavioral changes (different strategies, new constraints, reordered priorities) = minor version. Output format is stable but results may differ.
- Clarifications and rewording (same intent, clearer language) = patch. Should produce identical behavior.
This matters most when multiple agents depend on each other's output format. If Agent A produces structured JSON that Agent B parses, changing Agent A's output format without updating Agent B is exactly like changing an API without updating the client.
Store prompts in version control alongside the code that uses them. They are configuration, not content.
Interface Versioning
Prompt versioning covers what an agent IS. Interface versioning covers how agents TALK to each other. Both matter.
When Agent A delegates to Agent B, the handoff schema is an interface. When you change what Agent B expects — adding a required field, changing the output format, renaming a key — you are making an interface change.
Treat agent interfaces like API versions:
- Adding optional fields to a handoff = backwards compatible (minor version)
- Adding required fields or changing output format = breaking change (major version)
- When the coordinator expects
{"findings": [...]}and the specialist starts returning{"results": [...]}, everything downstream breaks silently
Version the handoff schema alongside the agent's prompt. When you version-bump an agent's output format, check every consumer of that output. This is the agent equivalent of "grep for callers before changing the function signature."
Context Window as Working Memory
The context window is the agent's RAM. It is finite, and how you allocate it determines what the agent can accomplish in a single session.
Budget Allocation
Think of the context window as a budget with competing demands:
- System prompt: 10-20% — keep it tight. Every word in the system prompt is a word the agent cannot use for reasoning.
- Tool definitions: 5-15% — more tools means less room for actual work. Only include tools the agent will actually use.
- Working state: 40-60% — the conversation, tool outputs, intermediate reasoning. This is where the agent does its job.
- Reserve: 15-25% — room for the next response and unexpected tool outputs. If you budget to 100%, the first large tool output will push critical context out of the window.
If your system prompt plus tool definitions exceed 35% of the window, you are constraining the agent's ability to reason about anything complex. Either trim the prompt, reduce the tool count, or accept that this agent can only handle simple tasks.
Context Management Strategies
Summarization checkpoints — periodically have the agent summarize completed work and compress the conversation. The agent replaces detailed step-by-step history with a concise summary of what was done and what matters going forward. This is the agent equivalent of garbage collection.
Structured context blocks — use clear delimiters and structure (headers, XML tags, JSON) so the agent can efficiently scan context. Unstructured prose is harder to parse and more likely to be misinterpreted. Structure also helps when you need to reference specific context blocks later.
Sliding window — for long-running tasks, keep only the most recent N turns plus the system prompt. Older turns are summarized or dropped. Simple to implement, but lossy — important details from early in the conversation may be lost if the summarization is not careful.
Selective tool output — configure tools to return only what the agent needs, not everything available. A search tool returning 50 full documents when the agent needs 3 relevant paragraphs wastes most of the window on noise. Design tool outputs to be concise and relevant.
Priority tagging — mark certain context as high-priority (must retain) versus low-priority (can compress or drop). When the window fills, compress low-priority context first. This is more sophisticated but gives you explicit control over what survives.
Shared State Patterns
When multiple agents need to share state, you need a pattern. Each has tradeoffs in complexity, consistency, and scalability.
Artifact Store
Agents read and write named documents or artifacts. Like a shared filesystem with named keys.
- Simple mental model, easy to implement with file tools or a key-value store
- Works well for: document drafting, code generation, report building, any workflow where agents produce and refine artifacts
- Ownership model: ideally one agent writes to a given artifact, others read. If multiple agents write, use last-write-wins — conflict resolution between agents is not worth the complexity
- Challenge: no built-in notification. Agents poll or must be told when an artifact changes.
Blackboard Pattern
A shared workspace where agents post findings. All agents can see everything on the blackboard. Each agent reads the full board, adds its contribution, and moves on.
- Good for: collaborative analysis, research tasks where findings build on each other
- Natural for convergent workflows — multiple specialists contribute to a shared understanding
- Challenge: the blackboard grows. Without a cleanup strategy, the cost of reading the full board grows unbounded and eventually dominates the context budget
- Mitigate with: periodic summarization of the blackboard by a coordinator, archiving completed topics, or partitioning the board into sections
Event Log (Append-Only)
Agents append events to a shared log. Other agents read events they care about, filtered by type or topic. Like a commit log or message queue.
- Good for: audit trails, tracing, event-driven choreography, workflows where ordering matters
- Natural ordering, no conflict resolution needed (appending never conflicts)
- Challenge: reading relevant events from a long log requires filtering. Without indexing, agents spend tokens scanning irrelevant entries.
- Works best with: explicit event types, correlation IDs, and filtering by type or time range
Structured Handoff State
Not persistent — passed directly between agents during delegation. Like function arguments and return values.
- Best for: coordinator-to-specialist delegation where the specialist does not need to share state with other agents
- Include: task description, relevant context (pruned to essentials), expected output format, constraints and budget
- Advantages: no shared mutable state, no consistency problems, clear ownership
- See
agent-communicationfor detailed handoff patterns
Pattern selection rule: use structured handoffs by default. Promote to an artifact store when agents need to share persistent artifacts. Use a blackboard when agents need to see each other's work. Use an event log when you need ordering and auditability.
Cross-Agent Consistency
When Agent A changes state that Agent B relies on, you have a consistency problem. In traditional systems this is solved with transactions and locks. In agent systems, the answer is simpler and more pragmatic.
Design for eventual consistency and idempotent operations.
- Agents should tolerate stale state. If Agent B reads data that Agent A is about to update, the worst case should be wasted work, not corruption. Design operations so that acting on slightly old data produces a suboptimal but not incorrect result.
- Prefer idempotent operations. If an agent retries because it did not see the result of its previous attempt, the outcome should be the same. "Set X to 5" is idempotent. "Increment X" is not.
- When strong consistency is required (rare in practice), use explicit coordination: Agent A completes and signals before Agent B starts. Sequential execution through a coordinator is the simplest form. Do not try to build distributed transactions between agents — the complexity is not worth it.
- Accept that agents will occasionally do redundant work. This is cheaper than building a coordination layer to prevent it.
Concrete failure scenario:
A coordinator delegates two parallel tasks: Agent A updates a project's config file, Agent B reads the config to generate documentation. B starts before A finishes, reads the old config, and generates documentation for the old settings. A completes, config is updated, but the documentation now describes the previous version.
Why this is usually fine: the documentation is stale but not corrupt. A human reviews and catches it, or the next run regenerates correctly. The cost of this inconsistency (one stale document) is far lower than the cost of coordinating A and B with locks (complexity, latency, deadlock risk).
When it is NOT fine: if Agent B's output triggers an irreversible action based on stale data — e.g., deploying with the old config because the documentation said it was current. In these cases, enforce ordering: A completes before B starts. Use the coordinator for sequencing, not locks.
Memory and Persistence
What should survive a session versus what should be recomputed? The answer depends on acquisition cost, stability, and staleness risk.
Persist when:
- The information was expensive to acquire — multi-step research, user interviews, complex analysis
- The information is stable and reusable — user preferences, project conventions, architectural decisions
- Loss would degrade the user experience — accumulated context about a project, learned patterns
Recompute when:
- The information changes frequently — current file contents, git status, test results
- Recomputing is cheap — reading a config file, running a quick search, checking a status endpoint
- Staleness is dangerous — persisted "the tests pass" becomes a lie after code changes, cached "the API is at v2" breaks when the API upgrades
Memory is a cache, not a source of truth. Always verify persisted state against current reality before acting on it. The cost of a verification read is almost always less than the cost of acting on stale data. An agent that confidently acts on month-old cached state will produce confident, wrong results.
Memory Hierarchy
Structure persistent memory in layers, from most stable to most volatile:
- Project knowledge — architecture, conventions, team preferences. Changes rarely. Safe to persist long-term.
- Session summaries — what was accomplished in previous sessions. Useful for continuity but verify before acting.
- Cached analysis — results of expensive computations. Persist with a timestamp and invalidation strategy.
- Ephemeral notes — scratchpad state for the current task. Do not persist — reconstruct from context.
Anti-Patterns
The God Prompt
A system prompt that tries to cover every possible scenario. 3000 words of instructions, edge cases, and conditional logic. The model cannot prioritize when everything is priority one. Keep system prompts focused. If you need conditional behavior, use context injection to load the relevant instructions for this specific task.
Stateless Agents in Stateful Workflows
Agents that forget everything between calls, in workflows where continuity matters. Every call starts from scratch, re-reads the same files, re-discovers the same context. If a workflow has multiple steps that build on each other, persist the intermediate state explicitly — do not rely on the next agent to rediscover it.
Unbounded Context Accumulation
Agents that never summarize, never compress, and just keep appending to context until the window fills and critical information gets silently dropped. The most dangerous form of this is when the dropped information includes constraints — the agent starts violating rules it was given early in the conversation because those rules are no longer in the active window.
Shared Mutable State Without Ownership
Multiple agents reading and writing the same state with no coordination. This works until it does not — and when it fails, the debugging is painful because the state corruption happened turns ago with no trace. Assign clear ownership: one agent writes, others read.
Prompt Drift
System prompts evolve informally — a tweak here, a new constraint there, a reworded section — with no versioning, no changelog, and no compatibility checks. Over weeks, the coordinator's expected output format drifts out of sync with what the specialist actually produces. The handoff schemas that worked last month silently break.
This is the agent equivalent of changing a library's API without bumping the version. The fix is the same: version prompts, version interfaces, and test compatibility when either changes. If you change a specialist's output format, check every agent that consumes it — before deploying, not after.
Related Skills
- For how agents communicate state during delegation, see
agent-communication - For deciding which agents own which state, see
agent-decomposition - For observing state flow and debugging, see
agent-observability - For designing the tools agents use to access state, see
tool-design
Prompt Templates — Role-Based Examples
Four role-based system prompt templates. Each follows the structure: identity, capabilities, constraints, process, output format, context injection.
1. Coordinator Agent
<!-- [IDENTITY] -->
You are a project coordinator. You decompose tasks, delegate to specialists,
evaluate outputs, and synthesize a final result.
<!-- [CAPABILITIES] -->
Available specialists:
- code-review: Evaluates code changes for correctness, style, security.
- research: Searches codebases and docs to answer technical questions.
- implementation: Writes or modifies code.
- test-writer: Creates test cases.
<!-- [CONSTRAINTS] -->
- Never write code yourself. Delegate all code tasks.
- Never fabricate information. If no specialist can answer, say so.
- Max 3 delegation levels. Restructure as parallel subtasks if deeper.
- Stay within <budget>. Abort gracefully if exhausted.
<!-- [PROCESS] -->
1. Decompose the request into discrete subtasks.
2. For each, prepare a structured handoff: task, context, expected output.
3. Execute. Prefer parallel when subtasks are independent.
4. Evaluate outputs. Retry or escalate on failure.
5. Synthesize into a coherent response.
<!-- [OUTPUT FORMAT] -->
- One-paragraph summary. Detailed results by subtask. Unresolved issues.
<!-- [CONTEXT INJECTION] -->
<context>{{project_description}} {{user_preferences}}</context>
<budget>{{token_budget}}</budget>
Pattern: knows when to use each specialist but never does their work. Process enforces decompose-delegate-evaluate-synthesize.
2. Specialist Agent (Code Review)
<!-- [IDENTITY] -->
You are a code review specialist. You analyze changes for correctness,
security, maintainability, and adherence to project conventions.
<!-- [CAPABILITIES] -->
Tools:
- Read: Examine source files referenced in diffs.
- Glob: Locate related files (tests, configs, types).
- Grep: Find usages of changed functions, verify naming.
<!-- [CONSTRAINTS] -->
- Read-only. Report findings, never modify files.
- Stay within changed code scope. Note risky dependencies, do not review them.
- Flag out-of-scope issues with a recommended specialist.
- Max 20 tool calls. If more needed, scope is too large.
<!-- [PROCESS] -->
- Read the full diff first. Understand intent before judging.
- Check each function for: correctness, error handling, edge cases, security.
- Verify findings against actual code. No assumption-based reports.
<!-- [OUTPUT FORMAT] -->
## Summary
approve | request changes | needs discussion.
## Findings
- **File**: path **Line**: N **Severity**: critical|warning|suggestion
- **Issue**: description **Recommendation**: what to do instead
## Out-of-Scope
Flagged issues with recommended specialist.
<!-- [CONTEXT INJECTION] -->
<conventions>{{coding_standards}}</conventions>
<scope>{{diff_content}}</scope>
Pattern: pure analysis, never modifies. Out-of-scope flags prevent information loss.
3. Validator Agent
<!-- [IDENTITY] -->
You are a validation agent. You verify completed work meets acceptance
criteria. You produce a clear pass or fail with evidence.
<!-- [CAPABILITIES] -->
Tools:
- Read: Verify file content. Bash: Run tests, linters, builds.
- Grep: Verify conventions. Glob: Verify expected outputs exist.
<!-- [CONSTRAINTS] -->
- Do not fix issues. Report them.
- No soft failures. Unmet criterion = FAIL, not "mostly passes."
- Every finding needs evidence: file path, command output, line number.
- Cannot verify = UNVERIFIABLE, not passed.
<!-- [PROCESS] -->
1. Read criteria from <criteria>.
2. Execute verification for each. Record evidence.
3. Pass/fail each independently. PASS overall only if all pass.
<!-- [OUTPUT FORMAT] -->
## Result: PASS | FAIL
## Checks
- **Criterion**: what **Status**: PASS|FAIL|UNVERIFIABLE
- **Evidence**: observed output **Details**: explanation if not PASS
## Escalation
Issues requiring human judgment.
<!-- [CONTEXT INJECTION] -->
<criteria>{{acceptance_criteria}}</criteria>
<work>{{paths_or_artifacts_to_validate}}</work>
Pattern: binary with no wiggle room. UNVERIFIABLE is explicit, not a silent skip.
4. Transformer Agent (Data Reshaping)
<!-- [IDENTITY] -->
You are a data transformation agent. You convert data between formats
according to explicit mapping rules. Same input always produces same output.
<!-- [CAPABILITIES] -->
Tools: Read (input files only). No Write or Bash. Output IS the result.
<!-- [CONSTRAINTS] -->
- Never add information not in the input. Transform, do not enrich.
- Never drop fields silently. Unmapped fields go in unmapped_fields.
- Malformed input = error response, not best-effort transformation.
- Ambiguous mapping = error listing the ambiguity, not a guess.
<!-- [PROCESS] -->
1. Validate input against <input-schema>. If invalid, return error.
2. Apply mapping rules from <mapping>.
3. Validate output against <output-schema>. Return result.
<!-- [OUTPUT FORMAT] -->
Success: {"status":"success","output":{...},"unmapped_fields":[...]}
Error: {"status":"error","error_type":"...","details":"..."}
<!-- [CONTEXT INJECTION] -->
<input-schema>{{input_schema}}</input-schema>
<output-schema>{{output_schema}}</output-schema>
<mapping>{{field_mapping_rules}}</mapping>
<input>{{data}}</input>
Pattern: no side-effect tools. Unmapped fields surfaced, not dropped. Malformed input fails loudly.
Adapting These Templates
- Identity — one sentence, unambiguous scope.
- Capabilities — tools with usage guidance, not just names.
- Constraints — hard rules. "Try to" is a suggestion, not a constraint.
- Output format — schema, not prose. Parseable by other agents and by code.
- Context injection — clearly delimited. Obvious what is stable vs dynamic.
- Test — vary context injection, keep everything else fixed. Inconsistency means ambiguous stable sections.
5. Composed System — Three Agents Working Together
How the templates above fit together in a coordinator + specialist + validator flow.
The Flow
User: "Review this PR for security issues"
│
▼
┌─────────────────┐
│ Coordinator │ ← prompt: knows specialists, delegates, synthesizes
│ (template 1) │
└────────┬────────┘
│ handoff: {task: "security review", scope: "diff content", budget: 5000}
▼
┌─────────────────┐
│ Code Review │ ← prompt: reads code, analyzes, reports findings
│ Specialist │
│ (template 2) │
└────────┬────────┘
│ output: {status: "request changes", findings: [...]}
▼
┌─────────────────┐
│ Validator │ ← prompt: checks findings have evidence, no soft passes
│ (template 3) │
└────────┬────────┘
│ output: {result: "PASS", checks: [...]}
▼
┌─────────────────┐
│ Coordinator │ ← synthesizes validated findings into user response
└─────────────────┘
What Gets Injected Where
Coordinator receives (system prompt context injection):
<context>
Project: rust-web-api, Language: Rust
User preference: focus on security, skip style nits
</context>
Specialist receives (via structured handoff, NOT system prompt):
<conventions>No unsafe blocks without comment. All SQL via query builder.</conventions>
<scope>[diff content inserted here]</scope>
The specialist's system prompt is stable — the same template every time. Only <conventions> and <scope> change per task. This separation means you can version the prompt independently from the per-task context.
Validator receives (via structured handoff):
<criteria>
- Each finding references a specific file and line
- Each finding has a severity level
- Security findings include CWE or OWASP reference
</criteria>
<work>[specialist output inserted here]</work>
Key Design Decisions
- The specialist never sees the user's original message. It sees the coordinator's structured handoff. This prevents the specialist from being influenced by conversational context that isn't relevant to its task.
- The validator doesn't know what the specialist was asked to do. It only sees the output and the criteria. This prevents the validator from being biased by the task description.
- Dynamic context flows through handoffs, not system prompts. The system prompts are stable templates. Task-specific data is injected via the handoff's delimited sections.
name: plan-agentic-system description: Use when the user wants to plan, scope, or design a new agentic system from scratch through an interactive discovery process. Triggers on "plan an agent system", "help me design my agents", "I want to build a multi-agent system", "plan agentic system", or when the user needs guided discovery of what their agent architecture should look like. version: 1.0.0
Plan Agentic System — Interactive Architecture Discovery
An interactive, question-driven process for designing an agentic system. Your job is to be the architect interviewing the client. The user knows their domain but may not see all the architectural possibilities. You ask the questions, surface options they haven't considered, and progressively build a complete system design.
Do not rush to a design. The discovery phase is the most valuable part. A mediocre design built on thorough understanding beats an elegant design built on assumptions.
How This Works
This is a multi-phase conversation, not a one-shot generation. Each phase ends with questions to the user. Do not proceed to the next phase until you have answers. Use the AskUserQuestion tool to ask structured questions with options where appropriate — this helps the user think through choices they might not have considered.
At the end, you produce a complete architecture document using patterns from the agentic-systems skills: agent-decomposition, agent-communication, tool-design, agent-state, and agent-observability.
Phase 1: Problem Space Discovery
Goal: understand what the user is trying to build and why. Do not discuss agents yet — understand the problem first.
Ask about:
1.1 The Mission
- What is this system supposed to accomplish? What's the core job?
- Who are the users? (Developers? End users? Internal teams? Automated pipelines?)
- What does success look like? How will they know it's working?
1.2 Current State
- How is this problem solved today? (Manually? Existing software? Not at all?)
- What's painful about the current approach? What breaks, what's slow, what's expensive?
- Is there existing infrastructure this needs to integrate with?
1.3 Constraints
- Are there budget/cost constraints? (Token spend matters in agent systems)
- Latency requirements? (Real-time user-facing vs batch processing vs async)
- Security/compliance? (What data can agents access? Are there audit requirements?)
- Scale? (10 requests/day vs 10,000/hour changes the architecture dramatically)
1.4 Compliance and Audit
- Are there regulatory requirements? (HIPAA, GDPR, SOC2, industry-specific?)
- What data can agents access? What data must they NOT access?
- Do you need an audit trail of agent decisions? For how long?
- Who should be able to review what agents did? (Just engineering, or compliance/legal too?)
- Are there retention requirements for agent outputs or traces?
Surface hidden constraints:
- "If agents will handle customer data, GDPR gives users the right to explanation — you may need decision tracing not just for debugging but for compliance."
- "Audit requirements often mean you need immutable logs of every agent action, not just errors. This affects your observability design from day one."
Surface possibilities the user may not see:
- "You mentioned X is done manually today — have you considered that an agent could handle the Y part while a human reviews the Z part?"
- "This workflow has a natural split between research and execution — that maps well to separate agents with different capabilities."
- "Given your latency requirements, we might want a fast cheap model for triage and a capable model for the hard cases."
Phase 2: Capability and Integration Discovery
Goal: map out every capability the system needs and every external system it touches. This is where you discover tools and services the user may not have thought to integrate.
Ask about:
2.1 Data Sources
- What data does the system need to access? (Databases, APIs, files, documents, web?)
- Where does this data live? (Internal services, SaaS products, public web, local files?)
- How frequently does the data change? (Real-time, daily, static?)
- Are there APIs already available, or would tools need to be built?
Probe deeper — users often forget sources:
- "You mentioned using Jira for project tracking — do you also have Confluence or a wiki with documentation that agents could search?"
- "If the agent needs customer context, is there a CRM? Support ticket history?"
- "Are there monitoring dashboards or logs the agent could query instead of asking a human?"
2.2 Actions and Side Effects
- What actions should the system be able to take? (Create, update, delete, send, deploy?)
- Which of these are reversible? Which are permanent?
- Which actions need human approval before execution?
- Are there existing APIs or CLIs for these actions, or would they need to be built?
Probe for automation potential:
- "You said the output is a report — does it need to be reviewed, or could it be sent directly?"
- "This involves creating tickets — is there an API for that, or is someone copying from chat to the ticketing system today?"
- "Are there actions a human does routinely that are low-risk enough to automate fully?"
2.3 Existing Tools and Services
Walk through what's already available:
- Code repositories and CI/CD pipelines
- Communication tools (Slack, email, Teams)
- Project management (Jira, Linear, GitHub Issues)
- Documentation systems (Confluence, Notion, wikis)
- Monitoring and observability (Grafana, Datadog, CloudWatch)
- Databases and data warehouses
- Custom internal services and APIs
- MCP servers already deployed or available
For each: ask about API availability, authentication requirements, and rate limits.
2.4 Knowledge and Context
- Is there domain knowledge the agents need that isn't in a database? (Tribal knowledge, conventions, unwritten rules?)
- Are there reference documents, style guides, runbooks, or playbooks?
- Does the system need to learn from feedback over time, or is it stateless?
2.5 Operations and Deployment
- How will changes to agents be deployed? (All at once? Gradual rollout? Can you canary a new prompt?)
- How will you know if a change degraded performance? What metrics define "working well"?
- What's the rollback plan if a new agent version misbehaves?
- How frequently do you expect to update agent prompts or capabilities?
- What's your cost tolerance? (Per-request budget? Monthly ceiling?)
Probe for model strategy:
- "Do all agents need the same model, or can some use cheaper models for simpler tasks?"
- "What's your latency tolerance? Opus thinks deeper but slower. Haiku is fast but shallower. The right mix depends on your tasks."
- "How do you want to handle model deprecation? When Claude's next version ships, what's your migration plan?"
Phase 3: Workflow Mapping
Goal: map the end-to-end workflows the system must support. This is where the agent structure starts to emerge.
3.1 Walk Through Concrete Scenarios
Ask the user to describe 2-3 concrete examples of the system being used:
- "Walk me through a typical request from start to finish. What happens at each step?"
- "Now walk me through a hard case — one where things get complicated or require judgment."
- "What's a failure case? When does the current process break?"
For each scenario, identify:
- Decision points — where does the workflow branch?
- Handoffs — where does responsibility shift from one person/system to another?
- Bottlenecks — what step takes the longest or fails the most?
- Quality gates — where does someone review before proceeding?
3.2 Identify Natural Agent Boundaries
Based on the workflows, surface potential decomposition to the user:
- "Steps 1-3 are all about research and gathering information. Steps 4-5 are about generating output. These could be separate agents with different tool sets."
- "This decision point looks like a coordinator's job — route to the right specialist based on the request type."
- "The review step is a natural validator agent — it checks the output before it goes to the user."
Present the emerging topology and ask: "Does this mapping feel right? Is there a step I'm oversimplifying?"
3.3 Volume and Patterns
- How often is each workflow triggered? (Per hour? Per day? On demand?)
- Are there peak times? Batch processing windows?
- Can workflows run concurrently, or are there serialization constraints?
- What's the typical vs worst-case complexity of a request?
3.4 Failure Mode Analysis
Walk through failure scenarios with the user:
- "What's the worst thing an agent could do in this system? What's the blast radius?"
- "Walk me through a failure case — a tool is down, an agent hallucinates, a handoff loses context. What should happen?"
- "Which operations are reversible? Which are permanent? For permanent ones, what's the human approval flow?"
- "What's the cost of an error? Is it hours of lost work? Money? Customer trust? Safety?"
- "What does 'partial success' look like? If 3 of 4 steps succeed, is that useful or dangerous?"
Surface failure modes the user hasn't considered:
- "If the research agent returns confidently wrong information, the downstream agents will act on it. How do you want to catch this?"
- "What happens during a model outage? Does the system queue work, degrade to a simpler flow, or fail entirely?"
- "If two agents produce conflicting results, who arbitrates — a third agent, the coordinator, or a human?"
Phase 4: Architecture Proposal
Goal: present a concrete architecture based on everything discovered. Use patterns from the agentic-systems skills.
4.1 Agent Topology
Apply agent-decomposition patterns:
- Draw an ASCII topology diagram showing all agents and their relationships
- For each agent: name, role, model tier, tool set, what it delegates and to whom
- Justify the decomposition — why these agents, why these boundaries?
- Call out where you chose NOT to split and why
4.2 Communication Design
Apply agent-communication patterns:
- Orchestration vs choreography decision with rationale
- Delegation patterns for each agent-to-agent communication
- Handoff schemas — what gets passed between agents
- Trust boundaries — which agents have access to which capabilities
- Human-in-the-loop checkpoints — where and why
4.3 Tool Inventory
Apply tool-design patterns:
- Complete list of tools needed, organized by agent
- For each tool: name, description, input/output contract, side effects
- Flag which tools already exist (discovered in Phase 2) vs which need to be built
- Prioritize: which tools are essential for v1, which can wait?
4.4 State Strategy
Apply agent-state patterns:
- Where state lives for each agent (context, external store, shared workspace)
- System prompt architecture for each agent role
- Context budget estimates — will the workflows fit in context windows?
- What persists across sessions vs what is ephemeral
4.5 Observability Plan
Apply agent-observability patterns:
- What to trace and how
- Error handling strategy per agent
- Cost estimates and budget alerts
- How to debug when things go wrong
4.6 Phased Rollout
Propose an incremental build plan:
- Phase 1: minimal viable system — fewest agents, core workflow only
- Phase 2: add specialist agents as complexity demands
- Phase 3: add observability, resilience, and optimization
- Call out decision points: "after Phase 1, you'll know whether X warrants splitting into its own agent"
Phase 5: Write the Design Document
Once the user approves the architecture, write a design document to a file. Ask the user where they want it (default: agentic-system-design.md in the project root).
The document should include:
- Problem statement — what this system solves (from Phase 1)
- System context — integrations, data sources, constraints (from Phase 2)
- Workflows — the concrete scenarios mapped (from Phase 3)
- Architecture — topology, communication, tools, state, observability (from Phase 4)
- Phased rollout plan (from Phase 4.6)
- Open questions — things that need validation or user decisions before implementation
Principles
- Ask, don't assume. When in doubt, ask the user. A wrong assumption early compounds into a wrong architecture.
- Surface hidden possibilities. The user knows their domain but may not see which existing tools, APIs, or services could be leveraged by agents. Your job is to discover these.
- Challenge gently. If the user proposes something that seems overengineered or underengineered, say so with reasoning. "You could do that, but here's a simpler approach that achieves the same thing" or "That sounds simple but will hit X problem at scale."
- Start simple, earn complexity. Always propose the simplest viable architecture first. Add agents and patterns only when justified by concrete needs discovered in the conversation.
- Make the implicit explicit. Users often have unspoken assumptions about latency, cost, reliability, or quality. Surface these. "You haven't mentioned error handling — what should happen when the research step fails? Is partial results acceptable or do you need retries?"
- Know where the analogy breaks. This skill treats agents like microservices, and the analogy is productive — but it has limits:
- Nondeterminism: microservices are deterministic (same input → same output). Agents are not. The same prompt can produce different results. This makes caching harder, testing harder, and debugging harder. Design for variance, not consistency.
- Ephemeral state: microservices have durable state (databases). Agent state is ephemeral by default (context window). If you don't explicitly persist it, it's gone. Recovery requires checkpointing, not just restarting.
- Composition depth: microservices can compose to arbitrary depth. Agents lose context at each hop and degrade after 2-3 levels. Flat is better than deep. If you need depth, use structured handoffs aggressively.
- Fragile interfaces: microservice APIs are formally specified (schemas, types). Agent "interfaces" are prompts — informal, brittle, and subtly version-dependent. A small prompt change can alter output format in ways that break consumers silently.
Related Skills
This skill is a comprehensive interactive process that draws from all five agentic-systems architecture skills:
- For agent boundary and topology decisions, see
agent-decomposition - For delegation and trust patterns, see
agent-communication - For tool interface design, see
tool-design - For state management and prompt architecture, see
agent-state - For tracing, resilience, and cost tracking, see
agent-observability
Each skill can also be used independently for targeted guidance on a specific concern.
name: tool-design description: Use when the user asks about designing tools for agents, tool granularity, tool schemas, input/output contracts, error contracts for tools, tool composability, tool descriptions, idempotency, or when building the tool layer of an agentic system. version: 1.0.0
Tool Design — Interfaces, Contracts, and Composability
Tools are the hands of an agent. The quality of your agentic system is bounded by the quality of its tool interfaces. A brilliant agent with poorly designed tools will produce poor results — it will call the wrong tool, pass the wrong parameters, misinterpret the output, and burn tokens recovering from avoidable confusion. Tool design is API design. The consumer just happens to be an LLM instead of a developer. This changes the priorities (descriptions matter more, consistency matters more, error clarity matters more) but the core discipline is identical: design for the caller, not for the implementer.
Tool Granularity
The right granularity: a tool should do one meaningful thing that the agent cannot accomplish through reasoning alone.
Too granular — the agent spends tokens orchestrating micro-steps. If the agent always calls tool A then tool B then tool C in that exact sequence, those should be one tool. You are forcing the agent to be a workflow engine, and agents are bad workflow engines. Every tool call is a decision point where the agent can make a mistake. Minimize unnecessary decision points.
Too coarse — the agent loses control. If a tool does 5 things and the agent only needed 1, the other 4 are wasted work or, worse, unwanted side effects. A tool that "creates a project, initializes git, installs dependencies, and opens the editor" is four tools pretending to be one. The agent that just wanted to create a directory now has an editor window open.
The litmus test: can the agent meaningfully choose NOT to call this tool in some scenarios? If the answer is always yes — sometimes the agent needs this, sometimes it doesn't — the granularity is right. If the agent must always call it as part of every workflow, it should be automatic or implicit, not a tool. If the agent never calls it independently (always paired with another tool), merge them.
Compound tools are fine when the compound operation is the natural unit of work. search_and_rank is better than separate search + rank if ranking without searching never makes sense. The boundary is: does the combination represent a coherent operation, or is it just bundling for convenience?
Prefer fewer, well-designed tools over many narrow ones. An agent with 50 tools has a harder time choosing the right one than an agent with 12. If you find yourself adding tools that overlap in purpose, consolidate. Tool sprawl is the tool equivalent of microservice sprawl — it shifts complexity from the implementation to the coordination layer, which is the worst place for it in an agentic system.
How to evaluate granularity in practice: list every tool your agent has access to and ask: "If I removed this tool, what task becomes impossible?" If the answer is "nothing becomes impossible, another tool mostly covers it," merge or remove it. Then ask: "If I split this tool in two, would the agent use each half independently?" If yes, split. If not, keep it whole.
Watch for emergent sequences. Once your agent is running, look at its tool call traces. If you see the same 3-tool sequence appearing in 80% of completions, that sequence is a candidate for a compound tool. The agent is telling you where your granularity is wrong.
Schema Design
Tool inputs and outputs are the agent's API contract. Design them like you would design a public API — because that is exactly what they are.
Inputs
-
Every parameter should have a clear, unambiguous name.
queryis better thaninput.max_resultsis better thanlimit(which limit?).file_pathis better thanpath(path to what?). The agent reads the parameter name and the description to decide what to pass. Ambiguous names cause ambiguous behavior. -
Use enums for constrained choices. Don't make the agent guess valid values. If a parameter accepts
"json","csv", or"text", say so in the schema. Free-text parameters that secretly only accept specific values are a trap. -
Required vs optional: required parameters should be the minimum needed to do the operation at all. Optional parameters with sensible defaults let the agent operate simply in the common case and precisely in the advanced case. If you have more than 3-4 required parameters, the tool is probably too complex.
-
Avoid boolean flags that change behavior dramatically. If
dry_run=trueanddry_run=falseproduce fundamentally different behavior (one reads, one writes), these should be separate tools. The agent reasons about tool safety from the tool description. A tool that is "safe" in one mode and "destructive" in another mode forces the agent to track state it should not have to. -
Accept the most natural input format. If the agent will have a file path, accept a file path — don't require a file ID that forces a lookup first. If the agent will have natural language, accept natural language — don't require structured syntax.
-
Validate early, fail fast. Check all inputs before doing any work. An agent that gets a validation error after a tool has already partially executed is in an ambiguous state. Did the side effect happen or not? Validate everything up front, then execute. If validation fails, return a clear error listing all invalid parameters at once — not just the first one found.
-
Document valid ranges. If
max_resultshas a ceiling of 1000, say so. Ifquerymust be under 500 characters, say so. Undocumented limits cause silent truncation or cryptic failures that the agent cannot diagnose.
Outputs
-
Structured output when downstream processing is needed. If another tool or agent will consume this result, return JSON with a predictable shape. The agent should not have to parse prose to extract a value.
-
Natural language output when the result is for reasoning. If the agent needs to think about the result (summarize it, make a judgment call, explain it to the user), natural language is fine. Not everything needs to be JSON.
-
Include metadata. Status, count, pagination info, timestamps. Don't make the agent infer "there are more results" from the absence of results. Don't make the agent guess whether an empty list means "no matches" or "something went wrong."
-
Consistent output shape. Success and error cases should have the same top-level structure. If success returns
{"status": "ok", "data": [...]}, then failure should return{"status": "error", "error": {...}}— not a raw string or a different shape entirely. The agent should not have to detect the output format before it can process the output. -
Trim aggressively. Return what the agent needs, not what the implementation happens to have. An agent working in a 200k context window does not need 50KB of raw log output. Summarize, filter, truncate — and offer a way to get the full data if needed (write to file, paginate).
-
Prefer stable output ordering. If a tool returns a list, sort it deterministically. The agent may compare outputs from successive calls to detect changes. Non-deterministic ordering makes comparison unreliable and wastes reasoning effort.
Error Contracts
How a tool communicates failure determines whether the agent can recover or just flails. Error design is not an afterthought — it is half the interface.
Recoverable Errors
The agent can retry or try a different approach. The error must include: what went wrong, why, and what the agent could try instead.
{
"status": "error",
"error": "rate_limited",
"message": "API rate limit exceeded (60 requests/minute)",
"recoverable": true,
"retry_after_seconds": 30,
"suggestion": "Reduce batch size or wait before retrying"
}
The agent reads this and knows: wait 30 seconds, then retry. Or reduce batch size. It has a plan.
Terminal Errors
Nothing the agent can do. Be explicit about this so the agent does not waste tokens retrying.
{
"status": "error",
"error": "not_found",
"message": "Repository 'foo/bar' does not exist",
"recoverable": false
}
The agent reads this and knows: stop trying, inform the user.
What Not to Return
- Raw stack traces. Useless to an agent, wastes context. The agent cannot fix your NullPointerException.
- Ambiguous errors. "Something went wrong" gives the agent nothing to work with. It will guess, and it will guess wrong.
- Silent success on failure. Returning
{"status": "ok"}when the operation actually failed is the worst possible outcome. The agent proceeds confidently on a false foundation.
Error Categories the Agent Should Distinguish
| Category | Agent Response | Example |
|---|---|---|
| Input validation | Fix the inputs and retry | "Parameter 'date' must be ISO 8601 format" |
| Transient failure | Wait and retry | "Connection timeout after 30s" |
| Permission denied | Escalate or abort | "API key lacks write access to this resource" |
| Not found | Search differently or inform user | "No file matching pattern '*.rs' in /empty-dir" |
| Resource exhausted | Back off or reduce scope | "Result set exceeds 10MB limit, add filters" |
Design your error responses so the agent can distinguish these categories programmatically — not by parsing English sentences.
The recoverable field. This single boolean saves more wasted tokens than any other error design choice. When the agent sees "recoverable": false, it stops trying. Without it, the agent may retry a permissions error five times, burning tokens and time on an operation that will never succeed. Cheap to implement, massive impact.
Error messages should be written for the agent, not for a log file. The agent does not need to know which line of code threw the exception. It needs to know what it can do about it. "Query parameter 'since' must be an ISO 8601 timestamp, received '2 days ago'" is actionable. "ValueError at line 342 in parser.py" is not.
Composability
Tools that compose well create emergent capability. Tools that don't compose force the agent to be the glue code, and agents are unreliable glue code. But composability is not always a virtue — for safety-critical flows where tools must be called in a specific sequence with specific guards, constrain the sequence explicitly rather than relying on the agent to discover the correct composition. The goal is composability where flexibility helps, and rigidity where safety demands it.
Output-to-input compatibility. The output of tool A should be directly usable as input to tool B without the agent reformatting. If the agent is always extracting a field from one tool's output to pass to another, you have a design problem. Either adjust the schemas to be compatible, or create a compound tool that handles the pipeline internally.
Consistent conventions across all tools. All tools that return lists use the same pagination pattern. All tools that accept identifiers use the same ID format. All tools that accept paths use the same path format (absolute vs relative, with or without trailing slash). Consistency reduces the cognitive load on the agent. Every inconsistency is a potential error.
Pipeline-friendly design. Tools that naturally chain (fetch then transform then store) should have compatible interfaces. The output of fetch should be a valid input to transform without the agent understanding serialization formats, encoding, or data layout.
Avoid hidden coupling. If tool B only works after tool A has been called (because A sets up some state that B depends on), make this explicit. Better: make B accept the state as a parameter so the dependency is visible in the schema, not hidden in runtime behavior.
Test composability by chaining. Take your tool set and attempt common multi-step tasks using only tool outputs as subsequent tool inputs. If you find yourself mentally reformatting data between steps — extracting an ID from a nested object, converting a timestamp format, joining fields into a string — your schemas have friction. Smooth that friction at the tool boundary, not in the agent's reasoning.
Shared vocabulary. Define a glossary of terms that all tools use consistently. If one tool calls it user_id and another calls it userId and a third calls it account_id referring to the same concept, the agent will eventually mix them up. One name, one concept, everywhere.
Idempotency and Side Effects
Agents retry. Agents sometimes call tools they have already called. Agents sometimes call tools speculatively to see what happens. Your tools must handle all of this gracefully.
Read tools should always be idempotent and safe to call any number of times. No side effects. No rate limiting that punishes repeated reads. If reading has side effects (audit logging, view counts), those should be invisible to the agent.
Write tools should be idempotent where possible. Creating a resource that already exists should return the existing resource, not fail with a conflict error. Updating a resource to a state it is already in should succeed, not complain. The agent should not have to check-then-act — that pattern is both wasteful and racy.
Label side effects explicitly. The tool description should state whether calling it changes state. Agents reason about safety and reversibility. A tool described as "Get project details" that secretly triggers a webhook on every call is a violation of trust.
Side effect categories:
| Category | Safety | Retry? | Example |
|---|---|---|---|
| Pure read | Always safe | Always | Search files, get status |
| Idempotent write | Safe to retry | Yes | Update config, set value |
| Non-idempotent write | Each call changes state | Carefully | Send email, post message |
| Destructive | Cannot be undone | Never blind | Delete resource, force push |
Make the category obvious from the tool name and description. Agents that accidentally call a destructive tool because it was poorly labeled are a design failure, not an agent failure.
Naming conventions that signal intent. Prefix or suffix tool names to make the side-effect category obvious at a glance. get_*, list_*, search_* for pure reads. create_*, update_*, set_* for idempotent writes. send_*, post_*, trigger_* for non-idempotent writes. delete_*, destroy_*, drop_* for destructive operations. The agent should be able to infer the safety profile from the name alone, before reading the description.
State visibility. If a write tool changes state that other tools will reflect, document this. "After calling deploy_service, subsequent calls to get_service_status will show the new deployment." The agent needs to build a mental model of how tools affect each other. Don't make it guess.
Description as Interface
The tool description is the only thing between the agent and correct tool usage. It is the API documentation, the README, the type signature, and the docstring — combined into one piece of text that must be complete enough for an LLM to use the tool correctly on the first try.
A good description answers four questions:
- What does this tool do? One sentence. No jargon the agent would not know.
- When should I use it? Trigger conditions — what situation calls for this tool?
- What should I expect? Output shape, common outcomes, rough size of results.
- When should I NOT use it? Common confusion with other tools, misuse cases.
Example — good:
Search for files matching a glob pattern in the codebase. Use when you need to find files by name or extension. Returns a list of matching absolute file paths sorted by modification time. Do not use this to search file contents — use Grep for content search.
Example — bad:
File search utility.
The bad description leaves the agent to guess: search by name? By content? By metadata? What does it return? When would I use this vs the other search tool? Every unanswered question is a coin flip the agent will get wrong some percentage of the time.
Description length: longer is better than ambiguous, but concise is better than verbose. Aim for 2-4 sentences. If you need more, the tool is probably too complex.
Include examples in descriptions for tools with non-obvious input formats. "Accepts glob patterns like **/*.ts or src/**/*.rs" is more useful than "Accepts a pattern string."
Describe boundaries between similar tools. If you have search_files and grep_files, the description of each should mention the other and explain the boundary. "Use search_files for name/path matching; use grep_files for content matching." Agents encounter all tools simultaneously and need to choose between them. Make the choice obvious.
Test your descriptions. Give the tool list (names and descriptions only, no code) to an LLM and ask it which tool to use for a set of tasks. If it picks wrong, your descriptions are unclear. This is the cheapest test you can run and it catches the most common tool design failures.
Anti-Patterns
Kitchen-Sink Tool — one tool with an action or mode parameter that completely changes behavior. manage_database(action="create") and manage_database(action="drop") should be separate tools. The agent cannot reason about safety when the same tool creates and destroys.
Brittle Tool — fails on any unexpected input. Agent inputs are noisy. Agents add extra whitespace, include quotes around values that don't need them, use slightly different date formats. Good tools handle minor formatting variations. Validate strictly on semantics, loosely on syntax.
Opaque Tool — returns "done" or "success" with no useful detail. The agent needs to know WHAT happened, not just that something happened. Return the created resource, the matched count, the operation summary. "Created user with ID 'abc-123'" is actionable. "Success" is not.
Chatty Tool — returns megabytes of raw data. The agent's context window is finite and expensive. Tools should return relevant, filtered results. If the full data is needed, write it to a file and return the path. Do not dump a 10,000-line log into the agent's context.
God Tool — does everything. "execute_workflow" that takes an arbitrary workflow definition is not a tool, it is a sub-system. The agent cannot reason about what it does because what it does depends entirely on the input. Break it down into tools the agent can understand individually.
Secret-State Tool — behavior depends on hidden mutable state that the agent cannot observe. If calling tool A changes what tool B returns, and the agent has no way to know this, you have created a trap. Make state explicit in inputs and outputs.
Inconsistent Tool — tools in the same set use different conventions. One returns {"items": [...]}, another returns {"results": [...]}, a third returns a bare array. One accepts user_id, another accepts userId. Each inconsistency is a small tax on the agent's attention. Across dozens of tool calls, these taxes compound into failures.
Undocumented Limit Tool — has constraints the description does not mention. Silently truncates results at 100 items, times out on queries over 10 seconds, rejects input over 4KB. The agent has no way to know these limits exist and no way to work around them. Document every constraint in the description.
Design Process
When designing a new tool set, work through these steps in order:
- List the tasks the agent must accomplish. Not the tools — the tasks. "Find relevant source files," "apply a code change," "verify tests pass."
- For each task, ask: does this require external action? If the agent can accomplish it through reasoning alone, it is not a tool. If it requires interacting with the outside world (filesystem, API, database), it is a tool.
- Group related actions by resource or domain. All file operations together, all database operations together. This reveals natural tool boundaries.
- Define the input/output contract for each tool. What is the minimum input? What does the agent need from the output to proceed?
- Write the descriptions before the implementations. If you cannot explain when to use a tool in 2-4 sentences, the tool is not well-defined yet.
- Test with an agent. Give the tool definitions to an LLM, describe a task, and see which tools it selects and how it uses them. Iterate on the design based on where it goes wrong.
Related Skills
- For deciding which tools belong to which agent: see
agent-decomposition - For how tool calls flow between agents: see
agent-communication - For monitoring tool usage and handling failures at runtime: see
agent-observability - For managing state that tools read and write: see
agent-state
Tool Schema Examples
Bad vs good tool definitions showing how schema design, descriptions, and error handling affect agent behavior.
1. Search Tool — Description and Output Structure
Bad
{ "name": "search", "description": "Search utility.", "parameters": { "input": { "type": "string" } } }
The agent has no idea whether this searches names, contents, or both. input gives no format hint. Empty return is ambiguous — no matches or error?
Good
{
"name": "search_files",
"description": "Search for files by name pattern in the project. Returns paths sorted by modification time. Do not use for content search — use grep_files instead.",
"parameters": {
"pattern": { "type": "string", "description": "Glob pattern. Examples: '**/*.ts', 'src/**/index.js'" },
"max_results": { "type": "integer", "default": 50 }
},
"output": {
"status": { "type": "string", "enum": ["ok", "error"] },
"files": ["string"],
"total_matches": "integer",
"truncated": "boolean"
}
}
Clear purpose, boundary with grep_files, structured output with truncated flag. No ambiguity.
2. Mutation Tool — Kitchen-Sink vs Separate Actions
Bad
{
"name": "manage_user",
"description": "Manage users in the system.",
"parameters": {
"action": { "type": "string", "enum": ["create", "update", "delete", "deactivate"] },
"user_id": { "type": "string" },
"data": { "type": "object" }
}
}
Safety is invisible — delete and create share a description. The opaque data object gives no schema guidance per action.
Good
{
"name": "create_user",
"description": "Create a new user. If email exists, returns existing user (idempotent).",
"parameters": {
"email": { "type": "string" },
"display_name": { "type": "string" },
"role": { "type": "string", "enum": ["viewer", "editor", "admin"], "default": "viewer" }
}
}
{
"name": "delete_user",
"description": "Permanently delete a user and all data. Cannot be undone. Use deactivate_user to disable without data loss.",
"parameters": {
"user_id": { "type": "string" },
"confirm": { "type": "boolean", "description": "Must be true. Safety check." }
}
}
Separate tools, separate safety profiles. create_user is idempotent, delete_user is destructive with a confirmation gate and a pointer to a safer alternative.
3. Data Retrieval — Unfiltered Dump vs Paginated and Filtered
Bad
{ "name": "get_logs", "description": "Get logs.", "parameters": { "source": { "type": "string" } } }
Returns the entire log as a single string — could be 5 lines or 500,000. No scope control, no filtering, no metadata.
Good
{
"name": "get_logs",
"description": "Retrieve log entries filtered by severity and time. Newest first. For full export use export_logs_to_file.",
"parameters": {
"service": { "type": "string" },
"severity": { "type": "string", "enum": ["debug","info","warn","error","fatal"], "default": "info" },
"since_minutes": { "type": "integer", "default": 60 },
"max_entries": { "type": "integer", "default": 100 },
"contains": { "type": "string", "description": "Substring filter" }
},
"output": {
"status": "string",
"entries": [{"timestamp": "...", "severity": "...", "message": "..."}],
"total_matching": "integer",
"returned": "integer"
}
}
Agent controls scope with five parameters. total_matching vs returned tells the agent if there are more results.
4. Error Handling — Raw Strings vs Structured Errors
Bad
The tool returns error information as plain strings:
"Error: ECONNREFUSED 10.0.0.5:5432 - connection refused"
Or stack traces:
"Traceback (most recent call last):\n File \"db.py\", line 42...\npsycopg2.OperationalError: could not connect to server"
The agent cannot distinguish connection errors from permission errors from syntax errors. Stack traces waste tokens.
Good
{
"status": "error",
"error": {
"code": "connection_failed",
"message": "Cannot connect to database at 10.0.0.5:5432",
"category": "transient",
"recoverable": true,
"retry_after_seconds": 5,
"suggestion": "Database may be starting up. Retry in a few seconds."
}
}
recoverable and category let the agent decide programmatically. retry_after_seconds gives a specific wait. suggestion provides a fallback strategy. No stack traces, no implementation internals.
name: bevy-ecosystem description: Use when the user asks about third-party Bevy crates, community plugins, which crate to use for a specific feature, Bevy ecosystem recommendations, or when looking for functionality not built into Bevy core. Also triggers for questions about Bevy version compatibility, migration between versions, or keeping up with breaking changes. version: 1.0.0
Bevy Ecosystem — Third-Party Crates & Migration
Bevy's core is intentionally lean. The community fills the gaps with high-quality crates for physics, input, networking, UI, and more. This skill helps you choose the right crate, verify version compatibility, and navigate Bevy's fast-moving release cycle.
For in-depth usage of specific crates, see the dedicated skills:
- bevy-physics — Avian and bevy_rapier setup, colliders, raycasting, joints
- bevy-input-and-interaction — leafwing-input-manager action mapping, input contexts
- bevy-ui-and-audio — bevy_kira_audio advanced audio, bevy_egui debug panels
Essential Crates Overview
The reference file references/ecosystem-crates.md has full dependency lines, setup code, and links. Below is the quick map so you know what exists.
Physics
| Crate | What it does |
|---|---|
avian3d / avian2d | ECS-native physics engine built for Bevy. Preferred for new projects. |
bevy_rapier3d / bevy_rapier2d | Rapier physics integration. Mature, widely used. |
Input
| Crate | What it does |
|---|---|
leafwing-input-manager | Declarative action-to-input mapping with combos, chords, and virtual axes. |
Assets & Loading
| Crate | What it does |
|---|---|
bevy_asset_loader | Declarative asset loading states — define what to load, get a callback when done. |
iyes_progress | Track loading progress across multiple asset collections. |
Animation
| Crate | What it does |
|---|---|
bevy_tweening | Tweens and animation sequences for transforms, colors, and custom components. |
UI & Editor
| Crate | What it does |
|---|---|
bevy_egui | Immediate-mode egui inside Bevy — great for dev tools and debug panels. |
bevy_cosmic_edit | Rich text editing widget powered by cosmic-text. |
Networking
| Crate | What it does |
|---|---|
lightyear | Client-prediction, server-authoritative networking with rollback. |
bevy_replicon | High-level replication framework — entity spawning, component sync, RPCs. |
bevy_renet | Lower-level reliable UDP transport for Bevy. |
Debug & Dev Tools
| Crate | What it does |
|---|---|
bevy-inspector-egui | Runtime ECS inspector — browse entities, edit components live. |
bevy_screen_diagnostics | On-screen FPS, entity count, and custom diagnostics overlay. |
Tilemap & Level Design
| Crate | What it does |
|---|---|
bevy_ecs_tilemap | High-performance ECS-backed tilemap renderer. |
Particles & VFX
| Crate | What it does |
|---|---|
bevy_hanabi | GPU-accelerated particle system with visual effect graphs. |
Camera
| Crate | What it does |
|---|---|
bevy_pancam | Plug-and-play 2D camera: pan, zoom, bounds. |
bevy_flycam | Simple 3D fly camera for prototyping and debugging. |
Persistence & Serialization
| Crate | What it does |
|---|---|
bevy_pkv | Simple key-value store for settings and save data (backed by sled or browser localStorage). |
Version Compatibility
Bevy releases break things. Every crate must target a specific Bevy version. Here is how to avoid mismatches:
Check before you cargo add
-
Look at the crate's
Cargo.tomlor its README — most ecosystem crates have a compatibility table showing which crate version maps to which Bevy version. -
Search for the
bevy-trackingGitHub label. Many crate repos use labels likebevy-0.15orbevy-trackingto track PRs that update to the latest Bevy release. -
Check the crate's latest release date. If the crate was last published before the Bevy version you are using shipped, it almost certainly does not support it yet.
-
Look at
Cargo.tomldependency specification. A crate specifyingbevy = "0.15"works with Bevy 0.15.x but not 0.14 or 0.16.
What to do when a crate is behind
- Check the crate's
mainbranch — an unreleased update may already exist. Use a git dependency temporarily:bevy_some_crate = { git = "https://github.com/author/bevy_some_crate", branch = "main" } - Search for forks that have already updated.
- Pin your Bevy version to match the crate if the feature is critical.
Migration Strategy
Bevy does not have a stability guarantee yet. Major releases (0.14 to 0.15, etc.) routinely contain breaking changes. Here is how to handle upgrades:
Where to find migration guides
The official migration guides live at:
https://bevyengine.org/learn/migration-guides/
Each guide is organized by the Bevy release (e.g., "0.14 to 0.15") and lists every breaking change with before/after code.
Common breaking change patterns
- System parameter changes — query syntax or resource access patterns change.
- Plugin API reshuffles —
add_pluginssignature, plugin group composition. - Rendering pipeline changes — material/shader APIs evolve rapidly.
- Schedule renaming —
CoreSet,Update, startup system registration. - Asset system changes —
AssetServerAPI, handle types, loading patterns.
Upgrade strategy
- Pin your current Bevy version in
Cargo.tomlbefore starting the upgrade so you have a known-good baseline. - Read the full migration guide for your target version before changing any code.
- Bump the Bevy version in
Cargo.tomland let the compiler fail. - Fix one system at a time. The compiler errors are your checklist — each error corresponds to a documented breaking change.
- Update third-party crates to their compatible versions (see Version Compatibility above).
- Run your game after each batch of fixes, not just at the end.
Tip: compiler-driven migration
Bevy's type system is strict enough that most breaking changes produce compiler errors rather than silent bugs. Trust the compiler. If it compiles and your systems still run, the migration is almost certainly correct.
Finding New Crates
When you need functionality not listed here:
-
Bevy Assets page — the official curated list:
https://bevyengine.org/assets/Categorized, searchable, with version compatibility info.
-
awesome-bevy — community-maintained GitHub repo:
https://github.com/bevyengine/bevy-assets -
crates.io — search with the
bevykeyword or category. Most Bevy ecosystem crates usebevyas a keyword. -
This Week in Bevy — weekly newsletter covering new crates, updates, and community highlights:
https://thisweekinbevy.com/
When evaluating a crate, check: last commit date, Bevy version support, number of open issues, and whether the maintainer is active in the Bevy Discord.
Ecosystem Crates Reference
Detailed reference for recommended third-party Bevy crates. All versions listed target Bevy 0.15.x. Always verify compatibility before adding a dependency.
Physics
avian3d / avian2d
ECS-native physics engine designed specifically for Bevy. Successor to bevy_xpbd. Preferred for new projects.
# Cargo.toml
avian3d = "0.2"
# or for 2D:
avian2d = "0.2"
#![allow(unused)] fn main() { app.add_plugins(avian3d::PhysicsPlugins::default()); }
- Repo: https://github.com/Jondolf/avian
- Docs: https://docs.rs/avian3d
bevy_rapier3d / bevy_rapier2d
Rapier physics engine integration. Mature and battle-tested.
bevy_rapier3d = "0.28"
# or for 2D:
bevy_rapier2d = "0.28"
#![allow(unused)] fn main() { app.add_plugins(RapierPhysicsPlugin::<NoUserData>::default()); }
- Repo: https://github.com/dimforge/bevy_rapier
- Docs: https://docs.rs/bevy_rapier3d
Input
leafwing-input-manager
Declarative input mapping: bind actions to keys, buttons, gamepads, mouse, or virtual axes. Supports combos, chords, and input contexts.
leafwing-input-manager = "0.16"
#![allow(unused)] fn main() { app.add_plugins(InputManagerPlugin::<MyAction>::default()); }
- Repo: https://github.com/Leafwing-Studios/leafwing-input-manager
- Docs: https://docs.rs/leafwing-input-manager
Assets & Loading
bevy_asset_loader
Declarative asset loading — define asset collections with derive macros, load them during a loading state, get notified when complete.
bevy_asset_loader = "0.22"
#![allow(unused)] fn main() { app.add_plugins(AssetLoaderPlugin::new(GameState::Loading, GameState::Playing)); }
- Repo: https://github.com/NiklasEi/bevy_asset_loader
- Docs: https://docs.rs/bevy_asset_loader
iyes_progress
Track loading progress across multiple systems. Pairs well with bevy_asset_loader.
iyes_progress = "0.13"
#![allow(unused)] fn main() { app.add_plugins(ProgressPlugin::<GameState>::new().with_state(GameState::Loading)); }
- Repo: https://github.com/IyesGames/iyes_progress
- Docs: https://docs.rs/iyes_progress
Animation
bevy_tweening
Component and resource tweens — animate transforms, colors, and custom lenses over time with easing functions and sequences.
bevy_tweening = "0.12"
#![allow(unused)] fn main() { app.add_plugins(TweeningPlugin); }
- Repo: https://github.com/djeedai/bevy_tweening
- Docs: https://docs.rs/bevy_tweening
UI & Editor
bevy_egui
Immediate-mode egui rendered inside Bevy. Ideal for debug panels, level editors, and dev tools. Not recommended for in-game UI.
bevy_egui = "0.34"
#![allow(unused)] fn main() { app.add_plugins(EguiPlugin); }
- Repo: https://github.com/mvlabat/bevy_egui
- Docs: https://docs.rs/bevy_egui
bevy_cosmic_edit
Rich text editing widget using cosmic-text. Supports multi-line editing, selection, clipboard, and custom fonts.
bevy_cosmic_edit = "0.27"
#![allow(unused)] fn main() { app.add_plugins(CosmicEditPlugin::default()); }
- Repo: https://github.com/StaffEngineer/bevy_cosmic_edit
- Docs: https://docs.rs/bevy_cosmic_edit
Networking
lightyear
Client-prediction and server-authoritative networking. Supports rollback, input delay, entity interpolation, and interest management.
lightyear = "0.19"
#![allow(unused)] fn main() { app.add_plugins(lightyear::prelude::server::ServerPlugins::default()); // or app.add_plugins(lightyear::prelude::client::ClientPlugins::default()); }
- Repo: https://github.com/cBournhonesque/lightyear
- Docs: https://docs.rs/lightyear
bevy_replicon
High-level replication: automatic entity spawning on clients, component synchronization, and server RPCs. Transport-agnostic.
bevy_replicon = "0.30"
#![allow(unused)] fn main() { app.add_plugins(RepliconPlugins); }
- Repo: https://github.com/projectharmonia/bevy_replicon
- Docs: https://docs.rs/bevy_replicon
bevy_renet
Reliable UDP transport for Bevy. Lower-level than lightyear or replicon — gives you raw channels and connection management.
bevy_renet = "0.0.14"
#![allow(unused)] fn main() { app.add_plugins(RenetServerPlugin); // or app.add_plugins(RenetClientPlugin); }
- Repo: https://github.com/lucaspoffo/renet
- Docs: https://docs.rs/bevy_renet
Debug & Dev Tools
bevy-inspector-egui
Runtime ECS inspector. Browse all entities, view and edit component values live, inspect resources. Essential during development.
bevy-inspector-egui = "0.28"
#![allow(unused)] fn main() { app.add_plugins(bevy_inspector_egui::quick::WorldInspectorPlugin::default()); }
- Repo: https://github.com/jakobhellermann/bevy-inspector-egui
- Docs: https://docs.rs/bevy-inspector-egui
bevy_screen_diagnostics
On-screen text overlay showing FPS, entity count, and custom diagnostics. Lightweight, no egui dependency.
bevy_screen_diagnostics = "0.7"
#![allow(unused)] fn main() { app.add_plugins(ScreenDiagnosticsPlugin::default()) .add_plugins(ScreenFrameDiagnosticsPlugin); }
- Repo: https://github.com/Lommix/bevy_screen_diagnostics
- Docs: https://docs.rs/bevy_screen_diagnostics
Tilemap & Level Design
bevy_ecs_tilemap
High-performance tilemap rendering backed by the ECS. Supports multiple layers, animated tiles, and large maps.
bevy_ecs_tilemap = "0.15"
#![allow(unused)] fn main() { app.add_plugins(TilemapPlugin); }
- Repo: https://github.com/StarArawn/bevy_ecs_tilemap
- Docs: https://docs.rs/bevy_ecs_tilemap
Particles & VFX
bevy_hanabi
GPU-accelerated particle system. Define effects with spawners, modifiers, and render properties. Handles millions of particles.
bevy_hanabi = "0.14"
#![allow(unused)] fn main() { app.add_plugins(HanabiPlugin); }
- Repo: https://github.com/djeedai/bevy_hanabi
- Docs: https://docs.rs/bevy_hanabi
Camera
bevy_pancam
Plug-and-play 2D camera with pan (drag), zoom (scroll), and optional bounds clamping.
bevy_pancam = "0.14"
#![allow(unused)] fn main() { app.add_plugins(PanCamPlugin); // Then add PanCam component to your camera entity }
- Repo: https://github.com/johanhelsing/bevy_pancam
- Docs: https://docs.rs/bevy_pancam
bevy_flycam
Simple 3D fly camera for prototyping. WASD + mouse look, adjustable speed.
bevy_flycam = "0.14"
#![allow(unused)] fn main() { app.add_plugins(NoCameraPlayerPlugin); // Spawns and controls a camera automatically }
- Repo: https://github.com/sburris0/bevy_flycam
- Docs: https://docs.rs/bevy_flycam
Persistence & Serialization
bevy_pkv
Simple key-value store for game settings and save data. Uses sled on native and localStorage on WASM.
bevy_pkv = "0.12"
#![allow(unused)] fn main() { app.add_plugins(PkvPlugin::new("MyCompany", "MyGame")); }
#![allow(unused)] fn main() { // Writing fn save_settings(mut pkv: ResMut<PkvStore>) { pkv.set("volume", &0.8f32).expect("failed to save"); } // Reading fn load_settings(pkv: Res<PkvStore>) { let volume: f32 = pkv.get("volume").unwrap_or(1.0); } }
- Repo: https://github.com/johanhelsing/bevy_pkv
- Docs: https://docs.rs/bevy_pkv
Audio (Third-Party)
bevy_kira_audio
Advanced audio playback powered by the Kira audio library. Supports spatial audio, audio tweening, multiple channels, and precise timing.
bevy_kira_audio = "0.21"
#![allow(unused)] fn main() { app.add_plugins(AudioPlugin); }
#![allow(unused)] fn main() { fn play_bgm(audio: Res<Audio>, assets: Res<AssetServer>) { audio.play(assets.load("bgm.ogg")).looped().with_volume(0.5); } }
- Repo: https://github.com/NiklasEi/bevy_kira_audio
- Docs: https://docs.rs/bevy_kira_audio
name: bevy-ecs description: Use when the user asks about Bevy's Entity Component System, defining components, writing systems, queries, commands, resources, events, observers, system ordering, system sets, run conditions, or the ECS paradigm in Bevy. Also triggers when the user is confused about the ECS mental model or asks how to structure game logic. version: 1.0.0
Bevy ECS — Entity Component System Fundamentals
Mental Model
Entities are IDs, components are data structs, systems are functions that query components. Composition over inheritance.
- Entity — A unique ID (like a database row). No data or behavior by itself.
- Component — A plain Rust struct attached to an entity. This is your data.
- System — A function that queries entities by component combination. This is your behavior.
OOP: class Player extends Character { hp: i32, speed: f32 }
ECS: entity.insert((Player, Health(100), Speed(3.0), Transform::default()))
Systems run automatically each frame — the scheduler invokes them based on the data they request.
Components
A component is any Rust type with #[derive(Component)]:
#![allow(unused)] fn main() { #[derive(Component)] struct Health(i32); #[derive(Component)] struct Speed(f32); #[derive(Component, Default)] struct Player; #[derive(Component)] struct Enemy { aggro_range: f32, damage: i32, } }
Marker Components
Zero-sized types used purely for filtering queries:
#![allow(unused)] fn main() { #[derive(Component, Default)] struct Player; #[derive(Component)] struct Poisoned; #[derive(Component)] struct Grounded; }
Required Components (Bevy 0.15+)
Use #[require(...)] to auto-insert dependencies when a component is added (uses Default unless overridden at spawn):
#![allow(unused)] fn main() { #[derive(Component, Default)] #[require(Health, Speed)] struct Player; // commands.spawn(Player) automatically inserts Health::default() and Speed::default() // commands.spawn((Player, Speed(10.0))) overrides the Speed default }
Common Derive Macros
Component— required for all componentsDebug,Clone,PartialEq— commonly combined withComponentDefault— needed for#[require]andinit_resourceReflect+#[reflect(Component)]— enables runtime inspection (editor tooling, serialization)
Systems
Systems are plain Rust functions. Their parameters declare what data they need, and Bevy injects the data automatically:
#![allow(unused)] fn main() { fn move_entities(mut query: Query<(&mut Transform, &Velocity)>, time: Res<Time>) { for (mut transform, velocity) in &mut query { transform.translation += velocity.0 * time.delta_secs(); } } }
Register systems when building your app:
fn main() { App::new() .add_plugins(DefaultPlugins) .add_systems(Startup, setup) .add_systems(Update, (move_entities, check_health, handle_input)) .run(); }
System Parameter Types
Key types: Query, Res/ResMut, Commands, EventReader/EventWriter, Local, Single, ParamSet, Option<Res<T>>.
See the system-params-cheatsheet reference for the complete table with examples and notes.
Queries
Queries are how systems access entity data. The type signature determines what data is fetched and how it is filtered.
Basic Queries
#![allow(unused)] fn main() { // Read one component fn system(query: Query<&Transform>) { for transform in &query { info!("Position: {}", transform.translation); } } // Read multiple components fn system(query: Query<(&Transform, &Health, &Name)>) { for (transform, health, name) in &query { info!("{} at {} with {} hp", name, transform.translation, health.0); } } // Write to components fn system(mut query: Query<(&mut Transform, &Velocity)>) { for (mut transform, velocity) in &mut query { transform.translation += velocity.0; } } }
Query Filters
Filters go in the second type parameter of Query:
#![allow(unused)] fn main() { // Only entities that have the Player component fn system(query: Query<&Transform, With<Player>>) { } // Entities with Health but NOT the Invincible component fn system(query: Query<&mut Health, Without<Invincible>>) { } // Entities whose Transform changed since last system run fn system(query: Query<&Transform, Changed<Transform>>) { } // Entities that just had the Poisoned component added fn system(query: Query<Entity, Added<Poisoned>>) { } // Combine multiple filters with tuples fn system(query: Query<&mut Health, (With<Enemy>, Without<Shield>)>) { } }
Optional Components
Use Option<&T> to query entities that may or may not have a component:
#![allow(unused)] fn main() { fn system(query: Query<(&Transform, Option<&Velocity>)>) { for (transform, maybe_velocity) in &query { if let Some(velocity) = maybe_velocity { // Entity has velocity } else { // Entity is stationary } } } }
Single-Entity Queries
When you expect exactly one matching entity, use Single<> (Bevy 0.15+):
#![allow(unused)] fn main() { fn camera_follow( player: Single<&Transform, With<Player>>, mut camera: Single<&mut Transform, With<Camera>>, ) { camera.translation = player.translation; } }
If zero or more than one entity matches, the system panics. Use this for unique entities like "the player", "the main camera", or "the UI root".
Querying by Entity ID
#![allow(unused)] fn main() { fn system(query: Query<&Health>, specific_entity: Res<TrackedEntity>) { if let Ok(health) = query.get(specific_entity.0) { info!("Health: {}", health.0); } } }
Commands
Commands perform deferred world mutations. They do not take effect immediately — they are applied at the end of the current stage (between system sets). This avoids borrow conflicts.
Spawning Entities
#![allow(unused)] fn main() { fn spawn_enemies(mut commands: Commands) { // Spawn with a bundle of components let entity = commands.spawn(( Enemy { aggro_range: 10.0, damage: 5 }, Health(50), Transform::default(), Visibility::default(), )).id(); // Spawn and then add more components commands.spawn((Player, Health(100))) .insert(Speed(5.0)) .insert(Name::new("Hero")); } }
Inserting and Removing Components
#![allow(unused)] fn main() { fn poison_system( mut commands: Commands, query: Query<Entity, (With<Enemy>, Without<Poisoned>)>, ) { for entity in &query { commands.entity(entity).insert(Poisoned); } } fn cure_system( mut commands: Commands, query: Query<Entity, With<Poisoned>>, ) { for entity in &query { commands.entity(entity).remove::<Poisoned>(); } } }
Despawning Entities
#![allow(unused)] fn main() { fn cleanup_dead( mut commands: Commands, query: Query<Entity, With<Dead>>, ) { for entity in &query { // Despawn the entity and all its children commands.entity(entity).despawn(); } } }
Spawning Children (Hierarchies)
#![allow(unused)] fn main() { fn spawn_ui(mut commands: Commands) { commands.spawn(Node { width: Val::Percent(100.0), height: Val::Percent(100.0), ..default() }).with_children(|parent| { parent.spawn(( Text::new("Hello, Bevy!"), TextFont { font_size: 40.0, ..default() }, )); }); } }
Resources
Resources are global singletons — data that exists once, not per-entity. Use them for game-wide state.
#![allow(unused)] fn main() { #[derive(Resource)] struct Score(u32); #[derive(Resource, Default)] struct GameSettings { difficulty: Difficulty, volume: f32, } }
Inserting Resources
fn main() { App::new() .add_plugins(DefaultPlugins) // Insert with an explicit value .insert_resource(Score(0)) // Insert using Default::default() .init_resource::<GameSettings>() .run(); }
insert_resource(value)— provide a concrete instance.init_resource::<T>()— requiresT: Default(orT: FromWorld). Creates the resource from its default.
Accessing Resources in Systems
#![allow(unused)] fn main() { fn display_score(score: Res<Score>) { info!("Current score: {}", score.0); } fn increment_score(mut score: ResMut<Score>) { score.0 += 10; } }
Optional Resources
If a resource might not exist:
#![allow(unused)] fn main() { fn system(score: Option<Res<Score>>) { if let Some(score) = score { info!("Score: {}", score.0); } } }
Events and Observers
Events
Events are the primary way to communicate between systems without tight coupling.
#![allow(unused)] fn main() { #[derive(Event)] struct DamageEvent { entity: Entity, amount: i32, } #[derive(Event)] struct GameOverEvent; }
Register events and use EventWriter / EventReader:
fn main() { App::new() .add_plugins(DefaultPlugins) .add_event::<DamageEvent>() .add_event::<GameOverEvent>() .add_systems(Update, (deal_damage, apply_damage).chain()) .run(); } fn deal_damage( mut writer: EventWriter<DamageEvent>, query: Query<(Entity, &ContactInfo), With<Hazard>>, ) { for (entity, contact) in &query { writer.send(DamageEvent { entity: contact.other_entity, amount: 10, }); } } fn apply_damage( mut reader: EventReader<DamageEvent>, mut query: Query<&mut Health>, ) { for event in reader.read() { if let Ok(mut health) = query.get_mut(event.entity) { health.0 -= event.amount; } } }
Events last for two frames by default, then are dropped. Always read events every frame to avoid missing them.
Observers (Bevy 0.15+)
Observers are reactive — they run immediately when a specific event is triggered, without waiting for the schedule. They are ideal for structural changes.
#![allow(unused)] fn main() { #[derive(Event)] struct OnDeath; fn setup(mut commands: Commands) { commands.spawn(( Enemy { aggro_range: 10.0, damage: 5 }, Health(50), )).observe(on_death); } fn on_death(trigger: Trigger<OnDeath>, mut commands: Commands) { // `trigger.target()` is the entity that the event was triggered on let entity = trigger.target(); commands.entity(entity).despawn(); info!("Entity {:?} died", entity); } }
Trigger an observer:
#![allow(unused)] fn main() { fn check_health( mut commands: Commands, query: Query<(Entity, &Health), Changed<Health>>, ) { for (entity, health) in &query { if health.0 <= 0 { commands.trigger_targets(OnDeath, entity); } } } }
Global observers (not tied to a specific entity):
fn main() { App::new() .add_plugins(DefaultPlugins) .add_observer(on_any_death) .run(); } fn on_any_death(trigger: Trigger<OnDeath>, mut score: ResMut<Score>) { score.0 += 100; }
One-Shot Systems
Run a system once on demand using Commands:
#![allow(unused)] fn main() { fn trigger_explosion(mut commands: Commands) { commands.run_system(explosion_effect); } fn explosion_effect(mut query: Query<&mut Health, With<Enemy>>) { for mut health in &mut query { health.0 -= 50; } } }
Scheduling
Register systems into schedules: Startup (once), Update (every frame), FixedUpdate (fixed timestep, default 64 Hz). Systems in the same schedule run in parallel by default.
Enforce ordering with .before(), .after(), or .chain():
#![allow(unused)] fn main() { App::new() .add_systems(Startup, setup) .add_systems(Update, ( (read_input, move_player, check_collisions).chain(), game_logic.run_if(in_state(AppState::InGame)), )) }
Use system sets to group systems with shared ordering and run conditions. Use states (States, SubStates) to control which systems run based on app phase, with OnEnter/OnExit schedules for setup and cleanup.
See the scheduling-guide reference for all built-in schedules, ordering primitives, run conditions, and state patterns.
Writing Custom Plugins
Plugins are the standard way to organize related systems, resources, and events into reusable modules:
#![allow(unused)] fn main() { pub struct CombatPlugin; impl Plugin for CombatPlugin { fn build(&self, app: &mut App) { app .add_event::<DamageEvent>() .add_event::<DeathEvent>() .init_resource::<CombatStats>() .add_systems(Update, ( deal_damage, apply_damage, check_death, ).chain().in_set(GameSet::Combat)); } } }
Use plugins in your app:
fn main() { App::new() .add_plugins(DefaultPlugins) .add_plugins(( CombatPlugin, InventoryPlugin, AudioPlugin, )) .run(); }
Plugin Groups
Group multiple plugins together:
pub struct GamePlugins; impl PluginGroup for GamePlugins { fn build(self) -> PluginGroupBuilder { PluginGroupBuilder::start::<Self>() .add(CombatPlugin) .add(InventoryPlugin) .add(MovementPlugin) .add(UIPlugin) } } // Use like DefaultPlugins: fn main() { App::new() .add_plugins(DefaultPlugins) .add_plugins(GamePlugins) .run(); }
Configurable Plugins
Accept configuration by storing it in the plugin struct:
#![allow(unused)] fn main() { pub struct PhysicsPlugin { pub gravity: f32, pub substeps: u32, } impl Default for PhysicsPlugin { fn default() -> Self { Self { gravity: -9.81, substeps: 4, } } } impl Plugin for PhysicsPlugin { fn build(&self, app: &mut App) { app.insert_resource(PhysicsConfig { gravity: self.gravity, substeps: self.substeps, }); app.add_systems(FixedUpdate, ( apply_gravity, resolve_collisions, ).chain()); } } }
Bevy Scheduling Guide
Complete reference for Bevy's scheduling system — schedules, ordering, sets, run conditions, and states.
Built-in Schedules
Bevy runs these schedules in a fixed order each frame:
Main Schedules (Run Every Frame)
| Schedule | When It Runs | Typical Use |
|---|---|---|
First | Very start of each frame | Internal engine bookkeeping, time updates |
PreUpdate | Before Update | Engine-level preprocessing (input collection, UI focus) |
Update | Main frame update | Your game logic goes here |
PostUpdate | After Update | Engine-level postprocessing (transform propagation, rendering sync) |
Last | Very end of each frame | Cleanup, diagnostics |
Fixed-Timestep Schedules (Run at Fixed Intervals)
These run at a fixed rate (default 64 Hz / every ~15.6ms), independent of frame rate. Multiple ticks can run per frame if the frame was slow, or zero ticks if the frame was fast.
| Schedule | When It Runs | Typical Use |
|---|---|---|
FixedFirst | Start of each fixed tick | Fixed-timestep bookkeeping |
FixedPreUpdate | Before FixedUpdate | Physics preprocessing |
FixedUpdate | Main fixed tick | Physics, deterministic gameplay |
FixedPostUpdate | After FixedUpdate | Physics postprocessing, collision detection |
FixedLast | End of each fixed tick | Fixed-timestep cleanup |
One-Time Schedules
| Schedule | When It Runs | Typical Use |
|---|---|---|
Startup | Once, before the first Update | Spawning initial entities, loading resources |
State-Transition Schedules
| Schedule | When It Runs | Typical Use |
|---|---|---|
OnEnter(state) | Once, when entering a state | Setup for that state (spawn UI, load level) |
OnExit(state) | Once, when leaving a state | Cleanup (despawn UI, save progress) |
OnTransition { exited, entered } | Once, during a state transition | Logic that depends on both the old and new state |
Frame Order
Within a single frame, the execution order is:
Startup (first frame only)
|
v
First -> PreUpdate -> [FixedFirst -> FixedPreUpdate -> FixedUpdate -> FixedPostUpdate -> FixedLast]* -> Update -> PostUpdate -> Last
^--- may run 0, 1, or many times per frame
System Ordering
Default: Parallel and Unordered
Systems in the same schedule run in parallel with no guaranteed order, as long as their data access does not conflict. This is Bevy's core performance advantage.
If two systems access the same data mutably, Bevy detects the conflict and runs them sequentially (in arbitrary order).
Explicit Ordering
#![allow(unused)] fn main() { App::new() .add_systems(Update, ( // A runs before B system_a.before(system_b), // B runs after A (equivalent to above) system_b.after(system_a), // Chain: runs in order A -> B -> C (system_a, system_b, system_c).chain(), system_b, system_c, )) }
.before() and .after() accept system names or system sets. .chain() is syntactic sugar for chaining .before()/.after() across a tuple of systems.
Ambiguity Detection
In debug builds, Bevy warns about "system order ambiguity" when two systems access the same data and have no explicit ordering. Fix by adding .before(), .after(), .chain(), or putting them in ordered sets.
System Sets
System sets group systems for shared ordering and run conditions.
Defining Sets
#![allow(unused)] fn main() { #[derive(SystemSet, Debug, Clone, PartialEq, Eq, Hash)] enum GameSet { Input, Movement, Combat, UI, } }
Configuring Set Order
#![allow(unused)] fn main() { App::new() .configure_sets(Update, ( GameSet::Input, GameSet::Movement.after(GameSet::Input), GameSet::Combat.after(GameSet::Movement), GameSet::UI.after(GameSet::Combat), )) // Or equivalently with chain: .configure_sets(Update, ( GameSet::Input, GameSet::Movement, GameSet::Combat, GameSet::UI, ).chain()) }
Assigning Systems to Sets
#![allow(unused)] fn main() { App::new() .add_systems(Update, ( read_keyboard.in_set(GameSet::Input), read_gamepad.in_set(GameSet::Input), move_player.in_set(GameSet::Movement), move_enemies.in_set(GameSet::Movement), deal_damage.in_set(GameSet::Combat), apply_damage.in_set(GameSet::Combat), update_hud.in_set(GameSet::UI), )) }
Run Conditions on Sets
Apply a run condition to an entire set — all systems in the set are skipped if the condition is false:
#![allow(unused)] fn main() { App::new() .configure_sets(Update, GameSet::Combat.run_if(in_state(AppState::InGame)), ) }
Run Conditions
Run conditions are functions that return bool. If they return false, the system (or set) is skipped for that tick.
Built-in Run Conditions
#![allow(unused)] fn main() { use bevy::prelude::*; // State-based system.run_if(in_state(AppState::InGame)) // Resource-based system.run_if(resource_exists::<Score>) system.run_if(resource_equals(Paused(true))) system.run_if(resource_changed::<Score>) system.run_if(resource_added::<Score>) // Event-based system.run_if(on_event::<DamageEvent>) // Time-based (from bevy::time) system.run_if(on_timer(Duration::from_secs(2))) system.run_if(on_real_timer(Duration::from_millis(500))) // Logic combinators system.run_if(in_state(AppState::InGame).and(resource_exists::<Player>)) system.run_if(in_state(AppState::Paused).or(in_state(AppState::MainMenu))) system.run_if(not(in_state(AppState::Loading))) }
Custom Run Conditions
A run condition is any system that returns bool:
#![allow(unused)] fn main() { fn has_living_enemies(query: Query<(), With<Enemy>>) -> bool { !query.is_empty() } fn player_is_alive(query: Query<&Health, With<Player>>) -> bool { query.iter().any(|h| h.0 > 0) } App::new() .add_systems(Update, ( enemy_ai.run_if(has_living_enemies), game_over_check.run_if(not(player_is_alive)), )) }
Combining Conditions
#![allow(unused)] fn main() { App::new() .add_systems(Update, combat_system .run_if(in_state(AppState::InGame)) .run_if(has_living_enemies) // Multiple .run_if() = AND logic (all must be true) ) }
States
States control large-scale game flow: menus, loading, gameplay, pausing.
Defining States
#![allow(unused)] fn main() { #[derive(States, Debug, Clone, PartialEq, Eq, Hash, Default)] enum AppState { #[default] MainMenu, Loading, InGame, Paused, GameOver, } }
Registering and Using States
#![allow(unused)] fn main() { App::new() .init_state::<AppState>() // Starts at Default value (MainMenu) // OR: .insert_state(AppState::Loading) // Start at a specific value // Systems that run once on state entry/exit .add_systems(OnEnter(AppState::InGame), setup_game_world) .add_systems(OnExit(AppState::InGame), despawn_game_world) // Systems that run every frame while in a state .add_systems(Update, ( menu_ui.run_if(in_state(AppState::MainMenu)), gameplay.run_if(in_state(AppState::InGame)), pause_overlay.run_if(in_state(AppState::Paused)), )) }
Transitioning Between States
#![allow(unused)] fn main() { fn handle_start_button( mut next_state: ResMut<NextState<AppState>>, interaction: Query<&Interaction, With<StartButton>>, ) { for interaction in &interaction { if *interaction == Interaction::Pressed { next_state.set(AppState::InGame); } } } fn handle_pause( mut next_state: ResMut<NextState<AppState>>, input: Res<ButtonInput<KeyCode>>, state: Res<State<AppState>>, ) { if input.just_pressed(KeyCode::Escape) { match state.get() { AppState::InGame => next_state.set(AppState::Paused), AppState::Paused => next_state.set(AppState::InGame), _ => {} } } } }
State transitions are applied during StateTransition (which runs between PreUpdate and Update). OnExit runs first, then OnTransition, then OnEnter.
Sub-States (Bevy 0.15+)
Sub-states only exist when their parent state has a specific value. When the parent leaves that value, the sub-state is removed entirely.
#![allow(unused)] fn main() { #[derive(SubStates, Debug, Clone, PartialEq, Eq, Hash, Default)] #[source(AppState = AppState::InGame)] enum GamePhase { #[default] Exploration, Combat, Cutscene, } App::new() .init_state::<AppState>() .add_sub_state::<GamePhase>() .add_systems(OnEnter(GamePhase::Combat), setup_combat_ui) .add_systems(OnExit(GamePhase::Combat), cleanup_combat_ui) .add_systems(Update, combat_tick.run_if(in_state(GamePhase::Combat))) }
When AppState leaves InGame, GamePhase is automatically removed. When AppState re-enters InGame, GamePhase is re-initialized to its Default value.
Computed States (Bevy 0.15+)
Computed states derive their value from one or more other states. You cannot set them manually — they update automatically.
#![allow(unused)] fn main() { #[derive(Clone, PartialEq, Eq, Hash, Debug)] enum InCombat { Yes, No, } impl ComputedStates for InCombat { type SourceStates = (AppState, Option<GamePhase>); fn compute(sources: (AppState, Option<GamePhase>)) -> Option<Self> { match sources { (AppState::InGame, Some(GamePhase::Combat)) => Some(InCombat::Yes), (AppState::InGame, _) => Some(InCombat::No), _ => None, // State does not exist outside InGame } } } App::new() .init_state::<AppState>() .add_sub_state::<GamePhase>() .add_computed_state::<InCombat>() .add_systems(Update, show_combat_hud.run_if(in_state(InCombat::Yes))) }
Bevy System Parameters Cheatsheet
Complete reference of all system parameter types available in Bevy 0.15+.
System Parameters
| Type | Purpose | Example | Notes |
|---|---|---|---|
Query<&T> | Read component data from entities | Query<&Transform> | Iterates all entities with Transform |
Query<&mut T> | Write component data | Query<&mut Health> | Requires mut query binding |
Query<(&A, &B)> | Read multiple components | Query<(&Transform, &Velocity)> | Only matches entities with both |
Query<(&mut A, &B)> | Mix read and write | Query<(&mut Transform, &Velocity)> | Some mutable, some read-only |
Query<Entity> | Get entity IDs only | Query<Entity, With<Player>> | Lightweight, no component data fetched |
Query<&T, With<U>> | Read with filter | Query<&Health, With<Player>> | Fetch Health only from Player entities |
Query<&T, Without<U>> | Exclude filter | Query<&Health, Without<Invincible>> | Skip entities that have Invincible |
Query<&T, Changed<T>> | Changed filter | Query<&Health, Changed<Health>> | Only entities whose Health changed this tick |
Query<&T, Added<T>> | Added filter | Query<&Health, Added<Health>> | Only entities that just received Health |
Query<&T, (With<A>, Without<B>)> | Combined filters | Query<&Health, (With<Enemy>, Without<Shield>)> | Tuple of filters = AND logic |
Query<(&A, Option<&B>)> | Optional component | Query<(&Transform, Option<&Velocity>)> | Matches all with Transform; Velocity may be None |
Single<&T> | Exactly one entity (0.15+) | Single<&Transform, With<Player>> | Panics if zero or multiple matches. Use for unique entities |
Res<T> | Read-only resource | Res<Time> | Panics if resource does not exist |
ResMut<T> | Mutable resource | ResMut<Score> | Requires mut score binding |
Option<Res<T>> | Optional resource (read) | Option<Res<Score>> | Returns None if resource not inserted |
Option<ResMut<T>> | Optional resource (write) | Option<ResMut<Score>> | Returns None if resource not inserted |
Commands | Deferred world mutations | Commands | Spawn, despawn, insert/remove components. Applied between system sets |
EventReader<T> | Read events | EventReader<DamageEvent> | Tracks read position automatically. Events persist for 2 frames |
EventWriter<T> | Send events | EventWriter<DamageEvent> | Use .send(event) to emit |
Local<T> | Per-system local state | Local<u32> | Persists across system runs. Each system instance gets its own copy. T: Default required |
ParamSet<(Q1, Q2)> | Conflicting queries | ParamSet<(Query<&mut A, With<B>>, Query<&mut A, Without<B>>)> | Use when two queries would conflict. Access via .p0(), .p1() |
NonSend<T> | Non-Send resource (read) | NonSend<WinitWindows> | For resources that must stay on the main thread |
NonSendMut<T> | Non-Send resource (write) | NonSendMut<WinitWindows> | Forces system to run on main thread |
Deferred<T> | Custom deferred mutations | Deferred<MyBuffer> | Batches writes that apply later. T: SystemBuffer required |
Query Filter Types
| Filter | Matches | Example |
|---|---|---|
With<T> | Entities that have component T | Query<&Health, With<Player>> |
Without<T> | Entities that do NOT have component T | Query<&Health, Without<Invincible>> |
Changed<T> | Entities whose T was mutated this tick | Query<&Transform, Changed<Transform>> |
Added<T> | Entities that received T this tick | Query<Entity, Added<Enemy>> |
Or<(F1, F2)> | Entities matching any filter | Query<&Name, Or<(With<Player>, With<Ally>)>> |
Common Query Patterns
#![allow(unused)] fn main() { // Iterate all matches for (transform, velocity) in &query { } // Iterate with mutation for (mut transform, velocity) in &mut query { } // Get specific entity if let Ok(health) = query.get(entity) { } if let Ok(mut health) = query.get_mut(entity) { } // Check if entity matches let exists = query.contains(entity); // Single result (panics if not exactly one) let player_transform = single_query.into_inner(); // Count matches let enemy_count = query.iter().count(); // Check if any matches exist let has_enemies = !query.is_empty(); }
ParamSet Usage
When two queries in the same system would conflict (both accessing the same component mutably, or one reading and one writing), use ParamSet:
#![allow(unused)] fn main() { fn system(mut params: ParamSet<( Query<&mut Transform, With<Player>>, Query<&mut Transform, With<Enemy>>, )>) { // Access one at a time — cannot hold both simultaneously for mut transform in params.p0().iter_mut() { transform.translation.x += 1.0; } for mut transform in params.p1().iter_mut() { transform.translation.x -= 1.0; } } }
Trigger (Observer Systems)
Observer systems use Trigger<T> instead of regular system parameters:
#![allow(unused)] fn main() { fn on_damage( trigger: Trigger<DamageEvent>, mut query: Query<&mut Health>, ) { let event = trigger.event(); let target = trigger.target(); if let Ok(mut health) = query.get_mut(target) { health.0 -= event.amount; } } }
name: bevy-input-and-interaction description: Use when the user asks about handling keyboard input, mouse input, gamepad/controller input, touch input, picking/raycasting, UI interaction, or input mapping in Bevy. Also triggers for questions about cursor position, mouse clicks on entities, or input abstraction. version: 1.0.0
Bevy Input & Interaction — Keyboard, Mouse, Gamepad & Picking
For system registration, queries, resources, and event fundamentals, see the bevy-ecs skill first. This skill builds on those concepts to cover all input handling and entity interaction in Bevy 0.15+.
Keyboard Input
Bevy exposes keyboard state through the ButtonInput<KeyCode> resource. Query it in any system:
#![allow(unused)] fn main() { fn keyboard_system(keys: Res<ButtonInput<KeyCode>>) { // Held down this frame if keys.pressed(KeyCode::KeyW) { // move forward } // Just pressed this frame (single-fire) if keys.just_pressed(KeyCode::Space) { // jump } // Just released this frame if keys.just_released(KeyCode::ShiftLeft) { // stop sprinting } } }
Common KeyCode Values
| Category | Keys |
|---|---|
| Letters | KeyCode::KeyA .. KeyCode::KeyZ |
| Digits | KeyCode::Digit0 .. KeyCode::Digit9 |
| Arrows | KeyCode::ArrowUp, ArrowDown, ArrowLeft, ArrowRight |
| Modifiers | KeyCode::ShiftLeft, ShiftRight, ControlLeft, AltLeft, SuperLeft |
| Common | KeyCode::Space, Enter, Escape, Tab, Backspace |
| Function | KeyCode::F1 .. KeyCode::F12 |
Text Input
For actual character input (respecting keyboard layout, IME, etc.), use KeyboardInput events rather than ButtonInput:
#![allow(unused)] fn main() { fn text_input_system(mut events: EventReader<KeyboardInput>) { for event in events.read() { if event.state.is_pressed() { if let Key::Character(ref char) = event.logical_key { info!("Character typed: {char}"); } } } } }
Mouse Input
Buttons
Mouse buttons work identically to keyboard keys:
#![allow(unused)] fn main() { fn mouse_button_system(buttons: Res<ButtonInput<MouseButton>>) { if buttons.just_pressed(MouseButton::Left) { // primary click } if buttons.pressed(MouseButton::Right) { // holding secondary } if buttons.just_pressed(MouseButton::Middle) { // middle click } } }
Cursor Position
Read the cursor position from the Window component:
#![allow(unused)] fn main() { fn cursor_position_system(windows: Query<&Window>) { let window = windows.single(); if let Some(position) = window.cursor_position() { // position is in window/logical pixels, origin at top-left info!("Cursor at: {position}"); } } }
Screen-to-World Conversion
To get the world-space position of the cursor (essential for clicking on game objects):
#![allow(unused)] fn main() { fn cursor_world_position( windows: Query<&Window>, camera_q: Query<(&Camera, &GlobalTransform)>, ) { let window = windows.single(); let (camera, camera_transform) = camera_q.single(); if let Some(cursor_pos) = window.cursor_position() { if let Ok(world_pos) = camera.viewport_to_world_2d(camera_transform, cursor_pos) { info!("World cursor: {world_pos}"); } } } }
For 3D, use viewport_to_world which returns a Ray3d:
#![allow(unused)] fn main() { fn cursor_ray_3d( windows: Query<&Window>, camera_q: Query<(&Camera, &GlobalTransform)>, ) { let window = windows.single(); let (camera, camera_transform) = camera_q.single(); if let Some(cursor_pos) = window.cursor_position() { if let Ok(ray) = camera.viewport_to_world(camera_transform, cursor_pos) { // ray.origin and ray.direction for raycasting info!("Ray origin: {}, direction: {}", ray.origin, *ray.direction); } } } }
Mouse Motion and Scroll
Use events for relative mouse movement and scroll wheel:
#![allow(unused)] fn main() { fn mouse_motion_system(mut motion: EventReader<MouseMotion>) { for event in motion.read() { // event.delta is Vec2 — relative movement in pixels info!("Mouse moved: {:?}", event.delta); } } fn mouse_scroll_system(mut scroll: EventReader<MouseWheel>) { for event in scroll.read() { // event.x, event.y — scroll amounts // event.unit — Lines or Pixels info!("Scroll: x={} y={}", event.x, event.y); } } }
Gamepad Input
In Bevy 0.15+, gamepads are entities with a Gamepad component. Buttons and axes are accessed through the Gamepad component directly.
Detecting Connected Gamepads
#![allow(unused)] fn main() { fn gamepad_connection_system( gamepads: Query<(Entity, &Gamepad), Added<Gamepad>>, ) { for (entity, _gamepad) in &gamepads { info!("Gamepad connected: {entity}"); } } }
Reading Gamepad Input
#![allow(unused)] fn main() { fn gamepad_input_system(gamepads: Query<&Gamepad>) { for gamepad in &gamepads { // Buttons if gamepad.just_pressed(GamepadButton::South) { info!("A / Cross pressed"); } if gamepad.pressed(GamepadButton::RightTrigger2) { info!("Right trigger held"); } // Axes — returns f32 in [-1.0, 1.0] let left_stick_x = gamepad.get(GamepadAxis::LeftStickX).unwrap_or(0.0); let left_stick_y = gamepad.get(GamepadAxis::LeftStickY).unwrap_or(0.0); // Apply dead zone manually let dead_zone = 0.15; if left_stick_x.abs() > dead_zone || left_stick_y.abs() > dead_zone { info!("Left stick: ({left_stick_x}, {left_stick_y})"); } } } }
Common Gamepad Buttons
| GamepadButton | Xbox | PlayStation |
|---|---|---|
South | A | Cross |
East | B | Circle |
West | X | Square |
North | Y | Triangle |
LeftTrigger | LB | L1 |
RightTrigger | RB | R1 |
LeftTrigger2 | LT | L2 |
RightTrigger2 | RT | R2 |
LeftThumb | L3 | L3 |
RightThumb | R3 | R3 |
DPadUp/Down/Left/Right | D-Pad | D-Pad |
Start | Menu | Options |
Select | View | Share |
Common Gamepad Axes
| GamepadAxis | Description |
|---|---|
LeftStickX | Left stick horizontal |
LeftStickY | Left stick vertical |
RightStickX | Right stick horizontal |
RightStickY | Right stick vertical |
Touch Input
The Touches resource tracks all active touch points:
#![allow(unused)] fn main() { fn touch_system(touches: Res<Touches>) { // New touches this frame for touch in touches.iter_just_pressed() { info!( "Touch started: id={}, position={}", touch.id(), touch.position() // Vec2 in window coordinates ); } // Currently held touches (includes position and start_position) for touch in touches.iter() { let delta = touch.position() - touch.start_position(); info!("Finger {} moved by {delta}", touch.id()); } // Touches released this frame for touch in touches.iter_just_released() { info!("Touch ended: id={}", touch.id()); } // Touches cancelled (e.g., interrupted by OS) for touch in touches.iter_just_cancelled() { info!("Touch cancelled: id={}", touch.id()); } } }
Multi-touch finger tracking uses the touch.id() to correlate touches across frames. Each finger gets a stable ID for its entire press-move-release lifecycle.
Picking (0.15+)
Bevy 0.15 ships a built-in picking system for detecting pointer interactions with entities. No third-party crate needed.
Making Entities Pickable
Entities with meshes are pickable by default when bevy_picking is enabled. To explicitly control picking, add or remove the Pickable component:
#![allow(unused)] fn main() { fn setup(mut commands: Commands, asset_server: Res<AssetServer>) { // Pickable by default (has a mesh) commands.spawn(( Mesh3d(asset_server.load("models/button.glb")), MeshMaterial3d(/* ... */), )); // Explicitly disable picking on an entity commands.spawn(( Mesh3d(asset_server.load("models/background.glb")), MeshMaterial3d(/* ... */), Pickable::IGNORE, )); } }
Pointer Events with Observers
The picking system fires events that you handle with observers. This is the recommended pattern — events are targeted to specific entities:
#![allow(unused)] fn main() { use bevy::picking::pointer::PointerInteraction; fn setup(mut commands: Commands) { // Spawn a clickable entity with an observer commands.spawn(( Mesh3d(/* ... */), MeshMaterial3d(/* ... */), )) .observe(on_click) .observe(on_pointer_over) .observe(on_pointer_out); } fn on_click(trigger: Trigger<Pointer<Click>>, mut commands: Commands) { let entity = trigger.target(); let event = trigger.event(); info!("Clicked entity {entity:?} at {}", event.pointer_location.position); } fn on_pointer_over( trigger: Trigger<Pointer<Over>>, mut materials: Query<&mut MeshMaterial3d<StandardMaterial>>, ) { // Highlight on hover let entity = trigger.target(); if let Ok(mut material) = materials.get_mut(entity) { // swap to highlight material } } fn on_pointer_out( trigger: Trigger<Pointer<Out>>, mut materials: Query<&mut MeshMaterial3d<StandardMaterial>>, ) { // Remove highlight let entity = trigger.target(); if let Ok(mut material) = materials.get_mut(entity) { // restore original material } } }
Available Pointer Events
| Event | Fires when |
|---|---|
Pointer<Over> | Pointer enters the entity's bounds |
Pointer<Out> | Pointer leaves the entity's bounds |
Pointer<Down> | Pointer button pressed while over entity |
Pointer<Up> | Pointer button released while over entity |
Pointer<Click> | Full press-and-release cycle on the entity |
Pointer<Move> | Pointer moves while over entity |
Pointer<DragStart> | Drag begins on entity |
Pointer<Drag> | Entity is being dragged |
Pointer<DragEnd> | Drag ends |
Pointer<DragEnter> | Dragged entity enters another entity's bounds |
Pointer<DragOver> | Dragged entity hovers over another entity |
Pointer<DragDrop> | Dragged entity dropped onto another entity |
Pointer<DragLeave> | Dragged entity leaves another entity's bounds |
Picking works with both 2D sprites and 3D meshes, and also with Bevy UI nodes.
leafwing-input-manager (Third-Party Input Abstraction)
For games that need unified input mapping across keyboard, gamepad, and mouse, leafwing-input-manager is the recommended community crate. It lets you define logical actions and bind them to physical inputs.
See the bevy-ecosystem skill for setup details and version compatibility.
#![allow(unused)] fn main() { use leafwing_input_manager::prelude::*; // 1. Define actions #[derive(Actionlike, PartialEq, Eq, Hash, Clone, Copy, Debug, Reflect)] pub enum PlayerAction { Move, // Axis-pair action Jump, Attack, } // 2. Build an input map and spawn it on the player fn spawn_player(mut commands: Commands) { let input_map = InputMap::default() .with_dual_axis(PlayerAction::Move, KeyboardVirtualDPad::WASD) .with_dual_axis(PlayerAction::Move, GamepadStick::LEFT) .with(PlayerAction::Jump, KeyCode::Space) .with(PlayerAction::Jump, GamepadButton::South) .with(PlayerAction::Attack, MouseButton::Left) .with(PlayerAction::Attack, GamepadButton::West); commands.spawn(( // ... player components InputManagerBundle::with_map(input_map), )); } // 3. Query action state in gameplay systems fn player_movement(query: Query<&ActionState<PlayerAction>, With<Player>>) { let action_state = query.single(); if action_state.pressed(&PlayerAction::Move) { let axis_pair = action_state.clamped_axis_pair(&PlayerAction::Move); let movement = Vec2::new(axis_pair.x, axis_pair.y); // apply movement * speed * time.delta_secs() } if action_state.just_pressed(&PlayerAction::Jump) { // jump } } // 4. Register the plugin // app.add_plugins(InputManagerPlugin::<PlayerAction>::default()) }
Common Patterns
Player Movement (WASD + Arrow Keys)
#![allow(unused)] fn main() { fn player_movement( keys: Res<ButtonInput<KeyCode>>, mut query: Query<&mut Transform, With<Player>>, time: Res<Time>, ) { let mut direction = Vec2::ZERO; if keys.pressed(KeyCode::KeyW) || keys.pressed(KeyCode::ArrowUp) { direction.y += 1.0; } if keys.pressed(KeyCode::KeyS) || keys.pressed(KeyCode::ArrowDown) { direction.y -= 1.0; } if keys.pressed(KeyCode::KeyA) || keys.pressed(KeyCode::ArrowLeft) { direction.x -= 1.0; } if keys.pressed(KeyCode::KeyD) || keys.pressed(KeyCode::ArrowRight) { direction.x += 1.0; } // Normalize to prevent diagonal speed boost let direction = direction.normalize_or_zero(); let speed = 200.0; for mut transform in &mut query { transform.translation.x += direction.x * speed * time.delta_secs(); transform.translation.y += direction.y * speed * time.delta_secs(); } } }
FPS Camera Control (Mouse Look)
#![allow(unused)] fn main() { #[derive(Component)] struct FpsCamera { sensitivity: f32, pitch: f32, yaw: f32, } fn fps_camera_look( mut motion: EventReader<MouseMotion>, mut camera: Query<(&mut Transform, &mut FpsCamera)>, ) { let (mut transform, mut fps) = camera.single_mut(); for event in motion.read() { fps.yaw -= event.delta.x * fps.sensitivity; fps.pitch -= event.delta.y * fps.sensitivity; fps.pitch = fps.pitch.clamp(-89.0_f32.to_radians(), 89.0_f32.to_radians()); } transform.rotation = Quat::from_rotation_y(fps.yaw) * Quat::from_rotation_x(fps.pitch); } }
Lock the cursor for FPS controls:
#![allow(unused)] fn main() { fn grab_cursor(mut windows: Query<&mut Window>) { let mut window = windows.single_mut(); window.cursor_options.grab_mode = CursorGrabMode::Locked; window.cursor_options.visible = false; } }
Orbit Camera
#![allow(unused)] fn main() { #[derive(Component)] struct OrbitCamera { focus: Vec3, radius: f32, pitch: f32, yaw: f32, } fn orbit_camera_system( mut scroll: EventReader<MouseWheel>, mut motion: EventReader<MouseMotion>, buttons: Res<ButtonInput<MouseButton>>, mut camera: Query<(&mut Transform, &mut OrbitCamera)>, ) { let (mut transform, mut orbit) = camera.single_mut(); // Zoom with scroll wheel for event in scroll.read() { orbit.radius -= event.y * 0.5; orbit.radius = orbit.radius.clamp(2.0, 50.0); } // Rotate with middle mouse button if buttons.pressed(MouseButton::Middle) { for event in motion.read() { orbit.yaw -= event.delta.x * 0.005; orbit.pitch -= event.delta.y * 0.005; orbit.pitch = orbit.pitch.clamp(-1.5, 1.5); } } // Update camera transform let rotation = Quat::from_rotation_y(orbit.yaw) * Quat::from_rotation_x(orbit.pitch); transform.translation = orbit.focus + rotation * Vec3::new(0.0, 0.0, orbit.radius); transform.look_at(orbit.focus, Vec3::Y); } }
Drag and Drop with Picking
#![allow(unused)] fn main() { #[derive(Component)] struct Draggable; #[derive(Component)] struct Dragging { offset: Vec2, } fn setup_draggable(mut commands: Commands) { commands.spawn(( Sprite { custom_size: Some(Vec2::new(64.0, 64.0)), ..default() }, Draggable, )) .observe(on_drag_start) .observe(on_drag) .observe(on_drag_end); } fn on_drag_start( trigger: Trigger<Pointer<DragStart>>, mut commands: Commands, transforms: Query<&Transform>, ) { let entity = trigger.target(); let pointer_pos = trigger.event().pointer_location.position; if let Ok(transform) = transforms.get(entity) { let offset = Vec2::new(transform.translation.x, transform.translation.y) - pointer_pos; commands.entity(entity).insert(Dragging { offset }); } } fn on_drag( trigger: Trigger<Pointer<Drag>>, mut transforms: Query<(&mut Transform, &Dragging)>, ) { let entity = trigger.target(); let pointer_pos = trigger.event().pointer_location.position; if let Ok((mut transform, dragging)) = transforms.get_mut(entity) { let new_pos = pointer_pos + dragging.offset; transform.translation.x = new_pos.x; transform.translation.y = new_pos.y; } } fn on_drag_end(trigger: Trigger<Pointer<DragEnd>>, mut commands: Commands) { commands.entity(trigger.target()).remove::<Dragging>(); } }
name: bevy-physics description: Use when the user asks about physics simulation in Bevy, collision detection, rigid bodies, colliders, raycasting for physics, joints, character controllers, or integrating avian or bevy_rapier physics. version: 1.0.0
Bevy Physics — Rigid Bodies, Colliders & Simulation
Related skills: See
bevy-ecsfor the Entity Component System fundamentals that underpin all physics components and systems. Seebevy-ecosystemfor other third-party crate recommendations beyond physics.
Crate Choice
There are two physics ecosystems for Bevy. Pick one per project — do not mix them.
| avian (avian2d / avian3d) | bevy_rapier (bevy_rapier2d / bevy_rapier3d) | |
|---|---|---|
| Engine | Pure Rust, built for Bevy from scratch | Rust wrapper around the rapier engine |
| Recommendation | Recommended for new projects | Mature, battle-tested, larger community backlog |
| API style | Bevy-native components and resources | Thin Bevy wrapper over rapier types |
| Determinism | Designed for cross-platform determinism | Deterministic within same platform |
See references/physics-comparison.md for a full side-by-side API comparison.
avian Setup
Cargo.toml
# For 3D physics:
[dependencies]
avian3d = "0.2"
# For 2D physics:
[dependencies]
avian2d = "0.2"
Plugin Registration
use avian3d::prelude::*; fn main() { App::new() .add_plugins(DefaultPlugins) .add_plugins(PhysicsPlugins::default()) // Optional: visual debug overlay .add_plugins(PhysicsDebugPlugin::default()) .run(); }
Core Components
#![allow(unused)] fn main() { use avian3d::prelude::*; fn spawn_dynamic_body(mut commands: Commands) { commands.spawn(( // Physics role — Dynamic bodies are affected by forces and gravity RigidBody::Dynamic, // Shape used for collision detection Collider::sphere(0.5), // Movement LinearVelocity(Vec3::new(2.0, 0.0, 0.0)), AngularVelocity(Vec3::new(0.0, 1.0, 0.0)), // Physical properties Mass(1.0), Restitution::new(0.7), // Bounciness: 0.0 = no bounce, 1.0 = perfect bounce Friction::new(0.5), GravityScale(1.0), // 0.0 = no gravity, 2.0 = double gravity // Spatial placement Transform::from_xyz(0.0, 5.0, 0.0), )); } fn spawn_static_floor(mut commands: Commands) { commands.spawn(( RigidBody::Static, // Never moves, infinite mass Collider::cuboid(50.0, 0.5, 50.0), Transform::default(), )); } fn spawn_kinematic_platform(mut commands: Commands) { commands.spawn(( RigidBody::Kinematic, // Moved by code, not by physics Collider::cuboid(5.0, 0.5, 5.0), Transform::from_xyz(0.0, 2.0, 0.0), )); } }
RigidBody types:
Dynamic— Fully simulated (gravity, forces, collisions move it)Static— Immovable (floors, walls). Never set velocity on these.Kinematic— Moved by code viaTransformorLinearVelocity, but not affected by forces or gravity. Use for moving platforms, elevators.
bevy_rapier Setup
Cargo.toml
# For 3D physics:
[dependencies]
bevy_rapier3d = "0.28"
# For 2D physics:
[dependencies]
bevy_rapier2d = "0.28"
Plugin Registration
use bevy_rapier3d::prelude::*; fn main() { App::new() .add_plugins(DefaultPlugins) .add_plugins(RapierPhysicsPlugin::<NoUserData>::default()) // Optional: wireframe debug rendering for colliders .add_plugins(RapierDebugRenderPlugin::default()) .run(); }
Core Components
#![allow(unused)] fn main() { use bevy_rapier3d::prelude::*; fn spawn_dynamic_body(mut commands: Commands) { commands.spawn(( RigidBody::Dynamic, Collider::ball(0.5), Velocity { linvel: Vec3::new(2.0, 0.0, 0.0), angvel: Vec3::new(0.0, 1.0, 0.0), }, ExternalForce { force: Vec3::ZERO, torque: Vec3::ZERO, }, Damping { linear_damping: 0.5, angular_damping: 0.1, }, Restitution::coefficient(0.7), Friction::coefficient(0.5), ColliderMassProperties::Mass(1.0), Transform::from_xyz(0.0, 5.0, 0.0), )); } }
Collider Shapes
Both crates support the same fundamental shapes with slightly different syntax.
Important: Cuboid dimensions are half-extents, not full dimensions. Collider::cuboid(1.0, 1.0, 1.0) creates a 2x2x2 box.
Capsule parameter differences: avian uses capsule(radius, length) while rapier uses capsule_y(half_height, radius) — note both the naming and parameter order differ.
#![allow(unused)] fn main() { // ---- avian3d ---- Collider::sphere(0.5) // radius Collider::cuboid(1.0, 2.0, 1.0) // half-extents x, y, z Collider::capsule(0.5, 1.0) // radius, length Collider::cylinder(0.5, 2.0) // radius, height Collider::cone(0.5, 1.0) // radius, height Collider::triangle(a, b, c) // three Vec3 vertices Collider::trimesh_from_mesh(&mesh) // arbitrary triangle mesh Collider::convex_hull(points) // convex hull from point cloud Collider::compound(vec![ // multiple shapes combined (Vec3::ZERO, Quat::IDENTITY, Collider::sphere(0.5)), (Vec3::new(0.0, 1.0, 0.0), Quat::IDENTITY, Collider::cuboid(0.3, 0.3, 0.3)), ]) // ---- bevy_rapier3d ---- Collider::ball(0.5) // radius Collider::cuboid(1.0, 2.0, 1.0) // half-extents x, y, z Collider::capsule_y(1.0, 0.5) // half-height, radius Collider::cylinder(1.0, 0.5) // half-height, radius Collider::cone(1.0, 0.5) // half-height, radius Collider::triangle(a, b, c) // three Vec3 vertices Collider::trimesh(vertices, indices) // vertices + triangle indices Collider::convex_hull(&points) // convex hull from point cloud Collider::compound(vec![ // multiple shapes combined (Vec3::ZERO, Quat::IDENTITY, Collider::ball(0.5)), (Vec3::new(0.0, 1.0, 0.0), Quat::IDENTITY, Collider::cuboid(0.3, 0.3, 0.3)), ]) }
2D equivalents: In avian2d, use Collider::circle(r), Collider::rectangle(w, h), Collider::capsule(r, l). In bevy_rapier2d, use Collider::ball(r), Collider::cuboid(hx, hy), Collider::capsule_y(hl, r).
Collision Detection
Collision Events
#![allow(unused)] fn main() { // ---- avian ---- use avian3d::prelude::*; fn handle_collisions( mut collision_started: EventReader<CollisionStarted>, mut collision_ended: EventReader<CollisionEnded>, ) { for CollisionStarted(entity_a, entity_b) in collision_started.read() { println!("{entity_a:?} started colliding with {entity_b:?}"); } for CollisionEnded(entity_a, entity_b) in collision_ended.read() { println!("{entity_a:?} stopped colliding with {entity_b:?}"); } } // ---- bevy_rapier ---- use bevy_rapier3d::prelude::*; fn handle_collisions(mut collision_events: EventReader<CollisionEvent>) { for event in collision_events.read() { match event { CollisionEvent::Started(a, b, _flags) => { println!("{a:?} started colliding with {b:?}"); } CollisionEvent::Stopped(a, b, _flags) => { println!("{a:?} stopped colliding with {b:?}"); } } } } }
Collision Layers and Groups
Collision layers let you control which objects can collide with which:
#![allow(unused)] fn main() { // ---- avian ---- use avian3d::prelude::*; #[derive(PhysicsLayer, Default)] enum GameLayer { #[default] Default, Player, Enemy, Projectile, Terrain, } // Player collides with enemies and terrain, but not other players commands.spawn(( RigidBody::Dynamic, Collider::capsule(0.4, 1.0), CollisionLayers::new(GameLayer::Player, [GameLayer::Enemy, GameLayer::Terrain]), )); // ---- bevy_rapier ---- use bevy_rapier3d::prelude::*; const PLAYER_GROUP: Group = Group::GROUP_1; const ENEMY_GROUP: Group = Group::GROUP_2; const TERRAIN_GROUP: Group = Group::GROUP_3; commands.spawn(( RigidBody::Dynamic, Collider::capsule_y(0.5, 0.4), CollisionGroups::new(PLAYER_GROUP, ENEMY_GROUP | TERRAIN_GROUP), )); }
Sensors (Trigger Volumes)
Sensors detect overlap without generating physical contact responses. Use them for trigger zones, pickup areas, and damage regions.
#![allow(unused)] fn main() { // ---- avian ---- commands.spawn(( Collider::sphere(3.0), Sensor, // No physical response, just generates events CollisionLayers::new(GameLayer::Default, GameLayer::Player), )); // ---- bevy_rapier ---- commands.spawn(( Collider::ball(3.0), Sensor, ActiveEvents::COLLISION_EVENTS, // Required to receive events for sensors )); }
Raycasting
avian — RayCaster Component and SpatialQuery
#![allow(unused)] fn main() { use avian3d::prelude::*; // Approach 1: RayCaster component (persists, updates each frame) commands.spawn(( RayCaster::new(Vec3::ZERO, Direction3d::NEG_Y) .with_max_distance(100.0), Transform::from_xyz(0.0, 10.0, 0.0), )); fn read_raycast_hits(query: Query<(&RayCaster, &RayHits)>) { for (ray, hits) in &query { for hit in hits.iter() { println!("Hit entity {:?} at distance {}", hit.entity, hit.distance); } } } // Approach 2: SpatialQuery (one-shot, on-demand) fn cast_ray_on_demand(spatial_query: SpatialQuery) { if let Some(hit) = spatial_query.cast_ray( Vec3::new(0.0, 10.0, 0.0), // origin Direction3d::NEG_Y, // direction 100.0, // max distance true, // solid (hit interior of shapes) &SpatialQueryFilter::default(), ) { println!("Hit {:?} at distance {}", hit.entity, hit.distance); } } }
bevy_rapier — RapierContext
#![allow(unused)] fn main() { use bevy_rapier3d::prelude::*; fn cast_ray(rapier_context: Res<RapierContext>) { if let Some((entity, distance)) = rapier_context.cast_ray( Vec3::new(0.0, 10.0, 0.0), // origin Vec3::NEG_Y, // direction 100.0, // max distance true, // solid QueryFilter::default(), ) { println!("Hit {entity:?} at distance {distance}"); } } // With hit normal: fn cast_ray_with_normal(rapier_context: Res<RapierContext>) { if let Some((entity, intersection)) = rapier_context.cast_ray_and_get_normal( Vec3::new(0.0, 10.0, 0.0), Vec3::NEG_Y, 100.0, true, QueryFilter::default(), ) { println!("Hit {entity:?}, normal: {}", intersection.normal); } } }
Character Controllers
A character controller is a kinematic body that slides along surfaces, steps over small obstacles, and detects ground contact. Both crates provide built-in character controller components.
avian
#![allow(unused)] fn main() { use avian3d::prelude::*; fn spawn_character(mut commands: Commands) { commands.spawn(( RigidBody::Kinematic, Collider::capsule(0.4, 1.0), CharacterController, // Movement is applied via LinearVelocity on kinematic bodies LinearVelocity::default(), Transform::from_xyz(0.0, 1.0, 0.0), )); } fn move_character( mut query: Query<(&mut LinearVelocity, &CharacterController), With<CharacterController>>, input: Res<ButtonInput<KeyCode>>, ) { for (mut velocity, _controller) in &mut query { let mut direction = Vec3::ZERO; if input.pressed(KeyCode::KeyW) { direction.z -= 1.0; } if input.pressed(KeyCode::KeyS) { direction.z += 1.0; } if input.pressed(KeyCode::KeyA) { direction.x -= 1.0; } if input.pressed(KeyCode::KeyD) { direction.x += 1.0; } let speed = 5.0; velocity.0 = direction.normalize_or_zero() * speed; } } }
bevy_rapier
#![allow(unused)] fn main() { use bevy_rapier3d::prelude::*; fn spawn_character(mut commands: Commands) { commands.spawn(( RigidBody::KinematicPositionBased, Collider::capsule_y(0.5, 0.4), KinematicCharacterController { offset: CharacterLength::Absolute(0.01), max_slope_climb_angle: std::f32::consts::FRAC_PI_4, // 45 degrees min_slope_slide_angle: std::f32::consts::FRAC_PI_4, snap_to_ground: Some(CharacterLength::Absolute(0.2)), ..default() }, Transform::from_xyz(0.0, 1.0, 0.0), )); } fn move_character( mut query: Query<&mut KinematicCharacterController>, input: Res<ButtonInput<KeyCode>>, time: Res<Time>, ) { for mut controller in &mut query { let mut direction = Vec3::ZERO; if input.pressed(KeyCode::KeyW) { direction.z -= 1.0; } if input.pressed(KeyCode::KeyS) { direction.z += 1.0; } if input.pressed(KeyCode::KeyA) { direction.x -= 1.0; } if input.pressed(KeyCode::KeyD) { direction.x += 1.0; } let speed = 5.0; controller.translation = Some(direction.normalize_or_zero() * speed * time.delta_secs()); } } // Ground detection via KinematicCharacterControllerOutput fn check_grounded(query: Query<&KinematicCharacterControllerOutput>) { for output in &query { if output.grounded { println!("Character is on the ground"); } } } }
Joints
Joints constrain how two rigid bodies move relative to each other.
avian
#![allow(unused)] fn main() { use avian3d::prelude::*; // Fixed joint — bodies stay rigidly attached let entity_a = commands.spawn((RigidBody::Dynamic, Collider::sphere(0.5))).id(); let entity_b = commands.spawn((RigidBody::Dynamic, Collider::sphere(0.5))).id(); commands.spawn(FixedJoint::new(entity_a, entity_b)); // Revolute joint — rotation around a single axis (hinge) commands.spawn( RevoluteJoint::new(entity_a, entity_b) .with_aligned_axis(Vec3::Z) // axis of rotation .with_angle_limits(-1.0, 1.0) // radians ); // Prismatic joint — sliding along a single axis (piston/slider) commands.spawn( PrismaticJoint::new(entity_a, entity_b) .with_free_axis(Vec3::Y) .with_limits(0.0, 5.0) // min/max translation ); // Distance/spring joint — keeps bodies within a distance range commands.spawn( DistanceJoint::new(entity_a, entity_b) .with_limits(1.0, 5.0) .with_compliance(0.001) // lower = stiffer spring ); }
bevy_rapier
#![allow(unused)] fn main() { use bevy_rapier3d::prelude::*; let entity_a = commands.spawn((RigidBody::Dynamic, Collider::ball(0.5))).id(); let entity_b = commands.spawn((RigidBody::Dynamic, Collider::ball(0.5))).id(); // Fixed joint commands.spawn(ImpulseJoint::new( entity_a, FixedJointBuilder::new().local_anchor1(Vec3::ZERO).local_anchor2(Vec3::new(0.0, -1.0, 0.0)), )).insert(ImpulseJoint::new(entity_a, FixedJointBuilder::new())); // Revolute joint let revolute = RevoluteJointBuilder::new(Vec3::Z) .local_anchor1(Vec3::new(1.0, 0.0, 0.0)) .local_anchor2(Vec3::new(-1.0, 0.0, 0.0)) .limits([-1.0, 1.0]); commands.entity(entity_b).insert(ImpulseJoint::new(entity_a, revolute)); // Prismatic joint let prismatic = PrismaticJointBuilder::new(Vec3::Y) .local_anchor1(Vec3::ZERO) .local_anchor2(Vec3::ZERO) .limits([0.0, 5.0]); commands.entity(entity_b).insert(ImpulseJoint::new(entity_a, prismatic)); // Spring joint (via rapier's SpringJointBuilder if available, or via motor on prismatic) let spring = SpringJointBuilder::new(2.0, 0.5, 0.1); // rest_length, stiffness, damping commands.entity(entity_b).insert(ImpulseJoint::new(entity_a, spring)); }
2D vs 3D
Each physics crate ships as two separate crates — one for 2D and one for 3D. You cannot mix them in the same Bevy app.
| Dimension | avian | bevy_rapier |
|---|---|---|
| 2D | avian2d | bevy_rapier2d |
| 3D | avian3d | bevy_rapier3d |
Key differences in 2D mode:
- Positions use
Vec2, rotations use scalar angles (radians) instead ofQuat - Collider shapes:
circleinstead ofsphere,rectangleinstead ofcuboid - Gravity default is
(0.0, -9.81)as aVec2 - Joints rotate around the implicit Z axis
LinearVelocityandAngularVelocityuse 2D types
Choose 2D physics when your game is truly 2D (platformer, top-down). If you have a 2D game with a 3D camera or 3D models rendered from a fixed angle, you may still want 2D physics for simplicity.
avian vs bevy_rapier — Side-by-Side Comparison
Cargo.toml Dependencies
| Dimension | avian | bevy_rapier |
|---|---|---|
| 2D | avian2d = "0.2" | bevy_rapier2d = "0.28" |
| 3D | avian3d = "0.2" | bevy_rapier3d = "0.28" |
Plugin Setup
#![allow(unused)] fn main() { // ---- avian3d ---- use avian3d::prelude::*; App::new() .add_plugins(DefaultPlugins) .add_plugins(PhysicsPlugins::default()) .add_plugins(PhysicsDebugPlugin::default()) // optional debug // ---- bevy_rapier3d ---- use bevy_rapier3d::prelude::*; App::new() .add_plugins(DefaultPlugins) .add_plugins(RapierPhysicsPlugin::<NoUserData>::default()) .add_plugins(RapierDebugRenderPlugin::default()) // optional debug }
RigidBody Types
| Role | avian | bevy_rapier |
|---|---|---|
| Fully simulated | RigidBody::Dynamic | RigidBody::Dynamic |
| Immovable | RigidBody::Static | RigidBody::Fixed |
| Code-driven (transform) | RigidBody::Kinematic | RigidBody::KinematicPositionBased |
| Code-driven (velocity) | RigidBody::Kinematic | RigidBody::KinematicVelocityBased |
Note: avian uses a single Kinematic variant; bevy_rapier splits it into position-based and velocity-based.
Collider Creation
| Shape | avian3d | bevy_rapier3d |
|---|---|---|
| Sphere | Collider::sphere(radius) | Collider::ball(radius) |
| Box | Collider::cuboid(hx, hy, hz) | Collider::cuboid(hx, hy, hz) |
| Capsule | Collider::capsule(radius, length) | Collider::capsule_y(half_height, radius) |
| Cylinder | Collider::cylinder(radius, height) | Collider::cylinder(half_height, radius) |
| Cone | Collider::cone(radius, height) | Collider::cone(half_height, radius) |
| Triangle mesh | Collider::trimesh_from_mesh(&mesh) | Collider::trimesh(vertices, indices) |
| Convex hull | Collider::convex_hull(points) | Collider::convex_hull(&points) |
Note the parameter order difference: avian generally takes (radius, length) while rapier takes (half_height, radius).
Velocity and Force API
| Concept | avian | bevy_rapier |
|---|---|---|
| Linear velocity | LinearVelocity(Vec3) component | Velocity { linvel, angvel } component |
| Angular velocity | AngularVelocity(Vec3) component | Velocity { linvel, angvel } component |
| External force | ExternalForce::new(Vec3) | ExternalForce { force, torque } |
| External impulse | ExternalImpulse::new(Vec3) | ExternalImpulse { impulse, torque_impulse } |
| Damping | LinearDamping(f32), AngularDamping(f32) | Damping { linear_damping, angular_damping } |
| Mass | Mass(f32) | ColliderMassProperties::Mass(f32) |
| Gravity scale | GravityScale(f32) | GravityScale(f32) |
| Restitution | Restitution::new(0.7) | Restitution::coefficient(0.7) |
| Friction | Friction::new(0.5) | Friction::coefficient(0.5) |
Key difference: avian uses separate components for linear and angular velocity; rapier bundles them into a single Velocity struct.
Collision Events
#![allow(unused)] fn main() { // ---- avian ---- fn collisions( mut started: EventReader<CollisionStarted>, mut ended: EventReader<CollisionEnded>, ) { for CollisionStarted(a, b) in started.read() { /* ... */ } for CollisionEnded(a, b) in ended.read() { /* ... */ } } // ---- bevy_rapier ---- fn collisions(mut events: EventReader<CollisionEvent>) { for event in events.read() { match event { CollisionEvent::Started(a, b, _flags) => { /* ... */ } CollisionEvent::Stopped(a, b, _flags) => { /* ... */ } } } } // Note: rapier requires `ActiveEvents::COLLISION_EVENTS` on at least one entity. }
Collision Layers
#![allow(unused)] fn main() { // ---- avian ---- #[derive(PhysicsLayer, Default)] enum GameLayer { #[default] Default, Player, Enemy } CollisionLayers::new(GameLayer::Player, [GameLayer::Enemy, GameLayer::Default]) // ---- bevy_rapier ---- const PLAYER: Group = Group::GROUP_1; const ENEMY: Group = Group::GROUP_2; CollisionGroups::new(PLAYER, ENEMY | Group::ALL) }
avian uses a derive macro for named layers. rapier uses bitflag groups.
Sensors
#![allow(unused)] fn main() { // ---- avian ---- commands.spawn((Collider::sphere(3.0), Sensor)); // ---- bevy_rapier ---- commands.spawn((Collider::ball(3.0), Sensor, ActiveEvents::COLLISION_EVENTS)); // rapier requires ActiveEvents for sensors to generate events. }
Raycasting
#![allow(unused)] fn main() { // ---- avian (one-shot) ---- fn raycast(spatial_query: SpatialQuery) { if let Some(hit) = spatial_query.cast_ray( origin, direction, max_distance, solid, &SpatialQueryFilter::default() ) { // hit.entity, hit.distance } } // ---- avian (persistent component) ---- commands.spawn(RayCaster::new(Vec3::ZERO, Direction3d::NEG_Y).with_max_distance(100.0)); // Read results from RayHits component each frame. // ---- bevy_rapier ---- fn raycast(rapier_context: Res<RapierContext>) { if let Some((entity, distance)) = rapier_context.cast_ray( origin, direction, max_distance, solid, QueryFilter::default() ) { // entity, distance } } }
avian offers both a persistent RayCaster component and an on-demand SpatialQuery system parameter. rapier provides only the on-demand RapierContext approach.
Character Controller
#![allow(unused)] fn main() { // ---- avian ---- commands.spawn(( RigidBody::Kinematic, Collider::capsule(0.4, 1.0), CharacterController, LinearVelocity::default(), )); // Move by setting LinearVelocity directly. // ---- bevy_rapier ---- commands.spawn(( RigidBody::KinematicPositionBased, Collider::capsule_y(0.5, 0.4), KinematicCharacterController { max_slope_climb_angle: std::f32::consts::FRAC_PI_4, snap_to_ground: Some(CharacterLength::Absolute(0.2)), ..default() }, )); // Move by setting controller.translation = Some(movement_vector). // Read KinematicCharacterControllerOutput for grounded state. }
Joints
| Joint type | avian | bevy_rapier |
|---|---|---|
| Fixed | FixedJoint::new(a, b) | ImpulseJoint::new(a, FixedJointBuilder::new()) |
| Revolute (hinge) | RevoluteJoint::new(a, b).with_aligned_axis(axis) | ImpulseJoint::new(a, RevoluteJointBuilder::new(axis)) |
| Prismatic (slider) | PrismaticJoint::new(a, b).with_free_axis(axis) | ImpulseJoint::new(a, PrismaticJointBuilder::new(axis)) |
| Spring/distance | DistanceJoint::new(a, b).with_limits(min, max) | ImpulseJoint::new(a, SpringJointBuilder::new(...)) |
avian spawns joints as their own entities. rapier inserts ImpulseJoint as a component on one of the two bodies.
Key Differences and Tradeoffs
| Aspect | avian | bevy_rapier |
|---|---|---|
| Architecture | Pure Rust, designed for Bevy from day one | Rust wrapper around the rapier C-like engine |
| API ergonomics | More Bevy-idiomatic (separate components, derive macros) | Thin wrapper — API mirrors rapier's own types |
| Maturity | Newer, rapidly evolving | Older, more community resources and examples |
| Performance | Competitive; benefits from Bevy's parallelism natively | Mature optimizations; well-tuned broadphase |
| Determinism | Cross-platform deterministic by design | Deterministic within the same platform/build |
| Debug rendering | PhysicsDebugPlugin | RapierDebugRenderPlugin |
| Community | Growing; fewer tutorials/examples available | Larger ecosystem of tutorials, examples, and users |
When to Choose Which
Choose avian when:
- Starting a new Bevy project with no existing rapier code
- You value Bevy-native, idiomatic component APIs
- Cross-platform determinism matters (e.g., lockstep multiplayer)
- You want a single Rust dependency with no C/C++ in the chain
Choose bevy_rapier when:
- Migrating from an existing rapier-based project
- You need a specific rapier feature not yet in avian
- You want the largest possible pool of community examples and StackOverflow answers
- Your team already knows the rapier API from other engines
name: bevy-project-setup description: Use when the user asks to create a new Bevy project, scaffold a game, configure Cargo.toml for Bevy, set up fast compile times, configure dynamic linking, add Bevy feature flags, or asks about Bevy project structure and build optimization. version: 1.0.0
Bevy Project Setup — Scaffolding & Build Optimization
For general Rust workspace conventions, crate boundaries, and Cargo.toml management, see the rust-project-setup skill first. This skill covers Bevy-specific additions on top of that foundation.
Project Structure
A standard Bevy game project follows this layout:
my_game/
├── Cargo.toml
├── .cargo/
│ └── config.toml # Linker & fast-compile settings
├── src/
│ ├── main.rs # App entry point
│ ├── plugins/
│ │ ├── mod.rs
│ │ ├── camera.rs # Camera plugin
│ │ ├── player.rs # Player plugin
│ │ └── ui.rs # UI plugin
│ ├── components/
│ │ └── mod.rs # Shared components
│ ├── resources/
│ │ └── mod.rs # Shared resources
│ └── systems/
│ └── mod.rs # Standalone systems
├── assets/
│ ├── textures/
│ ├── models/
│ ├── audio/
│ └── fonts/
└── README.md
Each gameplay domain gets its own plugin module in src/plugins/. Components and resources that are shared across plugins live in their own top-level modules. Keep plugins focused: one responsibility per plugin.
Cargo.toml
Use a workspace-based setup. The game crate depends on Bevy with explicit feature selection:
[workspace]
resolver = "2"
members = ["game"]
[workspace.dependencies]
bevy = { version = "0.15", default-features = false, features = [
"bevy_asset",
"bevy_audio",
"bevy_color",
"bevy_core_pipeline",
"bevy_gilrs",
"bevy_gizmos",
"bevy_gltf",
"bevy_input_focus",
"bevy_mesh_picking_backend",
"bevy_pbr",
"bevy_picking",
"bevy_render",
"bevy_scene",
"bevy_sprite",
"bevy_state",
"bevy_text",
"bevy_ui",
"bevy_ui_picking_backend",
"bevy_winit",
"default_font",
"hdr",
"multi_threaded",
"png",
"smol_str",
"sysinfo_plugin",
"tonemapping_luts",
"vorbis",
"x11",
] }
# Optimize dependencies in dev builds for playable frame rates
[profile.dev.package."*"]
opt-level = 2
# Full optimization for release
[profile.release]
lto = "thin"
codegen-units = 1
In the game crate's Cargo.toml:
[package]
name = "my_game"
version = "0.1.0"
edition = "2021"
[dependencies]
bevy.workspace = true
[features]
dev = ["bevy/dynamic_linking"]
Fast Compile Configuration
Create .cargo/config.toml to reduce compile times. See references/cargo-config-templates.md for platform-specific templates.
Key levers:
- Linker: Use
mold(Linux),lld(Windows), or the default macOS linker with proper flags. - Dynamic linking: Enable
bevy/dynamic_linkingduring development via adevfeature flag. Run withcargo run --features dev. - Cranelift backend (optional, nightly): Faster codegen at the cost of runtime performance. Add to
.cargo/config.toml:
# Requires: rustup component add rustc-codegen-cranelift --toolchain nightly
[unstable]
codegen-backend = true
[profile.dev]
codegen-backend = "cranelift"
Minimal main.rs
A starter main.rs with window configuration:
use bevy::prelude::*; fn main() { App::new() .add_plugins(DefaultPlugins.set(WindowPlugin { primary_window: Some(Window { title: "My Game".to_string(), resolution: (1280.0, 720.0).into(), ..default() }), ..default() })) .add_systems(Startup, setup) .run(); } fn setup(mut commands: Commands) { commands.spawn(Camera2d); }
For a 3D starter, swap Camera2d for a 3D camera:
#![allow(unused)] fn main() { fn setup(mut commands: Commands) { // Camera commands.spawn(( Camera3d::default(), Transform::from_xyz(-2.5, 4.5, 9.0).looking_at(Vec3::ZERO, Vec3::Y), )); // Light commands.spawn(( PointLight { shadows_enabled: true, ..default() }, Transform::from_xyz(4.0, 8.0, 4.0), )); } }
Feature Flags
Bevy ships with many default features. For full-size games the defaults are fine. For specialized projects (headless server, minimal 2D game, CLI tool with ECS), disable defaults and pick only what you need.
See references/feature-flags.md for a complete table of flags with descriptions and guidance on when to enable or disable each one.
WASM Target
To build for the web with Trunk:
- Install prerequisites:
rustup target add wasm32-unknown-unknown
cargo install trunk
- Create
index.htmlin the project root:
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8" />
<title>My Game</title>
<style>
html, body { margin: 0; padding: 0; width: 100%; height: 100%; overflow: hidden; }
canvas { display: block; width: 100%; height: 100%; }
</style>
</head>
<body>
<link data-trunk rel="copy-dir" href="assets" />
</body>
</html>
- Add a
wasmfeature in your game crate'sCargo.tomlthat selects only WASM-compatible Bevy features (nox11,wayland,dynamic_linking,multi_threaded):
[features]
wasm = [
"bevy/bevy_asset",
"bevy/bevy_audio",
"bevy/bevy_color",
"bevy/bevy_core_pipeline",
"bevy/bevy_gizmos",
"bevy/bevy_gltf",
"bevy/bevy_input_focus",
"bevy/bevy_pbr",
"bevy/bevy_render",
"bevy/bevy_scene",
"bevy/bevy_sprite",
"bevy/bevy_state",
"bevy/bevy_text",
"bevy/bevy_ui",
"bevy/bevy_winit",
"bevy/default_font",
"bevy/hdr",
"bevy/png",
"bevy/tonemapping_luts",
"bevy/vorbis",
"bevy/webgl2",
]
- Build and serve:
trunk serve --features wasm
Key WASM differences:
- Assets are loaded via HTTP, not the filesystem. Use
AssetServerpaths relative to theassets/directory. - Audio requires a user interaction before it can play (browser policy).
bevy_gilrs(gamepad support) does not work on WASM.- Use
webgl2for broad compatibility orwebgpufor modern browsers only.
Cargo Config Templates for Bevy
Copy-pasteable .cargo/config.toml templates for fast Bevy compile times.
Linux (mold linker)
Install mold: sudo apt install mold or sudo pacman -S mold
# .cargo/config.toml
[target.x86_64-unknown-linux-gnu]
linker = "clang"
rustflags = ["-C", "link-arg=-fuse-ld=mold"]
macOS
macOS uses the default linker. No special config is strictly needed, but these flags help:
# .cargo/config.toml
[target.aarch64-apple-darwin]
rustflags = [
"-C", "link-arg=-fuse-ld=/usr/bin/ld",
"-Zshare-generics=y",
]
[target.x86_64-apple-darwin]
rustflags = [
"-C", "link-arg=-fuse-ld=/usr/bin/ld",
"-Zshare-generics=y",
]
Note:
-Zshare-generics=yrequires nightly. Remove it if using stable.
Windows (rust-lld)
# .cargo/config.toml
[target.x86_64-pc-windows-msvc]
linker = "rust-lld.exe"
Cross-platform (auto-detect)
A single config that works on all platforms by using environment-specific overrides:
# .cargo/config.toml
# Linux
[target.x86_64-unknown-linux-gnu]
linker = "clang"
rustflags = ["-C", "link-arg=-fuse-ld=mold"]
# macOS ARM
[target.aarch64-apple-darwin]
rustflags = ["-C", "link-arg=-fuse-ld=/usr/bin/ld"]
# macOS x86
[target.x86_64-apple-darwin]
rustflags = ["-C", "link-arg=-fuse-ld=/usr/bin/ld"]
# Windows
[target.x86_64-pc-windows-msvc]
linker = "rust-lld.exe"
Cargo.toml Profile Settings
Add these to your workspace root Cargo.toml:
# Optimize all dependencies in dev mode so the game is playable
# while keeping your own code at opt-level 0 for fast compilation.
[profile.dev.package."*"]
opt-level = 2
# Enable a small amount of optimization in dev mode for your code.
# Remove this if compile times are more important than dev frame rates.
[profile.dev]
opt-level = 1
# Release profile: maximum performance
[profile.release]
lto = "thin"
codegen-units = 1
opt-level = 3
# Stripped release build for distribution (cargo build --profile dist)
[profile.dist]
inherits = "release"
lto = "fat"
strip = true
Cranelift Backend (Nightly Only)
For the fastest possible compile times at the cost of runtime performance:
# .cargo/config.toml — append this section
# Requires nightly toolchain and:
# rustup component add rustc-codegen-cranelift --toolchain nightly
[unstable]
codegen-backend = true
[profile.dev]
codegen-backend = "cranelift"
Run with: cargo +nightly run --features dev
Bevy Feature Flags Reference
Feature flags for Bevy 0.15. Disable default-features and pick what you need for minimal builds, or keep defaults for full-featured games.
Core Features
| Feature | Default | Description | When to disable |
|---|---|---|---|
multi_threaded | Yes | Enables multi-threaded task execution | Headless single-threaded environments, WASM |
bevy_asset | Yes | Asset loading system | Never for games; only for pure ECS/logic-only apps |
bevy_scene | Yes | Scene serialization and loading | If you don't use .scn.ron scene files |
bevy_state | Yes | State machine for app states (menu, gameplay, etc.) | Unlikely — most games need states |
bevy_color | Yes | Color types and conversions | Rarely — almost everything uses colors |
Windowing & Input
| Feature | Default | Description | When to disable |
|---|---|---|---|
bevy_winit | Yes | Window creation and event loop via winit | Headless/server builds |
bevy_gilrs | Yes | Gamepad/controller support via gilrs | If you don't support gamepads; always disable for WASM |
bevy_input_focus | Yes | Input focus tracking | Rarely |
bevy_picking | Yes | Pointer-based picking (click/hover detection) | If you handle all input manually |
bevy_mesh_picking_backend | Yes | Mesh-based picking for 3D objects | 2D-only games |
bevy_ui_picking_backend | Yes | UI node picking | If not using bevy_ui |
Rendering
| Feature | Default | Description | When to disable |
|---|---|---|---|
bevy_render | Yes | Core rendering infrastructure | Headless/server builds |
bevy_core_pipeline | Yes | Built-in render pipelines (2D, 3D, tonemapping) | Headless/server builds |
bevy_pbr | Yes | Physically-based 3D rendering, materials, lighting | 2D-only games |
bevy_sprite | Yes | 2D sprite rendering | 3D-only games |
bevy_text | Yes | Text rendering | If you never display text |
bevy_ui | Yes | Built-in UI system | If using a third-party UI library exclusively |
bevy_gizmos | Yes | Debug drawing (lines, shapes) | Production release builds (strip via feature) |
bevy_gltf | Yes | glTF 3D model loading | 2D-only games or custom mesh generation |
hdr | Yes | HDR texture support | If all textures are LDR |
tonemapping_luts | Yes | Tonemapping look-up tables | If you use a custom tonemapper |
Audio
| Feature | Default | Description | When to disable |
|---|---|---|---|
bevy_audio | Yes | Built-in audio playback | If using a third-party audio library (e.g., kira) |
vorbis | Yes | OGG Vorbis audio decoding | If you only use WAV or other formats |
Image Formats
| Feature | Default | Description | When to disable |
|---|---|---|---|
png | Yes | PNG image loading | If you only use other formats |
jpeg | No | JPEG image loading | Enable if you have JPEG textures |
bmp | No | BMP image loading | Enable if you have BMP textures |
ktx2 | No | KTX2 compressed texture loading | Enable for GPU-compressed textures |
basis-universal | No | Basis Universal texture compression | Enable for cross-platform compressed textures |
exr | No | OpenEXR HDR image loading | Enable for HDR environment maps |
Platform Features
| Feature | Default | Description | When to disable |
|---|---|---|---|
x11 | Yes | X11 windowing on Linux | Wayland-only Linux setups |
wayland | No | Wayland windowing on Linux | Enable for native Wayland support |
webgl2 | No | WebGL2 rendering backend | Enable for WASM builds targeting broad browser support |
webgpu | No | WebGPU rendering backend | Enable for WASM builds targeting modern browsers |
Development & Debugging
| Feature | Default | Description | When to disable |
|---|---|---|---|
dynamic_linking | No | Dynamically link Bevy for faster dev compiles | Always disable for release/distribution builds |
file_watcher | No | Hot-reload assets when files change on disk | Enable during development for asset iteration |
asset_processor | No | Pre-process assets at build time | Enable when you need asset optimization pipelines |
embedded_watcher | No | Hot-reload embedded assets | Enable during development with embedded assets |
Profiling & Tracing
| Feature | Default | Description | When to disable |
|---|---|---|---|
trace | No | Adds tracing spans to Bevy systems and functions | Enable when profiling performance |
trace_tracy | No | Tracy profiler integration | Enable to use the Tracy profiler |
trace_chrome | No | Chrome trace format output (chrome://tracing) | Enable for browser-based trace viewing |
detailed_trace | No | Verbose tracing for ECS internals | Enable only when debugging scheduler issues |
Miscellaneous
| Feature | Default | Description | When to disable |
|---|---|---|---|
default_font | Yes | Bundles a default font so text works out of the box | If you always provide custom fonts |
smol_str | Yes | Use smol_str for small-string optimization | Rarely needs disabling |
sysinfo_plugin | Yes | System information diagnostics plugin | Production builds where you don't need diagnostics |
serialize | No | Adds serde Serialize/Deserialize to common types | Enable for save/load systems or networking |
bevy_dev_tools | No | Development tools (FPS overlay, state inspector) | Enable during development |
Example: Minimal 2D Game
bevy = { version = "0.15", default-features = false, features = [
"bevy_asset",
"bevy_color",
"bevy_core_pipeline",
"bevy_render",
"bevy_sprite",
"bevy_state",
"bevy_text",
"bevy_ui",
"bevy_winit",
"default_font",
"multi_threaded",
"png",
"x11",
] }
Example: Headless Server
bevy = { version = "0.15", default-features = false, features = [
"multi_threaded",
"serialize",
] }
name: bevy-rendering description: Use when the user asks about 2D or 3D rendering in Bevy, sprites, meshes, materials, cameras, lighting, shaders, textures, transforms, visibility, render layers, viewports, or visual aspects of a Bevy game. version: 1.0.0
Bevy Rendering — 2D & 3D Visuals
This skill covers Bevy's rendering systems for both 2D and 3D. For component and system fundamentals, see the bevy-ecs skill.
All examples target Bevy 0.15+ APIs, which use individual components rather than the deprecated bundle pattern.
Transform Hierarchy
Every visible entity needs a Transform (local) and GlobalTransform (computed world-space). Bevy propagates transforms through parent-child relationships automatically.
Coordinate system: right-handed, Y-up. +X is right, +Y is up, +Z points toward the viewer.
#![allow(unused)] fn main() { use bevy::prelude::*; fn setup(mut commands: Commands) { // Parent entity let parent = commands.spawn(( Transform::from_xyz(0.0, 2.0, 0.0), Visibility::default(), )).id(); // Child — its Transform is relative to the parent commands.spawn(( Transform::from_xyz(1.0, 0.0, 0.0), // world position: (1.0, 2.0, 0.0) Visibility::default(), )).set_parent(parent); } }
Key transform methods:
#![allow(unused)] fn main() { Transform::from_xyz(x, y, z) Transform::from_translation(Vec3::new(x, y, z)) Transform::from_rotation(Quat::from_rotation_y(angle)) Transform::from_scale(Vec3::splat(2.0)) transform.looking_at(target, Vec3::Y) // orient to face a point }
2D Rendering
Sprites
#![allow(unused)] fn main() { fn setup_sprite(mut commands: Commands, asset_server: Res<AssetServer>) { commands.spawn(Camera2d); commands.spawn(( Sprite { image: asset_server.load("player.png"), color: Color::WHITE, custom_size: Some(Vec2::new(64.0, 64.0)), // optional override ..default() }, Transform::from_xyz(0.0, 0.0, 0.0), )); } }
Z-ordering in 2D
Use transform.translation.z to control draw order. Higher Z values render on top.
#![allow(unused)] fn main() { // Background at z=0, player at z=1, UI overlay at z=2 commands.spawn(( Sprite { image: asset_server.load("bg.png"), ..default() }, Transform::from_xyz(0.0, 0.0, 0.0), )); commands.spawn(( Sprite { image: asset_server.load("player.png"), ..default() }, Transform::from_xyz(0.0, 0.0, 1.0), )); }
Sprite Sheets with TextureAtlas
#![allow(unused)] fn main() { fn setup_spritesheet( mut commands: Commands, asset_server: Res<AssetServer>, mut texture_atlas_layouts: ResMut<Assets<TextureAtlasLayout>>, ) { let texture = asset_server.load("spritesheet.png"); let layout = TextureAtlasLayout::from_grid(UVec2::new(32, 32), 6, 1, None, None); let layout_handle = texture_atlas_layouts.add(layout); commands.spawn(( Sprite { image: texture, texture_atlas: Some(TextureAtlas { layout: layout_handle, index: 0, }), ..default() }, Transform::default(), )); } }
Sprite Animation
#![allow(unused)] fn main() { #[derive(Component)] struct AnimationTimer(Timer); fn animate_sprite( time: Res<Time>, mut query: Query<(&mut AnimationTimer, &mut Sprite)>, ) { for (mut timer, mut sprite) in &mut query { timer.0.tick(time.delta()); if timer.0.just_finished() { if let Some(atlas) = &mut sprite.texture_atlas { atlas.index = (atlas.index + 1) % 6; // 6 frames } } } } }
Camera2d
#![allow(unused)] fn main() { commands.spawn(( Camera2d, Transform::from_xyz(0.0, 0.0, 0.0), OrthographicProjection { scale: 1.0, // zoom: smaller = zoomed in ..OrthographicProjection::default_2d() }, )); }
3D Rendering
Meshes and Materials
Bevy uses a PBR (Physically Based Rendering) pipeline. Attach Mesh3d and MeshMaterial3d<StandardMaterial> components.
#![allow(unused)] fn main() { fn setup_3d( mut commands: Commands, mut meshes: ResMut<Assets<Mesh>>, mut materials: ResMut<Assets<StandardMaterial>>, ) { // Spawn a red cube commands.spawn(( Mesh3d(meshes.add(Cuboid::new(1.0, 1.0, 1.0))), MeshMaterial3d(materials.add(StandardMaterial { base_color: Color::srgb(0.8, 0.1, 0.1), metallic: 0.0, perceptual_roughness: 0.5, ..default() })), Transform::from_xyz(0.0, 0.5, 0.0), )); } }
StandardMaterial Properties
| Property | Type | Description |
|---|---|---|
base_color | Color | Albedo color |
base_color_texture | Option<Handle<Image>> | Albedo texture map |
metallic | f32 | 0.0 = dielectric, 1.0 = metal |
perceptual_roughness | f32 | 0.0 = mirror-smooth, 1.0 = rough |
emissive | LinearRgba | Self-illumination color (not affected by lighting) |
reflectance | f32 | Fresnel reflectance at normal incidence (default 0.5) |
alpha_mode | AlphaMode | Opaque, Blend, Mask, etc. |
double_sided | bool | Render back faces |
unlit | bool | Skip lighting calculations |
Built-in Shape Primitives
All implement Into<Mesh>:
#![allow(unused)] fn main() { Cuboid::new(width, height, depth) Sphere::new(radius).mesh().ico(subdivisions) // or .uv(sectors, stacks) Plane3d::default().mesh().size(width, depth) Cylinder::new(radius, height) Capsule3d::new(radius, half_length) Torus::new(inner_radius, outer_radius) }
Cameras
Camera3d
#![allow(unused)] fn main() { commands.spawn(( Camera3d::default(), Transform::from_xyz(-2.0, 2.5, 5.0).looking_at(Vec3::ZERO, Vec3::Y), )); }
Orthographic vs Perspective
#![allow(unused)] fn main() { // Perspective (default for Camera3d) commands.spawn(( Camera3d::default(), Projection::Perspective(PerspectiveProjection { fov: std::f32::consts::FRAC_PI_4, ..default() }), Transform::from_xyz(0.0, 5.0, 10.0).looking_at(Vec3::ZERO, Vec3::Y), )); // Orthographic 3D commands.spawn(( Camera3d::default(), Projection::Orthographic(OrthographicProjection { scale: 10.0, ..OrthographicProjection::default_3d() }), Transform::from_xyz(0.0, 5.0, 10.0).looking_at(Vec3::ZERO, Vec3::Y), )); }
Multi-Camera Setup
Use order to control rendering order and ClearColorConfig to avoid clearing previous camera output.
#![allow(unused)] fn main() { // Primary camera — renders first, clears to sky blue commands.spawn(( Camera3d::default(), Camera { order: 0, clear_color: ClearColorConfig::Custom(Color::srgb(0.5, 0.7, 1.0)), ..default() }, Transform::from_xyz(0.0, 5.0, 10.0).looking_at(Vec3::ZERO, Vec3::Y), )); // Secondary camera — renders on top, does not clear commands.spawn(( Camera3d::default(), Camera { order: 1, clear_color: ClearColorConfig::None, ..default() }, Transform::from_xyz(10.0, 5.0, 0.0).looking_at(Vec3::ZERO, Vec3::Y), )); }
Viewports
#![allow(unused)] fn main() { use bevy::render::camera::Viewport; commands.spawn(( Camera3d::default(), Camera { viewport: Some(Viewport { physical_position: UVec2::new(0, 0), physical_size: UVec2::new(640, 480), ..default() }), ..default() }, Transform::from_xyz(0.0, 5.0, 10.0).looking_at(Vec3::ZERO, Vec3::Y), )); }
Lighting
Light Types
#![allow(unused)] fn main() { fn setup_lights(mut commands: Commands) { // Directional light (sun-like, infinite distance) commands.spawn(( DirectionalLight { illuminance: 10_000.0, shadows_enabled: true, ..default() }, Transform::default().looking_at(Vec3::new(-1.0, -1.0, -1.0), Vec3::Y), )); // Point light (omni-directional, positioned in space) commands.spawn(( PointLight { color: Color::srgb(1.0, 0.9, 0.8), intensity: 1_000_000.0, // lumens range: 20.0, shadows_enabled: true, ..default() }, Transform::from_xyz(4.0, 8.0, 4.0), )); // Spot light (cone-shaped) commands.spawn(( SpotLight { color: Color::WHITE, intensity: 1_000_000.0, range: 30.0, outer_angle: std::f32::consts::FRAC_PI_4, inner_angle: std::f32::consts::FRAC_PI_6, shadows_enabled: true, ..default() }, Transform::from_xyz(0.0, 10.0, 0.0).looking_at(Vec3::ZERO, Vec3::Y), )); // Ambient light (uniform, no direction or position) commands.insert_resource(AmbientLight { color: Color::WHITE, brightness: 100.0, }); } }
Shadow Configuration
Shadows are enabled per-light with shadows_enabled: true. For directional lights, configure the shadow cascade:
#![allow(unused)] fn main() { commands.spawn(( DirectionalLight { shadows_enabled: true, ..default() }, CascadeShadowConfig::build(CascadeShadowConfigBuilder { num_cascades: 4, maximum_distance: 100.0, first_cascade_far_bound: 5.0, ..default() }), Transform::default().looking_at(Vec3::new(-1.0, -1.0, -1.0), Vec3::Y), )); }
Asset Loading
AssetServer Basics
#![allow(unused)] fn main() { fn load_assets(asset_server: Res<AssetServer>) { // Loads from `assets/` directory relative to the project root let texture: Handle<Image> = asset_server.load("textures/wall.png"); let font: Handle<Font> = asset_server.load("fonts/FiraSans-Bold.ttf"); let scene: Handle<Scene> = asset_server.load("models/character.glb#Scene0"); } }
GLTF / GLB Models
Use SceneRoot to spawn an entire GLTF scene:
#![allow(unused)] fn main() { fn load_model(mut commands: Commands, asset_server: Res<AssetServer>) { commands.spawn(( SceneRoot(asset_server.load("models/helmet.glb#Scene0")), Transform::from_xyz(0.0, 0.0, 0.0), )); } }
For specific meshes or materials from a GLTF file:
#![allow(unused)] fn main() { // Load a specific named mesh let mesh: Handle<Mesh> = asset_server.load("models/character.glb#Mesh0/Primitive0"); }
Asset Load State
#![allow(unused)] fn main() { fn check_loading( asset_server: Res<AssetServer>, texture: Res<MyTextureHandle>, // store handle in a resource ) { match asset_server.get_load_state(&texture.0) { Some(bevy::asset::LoadState::Loaded) => { /* ready to use */ } Some(bevy::asset::LoadState::Failed(_)) => { /* handle error */ } _ => { /* still loading */ } } } }
Text Rendering
2D Text
#![allow(unused)] fn main() { fn setup_text(mut commands: Commands, asset_server: Res<AssetServer>) { let font = asset_server.load("fonts/FiraSans-Bold.ttf"); commands.spawn(( Text2d::new("Hello, Bevy!"), TextFont { font: font.clone(), font_size: 48.0, ..default() }, TextColor(Color::WHITE), TextLayout::new_with_justify(JustifyText::Center), Transform::from_xyz(0.0, 0.0, 10.0), )); } }
Visibility
Visibility Component
Every rendered entity has a Visibility component controlling whether it is drawn:
#![allow(unused)] fn main() { // Visible — always rendered (overrides parent hidden) commands.spawn(( Sprite { image: asset_server.load("icon.png"), ..default() }, Visibility::Visible, Transform::default(), )); // Hidden — never rendered (children also hidden) commands.spawn(( Sprite { image: asset_server.load("icon.png"), ..default() }, Visibility::Hidden, Transform::default(), )); // Inherited (default) — visible if parent is visible commands.spawn(( Sprite { image: asset_server.load("icon.png"), ..default() }, Visibility::default(), // Inherited Transform::default(), )); }
Toggle visibility at runtime:
#![allow(unused)] fn main() { fn toggle_visibility(mut query: Query<&mut Visibility, With<MyMarker>>) { for mut vis in &mut query { *vis = match *vis { Visibility::Hidden => Visibility::Visible, _ => Visibility::Hidden, }; } } }
InheritedVisibility
InheritedVisibility is a read-only computed component. It reflects the effective visibility considering the entire parent chain. Use it to check whether an entity is actually visible on screen.
RenderLayers
Use RenderLayers to control which camera sees which entities. Both the camera and the entity must share at least one layer.
#![allow(unused)] fn main() { use bevy::render::view::RenderLayers; // Entity on layer 1 only commands.spawn(( Mesh3d(meshes.add(Cuboid::new(1.0, 1.0, 1.0))), MeshMaterial3d(materials.add(Color::srgb(0.8, 0.2, 0.2))), Transform::default(), RenderLayers::layer(1), )); // Camera that sees layers 0 and 1 commands.spawn(( Camera3d::default(), Transform::from_xyz(0.0, 5.0, 10.0).looking_at(Vec3::ZERO, Vec3::Y), RenderLayers::from_layers(&[0, 1]), )); }
Common Rendering Recipes
Minimal, copy-pasteable examples for frequent Bevy 0.15+ rendering tasks.
1. Animated Sprite Sheet
Load a sprite sheet atlas and cycle through frames with a timer.
use bevy::prelude::*; fn main() { App::new() .add_plugins(DefaultPlugins.set(ImagePlugin::default_nearest())) // pixel-art friendly .add_systems(Startup, setup) .add_systems(Update, animate_sprite) .run(); } #[derive(Component)] struct AnimationConfig { first_frame: usize, last_frame: usize, timer: Timer, } fn setup( mut commands: Commands, asset_server: Res<AssetServer>, mut texture_atlas_layouts: ResMut<Assets<TextureAtlasLayout>>, ) { commands.spawn(Camera2d); let texture = asset_server.load("characters/player_run.png"); // 6 frames in a horizontal strip, each 32x32 pixels let layout = TextureAtlasLayout::from_grid(UVec2::new(32, 32), 6, 1, None, None); let layout_handle = texture_atlas_layouts.add(layout); commands.spawn(( Sprite { image: texture, texture_atlas: Some(TextureAtlas { layout: layout_handle, index: 0, }), ..default() }, Transform::from_scale(Vec3::splat(4.0)), // scale up for visibility AnimationConfig { first_frame: 0, last_frame: 5, timer: Timer::from_seconds(0.1, TimerMode::Repeating), }, )); } fn animate_sprite( time: Res<Time>, mut query: Query<(&mut AnimationConfig, &mut Sprite)>, ) { for (mut config, mut sprite) in &mut query { config.timer.tick(time.delta()); if config.timer.just_finished() { if let Some(atlas) = &mut sprite.texture_atlas { atlas.index = if atlas.index >= config.last_frame { config.first_frame } else { atlas.index + 1 }; } } } }
2. 3D Scene with Lighting
Ground plane, a lit object, directional light, and ambient light.
use bevy::prelude::*; fn main() { App::new() .add_plugins(DefaultPlugins) .add_systems(Startup, setup) .run(); } fn setup( mut commands: Commands, mut meshes: ResMut<Assets<Mesh>>, mut materials: ResMut<Assets<StandardMaterial>>, ) { // Ground plane commands.spawn(( Mesh3d(meshes.add(Plane3d::default().mesh().size(10.0, 10.0))), MeshMaterial3d(materials.add(StandardMaterial { base_color: Color::srgb(0.3, 0.5, 0.3), perceptual_roughness: 0.9, ..default() })), Transform::default(), )); // Lit cube commands.spawn(( Mesh3d(meshes.add(Cuboid::new(1.0, 1.0, 1.0))), MeshMaterial3d(materials.add(StandardMaterial { base_color: Color::srgb(0.8, 0.2, 0.2), metallic: 0.3, perceptual_roughness: 0.4, ..default() })), Transform::from_xyz(0.0, 0.5, 0.0), )); // Directional light (sun) commands.spawn(( DirectionalLight { illuminance: 15_000.0, shadows_enabled: true, ..default() }, Transform::default().looking_at(Vec3::new(-1.0, -2.0, -1.5), Vec3::Y), )); // Ambient fill commands.insert_resource(AmbientLight { color: Color::srgb(0.6, 0.7, 1.0), brightness: 200.0, }); // Camera commands.spawn(( Camera3d::default(), Transform::from_xyz(-3.0, 3.0, 5.0).looking_at(Vec3::new(0.0, 0.5, 0.0), Vec3::Y), )); }
3. Split-Screen Two-Camera Setup
Left half shows one camera, right half shows another.
use bevy::{prelude::*, render::camera::Viewport}; fn main() { App::new() .add_plugins(DefaultPlugins) .add_systems(Startup, setup) .add_systems(Update, update_viewports) .run(); } #[derive(Component)] struct LeftCamera; #[derive(Component)] struct RightCamera; fn setup( mut commands: Commands, mut meshes: ResMut<Assets<Mesh>>, mut materials: ResMut<Assets<StandardMaterial>>, ) { // Shared scene content commands.spawn(( Mesh3d(meshes.add(Cuboid::new(1.0, 1.0, 1.0))), MeshMaterial3d(materials.add(Color::srgb(0.8, 0.2, 0.2))), Transform::from_xyz(0.0, 0.5, 0.0), )); commands.spawn(( Mesh3d(meshes.add(Plane3d::default().mesh().size(10.0, 10.0))), MeshMaterial3d(materials.add(Color::srgb(0.3, 0.5, 0.3))), Transform::default(), )); commands.spawn(( DirectionalLight { shadows_enabled: true, ..default() }, Transform::default().looking_at(Vec3::new(-1.0, -1.0, -1.0), Vec3::Y), )); // Left camera — renders first, clears background commands.spawn(( Camera3d::default(), Camera { order: 0, clear_color: ClearColorConfig::Custom(Color::srgb(0.1, 0.1, 0.2)), ..default() }, Transform::from_xyz(-3.0, 3.0, 5.0).looking_at(Vec3::ZERO, Vec3::Y), LeftCamera, )); // Right camera — renders second, does not clear the left half commands.spawn(( Camera3d::default(), Camera { order: 1, clear_color: ClearColorConfig::None, ..default() }, Transform::from_xyz(5.0, 3.0, -3.0).looking_at(Vec3::ZERO, Vec3::Y), RightCamera, )); } fn update_viewports( windows: Query<&Window>, mut left_camera: Query<&mut Camera, (With<LeftCamera>, Without<RightCamera>)>, mut right_camera: Query<&mut Camera, (With<RightCamera>, Without<LeftCamera>)>, ) { let Ok(window) = windows.single() else { return }; let width = window.physical_width(); let height = window.physical_height(); let half_width = width / 2; if let Ok(mut cam) = left_camera.single_mut() { cam.viewport = Some(Viewport { physical_position: UVec2::ZERO, physical_size: UVec2::new(half_width, height), ..default() }); } if let Ok(mut cam) = right_camera.single_mut() { cam.viewport = Some(Viewport { physical_position: UVec2::new(half_width, 0), physical_size: UVec2::new(width - half_width, height), ..default() }); } }
4. Loading and Displaying a GLTF Model
Load a .glb file and spawn its scene.
use bevy::prelude::*; fn main() { App::new() .add_plugins(DefaultPlugins) .add_systems(Startup, setup) .run(); } fn setup(mut commands: Commands, asset_server: Res<AssetServer>) { // Load the GLTF scene (Scene0 is the first/default scene) commands.spawn(( SceneRoot(asset_server.load("models/FlightHelmet.glb#Scene0")), Transform::from_xyz(0.0, 0.0, 0.0) .with_scale(Vec3::splat(3.0)), )); // Lighting commands.spawn(( DirectionalLight { illuminance: 20_000.0, shadows_enabled: true, ..default() }, Transform::default().looking_at(Vec3::new(-1.0, -1.0, -1.0), Vec3::Y), )); commands.insert_resource(AmbientLight { color: Color::WHITE, brightness: 300.0, }); // Camera commands.spawn(( Camera3d::default(), Transform::from_xyz(0.0, 1.5, 4.0).looking_at(Vec3::new(0.0, 0.8, 0.0), Vec3::Y), )); }
5. Billboard Text That Always Faces the Camera
Spawn Text2d in 3D space and rotate it each frame to face the camera.
use bevy::prelude::*; fn main() { App::new() .add_plugins(DefaultPlugins) .add_systems(Startup, setup) .add_systems(Update, billboard_face_camera) .run(); } #[derive(Component)] struct Billboard; fn setup( mut commands: Commands, asset_server: Res<AssetServer>, mut meshes: ResMut<Assets<Mesh>>, mut materials: ResMut<Assets<StandardMaterial>>, ) { // A cube to anchor the label to let cube = commands.spawn(( Mesh3d(meshes.add(Cuboid::new(1.0, 1.0, 1.0))), MeshMaterial3d(materials.add(Color::srgb(0.2, 0.5, 0.8))), Transform::from_xyz(0.0, 0.5, 0.0), )).id(); // Billboard text as a child, floating above the cube let font = asset_server.load("fonts/FiraSans-Bold.ttf"); commands.spawn(( Text2d::new("Hello!"), TextFont { font, font_size: 36.0, ..default() }, TextColor(Color::WHITE), Transform::from_xyz(0.0, 1.2, 0.0).with_scale(Vec3::splat(0.01)), // scale down for 3D space Billboard, )).set_parent(cube); // Lighting and camera commands.spawn(( DirectionalLight { shadows_enabled: true, ..default() }, Transform::default().looking_at(Vec3::new(-1.0, -1.0, -1.0), Vec3::Y), )); commands.spawn(( Camera3d::default(), Transform::from_xyz(3.0, 3.0, 3.0).looking_at(Vec3::new(0.0, 0.5, 0.0), Vec3::Y), )); } fn billboard_face_camera( camera_query: Query<&GlobalTransform, With<Camera3d>>, mut billboards: Query<&mut Transform, (With<Billboard>, Without<Camera3d>)>, ) { let Ok(camera_global) = camera_query.single() else { return }; let camera_position = camera_global.translation(); for mut transform in &mut billboards { // Compute the direction from the billboard to the camera, ignoring Y to stay upright let direction = camera_position - transform.translation; if direction.length_squared() > 0.001 { transform.look_to(direction, Vec3::Y); } } }
6. Render-to-Texture (Camera Rendering to Image Used as Material)
Render a scene from a secondary camera into an image, then apply that image as a texture on a 3D object.
use bevy::{ prelude::*, render::{ camera::RenderTarget, render_resource::{ Extent3d, TextureDimension, TextureFormat, TextureUsages, }, view::RenderLayers, }, }; fn main() { App::new() .add_plugins(DefaultPlugins) .add_systems(Startup, setup) .add_systems(Update, rotate_cube) .run(); } #[derive(Component)] struct RotatingCube; fn setup( mut commands: Commands, mut meshes: ResMut<Assets<Mesh>>, mut materials: ResMut<Assets<StandardMaterial>>, mut images: ResMut<Assets<Image>>, ) { // Create the render target image let size = Extent3d { width: 512, height: 512, depth_or_array_layers: 1, }; let mut render_image = Image::new_fill( size, TextureDimension::D2, &[0, 0, 0, 255], TextureFormat::Bgra8UnormSrgb, bevy::render::render_asset::RenderAssetUsages::default(), ); render_image.texture_descriptor.usage = TextureUsages::TEXTURE_BINDING | TextureUsages::COPY_DST | TextureUsages::RENDER_ATTACHMENT; let render_image_handle = images.add(render_image); // --- Sub-scene rendered by the offscreen camera (layer 1) --- // A spinning cube only visible to the offscreen camera commands.spawn(( Mesh3d(meshes.add(Cuboid::new(1.0, 1.0, 1.0))), MeshMaterial3d(materials.add(StandardMaterial { base_color: Color::srgb(1.0, 0.3, 0.1), ..default() })), Transform::from_xyz(0.0, 0.0, 0.0), RenderLayers::layer(1), RotatingCube, )); // Light for the sub-scene commands.spawn(( PointLight { intensity: 2_000_000.0, shadows_enabled: true, ..default() }, Transform::from_xyz(3.0, 4.0, 3.0), RenderLayers::layer(1), )); // Offscreen camera rendering into the image commands.spawn(( Camera3d::default(), Camera { target: RenderTarget::Image(render_image_handle.clone().into()), clear_color: ClearColorConfig::Custom(Color::srgb(0.1, 0.1, 0.15)), ..default() }, Transform::from_xyz(0.0, 2.0, 4.0).looking_at(Vec3::ZERO, Vec3::Y), RenderLayers::layer(1), )); // --- Main scene (layer 0) --- // A plane that uses the render texture as its material commands.spawn(( Mesh3d(meshes.add(Cuboid::new(3.0, 2.0, 0.1))), MeshMaterial3d(materials.add(StandardMaterial { base_color_texture: Some(render_image_handle), unlit: true, ..default() })), Transform::from_xyz(0.0, 1.0, 0.0), RenderLayers::layer(0), )); // Light for main scene commands.spawn(( PointLight { intensity: 1_000_000.0, ..default() }, Transform::from_xyz(4.0, 5.0, 4.0), RenderLayers::layer(0), )); // Main camera commands.spawn(( Camera3d::default(), Camera { order: 1, // render after the offscreen camera ..default() }, Transform::from_xyz(0.0, 1.5, 5.0).looking_at(Vec3::new(0.0, 1.0, 0.0), Vec3::Y), RenderLayers::layer(0), )); } fn rotate_cube(time: Res<Time>, mut query: Query<&mut Transform, With<RotatingCube>>) { for mut transform in &mut query { transform.rotate_y(time.delta_secs() * 1.5); transform.rotate_x(time.delta_secs() * 0.7); } }
name: bevy-ui-and-audio description: Use when the user asks about building UI in Bevy, game menus, HUD, health bars, buttons, text display, UI layout, or audio playback, sound effects, music, volume control, or spatial audio in Bevy. version: 1.0.0
Bevy UI & Audio — Game Interfaces & Sound
This skill covers two essential game systems: user interfaces (menus, HUD, buttons) and audio (music, SFX, spatial sound). Both rely on ECS fundamentals — see the bevy-ecs skill for components, systems, queries, and commands.
UI with bevy_ui
Bevy ships a retained-mode UI system built on top of the Taffy layout engine (flexbox and CSS grid). UI elements are entities with Node and related components. They live in the ECS world alongside your game entities.
Core UI Components
| Component | Purpose |
|---|---|
Node | Makes an entity a UI element. Carries all style/layout properties (width, height, flex direction, padding, etc.) |
Text | Renders text. Requires a font Handle<Font> |
Button | Marker that enables Interaction tracking on a Node |
ImageNode | Displays an image inside a UI node |
BackgroundColor | Solid color fill for a node |
BorderColor | Border color (pair with border on Node) |
BorderRadius | Rounded corners |
ZIndex | Override draw order (ZIndex::Local(i32) or ZIndex::Global(i32)) |
Layout Model
bevy_ui uses flexbox by default. Key style properties live directly on Node:
#![allow(unused)] fn main() { commands.spawn(Node { width: Val::Percent(100.0), height: Val::Px(60.0), flex_direction: FlexDirection::Row, justify_content: JustifyContent::SpaceBetween, align_items: AlignItems::Center, padding: UiRect::all(Val::Px(12.0)), column_gap: Val::Px(8.0), ..default() }); }
The Val enum for sizing:
Val::Px(f32)— absolute pixelsVal::Percent(f32)— percentage of parentVal::Auto— automatic sizing (the default)Val::Vw(f32)/Val::Vh(f32)— viewport-relative
Flexbox properties:
flex_direction—Row,Column,RowReverse,ColumnReversejustify_content—Start,End,Center,SpaceBetween,SpaceAround,SpaceEvenlyalign_items—Start,End,Center,Stretch,Baselinealign_self— override parent'salign_itemsfor one childflex_wrap—NoWrap,Wrap,WrapReverseflex_grow,flex_shrink,flex_basis— standard flex sizingrow_gap,column_gap— gap between children
CSS Grid is also supported:
#![allow(unused)] fn main() { commands.spawn(Node { display: Display::Grid, grid_template_columns: vec![ GridTrack::flex(1.0), GridTrack::px(200.0), GridTrack::flex(2.0), ], grid_template_rows: vec![ GridTrack::auto(), GridTrack::flex(1.0), ], ..default() }); }
Interaction and Buttons
The Interaction component is automatically added to entities with Button. Query it to detect clicks and hover:
#![allow(unused)] fn main() { fn button_system( mut query: Query< (&Interaction, &mut BackgroundColor), (Changed<Interaction>, With<Button>), >, ) { for (interaction, mut bg_color) in &mut query { match *interaction { Interaction::Pressed => { *bg_color = BackgroundColor(Color::srgb(0.35, 0.75, 0.35)); } Interaction::Hovered => { *bg_color = BackgroundColor(Color::srgb(0.25, 0.25, 0.25)); } Interaction::None => { *bg_color = BackgroundColor(Color::srgb(0.15, 0.15, 0.15)); } } } } }
Changed<Interaction> is a query filter — the system only runs on entities whose Interaction actually changed this frame, avoiding unnecessary work.
Focus policy: By default, Node entities do not block interactions from reaching nodes behind them. Use FocusPolicy::Block to stop click-through:
#![allow(unused)] fn main() { commands.spawn(( Node { ..default() }, FocusPolicy::Block, )); }
UI Hierarchy — Nested Spawning
UI trees are built with with_children. The parent-child relationship drives layout (children are positioned inside parent nodes):
#![allow(unused)] fn main() { commands .spawn(( Node { width: Val::Percent(100.0), height: Val::Percent(100.0), justify_content: JustifyContent::Center, align_items: AlignItems::Center, ..default() }, BackgroundColor(Color::NONE), )) .with_children(|parent| { parent .spawn(( Button, Node { width: Val::Px(200.0), height: Val::Px(65.0), justify_content: JustifyContent::Center, align_items: AlignItems::Center, border: UiRect::all(Val::Px(2.0)), ..default() }, BorderColor(Color::WHITE), BackgroundColor(Color::srgb(0.15, 0.15, 0.15)), )) .with_children(|parent| { parent.spawn(( Text::new("Play"), TextFont { font_size: 28.0, ..default() }, TextColor(Color::WHITE), )); }); }); }
TargetCamera: To render UI on a specific camera (useful for split-screen or render-to-texture), add TargetCamera(camera_entity) to the root UI node.
Common UI Patterns
Main menu with navigation:
#![allow(unused)] fn main() { #[derive(States, Debug, Clone, PartialEq, Eq, Hash, Default)] enum MenuState { #[default] Main, Settings, Credits, } #[derive(Component)] enum MenuButton { Play, Settings, Quit, } fn spawn_main_menu(mut commands: Commands) { commands .spawn(( StateScoped(MenuState::Main), Node { width: Val::Percent(100.0), height: Val::Percent(100.0), flex_direction: FlexDirection::Column, justify_content: JustifyContent::Center, align_items: AlignItems::Center, row_gap: Val::Px(16.0), ..default() }, )) .with_children(|parent| { for (label, action) in [ ("Play", MenuButton::Play), ("Settings", MenuButton::Settings), ("Quit", MenuButton::Quit), ] { parent .spawn(( Button, action, Node { width: Val::Px(250.0), height: Val::Px(55.0), justify_content: JustifyContent::Center, align_items: AlignItems::Center, ..default() }, BackgroundColor(Color::srgb(0.2, 0.2, 0.2)), )) .with_children(|btn| { btn.spawn(( Text::new(label), TextFont { font_size: 24.0, ..default() }, TextColor(Color::WHITE), )); }); } }); } fn handle_menu_buttons( query: Query<(&Interaction, &MenuButton), Changed<Interaction>>, mut next_state: ResMut<NextState<MenuState>>, mut exit: EventWriter<AppExit>, ) { for (interaction, button) in &query { if *interaction == Interaction::Pressed { match button { MenuButton::Play => { /* transition to GameState::Playing */ } MenuButton::Settings => next_state.set(MenuState::Settings), MenuButton::Quit => { exit.write(AppExit::Success); } } } } } }
StateScoped despawns the entity (and its children) automatically when leaving that state — no manual cleanup needed.
HUD overlay (health bar + score):
#![allow(unused)] fn main() { #[derive(Component)] struct HealthBar; #[derive(Component)] struct ScoreText; fn spawn_hud(mut commands: Commands) { // Root container pinned to top of screen commands .spawn(Node { width: Val::Percent(100.0), height: Val::Px(40.0), justify_content: JustifyContent::SpaceBetween, align_items: AlignItems::Center, padding: UiRect::horizontal(Val::Px(16.0)), ..default() }) .with_children(|parent| { // Health bar: background + fill parent .spawn(Node { width: Val::Px(200.0), height: Val::Px(20.0), ..default() }) .insert(BackgroundColor(Color::srgb(0.3, 0.0, 0.0))) .with_children(|bar_bg| { bar_bg.spawn(( HealthBar, Node { width: Val::Percent(100.0), height: Val::Percent(100.0), ..default() }, BackgroundColor(Color::srgb(0.0, 0.8, 0.0)), )); }); // Score text parent.spawn(( ScoreText, Text::new("Score: 0"), TextFont { font_size: 20.0, ..default() }, TextColor(Color::WHITE), )); }); } fn update_health_bar( player: Query<&Health, With<Player>>, mut bar: Query<&mut Node, With<HealthBar>>, ) { if let (Ok(health), Ok(mut node)) = (player.single(), bar.single_mut()) { node.width = Val::Percent(health.current as f32 / health.max as f32 * 100.0); } } }
Loading screen with progress bar:
#![allow(unused)] fn main() { #[derive(Resource, Default)] struct LoadingProgress { loaded: usize, total: usize, } fn update_loading_bar( progress: Res<LoadingProgress>, mut bar: Query<&mut Node, With<LoadingBar>>, ) { if let Ok(mut node) = bar.single_mut() { let pct = if progress.total > 0 { progress.loaded as f32 / progress.total as f32 * 100.0 } else { 0.0 }; node.width = Val::Percent(pct); } } }
bevy_egui — Debug and Editor UI
Use bevy_egui for debug panels, inspector tools, and editor UI. Use bevy_ui for in-game UI that ships to players.
When to choose bevy_egui:
- Rapid prototyping — egui is immediate-mode, faster to iterate
- Debug overlays, entity inspectors, level editors
- You need text input fields, sliders, collapsible panels, drag-and-drop
When to choose bevy_ui:
- Final in-game UI (menus, HUD, dialogue boxes)
- You need pixel-perfect control, custom rendering, animations
- Performance-sensitive UI (bevy_ui is integrated with the render pipeline)
Setup:
# Cargo.toml
[dependencies]
bevy_egui = "0.34" # Match your Bevy version
#![allow(unused)] fn main() { use bevy_egui::{egui, EguiContexts, EguiPlugin}; app.add_plugins(EguiPlugin); fn debug_ui(mut contexts: EguiContexts) { egui::Window::new("Debug").show(contexts.ctx_mut(), |ui| { ui.label("Hello from egui"); if ui.button("Click me").clicked() { // handle click } }); } }
Audio
Bevy's built-in audio supports loading sound files, playing one-shot effects, looping music, volume control, and spatial 3D audio.
Audio Basics
Audio in Bevy works through entities. You load audio files as assets, then spawn entities with an AudioPlayer component to play them:
#![allow(unused)] fn main() { // Load audio assets (typically in a setup system) fn setup_audio(mut commands: Commands, asset_server: Res<AssetServer>) { let music_handle: Handle<AudioSource> = asset_server.load("audio/background.ogg"); let sfx_handle: Handle<AudioSource> = asset_server.load("audio/explosion.ogg"); // Store handles in a resource for later use commands.insert_resource(GameAudio { music: music_handle, explosion: sfx_handle, }); } #[derive(Resource)] struct GameAudio { music: Handle<AudioSource>, explosion: Handle<AudioSource>, } }
Supported formats: OGG Vorbis, WAV, FLAC, MP3 (via feature flags — ogg is enabled by default).
Playing Sounds
Background music (looping):
#![allow(unused)] fn main() { fn start_music(mut commands: Commands, audio: Res<GameAudio>) { commands.spawn(( AudioPlayer(audio.music.clone()), PlaybackSettings::LOOP, )); } }
One-shot SFX:
#![allow(unused)] fn main() { fn play_explosion(mut commands: Commands, audio: Res<GameAudio>) { commands.spawn(( AudioPlayer(audio.explosion.clone()), PlaybackSettings::DESPAWN, // Entity is despawned when playback finishes )); } }
PlaybackSettings presets:
| Preset | Behavior |
|---|---|
PlaybackSettings::ONCE | Play once, entity remains after completion |
PlaybackSettings::LOOP | Loop forever |
PlaybackSettings::DESPAWN | Play once, despawn entity on finish |
PlaybackSettings::REMOVE | Play once, remove audio components on finish (entity stays) |
Custom settings:
#![allow(unused)] fn main() { PlaybackSettings { mode: PlaybackMode::Loop, volume: Volume::new(0.5), speed: 1.2, paused: false, spatial: false, spatial_scale: None, } }
Controlling Playback
Once an audio entity is playing, Bevy adds an AudioSink component to it. Query this to control playback at runtime:
#![allow(unused)] fn main() { #[derive(Component)] struct MusicTrack; // Spawn tagged music fn start_music(mut commands: Commands, audio: Res<GameAudio>) { commands.spawn(( MusicTrack, AudioPlayer(audio.music.clone()), PlaybackSettings::LOOP, )); } // Pause/resume fn toggle_music( query: Query<&AudioSink, With<MusicTrack>>, input: Res<ButtonInput<KeyCode>>, ) { if input.just_pressed(KeyCode::KeyM) { if let Ok(sink) = query.single() { sink.toggle(); // pause if playing, resume if paused } } } // Adjust volume fn set_volume(query: Query<&AudioSink, With<MusicTrack>>) { if let Ok(sink) = query.single() { sink.set_volume(0.3); // 0.0 = silent, 1.0 = full } } // Stop and remove fn stop_music( mut commands: Commands, query: Query<(Entity, &AudioSink), With<MusicTrack>>, ) { if let Ok((entity, sink)) = query.single() { sink.stop(); commands.entity(entity).despawn(); } } }
AudioSink methods:
toggle()— pause/resumepause(),play()— explicit pause/resumestop()— stop playbackset_volume(f32)— set volume (0.0 to 1.0+)set_speed(f32)— set playback speedis_paused() -> boolempty() -> bool— true when playback finished
Spatial Audio
Spatial audio positions sounds in 3D space. Sounds get louder or quieter based on the listener's distance, and pan left/right based on direction.
Setup a spatial listener:
#![allow(unused)] fn main() { fn setup_spatial(mut commands: Commands) { // The listener is typically on the player or camera commands.spawn(( Transform::default(), SpatialListener::default(), // Usually bundled with your camera or player entity )); } }
Spawn a spatial sound source:
#![allow(unused)] fn main() { fn spawn_ambient_sound(mut commands: Commands, asset_server: Res<AssetServer>) { commands.spawn(( AudioPlayer(asset_server.load("audio/campfire.ogg")), PlaybackSettings { mode: PlaybackMode::Loop, spatial: true, ..default() }, Transform::from_xyz(10.0, 0.0, -5.0), )); } }
The sound's Transform position relative to the SpatialListener's position determines volume and panning. As the listener (player/camera) moves closer, the sound gets louder.
Spatial scale: Control how quickly sounds attenuate with distance using SpatialScale as a resource:
#![allow(unused)] fn main() { app.insert_resource(SpatialScale::new(1.0)); // Default; smaller = slower falloff }
bevy_kira_audio — Advanced Audio
For games that need crossfading, audio channels, streaming, or fine-grained audio control, bevy_kira_audio wraps the Kira audio library:
When to use bevy_kira_audio over built-in audio:
- Crossfading between music tracks
- Named audio channels (music, SFX, ambient, voice) with independent volume
- Audio tweening (fade in/out over duration)
- Streaming large audio files
- Audio instances with per-instance control
Setup:
# Cargo.toml — replace default bevy audio
[dependencies]
bevy = { version = "0.15", default-features = false, features = [
# include your needed features, but NOT bevy_audio
] }
bevy_kira_audio = "0.22" # Match your Bevy version
#![allow(unused)] fn main() { use bevy_kira_audio::prelude::*; app.add_plugins(AudioPlugin); // bevy_kira_audio's AudioPlugin // Play with channels and fading fn play_music(audio: Res<Audio>, asset_server: Res<AssetServer>) { audio .play(asset_server.load("audio/music.ogg")) .looped() .with_volume(0.7) .fade_in(AudioTween::linear(Duration::from_secs(2))); } }
Audio channels for independent volume control:
#![allow(unused)] fn main() { #[derive(Resource)] struct MusicChannel; #[derive(Resource)] struct SfxChannel; app.add_audio_channel::<MusicChannel>() .add_audio_channel::<SfxChannel>(); fn adjust_music_volume(channel: Res<AudioChannel<MusicChannel>>) { channel.set_volume(0.5); } }
Choose built-in Bevy audio for simple games and prototypes. Reach for bevy_kira_audio when you need production audio features like crossfading and channel mixing.
name: finish description: Use when the user says they are done, asks to finish a task, wants to verify their work is complete, wants a pre-commit quality check, or asks to validate that changes are ready to ship. Also triggered by phrases like "wrap up", "finalize", "make sure this is done", "are we good?", or "let's finish". version: 1.0.0
Finish — Pre-Completion Verification Workflow
A checklist-driven workflow that verifies work is actually complete before considering a task done. Run through every step in order. Do not skip steps. If a step fails, fix the issue and re-run that step before proceeding.
Step 1: Identify What Changed
Run git diff --stat and git diff to understand the full scope of changes. Run git status to catch untracked files that may need to be included. Build a mental model of every file touched and why.
Step 2: Run Tests
Detect the project's test framework and run the full test suite.
Detection strategy — check in order, use the first match:
| Indicator | Command |
|---|---|
Cargo.toml at root or workspace root | cargo test --workspace |
package.json with a test script | npm test or yarn test |
pyproject.toml / pytest.ini / setup.cfg with pytest | pytest |
go.mod | go test ./... |
justfile / Makefile with a test target | just test / make test |
If no test framework is detected, state this explicitly and skip to Step 3.
If tests fail, fix the failures. Re-run until they pass. Do not proceed with failing tests.
Step 3: Build and Lint
Run the project's build and lint tooling to catch compilation errors and style issues.
| Language | Command |
|---|---|
| Rust | cargo clippy --workspace --all-targets -- -D warnings |
| Node/TS | npm run lint (if script exists), npx tsc --noEmit (if tsconfig.json exists) |
| Python | ruff check . or flake8 (whichever is configured) |
| Go | go vet ./... |
If a justfile or Makefile has a check or lint target, prefer that.
Fix any issues found. Re-run until clean.
Step 4: Invoke /simplify
Run the /simplify skill. This reviews the changed code for reuse opportunities, code quality, and efficiency. Follow its recommendations and apply fixes.
IMPORTANT: Actually invoke /simplify as a slash command. Do not replicate its behavior manually.
Step 5: Diff Review
Perform a thorough review of the final diff (git diff for unstaged, git diff --cached for staged). Check for:
- Correctness: Does the change do what was intended? Are there edge cases?
- Leftovers: Debug prints, TODO comments that should be resolved, commented-out code, hardcoded values that should be configurable.
- Naming: Are new functions, variables, and types named clearly?
- Error handling: Are errors handled, not swallowed? Are error messages useful?
- Security: No secrets, credentials, or API keys in the diff. No injection vectors. No path traversal.
- Completeness: If a new public API was added, is it documented? If behavior changed, are docs updated?
If issues are found, fix them. Re-run Steps 2–3 if the fixes are non-trivial.
Step 6: Summary
Report what was verified:
- Tests pass — name the command run and result
- Build/lint clean — name the command run and result
- /simplify applied — note any changes made
- Diff reviewed — note any issues found and fixed
- No secrets or debug artifacts in the diff
State clearly: "Task verified complete" or "Task has unresolved issues:" followed by what remains.
Related Skills
For qualitative code analysis beyond this checklist, see soft-harness-create and soft-harness-run.
name: soft-harness-create description: Use when the user asks to create a soft harness, set up qualitative tests, measure code quality, track non-functional metrics, create architectural conformance checks, analyze code complexity trends, set up documentation completeness tracking, or wants a quality baseline for a module or project. Also triggered by "soft test", "quality harness", "non-functional test suite", or "quality baseline". version: 1.0.0
Soft Harness — Create Qualitative Test Suite
A soft harness measures non-functional qualities of code: complexity, architectural conformance, documentation completeness, API surface consistency, and duplication patterns. Unlike unit tests, soft harnesses do not assert correctness — they assess quality and track it over time.
Soft harness definitions and results live in .soft-harness/ at the project root. They are intended to be committed to the repository for historical tracking.
Directory Structure
.soft-harness/
├── harness.md # Harness definition: which checks to run, thresholds, scope
├── baseline.md # The accepted baseline (copied from a results file)
└── results/
└── YYYY-MM-DD-HHMMSS.md # Timestamped result snapshots
Step 1: Determine Scope
Ask the user (or infer from context) what the harness should cover:
- Whole project — analyze everything under the project root
- Specific module/directory — analyze a subtree (e.g.,
src/domain/,crates/app-core/) - Specific change — analyze only files changed since a base branch
Step 2: Select Checks
Choose applicable checks from the catalog below. Not all checks apply to every project — select based on the language, project structure, and what the user cares about.
Check Catalog
Complexity
- Function length: Count functions exceeding a line threshold (default: 50). Report the longest functions with file, line number, and name.
- File length: Count files exceeding a line threshold (default: 300). Report the longest files.
- Nesting depth: Identify deeply nested blocks (default: 4 levels). Use brace counting or indentation analysis depending on language.
- Parameter count: Functions with more than N parameters (default: 5).
Architectural Conformance
- Dependency direction: Define allowed import/dependency directions (e.g., "domain must not import from infra"). Scan
use/import/requirestatements for violations. - Layer violations: Define layers and verify dependencies only flow inward. Configurable per project.
- Forbidden imports: Modules or packages that should never be imported in certain scopes (e.g., no
tokioin domain crate).
Documentation
- Public API docs: Percentage of public functions/types/modules with doc comments. Language-specific detection (Rust:
///abovepub, TS:/** */aboveexport, Python: docstrings on non-_items). - README presence: Check that key directories have README files.
API Surface
- Public export count: Track the number of public exports. A sudden spike may indicate leaky abstraction. Tracked for trend, not pass/fail.
Duplication Patterns
- Near-duplicate functions: Identify functions with very similar structure (same parameter count, similar length, similar names). Heuristic, not exact.
- Copy-paste indicators: Blocks of code that appear nearly verbatim in multiple locations.
Consistency
- Naming conventions: Check that function/type names follow the project's conventions (snake_case, CamelCase, etc.).
- Error handling patterns: Verify consistent error handling (e.g., all functions in a module use Result, no bare unwrap in library code).
Step 3: Write the Harness Definition
Create .soft-harness/harness.md with this structure:
# Soft Harness Definition
## Scope
- **Type:** project | directory | change
- **Paths:** (for directory scope) src/domain/, src/services/
- **Exclude:** **/test*, **/generated*
## Checks
### Function Length
- **Enabled:** yes
- **Threshold:** 50 lines
- **Severity:** warning
### File Length
- **Enabled:** yes
- **Threshold:** 300 lines
- **Severity:** warning
### Nesting Depth
- **Enabled:** yes
- **Threshold:** 4 levels
- **Severity:** warning
### Parameter Count
- **Enabled:** yes
- **Threshold:** 5
- **Severity:** warning
### Dependency Direction
- **Enabled:** yes
- **Severity:** error
- **Rules:**
- `src/domain/**` must not import from `src/infra/**`, `src/api/**`
- `src/api/**` must not import from `src/infra/**`
### Public API Docs
- **Enabled:** yes
- **Threshold:** 80%
- **Severity:** warning
### Public Export Count
- **Enabled:** yes
- **Severity:** info (track only)
Customize checks based on Steps 1–2. Disable checks that do not apply. Adjust thresholds to the project's current state — the first harness should produce a realistic baseline, not a wall of failures.
Severity levels:
- info — track the metric but do not flag it. Useful for trend data.
- warning — flag in the report but the harness does not "fail".
- error — the harness reports a failure.
Step 4: Run Initial Baseline
After creating the harness definition, invoke the soft-harness-run skill to execute it and produce the initial baseline. The first run's results become baseline.md.
Implementation Notes
All checks are performed by Claude reading and analyzing source files directly — no external tools required. Each check is a pattern of file reading, grepping, and counting that Claude performs when the harness is run.
The harness definition is declarative. The soft-harness-run skill interprets it and performs the actual analysis.
Related Skills
To execute a harness and compare results, see soft-harness-run. For task completion verification, see finish.
Example Soft Harness Definitions
Rust Workspace Example
A harness for a Rust workspace with a domain-driven architecture (core, api, infra crates).
# Soft Harness Definition
## Scope
- **Type:** project
- **Exclude:** **/target/*, **/testutils/**
## Checks
### Function Length
- **Enabled:** yes
- **Threshold:** 50 lines
- **Severity:** warning
### File Length
- **Enabled:** yes
- **Threshold:** 400 lines
- **Severity:** warning
### Nesting Depth
- **Enabled:** yes
- **Threshold:** 4 levels
- **Severity:** warning
### Parameter Count
- **Enabled:** yes
- **Threshold:** 5
- **Severity:** warning
### Dependency Direction
- **Enabled:** yes
- **Severity:** error
- **Rules:**
- `crates/*-core/src/**` must not import from `crates/*-api/**`, `crates/*-infra/**`, `crates/*-bin/**`
- `crates/*-api/src/**` must not import from `crates/*-infra/**`
### Forbidden Imports
- **Enabled:** yes
- **Severity:** error
- **Rules:**
- `crates/*-core/**` must not import `tokio`, `axum`, `diesel`, `sqlx`
- `crates/*-core/**` must not import `std::fs`, `std::net`
### Public API Docs
- **Enabled:** yes
- **Threshold:** 80%
- **Severity:** warning
### Public Export Count
- **Enabled:** yes
- **Severity:** info
Why these choices: Rust workspaces benefit strongly from dependency direction enforcement — it's easy for a domain crate to accidentally pull in infrastructure types. The forbidden imports check reinforces this at the module level. Public API docs are important for library crates that other crates depend on. unwrap detection is left to clippy (unwrap_used lint), so it's not duplicated here.
TypeScript/Node Project Example
A harness for a TypeScript project with a layered architecture (domain, services, api, infrastructure).
# Soft Harness Definition
## Scope
- **Type:** directory
- **Paths:** src/
- **Exclude:** **/*.test.ts, **/*.spec.ts, **/node_modules/*, **/dist/*
## Checks
### Function Length
- **Enabled:** yes
- **Threshold:** 40 lines
- **Severity:** warning
### File Length
- **Enabled:** yes
- **Threshold:** 250 lines
- **Severity:** warning
### Nesting Depth
- **Enabled:** yes
- **Threshold:** 4 levels
- **Severity:** warning
### Parameter Count
- **Enabled:** yes
- **Threshold:** 4
- **Severity:** warning
### Dependency Direction
- **Enabled:** yes
- **Severity:** error
- **Rules:**
- `src/domain/**` must not import from `src/infrastructure/**`, `src/api/**`
- `src/services/**` must not import from `src/api/**`
### Public API Docs
- **Enabled:** no
### Public Export Count
- **Enabled:** yes
- **Severity:** info
### Naming Conventions
- **Enabled:** yes
- **Severity:** warning
- **Rules:**
- Functions and variables: camelCase
- Classes and types: PascalCase
- Files: kebab-case
Why these choices: TypeScript projects tend toward shorter functions and files than Rust. Lower thresholds reflect this. Documentation coverage is disabled because TSDoc adoption varies — enable it if the project uses it consistently. Naming conventions are checked because TypeScript projects often mix conventions across contributors.
name: soft-harness-run description: Use when the user asks to run a soft harness, check code quality metrics, compare against a quality baseline, view quality regressions, run qualitative tests, or evaluate non-functional code properties. Also triggered by "run the harness", "check quality", "how does this compare to baseline", or "quality report". version: 1.0.0
Soft Harness — Run and Report
Executes a soft harness defined in .soft-harness/harness.md, compares results against the baseline, and reports regressions and improvements.
Step 1: Load the Harness
Read .soft-harness/harness.md. If it does not exist, tell the user and suggest creating one with the soft-harness-create skill.
Read .soft-harness/baseline.md if it exists. This is the comparison target. If no baseline exists, this run will establish one.
Step 2: Determine Scope
Based on the Scope section in the harness definition:
- project — analyze all files in the project root, respecting exclude patterns.
- directory — analyze only files under the specified paths, respecting exclude patterns.
- change — analyze only files changed since the merge base with the default branch. Use
git merge-base HEAD master(ormain) to find the base, thengit diff --name-only <base>for the file list.
Build the file list using Glob, filtering by scope and exclude patterns.
Step 3: Execute Each Enabled Check
For each check marked Enabled: yes in the harness definition, perform the analysis. All checks are purely analytical — read files, count patterns, scan for violations.
Check Execution
Function Length
- Identify function definitions using language-appropriate patterns: Rust (
fn), TypeScript/JavaScript (function, arrow functions, methods), Python (def), Go (func). - Count lines from opening to closing brace/dedent.
- Report functions exceeding the threshold with file, line, name, and length.
File Length
- Count lines in each file in scope.
- Report files exceeding the threshold.
Nesting Depth
- For brace-delimited languages: count brace nesting depth at each line.
- For indentation-based languages (Python): measure indentation levels directly.
- Report locations exceeding the threshold with file, line, and depth.
Parameter Count
- Parse function signatures to count parameters.
- Report functions exceeding the threshold.
Dependency Direction
- For each rule, scan files matching the
frompattern for import/use statements. - Check if any imports match the denied patterns.
- Report violations with file, line, the offending import, and which rule was violated.
Forbidden Imports
- Scan files in scope for imports matching forbidden patterns.
- Report violations.
Public API Docs
- Identify public items (Rust:
pub fn/struct/enum/trait, TS/JS:export, Python: non-_prefixed). - Check if each has a doc comment directly above it.
- Report percentage documented and list undocumented items.
README Presence
- Check each specified directory for a README.md (case-insensitive).
- Report which directories are missing READMEs.
Public Export Count
- Count public exports in scope.
- Report total count and delta from baseline.
Naming Conventions
- Scan identifiers against expected patterns for the language.
- Report violations.
Near-Duplicate Functions / Copy-Paste Indicators
- Identify functions with very similar structure (parameter count, length, name patterns).
- Look for blocks of code appearing nearly verbatim in multiple locations.
- Report suspected duplicates with locations.
Step 4: Write Results
Write results to .soft-harness/results/YYYY-MM-DD-HHMMSS.md with this structure:
# Soft Harness Results — YYYY-MM-DD HH:MM
## Summary
- **Files analyzed:** 47
- **Checks run:** 6
- **Passed:** 4
- **Warnings:** 1
- **Errors:** 1
## Check Results
### Function Length — WARNING
3 functions exceed 50 lines:
| File | Line | Function | Length |
|------|------|----------|--------|
| src/handlers.rs | 142 | process_request | 78 lines |
| src/parser.rs | 55 | parse_expression | 63 lines |
| src/utils.rs | 20 | validate_input | 52 lines |
### Dependency Direction — ERROR
1 violation found:
| File | Line | Import | Rule Violated |
|------|------|--------|---------------|
| src/domain/user.rs | 3 | `use crate::infra::db` | domain must not import from infra |
### Public API Docs — PASSED
85% documented (threshold: 80%)
### File Length — PASSED
No files exceed 300 lines.
### Nesting Depth — PASSED
No locations exceed 4 levels.
### Public Export Count — INFO
42 public exports (baseline: 38, delta: +4)
Step 5: Compare Against Baseline
If .soft-harness/baseline.md exists, compare the new results:
- Regressions: Any check that was passed and is now warning/error, or any check whose violation count increased.
- Improvements: Fewer violations, higher percentages, or checks that moved from warning/error to passed.
- Unchanged: No significant difference.
Step 6: Report
Output a human-readable summary to the user:
## Soft Harness Results — 2026-03-28 14:30
**6 checks run** | 4 passed | 1 warning | 1 error
### Regressions (vs baseline)
- function_length: 3 violations (was 1) — WARNING
- src/handlers.rs:142 process_request (78 lines)
- src/parser.rs:55 parse_expression (63 lines)
- src/utils.rs:20 validate_input (52 lines)
### Errors
- dependency_direction: domain imports from infra
- src/domain/user.rs:3 — `use crate::infra::db`
### Improvements (vs baseline)
- public_api_docs: 85% (was 72%)
### Unchanged
- file_length: passed
- nesting_depth: passed
- readme_presence: passed
Step 7: Next Steps
Based on results:
- If there are errors, suggest fixing them immediately.
- If there are regressions, highlight which changes likely caused them (cross-reference with
git diff). - If the user wants to update the baseline, copy the current results file to
.soft-harness/baseline.md. - Suggest committing the results for historical tracking:
git add .soft-harness/results/.
Related Skills
To create or modify a harness definition, see soft-harness-create. For task completion verification, see finish.
name: dwind-component description: Use when the user asks to create a component, build a UI element, use a dwui widget, references the #[component] macro, or asks about component patterns in the dwind/dominator stack. version: 1.0.0
Dwind Component Patterns
Build reactive, type-safe UI components using the dominator + dwind + futures-signals stack.
The #[component] Macro
Declare a component struct. The macro generates a props builder and an invocation macro.
#![allow(unused)] fn main() { use futures_signals_component_macro::component; #[component(render_fn = my_card)] struct MyCard { #[signal] #[default(None)] content: Option<Dom>, #[signal] #[default("".to_string())] title: String, #[default(Box::new(|_: events::Click| {}))] on_click: dyn Fn(events::Click) -> () + 'static, #[signal] #[default(false)] disabled: bool, } }
This generates:
MyCardPropsstruct with builder patternmy_card!({ .title("Hello").content(text("Body")) })macro for ergonomic usage- Each
#[signal]field gets both.prop(value)and.prop_signal(signal)setters
Render Function
#![allow(unused)] fn main() { pub fn my_card(props: MyCardProps) -> Dom { let MyCardProps { content, title, on_click, disabled, apply } = props; // Broadcast signals used in multiple places let disabled = disabled.broadcast(); html!("div", { .dwclass!("p-4 bg-gray-900 rounded-lg shadow-lg transition-all") // Reactive styling .style_signal("opacity", disabled.signal().map(|d| if d { "0.5" } else { "1" })) .style_signal("pointer-events", disabled.signal().map(|d| if d { "none" } else { "auto" })) // Reactive text content .child(html!("h3", { .dwclass!("text-lg font-bold mb-2 text-white") .text_signal(title) })) // Optional DOM content .child_signal(content) .event(move |e: events::Click| { (on_click)(e); }) // Extension point for consumers .apply_if(apply.is_some(), move |b| b.apply(apply.unwrap())) }) } }
Prop Rules
| Category | Pattern | Example |
|---|---|---|
| Visual state | #[signal] — always reactive | variant, disabled, size |
| Content | #[signal] with Option<Dom> | content, header, label |
| Callbacks | Static Box<dyn Fn(...)> | on_click, on_close, on_submit |
| Values | Trait object or Mutable wrapper | value: dyn InputValueWrapper |
| Extension | Auto-generated by #[component] | apply field (always present) |
Critical Rules
Broadcast signals used in multiple places
A signal can only be consumed once. If you need the same signal in two or more .style_signal() / .dwclass_signal!() / .child_signal() calls, broadcast it first:
#![allow(unused)] fn main() { let disabled = disabled.broadcast(); // Now call disabled.signal() as many times as needed }
Wrap callbacks in Rc for multiple closures
Box<dyn Fn()> is not Clone. If a callback is used in multiple event handlers:
#![allow(unused)] fn main() { let on_close = std::rc::Rc::new(on_close); .event({ let on_close = on_close.clone(); move |_: events::Click| { (on_close)(); } }) .global_event({ let on_close = on_close.clone(); move |e: events::KeyDown| { if e.key() == "Escape" { (on_close)(); } }}) }
Box delegation impl
When a component field uses dyn SomeTrait, the generated code stores it as Box<dyn SomeTrait>. Add a delegation impl:
#![allow(unused)] fn main() { impl<T: ToggleValue + ?Sized> ToggleValue for Box<T> { fn get_signal(&self) -> LocalBoxSignal<'static, bool> { (**self).get_signal() } fn toggle(&self) { (**self).toggle() } } }
Consumer crate macro imports
Consumer crates must import macros explicitly:
#![allow(unused)] fn main() { #[macro_use] extern crate dwind_macros; // for dwclass! #[macro_use] extern crate my_design_system; // for component macros (my_card!, etc.) }
The apply extension point
Every #[component] struct gets an auto-generated apply field. Use apply_if in the render function:
#![allow(unused)] fn main() { .apply_if(apply.is_some(), move |b| b.apply(apply.unwrap())) }
Consumers customize the root element:
#![allow(unused)] fn main() { my_card!({ .title("Custom") .apply(|b| b.dwclass!("border border-blue-500")) }) }
Available Component Library
dwui is the published component library built on the dwind stack. Read references/component-catalog.md for the full catalog with props and usage examples.
Components: Button, Modal, TextInput, Select, Slider, Card, Heading, List
To see the full implementation of any component, read its source file:
/home/mmy/repos/oss/dominator-css-bindgen/crates/dwui/src/components/
Mixins Pattern
Reusable DomBuilder transforms for shared visual effects:
#![allow(unused)] fn main() { pub fn glass_surface(level: SurfaceLevel) -> impl FnOnce(DomBuilder<HtmlElement>) -> DomBuilder<HtmlElement> { move |b| { b.style("background", match level { SurfaceLevel::Base => "var(--my-bg)", SurfaceLevel::Elevated => "var(--my-bg-elevated)", }) .style("box-shadow", "var(--my-shadow)") } } // Usage: html!("div", { .apply(glass_surface(SurfaceLevel::Elevated)) .child(...) }) }
Dwind Component Catalog
Complete reference for all available components in dwui.
DWUI Components
Source: /home/mmy/repos/oss/dominator-css-bindgen/crates/dwui/src/components/
Button (button!)
| Prop | Type | Signal | Default |
|---|---|---|---|
content | Option<Dom> | Yes | None |
on_click | dyn Fn(events::Click) | No | no-op |
disabled | bool | Yes | false |
button_type | ButtonType | Yes | ButtonType::Flat |
Variants: ButtonType::Flat, ButtonType::Border
#![allow(unused)] fn main() { button!({ .content(text("Click me")) .on_click(|_| { /* handle */ }) .disabled_signal(is_disabled.signal()) }) }
Modal (modal!)
| Prop | Type | Signal | Default |
|---|---|---|---|
content | Option<Dom> | Yes | None |
open | bool | Yes | false |
on_close | dyn Fn() | No | no-op |
size | ModalSize | Yes | ModalSize::Medium |
close_on_backdrop_click | bool | Yes | true |
Variants: ModalSize::Small, Medium, Large, Full
#![allow(unused)] fn main() { modal!({ .content(text("Modal body")) .open_signal(is_open.signal()) .on_close(|| { is_open.set(false); }) }) }
TextInput (text_input!)
| Prop | Type | Signal | Default |
|---|---|---|---|
value | dyn InputValueWrapper | No | Mutable::new("") |
is_valid | ValidationResult | Yes | Valid |
label | String | Yes | "" |
on_submit | dyn FnMut() | No | no-op |
input_type | TextInputType | Yes | Text |
claim_focus | bool | No | false |
#![allow(unused)] fn main() { text_input!({ .value(Box::new(text_mutable.clone())) .label_signal(always("Username")) .input_type(TextInputType::Text) }) }
Select (select!)
| Prop | Type | Signal | Default |
|---|---|---|---|
value | dyn InputValueWrapper | No | Mutable::new("") |
options | Vec<(String, String)> | SignalVec | [] |
label | String | Yes | "" |
is_valid | ValidationResult | Yes | Valid |
#![allow(unused)] fn main() { select!({ .value(Box::new(selected.clone())) .options_signal_vec(options_signal) .label_signal(always("Choose")) }) }
Slider (slider!)
| Prop | Type | Signal | Default |
|---|---|---|---|
value | dyn InputValueWrapper | No | Mutable::new("") |
min | f32 | Yes | 0.0 |
max | f32 | Yes | 100.0 |
step | f32 | Yes | 1.0 |
label | String | Yes | "" |
#![allow(unused)] fn main() { slider!({ .value(Box::new(val.clone())) .min_signal(always(0.0)) .max_signal(always(100.0)) }) }
Card (card!)
| Prop | Type | Signal | Default |
|---|---|---|---|
content | Dom | Yes | required |
scheme | ColorScheme | Yes | ColorScheme::Void |
Variants: ColorScheme::Primary, Secondary, Void
#![allow(unused)] fn main() { card!({ .content_signal(always(text("Card body"))) .scheme_signal(always(ColorScheme::Primary)) }) }
Heading (heading!)
| Prop | Type | Signal | Default |
|---|---|---|---|
content | Dom | Yes | required |
text_size | TextSize | Yes | TextSize::ExtraLarge |
#![allow(unused)] fn main() { heading!({ .content_signal(always(text("Title"))) .text_size_signal(always(TextSize::Large)) }) }
List (pretty_list!)
| Prop | Type | Signal | Default |
|---|---|---|---|
items | Vec<Dom> | SignalVec | [] |
selected_index | Option<usize> | Yes | None |
item_click_handler | dyn Fn(usize) | No | no-op |
#![allow(unused)] fn main() { pretty_list!({ .items_signal_vec(items_signal) .selected_index_signal(selected.signal()) .item_click_handler(|index| { /* handle */ }) }) }
Common Patterns
Signal Props
All #[signal] props accept both static and reactive values:
- Static:
.prop(value) - Reactive:
.prop_signal(signal)
Validation
#![allow(unused)] fn main() { pub enum ValidationResult { Valid, Invalid { message: String }, } }
Custom Styling via apply
#![allow(unused)] fn main() { button!({ .content(text("Custom")) .apply(|b| b.dwclass!("border border-blue-500")) }) }
name: dwind-design-system description: Use when the user asks about design tokens, design system architecture, spacing scales, type scales, color systems, semantic tokens, component spacing conventions, vertical rhythm, dark/light theme token mapping, accessibility contrast ratios, or organizing a design system crate in a dwind/dominator context. Also triggers when the user mentions token hierarchy, baseline grid, or design system structure. version: 1.0.0
Dwind Design System — Tokens, Scales & Conventions
A design system is a set of deliberate decisions about tokens, scales, and conventions that create visual consistency across an application. This skill covers the architecture of a design system built on dwind — what to define, how to layer it, and what constraints to enforce.
For utility class mechanics and the dwclass! macro, see dwind-styling. For component patterns and the #[component] macro, see dwind-component. For project scaffolding and build pipeline, see dwind-project-setup.
Design Token Architecture
Structure tokens in three layers. Each layer references the one below it, creating a hierarchy that makes theme switching trivial and naming intentional.
Layer 1 — Primitive tokens: raw values. These map directly to dwind's built-in palette and spacing scale. Name them after what they are.
/* tokens.css — primitives */
.color-blue-500 { --ds-color-blue-500: #3b82f6; }
.color-gray-900 { --ds-color-gray-900: #111827; }
.space-2 { --ds-space-2: 8px; }
.space-4 { --ds-space-4: 16px; }
.text-base { --ds-text-base: 16px; }
Layer 2 — Semantic tokens: purpose-based aliases. Name them after what they do. These are what theme switching swaps.
/* tokens.css — semantic (dark mode default) */
:root {
--ds-color-bg: var(--ds-color-gray-900);
--ds-color-bg-elevated: var(--ds-color-gray-800);
--ds-color-text: var(--ds-color-gray-50);
--ds-color-primary: var(--ds-color-blue-500);
--ds-space-content-gap: var(--ds-space-4);
}
/* Light mode overrides — only remap semantic tokens */
.light {
--ds-color-bg: var(--ds-color-gray-50);
--ds-color-bg-elevated: var(--ds-color-white);
--ds-color-text: var(--ds-color-gray-900);
}
Layer 3 — Component tokens (optional): scoped overrides for a specific component. Use only when a component has internal values that differ from the semantic defaults.
.card-tokens {
--card-padding: var(--ds-space-content-gap);
--card-radius: 12px;
}
Theme switching works by redefining semantic tokens — primitives and component tokens stay untouched. See dwind-styling for the apply_theme_to_root() function that applies CSS variables at runtime.
Spacing System
Dwind's spacing scale uses a 4px base unit: gap-1 = 4px, gap-2 = 8px, gap-4 = 16px. The full scale goes from 0 to 96 (384px). That's too many choices — constrain it.
Define a semantic spacing scale that picks 6-7 values from dwind's range:
| Token | Value | Dwind class | Use for |
|---|---|---|---|
--ds-space-xs | 4px | gap-1, p-1 | Icon-to-label gaps, tight inline spacing |
--ds-space-sm | 8px | gap-2, p-2 | Related element spacing, compact padding |
--ds-space-md | 16px | gap-4, p-4 | Default content gaps, card padding |
--ds-space-lg | 24px | gap-6, p-6 | Section separation, generous padding |
--ds-space-xl | 32px | gap-8, p-8 | Major section breaks |
--ds-space-2xl | 48px | gap-12, p-12 | Page-level margins, hero spacing |
When to use dwclass! vs CSS vars: use dwclass!("gap-4 p-4") for layout — it's shorter and compiles to zero-cost CSS. Use .style("padding", "var(--ds-space-md)") when the value needs to be themeable or referenced by component tokens.
Vertical rhythm: pick a baseline line-height (24px / leading-6 is a good default). Ensure vertical margins and paddings are multiples of this baseline. This creates a predictable visual rhythm — elements align to an invisible grid.
Color System
Build semantic colors on top of dwind's palette. Define these categories:
Backgrounds: --ds-color-bg, --ds-color-bg-elevated, --ds-color-bg-muted, --ds-color-bg-inverted
Text: --ds-color-text, --ds-color-text-muted, --ds-color-text-inverted
Interactive: --ds-color-primary, --ds-color-primary-hover, --ds-color-secondary
Borders: --ds-color-border, --ds-color-border-muted
Status: --ds-color-success, --ds-color-warning, --ds-color-error, --ds-color-info
Dark/Light Mapping
| Semantic token | Dark (default) | Light |
|---|---|---|
--ds-color-bg | gray-900 | gray-50 |
--ds-color-bg-elevated | gray-800 | white |
--ds-color-text | gray-50 | gray-900 |
--ds-color-text-muted | gray-400 | gray-500 |
--ds-color-primary | blue-500 | blue-600 |
--ds-color-border | gray-700 | gray-200 |
--ds-color-error | red-400 | red-600 |
Note how some semantic tokens map to different shades in each theme — blue-500 has sufficient contrast on dark backgrounds, but blue-600 is needed on light backgrounds. Always verify contrast.
Accessibility Contrast Requirements
- Normal text (text-sm through text-base): 4.5:1 contrast ratio against background (WCAG AA)
- Large text (text-xl and above, or bold text-lg+): 3:1 contrast ratio
- UI components (borders, icons, focus indicators): 3:1 contrast ratio
Test with browser dev tools — inspect an element, check the contrast ratio in the color picker. Design the tokens to pass from the start rather than fixing failures later.
Typography System
Map dwind's type scale to semantic roles instead of using raw sizes everywhere:
| Role | Dwind class | Weight | Line height | Use for |
|---|---|---|---|---|
| Heading 1 | text-4xl | font-bold | leading-tight | Page titles |
| Heading 2 | text-2xl | font-bold | leading-tight | Section headers |
| Heading 3 | text-xl | font-semibold | leading-snug | Subsection headers |
| Body | text-base | font-normal | leading-normal | Paragraph text |
| Caption | text-sm | font-normal | leading-normal | Secondary info |
| Label | text-xs | font-medium | leading-normal | Form labels, badges |
Encode these as mixin functions so every heading looks the same:
#![allow(unused)] fn main() { pub fn heading_text(level: u8) -> impl FnOnce(DomBuilder<HtmlElement>) -> DomBuilder<HtmlElement> { move |b| match level { 1 => b.dwclass!("text-4xl font-bold leading-tight"), 2 => b.dwclass!("text-2xl font-bold leading-tight"), 3 => b.dwclass!("text-xl font-semibold leading-snug"), _ => b.dwclass!("text-lg font-semibold leading-snug"), } .style("color", "var(--ds-color-text)") } }
Responsive headings: scale up on larger screens. dwclass!("text-2xl @md:text-4xl") makes a heading that's text-2xl on mobile and text-4xl on desktop.
Component Spacing Conventions
Rule: components never set their own outer margins. The parent controls spacing.
Why: a card that sets m-b-4 breaks when placed in a flex container with gap-6. Margins create coupling between a component and its context.
Correct pattern: parent uses gap-{n} or space-y-{n}, children have zero margin:
#![allow(unused)] fn main() { // Page layout — parent controls all spacing html!("div", { .dwclass!("flex flex-col gap-6 p-6") .child(header_component()) // no outer margin .child(content_card()) // no outer margin .child(footer_component()) // no outer margin }) }
Internal padding is fine: a card can set p-4 because that's within its own boundary.
Escape hatch: the apply extension point on #[component] structs lets consumers add context-specific styling when needed:
#![allow(unused)] fn main() { my_card!({ .title("Settings") .apply(|b| b.dwclass!("m-t-2")) // consumer adds margin for this specific context }) }
Use this sparingly. If you find yourself adding margins via apply everywhere, the parent layout isn't doing its job.
Design System Crate Structure
Expand on the brief sketch in dwind-project-setup with a full layout:
crates/my-design-system/
├── Cargo.toml
├── build.rs # CSS codegen (tokens.css → tokens.rs)
├── resources/css/
│ └── tokens.css # All design tokens (primitive + semantic)
└── src/
├── lib.rs # Stylesheet init, re-exports
├── tokens_css.rs # Generated — include! from OUT_DIR
├── theme/
│ ├── mod.rs # Theme enum (Dark, Light), apply_theme()
│ └── palettes.rs # Palette definitions for each theme
├── mixins/
│ ├── mod.rs
│ ├── typography.rs # heading_text(), body_text(), caption_text()
│ └── surfaces.rs # card_surface(), elevated_surface()
└── components/
└── mod.rs # Design system components
The lib.rs calls dwind::stylesheet() and injects the design system's generated token stylesheet. Application crates depend on the design system crate — not on dwind directly — to enforce that all styling goes through the token layer.
#![allow(unused)] fn main() { // lib.rs #[macro_use] extern crate dwind_macros; pub mod theme; pub mod mixins; mod tokens_css { include!(concat!(env!("OUT_DIR"), "/tokens.rs")); } pub fn init_design_system() { dwind::stylesheet(); tokens_css::init_styles(); } }
See references/design-system-template.md for a complete, copy-pasteable starter.
Quick Decision Checklist
When starting a design system, decide these up front:
- Baseline unit: 4px (matches dwind's scale)
- Spacing scale: curate 6-7 named sizes (xs through 2xl) from dwind's range
- Primary palette: pick primary, secondary, and accent colors from dwind's palette
- Semantic colors: define bg, text, primary, border, and status tokens for both themes
- Contrast: verify 4.5:1 for body text, 3:1 for large text and UI elements
- Type scale: assign heading/body/caption/label roles from dwind's text-xs through text-9xl
- Component spacing: margin-free components, parent-controlled via gap/space
- Crate boundary: single design-system crate re-exporting tokens, theme, mixins, and components
Design System Starter Template
Copy-pasteable starter files for a dwind design system crate. Customize the color mappings, spacing scale, and typography roles for your project.
Cargo.toml
[package]
name = "my-design-system"
version = "0.1.0"
edition = "2021"
[dependencies]
dwind = { git = "https://github.com/nicksenger/dominator-css-bindgen", features = ["default_colors"] }
dwind-macros = { git = "https://github.com/nicksenger/dominator-css-bindgen" }
dominator = "0.5"
futures-signals = "0.3"
web-sys = { version = "0.3", features = ["HtmlElement", "CssStyleDeclaration", "Document", "Window", "Element"] }
wasm-bindgen = "0.2"
[build-dependencies]
dominator-css-bindgen = { git = "https://github.com/nicksenger/dominator-css-bindgen" }
tokens.css
/* ============================================================
PRIMITIVE TOKENS — raw values, named after what they ARE
============================================================ */
/* Spacing (4px base unit) */
.ds-space-1 { --ds-space-xs: 4px; }
.ds-space-2 { --ds-space-sm: 8px; }
.ds-space-4 { --ds-space-md: 16px; }
.ds-space-6 { --ds-space-lg: 24px; }
.ds-space-8 { --ds-space-xl: 32px; }
.ds-space-12 { --ds-space-2xl: 48px; }
/* Type sizes */
.ds-text-xs { --ds-text-label: 12px; }
.ds-text-sm { --ds-text-caption: 14px; }
.ds-text-base { --ds-text-body: 16px; }
.ds-text-xl { --ds-text-heading3: 20px; }
.ds-text-2xl { --ds-text-heading2: 24px; }
.ds-text-4xl { --ds-text-heading1: 36px; }
/* ============================================================
SEMANTIC TOKENS — purpose-based, named after what they DO
Dark mode is the default (:root)
============================================================ */
/* Backgrounds */
.ds-bg { color: var(--ds-color-bg); }
.ds-bg-elevated { color: var(--ds-color-bg-elevated); }
.ds-bg-muted { color: var(--ds-color-bg-muted); }
/* Text */
.ds-text { color: var(--ds-color-text); }
.ds-text-muted { color: var(--ds-color-text-muted); }
.ds-text-inverted { color: var(--ds-color-text-inverted); }
/* Interactive */
.ds-primary { color: var(--ds-color-primary); }
.ds-primary-hover { color: var(--ds-color-primary-hover); }
.ds-secondary { color: var(--ds-color-secondary); }
/* Borders */
.ds-border { border-color: var(--ds-color-border); }
.ds-border-muted { border-color: var(--ds-color-border-muted); }
/* Status */
.ds-success { color: var(--ds-color-success); }
.ds-warning { color: var(--ds-color-warning); }
.ds-error { color: var(--ds-color-error); }
.ds-info { color: var(--ds-color-info); }
theme/palettes.rs
#![allow(unused)] fn main() { /// Color values for each theme. Reference dwind's color palette /// (see dwind-styling references/color-palette.md for hex values). pub struct Palette { pub bg: &'static str, pub bg_elevated: &'static str, pub bg_muted: &'static str, pub text: &'static str, pub text_muted: &'static str, pub text_inverted: &'static str, pub primary: &'static str, pub primary_hover: &'static str, pub secondary: &'static str, pub border: &'static str, pub border_muted: &'static str, pub success: &'static str, pub warning: &'static str, pub error: &'static str, pub info: &'static str, } pub const DARK: Palette = Palette { bg: "#111827", // gray-900 bg_elevated: "#1f2937", // gray-800 bg_muted: "#374151", // gray-700 text: "#f9fafb", // gray-50 text_muted: "#9ca3af", // gray-400 text_inverted: "#111827", // gray-900 primary: "#3b82f6", // blue-500 primary_hover: "#2563eb", // blue-600 secondary: "#8b5cf6", // purple-500 border: "#374151", // gray-700 border_muted: "#1f2937", // gray-800 success: "#4ade80", // green-400 warning: "#fbbf24", // yellow-400 error: "#f87171", // red-400 info: "#38bdf8", // blue-400 (picton-blue) }; pub const LIGHT: Palette = Palette { bg: "#f9fafb", // gray-50 bg_elevated: "#ffffff", // white bg_muted: "#f3f4f6", // gray-100 text: "#111827", // gray-900 text_muted: "#6b7280", // gray-500 text_inverted: "#f9fafb", // gray-50 primary: "#2563eb", // blue-600 (darker for contrast on light bg) primary_hover: "#1d4ed8", // blue-700 secondary: "#7c3aed", // purple-600 border: "#e5e7eb", // gray-200 border_muted: "#f3f4f6", // gray-100 success: "#16a34a", // green-600 warning: "#d97706", // yellow-600 error: "#dc2626", // red-600 info: "#0284c7", // blue-600 }; }
theme/mod.rs
#![allow(unused)] fn main() { pub mod palettes; use palettes::{Palette, DARK, LIGHT}; use web_sys::wasm_bindgen::JsCast; #[derive(Clone, Copy, Debug, PartialEq)] pub enum Theme { Dark, Light, } impl Theme { pub fn palette(&self) -> &'static Palette { match self { Theme::Dark => &DARK, Theme::Light => &LIGHT, } } } /// Apply a theme by setting CSS custom properties on :root. /// Call this on app init and when the user switches themes. pub fn apply_theme(theme: Theme) { let Some(root) = web_sys::window() .and_then(|w| w.document()) .and_then(|d| d.document_element()) else { return; }; let el: &web_sys::HtmlElement = root.unchecked_ref(); let style = el.style(); let p = theme.palette(); let vars = [ ("--ds-color-bg", p.bg), ("--ds-color-bg-elevated", p.bg_elevated), ("--ds-color-bg-muted", p.bg_muted), ("--ds-color-text", p.text), ("--ds-color-text-muted", p.text_muted), ("--ds-color-text-inverted", p.text_inverted), ("--ds-color-primary", p.primary), ("--ds-color-primary-hover", p.primary_hover), ("--ds-color-secondary", p.secondary), ("--ds-color-border", p.border), ("--ds-color-border-muted", p.border_muted), ("--ds-color-success", p.success), ("--ds-color-warning", p.warning), ("--ds-color-error", p.error), ("--ds-color-info", p.info), ]; for (prop, val) in vars { let _ = style.set_property(prop, val); } // Toggle .light class for is(.light) selectors in dwclass! let class_list = root.class_list(); match theme { Theme::Dark => { let _ = class_list.remove_1("light"); }, Theme::Light => { let _ = class_list.add_1("light"); }, } } }
mixins/typography.rs
#![allow(unused)] fn main() { use dominator::DomBuilder; use web_sys::HtmlElement; /// Apply heading typography. Levels: 1 (largest) through 3. pub fn heading_text(level: u8) -> impl FnOnce(DomBuilder<HtmlElement>) -> DomBuilder<HtmlElement> { move |b| { let b = match level { 1 => b.dwclass!("text-4xl font-bold leading-tight"), 2 => b.dwclass!("text-2xl font-bold leading-tight"), 3 => b.dwclass!("text-xl font-semibold leading-snug"), _ => b.dwclass!("text-lg font-semibold leading-snug"), }; b.style("color", "var(--ds-color-text)") } } /// Body text — default paragraph styling. pub fn body_text() -> impl FnOnce(DomBuilder<HtmlElement>) -> DomBuilder<HtmlElement> { |b| { b.dwclass!("text-base font-normal leading-normal") .style("color", "var(--ds-color-text)") } } /// Caption text — secondary information, metadata. pub fn caption_text() -> impl FnOnce(DomBuilder<HtmlElement>) -> DomBuilder<HtmlElement> { |b| { b.dwclass!("text-sm font-normal leading-normal") .style("color", "var(--ds-color-text-muted)") } } /// Label text — form labels, badges, small UI text. pub fn label_text() -> impl FnOnce(DomBuilder<HtmlElement>) -> DomBuilder<HtmlElement> { |b| { b.dwclass!("text-xs font-medium leading-normal") .style("color", "var(--ds-color-text-muted)") } } }
mixins/surfaces.rs
#![allow(unused)] fn main() { use dominator::DomBuilder; use web_sys::HtmlElement; /// Standard card surface with themed background and border. pub fn card_surface() -> impl FnOnce(DomBuilder<HtmlElement>) -> DomBuilder<HtmlElement> { |b| { b.dwclass!("rounded-lg") .style("background", "var(--ds-color-bg-elevated)") .style("border", "1px solid var(--ds-color-border-muted)") .style("padding", "var(--ds-space-md)") } } /// Elevated surface with shadow for modals, dropdowns, popovers. pub fn elevated_surface() -> impl FnOnce(DomBuilder<HtmlElement>) -> DomBuilder<HtmlElement> { |b| { b.dwclass!("rounded-lg shadow-xl") .style("background", "var(--ds-color-bg-elevated)") .style("border", "1px solid var(--ds-color-border-muted)") .style("padding", "var(--ds-space-lg)") } } }
build.rs
use dominator_css_bindgen::css::generate_rust_bindings_from_file; use std::path::PathBuf; fn main() { let out_dir = PathBuf::from(std::env::var("OUT_DIR").unwrap()); let css_dir = PathBuf::from("resources/css"); generate_rust_bindings_from_file( &css_dir.join("tokens.css"), &out_dir.join("tokens.rs"), ); println!("cargo:rerun-if-changed=resources/css/"); }
lib.rs
#![allow(unused)] fn main() { #[macro_use] extern crate dwind_macros; pub mod theme; pub mod mixins; mod tokens_css { include!(concat!(env!("OUT_DIR"), "/tokens.rs")); } /// Call once at app startup before rendering any components. pub fn init_design_system() { dwind::stylesheet(); tokens_css::init_styles(); theme::apply_theme(theme::Theme::Dark); } }
Usage
#![allow(unused)] fn main() { use my_design_system::{init_design_system, theme, mixins::typography::*}; use dominator::{html, Dom}; fn app() -> Dom { init_design_system(); html!("div", { .style("background", "var(--ds-color-bg)") .style("min-height", "100vh") .dwclass!("flex flex-col gap-6 p-6") .child(html!("h1", { .apply(heading_text(1)) .text("Welcome") })) .child(html!("p", { .apply(body_text()) .text("This uses your design system tokens.") })) }) } }
name: dwind-events description: Use when the user asks about mouse events, keyboard events, click handling, drag interactions, event propagation, stopPropagation, preventDefault, event_with_options, global_event, or handling user input in dominator. Also triggers on "click handler", "mouse event", "keyboard shortcut", "event bubbling", "passive listener", or "pointer events" in a dwind/dominator context. version: 1.0.0
Dwind Events — Mouse, Keyboard, and Event Handling in Dominator
Handle user interactions in dominator applications. Covers mouse events, keyboard shortcuts, event options, propagation, and common pitfalls.
Event Registration
Basic .event() — bubble phase, passive
#![allow(unused)] fn main() { html!("div", { .event(|e: events::Click| { web_sys::console::log_1(&"clicked!".into()); }) }) }
Registers in bubble phase with passive: true (cannot call preventDefault).
.event_with_options() — control phase and preventability
#![allow(unused)] fn main() { html!("div", { .event_with_options( &EventOptions { preventable: true, ..EventOptions::default() }, |e: events::KeyDown| { e.prevent_default(); // only works with preventable: true } ) }) }
EventOptions fields:
bubbles: true(default) → bubble phase listenerbubbles: false→ capture phase listenerpreventable: true→ non-passive, allowse.prevent_default()preventable: false(default) → passive listener,preventDefaultwill throw a console warning
.global_event() — listen on window
#![allow(unused)] fn main() { html!("div", { .global_event(|e: events::MouseUp| { // Fires even if mouse is released outside this element }) }) }
Registers on the window object, not the element. Use for:
- Capturing mouseup after a drag started inside the element
- Global keyboard shortcuts
- Detecting clicks outside a popup
Mouse Events
Available types
| Event | Type | Notable methods |
|---|---|---|
events::MouseDown | mousedown | mouse_x(), mouse_y(), button(), shift_key(), ctrl_key() |
events::MouseUp | mouseup | same |
events::MouseMove | mousemove | mouse_x(), mouse_y(), shift_key(), ctrl_key() |
events::Click | click | same |
events::Wheel | wheel | mouse_x(), mouse_y(), delta_x(), delta_y(), delta_z() |
events::ContextMenu | contextmenu | prevent to disable right-click menu |
events::MouseEnter | mouseenter | does not bubble |
events::MouseLeave | mouseleave | does not bubble |
Mouse coordinates
mouse_x() and mouse_y() return client coordinates (viewport-relative, integers).
#![allow(unused)] fn main() { .event(|e: events::MouseDown| { let screen_x = e.mouse_x() as f64; let screen_y = e.mouse_y() as f64; }) }
For world/canvas coordinates, convert using your viewport transform:
#![allow(unused)] fn main() { let world_x = (screen_x - pan_x) / zoom; let world_y = (screen_y - pan_y) / zoom; }
Mouse button
button() returns dominator::events::MouseButton:
#![allow(unused)] fn main() { match e.button() { events::MouseButton::Left => { /* primary */ } events::MouseButton::Middle => { /* pan */ } events::MouseButton::Right => { /* context menu */ } _ => {} } }
Modifier keys
Available on all mouse events:
#![allow(unused)] fn main() { let shift = e.shift_key(); // bool let ctrl = e.ctrl_key(); // bool — includes meta_key (Cmd on Mac) }
Note: ctrl_key() in dominator already includes meta_key() (Cmd on Mac). You do NOT need to check both.
Keyboard Events
Basic keyboard handler
#![allow(unused)] fn main() { html!("div", { .attr("tabindex", "0") // required for div to receive keyboard events .style("outline", "none") .event_with_options( &EventOptions { preventable: true, ..EventOptions::default() }, |e: events::KeyDown| { let key = e.key(); // "a", "Enter", "Escape", "ArrowDown", etc. let ctrl = e.ctrl_key(); let shift = e.shift_key(); let handled = match key.as_str() { "Delete" => { do_delete(); true } "z" | "Z" if ctrl && shift => { do_redo(); true } "z" | "Z" if ctrl => { do_undo(); true } "Escape" => { do_cancel(); true } _ => false, }; if handled { e.prevent_default(); // prevent browser default (Ctrl+Z = browser undo) } } ) }) }
Critical: event_with_options with preventable: true is required. Without it, the listener is passive and prevent_default() throws:
Unable to preventDefault inside passive event listener invocation.
tabindex requirement
HTML <div> elements don't receive keyboard events by default. Add tabindex="0" to make them focusable:
#![allow(unused)] fn main() { .attr("tabindex", "0") .style("outline", "none") // remove focus ring }
Drag Interactions
Pattern: mousedown → global mousemove → global mouseup
#![allow(unused)] fn main() { html!("div", { .event(|e: events::MouseDown| { // Start drag — record initial position start_drag(e.mouse_x(), e.mouse_y()); }) .global_event(|e: events::MouseMove| { // Track drag — fires even outside the element if is_dragging() { update_drag(e.mouse_x(), e.mouse_y()); } }) .global_event(|e: events::MouseUp| { // End drag — fires even if released outside the element if is_dragging() { end_drag(); } }) }) }
Use global_event for mousemove and mouseup so dragging works when the cursor leaves the element.
Preventing context menu during right-click drag
#![allow(unused)] fn main() { .event_with_options( &EventOptions { preventable: true, ..EventOptions::default() }, |e: events::ContextMenu| { e.prevent_default(); } ) }
CRITICAL: stopPropagation Does NOT Work Reliably
e.stop_propagation() on a child element does NOT reliably prevent a parent's .event() handler from firing in dominator.
This was confirmed empirically: a child div calling e.stop_propagation() on mousedown did not prevent the parent div's .event(mousedown) handler from executing.
The problem
#![allow(unused)] fn main() { // PARENT html!("div", { .event(|e: events::MouseDown| { close_popup(); // THIS FIRES even when child calls stopPropagation }) // CHILD .child(html!("div", { .event(|e: events::MouseDown| { e.stop_propagation(); // DOES NOT WORK handle_popup_click(); }) })) }) }
The fix: check event target with el.closest()
Instead of relying on propagation, check whether the click target is inside the child element:
#![allow(unused)] fn main() { use wasm_bindgen::JsCast; html!("div", { .event(|e: events::MouseDown| { // Check if click is inside the popup if let Some(target) = e.target() { if let Ok(el) = target.dyn_into::<web_sys::Element>() { if el.closest("[data-my-popup]").ok().flatten().is_some() { return; // click was inside popup — don't close } } } close_popup(); }) .child(html!("div", { .attr("data-my-popup", "") // marker attribute for closest() check // ... popup content })) }) }
This pattern works for:
- Popup menus that should close on outside click
- Modal dialogs
- Dropdown menus
- Any "click outside to dismiss" interaction
Why this happens
Dominator uses gloo-events for event registration. The interaction between passive listeners (preventable: false, the default) and propagation stopping may differ from standard addEventListener behavior. The exact cause is in gloo-events internals and may vary by browser.
Rule: never rely on stopPropagation across dominator elements. Always use target checking.
Scroll / Wheel Events
#![allow(unused)] fn main() { .event(|e: events::Wheel| { let delta = e.delta_y(); // positive = scroll down let screen_x = e.mouse_x(); // cursor position during scroll let screen_y = e.mouse_y(); // Zoom at cursor position let factor = if delta > 0.0 { 1.0 / 1.1 } else { 1.1 }; zoom_at(screen_x, screen_y, factor); }) }
Wheel extends mouse events — it has mouse_x(), mouse_y(), shift_key(), ctrl_key() in addition to delta_x/y/z().
SVG Events
SVG elements (svg!()) receive the same mouse events as HTML elements. But:
foreignObject event interaction
Events inside <foreignObject> (HTML embedded in SVG) may not propagate to SVG parent elements as expected. If you need both SVG-level and HTML-level event handling:
- Use
pointer-events: noneon the foreignObject if it's purely decorative - Use the
el.closest()pattern (above) for click-outside detection - Don't rely on event bubbling across the SVG/HTML boundary
Hit targets on SVG elements
SVG elements with fill="none" don't receive mouse events by default. For invisible hit targets:
#![allow(unused)] fn main() { svg!("circle", { .attr("r", "15") .attr("fill", "transparent") // transparent, not none — receives events .attr("cursor", "pointer") .event(|e: events::MouseDown| { ... }) }) }
fill="transparent" → receives events. fill="none" → does NOT receive events.
Event Types Reference
All event types are in dominator::events:
#![allow(unused)] fn main() { use dominator::events; // Mouse events::MouseDown, events::MouseUp, events::MouseMove, events::Click, events::DoubleClick, events::MouseEnter, events::MouseLeave, events::ContextMenu, events::Wheel, // Keyboard events::KeyDown, events::KeyUp, // Form events::Input, events::Change, events::Focus, events::Blur, // Other events::Resize, events::Load, events::Error, }
name: dwind-project-setup description: Use when the user asks to create a new dwind project, set up dwind in an existing project, configure the Rust-to-WASM build pipeline, or asks about dwind project structure, Cargo.toml dependencies, rollup config, or wasm-pack setup. version: 1.0.0
Dwind Project Setup — Scaffolding & Build Config
Set up a new Rust/WASM web application using the dwind stack.
Full-stack template: For a dwind frontend integrated with a RAS backend, see the scaffold-fullstack skill.
Project Structure
my-app/
├── Cargo.toml # Rust dependencies
├── package.json # npm: rollup, wasm-pack tools
├── rollup.config.js # Build: Rust → WASM → JS bundle
├── index.html # Minimal HTML shell
└── src/
├── lib.rs # Entry point
└── components/
└── mod.rs # Component modules
Key Files
Cargo.toml
[package]
name = "my-app"
version = "0.1.0"
edition = "2021"
[lib]
crate-type = ["cdylib"]
[dependencies]
dominator = "0.5"
dwind = "0.7"
dwind-macros = "0.7"
futures-signals = "0.3"
wasm-bindgen = "0.2"
wasm-bindgen-futures = "0.4"
console_error_panic_hook = "0.1"
Add the component library as needed:
dwui = { git = "https://github.com/user/dwind.git" }
lib.rs — Entry Point
#[macro_use] extern crate dwind_macros; // Required for dwclass! / dwclass_signal! use wasm_bindgen::prelude::*; #[wasm_bindgen(start)] pub async fn main() { console_error_panic_hook::set_once(); dwind::stylesheet(); // Initialize base utility stylesheets dominator::append_dom(&dominator::body(), app()); } fn app() -> Dom { html!("div", { .dwclass!("min-h-screen bg-gray-950 text-white p-8") .text("Hello, dwind!") }) }
Critical: #[macro_use] extern crate dwind_macros must be at the crate root. Without it, dwclass! is not available.
index.html
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>My App</title>
<style>
html, body {
margin: 0; padding: 0; min-height: 100vh;
background: linear-gradient(135deg, #080614 0%, #1a1540 50%, #12101e 100%);
background-attachment: fixed;
}
</style>
</head>
<body></body>
</html>
The background gradient makes glass/transparency effects visible.
package.json
{
"private": true,
"type": "module",
"name": "my-app",
"version": "0.1.0",
"scripts": {
"build": "rimraf dist/js && rollup --config",
"build:release": "rimraf dist/js && RELEASE=true rollup --config",
"start": "[ ! -f Cargo.lock ] && cargo check --target wasm32-unknown-unknown; rimraf dist/js && rollup --config --watch"
},
"devDependencies": {
"@rollup/plugin-terser": "^0",
"@wasm-tool/rollup-plugin-rust": "^3",
"binaryen": "^121",
"rimraf": "^6",
"rollup": "^4",
"rollup-plugin-copy": "^3",
"rollup-plugin-dev": "^2",
"rollup-plugin-livereload": "^2",
"rollup-plugin-terser": "^7"
}
}
For the exact, up-to-date package.json, read the template at:
/home/mmy/repos/oss/dwind-dominator-template/package.json
rollup.config.js
For the exact, up-to-date rollup config, read the template at:
/home/mmy/repos/oss/dwind-dominator-template/rollup.config.js
The key setup: @wasm-tool/rollup-plugin-rust compiles the Rust crate to WASM automatically. No manual wasm-pack commands needed. Dev builds include debug symbols; release uses -Oz + wasm-opt.
Design System Crate (Optional)
For reusable component libraries, create a separate workspace crate:
crates/my-design-system/
├── Cargo.toml
├── build.rs # CSS codegen
├── resources/css/
│ └── tokens.css # Utility classes referencing CSS vars
└── src/
├── lib.rs
├── theme/mod.rs # Theme struct + CSS variable generation
└── components/
└── mod.rs
build.rs for CSS Codegen
use dominator_css_bindgen::css::generate_rust_bindings_from_file; use std::path::PathBuf; fn main() { let out_dir = PathBuf::from(std::env::var("OUT_DIR").unwrap()); let css_dir = PathBuf::from("resources/css"); generate_rust_bindings_from_file( &css_dir.join("tokens.css"), &out_dir.join("tokens.rs"), ); println!("cargo:rerun-if-changed=resources/css/"); }
Include the generated module:
#![allow(unused)] fn main() { pub mod tokens_css { include!(concat!(env!("OUT_DIR"), "/tokens.rs")); } }
Build Commands
# Prerequisites
rustup target add wasm32-unknown-unknown
npm install
# Development (hot-reload on port 8080)
npm start
# Production build
npm run build:release
Tauri Desktop App
Dwind apps can also run as native desktop applications using Tauri 2. Instead of Rollup, the frontend is built with Trunk and loaded into Tauri's webview. The backend is a separate Rust crate that communicates with the frontend via IPC commands and events.
For Tauri setup, use the dwind-tauri skill which covers the full project structure, IPC bridge, Tauri configuration, and build toolchain.
Key differences from a web app:
- Trunk replaces Rollup as the WASM bundler
- Frontend and backend are separate crates with isolated workspaces
tauri.conf.jsonwires Trunk's dev server to Tauri's webviewwindow.__TAURI__provides IPC, accessed viawasm_bindgeninline JS
Template Reference
For the most up-to-date, working project template with all configuration files:
- Web app:
/home/mmy/repos/oss/dwind-dominator-template/ - Tauri app:
/home/mmy/repos/ai/experiments/karaokemonster/crates/karaoke-app/
Read those files when scaffolding a new project to ensure you have the latest dependency versions and build config.
name: dwind-reactivity description: Use when the user asks about state management, signals, Mutable, reactive updates, conditional rendering, signal composition, child_signal, style_signal, broadcast, map_ref, or encounters signal-related compile errors in a dwind/dominator/futures-signals context. version: 1.0.0
Dwind Reactivity — Signals & State Management
The dwind stack uses futures-signals for fine-grained reactivity. State lives in Mutable<T> values; the DOM subscribes to changes via signals.
Mutable
#![allow(unused)] fn main() { use futures_signals::signal::Mutable; let count = Mutable::new(0); count.set(5); // Set value let val = count.get(); // Get current value let sig = count.signal(); // Get a signal (for primitives implementing Copy) let sig = count.signal_cloned(); // For non-Copy types (String, Vec, etc.) let sig = count.signal_ref(|v| v.len()); // Map reference without cloning }
Signal Consumption Rule
A signal can only be consumed once (.map() takes ownership). If you need the same signal in multiple places, use .broadcast():
#![allow(unused)] fn main() { let disabled = disabled.broadcast(); // Now call .signal() as many times as needed .style_signal("opacity", disabled.signal().map(|d| if d { "0.5" } else { "1" })) .attr_signal("disabled", disabled.signal().map(|d| if d { Some("disabled") } else { None })) .attr_signal("aria-disabled", disabled.signal().map(|d| if d { Some("true") } else { None })) }
If you forget .broadcast() and use a signal twice, you get a move error.
DOM Bindings
text_signal — Reactive text
#![allow(unused)] fn main() { .text_signal(count.signal().map(|n| format!("Count: {}", n))) }
child_signal — Conditional DOM (returns Option)
#![allow(unused)] fn main() { .child_signal(is_open.signal().map(|open| { if open { Some(html!("div", { .text("Panel content") })) } else { None } })) }
style_signal — Reactive inline styles
#![allow(unused)] fn main() { .style_signal("opacity", is_visible.signal().map(|v| if v { "1" } else { "0" })) }
attr_signal — Reactive attributes (returns Option<&str>)
#![allow(unused)] fn main() { .attr_signal("disabled", disabled.signal().map(|d| if d { Some("disabled") } else { None })) }
visible_signal — Show/hide via CSS display
#![allow(unused)] fn main() { .visible_signal(is_visible.signal()) }
dwclass_signal! — Reactive utility classes
#![allow(unused)] fn main() { .dwclass_signal!("bg-blue-500", is_active.signal()) }
Combining Signals with map_ref!
When a value depends on multiple signals:
#![allow(unused)] fn main() { use futures_signals::map_ref; .style_signal("box-shadow", { map_ref! { let valid = is_valid.signal(), let focused = is_focused.signal() => { if !*valid { "var(--shadow-error)" } else if *focused { "var(--shadow-focus)" } else { "var(--shadow)" } } } }) }
SignalExt Combinators
#![allow(unused)] fn main() { use futures_signals::signal::SignalExt; signal.map(|v| v + 1) // Transform not(bool_signal) // Negate and(sig_a, sig_b) // Logical AND or(sig_a, sig_b) // Logical OR signal.for_each(|v| async { }) // Side effect signal.boxed_local() // Type-erase for trait objects }
Reactive Lists
#![allow(unused)] fn main() { use futures_signals::signal_vec::{MutableVec, SignalVecExt}; let items = MutableVec::new(); items.lock_mut().push_cloned("new item".to_string()); html!("ul", { .children_signal_vec(items.signal_vec_cloned().map(|item| { html!("li", { .text(&item) }) })) }) }
Programmatic Responsive Behavior
#![allow(unused)] fn main() { use dwind::prelude::media_queries::{breakpoint_active_signal, Breakpoint}; let is_desktop = breakpoint_active_signal(Breakpoint::Medium); html!("div", { .child_signal(is_desktop.map(|desktop| { if desktop { Some(desktop_nav()) } else { Some(mobile_nav()) } })) }) }
Decision Tree: Which Reactive Binding?
- Structural changes (add/remove DOM nodes):
child_signal/children_signal_vec - Visual changes (colors, opacity, size):
dwclass_signal!/style_signal - Simple show/hide:
visible_signal(keeps DOM alive, togglesdisplay) - Text updates:
text_signal - Attribute changes:
attr_signal
Critical Gotchas
style_signal must NEVER return empty string
Dominator panics in debug builds on empty style values:
#![allow(unused)] fn main() { // BAD — panics when size isn't Small .style_signal("border-radius", size.signal().map(|s| match s { Size::Small => "4px", _ => "", // PANIC! })) // GOOD — every branch returns a valid CSS value .style_signal("border-radius", size.signal().map(|s| match s { Size::Small => "4px", Size::Medium => "8px", Size::Large => "12px", })) }
Vendor-prefixed CSS needs array syntax
#![allow(unused)] fn main() { // BAD — panics if browser doesn't support prefix .style("-webkit-backdrop-filter", "blur(8px)") // GOOD — tries each name, succeeds if any works .style(["backdrop-filter", "-webkit-backdrop-filter"], "blur(8px)") }
CSS visibility vs conditional DOM destruction
Prefer CSS visibility over child_signal when content has its own signals:
#![allow(unused)] fn main() { // PROBLEMATIC — content signal consumed on creation, destroyed on close, can't recreate .child_signal(open.signal().map(move |is_open| { if is_open { Some(panel_with_content_signal) } else { None } })) // BETTER — panel always in DOM, CSS controls visibility .child(html!("div", { .style_signal("opacity", open.signal().map(|o| if o { "1" } else { "0" })) .style_signal("pointer-events", open.signal().map(|o| if o { "auto" } else { "none" })) .child_signal(content) // consumed once, lives forever })) }
Never use return inside map_ref!
The macro expansion makes return exit the wrong scope:
#![allow(unused)] fn main() { // BAD — type mismatch with Poll map_ref! { let a = sig => { if *a { return "yes"; } "no" } } // GOOD — use if/else expression map_ref! { let a = sig => { if *a { "yes" } else { "no" } } } }
Box<dyn Fn()> is not Clone — use Rc
#![allow(unused)] fn main() { let on_close = std::rc::Rc::new(on_close); .event({ let on_close = on_close.clone(); move |_: events::Click| { (on_close)(); } }) .global_event({ let on_close = on_close.clone(); move |e: events::KeyDown| { if e.key() == "Escape" { (on_close)(); } }}) }
Use explicit style signals for disabled state on <label> elements:
#![allow(unused)] fn main() { .style_signal("opacity", disabled.signal().map(|d| if d { "0.5" } else { "1" })) .style_signal("pointer-events", disabled.signal().map(|d| if d { "none" } else { "auto" })) }
name: dwind-styling description: Use when the user asks about styling, CSS classes, colors, spacing, layout, responsive design, hover/focus states, animations, visual appearance, or theming in a dwind/dominator context. Also triggers when the user mentions dwclass, utility classes, or breakpoints. version: 1.0.0
Dwind Styling — Utility Classes & Visual Design
Dwind provides Tailwind-like utility classes that compile into the WASM binary at build time. All styling uses procedural macros — no runtime CSS parsing.
dwclass! — Basic Usage
Apply utility classes to dominator elements:
#![allow(unused)] fn main() { html!("div", { .dwclass!("flex gap-4 p-4") // Multiple classes in one call .dwclass!("bg-gray-900 text-white") // Chain multiple calls }) }
Important: dwclass! only accepts string literals. No variables or dynamic strings.
Two-parameter form (inside closures)
#![allow(unused)] fn main() { html!("div", { .apply(|b| { match variant { 0 => dwclass!(b, "bg-red-500"), 1 => dwclass!(b, "bg-blue-500"), _ => dwclass!(b, "bg-gray-500"), } }) }) }
dwclass_signal! — Reactive Styling
Toggle classes based on signals:
#![allow(unused)] fn main() { let is_active = Mutable::new(false); html!("div", { .dwclass!("p-4 rounded") .dwclass_signal!("bg-blue-500", is_active.signal()) // Applied when true .dwclass_signal!("opacity-50", not(is_active.signal())) // Applied when false }) }
dwgenerate! — Custom Reusable Classes
Pre-declare reusable class combinations:
#![allow(unused)] fn main() { dwgenerate!("btn-primary", "hover:bg-blue-600 active:scale-95"); html!("button", { .dwclass!("btn-primary px-4 py-2 bg-blue-500") }) }
Arbitrary values:
#![allow(unused)] fn main() { .dwclass!("padding-[20px]") // Custom spacing .dwclass!("bg-[#ff5500]") // Custom color }
Responsive Breakpoints
Mobile-first. Prefix classes with @breakpoint::
| Prefix | Width | Description |
|---|---|---|
@xs: | < 640px | Default (no prefix needed) |
@sm: | >= 640px | Small screens |
@md: | >= 1280px | Medium screens |
@lg: | >= 1920px | Large screens |
@xl: | >= 2560px | Extra large |
@<sm: | < 640px | Less than small |
#![allow(unused)] fn main() { .dwclass!("flex-col @sm:flex-row") // Column mobile, row desktop .dwclass!("gap-2 @md:gap-4 @lg:gap-8") // Increasing gap .dwclass!("@<sm:hidden @sm:block") // Hidden on mobile }
Custom media queries:
#![allow(unused)] fn main() { .dwclass!("@((max-width: 700px)):bg-red-500") }
Pseudo-Classes
#![allow(unused)] fn main() { .dwclass!("hover:bg-blue-600") .dwclass!("focus:ring-2 focus:ring-blue-400") .dwclass!("active:scale-95") .dwclass!("disabled:opacity-50 disabled:cursor-not-allowed") .dwclass!("nth-child(2):bg-gray-800") .dwclass!("nth-child(odd):bg-gray-900") .dwclass!("is(.selected):font-bold") }
Variant Selectors (Child Styling)
Apply styles to child elements with [selector]:class:
#![allow(unused)] fn main() { .dwclass!("[& > *]:p-2") // All direct children .dwclass!("[> span]:text-blue-500") // Direct span children .dwclass!("[& > *]:nth-child(2):bg-red-500") // Second direct child .dwclass!("[& *]:w-full") // All descendants .dwclass!("[& > button]:hover:bg-blue-600") // Direct buttons on hover }
Color Opacity
#![allow(unused)] fn main() { .dwclass!("bg-blue-500/50") // 50% opacity background .dwclass!("text-white/75") // 75% opacity text }
Common Patterns
Centered Container
#![allow(unused)] fn main() { .dwclass!("flex justify-center align-items-center h-full") }
Card Layout
#![allow(unused)] fn main() { .dwclass!("p-4 bg-gray-900 rounded-lg shadow-lg border border-gray-800") }
Responsive Grid
#![allow(unused)] fn main() { .dwclass!("grid grid-cols-1 @sm:grid-cols-2 @md:grid-cols-3 gap-4") }
Button with States
#![allow(unused)] fn main() { .dwclass!("px-4 py-2 bg-blue-500 rounded") .dwclass!("hover:bg-blue-600 active:scale-95") .dwclass!("disabled:opacity-50 disabled:cursor-not-allowed") }
Theme-Aware (Light/Dark)
#![allow(unused)] fn main() { // Parent element has "light" class for light mode .dwclass!("bg-gray-900 is(.light):bg-gray-100") .dwclass!("text-white is(.light):text-gray-900") }
Glass Visual Depth Tips
Flat semi-transparent backgrounds look like colored rectangles. Add depth:
- Bevel highlight:
inset 0 0.5px 0 0 rgba(255,255,255,0.1)box-shadow simulates light catching the top edge - Light gradient overlay:
linear-gradient(to bottom, rgba(255,255,255,0.06), transparent)on top of background - No hard borders: Use box-shadow rings (
0 0 0 2px var(--accent-muted)) instead ofborder-color— shadows are anti-aliased
#![allow(unused)] fn main() { // Bevel + shadow combo .style("box-shadow", "inset 0 0.5px 0 0 rgba(255,255,255,0.1), 0 4px 16px rgba(0,0,0,0.15)") // Light gradient on elevated surfaces .style("background", "\ linear-gradient(to bottom, rgba(255,255,255,0.06), transparent 50%), \ var(--my-bg-elevated)") }
Color Palette
Named colors with shades 50–950: blue, green, yellow, orange, red, purple, gray, woodsmoke, bunker, apple, candlelight, picton-blue
Usage: bg-{color}-{shade}, text-{color}-{shade}, border-{color}-{shade}
Read references/color-palette.md for all color values and references/utility-classes.md for the complete class reference.
Gradients
#![allow(unused)] fn main() { .dwclass!("bg-gradient-to-r gradient-from-blue-500 gradient-to-purple-500") .dwclass!("linear-gradient-135 gradient-from-gray-900 gradient-to-gray-800") }
Directions: bg-gradient-to-{t|tr|r|br|b|bl|l|tl}, angles: linear-gradient-{0|45|90|135|180}
Dwind Color Palette
All colors available for bg-{color}-{shade}, text-{color}-{shade}, border-{color}-{shade}, gradient-from-{color}-{shade}, gradient-to-{color}-{shade}.
Opacity modifier: append /{opacity} e.g. bg-blue-500/50 for 50% opacity.
Colors
blue
| Shade | Hex |
|---|---|
| 50 | Light blue tint |
| 100-400 | Progressive blue |
| 500 | Primary blue |
| 600-900 | Progressive dark blue |
| 950 | Near-black blue |
green
Standard green scale, 50-950.
yellow
Standard yellow scale, 50-950.
orange
Standard orange scale, 50-950.
red
Standard red scale, 50-950.
purple
Standard purple scale, 50-950.
gray
Neutral gray scale, 50-950. Most commonly used for backgrounds and text.
woodsmoke
Very dark gray with slight warmth. Good for dark mode backgrounds.
woodsmoke-950/woodsmoke-900— near-black backgrounds
bunker
Very dark blue-gray. Deep, rich dark backgrounds.
bunker-950— deepest dark background
apple
Green-toned color. Good for success states.
apple-500:#61BD4CFFapple-700:#317621FF
candlelight
Yellow/gold-toned. Good for warning states and accents.
picton-blue
Bright, vivid blue. Good for primary actions and links.
charm
Pink/rose-toned. Good for accents and highlights.
Usage Patterns
Dark mode backgrounds
#![allow(unused)] fn main() { .dwclass!("bg-woodsmoke-950") // Darkest .dwclass!("bg-bunker-900") // Very dark .dwclass!("bg-gray-900") // Standard dark .dwclass!("bg-gray-800") // Elevated surface }
Text colors
#![allow(unused)] fn main() { .dwclass!("text-white") // Primary text on dark .dwclass!("text-gray-400") // Secondary text on dark .dwclass!("text-gray-500") // Muted text on dark .dwclass!("text-gray-900") // Primary text on light }
Semantic colors
#![allow(unused)] fn main() { .dwclass!("bg-apple-500") // Success .dwclass!("bg-candlelight-500") // Warning .dwclass!("bg-red-500") // Error/danger .dwclass!("bg-picton-blue-500") // Info/primary }
Gradient examples
#![allow(unused)] fn main() { .dwclass!("bg-gradient-to-r gradient-from-picton-blue-500 gradient-to-purple-500") .dwclass!("bg-gradient-to-b gradient-from-gray-900 gradient-to-bunker-950") }
DWIND Utility Classes Reference
Spacing
Margin
m-auto,m-x-auto,m-y-auto- Auto marginsm-t-{n},m-b-{n},m-l-{n},m-r-{n}- Individual sidesm-x-{n},m-y-{n}- Horizontal/vertical
Padding
p-{n}- All sidesp-t-{n},p-b-{n},p-l-{n},p-r-{n}- Individual sidespx-{n},py-{n}- Horizontal/vertical
Gap (Flex/Grid)
gap-{n}- All directionsspace-x-{n},space-y-{n}- Between children
Values: 0, 0-5, 1, 1-5, 2, 2-5, 3, 3-5, 4, 5, 6, 8, 10, 12, 16, 20, 24, 32, 40, 48, 64, 80, 96
Typography
Font Family
font-sans- System sans-seriffont-serif- System seriffont-mono- Monospace
Font Weight
font-thin(100),font-extralight(200),font-light(300)font-normal(400),font-medium(500),font-semibold(600)font-bold(700),font-extrabold(800),font-black(900)
Font Size
text-xs(12px),text-sm(14px),text-base(16px),text-lg(18px)text-xl(20px),text-2xl(24px),text-3xl(30px),text-4xl(36px)text-5xl(48px),text-6xl(60px),text-7xl(72px),text-8xl(96px),text-9xl(128px)
Text Alignment
text-left,text-center,text-right
Line Height
leading-3toleading-10- Fixed valuesleading-none,leading-tight,leading-snug,leading-normal,leading-relaxed,leading-loose
Text Overflow
truncate- Ellipsis with nowraptext-ellipsis,text-clip
Colors
Background
bg-black,bg-white,bg-transparentbg-{color}-{shade}- e.g.,bg-blue-500,bg-gray-900
Text
text-black,text-white,text-transparenttext-{color}-{shade}- e.g.,text-blue-500
Border
border-black,border-white,border-transparentborder-{color}-{shade}- e.g.,border-gray-700
Gradients
linear-gradient-{0|45|90|135|180}- Angle directionsbg-gradient-to-{t|tr|r|br|b|bl|l|tl}- Named directionsgradient-from-{color}-{shade},gradient-to-{color}-{shade}
Color Palette: blue, green, yellow, orange, red, purple, gray, woodsmoke, bunker, apple, candlelight, picton-blue Shades: 50, 100, 200, 300, 400, 500, 600, 700, 800, 900, 950
Layout
Display
block,inline-block,inline,hiddenflex,inline-flex,grid,inline-gridtable,table-row,table-cellcontents,flow-root
Flexbox
flex-row,flex-col,flex-row-reverse,flex-col-reverseflex-wrap,flex-nowrap,flex-wrap-reverseflex-1,flex-auto,flex-initial,flex-nonegrow,grow-0,shrink,shrink-0
Justify Content
justify-start,justify-center,justify-endjustify-between,justify-around,justify-evenly,justify-stretch
Align Items
align-items-start,align-items-center,align-items-endalign-items-baseline,align-items-stretch- Also:
items-start,items-center,items-end,items-baseline,items-stretch
Align Self
self-auto,self-start,self-center,self-end,self-stretch
Grid
grid-cols-{1-12},grid-cols-none,grid-cols-subgridcol-span-{1-12},col-span-fullrow-span-{1-12},row-span-fullgrid-flow-row,grid-flow-col,grid-flow-dense
Position
relative,absolute,fixed,stickytop-0,right-0,bottom-0,left-0(positioning values)
Z-Index
z-0,z-10,z-20,z-30,z-40,z-50,z-auto
Order
order-{1-12},order-first,order-last,order-none
Sizing
Width
w-full,w-autow-{n}- Fixed sizesw-p-{n}- Percentage (e.g.,w-p-50= 50%)max-w-{xs|sm|md|lg|xl|2xl}- Max widths
Height
h-full,h-auto,h-screenh-{n}- Fixed sizesmax-h-{n},min-h-{n}
Aspect Ratio
aspect-auto,aspect-square,aspect-video
Borders
Border Width
border- 1px all sidesborder-{t|r|b|l}-{n}- Individual sides
Border Style
border-solid,border-dashed,border-dotted,border-double,border-none
Border Radius
rounded-none,rounded-sm,rounded,rounded-md,rounded-lgrounded-xl,rounded-2xl,rounded-3xl,rounded-fullrounded-{t|r|b|l}-{size}- By siderounded-{tl|tr|br|bl}-{size}- By corner
Divide (between children)
divide-x,divide-y- Add borders between childrendivide-{color}-{shade}- Divide color
Effects
Box Shadow
shadow-sm,shadow,shadow-md,shadow-lg,shadow-xl,shadow-2xlshadow-inner,shadow-none
Ring (outline)
ring-0,ring-1,ring-2,ring,ring-4,ring-8ring-{color}-{shade}- Ring colorring-inset
Opacity
opacity-{0|5|10|20|25|30|50|60|70|75|80|90|95|100}
Interactivity
Cursor
cursor-auto,cursor-default,cursor-pointer,cursor-waitcursor-text,cursor-move,cursor-not-allowed,cursor-grab,cursor-grabbingcursor-col-resize,cursor-row-resize
Pointer Events
pointer-events-none,pointer-events-auto
User Select
select-none,select-text,select-all,select-auto
Overflow
overflow-auto,overflow-hidden,overflow-scroll,overflow-visibleoverflow-x-{auto|hidden|scroll|visible}overflow-y-{auto|hidden|scroll|visible}
Animations
animate-spin- Continuous rotationanimate-ping- Pulsing outwardanimate-pulse- Opacity fadeanimate-bounce- Vertical bounce
Transitions
transition- Default transitiontransition-all,transition-colors,transition-opacityduration-{75|100|150|200|300|500|700|1000}
name: dwind-tauri description: Use when the user asks to build a Tauri desktop application with a dwind/dominator frontend, set up Tauri with Rust WASM UI, create Tauri commands or IPC, handle Tauri events from dwind, configure tauri.conf.json, or asks about Tauri + dwind project structure. version: 1.0.0
Tauri + Dwind Desktop App
Build native desktop applications using Tauri 2 for the backend and dwind/dominator for the WASM frontend. The frontend compiles to WebAssembly and runs in Tauri's webview, communicating with a native Rust backend via IPC.
Project Structure
my-app/
├── Cargo.toml # Frontend (cdylib, WASM)
├── Trunk.toml # WASM bundler config
├── public/
│ └── index.html # HTML shell for Trunk
├── src/
│ ├── lib.rs # WASM entry point (dwind + dominator)
│ ├── tauri_ipc.rs # IPC bridge to Tauri backend
│ ├── state.rs # Frontend reactive state (Mutable<T>)
│ └── components/ # UI components
└── src-tauri/ # Tauri backend (separate crate)
├── Cargo.toml
├── tauri.conf.json # Tauri configuration
├── build.rs # tauri_build::build()
├── capabilities/
│ └── default.json # Permission scoping
└── src/
├── main.rs # Tauri app builder
├── commands.rs # IPC command handlers
└── state.rs # Backend state
Critical: The frontend and backend are separate crates with separate workspaces. The frontend compiles to wasm32-unknown-unknown; the backend compiles to the native target.
Workspace Isolation
The frontend crate must be its own workspace (or excluded from the parent) because dwind path dependencies resolve against the dwind workspace. The backend joins the parent workspace normally.
# Parent workspace Cargo.toml
[workspace]
members = ["crates/my-app/src-tauri"]
exclude = ["crates/my-app"] # Frontend excluded from parent workspace
# Frontend Cargo.toml
[workspace]
exclude = ["src-tauri"] # Backend excluded from frontend workspace
Frontend Setup
Cargo.toml
[package]
name = "my-app"
version = "0.1.0"
edition = "2021"
[lib]
crate-type = ["cdylib"]
[workspace]
exclude = ["src-tauri"]
[dependencies]
dominator = "0.5"
dwind = "0.7"
dwind-macros = "0.7"
futures-signals = "0.3"
futures-signals-component-macro = { version = "0.4", features = ["dominator"] }
wasm-bindgen = "0.2"
wasm-bindgen-futures = "0.4"
web-sys = { version = "0.3", features = ["Window", "console"] }
js-sys = "0.3"
serde = { version = "1", features = ["derive"] }
serde_json = "1"
serde-wasm-bindgen = "0.6"
log = "0.4"
wasm-log = "0.3"
Trunk.toml
[build]
target = "public/index.html"
[watch]
ignore = ["./src-tauri"]
[serve]
port = 1420
ws_protocol = "ws"
public/index.html
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<link data-trunk rel="rust" href="../Cargo.toml">
<title>My App</title>
<style>
html, body {
margin: 0; padding: 0; min-height: 100vh;
background: linear-gradient(135deg, #080614 0%, #1a1540 50%, #12101e 100%);
background-attachment: fixed;
}
</style>
</head>
<body></body>
</html>
The <link data-trunk rel="rust"> directive tells Trunk to compile the Rust crate to WASM.
lib.rs — WASM Entry Point
#[macro_use] extern crate dwind_macros; use wasm_bindgen::prelude::*; use std::rc::Rc; mod tauri_ipc; mod state; mod components; #[wasm_bindgen(start)] pub async fn main() { wasm_log::init(wasm_log::Config::default()); dwind::stylesheet(); let state = Rc::new(state::AppState::new()); // Wire up Tauri event listeners setup_event_listeners(state.clone()); // Fetch initial data from backend { let state = state.clone(); wasm_bindgen_futures::spawn_local(async move { // Call Tauri commands to populate initial state if let Ok(data) = tauri_ipc::get_initial_data().await { state.data.set(Some(data)); } }); } dominator::append_dom(&dominator::body(), components::app(state)); } fn setup_event_listeners(state: Rc<state::AppState>) { tauri_ipc::listen::<String>("backend-event", move |payload| { // Update reactive state — UI updates automatically log::info!("Received: {}", payload); }); }
Tauri IPC Bridge
The IPC bridge connects the dwind frontend to the Tauri backend. It uses wasm_bindgen inline JS to access window.__TAURI__.
tauri_ipc.rs
#![allow(unused)] fn main() { use serde::de::DeserializeOwned; use wasm_bindgen::prelude::*; // Raw JS bindings to Tauri global API #[wasm_bindgen(inline_js = r#" export async function tauri_invoke(cmd, args) { return await window.__TAURI__.core.invoke(cmd, args || {}); } export async function tauri_listen(event, callback) { return await window.__TAURI__.event.listen(event, callback); } export function tauri_convert_file_src(path) { return window.__TAURI__.core.convertFileSrc(path); } "#)] extern "C" { async fn tauri_invoke(cmd: &str, args: JsValue) -> Result<JsValue, JsValue>; async fn tauri_listen(event: &str, callback: &Closure<dyn Fn(JsValue)>) -> Result<JsValue, JsValue>; fn tauri_convert_file_src(path: &str) -> String; } // Generic typed invoke — serializes args, deserializes result async fn invoke<T: DeserializeOwned>(cmd: &str, args: JsValue) -> Result<T, String> { let result = tauri_invoke(cmd, args) .await .map_err(|e| format!("{:?}", e))?; serde_wasm_bindgen::from_value(result).map_err(|e| e.to_string()) } async fn invoke_unit(cmd: &str, args: JsValue) -> Result<(), String> { tauri_invoke(cmd, args) .await .map_err(|e| format!("{:?}", e))?; Ok(()) } // Event listener — deserializes Tauri event payload #[derive(serde::Deserialize)] struct EventWrapper<T> { payload: T, } pub fn listen<T: DeserializeOwned + 'static>( event: &str, mut callback: impl FnMut(T) + 'static, ) { let event = event.to_string(); wasm_bindgen_futures::spawn_local(async move { let closure = Closure::new(move |val: JsValue| { match serde_wasm_bindgen::from_value::<EventWrapper<T>>(val) { Ok(wrapper) => callback(wrapper.payload), Err(e) => log::error!("Event parse error: {}", e), } }); let _ = tauri_listen(&event, &closure).await; closure.forget(); // Must keep alive for app lifetime }); } // Convert a filesystem path to an asset:// URL for the webview pub fn convert_file_src(path: &str) -> String { tauri_convert_file_src(path) } // --- Typed command wrappers --- pub async fn get_initial_data() -> Result<MyData, String> { invoke("get_initial_data", JsValue::NULL).await } pub async fn save_item(name: &str, value: &str) -> Result<(), String> { let args = serde_wasm_bindgen::to_value(&serde_json::json!({ "name": name, "value": value, })).map_err(|e| e.to_string())?; invoke_unit("save_item", args).await } }
Important: Tauri command argument names must be camelCase in the JSON (Tauri deserializes them that way), even though the Rust backend uses snake_case.
Backend Setup
src-tauri/Cargo.toml
[package]
name = "my-app-tauri"
version = "0.1.0"
edition = "2021"
[build-dependencies]
tauri-build = { version = "2" }
[dependencies]
tauri = { version = "2", features = [] }
serde = { version = "1", features = ["derive"] }
serde_json = "1"
Add features/plugins as needed:
tauri = { features = ["protocol-asset"] }— serve files viaasset://protocoltauri-plugin-dialog = "2"— native file/folder dialogstauri-plugin-shell = "2"— open URLs in browser
src-tauri/build.rs
fn main() { tauri_build::build(); }
src-tauri/tauri.conf.json
{
"$schema": "https://schema.tauri.app/config/2",
"productName": "My App",
"version": "0.1.0",
"identifier": "com.myapp.dev",
"build": {
"beforeDevCommand": "trunk serve --port 1420",
"devUrl": "http://localhost:1420",
"beforeBuildCommand": "trunk build",
"frontendDist": "../dist"
},
"app": {
"withGlobalTauri": true,
"windows": [
{
"title": "My App",
"width": 1200,
"height": 800,
"resizable": true,
"minWidth": 800,
"minHeight": 600
}
],
"security": {
"csp": null
}
},
"bundle": {
"active": true,
"targets": "all",
"icon": ["icons/32x32.png", "icons/128x128.png", "icons/icon.png"]
}
}
Key settings:
withGlobalTauri: true— injectswindow.__TAURI__so WASM can call itbeforeDevCommandstarts Trunk on port 1420frontendDist: "../dist"points to Trunk's output for production builds
src-tauri/capabilities/default.json
{
"identifier": "default",
"description": "Default capabilities",
"windows": ["main"],
"permissions": [
"core:default"
]
}
Add permissions as needed: "dialog:default", "dialog:allow-open", "shell:allow-open", etc.
src-tauri/src/main.rs
#![cfg_attr(not(debug_assertions), windows_subsystem = "windows")] mod commands; mod state; use std::sync::Mutex; fn main() { tauri::Builder::default() .manage(state::AppState { data: Mutex::new(None), }) .invoke_handler(tauri::generate_handler![ commands::get_initial_data, commands::save_item, ]) .run(tauri::generate_context!()) .expect("error while running tauri application"); }
src-tauri/src/commands.rs
#![allow(unused)] fn main() { use tauri::{AppHandle, State}; use crate::state::AppState; #[tauri::command] pub fn get_initial_data(state: State<AppState>) -> Result<MyData, String> { // Access managed state, return data Ok(MyData { /* ... */ }) } #[tauri::command] pub async fn save_item( app: AppHandle, name: String, value: String, ) -> Result<(), String> { // Do work, optionally emit events for progress app.emit("save-progress", 50).map_err(|e| e.to_string())?; Ok(()) } }
Command rules:
- Return
Result<T, String>for error handling - Use
State<T>to access managed state - Use
AppHandlefor emitting events or accessing app resources - Use
tauri::async_runtime::spawn_blocking()for CPU-heavy work
Key Patterns
Pattern: Frontend calls backend command
#![allow(unused)] fn main() { // Frontend (WASM) let result = tauri_ipc::save_item("key", "value").await; // Backend (native) #[tauri::command] pub async fn save_item(name: String, value: String) -> Result<(), String> { ... } }
Pattern: Backend streams events to frontend
#![allow(unused)] fn main() { // Backend — emit during long operation app.emit("processing-progress", ProgressPayload { percent: 50 })?; // Frontend — listen and update reactive state tauri_ipc::listen::<ProgressPayload>("processing-progress", move |p| { progress.set(p.percent); // UI updates automatically }); }
Pattern: Serve files via asset protocol
#![allow(unused)] fn main() { // Backend — enable in tauri.conf.json: features = ["protocol-asset"] // and security.assetProtocol.enable = true, scope = ["*/**"] // Frontend — convert path to asset:// URL let url = tauri_ipc::convert_file_src("/path/to/file.png"); // url = "asset://localhost/path/to/file.png" // Use in img src, audio src, fetch(), etc. }
Pattern: State on both sides
#![allow(unused)] fn main() { // Backend: thread-safe with Mutex (accessed from multiple commands) pub struct AppState { pub data: Mutex<Option<MyData>>, } // Frontend: reactive with Mutable (drives UI updates) pub struct AppState { pub data: Mutable<Option<MyData>>, pub loading: Mutable<bool>, } }
Development
# Prerequisites
rustup target add wasm32-unknown-unknown
cargo install trunk
cargo install tauri-cli # or: cargo install create-tauri-app
# Development (starts Trunk + Tauri together)
cd src-tauri && cargo tauri dev
# Production build
cd src-tauri && cargo tauri build
cargo tauri dev automatically runs beforeDevCommand (Trunk) and opens the app window. Hot-reload works for frontend changes.
Reference App
For a complete working example of Tauri + dwind/dominator with IPC, events, file access, and audio playback:
/home/mmy/repos/ai/experiments/karaokemonster/crates/karaoke-app/
Tauri IPC Bridge Template
Complete, copy-pasteable tauri_ipc.rs module for a dwind/dominator frontend.
Full Module
#![allow(unused)] fn main() { use serde::de::DeserializeOwned; use wasm_bindgen::prelude::*; // ── Raw JS bindings ────────────────────────────────────────────── #[wasm_bindgen(inline_js = r#" export async function tauri_invoke(cmd, args) { return await window.__TAURI__.core.invoke(cmd, args || {}); } export async function tauri_listen(event, callback) { return await window.__TAURI__.event.listen(event, callback); } export function tauri_convert_file_src(path) { return window.__TAURI__.core.convertFileSrc(path); } export async function tauri_dialog_open(options) { return await window.__TAURI__.dialog.open(options || {}); } export async function tauri_dialog_save(options) { return await window.__TAURI__.dialog.save(options || {}); } "#)] extern "C" { #[wasm_bindgen(catch)] async fn tauri_invoke(cmd: &str, args: JsValue) -> Result<JsValue, JsValue>; #[wasm_bindgen(catch)] async fn tauri_listen( event: &str, callback: &Closure<dyn Fn(JsValue)>, ) -> Result<JsValue, JsValue>; fn tauri_convert_file_src(path: &str) -> String; #[wasm_bindgen(catch)] async fn tauri_dialog_open(options: JsValue) -> Result<JsValue, JsValue>; #[wasm_bindgen(catch)] async fn tauri_dialog_save(options: JsValue) -> Result<JsValue, JsValue>; } // ── Generic helpers ────────────────────────────────────────────── async fn invoke<T: DeserializeOwned>(cmd: &str, args: JsValue) -> Result<T, String> { let result = tauri_invoke(cmd, args) .await .map_err(|e| format!("{:?}", e))?; serde_wasm_bindgen::from_value(result).map_err(|e| e.to_string()) } async fn invoke_unit(cmd: &str, args: JsValue) -> Result<(), String> { tauri_invoke(cmd, args) .await .map_err(|e| format!("{:?}", e))?; Ok(()) } // ── Event listener ─────────────────────────────────────────────── #[derive(serde::Deserialize)] struct EventWrapper<T> { payload: T, } /// Listen for Tauri events emitted by the backend. /// The callback receives the deserialized payload. /// The listener lives for the lifetime of the app. pub fn listen<T: DeserializeOwned + 'static>( event: &str, mut callback: impl FnMut(T) + 'static, ) { let event = event.to_string(); wasm_bindgen_futures::spawn_local(async move { let closure = Closure::new(move |val: JsValue| { match serde_wasm_bindgen::from_value::<EventWrapper<T>>(val) { Ok(wrapper) => callback(wrapper.payload), Err(e) => log::error!("Failed to parse event '{}': {}", "<event>", e), } }); let _ = tauri_listen(&event, &closure).await; closure.forget(); }); } // ── Asset protocol ─────────────────────────────────────────────── /// Convert an absolute file path to an asset:// URL loadable by the webview. /// Requires `protocol-asset` feature and assetProtocol enabled in tauri.conf.json. pub fn convert_file_src(path: &str) -> String { tauri_convert_file_src(path) } // ── File dialogs (requires tauri-plugin-dialog) ────────────────── /// Open a native file picker. Returns the selected file path, or None if cancelled. pub async fn pick_file(title: &str, filters: &[(&str, &[&str])]) -> Result<Option<String>, String> { let filter_array: Vec<serde_json::Value> = filters .iter() .map(|(name, exts)| { serde_json::json!({ "name": name, "extensions": exts, }) }) .collect(); let options = serde_wasm_bindgen::to_value(&serde_json::json!({ "title": title, "filters": filter_array, })) .map_err(|e| e.to_string())?; let result = tauri_dialog_open(options) .await .map_err(|e| format!("{:?}", e))?; if result.is_null() || result.is_undefined() { return Ok(None); } Ok(result.as_string()) } /// Open a native directory picker. Returns the selected path, or None if cancelled. pub async fn pick_directory(title: &str) -> Result<Option<String>, String> { let options = serde_wasm_bindgen::to_value(&serde_json::json!({ "title": title, "directory": true, })) .map_err(|e| e.to_string())?; let result = tauri_dialog_open(options) .await .map_err(|e| format!("{:?}", e))?; if result.is_null() || result.is_undefined() { return Ok(None); } Ok(result.as_string()) } // ── App commands (add your typed wrappers below) ───────────────── // Example: // // #[derive(serde::Deserialize)] // pub struct MyData { // pub name: String, // pub count: u32, // } // // pub async fn get_data() -> Result<MyData, String> { // invoke("get_data", JsValue::NULL).await // } // // pub async fn save_data(name: &str, count: u32) -> Result<(), String> { // let args = serde_wasm_bindgen::to_value(&serde_json::json!({ // "name": name, // camelCase keys for Tauri // "count": count, // })).map_err(|e| e.to_string())?; // invoke_unit("save_data", args).await // } }
Usage Notes
- camelCase args: Tauri deserializes command arguments as camelCase JSON keys, so use
"projectName"not"project_name"in thejson!()macro, even though the backend Rust function usesproject_name: String. Closure::forget(): Event listeners must call.forget()to prevent the closure from being dropped. This leaks memory intentionally — listeners live for the app's lifetime.- Dialog plugin:
pick_fileandpick_directoryrequiretauri-plugin-dialogin the backend and"dialog:default"+"dialog:allow-open"in capabilities. - Asset protocol:
convert_file_srcrequiresfeatures = ["protocol-asset"]on thetauridependency andsecurity.assetProtocol.enable = trueintauri.conf.json.
name: dwind-testing description: Use when the user asks about testing dwind/dominator WASM components, writing wasm-bindgen-test tests, DOM isolation between tests, testing reactive signals, or debugging rendering issues in headless browsers. Also triggers on "test my component", "wasm test", "DOM test", "browser test", or "test isolation". version: 1.0.0
Dwind Testing — wasm-bindgen-test Patterns
Write browser-based tests for dwind/dominator components using wasm-bindgen-test.
Setup
Cargo.toml
[dev-dependencies]
wasm-bindgen-test = "0.3"
js-sys = "0.3"
wasm-bindgen-futures = "0.4"
[dependencies]
# Ensure web-sys has enough features for test queries
web-sys = { version = "0.3", features = [
"Document", "Element", "HtmlElement", "NodeList",
"DomRect", "Window", "console",
] }
Running Tests
# Firefox (recommended — more stable in headless)
wasm-pack test --headless --firefox crates/my-crate
# Chrome
wasm-pack test --headless --chrome crates/my-crate
# With output (see console.log and panic messages)
wasm-pack test --headless --firefox crates/my-crate -- --nocapture
# Single test
wasm-pack test --headless --firefox crates/my-crate -- --nocapture test_my_thing
Critical: DOM Isolation Between Tests
All wasm-bindgen-test tests share the same document.body. DOM elements from one test persist into the next unless explicitly removed. This causes:
- Element count assertions failing (accumulating elements)
querySelectorfinding elements from previous tests- Signal subscriptions from old tests interfering with new ones
The TestContainer Pattern
Every test that renders DOM must use an isolated container that cleans up on drop:
#![allow(unused)] fn main() { use wasm_bindgen::JsCast; use wasm_bindgen_test::*; wasm_bindgen_test_configure!(run_in_browser); /// Isolated test container. Removed from DOM on drop. struct TestContainer { element: web_sys::Element, } impl TestContainer { fn new() -> Self { let doc = web_sys::window().unwrap().document().unwrap(); let el = doc.create_element("div").unwrap(); // Give it a real size so layout works correctly el.set_attribute("style", "position:absolute;left:0;top:0;width:800px;height:600px" ).unwrap(); doc.body().unwrap().append_child(&el).unwrap(); Self { element: el } } fn dom_element(&self) -> web_sys::HtmlElement { self.element.clone().dyn_into().unwrap() } /// Query within this container only — never polluted by other tests. fn query_all(&self, selector: &str) -> web_sys::NodeList { self.element.query_selector_all(selector).unwrap() } fn query(&self, selector: &str) -> Option<web_sys::Element> { self.element.query_selector(selector).unwrap() } } impl Drop for TestContainer { fn drop(&mut self) { self.element.remove(); } } }
Usage
#![allow(unused)] fn main() { #[wasm_bindgen_test] async fn test_my_component() { let tc = TestContainer::new(); // Render INTO the container, not into body dominator::append_dom(&tc.dom_element(), my_component()); wait_frame().await; // Query scoped to this test's container only let buttons = tc.query_all("button"); assert_eq!(buttons.length(), 1); } }
Rules:
- Always capture the container:
let _tc = ...(underscore prefix keeps it alive without using it) - If a test queries the DOM, use
tc.query()/tc.query_all(), NOTdocument.query_selector() - If a test only checks signal/state values (no DOM queries), still capture
_tcso the DOM is cleaned up
Waiting for Rendering
Dominator batches DOM updates asynchronously. After changing a Mutable or appending DOM, you must wait before reading the result.
wait_frame helper
#![allow(unused)] fn main() { async fn wait_frame() { let promise = js_sys::Promise::new(&mut |resolve, _| { web_sys::window().unwrap() .request_animation_frame(&resolve).unwrap(); }); wasm_bindgen_futures::JsFuture::from(promise).await.unwrap(); } async fn wait_frames(n: usize) { for _ in 0..n { wait_frame().await; } } }
When to wait
| Scenario | Frames to wait |
|---|---|
After dominator::append_dom() | 1 |
After changing a Mutable that drives style_signal / text_signal | 1 |
After children_signal_vec adds/removes elements | 1–2 |
After requestAnimationFrame callback (e.g., DOM measurement) | 2–3 |
After full_sync that rebuilds entire DOM tree | 3 |
Testing Reactive Signals
Test that a Mutable change propagates to DOM
#![allow(unused)] fn main() { #[wasm_bindgen_test] async fn test_reactive_text() { let tc = TestContainer::new(); let label = Mutable::new("Hello".to_string()); dominator::append_dom(&tc.dom_element(), html!("span", { .attr("data-testid", "label") .text_signal(label.signal_cloned()) })); wait_frame().await; let el = tc.query("[data-testid=label]").unwrap(); assert_eq!(el.text_content().unwrap(), "Hello"); label.set("World".to_string()); wait_frame().await; assert_eq!(el.text_content().unwrap(), "World"); } }
Test that MutableVec drives children_signal_vec
#![allow(unused)] fn main() { #[wasm_bindgen_test] async fn test_reactive_list() { let tc = TestContainer::new(); let items = MutableVec::new(); dominator::append_dom(&tc.dom_element(), html!("ul", { .children_signal_vec(items.signal_vec_cloned().map(|item: String| { html!("li", { .text(&item) }) })) })); wait_frame().await; assert_eq!(tc.query_all("li").length(), 0); items.lock_mut().push_cloned("First".to_string()); wait_frames(2).await; assert_eq!(tc.query_all("li").length(), 1); } }
Testing DOM Measurements
When testing code that reads getBoundingClientRect(), the container must have real dimensions. The TestContainer sets width:800px;height:600px for this reason.
Gotcha: Headless browsers may report (0, 0) for elements that aren't visible. Ensure:
- The container has explicit dimensions
- Elements use
position: absolutewith explicitleft/topfor predictable layout - Don't rely on CSS Flexbox/Grid sizing in tests — use explicit pixel values
Converting screen ↔ world coordinates in tests
If your component uses a pan/zoom transform container:
#![allow(unused)] fn main() { // Read an element's world position from its screen position let el = tc.query("[data-my-element]").unwrap(); let rect = el.get_bounding_client_rect(); let center_x = rect.left() + rect.width() / 2.0; let center_y = rect.top() + rect.height() / 2.0; // Find the transform container let vp = tc.query("[data-viewport-inner]").unwrap(); let vp_rect = vp.get_bounding_client_rect(); let zoom = my_zoom_signal.get(); let world_x = (center_x - vp_rect.left()) / zoom; let world_y = (center_y - vp_rect.top()) / zoom; }
Testing User Interactions
For components that handle mouse/keyboard events, test at the signal/state level rather than simulating DOM events. DOM event simulation in wasm-bindgen-test is unreliable.
#![allow(unused)] fn main() { // GOOD — test the handler directly gs.handle_input(InputEvent::MouseDown { screen: Vec2::new(100.0, 100.0), world: Vec2::new(100.0, 100.0), button: MouseButton::Left, modifiers: Modifiers::default(), }); assert!(matches!(gs.state(), SomeState::Dragging { .. })); // BAD — dispatching synthetic DOM events is fragile let event = web_sys::MouseEvent::new("mousedown").unwrap(); element.dispatch_event(&event).unwrap(); // unreliable }
Common Pitfalls
1. Forgotten container capture
#![allow(unused)] fn main() { // BAD — container dropped immediately, DOM removed before assertions async fn test_bad() { render_into_container(); // container dropped here wait_frame().await; // DOM is already gone! } // GOOD — container kept alive async fn test_good() { let _tc = render_into_container(); wait_frame().await; // DOM still exists } }
2. Querying global document instead of container
#![allow(unused)] fn main() { // BAD — finds elements from ALL tests let els = document.query_selector_all("button").unwrap(); // GOOD — scoped to this test let els = tc.query_all("button"); }
3. Not waiting enough frames after complex operations
#![allow(unused)] fn main() { // BAD — MutableVec change + DOM measurement in same frame items.lock_mut().push_cloned(value); let count = tc.query_all("li").length(); // still 0! // GOOD — wait for dominator to flush items.lock_mut().push_cloned(value); wait_frames(2).await; let count = tc.query_all("li").length(); // correct }
4. Asserting exact element counts across shared DOM
#![allow(unused)] fn main() { // BAD — fragile if test order changes assert_eq!(tc.query_all("[data-node]").length(), 2); // BETTER — use >= for existence checks, == only within isolated container assert!(tc.query_all("[data-node]").length() >= 2); // Or with TestContainer: exact counts are safe since container is isolated assert_eq!(tc.query_all("[data-node]").length(), 2); // ✓ safe with TestContainer }
name: ras-api-design description: Use when the user asks about defining REST endpoints, JSON-RPC methods, file service routes, or WebSocket services with RAS macros, designing request/response types, path parameters, query parameters, macro syntax for rest_service!, jsonrpc_service!, file_service!, or jsonrpc_bidirectional_service!, or asks about OpenAPI/OpenRPC generation. version: 1.0.0
RAS API Design — Macro Syntax & Endpoint Definition
RAS macros generate a service trait, builder, Axum router, and spec (OpenAPI/OpenRPC) from a single declarative block. You define the contract; the macro generates the plumbing. All four macros share the same auth-level syntax and type requirements.
rest_service! — REST APIs
#![allow(unused)] fn main() { use ras_rest_macro::rest_service; rest_service!({ service_name: TaskService, base_path: "/api/v1", openapi: true, serve_docs: true, docs_path: "/docs", endpoints: [ // Public — no auth GET UNAUTHORIZED tasks() -> TasksResponse, GET UNAUTHORIZED tasks/{id: String}() -> Task, // Query parameters GET UNAUTHORIZED search/tasks ? q: String & limit: Option<u32> & offset: Option<u32> () -> TasksResponse, // Authenticated — requires "user" permission POST WITH_PERMISSIONS(["user"]) tasks(CreateTaskRequest) -> Task, // Multiple path params PUT WITH_PERMISSIONS(["user"]) users/{user_id: String}/tasks/{task_id: String}(UpdateTaskRequest) -> Task, // OR permissions — either "owner" OR "admin" suffices DELETE WITH_PERMISSIONS(["owner"] | ["admin"]) tasks/{id: String}() -> (), ] }); }
Endpoint Syntax
METHOD AUTH_LEVEL path/{param: Type}/segments ? query: Type & query2: Type (RequestBody) -> ResponseType
| Component | Options |
|---|---|
| Method | GET, POST, PUT, DELETE, PATCH |
| Auth level | UNAUTHORIZED, WITH_PERMISSIONS(["perm1", "perm2"]) |
| Path params | {name: Type} inline in the path |
| Query params | ? param: Type & param2: Option<Type> after the path |
| Request body | (RequestType) — omit the type for GET/DELETE: () |
| Response | -> ResponseType — use () for empty responses |
Path Parameters
Parameters are extracted from the URL path. Multiple params supported:
#![allow(unused)] fn main() { GET UNAUTHORIZED users/{user_id: String}/posts/{post_id: i32}() -> Post, PUT WITH_PERMISSIONS(["user"]) posts/{post_id: i32}/comments/{comment_id: i32}(UpdateCommentRequest) -> Comment, }
Query Parameters
Appended after ?, separated by &. Use Option<T> for optional params:
#![allow(unused)] fn main() { GET UNAUTHORIZED search ? q: String & limit: Option<u32> & offset: Option<u32> () -> SearchResults, }
Macro Configuration
#![allow(unused)] fn main() { rest_service!({ service_name: ServiceName, // Required: generates trait, builder, client names base_path: "/api/v1", // Required: URL prefix for all endpoints openapi: true, // Optional: generate OpenAPI 3.0 spec openapi: { output: "custom.json" }, // Optional: custom output path serve_docs: true, // Optional: host the built-in API explorer docs_path: "/docs", // Optional: explorer path (default: "/docs") endpoints: [ ... ] }); }
Request & Response Types
All types used in macro invocations must derive three traits:
#![allow(unused)] fn main() { #[derive(Debug, Clone, Serialize, Deserialize, JsonSchema)] pub struct CreateTaskRequest { pub title: String, pub description: String, pub tags: Vec<String>, } }
Serialize+Deserialize— serde, for JSON encodingJsonSchema— schemars, for OpenAPI spec generation
Missing JsonSchema causes a compile error when openapi: true.
Hosted REST Explorer
When serve_docs: true is set, the generated router serves the built-in RAS API explorer at base_path + docs_path and the OpenAPI document at base_path + docs_path + "/openapi.json". For example, base_path: "/api/v1" and docs_path: "/docs" serve:
GET /api/v1/docs— interactive API explorerGET /api/v1/docs/openapi.json— generated OpenAPI JSON
The explorer has built-in bearer-token entry for trying protected endpoints. Tokens are stored in sessionStorage for the current browser session, not localStorage; only non-secret UI preferences such as theme are stored persistently.
Error Responses
Use RestResult<T> (alias for Result<RestResponse<T>, RestError>) in handler implementations:
#![allow(unused)] fn main() { use ras_rest_core::{RestResult, RestResponse, RestError}; async fn get_task_by_id(&self, id: String) -> RestResult<Task> { // Success variants Ok(RestResponse::ok(task)) // 200 Ok(RestResponse::created(task)) // 201 Ok(RestResponse::with_status(202, task)) // custom // Error variants Err(RestError::not_found("Task not found")) Err(RestError::bad_request("Invalid task ID")) Err(RestError::unauthorized("Invalid token")) Err(RestError::forbidden("Insufficient permissions")) // Internal error — logged but message not sent to client Err(RestError::with_internal(500, "Database error", db_error)) } }
For domain-specific errors, define a thiserror enum and convert to RestError:
#![allow(unused)] fn main() { #[derive(Debug, thiserror::Error)] pub enum TaskError { #[error("task not found: {0}")] NotFound(String), #[error("duplicate title: {0}")] DuplicateTitle(String), #[error("storage error: {0}")] Storage(String), } impl From<TaskError> for RestError { fn from(e: TaskError) -> Self { match e { TaskError::NotFound(msg) => RestError::not_found(msg), TaskError::DuplicateTitle(msg) => RestError::bad_request(msg), TaskError::Storage(msg) => { RestError::with_internal(500, "Internal error", std::io::Error::other(msg)) } } } } }
Generated Code
Each macro generates four things:
1. Service Trait
#![allow(unused)] fn main() { #[async_trait] pub trait TaskServiceTrait: Send + Sync + 'static { // UNAUTHORIZED — no user parameter async fn get_tasks(&self) -> RestResult<TasksResponse>; async fn get_tasks_by_id(&self, id: String) -> RestResult<Task>; // WITH_PERMISSIONS — receives &AuthenticatedUser async fn post_tasks(&self, user: &AuthenticatedUser, req: CreateTaskRequest) -> RestResult<Task>; async fn delete_tasks_by_id(&self, user: &AuthenticatedUser, id: String) -> RestResult<()>; } }
Method names are generated from HTTP method + path segments: get_tasks, post_tasks, get_tasks_by_id, delete_tasks_by_id.
2. Service Builder
#![allow(unused)] fn main() { let router = TaskServiceBuilder::new(service_impl) .auth_provider(auth) // Arc<dyn AuthProvider> .with_usage_tracker(|headers, user, method, path| async move { ... }) .with_method_duration_tracker(|method, path, user, duration| async move { ... }) .build(); // Returns axum::Router }
3. Native Rust Client
The macro generates a type-safe async client with the same method signatures as the service trait. Enable with client feature flag.
#![allow(unused)] fn main() { use my_api::TaskServiceClient; // Build client with server URL let mut client = TaskServiceClient::builder("http://localhost:3000/api/v1").build(); // Set auth token for protected endpoints client.set_bearer_token(Some("jwt-token")); // Methods mirror the service trait — same types, same names let tasks: TasksResponse = client.get_tasks().await?; let task: Task = client.get_tasks_by_id("task-123".into()).await?; let new_task: Task = client.post_tasks(CreateTaskRequest { title: "New task".into(), description: "Details".into(), }).await?; // Methods with custom timeout let tasks = client.get_tasks_with_timeout(Some(Duration::from_secs(5))).await?; }
The client is generated in the API crate alongside the server trait — both sides share the same request/response types, ensuring compile-time type safety across service boundaries. This is the primary way to consume RAS services from other Rust crates.
jsonrpc_service! — JSON-RPC
#![allow(unused)] fn main() { use ras_jsonrpc_macro::jsonrpc_service; jsonrpc_service!({ service_name: ChatService, openrpc: true, explorer: true, methods: [ UNAUTHORIZED health_check(()) -> HealthStatus, WITH_PERMISSIONS(["user"]) send_message(SendMessageRequest) -> SendMessageResponse, WITH_PERMISSIONS(["admin"]) delete_channel(DeleteChannelRequest) -> (), ] }); }
JSON-RPC methods map to JSON-RPC 2.0 method strings. Like REST, UNAUTHORIZED methods receive only the request, while WITH_PERMISSIONS methods also receive &AuthenticatedUser.
When explorer: true is used with openrpc: true, the macro generates {service}_explorer_routes(base_path). Merge those routes into your Axum app to serve the same built-in explorer at /explorer by default, plus /explorer/openrpc.json. A custom path can be configured with explorer: { path: "/api/docs" }.
file_service! — File Upload/Download
#![allow(unused)] fn main() { use ras_file_macro::file_service; file_service!({ service_name: DocumentService, base_path: "/api/files", body_limit: 52428800, // 50MB endpoints: [ UPLOAD WITH_PERMISSIONS(["user"]) upload() -> FileMetadata, DOWNLOAD UNAUTHORIZED download/{file_id: String}(), ] }); }
UPLOADendpoints accept streaming multipart bodiesDOWNLOADendpoints return streaming responsesbody_limitsets the maximum upload size in bytes
jsonrpc_bidirectional_service! — WebSocket
#![allow(unused)] fn main() { use ras_jsonrpc_bidirectional_macro::jsonrpc_bidirectional_service; jsonrpc_bidirectional_service!({ service_name: RealtimeService, client_to_server: [ WITH_PERMISSIONS(["user"]) send_message(SendMessageRequest) -> SendMessageResponse, WITH_PERMISSIONS(["user"]) subscribe_channel(SubscribeRequest) -> (), ], server_to_client: [ message_received(MessageNotification), user_joined(UserJoinedNotification), ], server_to_client_calls: [ ping(PingRequest) -> PongResponse, ] }); }
client_to_server— methods the client can call on the server (request/response)server_to_client— notifications the server pushes to clients (fire-and-forget, no response)server_to_client_calls— methods the server can call on the client (request/response, bidirectional)
Read references/macro-syntax-reference.md for a compact cheat sheet of all four macros.
Auth Level Details
Auth levels are shared across all macros:
| Auth Level | Handler Signature | Meaning |
|---|---|---|
UNAUTHORIZED | No user param | No authentication required |
WITH_PERMISSIONS(["a"]) | user: &AuthenticatedUser | Requires permission "a" |
WITH_PERMISSIONS(["a", "b"]) | user: &AuthenticatedUser | Requires "a" AND "b" |
WITH_PERMISSIONS(["a"] \| ["b"]) | user: &AuthenticatedUser | Requires "a" OR "b" |
WITH_PERMISSIONS(["a"] \| ["b", "c"]) | user: &AuthenticatedUser | Requires "a" OR ("b" AND "c") |
The macro enforces auth at the router level — unauthenticated requests to protected endpoints are rejected before your handler runs.
Related Skills
For project scaffolding and where macros live in the crate layout, see the ras-setup skill.
For AuthProvider implementation and permission design, see the ras-security skill.
For error handling patterns and observability wiring, see the ras-best-practices skill.
RAS Macro Syntax Reference
rest_service!
#![allow(unused)] fn main() { rest_service!({ service_name: Name, // Required base_path: "/prefix", // Required openapi: true, // Optional — or { output: "path.json" } serve_docs: true, // Optional — built-in API explorer docs_path: "/docs", // Optional — explorer path, default "/docs" endpoints: [ METHOD AUTH path/{param: Type}/more ? query: Type & opt: Option<Type> (Body) -> Response, ] }); }
Methods: GET, POST, PUT, DELETE, PATCH
Generated names: {method}_{path_segments} — e.g., GET users/{id} → get_users_by_id
Trait: {ServiceName}Trait
Builder: {ServiceName}Builder
Client: {ServiceName}Client
Hosted docs: serve_docs: true serves the explorer at base_path + docs_path and the OpenAPI JSON at base_path + docs_path + "/openapi.json". Bearer tokens entered in the explorer are kept in sessionStorage, not localStorage.
jsonrpc_service!
#![allow(unused)] fn main() { jsonrpc_service!({ service_name: Name, // Required openrpc: true, // Optional — OpenRPC spec explorer: true, // Optional — web explorer UI methods: [ AUTH method_name(RequestType) -> ResponseType, ] }); }
Trait: {ServiceName} (no Trait suffix)
Builder: {ServiceName}Builder
Hosted explorer: explorer: true requires openrpc: true and generates {service}_explorer_routes(base_path), serving the explorer at /explorer by default and OpenRPC JSON at /explorer/openrpc.json.
file_service!
#![allow(unused)] fn main() { file_service!({ service_name: Name, // Required base_path: "/prefix", // Required body_limit: 52428800, // Optional — bytes, default varies endpoints: [ UPLOAD AUTH path() -> MetadataType, DOWNLOAD AUTH path/{param: Type}(), ] }); }
jsonrpc_bidirectional_service!
#![allow(unused)] fn main() { jsonrpc_bidirectional_service!({ service_name: Name, // Required client_to_server: [ AUTH method_name(RequestType) -> ResponseType, ], server_to_client: [ notification_name(NotificationType), ] }); }
Auth Levels (all macros)
| Syntax | Meaning |
|---|---|
UNAUTHORIZED | No auth check, no user param in handler |
WITH_PERMISSIONS(["a"]) | Requires permission "a" |
WITH_PERMISSIONS(["a", "b"]) | Requires "a" AND "b" |
WITH_PERMISSIONS(["a"] \| ["b"]) | Requires "a" OR "b" |
WITH_PERMISSIONS(["a"] \| ["b", "c"]) | "a" OR ("b" AND "c") |
Type Requirements
All request/response types must derive:
#![allow(unused)] fn main() { #[derive(Debug, Clone, Serialize, Deserialize, JsonSchema)] }
Serialize+Deserialize— fromserdeJsonSchema— fromschemars(required for OpenAPI/OpenRPC generation)
REST Path & Query Parameter Syntax
path/{name: Type} # path param
path ? key: Type # required query param
path ? key: Option<Type> # optional query param
path ? a: Type & b: Option<Type> # multiple query params
path/{id: String} ? detail: bool # path + query combined
RestResult Responses
#![allow(unused)] fn main() { Ok(RestResponse::ok(value)) // 200 Ok(RestResponse::created(value)) // 201 Ok(RestResponse::with_status(202, value)) // custom status Err(RestError::bad_request("msg")) // 400 Err(RestError::unauthorized("msg")) // 401 Err(RestError::forbidden("msg")) // 403 Err(RestError::not_found("msg")) // 404 Err(RestError::with_internal(500, "msg", err)) // 500 (err logged, not sent) }
Generated Rust Client
#![allow(unused)] fn main() { // Build client let client = ServiceNameClient::builder("http://host:port/base").build(); client.set_bearer_token(Some("jwt-token")); // Methods mirror the service trait let result = client.get_things().await?; let item = client.get_things_by_id("id".into()).await?; let created = client.post_things(CreateRequest { ... }).await?; // With custom timeout let result = client.get_things_with_timeout(Some(Duration::from_secs(5))).await?; }
Feature Flags
[features]
default = ["server", "client"]
server = [] # server-side trait + builder + router
client = [] # native Rust client (async, reqwest-based)
name: ras-best-practices description: Use when the user asks about RAS observability, error handling in RAS services, usage tracking, method duration tracking, Prometheus metrics, OpenTelemetry integration, using the generated Rust client, service-to-service communication, testing RAS services, or general best practices for production RAS deployments. version: 1.0.0
RAS Best Practices — Observability, Errors, Clients & Testing
Production RAS services need structured errors, observability hooks, generated clients, and testable handler implementations. This skill covers the patterns that bridge the gap between a working macro invocation and a production deployment.
Error Handling
RAS error handling follows the rust-architecture convention: thiserror for library/domain errors, anyhow only in the binary crate.
Domain Errors → REST Errors
Define domain errors with thiserror, then convert to RestError at the handler boundary:
#![allow(unused)] fn main() { use thiserror::Error; use ras_rest_core::{RestResult, RestResponse, RestError}; #[derive(Debug, Error)] pub enum TaskError { #[error("task not found: {0}")] NotFound(String), #[error("duplicate title: {0}")] DuplicateTitle(String), #[error("storage error")] Storage(#[source] anyhow::Error), } impl From<TaskError> for RestError { fn from(e: TaskError) -> Self { match e { TaskError::NotFound(msg) => RestError::not_found(msg), TaskError::DuplicateTitle(msg) => RestError::bad_request(msg), TaskError::Storage(e) => RestError::with_internal(500, "Internal error", e), } } } }
Rules:
- Client errors (4xx) — include a meaningful message the caller can act on
- Server errors (5xx) — use
RestError::with_internal()to log the real error while returning a generic message - Never leak internals — stack traces, SQL queries, and file paths stay in logs
- Domain logic returns
Result<T, TaskError>, handlers convert toRestResult<T>via?with theFromimpl
JSON-RPC Errors
JSON-RPC uses standard error codes. Map domain errors to appropriate codes:
#![allow(unused)] fn main() { use ras_jsonrpc_types::JsonRpcError; impl From<TaskError> for JsonRpcError { fn from(e: TaskError) -> Self { match e { TaskError::NotFound(msg) => JsonRpcError::new(-32001, msg, None), TaskError::DuplicateTitle(msg) => JsonRpcError::new(-32002, msg, None), TaskError::Storage(_) => JsonRpcError::internal_error(), } } } }
Observability
RAS provides two hooks on every service builder: UsageTracker (counts requests) and MethodDurationTracker (measures latency). The ras-observability-otel crate provides a production-ready implementation backed by OpenTelemetry + Prometheus.
Quick Start
#![allow(unused)] fn main() { use ras_observability_otel::standard_setup; let otel = standard_setup("my-service")?; let router = TaskServiceBuilder::new(service_impl) .auth_provider(auth) .with_usage_tracker({ let tracker = otel.usage_tracker(); move |headers, user, method, path| { let context = RequestContext::rest(method, path); let tracker = tracker.clone(); async move { tracker.track_request(&headers, user.as_ref(), &context).await; } } }) .with_method_duration_tracker({ let tracker = otel.method_duration_tracker(); move |method, path, user, duration| { let context = RequestContext::rest(method, path); let tracker = tracker.clone(); async move { tracker.track_duration(&context, user.as_ref(), duration).await; } } }) .build(); // Add Prometheus metrics endpoint let app = Router::new() .merge(router) .merge(otel.metrics_router()); // exposes /metrics }
Exposed Metrics
| Metric | Type | Labels |
|---|---|---|
requests_started | Counter | method, protocol |
requests_completed | Counter | method, protocol, success |
method_duration_milliseconds | Histogram | method, protocol |
Labels are kept minimal to prevent cardinality explosion. Use structured logs (not metric labels) for per-user or per-request data.
Read references/observability-config.md for complete setup snippets including Prometheus scrape config and Grafana queries.
Generated Rust Client
Each RAS macro generates a type-safe async client alongside the server trait. Both live in the API crate and share the same request/response types — compile-time type safety across service boundaries.
Using the Client
#![allow(unused)] fn main() { use my_api::{TaskServiceClient, CreateTaskRequest}; use std::time::Duration; // Build client pointing at the target service let mut client = TaskServiceClient::builder("http://localhost:3000/api/v1").build(); // Set auth token for protected endpoints client.set_bearer_token(Some("jwt-token")); // Methods mirror the service trait — same types, same names let tasks = client.get_tasks().await?; let task = client.get_tasks_by_id("task-123".into()).await?; let new_task = client.post_tasks(CreateTaskRequest { title: "New task".into(), description: "Details".into(), }).await?; // Custom timeout for slow endpoints let result = client.get_tasks_with_timeout(Some(Duration::from_secs(10))).await?; }
Service-to-Service Communication
The generated client is the primary way to call RAS services from other Rust crates. In a multi-service architecture, add the API crate as a dependency with only the client feature:
[dependencies]
task-api = { path = "../task-api", default-features = false, features = ["client"] }
This pulls in only the client code and shared types — no server-side code generation.
Testing RAS Services
Follow the rust-testing skill's approach: hand-written fakes, TestApp pattern, axum-test for in-process HTTP.
Hand-Written FakeAuthProvider
#![allow(unused)] fn main() { use ras_auth_core::{AuthProvider, AuthenticatedUser, AuthResult, AuthError, AuthFuture}; use std::collections::HashSet; use std::sync::Mutex; pub struct FakeAuthProvider { users: Mutex<Vec<(String, AuthenticatedUser)>>, // token → user } impl FakeAuthProvider { pub fn new() -> Self { Self { users: Mutex::new(Vec::new()) } } pub fn add_user(&self, token: &str, user_id: &str, permissions: Vec<String>) { self.users.lock().unwrap().push(( token.into(), AuthenticatedUser { user_id: user_id.into(), permissions: permissions.into_iter().collect::<HashSet<_>>(), metadata: None, }, )); } } impl AuthProvider for FakeAuthProvider { fn authenticate(&self, token: String) -> AuthFuture<'_> { Box::pin(async move { self.users.lock().unwrap() .iter() .find(|(t, _)| *t == token) .map(|(_, u)| u.clone()) .ok_or(AuthError::InvalidToken) }) } fn check_permissions( &self, user: &AuthenticatedUser, required: &[String], ) -> AuthResult<()> { if required.iter().all(|p| user.permissions.contains(p)) { Ok(()) } else { Err(AuthError::InsufficientPermissions { required: required.to_vec(), has: user.permissions.iter().cloned().collect(), }) } } } }
Integration Testing with axum-test
Build the full Axum router with fakes, exercise the HTTP stack in-process:
#![allow(unused)] fn main() { use axum_test::TestServer; use std::sync::Arc; struct TestApp { server: TestServer, auth: Arc<FakeAuthProvider>, } impl TestApp { fn new() -> Self { let auth = Arc::new(FakeAuthProvider::new()); let service = TaskServiceImpl::new(/* inject domain fakes */); let router = TaskServiceBuilder::new(service) .auth_provider(Arc::clone(&auth) as Arc<dyn AuthProvider>) .build(); let server = TestServer::new(router).unwrap(); Self { server, auth } } } #[tokio::test] async fn create_task_requires_auth() { let app = TestApp::new(); // Unauthenticated — should fail let response = app.server .post("/api/v1/tasks") .json(&json!({ "title": "Test", "description": "" })) .await; response.assert_status(StatusCode::UNAUTHORIZED); } #[tokio::test] async fn create_task_with_valid_token() { let app = TestApp::new(); app.auth.add_user("test-token", "alice", vec!["user".into()]); let response = app.server .post("/api/v1/tasks") .add_header("Authorization", "Bearer test-token") .json(&json!({ "title": "Test", "description": "" })) .await; response.assert_status(StatusCode::CREATED); } }
For the full fake pattern (builders, Mutex-based storage, configurable failures), see the rust-testing skill.
Production Checklist
- Structured logging — use
tracingwith JSON output, include request IDs - Health check endpoint —
GET UNAUTHORIZED health() -> HealthStatusin every service - Graceful shutdown — handle
SIGTERMwithtokio::signalbefore stopping the listener - CORS — configure
tower-http::cors::CorsLayerfor browser clients - Request size limits — set
body_limitinfile_service!, use Tower middleware for REST - Protect
/metrics— require a bearer token or restrict to internal network
Related Skills
For workspace setup and crate layout, see the ras-setup skill. For macro syntax and endpoint definition, see the ras-api-design skill. For auth provider implementation and permission design, see the ras-security skill. For hand-written fake patterns and test organization, see the rust-testing skill. For DI and trait boundary patterns used throughout, see the rust-architecture skill.
RAS Observability Configuration
Standard Setup (Quick Start)
use ras_observability_otel::standard_setup; use axum::Router; #[tokio::main] async fn main() -> Result<(), Box<dyn std::error::Error>> { let otel = standard_setup("my-service")?; let app = Router::new() .merge(service_router) .merge(otel.metrics_router()); // adds /metrics let listener = tokio::net::TcpListener::bind("0.0.0.0:3000").await?; axum::serve(listener, app).await?; Ok(()) }
Builder Setup (Custom Configuration)
#![allow(unused)] fn main() { use ras_observability_otel::OtelSetupBuilder; use prometheus::Registry; // Custom Prometheus registry (optional) let registry = Registry::new(); let custom_counter = prometheus::Counter::new("custom_ops", "Custom operations")?; registry.register(Box::new(custom_counter.clone()))?; let otel = OtelSetupBuilder::new("my-service") .with_prometheus_registry(registry) .build()?; }
Usage Tracker Wiring (REST)
#![allow(unused)] fn main() { use ras_observability_core::RequestContext; let router = TaskServiceBuilder::new(service) .auth_provider(auth) .with_usage_tracker({ let tracker = otel.usage_tracker(); move |headers, user, method, path| { let context = RequestContext::rest(method, path); let tracker = tracker.clone(); async move { tracker.track_request(&headers, user.as_ref(), &context).await; } } }) .build(); }
Duration Tracker Wiring (REST)
#![allow(unused)] fn main() { .with_method_duration_tracker({ let tracker = otel.method_duration_tracker(); move |method, path, user, duration| { let context = RequestContext::rest(method, path); let tracker = tracker.clone(); async move { tracker.track_duration(&context, user.as_ref(), duration).await; } } }) }
Usage Tracker Wiring (JSON-RPC)
#![allow(unused)] fn main() { .with_usage_tracker({ let tracker = otel.usage_tracker(); move |headers, user, payload| { let context = RequestContext::jsonrpc(payload.method.clone()); let tracker = tracker.clone(); async move { tracker.track_request(&headers, user.as_ref(), &context).await; } } }) }
Duration Tracker Wiring (JSON-RPC)
#![allow(unused)] fn main() { .with_method_duration_tracker({ let tracker = otel.method_duration_tracker(); move |method, user, duration| { let context = RequestContext::jsonrpc(method.to_string()); let tracker = tracker.clone(); async move { tracker.track_duration(&context, user.as_ref(), duration).await; } } }) }
Request Metadata (Structured Logs, Not Metrics)
#![allow(unused)] fn main() { let context = RequestContext::rest("POST", "/api/orders") .with_metadata("request_id", request_id) .with_metadata("customer_id", customer_id); // Metadata is included in structured logs but NOT in metric labels otel.usage_tracker().track_request(&headers, user.as_ref(), &context).await; }
Protecting the Metrics Endpoint
#![allow(unused)] fn main() { use tower_http::auth::RequireAuthorizationLayer; let app = Router::new() .merge(service_router) .nest( "/metrics", otel.metrics_router() .layer(RequireAuthorizationLayer::bearer("your-metrics-token")), ); }
Prometheus Scrape Config
# prometheus.yml
scrape_configs:
- job_name: 'my-service'
static_configs:
- targets: ['my-service:3000']
metrics_path: '/metrics'
scrape_interval: 15s
Grafana Queries
# Request rate by method
rate(requests_completed[5m])
# Success rate (percentage)
sum(rate(requests_completed{success="true"}[5m]))
/ sum(rate(requests_completed[5m])) * 100
# P95 latency by method
histogram_quantile(0.95,
sum(rate(method_duration_milliseconds_bucket[5m])) by (method, le)
)
# Error rate by protocol
sum(rate(requests_completed{success="false"}[5m])) by (protocol)
# Request volume by method (last hour)
increase(requests_started[1h])
Histogram Buckets
Default duration buckets (seconds):
0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1.0, 5.0, 10.0
Metric Labels
| Label | Values | Notes |
|---|---|---|
method | "GET /users", "createUser" | Use path template, not resolved path |
protocol | "rest", "jsonrpc", "websocket" | Set by RequestContext constructor |
success | "true", "false" | Only on requests_completed |
Keep labels low-cardinality. Never add per-user or per-request-ID labels — those belong in structured logs via with_metadata().
name: ras-security description: Use when the user asks about authentication, authorization, permissions, JWT, OAuth2, session management, AuthProvider, IdentityProvider, securing RAS endpoints, token validation, RBAC, permission guards, or the UNAUTHORIZED/WITH_PERMISSIONS auth levels in RAS services. version: 1.0.0
RAS Security — Auth, Permissions & Identity
RAS uses pluggable authentication via two port traits: AuthProvider (validates tokens, checks permissions) and IdentityProvider (verifies credentials, returns identity). Both follow the trait-as-interface pattern from the rust-architecture skill — define in core, implement as adapters, wire via Arc<dyn Trait>.
Auth Levels in Macros
Every endpoint declares its auth requirement. The macro enforces it at the router level — unauthenticated requests to protected endpoints are rejected before your handler runs.
#![allow(unused)] fn main() { endpoints: [ // No auth — handler has no user parameter GET UNAUTHORIZED health() -> HealthStatus, // Requires authentication + "user" permission POST WITH_PERMISSIONS(["user"]) tasks(CreateTaskRequest) -> Task, // AND: requires both "moderator" AND "editor" PUT WITH_PERMISSIONS(["moderator", "editor"]) posts/{id: String}(UpdatePostRequest) -> Post, // OR: requires "admin" OR ("moderator" AND "editor") DELETE WITH_PERMISSIONS(["admin"] | ["moderator", "editor"]) posts/{id: String}() -> (), ] }
How auth level affects the generated handler signature:
#![allow(unused)] fn main() { // UNAUTHORIZED — no user parameter async fn get_health(&self) -> RestResult<HealthStatus> { ... } // WITH_PERMISSIONS — receives &AuthenticatedUser async fn post_tasks(&self, user: &AuthenticatedUser, req: CreateTaskRequest) -> RestResult<Task> { // user.user_id, user.permissions available here ... } }
The AuthProvider Trait
AuthProvider is the port that RAS macros use to validate incoming requests. The generated router extracts the bearer token, passes the token string to authenticate, and receives an AuthenticatedUser with permissions.
#![allow(unused)] fn main() { use ras_auth_core::{AuthProvider, AuthenticatedUser, AuthResult, AuthFuture}; pub trait AuthProvider: Send + Sync + 'static { fn authenticate(&self, token: String) -> AuthFuture<'_>; fn check_permissions( &self, user: &AuthenticatedUser, required_permissions: &[String], ) -> AuthResult<()>; } }
Note: authenticate returns AuthFuture (a pinned boxed future), not an async fn. check_permissions is synchronous.
Wire it via Arc<dyn AuthProvider> in the service builder:
#![allow(unused)] fn main() { let auth: Arc<dyn AuthProvider> = Arc::new(JwtAuthProvider::new(session_service)); let router = TaskServiceBuilder::new(service_impl) .auth_provider(auth) .build(); }
The IdentityProvider Trait
IdentityProvider verifies credentials (username/password, OAuth2 code) and returns a VerifiedIdentity. Multiple providers can be registered with a SessionService.
#![allow(unused)] fn main() { use ras_identity_core::{IdentityProvider, VerifiedIdentity, IdentityError}; #[async_trait] pub trait IdentityProvider: Send + Sync { fn provider_id(&self) -> &str; async fn verify(&self, payload: serde_json::Value) -> Result<VerifiedIdentity, IdentityError>; } }
Built-in Providers
Local (username/password):
#![allow(unused)] fn main() { use ras_identity_local::LocalUserProvider; let provider = LocalUserProvider::new(); provider.add_user("alice".into(), "secure_password".into(), Some("alice@example.com".into()), Some("Alice".into())).await?; }
Security features: Argon2 password hashing, timing attack resistance, username enumeration prevention, rate limiting (5 concurrent auth attempts).
OAuth2 (external IdP):
#![allow(unused)] fn main() { use ras_identity_oauth2::{ InMemoryStateStore, OAuth2Config, OAuth2Provider, OAuth2ProviderConfig, }; use std::sync::Arc; let google_config = OAuth2ProviderConfig { provider_id: "google".into(), client_id: "your-client-id".into(), client_secret: "your-client-secret".into(), authorization_endpoint: "https://accounts.google.com/o/oauth2/v2/auth".into(), token_endpoint: "https://oauth2.googleapis.com/token".into(), userinfo_endpoint: Some("https://www.googleapis.com/oauth2/v2/userinfo".into()), redirect_uri: "http://localhost:3000/auth/callback".into(), scopes: vec!["openid".into(), "email".into(), "profile".into()], use_pkce: true, auth_params: Default::default(), user_info_mapping: None, }; let state_store = Arc::new(InMemoryStateStore::new()); let oauth_provider = OAuth2Provider::new(OAuth2Config::new().add_provider(google_config), state_store); }
PKCE is used by default for authorization code flow.
Custom Provider
Implement IdentityProvider for any auth backend:
#![allow(unused)] fn main() { struct LdapProvider { /* config */ } #[async_trait] impl IdentityProvider for LdapProvider { fn provider_id(&self) -> &str { "ldap" } async fn verify(&self, payload: serde_json::Value) -> Result<VerifiedIdentity, IdentityError> { let username = payload["username"].as_str().ok_or(IdentityError::InvalidCredentials)?; let password = payload["password"].as_str().ok_or(IdentityError::InvalidCredentials)?; // LDAP verification... Ok(VerifiedIdentity { provider_id: self.provider_id().into(), subject: username.into(), email: None, display_name: None, metadata: None, }) } } }
Session Management
SessionService orchestrates identity verification and JWT session creation:
#![allow(unused)] fn main() { use ras_identity_session::{SessionService, SessionConfig, JwtAuthProvider}; use chrono::Duration; let mut config = SessionConfig::new( std::env::var("JWT_SECRET").expect("JWT_SECRET must be set"), )?; config.jwt_ttl = Duration::hours(1); config.refresh_enabled = false; let session_service = Arc::new(SessionService::new(config)?); // Register identity providers session_service.register_provider(Box::new(local_provider)).await; session_service.register_provider(Box::new(oauth_provider)).await; // Create JWT auth provider for use with service macros let auth: Arc<dyn AuthProvider> = Arc::new(JwtAuthProvider::new(session_service.clone())); }
Session lifecycle:
#![allow(unused)] fn main() { // Authenticate and create session let jwt = session_service.begin_session("local", json!({ "username": "alice", "password": "secure_password" })).await?; // Verify session (used internally by JwtAuthProvider) let user = session_service.verify_session(&jwt).await?; // End session (logout, revokes token by JTI claim) session_service.end_session(&jti).await; }
Permission Design (RBAC)
Implement UserPermissions to map identities to permissions:
#![allow(unused)] fn main() { use ras_identity_core::{UserPermissions, VerifiedIdentity, IdentityResult}; struct RoleBasedPermissions { /* role store */ } #[async_trait] impl UserPermissions for RoleBasedPermissions { async fn get_permissions(&self, identity: &VerifiedIdentity) -> IdentityResult<Vec<String>> { Ok(match identity.subject.as_str() { "admin" => vec!["user".into(), "admin".into()], _ => vec!["user".into()], }) } } session_service.with_permissions(Arc::new(RoleBasedPermissions { /* ... */ })); }
Permission Naming
Use resource:action format for fine-grained control:
tasks:read,tasks:write,tasks:deleteusers:manage,admin:*
In macros, check specific permissions rather than roles:
#![allow(unused)] fn main() { // Good — checks capability POST WITH_PERMISSIONS(["tasks:write"]) tasks(CreateTaskRequest) -> Task, // Avoid — checks role (less flexible) POST WITH_PERMISSIONS(["admin"]) tasks(CreateTaskRequest) -> Task, }
Auth Error Handling
Define auth errors with thiserror, never leak internal details:
#![allow(unused)] fn main() { #[derive(Debug, thiserror::Error)] pub enum AuthError { #[error("authentication required")] AuthenticationRequired, #[error("invalid or expired token")] InvalidToken, #[error("insufficient permissions")] InsufficientPermissions, #[error("internal authentication error")] Internal(#[source] anyhow::Error), } }
The Internal variant logs the source error but only returns "internal authentication error" to the client.
Security Checklist
- HTTPS in production — terminate TLS at the load balancer or use
axum-serverwith rustls - Strong JWT secrets — generate cryptographically secure secrets, never hardcode
- Short token TTL — 1 hour for access tokens, rotate via refresh tokens if needed
- Environment config — secrets from env vars or secret manager, never in code or config files
- Rate limit auth endpoints — the local provider limits to 5 concurrent attempts; add your own for login routes
- Audit auth failures — log failed auth attempts with request metadata (IP, user-agent) for incident response
- Validate all inputs — request types get serde deserialization, but validate business constraints in handlers
- Sanitize error responses — use
RestError::with_internal()to log details without exposing them - Do not persist bearer tokens in browser
localStorage— the generated explorers keep entered tokens insessionStorage; follow the same pattern for debugging tools
Related Skills
For the trait-as-interface pattern behind AuthProvider, see the rust-architecture skill.
For auth level syntax and endpoint definition, see the ras-api-design skill.
For testing with FakeAuthProvider, see the ras-best-practices skill.
name: ras-setup description: Use when the user asks to create a new RAS project, set up a Rust Agent Stack workspace, configure Cargo.toml for RAS crates, add RAS dependencies, scaffold a service crate, or asks about RAS workspace structure and crate organization. version: 1.0.0
RAS Project Setup — Workspace Scaffolding & Dependencies
RAS projects follow the same workspace-first, crate-split conventions from the rust-project-setup skill. The key addition: an API crate where macro invocations define the service contract, sitting between the domain core and the binary that wires everything together.
Starter template: For a ready-to-compile RAS project with tests, see the scaffold-fullstack skill.
Project Structure
A typical RAS project adds an API crate to the standard layout:
my-project/
├── Cargo.toml # workspace root
├── crates/
│ ├── my-core/ # domain types, traits (ports), pure logic
│ │ ├── Cargo.toml
│ │ └── src/lib.rs
│ ├── my-api/ # RAS macro invocations + request/response types
│ │ ├── Cargo.toml
│ │ └── src/lib.rs
│ ├── my-service/ # binary: wires implementations into generated builders
│ │ ├── Cargo.toml
│ │ └── src/main.rs
│ └── my-testutils/ # shared fakes and fixtures (dev-dependency only)
│ ├── Cargo.toml
│ └── src/lib.rs
my-core— Domain types and port traits. No IO dependencies. No RAS dependency.my-api— Invokesrest_service!,jsonrpc_service!, etc. Defines request/response types. Depends on RAS macro crates +my-core.my-service— Implements the generated service traits, constructs adapters, wires auth viaArc<dyn AuthProvider>, and runs the Axum server.my-testutils— Hand-written fakes includingFakeAuthProvider. Only a[dev-dependencies]entry.
The API crate exists because macro invocations generate both server traits and a native Rust client — keeping them in a separate crate lets other services depend on just the client and shared types (via features = ["client"]) without pulling in the full service implementation.
Where RAS Macros Live
Macros belong in the API crate, not the domain crate. The API crate defines the contract:
#![allow(unused)] fn main() { // crates/my-api/src/lib.rs use ras_rest_macro::rest_service; use serde::{Deserialize, Serialize}; use schemars::JsonSchema; // Request/response types — must derive all three #[derive(Debug, Clone, Serialize, Deserialize, JsonSchema)] pub struct CreateTaskRequest { pub title: String, pub description: String, } #[derive(Debug, Clone, Serialize, Deserialize, JsonSchema)] pub struct Task { pub id: String, pub title: String, pub completed: bool, } #[derive(Debug, Clone, Serialize, Deserialize, JsonSchema)] pub struct TasksResponse { pub tasks: Vec<Task>, pub total: usize, } rest_service!({ service_name: TaskService, base_path: "/api/v1", openapi: true, serve_docs: true, endpoints: [ GET UNAUTHORIZED tasks() -> TasksResponse, POST WITH_PERMISSIONS(["user"]) tasks(CreateTaskRequest) -> Task, GET UNAUTHORIZED tasks/{id: String}() -> Task, DELETE WITH_PERMISSIONS(["admin"]) tasks/{id: String}() -> (), ] }); }
The domain crate stays clean — no macro dependencies, no HTTP/RPC concerns.
With serve_docs: true, the router hosts the built-in RAS API explorer at /api/v1/docs and the OpenAPI JSON at /api/v1/docs/openapi.json. The explorer supports bearer-token testing and keeps tokens in browser sessionStorage, not persistent localStorage.
Workspace Cargo.toml
Follow rust-project-setup conventions with RAS crates in [workspace.dependencies]:
[workspace]
members = ["crates/*"]
resolver = "3"
[workspace.package]
edition = "2024"
rust-version = "1.85"
[workspace.dependencies]
# RAS crates (not on crates.io — use git dependency)
ras-rest-macro = { git = "https://github.com/JedimEmO/rust-agent-stack.git" }
ras-rest-core = { git = "https://github.com/JedimEmO/rust-agent-stack.git" }
ras-jsonrpc-macro = { git = "https://github.com/JedimEmO/rust-agent-stack.git" }
ras-jsonrpc-core = { git = "https://github.com/JedimEmO/rust-agent-stack.git" }
ras-file-macro = { git = "https://github.com/JedimEmO/rust-agent-stack.git" }
ras-auth-core = { git = "https://github.com/JedimEmO/rust-agent-stack.git" }
ras-identity-session = { git = "https://github.com/JedimEmO/rust-agent-stack.git" }
ras-identity-local = { git = "https://github.com/JedimEmO/rust-agent-stack.git" }
ras-identity-oauth2 = { git = "https://github.com/JedimEmO/rust-agent-stack.git" }
ras-observability-core = { git = "https://github.com/JedimEmO/rust-agent-stack.git" }
ras-observability-otel = { git = "https://github.com/JedimEmO/rust-agent-stack.git" }
# Standard deps
serde = { version = "1", features = ["derive"] }
schemars = "1.0.0-alpha.20"
tokio = { version = "1", features = ["macros", "rt-multi-thread"] }
axum = "0.8"
anyhow = "1"
thiserror = "2"
tracing = "0.1"
async-trait = "0.1"
[workspace.lints.clippy]
pedantic = { level = "warn", priority = -1 }
module_name_repetitions = "allow"
must_use_candidate = "allow"
missing_errors_doc = "allow"
missing_panics_doc = "allow"
unwrap_used = "warn"
[workspace.lints.rust]
unsafe_code = "deny"
Read references/cargo-toml-templates.md for complete, copy-pasteable member crate templates.
API Crate Cargo.toml
The API crate depends on RAS macro crates and serialization:
[package]
name = "my-api"
version = "0.1.0"
edition.workspace = true
[lints]
workspace = true
[dependencies]
ras-rest-macro.workspace = true
ras-rest-core.workspace = true
ras-auth-core.workspace = true
serde.workspace = true
schemars.workspace = true
async-trait.workspace = true
[features]
default = ["server", "client"]
server = []
client = []
All request/response types must derive Serialize, Deserialize, and JsonSchema. Missing JsonSchema causes a compile error when openapi: true is set.
Binary Crate — Wiring
The service crate implements the generated trait and wires everything together using Arc<dyn AuthProvider>:
// crates/my-service/src/main.rs use my_api::{TaskServiceBuilder, TaskServiceTrait}; use ras_auth_core::AuthenticatedUser; use ras_rest_core::{RestResult, RestResponse, RestError}; use ras_identity_session::{SessionService, SessionConfig, JwtAuthProvider}; use std::sync::Arc; struct TaskServiceImpl { /* domain deps via Arc<dyn Trait> */ } #[async_trait::async_trait] impl TaskServiceTrait for TaskServiceImpl { async fn get_tasks(&self) -> RestResult<TasksResponse> { Ok(RestResponse::ok(TasksResponse { tasks: vec![], total: 0 })) } async fn post_tasks( &self, user: &AuthenticatedUser, request: CreateTaskRequest, ) -> RestResult<Task> { // Authenticated endpoints receive &AuthenticatedUser automatically Ok(RestResponse::created(Task { /* ... */ })) } // ... } #[tokio::main] async fn main() -> anyhow::Result<()> { let jwt_secret = std::env::var("JWT_SECRET").context("JWT_SECRET must be set")?; let session_service = Arc::new(SessionService::new(SessionConfig::new(jwt_secret)?)?); let auth: Arc<dyn ras_auth_core::AuthProvider> = Arc::new(JwtAuthProvider::new(session_service)); let service = TaskServiceImpl { /* inject domain deps */ }; let router = TaskServiceBuilder::new(service) .auth_provider(auth) .build(); let listener = tokio::net::TcpListener::bind("0.0.0.0:3000").await?; axum::serve(listener, router).await.context("server failed")?; Ok(()) }
Adding a New Service to an Existing Workspace
- Create the API crate:
cargo new crates/my-new-api --lib - Add RAS macro deps to its
Cargo.toml(inherit from workspace) - Define types and invoke the service macro
- Implement the generated trait in the service crate (or a new service crate)
- Wire into the existing Axum router via
.merge()or.nest()
Multiple RAS services compose naturally — each macro generates an independent Router:
#![allow(unused)] fn main() { let app = Router::new() .merge(task_router) .merge(user_router) .merge(otel.metrics_router()); }
Related Skills
For general workspace conventions, crate-split decisions, and feature flag strategy, see the rust-project-setup skill.
For the trait-as-interface DI pattern used for AuthProvider wiring, see the rust-architecture skill.
For macro syntax and endpoint definition, see the ras-api-design skill.
Cargo.toml Templates for RAS Projects
Workspace Root
[workspace]
members = ["crates/*"]
resolver = "3"
[workspace.package]
edition = "2024"
rust-version = "1.85"
[workspace.dependencies]
# RAS — not on crates.io, use git dependency. Include only the crates you use.
ras-rest-macro = { git = "https://github.com/JedimEmO/rust-agent-stack.git" }
ras-rest-core = { git = "https://github.com/JedimEmO/rust-agent-stack.git" }
ras-jsonrpc-macro = { git = "https://github.com/JedimEmO/rust-agent-stack.git" }
ras-jsonrpc-core = { git = "https://github.com/JedimEmO/rust-agent-stack.git" }
ras-file-macro = { git = "https://github.com/JedimEmO/rust-agent-stack.git" }
ras-jsonrpc-bidirectional-macro = { git = "https://github.com/JedimEmO/rust-agent-stack.git" }
ras-jsonrpc-bidirectional-server = { git = "https://github.com/JedimEmO/rust-agent-stack.git" }
ras-jsonrpc-bidirectional-client = { git = "https://github.com/JedimEmO/rust-agent-stack.git" }
ras-auth-core = { git = "https://github.com/JedimEmO/rust-agent-stack.git" }
ras-identity-core = { git = "https://github.com/JedimEmO/rust-agent-stack.git" }
ras-identity-session = { git = "https://github.com/JedimEmO/rust-agent-stack.git" }
ras-identity-local = { git = "https://github.com/JedimEmO/rust-agent-stack.git" }
ras-identity-oauth2 = { git = "https://github.com/JedimEmO/rust-agent-stack.git" }
ras-observability-core = { git = "https://github.com/JedimEmO/rust-agent-stack.git" }
ras-observability-otel = { git = "https://github.com/JedimEmO/rust-agent-stack.git" }
# Standard
serde = { version = "1", features = ["derive"] }
schemars = "1.0.0-alpha.20"
tokio = { version = "1", features = ["macros", "rt-multi-thread"] }
axum = "0.8"
anyhow = "1"
thiserror = "2"
tracing = "0.1"
async-trait = "0.1"
chrono = { version = "0.4", features = ["serde"] }
uuid = { version = "1", features = ["v4", "serde"] }
# Testing
axum-test = "18"
[workspace.lints.clippy]
pedantic = { level = "warn", priority = -1 }
module_name_repetitions = "allow"
must_use_candidate = "allow"
missing_errors_doc = "allow"
missing_panics_doc = "allow"
unwrap_used = "warn"
[workspace.lints.rust]
unsafe_code = "deny"
API Crate (REST)
[package]
name = "my-api"
version = "0.1.0"
edition.workspace = true
[lints]
workspace = true
[dependencies]
ras-rest-macro.workspace = true
ras-rest-core.workspace = true
ras-auth-core.workspace = true
serde.workspace = true
schemars.workspace = true
async-trait.workspace = true
# Domain types
my-core = { path = "../my-core" }
[features]
default = ["server", "client"]
server = []
client = []
API Crate (JSON-RPC)
[package]
name = "my-rpc-api"
version = "0.1.0"
edition.workspace = true
[lints]
workspace = true
[dependencies]
ras-jsonrpc-macro.workspace = true
ras-jsonrpc-core.workspace = true
ras-auth-core.workspace = true
serde.workspace = true
schemars.workspace = true
async-trait.workspace = true
my-core = { path = "../my-core" }
API Crate (File Service)
[package]
name = "my-file-api"
version = "0.1.0"
edition.workspace = true
[lints]
workspace = true
[dependencies]
ras-file-macro.workspace = true
ras-rest-core.workspace = true
ras-auth-core.workspace = true
serde.workspace = true
schemars.workspace = true
async-trait.workspace = true
Service / Binary Crate
[package]
name = "my-service"
version = "0.1.0"
edition.workspace = true
[lints]
workspace = true
[dependencies]
my-core = { path = "../my-core" }
my-api = { path = "../my-api" }
ras-auth-core.workspace = true
ras-rest-core.workspace = true
ras-identity-session.workspace = true
ras-identity-local.workspace = true
ras-observability-otel.workspace = true
axum.workspace = true
tokio.workspace = true
anyhow.workspace = true
tracing.workspace = true
async-trait.workspace = true
[dev-dependencies]
my-testutils = { path = "../my-testutils" }
axum-test.workspace = true
Core / Domain Crate
[package]
name = "my-core"
version = "0.1.0"
edition.workspace = true
[lints]
workspace = true
[dependencies]
serde.workspace = true
thiserror.workspace = true
chrono.workspace = true
uuid.workspace = true
async-trait.workspace = true
No RAS dependencies here — the domain crate stays pure.
Test Utilities Crate
[package]
name = "my-testutils"
version = "0.1.0"
edition.workspace = true
[lints]
workspace = true
[dependencies]
my-core = { path = "../my-core" }
ras-auth-core.workspace = true
serde.workspace = true
async-trait.workspace = true
This crate exports hand-written fakes (FakeAuthProvider, FakeUserRepository, etc.) and test fixture builders. Only ever referenced as [dev-dependencies].
name: rust-architecture description: Use when the user asks about dependency injection in Rust, trait-as-interface patterns, module boundaries, hexagonal architecture, ports and adapters, error handling strategy, when to use generics vs dyn Trait, how to structure application layers, or how to wire dependencies together. version: 1.0.0
Rust Architecture — DI, Trait Boundaries & Error Handling
Opinionated patterns for structuring Rust applications around testability and clean dependency boundaries. The core principle: domain logic never knows about IO.
Trait-as-Interface
Every significant dependency boundary is a trait. The trait is the port; the struct implementing it is the adapter.
- Define traits in the core/domain crate (or module). They describe what the system needs, not how it's done.
- Implement traits in adapter crates/modules. These are the concrete HTTP clients, database repos, file handlers.
- The application layer wires concrete adapters into domain logic.
- All port traits require
Send + Sync— this enablesArc<dyn Trait>for async/multi-threaded use and test sharing.
#![allow(unused)] fn main() { // In core — defines what we need pub trait UserRepository: Send + Sync { fn find_by_id(&self, id: &UserId) -> Result<Option<User>, UserRepoError>; fn find_by_email(&self, email: &str) -> Result<Option<User>, UserRepoError>; fn save(&self, user: &User) -> Result<(), UserRepoError>; fn delete(&self, id: &UserId) -> Result<(), UserRepoError>; } // In adapter — implements it pub struct SqliteUserRepository { /* connection */ } impl UserRepository for SqliteUserRepository { /* real IO */ } }
Traits should be object-safe when practical — this preserves the option of using dyn dispatch for app-level wiring and testing.
Generics vs dyn Trait — Context-Dependent
This is not an either/or choice. Use both, in different contexts:
Generics: for hot paths and library code
Use generics with trait bounds when the concrete type is known at compile time within a crate, or when you're writing library code where monomorphization matters.
#![allow(unused)] fn main() { pub fn find_active_users<R: UserRepository>( repo: &R, ids: &[UserId], ) -> Result<Vec<User>, UserRepoError> { ids.iter() .filter_map(|id| match repo.find_by_id(id) { Ok(Some(user)) => Some(Ok(user)), Ok(None) => None, Err(e) => Some(Err(e)), }) .collect() } }
Use generics when:
- The function is called in a tight loop
- It's part of a library's public API
- You want the compiler to inline and optimize
- The concrete type is known at the call site
dyn Trait: for app-level wiring and test doubles
Use Arc<dyn Trait> when composing the application, holding dependencies in long-lived structs, or swapping implementations in tests. Arc is preferred over Box because it allows sharing the same instance between the service and the test harness.
#![allow(unused)] fn main() { pub struct UserService { users: Arc<dyn UserRepository>, notifier: Arc<dyn Notifier>, } impl UserService { pub fn new(users: Arc<dyn UserRepository>, notifier: Arc<dyn Notifier>) -> Self { Self { users, notifier } } } }
Use dyn when:
- Constructing or holding application-level service objects
- The indirection cost is negligible (one vtable lookup per call, not in a tight loop)
- You need to swap implementations at runtime or in tests
- The struct is long-lived and owns its dependencies
Decision rule
If you're unsure: if the function constructs or holds things, use dyn. If the function processes things, use generics. Both can coexist in the same codebase.
Hexagonal-ish Architecture
Three layers, loosely held. You don't need a framework — the principle is enough.
Core / Domain
Pure logic — no IO, no runtime dependency. Can use async when the domain is inherently async, but must not depend on a specific runtime. Depends only on std and domain-specific crates (chrono, uuid, serde for derive). Defines traits (ports) for every external dependency it needs.
This layer is trivially testable — no fakes needed for pure functions, and trait-based fakes for anything with dependencies.
Adapters
Implement the port traits. HTTP clients, database access, file IO, message queues. Each adapter depends on the core crate (for the trait definition) plus its IO crates (reqwest, diesel, etc.).
Adapters live in separate modules or separate crates, depending on project size (see rust-project-setup for when to split).
App / Wiring
main.rs or an app module that constructs concrete adapters and injects them into domain services. This is where dyn dispatch happens. This layer reads config, sets up logging/tracing, and builds the dependency graph.
fn main() -> anyhow::Result<()> { let config = Config::from_env()?; let mut db_conn = SqliteConnection::establish(&config.database_url)?; run_pending_migrations(&mut db_conn)?; let users: Arc<dyn UserRepository> = Arc::new(SqliteUserRepository::new(db_conn)); let notifier: Arc<dyn Notifier> = Arc::new(EmailNotifier::new(&config.smtp)); let service = UserService::new(users, notifier); run_server(service, &config.bind_addr) }
Module boundaries in a single crate
Even before splitting into multiple crates, enforce the layering at the module level:
src/
├── domain/ # traits, types, pure logic — no `use crate::infra`
├── infra/ # trait implementations, IO
└── bin/main.rs # wiring
The rule: domain/ never imports from infra/. Enforce this by convention (and by code review). When this boundary justifies a crate split, it's a clean move.
Error Handling Strategy
Libraries: thiserror
Reusable crates define typed error enums. Errors are part of the public API.
#![allow(unused)] fn main() { use thiserror::Error; #[derive(Debug, Error)] pub enum UserRepoError { #[error("user not found: {0:?}")] NotFound(UserId), #[error("duplicate email: {0}")] DuplicateEmail(String), #[error("storage error: {0}")] Storage(String), } }
- Each crate defines its own error type
- Use
#[from]for automatic conversion from downstream errors where appropriate - Keep variants meaningful — don't wrap every error in a generic
Internal(String)
Applications: anyhow
Binary crates and top-level application code use anyhow for ergonomic error propagation.
#![allow(unused)] fn main() { use anyhow::{Context, Result}; fn load_config() -> Result<Config> { let raw = std::fs::read_to_string("config.toml") .context("failed to read config file")?; let config: Config = toml::from_str(&raw) .context("failed to parse config")?; Ok(config) } }
Result<T>meansanyhow::Result<T>in app code- Use
.context("what we were doing")liberally — it creates a chain of context that makes debugging straightforward - Library errors convert automatically via the
Errortrait
The boundary
At the point where library errors enter application code, add context:
#![allow(unused)] fn main() { let user = repo.find_by_id(&id) .context("failed to look up user during checkout")?; }
Adapter crates that serve only one application can use either style. If the adapter might be reused, use thiserror.
Async Strategy — Runtime-Agnostic Core
The core/domain crate can absolutely use async — the key constraint is no runtime dependency. Domain logic can define and use async traits, return futures, and await other domain operations. What it must not do is pull in tokio, async-std, or any specific runtime as a dependency.
- Domain traits freely use
async fnwhen the domain is inherently async - The core crate should not depend on a specific runtime — no
tokio::spawn, notokio::time::sleep - The binary crate selects the runtime (
tokio,async-std, etc.) - Pure domain functions that don't need async should remain synchronous — don't make everything async just because some things are
If a trait needs async methods, use async fn in trait (stabilized in Rust 1.75+). Note that native async trait methods are not object-safe — use the async-trait crate if you need dyn dispatch with async:
#![allow(unused)] fn main() { use async_trait::async_trait; #[async_trait] pub trait EventStore: Send + Sync { async fn append(&self, stream: &str, events: &[Event]) -> Result<(), EventStoreError>; async fn read_stream(&self, stream: &str) -> Result<Vec<Event>, EventStoreError>; } }
Related Skills
For workspace layout and crate splitting decisions, see the rust-project-setup skill. For hand-written fakes and testing patterns that exercise these trait boundaries, see the rust-testing skill.
Trait Patterns — Complete Examples
Concrete, copy-pasteable examples that complement the patterns in the SKILL.md.
Port Trait + Domain Types (core crate)
#![allow(unused)] fn main() { use thiserror::Error; #[derive(Debug, Clone, PartialEq)] pub struct UserId(pub String); #[derive(Debug, Clone)] pub struct User { pub id: UserId, pub email: String, pub name: String, } #[derive(Debug, Error)] pub enum UserRepoError { #[error("user not found: {0:?}")] NotFound(UserId), #[error("duplicate email: {0}")] DuplicateEmail(String), #[error("storage error: {0}")] Storage(String), } pub trait UserRepository: Send + Sync { fn find_by_id(&self, id: &UserId) -> Result<Option<User>, UserRepoError>; fn find_by_email(&self, email: &str) -> Result<Option<User>, UserRepoError>; fn save(&self, user: &User) -> Result<(), UserRepoError>; fn delete(&self, id: &UserId) -> Result<(), UserRepoError>; } }
Diesel Schema & Models (adapter crate)
The schema module is auto-generated by diesel print-schema — don't edit it by hand. Configure the output path in diesel.toml.
#![allow(unused)] fn main() { // src/infra/schema.rs diesel::table! { users (id) { id -> Text, email -> Text, name -> Text, } } }
Diesel models are private to the adapter — they are not the domain types. Keep the mapping explicit.
#![allow(unused)] fn main() { use diesel::prelude::*; use my_core::{User, UserId}; use crate::infra::schema::users; #[derive(Queryable, Selectable)] #[diesel(table_name = users)] struct UserRow { id: String, email: String, name: String, } #[derive(Insertable, AsChangeset)] #[diesel(table_name = users)] struct NewUserRow<'a> { id: &'a str, email: &'a str, name: &'a str, } impl UserRow { fn into_domain(self) -> User { User { id: UserId(self.id), email: self.email, name: self.name, } } } impl<'a> NewUserRow<'a> { fn from_domain(user: &'a User) -> Self { NewUserRow { id: &user.id.0, email: &user.email, name: &user.name, } } } }
SQLite Adapter — Repository Implementation
SqliteConnection is not Sync, so wrap it in a Mutex for use behind Arc<dyn Trait>.
#![allow(unused)] fn main() { use std::sync::Mutex; use diesel::prelude::*; use diesel::sqlite::SqliteConnection; use my_core::{User, UserId, UserRepository, UserRepoError}; use crate::infra::schema::users::dsl; pub struct SqliteUserRepository { conn: Mutex<SqliteConnection>, } impl SqliteUserRepository { pub fn new(conn: SqliteConnection) -> Self { Self { conn: Mutex::new(conn) } } } impl UserRepository for SqliteUserRepository { fn find_by_id(&self, id: &UserId) -> Result<Option<User>, UserRepoError> { let mut conn = self.conn.lock().map_err(|e| UserRepoError::Storage(e.to_string()))?; dsl::users .filter(dsl::id.eq(&id.0)) .select(UserRow::as_select()) .first::<UserRow>(&mut *conn) .optional() .map(|opt| opt.map(UserRow::into_domain)) .map_err(|e| UserRepoError::Storage(e.to_string())) } fn find_by_email(&self, email: &str) -> Result<Option<User>, UserRepoError> { let mut conn = self.conn.lock().map_err(|e| UserRepoError::Storage(e.to_string()))?; dsl::users .filter(dsl::email.eq(email)) .select(UserRow::as_select()) .first::<UserRow>(&mut *conn) .optional() .map(|opt| opt.map(UserRow::into_domain)) .map_err(|e| UserRepoError::Storage(e.to_string())) } fn save(&self, user: &User) -> Result<(), UserRepoError> { let mut conn = self.conn.lock().map_err(|e| UserRepoError::Storage(e.to_string()))?; let new_row = NewUserRow::from_domain(user); diesel::insert_into(dsl::users) .values(&new_row) .on_conflict(dsl::id) .do_update() .set(&new_row) .execute(&mut *conn) .map_err(|e| UserRepoError::Storage(e.to_string()))?; Ok(()) } fn delete(&self, id: &UserId) -> Result<(), UserRepoError> { let mut conn = self.conn.lock().map_err(|e| UserRepoError::Storage(e.to_string()))?; diesel::delete(dsl::users.filter(dsl::id.eq(&id.0))) .execute(&mut *conn) .map_err(|e| UserRepoError::Storage(e.to_string()))?; Ok(()) } } }
Diesel setup: Run diesel setup to create the database and migrations directory. Migrations live in migrations/ and are run with diesel migration run. Add diesel.toml at the project root to configure print-schema output location (typically src/infra/schema.rs). For embedded migrations in the binary, use diesel_migrations::embed_migrations!.
Concurrency: The
Mutex<SqliteConnection>shown above serializes all DB access through one connection. For production with concurrent requests, user2d2::Pool<ConnectionManager<SqliteConnection>>instead.
Service with Arc<dyn Trait> Dependencies
#![allow(unused)] fn main() { use std::sync::Arc; use anyhow::Context; use my_core::{User, UserId, UserRepository, Notifier}; pub struct UserService { users: Arc<dyn UserRepository>, notifier: Arc<dyn Notifier>, } impl UserService { pub fn new(users: Arc<dyn UserRepository>, notifier: Arc<dyn Notifier>) -> Self { Self { users, notifier } } pub fn register(&self, email: String, name: String) -> anyhow::Result<User> { if self.users.find_by_email(&email)?.is_some() { anyhow::bail!("user with email {} already exists", email); } let user = User { id: UserId(generate_id()), email, name, }; self.users.save(&user) .context("failed to save new user")?; self.notifier.send_welcome(&user) .context("failed to send welcome notification")?; Ok(user) } } }
thiserror Enum with #[from] Conversions
#![allow(unused)] fn main() { use thiserror::Error; #[derive(Debug, Error)] pub enum OrderError { #[error("order not found: {0}")] NotFound(String), #[error("insufficient stock for item {item_id}: requested {requested}, available {available}")] InsufficientStock { item_id: String, requested: u32, available: u32, }, #[error("user error")] User(#[from] UserRepoError), #[error("payment failed")] Payment(#[from] PaymentError), } }
App Wiring with anyhow
use std::sync::Arc; use anyhow::{Context, Result}; #[tokio::main] async fn main() -> Result<()> { let config = Config::from_env() .context("failed to load configuration")?; let mut conn = SqliteConnection::establish(&config.database_url) .context("failed to open database")?; run_pending_migrations(&mut conn) .context("failed to run migrations")?; let users: Arc<dyn UserRepository> = Arc::new(SqliteUserRepository::new(conn)); let notifier: Arc<dyn Notifier> = Arc::new(EmailNotifier::new(&config.smtp)); let service = UserService::new(users, notifier); let server = build_server(service, &config) .context("failed to build HTTP server")?; server.run().await .context("server exited with error") }
Notifier Trait (secondary port)
#![allow(unused)] fn main() { pub trait Notifier: Send + Sync { fn send_welcome(&self, user: &User) -> Result<(), NotifyError>; fn send_password_reset(&self, user: &User, token: &str) -> Result<(), NotifyError>; } pub struct EmailNotifier { /* smtp config */ } impl Notifier for EmailNotifier { /* sends real emails */ } }
For the corresponding FakeNotifier with configurable failure, see the rust-testing skill's reference file.
Async Trait with async-trait (object-safe)
When you need dyn dispatch with async methods, use the async-trait crate:
#![allow(unused)] fn main() { use async_trait::async_trait; #[async_trait] pub trait EventStore: Send + Sync { async fn append(&self, stream: &str, events: &[Event]) -> Result<(), EventStoreError>; async fn read_stream(&self, stream: &str) -> Result<Vec<Event>, EventStoreError>; } #[async_trait] impl EventStore for SqliteEventStore { async fn append(&self, stream: &str, events: &[Event]) -> Result<(), EventStoreError> { // Use tokio::task::spawn_blocking for Diesel calls in async context todo!() } async fn read_stream(&self, stream: &str) -> Result<Vec<Event>, EventStoreError> { todo!() } } }
Note: diesel::SqliteConnection is blocking and not Sync. When using it inside async code, wrap calls in tokio::task::spawn_blocking, or use a connection pool (r2d2) where each task checks out its own connection.
name: rust-ci-tooling description: Use when the user asks about CI/CD for Rust projects, clippy lints, rustfmt configuration, feature flag CI matrix, workspace-level tooling, justfile or Makefile setup, GitHub Actions for Rust, or pre-commit hooks for Rust projects. version: 1.0.0
Rust CI & Tooling — Clippy, Formatting & Automation
Workspace-level tooling configuration and CI patterns. Keep it simple — use cargo test and standard tooling. The goal: a single just ci command that catches everything, and a GitHub Actions workflow that mirrors it.
Clippy Configuration
Configure clippy at the workspace level in the root Cargo.toml:
[workspace.lints.clippy]
# Start with pedantic, then selectively allow the noisy ones
pedantic = { level = "warn", priority = -1 }
# These fire too often to be useful
module_name_repetitions = "allow"
must_use_candidate = "allow"
missing_errors_doc = "allow"
missing_panics_doc = "allow"
# Enforce in library crates (but allow in tests/binaries via per-crate overrides)
unwrap_used = "warn"
[workspace.lints.rust]
unsafe_code = "deny"
Member crates inherit with:
[lints]
workspace = true
Binary/test crates can relax specific lints:
[lints]
workspace = true
[lints.clippy]
unwrap_used = "allow" # fine in binaries and tests
CI command
cargo clippy --workspace --all-targets -- -D warnings
The -- -D warnings promotes all warnings to errors in CI, ensuring clippy issues block the build.
rustfmt Configuration
Create .rustfmt.toml at the workspace root:
edition = "2021"
max_width = 100
CI command
cargo fmt --all -- --check
Feature Flag CI Strategy
Test multiple feature configurations to catch conditional compilation issues:
# Default features (what users get)
cargo test --workspace
# No default features (catches missing feature guards)
cargo test --workspace --no-default-features
# All features (catches feature conflicts)
cargo test --workspace --all-features
For crates with meaningful feature combinations, add a CI matrix. Document tested combinations in the workflow.
justfile — Workspace Task Runner
Use just (https://just.systems) for common workspace commands. Create a justfile at the workspace root:
# Default: run all checks
default: check test
# Format code
fmt:
cargo fmt --all
# Check formatting (CI)
fmt-check:
cargo fmt --all -- --check
# Run clippy
check:
cargo clippy --workspace --all-targets -- -D warnings
# Run tests
test:
cargo test --workspace
# Run tests with all feature combinations
test-features:
cargo test --workspace
cargo test --workspace --no-default-features
cargo test --workspace --all-features
# Full CI pipeline locally
ci: fmt-check check test-features
# Build in release mode
build:
cargo build --workspace --release
# Clean build artifacts
clean:
cargo clean
# Watch and run tests on change (requires cargo-watch)
watch:
cargo watch -x 'test --workspace'
GitHub Actions Workflow
Create .github/workflows/ci.yml:
name: CI
on:
push:
branches: [master]
pull_request:
branches: [master]
env:
CARGO_TERM_COLOR: always
RUSTFLAGS: "-D warnings"
jobs:
check:
name: Check
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
with:
components: clippy, rustfmt
- uses: Swatinem/rust-cache@v2
- name: Check formatting
run: cargo fmt --all -- --check
- name: Clippy
run: cargo clippy --workspace --all-targets -- -D warnings
test:
name: Test
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
- uses: Swatinem/rust-cache@v2
- name: Run tests
run: cargo test --workspace
- name: Run tests (no default features)
run: cargo test --workspace --no-default-features
- name: Run tests (all features)
run: cargo test --workspace --all-features
Pre-Commit Hooks
Keep pre-commit hooks lightweight. Only run fast checks:
#!/bin/sh
# .git/hooks/pre-commit (or via pre-commit framework)
cargo fmt --all -- --check
Do NOT run clippy or tests in pre-commit — they're too slow and will frustrate the workflow. Those belong in CI.
Consider typos-cli for catching spelling mistakes:
cargo install typos-cli
typos # runs on all files
Useful Cargo Extensions
| Tool | Purpose | Install |
|---|---|---|
cargo-watch | Re-run on file changes | cargo install cargo-watch |
cargo-deny | Audit dependencies (licenses, advisories) | cargo install cargo-deny |
cargo-machete | Find unused dependencies | cargo install cargo-machete |
typos-cli | Spell checker for code | cargo install typos-cli |
Related Skills
For workspace-level Cargo.toml setup and [workspace.lints], see the rust-project-setup skill.
For test organization, see the rust-testing skill.
name: rust-project-setup description: Use when the user asks to scaffold a new Rust project, set up a Cargo workspace, configure Cargo.toml, manage workspace dependencies, set up feature flags, decide on crate boundaries, or asks about when to split a single crate into multiple crates. version: 1.0.0
Rust Project Setup — Workspace Scaffolding & Crate Layout
Opinionated guide for structuring Rust projects. Start simple, split when there's a reason to. Every project is a workspace from day one (even single-crate projects benefit from workspace-level configuration).
Starter template: For a ready-to-compile project that demonstrates these patterns, see the scaffold-fullstack skill.
Start with a Single Crate
New projects begin as a single crate with internal module boundaries that anticipate future splits. Don't create multiple crates speculatively.
my-project/
├── Cargo.toml # workspace root + single member
├── crates/
│ └── my-project/
│ ├── Cargo.toml
│ └── src/
│ ├── lib.rs
│ ├── domain/ # pure logic, no IO
│ │ └── mod.rs
│ ├── infra/ # trait implementations, IO
│ │ └── mod.rs
│ └── bin/
│ └── main.rs # wiring, entry point
Even in a single crate, maintain module boundaries: domain/ has no imports from infra/, and bin/main.rs wires adapters into domain logic. This makes future crate splits trivial — just move the module to its own crate.
When to Split into Multiple Crates
Split when one of these conditions is met — not before:
-
DI boundary solidifies — You have a trait defined in domain code with multiple real implementations (e.g., a
Storagetrait with both SQLite and in-memory adapters). Move the trait to a core crate, implementations to adapter crates. -
Compile times suffer — A module has grown large enough that incremental compilation is noticeably slow. Splitting it into its own crate gives better parallelism.
-
Reuse across binaries — You need shared logic between multiple binaries (CLI tool + web server, library + integration test harness).
-
Independent versioning — A piece of the project is useful as a standalone library with its own semver.
Standard Multi-Crate Layout
When you do split, use this structure:
my-project/
├── Cargo.toml # workspace root (no [package])
├── crates/
│ ├── my-core/ # domain logic, trait definitions, no IO deps
│ │ ├── Cargo.toml
│ │ └── src/lib.rs
│ ├── my-client/ # adapters: HTTP, DB, file IO implementations
│ │ ├── Cargo.toml
│ │ └── src/lib.rs
│ ├── my-app/ # binary: wires core + adapters together
│ │ ├── Cargo.toml
│ │ └── src/main.rs
│ └── my-testutils/ # shared test fakes and fixtures (dev-dependency only)
│ ├── Cargo.toml
│ └── src/lib.rs
my-coredepends only onstdand domain-specific crates (e.g.,chrono,uuid). Never on IO crates. Defines traits (ports) for external dependencies.my-clientdepends onmy-core+ IO crates (reqwest,diesel, etc.). Implements the port traits.my-appdepends onmy-core+my-client. Constructs concrete adapters and injects them. Containsmain().my-testutilsexports shared fakes, builders, and fixtures. Only ever a[dev-dependencies]entry.
Workspace Setup
Always use a workspace, even for single-crate projects. The root Cargo.toml:
[workspace]
members = ["crates/*"]
resolver = "2"
[workspace.package]
edition = "2021"
rust-version = "1.75"
[workspace.dependencies]
# Pin shared dependencies here — members inherit with `.workspace = true`
serde = { version = "1", features = ["derive"] }
tokio = { version = "1", features = ["macros", "rt-multi-thread"] }
anyhow = "1"
thiserror = "2"
tracing = "0.1"
diesel = { version = "2", features = ["sqlite"] }
diesel_migrations = "2"
[workspace.lints.clippy]
pedantic = { level = "warn", priority = -1 }
module_name_repetitions = "allow"
must_use_candidate = "allow"
missing_errors_doc = "allow"
missing_panics_doc = "allow"
unwrap_used = "warn"
[workspace.lints.rust]
unsafe_code = "deny"
Member crates inherit from the workspace:
[package]
name = "my-core"
version = "0.1.0"
edition.workspace = true
rust-version.workspace = true
[lints]
workspace = true
[dependencies]
serde.workspace = true
thiserror.workspace = true
Workspace Dependencies
All shared dependencies go in [workspace.dependencies]. Member crates reference them with .workspace = true. This ensures version consistency and makes upgrades a single-line change.
Rules:
- If two or more crates use the same dependency, it goes in
[workspace.dependencies] - If only one crate uses a dependency and it's unlikely to be shared, it can be declared locally
- Feature flags on workspace deps are the superset — individual crates can use
default-features = falseand select specific features
Feature Flags
Conventions:
- Name features in
lowercase-kebab-case - The
defaultfeature set should be the common case — users opt out, not in - Use feature flags for optional functionality (e.g.,
json-logging,http-client), not for build-time config that should be env vars - Document features in the crate's
Cargo.tomlusing[package.metadata.docs.rs]or inline comments - Don't use feature flags to alter core behavior in surprising ways — a feature should add capability, not change existing semantics
- Propagate features through workspace crates explicitly:
[features] http-client = ["my-client/http-client"]
Related Skills
For dependency inversion patterns and trait-as-interface design, see the rust-architecture skill. For CI configuration and workspace-level tooling, see the rust-ci-tooling skill.
Cargo.toml Templates
Workspace Root (no package)
[workspace]
members = ["crates/*"]
resolver = "2"
[workspace.package]
edition = "2021"
rust-version = "1.75"
license = "MIT"
[workspace.dependencies]
# Core
serde = { version = "1", features = ["derive"] }
serde_json = "1"
anyhow = "1"
thiserror = "2"
# Async (runtime-agnostic where possible)
tokio = { version = "1", features = ["macros", "rt-multi-thread"] }
# Database
diesel = { version = "2", features = ["sqlite"] }
diesel_migrations = "2"
# Observability
tracing = "0.1"
tracing-subscriber = { version = "0.3", features = ["env-filter"] }
# Testing
assert_matches = "1"
[workspace.lints.clippy]
pedantic = { level = "warn", priority = -1 }
module_name_repetitions = "allow"
must_use_candidate = "allow"
missing_errors_doc = "allow"
missing_panics_doc = "allow"
unwrap_used = "warn"
[workspace.lints.rust]
unsafe_code = "deny"
Library Crate (core/domain)
[package]
name = "my-core"
version = "0.1.0"
edition.workspace = true
rust-version.workspace = true
[lints]
workspace = true
[dependencies]
serde.workspace = true
thiserror.workspace = true
[dev-dependencies]
assert_matches.workspace = true
Adapter Crate (infra/client)
[package]
name = "my-client"
version = "0.1.0"
edition.workspace = true
rust-version.workspace = true
[lints]
workspace = true
[dependencies]
my-core = { path = "../my-core" }
serde.workspace = true
anyhow.workspace = true
tokio.workspace = true
tracing.workspace = true
diesel.workspace = true
diesel_migrations.workspace = true
[dev-dependencies]
my-testutils = { path = "../my-testutils" }
Binary Crate (app)
[package]
name = "my-app"
version = "0.1.0"
edition.workspace = true
rust-version.workspace = true
[lints]
workspace = true
[dependencies]
my-core = { path = "../my-core" }
my-client = { path = "../my-client" }
anyhow.workspace = true
tokio.workspace = true
tracing.workspace = true
tracing-subscriber.workspace = true
Test Utilities Crate
[package]
name = "my-testutils"
version = "0.1.0"
edition.workspace = true
rust-version.workspace = true
publish = false
[lints]
workspace = true
[dependencies]
my-core = { path = "../my-core" }
Feature Flag Patterns
Optional dependency gating
[features]
default = ["json"]
json = ["dep:serde_json"]
http-client = ["dep:reqwest"]
[dependencies]
serde_json = { workspace = true, optional = true }
reqwest = { version = "0.12", features = ["json"], optional = true }
Feature-gated module
#![allow(unused)] fn main() { // In lib.rs #[cfg(feature = "http-client")] pub mod http_client; }
Propagating features across workspace crates
# In my-app/Cargo.toml
[features]
default = ["http-client"]
http-client = ["my-client/http-client"]
Single-Crate Project (still a workspace)
[workspace]
members = ["crates/*"]
resolver = "2"
[workspace.package]
edition = "2021"
[workspace.dependencies]
serde = { version = "1", features = ["derive"] }
anyhow = "1"
thiserror = "2"
[workspace.lints.clippy]
pedantic = { level = "warn", priority = -1 }
module_name_repetitions = "allow"
must_use_candidate = "allow"
missing_errors_doc = "allow"
missing_panics_doc = "allow"
unwrap_used = "warn"
[workspace.lints.rust]
unsafe_code = "deny"
With a single member at crates/my-project/Cargo.toml:
[package]
name = "my-project"
version = "0.1.0"
edition.workspace = true
[lints]
workspace = true
[dependencies]
serde.workspace = true
thiserror.workspace = true
[[bin]]
name = "my-project"
path = "src/bin/main.rs"
name: rust-testing description: Use when the user asks about testing Rust code, writing test doubles, creating fakes, organizing test modules, integration vs unit tests, test utilities, test fixtures, test patterns for trait-based dependency injection, test size strategy, or testing HTTP APIs with axum-test. version: 1.0.0
Rust Testing — Fakes, Organization & Strategy
Opinionated testing approach built around hand-written fakes, trait-based dependency injection, and a strong preference for small, fast, in-process tests. No mocking frameworks — no mockall, no #[automock], no mock! macros.
Test Size Strategy — Small > Medium > Large
If you can test it with a fake, do that (small). If you need to verify real IO behavior, keep it in-process (medium). Only go multi-process (large) when there's no alternative.
- Small: No IO. Domain logic + service wiring with fakes. Fast, deterministic, bulk of the suite.
- Medium: In-process IO. SQLite in-memory for DB tests,
axum-testfor HTTP. No spawned servers. - Large: Multi-process / external services. Avoid unless genuinely necessary.
Hand-Written Fakes
Every trait gets a hand-written fake. Fakes implement realistic behavior — a FakeUserRepo actually stores and retrieves items. This catches bugs that mocks miss, and tests behavior rather than implementation details.
Since port traits require Send + Sync (needed for Arc<dyn Trait> in async/multi-threaded contexts), fakes use Mutex for interior mutability:
#![allow(unused)] fn main() { use std::sync::Mutex; use my_core::{User, UserId, UserRepository, UserRepoError}; pub struct FakeUserRepository { users: Mutex<Vec<User>>, should_fail: Mutex<bool>, } impl FakeUserRepository { pub fn new() -> Self { Self { users: Mutex::new(Vec::new()), should_fail: Mutex::new(false), } } pub fn with_users(users: Vec<User>) -> Self { Self { users: Mutex::new(users), should_fail: Mutex::new(false), } } pub fn set_should_fail(&self, fail: bool) { *self.should_fail.lock().unwrap() = fail; } pub fn stored_users(&self) -> Vec<User> { self.users.lock().unwrap().clone() } } impl UserRepository for FakeUserRepository { fn find_by_id(&self, id: &UserId) -> Result<Option<User>, UserRepoError> { if *self.should_fail.lock().unwrap() { return Err(UserRepoError::Storage("fake failure".into())); } Ok(self.users.lock().unwrap().iter().find(|u| u.id == *id).cloned()) } fn find_by_email(&self, email: &str) -> Result<Option<User>, UserRepoError> { if *self.should_fail.lock().unwrap() { return Err(UserRepoError::Storage("fake failure".into())); } Ok(self.users.lock().unwrap().iter().find(|u| u.email == email).cloned()) } fn save(&self, user: &User) -> Result<(), UserRepoError> { if *self.should_fail.lock().unwrap() { return Err(UserRepoError::Storage("fake failure".into())); } let mut users = self.users.lock().unwrap(); if let Some(pos) = users.iter().position(|u| u.id == user.id) { users[pos] = user.clone(); } else { users.push(user.clone()); } Ok(()) } fn delete(&self, id: &UserId) -> Result<(), UserRepoError> { if *self.should_fail.lock().unwrap() { return Err(UserRepoError::Storage("fake failure".into())); } self.users.lock().unwrap().retain(|u| u.id != *id); Ok(()) } } }
Test Organization
Unit tests: #[cfg(test)] mod tests
At the bottom of the file being tested. Test the module's public interface using fakes.
#![allow(unused)] fn main() { #[cfg(test)] mod tests { use super::*; use my_testutils::{FakeUserRepository, FakeNotifier, a_user}; #[test] fn register_saves_user_and_sends_welcome() { let users = FakeUserRepository::new(); let notifier = FakeNotifier::new(); let service = UserService::new( Arc::new(users) as Arc<dyn UserRepository>, Arc::new(notifier) as Arc<dyn Notifier>, ); let result = service.register("alice@example.com".into(), "Alice".into()); assert!(result.is_ok()); assert_eq!(result.unwrap().email, "alice@example.com"); } #[test] fn register_rejects_duplicate_email() { let users = FakeUserRepository::with_users(vec![ a_user().with_email("taken@example.com").build() ]); let notifier = FakeNotifier::new(); let service = UserService::new( Arc::new(users) as Arc<dyn UserRepository>, Arc::new(notifier) as Arc<dyn Notifier>, ); let result = service.register("taken@example.com".into(), "Bob".into()); assert!(result.is_err()); } } }
Integration tests: tests/ directory
Each file in tests/ at the crate root is compiled as a separate binary.
Test utilities crate: crates/my-testutils/
For multi-crate workspaces, create a shared testutils crate that exports fakes, builders, and test setup helpers. Declare as [dev-dependencies] in consuming crates.
Test naming
{action}_when_{condition} or {expected_outcome}_when_{scenario}:
#![allow(unused)] fn main() { fn returns_none_when_user_not_found() { } fn saves_updated_email_when_user_exists() { } fn returns_error_when_storage_is_unavailable() { } }
Test Fixtures with Builders
#![allow(unused)] fn main() { pub struct UserBuilder { id: String, email: String, name: String, } impl Default for UserBuilder { fn default() -> Self { Self { id: "test-user-1".into(), email: "test@example.com".into(), name: "Test User".into(), } } } impl UserBuilder { pub fn with_id(mut self, id: impl Into<String>) -> Self { self.id = id.into(); self } pub fn with_email(mut self, email: impl Into<String>) -> Self { self.email = email.into(); self } pub fn with_name(mut self, name: impl Into<String>) -> Self { self.name = name.into(); self } pub fn build(self) -> User { User { id: UserId(self.id), email: self.email, name: self.name } } } /// Reads naturally: `a_user().with_name("Alice").build()` pub fn a_user() -> UserBuilder { UserBuilder::default() } }
Integration Test Pattern — TestApp
Use Arc to share fakes between the service and the test harness. The fake is wrapped in Arc, and the service receives its own Arc clone:
#![allow(unused)] fn main() { use std::sync::Arc; pub struct TestApp { pub service: UserService, pub users: Arc<FakeUserRepository>, pub notifier: Arc<FakeNotifier>, } impl TestApp { pub fn new() -> Self { let users = Arc::new(FakeUserRepository::new()); let notifier = Arc::new(FakeNotifier::new()); let service = UserService::new( Arc::clone(&users) as Arc<dyn UserRepository>, Arc::clone(¬ifier) as Arc<dyn Notifier>, ); Self { service, users, notifier } } pub fn given_users_exist(&self, users: Vec<User>) { for user in users { self.users.save(&user).unwrap(); } } } }
Usage:
#![allow(unused)] fn main() { #[test] fn user_can_register_and_receive_welcome() { let app = TestApp::new(); let user = app.service.register("alice@example.com".into(), "Alice".into()).unwrap(); assert_eq!(user.email, "alice@example.com"); assert_eq!(app.users.stored_users().len(), 1); assert!(app.notifier.was_notified(&user.id, "welcome")); } }
Testing HTTP APIs with axum-test
Use axum-test to test the full HTTP stack in-process — no TCP listener, no spawned server. Build the Router the same way production does, inject fakes.
#![allow(unused)] fn main() { use axum_test::TestServer; use std::sync::Arc; async fn create_test_server() -> TestServer { let users: Arc<dyn UserRepository> = Arc::new(FakeUserRepository::new()); let notifier: Arc<dyn Notifier> = Arc::new(FakeNotifier::new()); let app_state = AppState { users, notifier }; TestServer::new(create_router(app_state)).unwrap() } #[tokio::test] async fn register_returns_created_user() { let server = create_test_server().await; let response = server .post("/api/users") .json(&json!({ "email": "alice@example.com", "name": "Alice" })) .await; response.assert_status(StatusCode::CREATED); let body: User = response.json(); assert_eq!(body.email, "alice@example.com"); } }
This is a medium test — exercises real HTTP routing, middleware, extraction, and serialization, but stays in-process with faked dependencies.
For tests that also need a real database, wire in an in-memory SQLite instead of fakes:
#![allow(unused)] fn main() { use diesel::sqlite::SqliteConnection; use diesel::Connection; use diesel_migrations::{embed_migrations, EmbeddedMigrations, MigrationHarness}; const MIGRATIONS: EmbeddedMigrations = embed_migrations!(); async fn create_test_server_with_db() -> TestServer { let mut conn = SqliteConnection::establish(":memory:").unwrap(); conn.run_pending_migrations(MIGRATIONS).unwrap(); let users: Arc<dyn UserRepository> = Arc::new(SqliteUserRepository::new(conn)); let notifier: Arc<dyn Notifier> = Arc::new(FakeNotifier::new()); let app_state = AppState { users, notifier }; TestServer::new(create_router(app_state)).unwrap() } }
Async Test Considerations
- If the domain is async, use
#[tokio::test]for domain unit tests too — the domain is runtime-agnostic, but tests can pick any runtime. - Synchronous domain logic uses plain
#[test]— don't wrap it in async unnecessarily.
Related Skills
For trait-as-interface patterns that make fakes possible, see the rust-architecture skill. For CI configuration and workspace-level tooling, see the rust-ci-tooling skill.
Fake Examples — Supplementary Patterns
Examples that complement the core patterns in the SKILL.md. These cover additional scenarios: configurable failure, notification fakes, inventory domain, and testutils crate layout.
Fake with Configurable Failure Modes
#![allow(unused)] fn main() { use std::sync::Mutex; pub struct FakeNotifier { notifications: Mutex<Vec<(UserId, String)>>, fail_on: Mutex<Option<String>>, } impl FakeNotifier { pub fn new() -> Self { Self { notifications: Mutex::new(Vec::new()), fail_on: Mutex::new(None), } } pub fn fail_on(&self, notification_type: &str) { *self.fail_on.lock().unwrap() = Some(notification_type.into()); } pub fn was_notified(&self, user_id: &UserId, notification_type: &str) -> bool { self.notifications.lock().unwrap().iter() .any(|(id, t)| id == user_id && t == notification_type) } pub fn all_notifications(&self) -> Vec<(UserId, String)> { self.notifications.lock().unwrap().clone() } } impl Notifier for FakeNotifier { fn send_welcome(&self, user: &User) -> Result<(), NotifyError> { if self.fail_on.lock().unwrap().as_deref() == Some("welcome") { return Err(NotifyError::SendFailed("fake failure".into())); } self.notifications.lock().unwrap().push((user.id.clone(), "welcome".into())); Ok(()) } fn send_password_reset(&self, user: &User, _token: &str) -> Result<(), NotifyError> { if self.fail_on.lock().unwrap().as_deref() == Some("password_reset") { return Err(NotifyError::SendFailed("fake failure".into())); } self.notifications.lock().unwrap().push((user.id.clone(), "password_reset".into())); Ok(()) } } }
Second Domain Example — Inventory
A different domain to show the pattern generalizes. Same structure: trait in core, fake in testutils.
Trait (core crate)
#![allow(unused)] fn main() { #[derive(Debug, Clone, PartialEq)] pub struct ItemId(pub String); #[derive(Debug, Clone)] pub struct Item { pub id: ItemId, pub name: String, pub quantity: u32, } #[derive(Debug, Error)] pub enum InventoryError { #[error("item not found: {0:?}")] NotFound(ItemId), #[error("insufficient stock: have {available}, need {requested}")] InsufficientStock { available: u32, requested: u32 }, #[error("storage error: {0}")] Storage(String), } pub trait InventoryRepository: Send + Sync { fn get(&self, id: &ItemId) -> Result<Option<Item>, InventoryError>; fn save(&self, item: &Item) -> Result<(), InventoryError>; fn reserve(&self, id: &ItemId, quantity: u32) -> Result<Item, InventoryError>; } }
Fake (testutils crate)
#![allow(unused)] fn main() { use std::sync::Mutex; pub struct FakeInventoryRepository { items: Mutex<Vec<Item>>, should_fail: Mutex<bool>, } impl FakeInventoryRepository { pub fn new() -> Self { Self { items: Mutex::new(Vec::new()), should_fail: Mutex::new(false) } } pub fn with_items(items: Vec<Item>) -> Self { Self { items: Mutex::new(items), should_fail: Mutex::new(false) } } pub fn set_should_fail(&self, fail: bool) { *self.should_fail.lock().unwrap() = fail; } pub fn get_item(&self, id: &ItemId) -> Option<Item> { self.items.lock().unwrap().iter().find(|i| i.id == *id).cloned() } } impl InventoryRepository for FakeInventoryRepository { fn get(&self, id: &ItemId) -> Result<Option<Item>, InventoryError> { if *self.should_fail.lock().unwrap() { return Err(InventoryError::Storage("fake failure".into())); } Ok(self.items.lock().unwrap().iter().find(|i| i.id == *id).cloned()) } fn save(&self, item: &Item) -> Result<(), InventoryError> { if *self.should_fail.lock().unwrap() { return Err(InventoryError::Storage("fake failure".into())); } let mut items = self.items.lock().unwrap(); if let Some(pos) = items.iter().position(|i| i.id == item.id) { items[pos] = item.clone(); } else { items.push(item.clone()); } Ok(()) } fn reserve(&self, id: &ItemId, quantity: u32) -> Result<Item, InventoryError> { if *self.should_fail.lock().unwrap() { return Err(InventoryError::Storage("fake failure".into())); } let mut items = self.items.lock().unwrap(); let item = items.iter_mut() .find(|i| i.id == *id) .ok_or_else(|| InventoryError::NotFound(id.clone()))?; if item.quantity < quantity { return Err(InventoryError::InsufficientStock { available: item.quantity, requested: quantity, }); } item.quantity -= quantity; Ok(item.clone()) } } }
Item Builder
#![allow(unused)] fn main() { pub struct ItemBuilder { id: String, name: String, quantity: u32, } impl Default for ItemBuilder { fn default() -> Self { Self { id: "item-1".into(), name: "Test Item".into(), quantity: 100 } } } impl ItemBuilder { pub fn with_id(mut self, id: impl Into<String>) -> Self { self.id = id.into(); self } pub fn with_name(mut self, name: impl Into<String>) -> Self { self.name = name.into(); self } pub fn with_quantity(mut self, quantity: u32) -> Self { self.quantity = quantity; self } pub fn out_of_stock(mut self) -> Self { self.quantity = 0; self } pub fn build(self) -> Item { Item { id: ItemId(self.id), name: self.name, quantity: self.quantity } } } pub fn an_item() -> ItemBuilder { ItemBuilder::default() } }
Testutils Crate Layout
crates/my-testutils/
├── Cargo.toml
└── src/
├── lib.rs
├── fakes/
│ ├── mod.rs # pub use each fake
│ ├── user.rs # FakeUserRepository
│ ├── inventory.rs # FakeInventoryRepository
│ └── notifier.rs # FakeNotifier
└── builders/
├── mod.rs # pub use each builder
├── user.rs # UserBuilder, a_user()
└── item.rs # ItemBuilder, an_item()
# crates/my-testutils/Cargo.toml
[package]
name = "my-testutils"
version = "0.1.0"
edition.workspace = true
publish = false
[lints]
workspace = true
[dependencies]
my-core = { path = "../my-core" }
Example Integration Test
#![allow(unused)] fn main() { use my_testutils::{FakeInventoryRepository, FakeNotifier, an_item, a_user}; use std::sync::Arc; #[test] fn placing_order_reserves_stock_and_notifies_user() { let inventory = Arc::new(FakeInventoryRepository::with_items(vec![ an_item().with_id("widget").with_quantity(10).build() ])); let users = Arc::new(FakeUserRepository::with_users(vec![ a_user().with_id("alice").build() ])); let notifier = Arc::new(FakeNotifier::new()); let service = OrderService::new( Arc::clone(&inventory) as Arc<dyn InventoryRepository>, Arc::clone(&users) as Arc<dyn UserRepository>, Arc::clone(¬ifier) as Arc<dyn Notifier>, ); let order = service.place_order("alice", "widget", 3).unwrap(); assert_eq!(order.quantity, 3); assert_eq!(inventory.get_item(&ItemId("widget".into())).unwrap().quantity, 7); assert!(notifier.was_notified(&UserId("alice".into()), "order_confirmation")); } }
name: rust-design-agent description: > Rust architectural design specialist. Use proactively when the user asks to design, plan, or architect a Rust feature, module, service, or crate. Triggers on "design the architecture", "plan the module structure", "what crates do I need", "how should I structure this", "technical design", or when requirements need a Rust technical approach. model: opus skills:
- rust-architecture
- rust-project-setup
- rust-testing
- rust-ci-tooling
- ras-setup
- ras-api-design
- ras-best-practices
- ras-security
- dwind-project-setup
- dwind-component
You are a Rust architectural design agent. You have deep knowledge of opinionated Rust patterns preloaded from your skills — use them directly. Your job is to produce a clear, actionable architectural design for the user's feature or system.
Process
Phase 1: Understand the request
Categorize the task:
- Greenfield project — new workspace from scratch
- New service — new crate/binary in an existing workspace
- New module — new domain area within an existing crate or service
- Refactoring — restructuring existing code
Identify which concerns are relevant:
- API surface (REST, JSON-RPC, WebSocket, file serving)
- Persistence (database, file storage)
- Authentication and authorization
- Frontend (web, desktop/Tauri)
- Error handling boundaries
- Observability and monitoring
- Service-to-service communication
If the request is ambiguous, ask clarifying questions before proceeding.
Phase 2: Explore the codebase
Read the project to understand what exists:
Cargo.tomlat workspace root — current crate members and shared dependencies- Existing trait boundaries — ports defined in domain layers, adapters in infra
- Domain types the new design will interact with
- Current test infrastructure — testutils crate, existing fakes, integration test patterns
- Error types already in use
Use Glob and Grep to find these efficiently. Focus on understanding the shape of the existing code, not reading every file.
Phase 3: Design with patterns
Apply the patterns from your preloaded skills. You already have all the knowledge — reference it directly rather than guessing or inventing new patterns.
Map each design concern to the relevant patterns:
| Concern | Apply patterns from |
|---|---|
| Workspace layout, crate boundaries | rust-project-setup |
| DI, traits-as-interfaces, layer separation, hexagonal architecture | rust-architecture |
| REST/JSON-RPC/WebSocket/file-serving endpoints | ras-api-design |
| New service workspace from scratch | ras-setup |
| Auth, permissions, identity providers | ras-security |
| Error handling, observability, service communication | ras-best-practices |
| Test strategy, fakes, integration tests | rust-testing |
| CI/CD, lints, tooling | rust-ci-tooling |
| Frontend components (dwind/dominator) | dwind-component |
| Frontend project setup (Trunk, WASM) | dwind-project-setup |
Only address the concerns that are relevant to the user's request. Do not force every pattern into every design.
Phase 4: Present the design
Structure your output as:
- Overview — one paragraph describing what is being built and why the chosen approach fits
- Crate/module structure — directory tree showing where new code lives
- Trait boundaries — key port traits with method signatures, showing the domain/infra boundary
- Error strategy — which error types, thiserror in libraries vs anyhow in binaries
- API surface — if applicable, the RAS macro invocations or endpoint definitions
- Testing strategy — which fakes are needed, small/medium/large test distribution
- Open questions — trade-offs, things that need user input, things you'd want to validate
Be concrete. Show actual trait signatures, actual crate names, actual directory paths. Avoid vague advice — the user has patterns for that; your job is to apply them to their specific problem.
Guidelines
- Prefer the simplest design that satisfies the requirements. Do not over-engineer.
- Respect existing crate boundaries and patterns in the project. Extend, don't rewrite.
- When the project already has conventions (error types, test patterns, module layout), follow them.
- If a concern is out of scope for the current design, say so briefly and move on.
- The design is conversational output. Do not write files unless the user explicitly asks for an ADR or design doc.
name: scaffold-desktop description: Use when the user asks to scaffold, bootstrap, create, or start a new Tauri 2 desktop application with a dwind/dominator WASM frontend. Also use when they want a working Tauri + dwind starter template, a reference desktop app implementation, or to generate a native desktop app following marketplace best practices. version: 1.0.0
Desktop Scaffold — Tauri 2 Backend + dwind Frontend
A compilable, tested Tauri 2 desktop application template. Copy it, rename the app- prefix to your project name, and replace the Item domain with your own.
Architecture
template/
├── Cargo.toml # Parent workspace (edition 2024, resolver 3)
├── .rustfmt.toml # max_width = 100
├── justfile # fmt, check, test, ci, dev, build targets
├── .github/workflows/ci.yml # GitHub Actions: check + test + frontend build
│
├── crates/
│ ├── app-core/ # Domain layer — pure, no IO deps
│ │ └── src/
│ │ ├── domain/mod.rs # Item, ItemId (Uuid-backed)
│ │ ├── dto.rs # CreateItemRequest, ItemResponse, ItemListResponse
│ │ ├── error.rs # ItemError (thiserror): NotFound, AlreadyExists, Storage
│ │ └── ports/mod.rs # ItemRepository trait (async_trait, Send + Sync)
│ │
│ ├── app-adapters/ # Trait implementations
│ │ └── src/
│ │ └── in_memory.rs # InMemoryItemRepository (Mutex<HashMap>)
│ │
│ ├── app-testutils/ # Test support crate
│ │ └── src/
│ │ ├── fakes.rs # FakeItemRepository (configurable failure)
│ │ └── builders.rs # ItemBuilder with an_item() convenience
│ │
│ ├── app-tauri/ # Tauri backend (joins parent workspace)
│ │ ├── Cargo.toml # Depends on app-core, app-adapters
│ │ ├── build.rs # tauri_build::build()
│ │ ├── tauri.conf.json # Window config, Trunk integration, withGlobalTauri
│ │ ├── icons/ # Placeholder app icons (32x32, 128x128)
│ │ ├── capabilities/
│ │ │ └── default.json # core:default permissions
│ │ └── src/
│ │ ├── main.rs # DI wiring, command registration
│ │ ├── commands.rs # get_items, create_item, delete_item
│ │ └── state.rs # AppState with Arc<dyn ItemRepository>
│ │
│ └── app/ # Frontend WASM crate (own workspace, edition 2021)
│ ├── Cargo.toml # cdylib, standalone [workspace]
│ ├── Trunk.toml # WASM bundler config (port 1420)
│ ├── public/index.html # HTML shell with Trunk directive
│ └── src/
│ ├── lib.rs # wasm_bindgen(start) entry, dwind stylesheet init
│ ├── tauri_ipc.rs # IPC bridge to Tauri backend via window.__TAURI__
│ ├── types.rs # Frontend-local DTOs (String ids/dates)
│ └── components/
│ ├── app.rs # Root layout
│ ├── items.rs # Item list + create form (calls IPC commands)
│ └── state.rs # AppState with MutableVec<ItemResponse>
Workspace Isolation
The frontend and backend are separate workspaces. This is required because dwind's path dependencies resolve against dwind's own workspace and cannot coexist with the parent workspace.
- Parent workspace (
Cargo.tomlat root): includesapp-core,app-adapters,app-testutils, andapp-tauri - Frontend workspace (
crates/app/Cargo.toml): standalone with its own[workspace]
The frontend uses edition = "2021" (hardcoded, not inherited). The parent workspace uses edition = "2024".
Frontend types are local — crates/app/src/types.rs mirrors app-core::dto with String ids/dates instead of Uuid/DateTime<Utc>. This avoids cross-workspace path dependencies.
How to Use This Scaffold
- Read the template directory to understand the complete file structure
- Copy the template into the user's target directory
- Rename all
app-prefixes to the user's project name (e.g.,app-core→myapp-core,app-tauri→myapp-tauri) - Replace the domain — swap
Item/ItemId/ItemRepositorywith the user's domain types - Update Tauri commands in
commands.rsto match the new domain - Update the frontend — replace
types.rsDTOs anditems.rscomponent - Update
tauri.conf.json— changeproductName,identifier, windowtitle - Run
just cito verify parent workspace compiles and tests pass - Run
just devto launch the desktop app with hot-reload
Key Patterns Demonstrated
- Trait-as-Interface DI — domain traits in core, implementations in adapters, wiring in Tauri main (see rust-architecture skill)
- Workspace-first layout — all crates under
crates/, shared deps in[workspace.dependencies](see rust-project-setup skill) - Workspace isolation — frontend WASM crate excluded from parent workspace (see dwind-tauri skill)
- Tauri IPC bridge —
tauri_ipc.rsuseswasm_bindgeninline JS to callwindow.__TAURI__(see dwind-tauri skill) - Hand-written fakes —
FakeItemRepositorywithMutexforSend + Sync(see rust-testing skill) - Frontend-local DTOs —
types.rsmirrors backend types with simpler serialization (String ids/dates) - Reactive UI — dwind/dominator with
MutableVecfor live item list updates - thiserror/anyhow split —
thiserrorfor domain errors,anyhowonly in the Tauri binary crate - Clippy pedantic — workspace-level lints,
unwrap_usedwarning (see rust-ci-tooling skill)
Build & Test Commands
# Prerequisites
rustup target add wasm32-unknown-unknown
cargo install trunk
cargo install tauri-cli
# Development (starts Trunk + Tauri together, hot-reload)
just dev
# Full CI: fmt + clippy + test
just ci
# Production build (creates native installer)
just build
# Backend tests only
cargo test --workspace
# Build frontend WASM only
cd crates/app && trunk build
IPC Communication
The frontend calls the backend via Tauri commands, not HTTP:
| Frontend (WASM) | Backend (native) | What |
|---|---|---|
tauri_ipc::get_items() | commands::get_items | List all items |
tauri_ipc::create_item(name, qty) | commands::create_item | Create a new item |
tauri_ipc::delete_item(id) | commands::delete_item | Delete by UUID |
Adding New Commands
- Add a
#[tauri::command]function incommands.rs - Register it in
main.rsviatauri::generate_handler![...] - Add a typed wrapper in
tauri_ipc.rs - Add capability permissions in
capabilities/default.jsonif using Tauri plugins
Notes
- The frontend uses
wasm_logfor logging — messages appear in the Tauri devtools console - Tauri command argument names must be camelCase in the JSON (IPC serialization), even though Rust uses snake_case
cargo tauri devautomatically runstrunk serveand opens the app window- For the
asset://protocol (serving local files in the webview), addfeatures = ["protocol-asset"]to the Tauri dependency
Customization Guide
Renaming the Project
Replace all app- prefixes with your project name. For a project named acme:
- Rename directories:
app-core→acme-core,app-adapters→acme-adapters, etc. - Update
Cargo.tomlnames and path references in all crates - Update
usestatements:app_core→acme_core, etc. - Update workspace
Cargo.tomlmembers and exclude paths - Update
tauri.conf.json:productName,identifier, windowtitle - Rename the frontend crate (
app→acme) incrates/app/Cargo.toml
Replacing the Domain
The scaffold uses an Item/Inventory domain. To replace it:
app-core/src/domain/mod.rs— ReplaceItem,ItemIdwith your domain typesapp-core/src/error.rs— ReplaceItemErrorvariants with your domain errorsapp-core/src/ports/mod.rs— ReplaceItemRepositorywith your domain trait(s)app-core/src/dto.rs— Replace request/response typesapp-adapters/src/in_memory.rs— Implement the new trait (or replace with a real adapter)app-testutils/src/fakes.rs— Write fakes for your new traitsapp-testutils/src/builders.rs— Write builders for your new domain typesapp-tauri/src/commands.rs— Rewrite Tauri commands for the new domainapp-tauri/src/state.rs— UpdateAppStatewith the new trait(s)src/types.rs(frontend) — Mirror the new DTOs with String-based fieldssrc/tauri_ipc.rs(frontend) — Update typed command wrapperssrc/components/items.rs(frontend) — Rewrite the UI for the new domain
Adding a Real Database
Replace InMemoryItemRepository with a Diesel + SQLite adapter:
- Add
dieselanddiesel_migrationsto workspace deps - Create a
SqliteItemRepositoryinapp-adapters(see rust-architecture skill's trait-patterns reference) - Wrap
SqliteConnectioninMutexforSend + Sync - Update
app-tauri/src/main.rsto create the connection and wire the new adapter - Store the database file in the Tauri app data directory (use
app.path().app_data_dir())
Adding Tauri Plugins
File Dialogs
- Add
tauri-plugin-dialog = "2"toapp-tauri/Cargo.toml - Register:
.plugin(tauri_plugin_dialog::init())inapp-tauri/src/main.rs - Add
"dialog:default","dialog:allow-open"toapp-tauri/capabilities/default.json - Use the full IPC template from the dwind-tauri skill's reference which includes
pick_fileandpick_directory
Shell (open URLs in browser)
- Add
tauri-plugin-shell = "2"toapp-tauri/Cargo.toml - Register:
.plugin(tauri_plugin_shell::init())inapp-tauri/src/main.rs - Add
"shell:allow-open"toapp-tauri/capabilities/default.json
File System Access
- Add
tauri-plugin-fs = "2"toapp-tauri/Cargo.toml - Register:
.plugin(tauri_plugin_fs::init())inapp-tauri/src/main.rs - Add appropriate
"fs:*"permissions toapp-tauri/capabilities/default.json
Asset Protocol (serve local files in webview)
- Add
features = ["protocol-asset"]to thetauridependency - Enable in
tauri.conf.json:"security": { "assetProtocol": { "enable": true, "scope": ["**"] } } - Use
tauri_ipc::convert_file_src(path)to convert paths toasset://URLs
Adding Backend Events
The backend can stream events to the frontend:
#![allow(unused)] fn main() { // Backend (commands.rs) #[tauri::command] pub async fn long_operation(app: tauri::AppHandle) -> Result<(), String> { app.emit("progress", 50).map_err(|e| e.to_string())?; // ... app.emit("progress", 100).map_err(|e| e.to_string())?; Ok(()) } }
#![allow(unused)] fn main() { // Frontend (lib.rs) tauri_ipc::listen::<u32>("progress", move |percent| { progress_state.set(percent); }); }
Adding More Tauri Commands
- Add the
#[tauri::command]function inapp-tauri/src/commands.rs - Register it in
generate_handler![...]inapp-tauri/src/main.rs - Add a typed wrapper in
src/tauri_ipc.rsusinginvokeorinvoke_unit - Call from the frontend component
name: scaffold-fullstack description: Use when the user asks to scaffold, bootstrap, create, or start a new full-stack Rust project from scratch — including backend API, frontend UI, domain layer, and test infrastructure. Also use when they want a working starter template, a reference implementation, or to generate a new app following marketplace best practices. version: 1.0.0
Full-Stack Scaffold — RAS Backend + dwind Frontend
A compilable, tested full-stack Rust application template. Copy it, rename the app- prefix to your project name, and replace the Item domain with your own.
Architecture
template/
├── Cargo.toml # Workspace root (edition 2024, resolver 3)
├── .rustfmt.toml # max_width = 100
├── justfile # fmt, check, test, ci targets
├── .github/workflows/ci.yml # GitHub Actions: check + test + frontend build
│
├── crates/
│ ├── app-core/ # Domain layer — pure, no IO deps
│ │ └── src/
│ │ ├── domain/mod.rs # Item, ItemId (Uuid-backed)
│ │ ├── dto.rs # CreateItemRequest, ItemResponse, ItemListResponse (shared by frontend + backend)
│ │ ├── error.rs # ItemError (thiserror): NotFound, AlreadyExists, Storage
│ │ └── ports/mod.rs # ItemRepository trait (async_trait, Send + Sync)
│ │
│ ├── app-api/ # RAS API definition (rest_service! macro)
│ │ └── src/
│ │ ├── endpoints.rs # GET/POST/DELETE /items with auth levels
│ │ └── types.rs # API types with JsonSchema (re-exports core dto + adds schemars)
│ │
│ ├── app-adapters/ # Trait implementations
│ │ └── src/
│ │ └── in_memory.rs # InMemoryItemRepository (Mutex<HashMap>)
│ │
│ ├── app-service/ # Backend binary
│ │ └── src/
│ │ ├── main.rs # DI wiring, CORS, graceful shutdown
│ │ └── handlers.rs # Implements ItemServiceTrait, error conversion, tests
│ │
│ ├── app-frontend/ # dwind WASM app (cdylib)
│ │ └── src/
│ │ ├── lib.rs # wasm_bindgen entry, dwind stylesheet init
│ │ └── components/
│ │ ├── app.rs # Root layout
│ │ └── items.rs # Item list + create form (web_sys fetch, shared types from core)
│ │
│ └── app-testutils/ # Test support crate
│ └── src/
│ ├── fakes.rs # FakeItemRepository (configurable failure), FakeAuthProvider (Clone + shared state)
│ └── builders.rs # ItemBuilder with an_item() convenience
│
└── frontend/ # Frontend build config
├── index.html # Minimal HTML shell with gradient background
├── package.json # rollup + @wasm-tool/rollup-plugin-rust
└── rollup.config.js # Rust → WASM → JS bundle pipeline
How to Use This Scaffold
- Read the template directory to understand the complete file structure
- Copy the template into the user's target directory
- Rename all
app-prefixes to the user's project name (e.g.,app-core→myapp-core) - Replace the domain — swap
Item/ItemId/ItemRepositorywith the user's domain types - Update API endpoints in
endpoints.rsto match the new domain - Update the frontend components to display the new domain
- Run
just cito verify everything compiles and tests pass
Key Patterns Demonstrated
- Trait-as-Interface DI — domain traits in core, implementations in adapters, wiring in service binary (see rust-architecture skill)
- Workspace-first layout — all crates under
crates/, shared deps in[workspace.dependencies](see rust-project-setup skill) - RAS macro-driven API —
rest_service!generates trait, builder, client, and OpenAPI spec (see ras-api-design skill) - Hosted API explorer —
serve_docs: trueexposes/api/v1/docsand/api/v1/docs/openapi.json - Hand-written fakes —
FakeItemRepositoryandFakeAuthProviderwithMutexforSend + Sync(see rust-testing skill) - TestApp pattern — full Axum router in-process via
axum-test(see ras-best-practices skill) - Shared types — request/response DTOs in
app-core::dto, used by both frontend and backend - Type-safe frontend — dwind WASM app shares domain types with backend via
app-core - thiserror/anyhow split —
thiserrorfor domain errors,anyhowonly in the binary crate - Clippy pedantic — workspace-level lints,
unwrap_usedwarning (see rust-ci-tooling skill)
Build & Test Commands
# Backend
just ci # Full CI: fmt + clippy + test (all feature combos)
just test # Run backend tests
just test-wasm # Run frontend wasm-bindgen-test (needs wasm-pack)
cargo run -p app-service # Start the backend on :3000
# Frontend
rustup target add wasm32-unknown-unknown
cd frontend && npm install && npm start # Dev server on :8080 (proxies /api/* to :3000)
# Or just build the WASM
cargo build --target wasm32-unknown-unknown -p app-frontend
Authentication Flow
The scaffold uses real JWT authentication via RAS identity crates:
- Login —
POST /api/auth/loginwith{"username":"demo","password":"demo"}returns a JWT token - Token storage — Frontend stores the JWT in reactive state (
Mutable<Option<String>>) - Authenticated requests — Frontend passes
Authorization: Bearer <token>on POST/DELETE - Validation —
JwtAuthProvidervalidates the JWT on every protected endpoint
The API explorer at /api/v1/docs can also call protected endpoints with a bearer token. It stores the entered token in browser sessionStorage, not persistent localStorage.
Endpoint Auth Levels
| Endpoint | Auth | Why |
|---|---|---|
GET /api/v1/items | Public (UNAUTHORIZED) | Read access is open |
GET /api/v1/items/{id} | Public (UNAUTHORIZED) | Read access is open |
POST /api/v1/items | Protected (items:write) | Mutations require auth |
DELETE /api/v1/items/{id} | Protected (items:write) | Mutations require auth |
POST /api/auth/login | Public (custom handler) | Login endpoint |
To make all endpoints require auth, change UNAUTHORIZED to WITH_PERMISSIONS(["items:read"]) in endpoints.rs.
Default Credentials
A demo user is created on startup in main.rs:
- Username:
demo - Password:
demo - Permissions:
items:write
Deployment
Docker Compose
docker compose up --build
# Frontend: http://localhost:8080
# Backend: http://localhost:3000
The frontend nginx proxies /api/* to the backend, so the WASM app uses relative URLs.
Local Development
# Terminal 1 — backend
cargo run -p app-service
# Terminal 2 — frontend (proxies /api/* to :3000)
cd frontend && npm install && npm start
Notes
- The frontend uses
web_sys::fetchto call the backend API, sharing types fromapp-core::dto - For the RAS-generated native Rust client (useful for service-to-service calls), depend on
app-apiwithfeatures = ["client"] - The
app-apicrate hasserverandclientfeatures — use only what you need - Frontend component tests use
wasm-bindgen-test— run withwasm-pack test --headless --chrome crates/app-frontend
Customization Guide
Renaming the Project
Replace all app- prefixes with your project name. For a project named acme:
- Rename directories:
app-core→acme-core,app-api→acme-api, etc. - Update
Cargo.tomlnames and path references in all crates - Update
usestatements:app_core→acme_core, etc. - Update workspace
Cargo.tomlmembers if you changed the directory structure
Replacing the Domain
The scaffold uses an Item/Inventory domain. To replace it:
app-core/src/domain/mod.rs— ReplaceItem,ItemIdwith your domain typesapp-core/src/error.rs— ReplaceItemErrorvariants with your domain errorsapp-core/src/ports/mod.rs— ReplaceItemRepositorywith your domain trait(s)app-core/src/dto.rs— Replace request/response typesapp-api/src/types.rs— Update API types (addJsonSchemaderive)app-api/src/endpoints.rs— Rewrite therest_service!macro with your endpointsapp-service/src/handlers.rs— Implement the new generated traitapp-adapters/src/in_memory.rs— Implement the new trait (or replace with a real adapter)app-testutils/src/fakes.rs— Write fakes for your new traitsapp-testutils/src/builders.rs— Write builders for your new domain types
Adding a Real Database
Replace InMemoryItemRepository with a Diesel + SQLite adapter:
- Add
dieselanddiesel_migrationsto workspace deps - Create a
SqliteItemRepositoryinapp-adapters(see rust-architecture skill's trait-patterns reference) - Wrap
SqliteConnectioninMutexforSend + Sync - Update
app-service/main.rsto create the connection and wire the new adapter
Customizing Authentication
The scaffold already includes real JWT auth via SessionService + LocalUserProvider + JwtAuthProvider. To customize:
- Add users — call
local_provider.add_user()inmain.rs, or replaceLocalUserProviderwith a database-backed provider - Per-user permissions — replace
StaticPermissionswith a customUserPermissionsimpl that looks up permissions per user - OAuth2 — add
ras-identity-oauth2and register anOAuth2Provideralongside (or instead of) the local provider - Session config — adjust
SessionConfiginmain.rs(JWT secret, TTL, algorithm)
Adding Observability
- Add
ras-observability-otelto workspace deps - Use
standard_setup()and wirewith_usage_tracker/with_method_duration_trackeron the builder (see ras-best-practices skill) - Add a
/metricsendpoint
Adding More API Crates
For JSON-RPC or WebSocket APIs alongside REST:
- Create a new
app-rpc-apicrate withjsonrpc_service!orjsonrpc_bidirectional_service! - Add it to the workspace
- Merge its router into the main app (see ras-api-design skill)
Splitting the Frontend
For a Tauri desktop app instead of (or alongside) the web app:
- See the dwind-tauri skill for the separate frontend/backend workspace pattern
- Use Trunk instead of Rollup for the WASM build
- Add IPC bridge via
window.__TAURI__bindings
name: enforce-robustness description: Use when the user asks to make code more reliable, add tests, raise coverage, protect against regressions, verify an AI-generated change, build confidence before shipping, create UAT or acceptance tests, add mutation/property/contract tests, or enforce "aggressive trust building" through unit, integration, end-to-end, feature regression, and verification evidence.
Enforce Robustness
Build evidence that the code behaves correctly under realistic use, edge cases, and future edits. Treat tests as a trust-building artifact that should usually be committed with the production change.
Operating Standard
Default to a high bar unless the user sets a narrower scope:
- Protect critical behavior with executable tests before considering work complete.
- Prefer behavior and invariant coverage over line coverage alone.
- Push coverage toward the practical maximum for changed code; target 100% branch coverage for critical decision logic when feasible.
- Use mutation testing when a mature tool exists for the stack, especially around business rules, parsers, authorization, money, state machines, migrations, and recovery paths.
- Add regression tests for every confirmed bug and every risky edge case discovered while reviewing the change.
- Keep generated tests deterministic, maintainable, and aligned with existing test style.
Workflow
- Map the trust boundary. Identify the behavior being changed, public interfaces, persistence effects, external calls, concurrency boundaries, and user-visible workflows.
- Inventory current evidence. Locate existing unit, integration, contract, snapshot, browser, UAT, property, fuzz, and regression tests. Run the smallest relevant subset to establish the current state.
- Find blind spots. Compare changed behavior against existing tests. Look for untested branches, failure paths, boundary values, permission states, compatibility cases, migrations, and UI workflows.
- Write the missing tests. Add focused tests in the closest existing test layer. Prefer small unit tests for pure logic, integration tests for boundaries, and UAT/end-to-end tests for user promises.
- Add regression protection. When a bug, edge case, or near-miss is found, create a test that fails for the vulnerable implementation and passes after the fix.
- Escalate evidence for high-risk code. Add property tests, mutation testing, model-based tests, golden fixtures, contract tests, or race/concurrency tests where ordinary examples are too weak.
- Run and tighten. Execute tests, coverage, mutation checks, and lint/build commands that are reasonable for the repo. Fix weak assertions, flaky timing, excessive mocks, and tests that only exercise implementation details.
- Report the evidence. Summarize what was added, what commands passed, what risk remains, and any tool unavailable in the environment.
Test Selection
Use this table to choose the next test layer:
| Risk | Strong evidence |
|---|---|
| Pure business rule, parser, serializer, validator | Unit tests with boundary cases, table tests, property tests |
| Stateful workflow, lifecycle, cache, retry, transaction | Integration tests with real or close test doubles |
| Public API or SDK contract | Contract tests, schema validation, compatibility fixtures |
| UI feature or user promise | UAT/end-to-end tests that follow the user workflow |
| Past bug or production incident | Minimal regression test plus the broader scenario that allowed it |
| Complex branching or critical invariants | Coverage report plus mutation testing |
| Concurrency, async, scheduling, idempotency | Stress tests, deterministic schedulers if available, repeated runs |
Tooling Heuristics
Prefer the repo's configured commands, then common defaults:
- Rust:
cargo test,cargo nextest run,cargo llvm-cov,cargo mutants,proptest,quickcheck,loom. - TypeScript/JavaScript:
npm test,vitest,jest,playwright,nycor built-in coverage,stryker. - Python:
pytest,pytest-cov,hypothesis,mutmut,cosmic-ray. - Go:
go test ./...,go test -race,go test -cover, fuzz tests withgo test -fuzz. - JVM: JUnit, Gradle/Maven test tasks, JaCoCo, PIT mutation testing.
If a tool is not installed or would require network access, state the exact command that would be used and continue with locally available evidence.
UAT And Feature Regression
For user-facing changes, create at least one test that reads like the user's actual workflow:
- Start from a realistic user state.
- Perform the same action sequence a user or API client would perform.
- Assert the observable result, not just internal calls.
- Include the original bug or requirement wording in the test name only when it improves traceability.
Avoid brittle selectors, sleeps, over-mocked dependencies, and snapshots that are too broad to diagnose.
Quality Gate
Before finalizing:
- Run the relevant test suite and any new test in isolation.
- Check coverage or mutation score when tooling exists.
- Inspect the final diff for tests that can pass without proving behavior.
- Verify each new test would fail against the old bug or missing behavior when practical.
- Call out residual risk honestly, including untested paths and unavailable tools.
For detailed coverage targets and mutation-testing triage, read references/evidence-standards.md.
Evidence Standards
Use these targets as pressure, not bureaucracy. Raise or lower them based on criticality, repo maturity, runtime cost, and user constraints.
Coverage
- Changed critical decision logic: aim for 100% branch coverage.
- Changed ordinary application code: aim for at least 90% line and branch coverage on touched modules when practical.
- Generated, declarative, UI styling, and framework glue code can use lower coverage if behavior is exercised elsewhere.
- Do not accept coverage that only executes code without assertions tied to behavior.
Mutation Testing
Prioritize mutation testing for:
- Authorization and tenancy logic.
- Financial calculations, billing, quotas, limits, and permissions.
- Parsers, validators, serializers, migrations, and compatibility code.
- Retry, idempotency, reconciliation, and recovery paths.
Triage surviving mutants:
- Add a test when the mutant changes observable behavior.
- Mark equivalent mutants only when the code is genuinely indistinguishable.
- Consider simplifying code when many equivalent or hard-to-kill mutants appear.
Regression Tests
A regression test should:
- Fail against the broken behavior.
- Assert the externally meaningful result.
- Be named around behavior, not implementation details.
- Live close to the layer where the bug escaped.
UAT Coverage
For user-facing features, cover:
- The happy path.
- At least one validation or permission failure.
- The most likely recovery path.
- The workflow state after reload, retry, or navigation when applicable.
name: security-audit description: Use when the user asks for a security review, vulnerability audit, threat modeling, secure-code analysis, dependency audit, fuzzing, sanitizer checks, API verification, SAST/DAST guidance, security tests, exploit regression tests, auth/authz validation, input sanitization checks, secret scanning, or aggressive trust building for security-sensitive code.
Security Audit
Find security problems and turn the important ones into durable evidence. Work on two tracks: analyze the code directly, and add committed tests or verification tooling that prevents regressions.
Security Stance
Default to adversarial scrutiny for code that handles identity, permissions, money, secrets, network input, file paths, serialization, command execution, cryptography, plugins, migrations, or multi-tenant data.
Do not stop at a checklist. Trace real data and control flow from untrusted input to sensitive sinks. When a vulnerability is confirmed or plausible enough to protect, add a regression test, fuzz target, sanitizer run, or static-analysis rule where the repo can support it.
Workflow
- Scope the assets. Identify trust boundaries, attacker-controlled inputs, sensitive data, privileged operations, external services, and deployment assumptions.
- Map attack paths. Follow data from entry points to sinks: database queries, shell commands, filesystem paths, SSRF targets, template rendering, deserialization, redirects, logs, and authorization decisions.
- Review controls. Check authentication, authorization, tenancy isolation, validation, encoding, rate limiting, replay protection, session lifecycle, error handling, secrets handling, and audit logging.
- Run available tools. Prefer configured repo tooling first, then language-standard scanners, dependency audit, secret scanning, fuzzing, sanitizer, and type/lint checks.
- Write security evidence. Add tests that fail on the vulnerable behavior: permission bypass, injection payload, malformed input, path traversal, replay, cross-tenant access, unsafe redirect, panic/DoS, or secret leakage.
- Fix and verify. Patch confirmed issues when in scope. Re-run tests and security tools. Add focused regression coverage near the vulnerable boundary.
- Report clearly. Lead with confirmed findings and severity. Distinguish confirmed vulnerabilities, plausible risks, hardening suggestions, and tools that could not be run.
Evidence To Add
Choose the strongest practical evidence:
| Risk | Evidence |
|---|---|
| Authn/authz bypass | Negative permission tests for every role, tenant, and ownership boundary |
| Injection | Payload tests against SQL, NoSQL, shell, LDAP, template, and expression sinks |
| Path traversal or file exposure | Canonicalization tests with encoded, relative, symlink, and absolute paths |
| SSRF or unsafe outbound calls | URL parser allowlist tests and blocked private-network targets |
| Parser/decoder bugs | Fuzz target, corpus fixtures, malformed input regression tests |
| Memory safety or UB | Sanitizer runs, Miri, fuzzing, bounds tests |
| Crypto/session weakness | Token expiry, replay, rotation, algorithm, nonce, and constant-time comparison tests |
| Secret handling | Secret scan plus tests that logs/errors/responses redact sensitive values |
| API contract drift | Schema validation, OpenAPI checks, consumer contract tests |
Tooling Heuristics
Use local configuration before introducing new commands:
- Cross-language:
semgrep,codeql,gitleaks,trufflehog,osv-scanner. - Rust:
cargo audit,cargo deny,cargo clippy,cargo miri,cargo fuzz, sanitizer builds where configured. - TypeScript/JavaScript:
npm audit,pnpm audit,yarn npm audit,eslintsecurity rules,tsc,playwrightsecurity regressions. - Python:
pip-audit,bandit,ruff,pytest,hypothesis. - Go:
govulncheck,gosec,go test -race,go test -fuzz. - Containers/IaC:
trivy,grype,checkov,tfsec, Kubernetes policy linters.
If a tool is missing or requires network access, do not invent results. State that it was unavailable and name the exact command the user can run.
Audit Depth
For security-sensitive changes, include at least one direct code-review pass and one executable evidence pass:
- Direct pass: inspect the code path manually and reason about attacker control, preconditions, and impact.
- Evidence pass: add or run a test, fuzz target, sanitizer, scanner, or dependency audit that would catch the issue class.
For ambiguous findings, create a small proof-oriented test or reproduction before labeling it a vulnerability.
Reporting Format
When reporting findings, use:
- Severity: Critical, High, Medium, Low, or Hardening.
- Location: file and line when available.
- Attack path: input, missing control, sink, and impact.
- Evidence: test/tool/manual reasoning that supports the finding.
- Fix: specific mitigation and regression coverage added or recommended.
If no issues are found, say what was examined and what evidence supports the conclusion. Avoid claiming the system is secure; say what risk remains.
For detailed attack categories and test ideas, read references/security-test-catalog.md.
Security Test Catalog
Use this catalog when designing security regression tests. Select cases relevant to the stack and threat model.
Authentication And Sessions
- Expired, malformed, missing, and wrong-audience tokens.
- Session fixation, replay, logout invalidation, refresh-token rotation.
- Password reset and email verification token reuse.
- Timing and enumeration differences for login and recovery flows.
Authorization And Tenancy
- Horizontal access: user A reading, updating, deleting, or listing user B resources.
- Vertical access: low-privilege user calling admin or service-only paths.
- Multi-tenant scoping: query filters, cache keys, background jobs, exports, webhooks.
- Object ownership checked on every mutation, not only on read.
Input And Injection
- SQL/NoSQL/LDAP/template/expression payloads.
- Command arguments containing spaces, separators, substitutions, and encoded characters.
- HTML, Markdown, CSV, and rich-text payloads that cross rendering contexts.
- Header injection, request smuggling edge cases, and unsafe redirects.
Files And URLs
../, encoded traversal, absolute paths, symlink traversal, mixed separators.- MIME confusion, extension spoofing, archive bombs, zip slip.
- SSRF to localhost, metadata services, private ranges, IPv6, DNS rebinding, redirects.
Reliability As Security
- Oversized payloads, deep nesting, decompression bombs, parser panics.
- Race conditions around authorization, payment, inventory, quotas, and idempotency keys.
- Partial failure that exposes data, repeats side effects, or skips audit logs.
Secret Handling
- Logs, errors, traces, telemetry, snapshots, and client responses redact secrets.
- Test fixtures do not contain real credentials.
- Config loaders fail closed when required secrets are missing or weak.