name: plan-agentic-system description: Use when the user wants to plan, scope, or design a new agentic system from scratch through an interactive discovery process. Triggers on "plan an agent system", "help me design my agents", "I want to build a multi-agent system", "plan agentic system", or when the user needs guided discovery of what their agent architecture should look like. version: 1.0.0

Plan Agentic System — Interactive Architecture Discovery

An interactive, question-driven process for designing an agentic system. Your job is to be the architect interviewing the client. The user knows their domain but may not see all the architectural possibilities. You ask the questions, surface options they haven't considered, and progressively build a complete system design.

Do not rush to a design. The discovery phase is the most valuable part. A mediocre design built on thorough understanding beats an elegant design built on assumptions.

How This Works

This is a multi-phase conversation, not a one-shot generation. Each phase ends with questions to the user. Do not proceed to the next phase until you have answers. Use the AskUserQuestion tool to ask structured questions with options where appropriate — this helps the user think through choices they might not have considered.

At the end, you produce a complete architecture document using patterns from the agentic-systems skills: agent-decomposition, agent-communication, tool-design, agent-state, and agent-observability.

Phase 1: Problem Space Discovery

Goal: understand what the user is trying to build and why. Do not discuss agents yet — understand the problem first.

Ask about:

1.1 The Mission

What is this system supposed to accomplish? What's the core job?
Who are the users? (Developers? End users? Internal teams? Automated pipelines?)
What does success look like? How will they know it's working?

1.2 Current State

How is this problem solved today? (Manually? Existing software? Not at all?)
What's painful about the current approach? What breaks, what's slow, what's expensive?
Is there existing infrastructure this needs to integrate with?

1.3 Constraints

Are there budget/cost constraints? (Token spend matters in agent systems)
Latency requirements? (Real-time user-facing vs batch processing vs async)
Security/compliance? (What data can agents access? Are there audit requirements?)
Scale? (10 requests/day vs 10,000/hour changes the architecture dramatically)

1.4 Compliance and Audit

Are there regulatory requirements? (HIPAA, GDPR, SOC2, industry-specific?)
What data can agents access? What data must they NOT access?
Do you need an audit trail of agent decisions? For how long?
Who should be able to review what agents did? (Just engineering, or compliance/legal too?)
Are there retention requirements for agent outputs or traces?

Surface hidden constraints:

"If agents will handle customer data, GDPR gives users the right to explanation — you may need decision tracing not just for debugging but for compliance."
"Audit requirements often mean you need immutable logs of every agent action, not just errors. This affects your observability design from day one."

Surface possibilities the user may not see:

"You mentioned X is done manually today — have you considered that an agent could handle the Y part while a human reviews the Z part?"
"This workflow has a natural split between research and execution — that maps well to separate agents with different capabilities."
"Given your latency requirements, we might want a fast cheap model for triage and a capable model for the hard cases."

Phase 2: Capability and Integration Discovery

Goal: map out every capability the system needs and every external system it touches. This is where you discover tools and services the user may not have thought to integrate.

Ask about:

2.1 Data Sources

What data does the system need to access? (Databases, APIs, files, documents, web?)
Where does this data live? (Internal services, SaaS products, public web, local files?)
How frequently does the data change? (Real-time, daily, static?)
Are there APIs already available, or would tools need to be built?

Probe deeper — users often forget sources:

"You mentioned using Jira for project tracking — do you also have Confluence or a wiki with documentation that agents could search?"
"If the agent needs customer context, is there a CRM? Support ticket history?"
"Are there monitoring dashboards or logs the agent could query instead of asking a human?"

2.2 Actions and Side Effects

What actions should the system be able to take? (Create, update, delete, send, deploy?)
Which of these are reversible? Which are permanent?
Which actions need human approval before execution?
Are there existing APIs or CLIs for these actions, or would they need to be built?

Probe for automation potential:

"You said the output is a report — does it need to be reviewed, or could it be sent directly?"
"This involves creating tickets — is there an API for that, or is someone copying from chat to the ticketing system today?"
"Are there actions a human does routinely that are low-risk enough to automate fully?"

2.3 Existing Tools and Services

Walk through what's already available:

Code repositories and CI/CD pipelines
Communication tools (Slack, email, Teams)
Project management (Jira, Linear, GitHub Issues)
Documentation systems (Confluence, Notion, wikis)
Monitoring and observability (Grafana, Datadog, CloudWatch)
Databases and data warehouses
Custom internal services and APIs
MCP servers already deployed or available

For each: ask about API availability, authentication requirements, and rate limits.

2.4 Knowledge and Context

Is there domain knowledge the agents need that isn't in a database? (Tribal knowledge, conventions, unwritten rules?)
Are there reference documents, style guides, runbooks, or playbooks?
Does the system need to learn from feedback over time, or is it stateless?

2.5 Operations and Deployment

How will changes to agents be deployed? (All at once? Gradual rollout? Can you canary a new prompt?)
How will you know if a change degraded performance? What metrics define "working well"?
What's the rollback plan if a new agent version misbehaves?
How frequently do you expect to update agent prompts or capabilities?
What's your cost tolerance? (Per-request budget? Monthly ceiling?)

Probe for model strategy:

"Do all agents need the same model, or can some use cheaper models for simpler tasks?"
"What's your latency tolerance? Opus thinks deeper but slower. Haiku is fast but shallower. The right mix depends on your tasks."
"How do you want to handle model deprecation? When Claude's next version ships, what's your migration plan?"

Phase 3: Workflow Mapping

Goal: map the end-to-end workflows the system must support. This is where the agent structure starts to emerge.

3.1 Walk Through Concrete Scenarios

Ask the user to describe 2-3 concrete examples of the system being used:

"Walk me through a typical request from start to finish. What happens at each step?"
"Now walk me through a hard case — one where things get complicated or require judgment."
"What's a failure case? When does the current process break?"

For each scenario, identify:

Decision points — where does the workflow branch?
Handoffs — where does responsibility shift from one person/system to another?
Bottlenecks — what step takes the longest or fails the most?
Quality gates — where does someone review before proceeding?

3.2 Identify Natural Agent Boundaries

Based on the workflows, surface potential decomposition to the user:

"Steps 1-3 are all about research and gathering information. Steps 4-5 are about generating output. These could be separate agents with different tool sets."
"This decision point looks like a coordinator's job — route to the right specialist based on the request type."
"The review step is a natural validator agent — it checks the output before it goes to the user."

Present the emerging topology and ask: "Does this mapping feel right? Is there a step I'm oversimplifying?"

3.3 Volume and Patterns

How often is each workflow triggered? (Per hour? Per day? On demand?)
Are there peak times? Batch processing windows?
Can workflows run concurrently, or are there serialization constraints?
What's the typical vs worst-case complexity of a request?

3.4 Failure Mode Analysis

Walk through failure scenarios with the user:

"What's the worst thing an agent could do in this system? What's the blast radius?"
"Walk me through a failure case — a tool is down, an agent hallucinates, a handoff loses context. What should happen?"
"Which operations are reversible? Which are permanent? For permanent ones, what's the human approval flow?"
"What's the cost of an error? Is it hours of lost work? Money? Customer trust? Safety?"
"What does 'partial success' look like? If 3 of 4 steps succeed, is that useful or dangerous?"

Surface failure modes the user hasn't considered:

"If the research agent returns confidently wrong information, the downstream agents will act on it. How do you want to catch this?"
"What happens during a model outage? Does the system queue work, degrade to a simpler flow, or fail entirely?"
"If two agents produce conflicting results, who arbitrates — a third agent, the coordinator, or a human?"

Phase 4: Architecture Proposal

Goal: present a concrete architecture based on everything discovered. Use patterns from the agentic-systems skills.

4.1 Agent Topology

Apply agent-decomposition patterns:

Draw an ASCII topology diagram showing all agents and their relationships
For each agent: name, role, model tier, tool set, what it delegates and to whom
Justify the decomposition — why these agents, why these boundaries?
Call out where you chose NOT to split and why

4.2 Communication Design

Apply agent-communication patterns:

Orchestration vs choreography decision with rationale
Delegation patterns for each agent-to-agent communication
Handoff schemas — what gets passed between agents
Trust boundaries — which agents have access to which capabilities
Human-in-the-loop checkpoints — where and why

4.3 Tool Inventory

Apply tool-design patterns:

Complete list of tools needed, organized by agent
For each tool: name, description, input/output contract, side effects
Flag which tools already exist (discovered in Phase 2) vs which need to be built
Prioritize: which tools are essential for v1, which can wait?

4.4 State Strategy

Apply agent-state patterns:

Where state lives for each agent (context, external store, shared workspace)
System prompt architecture for each agent role
Context budget estimates — will the workflows fit in context windows?
What persists across sessions vs what is ephemeral

4.5 Observability Plan

Apply agent-observability patterns:

What to trace and how
Error handling strategy per agent
Cost estimates and budget alerts
How to debug when things go wrong

4.6 Phased Rollout

Propose an incremental build plan:

Phase 1: minimal viable system — fewest agents, core workflow only
Phase 2: add specialist agents as complexity demands
Phase 3: add observability, resilience, and optimization
Call out decision points: "after Phase 1, you'll know whether X warrants splitting into its own agent"

Phase 5: Write the Design Document

Once the user approves the architecture, write a design document to a file. Ask the user where they want it (default: agentic-system-design.md in the project root).

The document should include:

Problem statement — what this system solves (from Phase 1)
System context — integrations, data sources, constraints (from Phase 2)
Workflows — the concrete scenarios mapped (from Phase 3)
Architecture — topology, communication, tools, state, observability (from Phase 4)
Phased rollout plan (from Phase 4.6)
Open questions — things that need validation or user decisions before implementation

Principles

Ask, don't assume. When in doubt, ask the user. A wrong assumption early compounds into a wrong architecture.
Surface hidden possibilities. The user knows their domain but may not see which existing tools, APIs, or services could be leveraged by agents. Your job is to discover these.
Challenge gently. If the user proposes something that seems overengineered or underengineered, say so with reasoning. "You could do that, but here's a simpler approach that achieves the same thing" or "That sounds simple but will hit X problem at scale."
Start simple, earn complexity. Always propose the simplest viable architecture first. Add agents and patterns only when justified by concrete needs discovered in the conversation.
Make the implicit explicit. Users often have unspoken assumptions about latency, cost, reliability, or quality. Surface these. "You haven't mentioned error handling — what should happen when the research step fails? Is partial results acceptable or do you need retries?"
Know where the analogy breaks. This skill treats agents like microservices, and the analogy is productive — but it has limits:
- Nondeterminism: microservices are deterministic (same input → same output). Agents are not. The same prompt can produce different results. This makes caching harder, testing harder, and debugging harder. Design for variance, not consistency.
- Ephemeral state: microservices have durable state (databases). Agent state is ephemeral by default (context window). If you don't explicitly persist it, it's gone. Recovery requires checkpointing, not just restarting.
- Composition depth: microservices can compose to arbitrary depth. Agents lose context at each hop and degrade after 2-3 levels. Flat is better than deep. If you need depth, use structured handoffs aggressively.
- Fragile interfaces: microservice APIs are formally specified (schemas, types). Agent "interfaces" are prompts — informal, brittle, and subtly version-dependent. A small prompt change can alter output format in ways that break consumers silently.

This skill is a comprehensive interactive process that draws from all five agentic-systems architecture skills:

For agent boundary and topology decisions, see agent-decomposition
For delegation and trust patterns, see agent-communication
For tool interface design, see tool-design
For state management and prompt architecture, see agent-state
For tracing, resilience, and cost tracking, see agent-observability

Each skill can also be used independently for targeted guidance on a specific concern.

Mathias Marketplace — Plugin Reference