AI Prompt for Agent Architectures (ReAct, Plan-Execute, Multi-agent)
Multi-agent loop-until-done with critic system in Mastra tackling daily research briefing in a HR context. Roles, handoffs, shared state, and supervisor logic.
More prompts for Agent Architectures (ReAct, Plan-Execute, Multi-agent).
End-to-end CodeAct (code as action) agent implemented in Vercel AI SDK for SEO keyword research. Includes graph/state design, tool wiring, loop termination, observability via Braintrust, and evals.
Multi-agent loop-until-done with critic system in AutoGen tackling SQL report writing in a e-commerce context. Roles, handoffs, shared state, and supervisor logic.
Multi-agent loop-until-done with critic system in Inngest agent-kit tackling onboarding coordinator in a HR context. Roles, handoffs, shared state, and supervisor logic.
Agent loop that critiques and revises its own output for customer support triage. Full trace capture via LangSmith, retry budget, and ship criteria.
Agent loop that critiques and revises its own output for incident postmortem drafting. Full trace capture via OpenTelemetry + Honeycomb, retry budget, and ship criteria.
Agent loop that critiques and revises its own output for content calendar planning. Full trace capture via Weights & Biases Weave, retry budget, and ship criteria.
Design a multi-agent system using the **loop-until-done with critic** topology in Mastra, applied to daily research briefing inside a HR organization. **Model:** Claude 3.5 Sonnet **Runtime:** TypeScript + Bun ## Part 1 — Role cast Define 4–6 specialized agents. For each: - **Name** and one-sentence charter - **Seniority level** (junior/senior) and why - **Tools** it has exclusive or shared access to - **It does NOT do X** — explicit non-goals prevent role overlap - **Inputs / Outputs** it expects For loop-until-done with critic specifically, include the structural role (e.g. supervisor / worker / critic / router / handoff target). ## Part 2 — Communication protocol - Shared state schema (what every agent can read) - Private scratchpads (what stays within one agent) - Message format between agents (structured JSON, not free text) - How the next agent is chosen (loop-until-done with critic rules) - Handoff semantics: does state transfer fully, or just a summary? ## Part 3 — Supervisor / orchestrator logic Write the orchestrator prompt (if applicable to loop-until-done with critic). It must: - Decide who acts next based on current state - Detect when the team is stuck (repeating, circling, off-task) - Escalate to human when confidence is low - Terminate when daily research briefing is done ## Part 4 — Framework implementation (Mastra) Write the code: - Agent definitions (idiomatic for Mastra) - The topology wiring - Shared state setup - Entry point that takes a daily research briefing input and returns a final result ## Part 5 — Conflict resolution When two agents disagree (e.g. the researcher says X, the critic says not-X), how is it resolved? Options: deferred to supervisor, debated with token budget, deferred to human, majority vote of N-agents. Pick one and justify for daily research briefing. ## Part 6 — Cost and latency budget - Per-agent token budget - Max parallel agents running - Short-circuit: if early result is high-confidence, skip the rest of the pipeline - Cost attribution per agent (so you know which one is expensive) ## Part 7 — HR context What HR-specific knowledge do agents need baked into their prompts? (regulations, jargon, SLAs, stakeholders). Where does that knowledge live — prompt, RAG, tool? ## Part 8 — Evaluation - End-to-end success on daily research briefing (trajectories on 30 real examples) - Per-agent evals: does the researcher retrieve the right docs? Does the critic catch seeded errors? - Ablation: remove each agent in turn, see which ones actually add value ## Part 9 — Anti-patterns to avoid Warn about: agents talking forever, agents flattering each other, supervisor becoming a single point of failure, state bloat across handoffs, tool permission creep. Output should be a concrete, runnable design. Include the code, the prompts, and the eval harness.