Sitemap

Multi-agent research: Anthropic

2 min readJun 14, 2025

--

https://www.anthropic.com/engineering/built-multi-agent-research-system

How Anthropic Built a Multi-Agent Research System for Claude

Anthropic recently introduced Research capabilities in Claude, enabling it to explore complex topics by coordinating multiple AI agents. In a blog post, the engineering team shared key insights from developing this system. Here’s a summary of their approach and lessons learned.

Why Multi-Agent Systems?

Research tasks are inherently dynamic — unlike structured workflows, they require adapting strategies based on new findings. A single AI agent struggles with:

  • Token limits (context windows constrain deep exploration).
  • Sequential bottlenecks (slow, one-step-at-a-time searches).
  • Lack of parallelization (difficulty exploring multiple angles simultaneously).

A multi-agent system solves these issues by:

  • Delegating tasks (a lead agent spawns specialized subagents).
  • Running parallel searches (subagents explore different aspects at once).
  • Compressing insights (each subagent summarizes findings for the lead agent).

Results:

  • 90.2% improvement over single-agent Claude Opus 4 in research tasks.
  • Faster execution (parallel tool calls cut research time by 90%).

Key Engineering Challenges

1. Prompt Engineering for Coordination

  • Teach delegation: Lead agents must clearly define subagent tasks to avoid duplication.
  • Scale effort to complexity: Simple queries use 1 agent; complex ones use 10+.
  • Guide search strategy: Start broad, then narrow down (like human researchers).

2. Tool Design & Reliability

  • Bad tool descriptions mislead agents — clear documentation is critical.
  • Agents can improve their own tools: Claude 4 models helped refine prompts and tool descriptions, reducing errors by 40%.

3. Evaluation & Debugging

  • LLMs as judges: Grading outputs on accuracy, citations, and completeness.
  • Human testing catches edge cases (e.g., agents favoring SEO-optimized junk over authoritative sources).
  • Emergent behaviors require monitoring — small prompt changes can drastically alter agent interactions.

4. Production Challenges

  • Stateful execution: Errors compound, so checkpoints and retries are essential.
  • Asynchronous bottlenecks: Current synchronous execution slows research; future versions may allow dynamic subagent spawning.
  • Rainbow deployments prevent disruptions when updating live agents.

Conclusion

Multi-agent systems unlock new capabilities for AI research but require careful engineering to handle coordination, reliability, and efficiency. Anthropic’s approach — combining parallel agents, smart tool use, and iterative prompting — demonstrates how AI can tackle open-ended problems at scale.

What’s next?

  • Asynchronous agent coordination for faster workflows.
  • Better memory management for long-horizon tasks.
  • Expanding use cases beyond research (e.g., coding, business strategy).

--

--

noailabs
noailabs

Written by noailabs

Tech/biz consulting, analytics, research for founders, startups, corps and govs.

No responses yet