cognition
links#
How we built our multi-agent research system speaks to Anthropic’s multi-agent build experiences.
We found that a multi-agent system with Claude Opus 4 as the lead agent and Claude Sonnet 4 subagents outperformed single-agent Claude Opus 4 by 90.2% on our internal research eval
However these architectures burn through tokens fast:
In our data, agents typically use about 4× more tokens than chat interactions, and multi-agent systems use about 15× more tokens than chats.
Cognition goes further in Don’t Build Multi-Agents
In some cases, libraries such as swarm by OpenAI and autogen by Microsoft actively push concepts which I believe to be the wrong way of building agents.