patterns
links#
How we built our multi-agent research system speaks to Anthropic’s multi-agent build experiences.
We found that a multi-agent system with Claude Opus 4 as the lead agent and Claude Sonnet 4 subagents outperformed single-agent Claude Opus 4 by 90.2% on our internal research eval
However these architectures burn through tokens fast:
In our data, agents typically use about 4× more tokens than chat interactions, and multi-agent systems use about 15× more tokens than chats.
Cognition goes further in Don’t Build Multi-Agents
In some cases, libraries such as swarm by OpenAI and autogen by Microsoft actively push concepts which I believe to be the wrong way of building agents.
A rules-based pattern is emerging for helping agentic workflows produce better results. Examples include GreatScottyMac’s RooFlow and, Geoff Huntley’s specs and stdlib approaches.
Emerging Patterns in Building GenAI Products - a look at a number of different gen-ai patterns across evals, embeddings, RAG, Guardrails, fine tuning.
Mark tests that test overall behaviour other tests expand on in more detail proposes improving Jest’s test output by allowing tests to be marked as dependent on others, helping focus on root causes when failures occur in test suites with overlapping assertions. Utlimately this was closed as not planned and --bail
was suggested as an alternative.
Why Most Unit Testing is Waste argues that excessive unit testing can be counterproductive and suggests focusing on integration tests that verify valuable business logic. (reddit)
Write tests. Not too many. Mostly integration argues that integration tests provide the best balance between confidence and speed, suggesting that teams should focus more on integration testing than unit testing, while being mindful not to over-test implementation details.
Unit testing vs BDD explains how BDD is essentially unit testing done right - focusing on verifying behavior rather than implementation details. Discusses the practical value of Gherkin syntax and argues that regular code can achieve similar readability.
UnitTest explains how unit tests are low-level tests focusing on a small part of the software system, written by programmers using testing frameworks, and designed to run quickly. Discusses the distinction between solitary unit tests using test doubles and sociable tests that allow real collaborators.
Practical Test Pyramid provides a comprehensive guide to structuring automated tests, explaining how to balance different types of tests from unit to end-to-end, and how to effectively implement them in a continuous delivery pipeline.
Ordering microservice, part of the eShopOnContainers repo
The business value of using DDD, Vaughn Vernon