evals

links

Evalite - a vitest-based eval runner by Matt Pocock.

ai, llm, evals, vitest • 2025-03-07 • 3:11pm

Emerging Patterns in Building GenAI Products - a look at a number of different gen-ai patterns across evals, embeddings, RAG, Guardrails, fine tuning.

ai, evals, patterns • 2025-02-28 • 9:14am

Building a SNAP LLM eval - the first write-up in a series about our process of building an “eval” — evaluation — to assess how well AI models perform on prompts

ai, llm, evals • 2025-02-16 • 9:02am

Your AI product needs evals - How to construct domain-specific LLM evaluation systems to improve AI by iterating quickly.

ai, llm, evals • 2025-02-16 • 9:01am

Getting AI-powered features past the post-MVP slump

The non-negotiable first step in systematically improving your AI systems is establishing a solid feedback loop.

ai, llm, evals • 2025-02-05 • 8:12am