Research
Every report here came out of a client engagement. We ship it, measure it, then publish what worked — including the numbers.
cables · Field notes & installable recipes
Shorter than a research article, more actionable than a blog post. Installable artifacts for Claude Code, Codex, and agent workflows — straight from active client projects.
Browse the cables libraryTechnical Report
DungBeetle had the right idea five years early. Here is the 2026 rebuild: Apache Iceberg as the durable result store, a per-query router across DuckDB, StarRocks, Trino, and RisingWave, and a durable workflow layer underneath. Keep the design intent, drop the constraints.
Field Note
The end-to-end walkthrough for publishing your own Claude Code stack to the Cables community, from repo layout to the automated security review to a one-command install by anyone.
Field Note
Why markdown test cases beat brittle browser scripts for fast-moving products. A practical pattern for persona-aware QA, release sweeps, and evidence-rich test runs.
Benchmark
Our conclusion from production usage, not generic benchmark hype: Gemini 3.1 Flash Lite hit the best speed-cost-reliability balance for browser QA agents.
Technical Report
An in-depth comparison of LangChain and LlamaIndex for production AI applications. We cover architecture, RAG capabilities, agent support, ecosystem maturity, and when to use each framework.
Technical Report
Practical strategies to cut your LLM inference costs dramatically. Covers dynamic model routing, prompt optimization, semantic caching, and batching, with real numbers from production deployments.
Technical Report
Everything you need to know about building RAG pipelines that work in production. Covers chunking strategies, embedding selection, vector stores, retrieval optimization, evaluation, and common failure modes.
Technical Report
How to design, build, and deploy multi-agent AI systems that work reliably in production. Covers architecture patterns, agent boundaries, state management, error handling, and lessons learned from deploying 8+ production agents.
Technical Report
A practical guide to building production-grade multi-agent AI systems with LangGraph. Learn architecture patterns, cost optimization strategies, and lessons from deploying 8 specialized agents at scale.
Benchmark
Learn how dynamic model routing can cut LLM costs by 10x. Route simple tasks to cheap models and complex tasks to powerful ones using LangGraph middleware, with full Python implementation.
Field Note
A practical guide to migrating legacy LLM pipelines to LangChain v1 in production. Covers architecture decisions, caching strategies for 90%+ hit rates, multi-model routing, and how to achieve 5-10x throughput with 50-70% token cost reduction.