Research

Benchmarks and field research from live production work.

Every report here came out of a client engagement. We ship it, measure it, then publish what worked — including the numbers.

Technical Reports

Installable Cables

480

Production Specs Benchmarked

289

Archived QA Runs

cables · Field notes & installable recipes

The working knowledge behind our engagements

Shorter than a research article, more actionable than a blog post. Installable artifacts for Claude Code, Codex, and agent workflows — straight from active client projects.

Browse the cables library

Technical Report

May 17, 2026

15 min read

Re-Architecting DungBeetle on Apache Iceberg, With a Per-Query Engine Router

DungBeetle had the right idea five years early. Here is the 2026 rebuild: Apache Iceberg as the durable result store, a per-query router across DuckDB, StarRocks, Trino, and RisingWave, and a durable workflow layer underneath. Keep the design intent, drop the constraints.

Data EngineeringApache IcebergQuery EnginesLakehouse ArchitectureDuckDBTrino

Read article

Field Note

Apr 18, 2026

7 min read

Ship Your Stack: A Contributor Guide to Publishing on Cables

The end-to-end walkthrough for publishing your own Claude Code stack to the Cables community, from repo layout to the automated security review to a one-command install by anyone.

CommunityClaude CodePublishingDeveloper Tools

Read article

Field Note

Apr 16, 2026

6 min read

Markdown Specs Are the Missing Interface for Browser QA Agents

Why markdown test cases beat brittle browser scripts for fast-moving products. A practical pattern for persona-aware QA, release sweeps, and evidence-rich test runs.

QA AutomationBrowser AgentsTesting StrategyDeveloper Tools

Read article

Benchmark

Apr 16, 2026

7 min read

Why Gemini 3.1 Flash Lite Was Our Best Fit for Browser QA Agents

Our conclusion from production usage, not generic benchmark hype: Gemini 3.1 Flash Lite hit the best speed-cost-reliability balance for browser QA agents.

GeminiBrowser AgentsQA AutomationModel Selection

Read article

Technical Report

Apr 10, 2026

10 min read

LangChain vs LlamaIndex: Which Framework Should You Use in 2026?

An in-depth comparison of LangChain and LlamaIndex for production AI applications. We cover architecture, RAG capabilities, agent support, ecosystem maturity, and when to use each framework.

LangChainLlamaIndexRAGFramework ComparisonAI Architecture

Read article

Technical Report

Apr 8, 2026

9 min read

How to Reduce LLM Costs by 50-70% Without Sacrificing Quality

Practical strategies to cut your LLM inference costs dramatically. Covers dynamic model routing, prompt optimization, semantic caching, and batching, with real numbers from production deployments.

LLM OptimizationCost ReductionModel RoutingProduction AI

Read article

Technical Report

Apr 5, 2026

12 min read

Building Production-Grade RAG Pipelines: A Practical Guide

Everything you need to know about building RAG pipelines that work in production. Covers chunking strategies, embedding selection, vector stores, retrieval optimization, evaluation, and common failure modes.

RAGProduction AIVector DatabaseLangChainAI Architecture

Read article

Technical Report

Apr 1, 2026

11 min read

Multi-Agent System Architecture: A Practical Guide for Production

How to design, build, and deploy multi-agent AI systems that work reliably in production. Covers architecture patterns, agent boundaries, state management, error handling, and lessons learned from deploying 8+ production agents.

Multi-Agent SystemsAI ArchitectureLangGraphProduction AIAgent Design

Read article

Technical Report

Mar 5, 2026

8 min read

How to Build Production Multi-Agent Systems with LangGraph

A practical guide to building production-grade multi-agent AI systems with LangGraph. Learn architecture patterns, cost optimization strategies, and lessons from deploying 8 specialized agents at scale.

LangGraphMulti-Agent SystemsAI ArchitectureProduction AI

Read article

Benchmark

Mar 1, 2026

7 min read

10x LLM Cost Savings with Dynamic Model Routing

Learn how dynamic model routing can cut LLM costs by 10x. Route simple tasks to cheap models and complex tasks to powerful ones using LangGraph middleware, with full Python implementation.

LLM OptimizationCost ReductionLangGraphDynamic Routing

Read article

Field Note

Feb 20, 2026

9 min read

Migrating LangChain to Production: Lessons from the Field

A practical guide to migrating legacy LLM pipelines to LangChain v1 in production. Covers architecture decisions, caching strategies for 90%+ hit rates, multi-model routing, and how to achieve 5-10x throughput with 50-70% token cost reduction.

LangChainMigrationProduction AILLM Optimization

Read article