What is QA Agent?
Read the field note below to see how we apply this pattern in real QA Agent projects.
What is QA Agent?
Code changes land fast. Your QA process shouldn't be the bottleneck.
What you'll learn
- What QA Agent does and why markdown test specs beat code-based tests
- How it fits into a real product workflow
- When to reach for it versus Playwright or Cypress
The core idea
QA Agent is an autonomous browser testing runner. You write test cases in plain markdown. QA Agent reads them, opens a real browser, executes the steps using an AI model (Gemini 3.1 Flash Lite by default), captures screenshots and screen recordings, and produces an HTML report with pass/fail results and visual evidence.
The key design choice: test specs are written in product language, not code. A spec looks like this:
# Checkout flow completes successfully
- **persona**: free-user
- **priority**: high
- **tags**: checkout, critical
## Steps
1. Navigate to /cart
2. Add the first product to the cart
3. Click "Proceed to checkout"
4. Verify the order summary is visible
## Expected
- Cart shows correct item count
- Order summary displays correct totals
- No console errors during the flow
This means a non-engineer can read and review a test case. More importantly, the spec survives a UI refactor. If a button's label changes from "Proceed to checkout" to "Review order", you update one line of markdown, not a selector chain spread across multiple test files.
How it runs
QA Agent uses browser-use under the hood, which drives a real Chromium instance. The AI model interprets each step in the spec and translates it into browser actions. Clicks, navigation, form fills, assertions. At the end of every run it generates:
- An HTML report with per-test pass/fail, timing, and failure notes
- Screenshots at each step
- GIFs and videos of the full browser session
- Structured failure notes with the exact step, reason, and a screenshot path
When to use QA Agent
Use it when:
- You want to verify user-facing flows before a release without writing and maintaining Playwright suites
- Your team wants QA coverage that non-engineers can read and expand
- You need visual evidence of failures, not just a red/green result
Use Playwright or Cypress instead when:
- You need sub-second CI performance (QA Agent runs at human browser speed)
- You need to test deeply technical internal APIs or WebSocket behaviour
- You already have a mature, maintained selector-based suite
What's next
Quick answers
What do I get from this cable?
You get a dated field note that explains how we handle this onboarding workflow in real QA Agent projects.
How much time should I budget?
Typical effort is 5 min. The cable is marked beginner.
How do I install the artifact?
This cable is guidance-only and does not ship an installable artifact.
How fresh is the guidance?
The cable is explicitly last verified on 2026-04-17, and includes source links for traceability.
Work with FRE|Nxt
We build the production AI systems we write about.
Cables are the field notes. The playbooks come from client engagements — multi-agent systems, RAG pipelines, and LLM cost cuts that ship and hold up in production. If something here maps to a problem on your roadmap, two ways in:
Audit capacity: 5 slots/month · No pitch deck · NDA on request
Use auto mode, not --dangerously-skip-permissions
Two flags promise to stop Claude Code from pausing at every tool call. One of them reads your settings, honours your allowlist, and refuses to run anything g…
Publish your stack to Cables (automated)
A skill that walks Claude Code through publishing your Claude stack to the Cables community in one conversation. No manual repo setup, no hand-written `stack…
Replicate Ragav's stack (skills + plugins + scripts)
Pick the stack that matches what you're building. Each one is a single `npx` command. Plugins installed, skills synced, marketplaces configured, no bash scri…