What is QA Agent?

Read the field note below to see how we apply this pattern in real QA Agent projects.

verified 1 month ago5 min

What is QA Agent?

Code changes land fast. Your QA process shouldn't be the bottleneck.

What you'll learn

What QA Agent does and why markdown test specs beat code-based tests
How it fits into a real product workflow
When to reach for it versus Playwright or Cypress

The core idea

QA Agent is an autonomous browser testing runner. You write test cases in plain markdown. QA Agent reads them, opens a real browser, executes the steps using an AI model (Gemini 3.1 Flash Lite by default), captures screenshots and screen recordings, and produces an HTML report with pass/fail results and visual evidence.

The key design choice: test specs are written in product language, not code. A spec looks like this:

# Checkout flow completes successfully

- **persona**: free-user
- **priority**: high
- **tags**: checkout, critical

## Steps

1. Navigate to /cart
2. Add the first product to the cart
3. Click "Proceed to checkout"
4. Verify the order summary is visible

## Expected

- Cart shows correct item count
- Order summary displays correct totals
- No console errors during the flow

This means a non-engineer can read and review a test case. More importantly, the spec survives a UI refactor. If a button's label changes from "Proceed to checkout" to "Review order", you update one line of markdown, not a selector chain spread across multiple test files.

How it runs

QA Agent uses browser-use under the hood, which drives a real Chromium instance. The AI model interprets each step in the spec and translates it into browser actions. Clicks, navigation, form fills, assertions. At the end of every run it generates:

An HTML report with per-test pass/fail, timing, and failure notes
Screenshots at each step
GIFs and videos of the full browser session
Structured failure notes with the exact step, reason, and a screenshot path

When to use QA Agent

Use it when:

You want to verify user-facing flows before a release without writing and maintaining Playwright suites
Your team wants QA coverage that non-engineers can read and expand
You need visual evidence of failures, not just a red/green result

Use Playwright or Cypress instead when:

You need sub-second CI performance (QA Agent runs at human browser speed)
You need to test deeply technical internal APIs or WebSocket behaviour
You already have a mature, maintained selector-based suite

What's next

Cable 2: Setting Up QA Agent →

Quick answers

What do I get from this cable?

You get a dated field note that explains how we handle this onboarding workflow in real QA Agent projects.

How much time should I budget?

Typical effort is 5 min. The cable is marked beginner.

How do I install the artifact?

This cable is guidance-only and does not ship an installable artifact.

How fresh is the guidance?

The cable is explicitly last verified on 2026-04-17, and includes source links for traceability.

Work with FRE|Nxt

We build the production AI systems we write about.

Cables are the field notes. The playbooks come from client engagements — multi-agent systems, RAG pipelines, and LLM cost cuts that ship and hold up in production. If something here maps to a problem on your roadmap, two ways in:

Get a free 2-page audit Book a 30-min discovery call

Audit capacity: 5 slots/month · No pitch deck · NDA on request

Same shelf · Fix a specific problem

claude-code·no artifact

Use auto mode, not --dangerously-skip-permissions

Two flags promise to stop Claude Code from pausing at every tool call. One of them reads your settings, honours your allowlist, and refuses to run anything g…

@frenxt · 8 mininstall →

claude-code·skill

Publish your stack to Cables (automated)

A skill that walks Claude Code through publishing your Claude stack to the Cables community in one conversation. No manual repo setup, no hand-written `stack…

@frenxt · 10 mininstall →

claude-code·skill

Replicate Ragav's stack (skills + plugins + scripts)

Pick the stack that matches what you're building. Each one is a single `npx` command. Plugins installed, skills synced, marketplaces configured, no bash scri…

@frenxt · 15 mininstall →

Share this cable

Share on Twitter Share on LinkedIn