Writing Your First Test Case

Read the field note below to see how we apply this pattern in real QA Agent projects.

FRE|Nxt Labs

verified 2 months ago20 min

Writing Your First Test Case

A good test case reads like a conversation between a QA engineer and a developer. Not like code.

What you'll learn

The anatomy of a QA Agent test case file
How to write steps that survive UI changes
How to run a single test and read its output

The test case format

Every test case is a markdown file inside a suite folder. Create your first one:

mkdir -p tests/my-suite
touch tests/my-suite/homepage-loads.md

A test case has three required sections:

# Homepage loads correctly

- **persona**: unauthenticated
- **priority**: high
- **tags**: smoke, homepage

## Steps

1. Navigate to /
2. Verify the main heading is visible
3. Verify no console errors appear

## Expected

- Page renders without a blank screen
- Primary CTA button is visible and clickable

Title

The first line (# Title) is both the test name shown in reports and what the AI uses to understand the test's intent. Be specific: "Homepage loads correctly" is better than "Test 1".

Metadata

persona. The user context for this test. Options: unauthenticated, free-user, pro-user. QA Agent loads the corresponding credentials from personas/. Start with unauthenticated for public flows.
priority. One of critical, high, medium, or low. Used to filter runs and prioritise triage.
tags. Comma-separated. Used to run targeted subsets (python cli.py run --tag smoke).

Steps

Write steps in plain imperative language. The AI model interprets them:

Good: Click "Sign in". References visible UI text
Good: Verify the dashboard heading shows "Welcome back"
Avoid: Click button[data-testid="login-btn"]. Selectors break on refactors and the AI doesn't need them
Avoid: Wait 2 seconds. The agent handles timing automatically

Expected

List what success looks like. These become the pass/fail criteria the agent checks at the end of the run.

Writing steps that survive refactors

The biggest advantage of prose steps is resilience. When a button's label changes from "Get started" to "Start free trial", update one line. When a nav item moves, update the step description. You don't hunt for selectors.

Practical rules:

Reference what the user sees, not what's in the DOM
Describe intent: "Complete the sign-in flow" rather than "Click the submit button on line 3 of the form"
Keep each step to one action

Run your test

python cli.py run --file tests/my-suite/homepage-loads.md --headed

--headed opens a visible browser window so you can watch what the agent does on the first run. Once you're confident the steps are interpreted correctly, drop --headed for faster headless runs.

Output:

Running: tests/my-suite/homepage-loads.md
[homepage-loads] PASS (15.2s)

Report saved: reports/2026-04-17-my-suite-01/index.html

Open the HTML report. You'll see a timeline of browser actions with screenshots at each step.

Common first-run issues

The agent gets stuck on a cookie banner. Add a step: Dismiss any cookie consent banner if present. The AI handles conditional UI well.

A step fails because an element loads slowly. Add Wait for the page to fully load as the first step after navigation. The agent will wait for the visible state before proceeding.

The test passes but the report shows a step differently than expected. Read the agent's reasoning in the report. Rewrite the step to be more explicit if the interpretation was wrong.

What's next

Cable 4: Running Tests & Reading Reports →

Quick answers

What do I get from this cable?

You get a dated field note that explains how we handle this test-authoring workflow in real QA Agent projects.

How much time should I budget?

Typical effort is 20 min. The cable is marked beginner.

How do I install the artifact?

This cable is guidance-only and does not ship an installable artifact.

How fresh is the guidance?

The cable is explicitly last verified on 2026-04-17, and includes source links for traceability.

Work with FRE|Nxt

We build the production AI systems we write about.

Cables are the field notes. The playbooks come from client engagements — multi-agent systems, RAG pipelines, and LLM cost cuts that ship and hold up in production. If something here maps to a problem on your roadmap, two ways in:

Get a free 2-page audit Book a 30-min discovery call

Audit capacity: 5 slots/month · No pitch deck · NDA on request

Same shelf · Fix a specific problem

claude-code·no artifact

Use auto mode, not --dangerously-skip-permissions

Two flags promise to stop Claude Code from pausing at every tool call. One of them reads your settings, honours your allowlist, and refuses to run anything g…

@frenxt · 8 mininstall →

claude-code·skill

Publish your stack to Cables (automated)

A skill that walks Claude Code through publishing your Claude stack to the Cables community in one conversation. No manual repo setup, no hand-written `stack…

@frenxt · 10 mininstall →

claude-code·skill

Replicate Ragav's stack (skills + plugins + scripts)

Pick the stack that matches what you're building. Each one is a single `npx` command. Plugins installed, skills synced, marketplaces configured, no bash scri…

@frenxt · 15 mininstall →

Share this cable

Share on Twitter Share on LinkedIn