Running Tests & Reading Reports

Read the field note below to see how we apply this pattern in real QA Agent projects.

verified 1 month ago20 min

Running Tests & Reading Reports

A green result is only useful if you trust it. A red result is only useful if you can debug it fast.

What you'll learn

  • The full set of run command options
  • What the HTML report contains and how to read it
  • How to share reports with your team

The run command

Run a full suite

python cli.py run --suite smoke

Runs every .md file inside tests/smoke/. Suites run in parallel. The default is 3 concurrent browser sessions (configurable in config.yaml under parallel).

Run a single file

python cli.py run --file tests/smoke/homepage-loads.md

Useful when writing or debugging a new test case.

Run by tag

python cli.py run --tag critical

Runs all test cases across all suites that have the matching tag in their metadata.

Run in headed mode

python cli.py run --suite smoke --headed

Opens a visible browser window. Use this when developing test cases or investigating failures.

View the latest report

python cli.py report --latest

Opens the most recently generated HTML report in your default browser.

Reading the HTML report

The report has three sections:

Summary bar

Shows total tests, pass count, fail count, skip count, total duration, and the timestamp. This is what you screenshot for a release sign-off.

Per-test timeline

Each test gets its own card with:

  • Status badge. PASS (green), FAIL (red), or SKIP (grey)
  • Duration. How long the test took
  • Step log. Each step the agent executed, in order, with its interpretation
  • Screenshots. One per step, shown inline
  • Failure note (on FAIL). Structured text: which step failed, the agent's reasoning, and the screenshot path

A typical failure note:

Step failed: "Click 'Proceed to checkout'"
Reason: Element matched selector but returned pointer-events: none
Screenshot: reports/2026-04-17-checkout-01/step-4.png
Reproduction: Load /cart with free-user persona, add item, proceed to checkout

GIF / video evidence

At the bottom of each test card is a GIF of the full browser session. Share this directly in a Slack thread or Linear issue. No need for anyone to reproduce the failure locally to understand what happened.

Sharing reports

Reports are static HTML files in the reports/ directory. Options for sharing:

For team review: Commit the report directory to a qa-reports branch and share the GitHub URL, or upload the folder to any static host.

For release pipelines: Use the built-in Supabase uploader (python cli.py report --upload) if you have SUPABASE_URL and SUPABASE_SERVICE_ROLE_KEY set in .env. This uploads the report and returns a public URL.

For Linear integration: Set linear.create_issues: true in config.yaml. QA Agent will automatically open a Linear issue for each failing test, attaching the failure note and GIF.

What's next

Cable 5: Organizing Suites & Personas →

Quick answers

What do I get from this cable?

You get a dated field note that explains how we handle this running-tests workflow in real QA Agent projects.

How much time should I budget?

Typical effort is 20 min. The cable is marked intermediate.

How do I install the artifact?

This cable is guidance-only and does not ship an installable artifact.

How fresh is the guidance?

The cable is explicitly last verified on 2026-04-17, and includes source links for traceability.

Work with FRE|Nxt

We build the production AI systems we write about.

Cables are the field notes. The playbooks come from client engagements — multi-agent systems, RAG pipelines, and LLM cost cuts that ship and hold up in production. If something here maps to a problem on your roadmap, two ways in:

Audit capacity: 5 slots/month · No pitch deck · NDA on request

Same shelf · Fix a specific problem
Share this cable