How QA Agent Works (Architecture)

Read the guide below.

verified today
SERIES QA Agent — Complete Guide 06/07DIFFICULTY intermediateTIME 20 minCATEGORY architectureEdit on GitHub →
Need this in production?

Turn this cable into a shipping system.

We help teams deploy reliable AI workflows with architecture, implementation, and hardening support.

How QA Agent Works (Architecture)

Before contributing, understand what runs when you type python cli.py run --suite smoke.

What you'll learn

  • The execution pipeline from CLI command to HTML report
  • What each module in qa_agent/ does
  • How the agent makes pass/fail decisions
  • Which integration points are extension-friendly

Execution pipeline

CLI command
    │
    ▼
cli.py (entry point)
    │  parses --suite / --file / --tag flags
    │
    ▼
runner.py (RunnerService)
    │  resolves test files, manages parallelism (asyncio + semaphore)
    │
    ▼
parser.py (parse_test_case)
    │  reads markdown, extracts title, persona, priority, tags, steps, expected
    │
    ▼
agent_factory.py (build_agent)
    │  creates browser-use Agent with the configured LLM
    │  loads persona session cookies if persona != unauthenticated
    │
    ▼
browser-use Agent (executes steps)
    │  each step → LLM call → browser action
    │  screenshots captured at every step
    │
    ▼
hooks.py (on_step_end, on_done)
    │  captures screenshots, GIFs, video via Playwright
    │
    ▼
reporter.py (generate_report)
    │  builds HTML report from run results
    │
    ▼
linear_reporter.py / supabase_uploader.py (optional)
    │  creates Linear issues for failures, uploads report
    │
    ▼
HTML report on disk

Module responsibilities

| File | Responsibility | |------|---------------| | cli.py | Argument parsing, command dispatch (run, auth, report, release) | | runner.py | Parallel test execution, result aggregation | | parser.py | Markdown test case parsing → structured TestCase dataclass | | agent_factory.py | Constructs browser-use Agent with LLM config and persona session | | hooks.py | Step-level callbacks: screenshots, GIF/video recording | | reporter.py | HTML report generation from RunResult objects | | auth.py | Interactive persona session capture and persistence | | linear_reporter.py | Linear issue creation for failing tests | | sentry_reporter.py | Sentry event correlation (links test failures to error events) | | supabase_uploader.py | Report upload to Supabase Storage | | release_runner.py | CSV-driven release pipeline: reads a test manifest, runs specified suites, generates a consolidated report | | csv_parser.py | Parses release CSV format into runnable test batches | | custom_tools.py | Custom browser-use tools (extended actions available to the agent) |

How pass/fail works

QA Agent does not use assertions in the traditional sense. Instead:

  1. The agent executes every step in the ## Steps section
  2. At the end of the run, the agent evaluates the ## Expected section against what it observed
  3. If all expected outcomes are satisfied, the test passes. If any are not, the test fails with a structured failure note explaining which expectation was not met and why

This means the AI model is both the executor and the judge. For most UI flows this works well. For precise numeric or time-based assertions ("price must be exactly $29.00"), be explicit in the Expected section to reduce ambiguity.

LLM configuration

The model is set in config.yaml:

llm:
  model: google/gemini-3.1-flash-lite-preview
  base_url: https://openrouter.ai/api/v1

agent_factory.py constructs the ChatOpenAI-compatible client from these values. To swap models, change model in config.yaml. Any OpenRouter-compatible model ID works.

Integration points

The cleanest places to extend QA Agent:

  • New integrations (Jira, GitHub Issues, Slack) — add a module alongside linear_reporter.py, implement a report(run_result) function, and call it from runner.py after the run completes
  • Custom agent tools — add browser actions to custom_tools.py (e.g., drag-and-drop, file upload helpers)
  • Custom report formats — replace or extend reporter.py for JSON output, CSV summaries, or custom HTML templates

What's next

Cable 7: Contributing to QA Agent →

Quick answers

What do I get from this cable?

You get a step-by-step guide for this aspect of QA Agent.

How much time should I budget?

Typical effort is 20 min. The cable is marked intermediate.

Do I need to know Python?

Basic familiarity with running Python CLI commands is enough for the user guide cables (1–5). The contributor guide (cables 6–7) assumes you can read and write Python.

How fresh is the guidance?

The cable was last verified on 2026-04-17.

More from @frenxt

Share this cable