In my first blog in our series about how we’re using AI to address testing constraints in a fast development cycle, I talked about the problem that forced our hand. You can read it here if you missed it.
Here’s the short story: Our team didn’t struggle with testing because our QA team wasn’t good enough, we struggled because success compounded a lack of time. Every new feature increased regression scope, validation slowed down, and quality risk went up. For an investing platform, “move fast and fix it later” can’t be a solution.
Our answer was to create an AI-fueled system where our people defined intent, and automation handled repetitive tasks.
In this post, I’ll walk through how we actually built that system.
This is the practical, hands-on part of the story. Let’s dive in.

The Three Layers of Our Testing System Framework
The framework has three unique layers, each built to do one thing well.
- Specification Layer: Human-readable test definitions
- Execution Layer: Auto-generated Playwright tests
- Page Object Layer: Reusable UI interactions with flexible selectors
Let’s look at each one.
Layer 1: Spec Files
Spec files are plain text written in Markdown. Anyone can write them with no code knowledge required. They need to describe intent, not implementation.
A typical spec defines the user story, authentication context, and a sequence of steps written in plain English. Steps can include timing hints for slow operations, logical grouping, and clear success criteria.
What matters most is readability. A product manager, QA engineer, or developer should all be able to read the same spec and understand exactly what’s being tested.
Key characteristics of good specs:
- Written in plain English
- Authentication context defined at the top
- Tests grouped logically
- Notes included for slow or asynchronous operations
Specs define what should happen. They deliberately avoid how it happens.
# Tax Loss Harvesting Specifications
## User Story
User can harvest tax losses from investment accounts
## Authentication
USER: DEFAULT
================================================================================
TEST GROUP TLH: Tax Loss Harvesting
================================================================================
— TEST TLH.1: Harvest Account [HAPPY PATH] —
Steps:
1. Navigate to harvest page
2. Look for Select Account and search for “Haley Fuller”
3. Confirm Account Summary shows positive values for Equities
4. Click on ‘Harvest Account’. This might take up to 60 seconds
5. Confirm ‘Selected Capital:’ shows positive value
6. Click on ‘Optimize Replacements’. This might take up to 120 seconds
7. Click on ‘Preview Orders’ to verify orders
8. Confirm there are some sells and buys in the orders preview
Layer 2: Generated Tests
Next, Claude Code reads the spec files and generates executable Playwright tests.
The generated tests mirror the structure of the spec. Each step in the spec maps directly to a method call. The test code stays readable and predictable, while the complexity is pushed down into reusable components.
Important characteristics of generated tests:
- Authentication is applied consistently
- Timeouts reflect real-world workflows
- Soft assertions are used where appropriate
- Valid “no-op” states are handled gracefully
The generated test code reads like a faithful translation of the spec rather than a hand-written automation script. This makes it easier to debug, review, and trust.
// generated/tlh.spec.ts
import { test, expect } from ‘../fixtures’;
import { TLHPage } from ‘../page-objects’;
test.use({ userType: ‘DEFAULT’ });
test.setTimeout(300000); // 5 minutes
test.describe(‘TLH Tests’, () => {
let tlhPage: TLHPage;
test.beforeEach(async ({ page }) => {
tlhPage = new TLHPage(page);
});
test(‘TLH.1: Harvest Account’, async ({ page }) => {
// Step 1: Navigate to Harvest page
await tlhPage.goto();
// Step 2: Select account
await tlhPage.selectAccount(‘Haley Fuller’);
// Step 3: Verify equities
const equities = await tlhPage.getEquitiesValue();
expect.soft(equities, ‘Equities should be positive’).toBeGreaterThan(0);
// Step 4: Click Harvest Account (handles 60s timeout)
await tlhPage.clickHarvestAccount();
// Step 5: Check for opportunities
const hasOpportunities = await tlhPage.hasHarvestingOpportunities();
if (!hasOpportunities) {
console.log(‘No harvesting opportunities found’);
return; // Valid state – test passes
}
// Step 6: Optimize Replacements
await tlhPage.clickOptimizeReplacements();
// Step 7-8: Preview and verify orders
const previewEnabled = await tlhPage.isPreviewOrdersEnabled();
if (previewEnabled) {
const orderCounts = await tlhPage.clickPreviewOrders();
expect.soft(orderCounts.buys).toBeGreaterThan(0);
expect.soft(orderCounts.sells).toBeGreaterThan(0);
}
});
});
Layer 3: Page Objects
Page objects handle the messy reality of UI automation.
Selectors can change, elements may get intercepted, and loading states appear and disappear. Page objects exist to absorb that complexity so tests and specs stay clean.
Key responsibilities of page objects:
- Define flexible selectors with fallbacks
- Handle intercepted clicks and overlays
- Encapsulate waiting logic for async operations
- Provide stable, reusable interaction methods
UIs are constantly evolving, so selectors need to tolerate change. We prioritize semantic, role-based selectors with flexible matching, and only fall back to text or test IDs when necessary. That way, routine UI updates don’t cascade into multiple broken tests.
The objective is simple: small changes shouldn’t create big failures.
// page-objects/TLHPage.ts
import { Page, Locator } from ‘@playwright/test’;
import { BasePage } from ‘./BasePage’;
export class TLHPage extends BasePage {
readonly url = ‘/harvest’;
// Declarative locators with fallbacks
readonly harvestAccountButton: Locator;
readonly previewOrdersButton: Locator;
constructor(page: Page) {
super(page);
// Primary selector with fallback using .or()
this.harvestAccountButton = page
.getByRole(‘button’, { name: /harvest.*account/i })
.or(page.locator(‘button:has-text(“Harvest Account”)’));
this.previewOrdersButton = page
.getByRole(‘button’, { name: /preview.*order/i })
.or(page.locator(‘button:has-text(“Preview Orders”)’));
}
async selectAccount(accountName: string): Promise<void> {
// React-select dropdown pattern
const combobox = this.page.getByRole(‘combobox’).first();
await combobox.click();
await this.page.keyboard.type(accountName);
await this.page.waitForSelector(‘[role=”option”]’);
const option = this.page
.locator(‘[role=”option”]’)
.filter({ hasText: new RegExp(accountName, ‘i’) })
.first();
await option.click();
}
async clickHarvestAccount(): Promise<void> {
await this.harvestAccountButton.scrollIntoViewIfNeeded();
// Handle potential element interception
try {
await this.harvestAccountButton.click({ timeout: 5000 });
} catch {
// Fallback to JavaScript click if normal click is intercepted
await this.harvestAccountButton.evaluate(
(btn) => (btn as HTMLButtonElement).click()
);
}
await this.waitForHarvestComplete();
}
async waitForHarvestComplete(): Promise<void> {
// Wait for loading indicators
const generatingText = this.page.locator(‘text=/Generating.*replacement/i’);
try {
await generatingText.waitFor({ state: ‘visible’, timeout: 5000 });
await generatingText.waitFor({ state: ‘hidden’, timeout: 120000 });
} catch {
// Operation may complete before we can observe the loading state
}
}
}
AI-Powered Test Generation: The Script
A simple script orchestrates test generation.
#!/bin/bash
# scripts/generate-tests.sh
SCRIPT_DIR=”$(cd “$(dirname “${BASH_SOURCE[0]}”)” && pwd)”
TESTS_DIR=”$(cd “$SCRIPT_DIR/..” && pwd)”
# Read the AI prompt template
PROMPT_FILE=”$TESTS_DIR/docs/GENERATE_TESTS_PROMPT.md”
# Build Claude prompt
CLAUDE_PROMPT=”I need you to generate Playwright E2E tests based on:
## Instructions
$(cat “$PROMPT_FILE”)
## Test Cases to Implement
$(cat “$INPUT_FILE”)
## Task
1. Read existing page objects to understand patterns
2. Create NEW page objects if needed
3. Generate test spec file
4. Use flexible regex selectors
5. Use .or() fallbacks
6. Use expect.soft() for non-critical assertions”
# Run Claude Code
claude –print “$CLAUDE_PROMPT”
This keeps generation consistent and repeatable.
The Generation Prompt
Next, this prompt tells Claude how to write resilient tests.
**Selector priority:**
“`typescript
// BEST: Role-based with regex
page.getByRole(‘button’, { name: /submit.*request/i })
page.getByRole(‘textbox’, { name: /email/i })
// GOOD: Text-based with regex
page.getByText(/total.*requests|all.*requests/i)
// FALLBACK: Use .or() chains
const submitButton = page
.getByRole(‘button’, { name: /submit/i })
.or(page.locator(‘button:has-text(“Submit”)’))
.or(page.locator(‘[data-testid=”submit-btn”]’));
“`
**React-Select dropdowns:**
“`typescript
async selectFromDropdown(labelText: string, optionText: string) {
const dropdown = this.page
.locator(`text=${labelText}`)
.locator(‘xpath=following-sibling::*[1]//input[@role=”combobox”]’)
.or(this.page.getByRole(‘combobox’).first());
await dropdown.click();
await this.page.waitForSelector(‘[role=”option”]’);
const option = this.page
.locator(‘[role=”option”]’)
.filter({ hasText: new RegExp(optionText, ‘i’) })
.first();
await option.click();
}
“`
**Assertions:**
“`typescript
// Soft assertions for non-critical checks (test continues if these fail)
expect.soft(value, ‘Descriptive message’).toBeGreaterThan(0);
// Hard assertions for critical path (test stops if these fail)
expect(isLoggedIn, ‘User must be authenticated’).toBeTruthy();
“`
**Slow operations:**
“`typescript
// Wait for loading indicators to appear and disappear
await page.waitForSelector(‘text=/Loading/i’, { state: ‘hidden’, timeout: 60000 });
// Or wait for specific completion indicators
await page.waitForFunction(() => {
return !document.querySelector(‘button’)?.textContent?.includes(‘Loading’);
}, { timeout: 120000 });
The prompt is where most of the leverage lives. It teaches Claude how to write tests that survive real-world UI changes.
Key principles baked into the prompt:
| Selector priority | Assertions | Async behavior |
| • Prefer role-based selectors with regex • Fall back to text-based selectors with regex • Use explicit test IDs when available • Avoid brittle CSS selectors unless necessary | • Use soft assertions for non-critical checks • Use hard assertions for critical path validation | • Wait for visible loading states when possible • Handle operations that may complete too quickly to observe • Account for long-running workflows explicitly |
This ensures that generated tests follow the same patterns every time.
Core Insights on Building Resilient Page Objects
Flexible selectors are the single biggest contributor to test stability.
// BRITTLE: Breaks when text changes
page.locator(‘button:has-text(“Submit Request”)’);
// RESILIENT: Survives minor changes
page.getByRole(‘button’, { name: /submit.*request/i })
.or(page.locator(‘[data-testid=”submit-btn”]’));
Brittle selectors break whenever text, spacing, or component structure changes. Flexible selectors, however, tolerate variation while preserving intent.
For example, matching “Submit,” “Submit Request,” or “Submit Now” with a single regex-based selector dramatically reduces breakage.
Selector Priority
When writing selectors, use this order of preference:
- Role-based selectors with regex: Most resilient and use semantic meaning
- Text-based selectors with regex: A good fallback that handles variations
- Data-testid attributes: An explicit contract between dev and test
- CSS selectors as a last resort, because they break the most
This mirrors how users interact with the UI and how accessibility semantics are defined.
Common Interaction Patterns
Once you start building page objects this way, tests need to accommodate these interaction patterns by default:
- Buttons should tolerate label variation.
- Form fields should tolerate label and placeholder differences.
- Dropdowns should account for delayed option rendering.
- Tables should allow row-level filtering based on visible text.
Encapsulating these patterns in page objects prevents duplication and keeps tests expressive.
// Handle label variations
const emailField = page
.getByRole(‘textbox’, { name: /email/i })
.or(page.getByPlaceholder(/email/i))
.or(page.locator(‘input[type=”email”]’));
“`
**Dropdowns (React-Select):**
“`typescript
async selectOption(optionText: string) {
const combobox = this.page.getByRole(‘combobox’).first();
await combobox.click();
// Wait for options to load
await this.page.waitForSelector(‘[role=”option”]’);
// Find option with flexible matching
const option = this.page
.locator(‘[role=”option”]’)
.filter({ hasText: new RegExp(optionText, ‘i’) })
.first();
await option.click();
}
**Tables:**
“`typescript
async getRowByText(text: string) {
return this.page
.locator(‘tr’)
.filter({ hasText: new RegExp(text, ‘i’) })
.first();
}
async clickActionInRow(rowText: string, actionText: string) {
const row = await this.getRowByText(rowText);
await row
.getByRole(‘button’, { name: new RegExp(actionText, ‘i’) })
.click();
}
Handling Intercepted Clicks
Modern UIs often introduce overlays, modals, and transitions that can intercept clicks. Page objects handle this defensively in three key ways:
- Scroll elements into view
- Attempt a normal click
- Fall back to a JavaScript-triggered click if needed
This prevents flaky failures caused by timing or layering issues.
async clickButton(locator: Locator) {
await locator.scrollIntoViewIfNeeded();
try {
await locator.click({ timeout: 5000 });
} catch (error) {
// Fallback to JavaScript click if normal click is intercepted
await locator.evaluate((btn) => (btn as HTMLButtonElement).click());
}
}
Waiting for Async Operations
Most UI interactions trigger API calls or background processing.
Reliable waiting strategies include:
- Observing loading indicators when present
- Waiting for indicators to disappear
- Falling back to network idle states
- Gracefully handling operations that complete too quickly to observe
Waiting logic belongs in page objects, not in test cases.
async waitForLoadingComplete() {
const loadingIndicator = this.page.locator(‘text=/loading|processing/i’);
try {
await loadingIndicator.waitFor({ state: ‘visible’, timeout: 2000 });
await loadingIndicator.waitFor({ state: ‘hidden’, timeout: 60000 });
} catch {
// Loading may complete before we can observe it
}
// Also wait for network to settle
await this.page.waitForLoadState(‘networkidle’);
}
The Workflow
Here’s how the system works from start to finish.
| Step 1: Write the Spec | Step 2: Generate Tests | Step 3: Run and Fix | Step 4: Add More Tests |
| A human writes or updates a spec file describing the desired behavior in plain English. | Claude Code translates the spec into executable tests and supporting page objects. | Run the generated tests. If a test fails: 1. Inspect error context 2. Adjust selectors or waits 3. Ask Claude to refine the implementation | Append new scenarios to the spec, regenerate only what’s needed, and scale the framework incrementally. |
# My Feature Specifications
## Authentication
USER: DEFAULT
================================================================================
TEST GROUP A: Core Functionality
================================================================================
— TEST A.1: Basic Flow [HAPPY PATH] —
Steps:
1. Navigate to /my-feature
2. Click “Start” button
3. Fill in “Name” field with “Test User”
4. Click “Submit”
5. Verify success message appears
“`
### Step 2: Generate Tests
“`bash
./tests/scripts/generate-tests.sh my-feature
“`
Claude will:
1. Read existing page objects for patterns
2. Create `page-objects/MyFeaturePage.ts` if needed
3. Generate `generated/my-feature.spec.ts`
4. Update `page-objects/index.ts` exports
### Step 3: Run and Fix
“`bash
# Run the test
npx playwright test tests/generated/my-feature.spec.ts –project=chromium
# If it fails, check the error context
cat tests/results/test-artifacts/*/error-context.md
# Ask Claude to fix issues
claude “Fix the selector for the Submit button in MyFeaturePage.ts –
it’s being intercepted by a modal overlay”
“`
### Step 4: Add More Tests
“`bash
# Add to existing spec
echo “
— TEST A.2: Validation Error —
Steps:
1. Navigate to /my-feature
2. Click Submit without filling fields
3. Verify error message appears
” >> tests/specs/my-feature.spec.md
# Generate just the new test
./tests/scripts/add-test.sh my-feature A.2
Next Up
At this point, you’ve seen how the framework is structured and how specs, generated tests, and page objects work together to remove QA as a bottleneck without removing human judgment.
In Part 3, we’ll move from framework mechanics to real-world impact for my final entry in this series.
Check back next Wednesday or subscribe to my weekly email for a reminder. See you then.
