Home Insights The 3 Layers of an AI-powered Testing Framework

The 3 Layers of an AI-powered Testing Framework

In my first blog in our series about how we’re using AI to address testing constraints in a fast development cycle, I talked about the problem that forced our hand. You can read it here if you missed it.

Here’s the short story: Our team didn’t struggle with testing because our QA team wasn’t good enough, we struggled because success compounded a lack of time. Every new feature increased regression scope, validation slowed down, and quality risk went up. For an investing platform, “move fast and fix it later” can’t be a solution.

Our answer was to create an AI-fueled system where our people defined intent, and automation handled repetitive tasks.

In this post, I’ll walk through how we actually built that system.

This is the practical, hands-on part of the story. Let’s dive in.

The Three Layers of Our Testing System Framework

The framework has three unique layers, each built to do one thing well.

Specification Layer: Human-readable test definitions
Execution Layer: Auto-generated Playwright tests
Page Object Layer: Reusable UI interactions with flexible selectors

Let’s look at each one.

Layer 1: Spec Files

Spec files are plain text written in Markdown. Anyone can write them with no code knowledge required. They need to describe intent, not implementation.

A typical spec defines the user story, authentication context, and a sequence of steps written in plain English. Steps can include timing hints for slow operations, logical grouping, and clear success criteria.

What matters most is readability. A product manager, QA engineer, or developer should all be able to read the same spec and understand exactly what’s being tested.

Key characteristics of good specs:

Written in plain English
Authentication context defined at the top
Tests grouped logically
Notes included for slow or asynchronous operations

Specs define what should happen. They deliberately avoid how it happens.

# Tax Loss Harvesting Specifications

## User Story
User can harvest tax losses from investment accounts

## Authentication
USER: DEFAULT

================================================================================
TEST GROUP TLH: Tax Loss Harvesting
================================================================================

--- TEST TLH.1: Harvest Account [HAPPY PATH] ---
Steps:
1. Navigate to harvest page
2. Look for Select Account and search for "Haley Fuller"
3. Confirm Account Summary shows positive values for Equities
4. Click on 'Harvest Account'. This might take up to 60 seconds
5. Confirm 'Selected Capital:' shows positive value
6. Click on 'Optimize Replacements'. This might take up to 120 seconds
7. Click on 'Preview Orders' to verify orders
8. Confirm there are some sells and buys in the orders preview

Layer 2: Generated Tests

Next, Claude Code reads the spec files and generates executable Playwright tests.

The generated tests mirror the structure of the spec. Each step in the spec maps directly to a method call. The test code stays readable and predictable, while the complexity is pushed down into reusable components.

Important characteristics of generated tests:

Authentication is applied consistently
Timeouts reflect real-world workflows
Soft assertions are used where appropriate
Valid “no-op” states are handled gracefully

The generated test code reads like a faithful translation of the spec rather than a hand-written automation script. This makes it easier to debug, review, and trust.

// generated/tlh.spec.ts
import { test, expect } from '../fixtures';
import { TLHPage } from '../page-objects';

test.use({ userType: 'DEFAULT' });
test.setTimeout(300000); // 5 minutes

test.describe('TLH Tests', () => {
 let tlhPage: TLHPage;

 test.beforeEach(async ({ page }) => {
 tlhPage = new TLHPage(page);
 });

 test('TLH.1: Harvest Account', async ({ page }) => {
 // Step 1: Navigate to Harvest page
 await tlhPage.goto();

 // Step 2: Select account
 await tlhPage.selectAccount('Haley Fuller');

 // Step 3: Verify equities
 const equities = await tlhPage.getEquitiesValue();
 expect.soft(equities, 'Equities should be positive').toBeGreaterThan(0);

 // Step 4: Click Harvest Account (handles 60s timeout)
 await tlhPage.clickHarvestAccount();

 // Step 5: Check for opportunities
 const hasOpportunities = await tlhPage.hasHarvestingOpportunities();
 if (!hasOpportunities) {
 console.log('No harvesting opportunities found');
 return; // Valid state - test passes
 }

 // Step 6: Optimize Replacements
 await tlhPage.clickOptimizeReplacements();

 // Step 7-8: Preview and verify orders
 const previewEnabled = await tlhPage.isPreviewOrdersEnabled();
 if (previewEnabled) {
 const orderCounts = await tlhPage.clickPreviewOrders();
 expect.soft(orderCounts.buys).toBeGreaterThan(0);
 expect.soft(orderCounts.sells).toBeGreaterThan(0);
 }
 });
});

Layer 3: Page Objects

Page objects handle the messy reality of UI automation.

Selectors can change, elements may get intercepted, and loading states appear and disappear. Page objects exist to absorb that complexity so tests and specs stay clean.

Key responsibilities of page objects:

Define flexible selectors with fallbacks
Handle intercepted clicks and overlays
Encapsulate waiting logic for async operations
Provide stable, reusable interaction methods

UIs are constantly evolving, so selectors need to tolerate change. We prioritize semantic, role-based selectors with flexible matching, and only fall back to text or test IDs when necessary. That way, routine UI updates don’t cascade into multiple broken tests.

The objective is simple: small changes shouldn’t create big failures.

// page-objects/TLHPage.ts
import { Page, Locator } from '@playwright/test';
import { BasePage } from './BasePage';

export class TLHPage extends BasePage {
 readonly url = '/harvest';

 // Declarative locators with fallbacks
 readonly harvestAccountButton: Locator;
 readonly previewOrdersButton: Locator;

 constructor(page: Page) {
 super(page);

 // Primary selector with fallback using .or()
 this.harvestAccountButton = page
 .getByRole('button', { name: /harvest.*account/i })
 .or(page.locator('button:has-text("Harvest Account")'));

 this.previewOrdersButton = page
 .getByRole('button', { name: /preview.*order/i })
 .or(page.locator('button:has-text("Preview Orders")'));
 }

 async selectAccount(accountName: string): Promise<void> {
 // React-select dropdown pattern
 const combobox = this.page.getByRole('combobox').first();
 await combobox.click();
 await this.page.keyboard.type(accountName);
 await this.page.waitForSelector('[role="option"]');

 const option = this.page
 .locator('[role="option"]')
 .filter({ hasText: new RegExp(accountName, 'i') })
 .first();

 await option.click();
 }

 async clickHarvestAccount(): Promise<void> {
 await this.harvestAccountButton.scrollIntoViewIfNeeded();

 // Handle potential element interception
 try {
 await this.harvestAccountButton.click({ timeout: 5000 });
 } catch {
 // Fallback to JavaScript click if normal click is intercepted
 await this.harvestAccountButton.evaluate(
 (btn) => (btn as HTMLButtonElement).click()
 );
 }

 await this.waitForHarvestComplete();
 }

 async waitForHarvestComplete(): Promise<void> {
 // Wait for loading indicators
 const generatingText = this.page.locator('text=/Generating.*replacement/i');

 try {
 await generatingText.waitFor({ state: 'visible', timeout: 5000 });
 await generatingText.waitFor({ state: 'hidden', timeout: 120000 });
 } catch {
 // Operation may complete before we can observe the loading state
 }
 }
}

AI-Powered Test Generation: The Script

A simple script orchestrates test generation.

#!/bin/bash
# scripts/generate-tests.sh

SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
TESTS_DIR="$(cd "$SCRIPT_DIR/.." && pwd)"

# Read the AI prompt template
PROMPT_FILE="$TESTS_DIR/docs/GENERATE_TESTS_PROMPT.md"

# Build Claude prompt
CLAUDE_PROMPT="I need you to generate Playwright E2E tests based on:

## Instructions
$(cat "$PROMPT_FILE")

## Test Cases to Implement
$(cat "$INPUT_FILE")

## Task
1. Read existing page objects to understand patterns
2. Create NEW page objects if needed
3. Generate test spec file
4. Use flexible regex selectors
5. Use .or() fallbacks
6. Use expect.soft() for non-critical assertions"

# Run Claude Code
claude --print "$CLAUDE_PROMPT"

This keeps generation consistent and repeatable.

The Generation Prompt

Next, this prompt tells Claude how to write resilient tests.

// BEST: Role-based with regex
page.getByRole('button', { name: /submit.*request/i })
page.getByRole('textbox', { name: /email/i })

// GOOD: Text-based with regex
page.getByText(/total.*requests|all.*requests/i)

// FALLBACK: Use .or() chains
const submitButton = page
 .getByRole('button', { name: /submit/i })
 .or(page.locator('button:has-text("Submit")'))
 .or(page.locator('[data-testid="submit-btn"]'));
```

**React-Select dropdowns:**
```typescript
async selectFromDropdown(labelText: string, optionText: string) {
 const dropdown = this.page
 .locator(`text=${labelText}`)
 .locator('xpath=following-sibling::*[1]//input[@role="combobox"]')
 .or(this.page.getByRole('combobox').first());

 await dropdown.click();
 await this.page.waitForSelector('[role="option"]');

 const option = this.page
 .locator('[role="option"]')
 .filter({ hasText: new RegExp(optionText, 'i') })
 .first();

 await option.click();
}
```

**Assertions:**
```typescript
// Soft assertions for non-critical checks (test continues if these fail)
expect.soft(value, 'Descriptive message').toBeGreaterThan(0);

// Hard assertions for critical path (test stops if these fail)
expect(isLoggedIn, 'User must be authenticated').toBeTruthy();
```

**Slow operations:**
```typescript
// Wait for loading indicators to appear and disappear
await page.waitForSelector('text=/Loading/i', { state: 'hidden', timeout: 60000 });

// Or wait for specific completion indicators
await page.waitForFunction(() => {
 return !document.querySelector('button')?.textContent?.includes('Loading');
}, { timeout: 120000 });

The prompt is where most of the leverage lives. It teaches Claude how to write tests that survive real-world UI changes.

Key principles baked into the prompt:

Selector priority	Assertions	Async behavior
• Prefer role-based selectors with regex • Fall back to text-based selectors with regex • Use explicit test IDs when available • Avoid brittle CSS selectors unless necessary	• Use soft assertions for non-critical checks • Use hard assertions for critical path validation	• Wait for visible loading states when possible • Handle operations that may complete too quickly to observe • Account for long-running workflows explicitly

This ensures that generated tests follow the same patterns every time.

Core Insights on Building Resilient Page Objects

Flexible selectors are the single biggest contributor to test stability.

// BRITTLE: Breaks when text changes
page.locator('button:has-text("Submit Request")');

// RESILIENT: Survives minor changes
page.getByRole('button', { name: /submit.*request/i })
 .or(page.locator('[data-testid="submit-btn"]'));

Brittle selectors break whenever text, spacing, or component structure changes. Flexible selectors, however, tolerate variation while preserving intent.

For example, matching “Submit,” “Submit Request,” or “Submit Now” with a single regex-based selector dramatically reduces breakage.

Selector Priority

When writing selectors, use this order of preference:

Role-based selectors with regex: Most resilient and use semantic meaning
Text-based selectors with regex: A good fallback that handles variations
Data-testid attributes: An explicit contract between dev and test
CSS selectors as a last resort, because they break the most

This mirrors how users interact with the UI and how accessibility semantics are defined.

Common Interaction Patterns

Once you start building page objects this way, tests need to accommodate these interaction patterns by default:

Buttons should tolerate label variation.
Form fields should tolerate label and placeholder differences.
Dropdowns should account for delayed option rendering.
Tables should allow row-level filtering based on visible text.

Encapsulating these patterns in page objects prevents duplication and keeps tests expressive.

// Handle label variations
const emailField = page
 .getByRole('textbox', { name: /email/i })
 .or(page.getByPlaceholder(/email/i))
 .or(page.locator('input[type="email"]'));
```

**Dropdowns (React-Select):**
```typescript
async selectOption(optionText: string) {
 const combobox = this.page.getByRole('combobox').first();
 await combobox.click();

 // Wait for options to load
 await this.page.waitForSelector('[role="option"]');

 // Find option with flexible matching
 const option = this.page
 .locator('[role="option"]')
 .filter({ hasText: new RegExp(optionText, 'i') })
 .first();

 await option.click();
}


**Tables:**
```typescript
async getRowByText(text: string) {
 return this.page
 .locator('tr')
 .filter({ hasText: new RegExp(text, 'i') })
 .first();
}

async clickActionInRow(rowText: string, actionText: string) {
 const row = await this.getRowByText(rowText);
 await row
 .getByRole('button', { name: new RegExp(actionText, 'i') })
 .click();
}

Handling Intercepted Clicks

Modern UIs often introduce overlays, modals, and transitions that can intercept clicks. Page objects handle this defensively in three key ways:

Scroll elements into view
Attempt a normal click
Fall back to a JavaScript-triggered click if needed

This prevents flaky failures caused by timing or layering issues.

async clickButton(locator: Locator) {
 await locator.scrollIntoViewIfNeeded();

 try {
 await locator.click({ timeout: 5000 });
 } catch (error) {
 // Fallback to JavaScript click if normal click is intercepted
 await locator.evaluate((btn) => (btn as HTMLButtonElement).click());
 }
}

Waiting for Async Operations

Most UI interactions trigger API calls or background processing.

Reliable waiting strategies include:

Observing loading indicators when present
Waiting for indicators to disappear
Falling back to network idle states
Gracefully handling operations that complete too quickly to observe

Waiting logic belongs in page objects, not in test cases.

async waitForLoadingComplete() {
 const loadingIndicator = this.page.locator('text=/loading|processing/i');

 try {
 await loadingIndicator.waitFor({ state: 'visible', timeout: 2000 });
 await loadingIndicator.waitFor({ state: 'hidden', timeout: 60000 });
 } catch {
 // Loading may complete before we can observe it
 }

 // Also wait for network to settle
 await this.page.waitForLoadState('networkidle');
}

The Workflow

Here’s how the system works from start to finish.

Step 1: Write the Spec	Step 2: Generate Tests	Step 3: Run and Fix	Step 4: Add More Tests
A human writes or updates a spec file describing the desired behavior in plain English.	Claude Code translates the spec into executable tests and supporting page objects.	Run the generated tests. If a test fails: 1. Inspect error context 2. Adjust selectors or waits 3. Ask Claude to refine the implementation	Append new scenarios to the spec, regenerate only what’s needed, and scale the framework incrementally.

# My Feature Specifications

## Authentication
USER: DEFAULT

================================================================================
TEST GROUP A: Core Functionality
================================================================================

--- TEST A.1: Basic Flow [HAPPY PATH] ---
Steps:
1. Navigate to /my-feature
2. Click "Start" button
3. Fill in "Name" field with "Test User"
4. Click "Submit"
5. Verify success message appears
```

### Step 2: Generate Tests

```bash
./tests/scripts/generate-tests.sh my-feature
```

Claude will:
1. Read existing page objects for patterns
2. Create `page-objects/MyFeaturePage.ts` if needed
3. Generate `generated/my-feature.spec.ts`
4. Update `page-objects/index.ts` exports

### Step 3: Run and Fix

```bash
# Run the test
npx playwright test tests/generated/my-feature.spec.ts --project=chromium

# If it fails, check the error context
cat tests/results/test-artifacts/*/error-context.md

# Ask Claude to fix issues
claude "Fix the selector for the Submit button in MyFeaturePage.ts -
 it's being intercepted by a modal overlay"
```

### Step 4: Add More Tests

```bash
# Add to existing spec
echo "
--- TEST A.2: Validation Error ---
Steps:
1. Navigate to /my-feature
2. Click Submit without filling fields
3. Verify error message appears
" >> tests/specs/my-feature.spec.md

# Generate just the new test
./tests/scripts/add-test.sh my-feature A.2

Next Up

At this point, you’ve seen how the framework is structured and how specs, generated tests, and page objects work together to remove QA as a bottleneck without removing human judgment.

In Part 3, we’ll move from framework mechanics to real-world impact for my final entry in this series.

Check back next Wednesday or subscribe to my weekly email for a reminder. See you then.

January 28, 2026

Mohan Naidu

Mohan brings deep experience in investment research and technology in his role as CEO. He has been recognized by StarMine and Institutional Investors for his research efforts, and is an avid participant in deep learning AI technology research. Mohan holds an MBA and is a CFA Charterholder, and most recently as Managing Director at Oppenheimer.

Deliver a superior client experience with truly customized investment solutions

Alphathena’s cloud-based platform eliminates the complexities associated with direct and custom indexing, simplifying personalization through tax-loss harvesting, auto-rebalancing, and index lifecycle management capabilities.

Deliver a superior client experience with truly customized investment solutions

What’s next

AI Insights

How Agentic Workflows Will Disrupt Outsourced Portfolio Management and Why the SMA and TAMP Model Is Ripe for Reinvention

February 24, 2026

By Mohan Naidu