qastratechnologies

Inside a Production-Grade Playwright Framework: What Most Teams Get Wrong

Most Playwright frameworks look impressive at first.

Clean folders.

A shiny config file.

A few passing tests.

Then real life happens.

More features. More contributors. CI pipelines get slower. Failures become harder to explain. Someone adds a workaround. Someone else adds another. Six months later, the framework technically still runs — but no one fully trusts it.

At QAstra, this is usually when teams call us.

Not because Playwright failed them — but because the framework around it did.

This article walks through the most common anti-patterns we see in Playwright frameworks, why they cause instability, and how QAstra designs architectures that actually survive production usage.

What “Production-Grade” Really Means

A production-grade Playwright framework is not one that:

  • Has the most tests
  • Uses the newest APIs
  • Looks impressive in a demo

It’s one that:

  • Scales without becoming fragile
  • Is understandable by someone new to the team
  • Fails loudly and clearly
  • Can be trusted in CI without constant reruns

Most teams don’t struggle because Playwright is immature.

They struggle because framework design was treated as setup work instead of engineering work.

The most common mistake is rebuilding Selenium habits on top of Playwright.

Teams recreate:

  • Deep Page Object inheritance
  • Generic “click” helpers
  • Custom wait wrappers
  • Forced interactions to “stabilize” tests

Here’s a real example we see often:

❌ Anti-pattern: Generic click helpers

// Hides Playwright behavior and encourages force-clicking

export async function clickElement(page: Page, selector: string) {

  await page.waitForSelector(selector);

  await page.click(selector, { force: true });

}

This removes:

  • Auto-waiting
  • Actionability checks
  • Strictness
  • Clear failure signals

✅ QAstra approach: Let Playwright do the work

await page.getByRole(‘button’, { name: ‘Submit’ }).click();

await expect(page.getByRole(‘status’)).toHaveText(/saved/i);

We avoid generic wrappers unless there’s a clear, repeatable reason.

Abstraction is earned — not automatic.

Flakiness often starts with time-based assumptions.

❌ Anti-pattern: Hard waits

await page.click(‘#submit’);

await page.waitForTimeout(3000);

await expect(page.locator(‘.toast-success’)).toBeVisible();

This works until:

  • CI is slower
  • The backend hiccups
  • The UI animation changes

✅ QAstra approach: State-based waiting

await page.getByRole(‘button’, { name: ‘Submit’ }).click();

await expect(page.getByRole(‘status’)).toHaveText(/success/i);

We wait for conditions, not guesses.

This single principle eliminates more flakiness than any AI tool.

Many frameworks tightly couple tests to DOM structure.

❌ Anti-pattern: DOM-dependent selectors

await page.locator(

  ‘div.card > div:nth-child(2) > button.btn-primary’

).click();

This fails the moment:

  • CSS classes change
  • Layout is refactored
  • A design system is updated

✅ QAstra approach: Intent-driven locators

await page.getByRole(‘button’, { name: ‘Create account’ }).click();

Users don’t click CSS paths.

They click intent.

Ambiguous locators are silent killers.

❌ Anti-pattern: Ambiguous text match

await page.locator(‘text=Save’).click();

This passes until:

  • Another “Save” button appears
  • The wrong button is clicked silently

✅ QAstra approach: Scoped + strict locators

const modal = page.getByRole(‘dialog’, { name: ‘Edit profile’ });

await modal.getByRole(‘button’, { name: ‘Save’ }).click();

We want failures early — not incorrect passes.

Parallel execution exposes this instantly.

❌ Anti-pattern: Shared page & login

let page: Page;

test.beforeAll(async ({ browser }) => {

  page = await browser.newPage();

  await page.goto(‘/’);

  await page.fill(‘#user’, ‘admin’);

  await page.fill(‘#pass’, ‘admin’);

  await page.click(‘#login’);

});

Tests now depend on:

  • Execution order
  • Hidden state
  • Cleanup discipline

✅ QAstra approach: Isolation + storageState

test.use({ storageState: ‘storage/auth.json’ });

test(‘Settings’, async ({ page }) => {

  await page.goto(‘/settings’);

});

test(‘Billing’, async ({ page }) => {

  await page.goto(‘/billing’);

});

Isolation is non-negotiable.

State sharing happens intentionally — not accidentally.

Retries feel comforting. They’re dangerous.

❌ Anti-pattern: Blanket retries

export default defineConfig({

  retries: 3,

});

This hides:

  • Product bugs
  • Synchronization issues
  • Test design flaws

✅ QAstra approach: Controlled retries + visibility

export default defineConfig({

  retries: process.env.CI ? 1 : 0,

  use: {

    trace: ‘retain-on-failure’,

    video: ‘retain-on-failure’,

  },

});

Retries are a signal, not a solution.

Every struggling framework has one.

❌ Anti-pattern: God-utility file

helpers.ts

– waitForLoader

– randomData

– apiCalls

– assertions

– retries

Eventually:

  • No one knows what’s safe to reuse
  • Everything depends on everything else

✅ QAstra approach: Clear responsibility boundaries

// pages/LoginPage.ts

export class LoginPage { /* UI logic only */ }

// infra/apiClient.ts

export async function createUser() { /* API only */ }

// asserts/uiAsserts.ts

export async function expectToast() { /* assertions only */ }

If a file can’t explain its role in one sentence, it’s doing too much.

When tests fail in CI, engineers should see what happened.

❌ Anti-pattern: No diagnostics

use: {

  trace: ‘off’,

  video: ‘off’,

}

✅ QAstra approach: Debuggability by default

use: {

  trace: ‘retain-on-failure’,

  screenshot: ‘only-on-failure’,

  video: ‘retain-on-failure’,

}

Failures should tell a story — not start a guessing game.

❌ Anti-pattern

class AppPage {

  async login() {}

  async createUser() {}

  async manageBilling() {}

  async verifyEmail() {}

}

✅ QAstra approach: Feature-focused design

class LoginPage { /* login only */ }

class UsersPage { /* user flows only */ }

class BillingApi { /* non-UI logic */ }

Smaller objects scale.

Monoliths rot.

What QAstra’s Production-Grade Architecture Optimizes For

  • Deterministic behavior over “smart” retries
  • Minimal abstraction over Playwright primitives
  • CI visibility built in
  • Isolation by default
  • AI used as an assistant — never a silent fixer

Most importantly, the framework is boring.

Boring frameworks last.

Clever ones break.

The Litmus Test We Use

When reviewing a Playwright framework, we ask:

  • Can a new engineer understand a failure without tribal knowledge?
  • Would this still work with 5× more tests?
  • Do failures fail loudly and early?
  • Can CI failures be trusted?

If the answer is “not really,” the framework isn’t production-grade yet.

Final Thought

Playwright is powerful.

But power alone doesn’t create reliability.

Most teams don’t fail because Playwright is new.

They fail because framework design wasn’t treated as engineering work.

At QAstra, we design Playwright frameworks like products — something that must earn trust, scale predictably, and remain readable long after the first demo passes.

That’s how Playwright becomes a competitive advantage — not just another tool.

Ready to Build a Framework That Actually Lasts?

At QAstra Technologies, we help teams design and stabilize Playwright frameworks that work in real CI/CD pipelines — not just in demos.

If your current setup feels fragile, confusing, or hard to scale, we’d be happy to take a look.

Learn More

From Automation to Intelligent QA

Move beyond scripted automation with context-aware, adaptive testing that improves reliability, scalability, and release confidence
Scroll to Top