Watch AI agents autonomously test your e-commerce site and generate reviewable browser replays
An intelligent quality assurance system that uses AI agents to automatically test e-commerce websites, powered by Kernel and Browser Use.
Traditional QA testing is manual, time-consuming, and error-prone. This demo shows how AI agents can:
- β Automate visual testing - Agents "see" and evaluate product images like a human QA tester would
- π Execute complex workflows - Navigate multi-step user journeys autonomously
- πΉ Provide reviewable evidence - Every action is recorded as a browser replay
- π― Scale effortlessly - Test hundreds of products in the time it takes to test one manually
This is the future of QA: AI agents that think, see, and validate like your best QA engineer.
This application showcases dual-phase AI-powered QA testing on e-commerce sites:
- Navigate to product URLs autonomously
- Evaluate if product images accurately match title descriptions
- Interact with the page to add items to cart
- Report PASS/FAIL verdict with detailed assessment
- Navigate to cart using the site's natural UI flow
- Search for promotional banners and special offers
- Validate specific promotion text (e.g., "$20 off when you spend $100")
- Report PASS/FAIL verdict with findings
Each testing phase is automatically recorded as a browser replay, allowing you to:
- See exactly what the AI agent saw
- Review the agent's decision-making process
- Debug failures by watching agent interactions
- Share results with your team
Feature | Description |
---|---|
π€ Autonomous AI Agents | Claude-powered agents that understand context and make intelligent decisions |
π¬ Browser Replays | Every test generates a reviewable recording via Kernel's replay system |
π Visual Understanding | AI agents can "see" and evaluate images, layouts, and visual elements |
π Structured Results | Clean PASS/FAIL verdicts with detailed assessment narratives |
π Secure Credentials | Built-in sensitive data handling for store passwords and API keys |
β‘ Async Execution | Non-blocking API calls with polling for efficient testing workflows |
π― Multi-Phase Testing | Sequential test phases with independent replay recordings |
This repository showcases the approach and architecture for building AI-powered QA systems. To adapt it for your use case, you'll need to:
- β Kernel Platform Access - Active account with API access (sign up)
- β Anthropic API Key - For Claude AI functionality (get key)
- β Your Test Environment - E-commerce site or web application to test
- β Custom Agent Tasks - Modify the AI instructions for your specific QA scenarios
- π§ Agent Task Instructions - Current tasks are specific to our demo store
- π§ Product URLs - Update to match your test environment
- π§ Validation Logic - Adapt pass/fail criteria to your requirements
- π§ Expected Behaviors - Modify based on what you're testing
Think of this as a blueprint, not a finished product. The value is in seeing how AI agents, browser automation, and replay recording work together to create autonomous QA workflows.
Before you can run this demo, ensure you have:
- Python 3.11+ installed
- Kernel CLI installed and configured (installation guide)
- Anthropic API key with Claude access
- Development Shopify store (or similar e-commerce platform)
- UV package manager (recommended) or pip
git clone <this-repository>
cd replays-demo
uv install # or pip install -r requirements.txt
Create a .env
file in the project root:
# For running the demo client
KERNEL_API_KEY=your_kernel_api_key_here
# For deployment (can also be set via -e flag)
ANTHROPIC_API_KEY=your_anthropic_api_key_here
STORE_PASSWORD=your_shopify_store_password
main.py
and demo-qa-agent.py
:
- Replace
kernel-test-store-1.myshopify.com
with your store URL - Modify product URLs in the demo script
- Customize agent task instructions for your specific QA scenarios
- Update expected promotional text and validation criteria
kernel deploy main.py --env-file .env
uv run demo-qa-agent.py
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Demo Client Script β
β (demo-qa-agent.py) β
β β
β β’ Invokes Kernel app via API β
β β’ Polls for completion β
β β’ Displays formatted results β
ββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββββ
β
β API Call
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Kernel Platform β
β (Managed Browser) β
β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Kernel App (main.py) β β
β β β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β Phase 1: Product Page QA β β β
β β β β’ Start replay recording β β β
β β β β’ AI Agent navigates to product URL β β β
β β β β’ Evaluate image vs. title match β β β
β β β β’ Add item to cart β β β
β β β β’ Stop replay β Generate replay link β β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β Phase 2: Cart Page QA β β β
β β β β’ Start new replay recording β β β
β β β β’ AI Agent navigates to cart β β β
β β β β’ Search for promotional banners β β β
β β β β’ Validate specific promotion text β β β
β β β β’ Stop replay β Generate replay link β β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β Return Results β β β
β β β β’ Product assessment + replay link β β β
β β β β’ Cart assessment + replay link β β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Component | Purpose | Key Technology |
---|---|---|
main.py | Kernel app orchestrating the dual-agent QA workflow | Browser Use SDK, Claude Sonnet 4 |
session.py | Custom browser session with viewport handling for CDP connections | Browser Use + Playwright |
demo-qa-agent.py | Client script that invokes the app and displays results | Kernel API, async polling |
- AI Engine: Claude Sonnet 4 (
claude-sonnet-4-20250514
) - Browser Automation: Browser Use SDK (~0.5.3)
- Orchestration: Kernel Platform (>=0.8.1)
- Browser Control: Playwright (via Browser Use)
- Language: Python 3.11+
The current agents are configured for specific scenarios. To customize:
-
Update Product Assessment (main.py:72-84):
- Modify the product evaluation criteria
- Change expected product types or attributes
- Adjust cart addition logic
-
Update Cart Assessment (main.py:112-124):
- Replace promotional text with your offers
- Modify banner detection logic
- Add additional cart validation steps
-
Add New Assessment Phases:
- Create additional agent instances
- Add new replay recording phases
- Update the QAResult structure
When you run uv run demo-qa-agent.py
, you'll see formatted results like this:
============================================================
π΅οΈ E-commerce QA Agent Demo
============================================================
π Starting AI-powered quality assurance testing...
π Testing URL: https://kernel-test-store-1.myshopify.com/products/short-sleeved-red-t-shirt
β³ Invoking QA agents... (this may take 2-3 minutes)
β³ Invocation started (ID: inv_abc123)... polling for completion
============================================================
π QA Results
============================================================
β±οΈ Completed in 127.3 seconds
π Timestamp: 2025-09-30 14:23:45
π Product Page Assessment
----------------------------------------
β
VERDICT: PASS - The product image clearly shows a red short-sleeved
t-shirt which accurately matches the product title description. The item
was successfully added to the cart.
π Cart Page Assessment
----------------------------------------
β VERDICT: FAIL - The cart page displays a '$15 off when you spend $75'
promotional banner instead of the expected '$20 off when you spend $100'
promotion.
π Browser Replay Links
----------------------------------------
Review the AI agent's actions by clicking these replay links:
π¬ Product Page Inspection:
https://replays.onkernel.com/replay/rpl_xyz789
π¬ Cart Page Inspection:
https://replays.onkernel.com/replay/rpl_abc456
============================================================
β¨ Demo Complete! π
============================================================
π‘ Key Features Demonstrated:
β’ AI agents can autonomously navigate websites
β’ Automated quality assurance testing
β’ Visual replay links for review and debugging
β’ Structured assessment reporting
This project is licensed under the MIT License - see the LICENSE file for details.
Built with:
- Kernel - Browser orchestration and replay infrastructure
- Browser Use - AI-powered browser automation
- Anthropic Claude - Advanced AI agent intelligence
Ready to build autonomous QA agents? Start by exploring the code in main.py to see how the dual-phase agent workflow is structured. π