E2E Testing
E2E tests are Playwright specs intests/e2e/*.spec.ts. The dev server starts automatically — Playwright launches it before running tests. Tests run against both ChatGPT and Claude hosts via Playwright projects.
Writing E2E Tests
Importtest and expect from sunpeak/test. The mcp fixture handles inspector navigation, double-iframe traversal, and host selection:
- Standalone (any MCP server)
- sunpeak framework
For non-sunpeak projects, pass a
server option to defineConfig:URL Parameters
The mcp.callTool() method accepts options for theme, displayMode, and prodResources. For advanced URL parameters, see the Inspector API Reference.
Testing Backend-Only Tools
If your resource calls backend tools viauseCallServerTool, define mock responses using the serverTools field in the simulation JSON. The inspector resolves these mocks based on the tool call arguments:
serverTools field supports both simple (single result) and conditional (when/result array) forms. See Simulation API Reference for details.
Example E2E Test Structure
A typical e2e test file tests a resource across different modes. Each test runs automatically against both ChatGPT and Claude hosts:Best Practices
Keep tests simple
Keep tests simple
Test one thing per test case. Clear tests are maintainable tests.
Use meaningful test descriptions
Use meaningful test descriptions
Test user-facing behavior
Test user-facing behavior
Test what users see and interact with, not implementation details:
Test across hosts, themes, and display modes
Test across hosts, themes, and display modes
Use See MCP Apps Display Modes for how hosts handle inline, fullscreen, and PiP views.
mcp.callTool() options to test your resources in different configurations. Tests run across ChatGPT and Claude hosts automatically via Playwright projects. Pass theme and displayMode as options:Add visual regression tests for key states
Add visual regression tests for key states
Use
mcp.screenshot() in tests that cover important visual states (light/dark theme, fullscreen, empty states). Visual tests catch CSS regressions that functional assertions miss:Clean up after tests
Clean up after tests
Use
afterEach to reset state between tests:Learn More
Visual Regression Testing
Screenshot comparison and baseline management.
Inspector
The runtime that powers E2E tests.
Simulations
JSON schema, conventions, and auto-discovery.