E2E tests are Playwright specs in tests/e2e/*.spec.ts. The dev server starts automatically — Playwright launches it before running tests. Tests run against both ChatGPT and Claude hosts via Playwright projects.
pnpm
npm
yarn
pnpm test # Run unit + e2epnpm test:e2e # E2E onlypnpm test:e2e -- --ui # Playwright UI modepnpm test:e2e -- tests/e2e/albums.spec.ts # Single file
npm run test # Run unit + e2enpm run test:e2e # E2E onlynpm run test:e2e -- --ui # Playwright UI modenpm run test:e2e -- tests/e2e/albums.spec.ts # Single file
yarn test # Run unit + e2eyarn test:e2e # E2E onlyyarn test:e2e --ui # Playwright UI modeyarn test:e2e tests/e2e/albums.spec.ts # Single file
Import test and expect from sunpeak/test. The mcp fixture provides protocol-level methods, and the inspector fixture handles rendering, double-iframe traversal, and host selection:
// playwright.config.tsimport { defineConfig } from 'sunpeak/test/config';export default defineConfig();
This auto-detects sunpeak projects and creates per-host Playwright projects (chatgpt, claude). Each test runs once per host automatically — no host loops needed.
Standalone (any MCP server)
sunpeak framework
For non-sunpeak projects, pass a server option to defineConfig:
import { defineConfig } from 'sunpeak/test/config';export default defineConfig({ server: 'http://localhost:8000/mcp',});
For stdio servers, pass a command and optional configuration:
import { defineConfig } from 'sunpeak/test/config';export default defineConfig({ server: { command: 'python', args: ['server.py'], env: { DATABASE_URL: 'sqlite:///test.db' }, cwd: './my-server', }, timeout: 90_000, // Server startup timeout in ms (default: 60000)});
For sunpeak projects, the dev server is auto-detected and started:
import { defineConfig } from 'sunpeak/test/config';export default defineConfig();
inspector.renderTool renders the tool result in the inspector and returns an InspectorResult with both the MCP data and a UI locator. With input, the tool is called on the real server. Without input, simulation fixture data is used when available. The returned InspectorResult includes a source field ('fixture' or 'server') indicating where the data came from, and a screenshot() method for visual regression. Inspector sidebars are hidden by default in this fixture so app e2e and visual tests do not depend on inspector layout. Pass { sidebar: true } when a test needs the inspector controls.
// Call the real server with specific argumentsconst result = await inspector.renderTool('search', { query: 'test', limit: 10 });expect(result).not.toBeError();const app = result.app();await expect(app.getByText('test')).toBeVisible();// Use simulation fixture data, or call server with empty argsconst result = await inspector.renderTool('show-albums', undefined, { theme: 'dark' });
The options object accepts theme, displayMode, sidebar, and timeout. Per-call timeout overrides the config default.
If your resource calls backend tools via useCallServerTool, define mock responses using the serverTools field in the simulation JSON. The inspector resolves these mocks based on the tool call arguments:
Use inspector.renderTool() options to test your resources in different configurations. Tests run across ChatGPT and Claude hosts automatically via Playwright projects. Pass theme and displayMode as options:
Use result.screenshot() in tests that cover important visual states (light/dark theme, fullscreen, empty states). Visual tests catch CSS regressions that functional assertions miss: