Skip to main content

Documentation Index

Fetch the complete documentation index at: https://sunpeak.ai/docs/llms.txt

Use this file to discover all available pages before exploring further.

Prerequisites

  • Node.js 20+ is required, even if your MCP server is written in Python, Go, or another language. The testing framework runs on Node.js and Playwright.
  • Your MCP server running locally (HTTP or stdio)

1. Try the inspector (optional)

Before writing tests, try the inspector to verify sunpeak can connect to your server:
npx sunpeak inspect --server http://localhost:8000/mcp
This opens the inspector at http://localhost:3000, where you can call your tools and see them rendered in simulated ChatGPT and Claude runtimes. Browse your tools, switch hosts and themes, and verify everything connects.

2. Scaffold test infrastructure

Once the inspector works, scaffold automated tests:
npx sunpeak test init --server http://localhost:8000/mcp
Or with a stdio command:
npx sunpeak test init --server "python server.py"
This creates test files for all four testing levels. For non-JS projects, everything goes into a self-contained tests/sunpeak/ directory with its own package.json. Install dependencies:
cd tests/sunpeak
npm install
npx playwright install chromium

3. Run the smoke test

npx sunpeak test
For non-JS projects, sunpeak test auto-discovers tests/sunpeak/playwright.config.ts when no root-level config exists. You can run it from your project root without cd-ing into the test directory. The scaffolded smoke test verifies that the inspector can connect to your server and load. You should see one passing test.

4. Write your first real test

Open the scaffolded smoke test (smoke.test.ts) and add a test for one of your tools. Replace your-tool with an actual tool name from your server:
import { test, expect } from 'sunpeak/test';

test('server is reachable and inspector loads', async ({ inspector }) => {
  await expect(inspector.page.locator('#root')).not.toBeEmpty();
});

test('my tool returns a result', async ({ mcp }) => {
  const result = await mcp.callTool('your-tool', { key: 'value' });
  expect(result.isError).toBeFalsy();
});

// If your tool renders a UI, you can interact with it:
test('my tool renders a UI', async ({ inspector }) => {
  const result = await inspector.renderTool('your-tool', { key: 'value' });
  const app = result.app();
  await expect(app.getByText('Expected text')).toBeVisible();
});
The mcp and inspector fixtures handle all the plumbing: starting the inspector, connecting to your server, navigating to the tool, and traversing the double-iframe sandbox. Each test runs automatically against both ChatGPT and Claude hosts. There are two fixtures: mcp for protocol-level testing (callTool, listTools, etc., returning raw MCP data) and inspector for UI testing (renderTool, which renders the result in the inspector). When you pass input to renderTool, the tool is called on your real server and the result is rendered. Without input, the tool uses pre-baked simulation fixture data (if available) for fast, deterministic tests. See Simulations for more on when to use each approach.
Run npx sunpeak inspect --server <url> to browse your tools interactively and find the right tool names and arguments to use in tests.

5. Add more test levels

The scaffolded files include templates for all four testing levels:
LevelFileCommandCost
E2Esmoke.test.tssunpeak testFree
Visualvisual.test.tssunpeak test --visualFree
Livelive/example.test.tssunpeak test --liveHost credits
Evalsevals/example.eval.tssunpeak test --evalAPI keys
Start with E2E tests (free, fast, local). Add visual regression when you want to catch CSS regressions. Add live tests and evals when you need production host validation and multi-model reliability testing.

Language-specific tips

For stdio servers, pass the full command including any virtual environment activation:
// playwright.config.ts
import { defineConfig } from 'sunpeak/test/config';
export default defineConfig({
  server: {
    // Option 1: uv (recommended)
    command: 'uv', args: ['run', 'python', 'server.py'],

    // Option 2: venv absolute path
    // command: '.venv/bin/python', args: ['server.py'],

    // Option 3: HTTP server (no shell needed)
    // url: 'http://localhost:8000/mcp',

    // Pass environment variables to the server process
    env: { PYTHONPATH: './src', DATABASE_URL: 'sqlite:///test.db' },

    // Set the working directory
    cwd: './my-python-server',
  },
});
HTTP servers (FastAPI, Flask) are the simplest option because you start them separately and sunpeak just connects to the URL.
import { defineConfig } from 'sunpeak/test/config';
export default defineConfig({
  server: {
    command: 'go', args: ['run', './cmd/server'],
    env: { GO_ENV: 'test' },

    // Or connect to a running HTTP server:
    // url: 'http://localhost:8000/mcp',
  },
});
import { defineConfig } from 'sunpeak/test/config';
export default defineConfig({
  server: {
    command: 'cargo', args: ['run', '--release'],
    // url: 'http://localhost:8000/mcp',
  },
});

Next steps

E2E Testing

Write Playwright tests against simulated hosts.

Visual Regression

Screenshot comparison across themes and hosts.

Live Testing

Test against real ChatGPT and Claude.

Evals

Multi-model tool calling reliability.