Skip to main content

Overview

The sunpeak test command runs automated tests for your MCP server. It supports unit tests, E2E tests against the inspector, visual regression, live tests against real hosts, and multi-model evals.
sunpeak test
With no flags, sunpeak test runs unit tests (Vitest) and E2E tests (Playwright). Live tests and evals are never included in the default run because they require API keys and cost money.

Options

FlagDescription
--unitRun unit tests only (Vitest + happy-dom)
--e2eRun E2E tests only (Playwright + inspector)
--visualRun E2E tests with visual regression comparison
--visual --updateUpdate visual regression baselines
--liveRun live tests against real hosts (ChatGPT)
--evalRun evals against multiple LLM models
Flags are additive: --unit --e2e --live --eval runs all four. --visual implies --e2e. --update implies --visual. Extra arguments are passed through to the underlying test runner (Playwright or Vitest):
sunpeak test --e2e --ui                    # Playwright UI mode
sunpeak test --e2e tests/e2e/albums.spec.ts  # Single file
sunpeak test --eval albums                 # Filter evals by name

Subcommands

sunpeak test init

Scaffold test infrastructure for an existing MCP server (not built with sunpeak):
sunpeak test init
This generates:
  • tests/e2e/ with example Playwright specs and config
  • tests/simulations/ with example simulation JSON fixtures
  • tests/evals/ with eval config, .env.example, and example eval specs
  • tests/live/ with live test config and example specs
For sunpeak framework projects, sunpeak new scaffolds all of this automatically.

Examples

# Default: unit + e2e
sunpeak test

# Unit tests only
sunpeak test --unit

# E2E tests only
sunpeak test --e2e

# Visual regression (compare against baselines)
sunpeak test --visual

# Update visual baselines
sunpeak test --visual --update

# Live tests against real ChatGPT
sunpeak test --live

# Evals against multiple LLM models
sunpeak test --eval

# Run everything
sunpeak test --unit --e2e --live --eval

See Also

Unit Testing

Fast component and hook tests with Vitest.

E2E Testing

Write Playwright tests against simulated hosts.

Visual Regression

Screenshot comparison and baseline management.

Live Testing

Test against real ChatGPT.

Evals

Multi-model tool calling tests.