sunpeak test - sunpeak

Overview

The sunpeak test command runs automated tests for your MCP server. It supports E2E tests against the inspector, visual regression, live tests against real hosts, and multi-model evals.

sunpeak test

With no flags, sunpeak test runs unit tests (Vitest, if configured) and E2E tests (Playwright). For standalone testing framework projects, only E2E tests run because unit tests aren’t scaffolded. For sunpeak app framework projects, both unit and E2E tests run by default. Live tests and evals are never included in the default run because they require API keys and cost money.

Options

Flag	Description
`--e2e`	Run E2E tests only (Playwright + inspector)
`--visual`	Run E2E tests with visual regression comparison
`--visual --update`	Update visual regression baselines
`--live`	Run live tests against real hosts (ChatGPT)
`--eval`	Run evals against multiple LLM models
`--unit`	Run unit tests (sunpeak app framework only, Vitest + happy-dom)

Flags are additive: --e2e --live --eval runs all three. --visual implies --e2e. --update implies --visual. Extra arguments are passed through to the underlying test runner (Playwright or Vitest):

sunpeak test --e2e --ui                    # Playwright UI mode
sunpeak test --e2e tests/e2e/albums.spec.ts  # Single file
sunpeak test --eval albums                 # Filter evals by name

Subcommands

`sunpeak test init`

Scaffold test infrastructure for an existing MCP server (not built with sunpeak):

sunpeak test init

This generates:

tests/e2e/ with example Playwright specs and config
tests/simulations/ with example simulation JSON fixtures
tests/evals/ with eval config, .env.example, and example eval specs
tests/live/ with live test config and example specs

For sunpeak framework projects, sunpeak new scaffolds all of this automatically.

Examples

# Default: unit (if configured) + e2e
sunpeak test

# E2E tests only
sunpeak test --e2e

# Visual regression (compare against baselines)
sunpeak test --visual

# Update visual baselines
sunpeak test --visual --update

# Live tests against real ChatGPT
sunpeak test --live

# Evals against multiple LLM models
sunpeak test --eval

# Run everything
sunpeak test --e2e --live --eval

E2E Testing

Write Playwright tests against simulated hosts.

Visual Regression

Screenshot comparison and baseline management.

Live Testing

Test against real ChatGPT.

Evals

Multi-model tool calling tests.

​Overview

​Options

​Subcommands

​sunpeak test init

​Examples

​See Also