Skip to main content

Documentation Index

Fetch the complete documentation index at: https://sunpeak.ai/docs/llms.txt

Use this file to discover all available pages before exploring further.

Overview

The sunpeak test command runs automated tests for your MCP server. It supports E2E tests against the inspector, visual regression, live tests against real hosts, and multi-model evals.
sunpeak test
With no flags, sunpeak test runs unit tests (Vitest, if configured) and E2E tests (Playwright). For standalone testing framework projects, only E2E tests run because unit tests aren’t scaffolded. For sunpeak app framework projects, both unit and E2E tests run by default. Live tests and evals are never included in the default run because they require API keys and cost money.

Options

FlagDescription
--e2eRun E2E tests only (Playwright + inspector)
--visualRun E2E tests with visual regression comparison
--visual --updateUpdate visual regression baselines
--liveRun live tests against real hosts (ChatGPT)
--evalRun evals against multiple LLM models
--unitRun unit tests (sunpeak app framework only, Vitest + happy-dom)
Flags are additive: --e2e --live --eval runs all three. --visual implies --e2e. --update implies --visual. Extra arguments are passed through to the underlying test runner (Playwright or Vitest):
sunpeak test --e2e --ui                    # Playwright UI mode
sunpeak test --e2e tests/e2e/albums.spec.ts  # Single file
sunpeak test --eval albums                 # Filter evals by name

Subcommands

sunpeak test init

Scaffold test infrastructure for an existing MCP server (not built with sunpeak):
sunpeak test init
This generates:
  • tests/e2e/ with example Playwright specs and config
  • tests/simulations/ with example simulation JSON fixtures
  • tests/evals/ with eval config, .env.example, and example eval specs
  • tests/live/ with live test config and example specs
For sunpeak framework projects, sunpeak new scaffolds all of this automatically.

Examples

# Default: unit (if configured) + e2e
sunpeak test

# E2E tests only
sunpeak test --e2e

# Visual regression (compare against baselines)
sunpeak test --visual

# Update visual baselines
sunpeak test --visual --update

# Live tests against real ChatGPT
sunpeak test --live

# Evals against multiple LLM models
sunpeak test --eval

# Run everything
sunpeak test --e2e --live --eval

See Also

E2E Testing

Write Playwright tests against simulated hosts.

Visual Regression

Screenshot comparison and baseline management.

Live Testing

Test against real ChatGPT.

Evals

Multi-model tool calling tests.