How to Unit Test MCP Apps, ChatGPT Apps, and Claude Connectors (April 2026)
Unit testing MCP App resource components and tool handlers with Vitest.
MCP Apps, ChatGPT Apps, and Claude Connectors have a layered architecture that makes them awkward to test by hand. Your resource component renders inside a sandboxed iframe, receives data through the MCP protocol, and needs to work across two different host runtimes with different themes and display modes. Unit tests strip all of that away and let you test your actual code: does the component render the right thing for a given data shape? Does the tool handler return the right output for a given input?
TL;DR: Mock sunpeak hooks with vi.mock('sunpeak'), render resource components with @testing-library/react, and call tool handlers directly. Run with pnpm test:unit. No browser, no paid accounts, no inspector needed. Unit tests cover component rendering, tool handler logic, and state transitions (loading, error, cancelled, success) in milliseconds.
What to Unit Test in an MCP App
An MCP App has two main pieces of code you write: the tool handler (server-side) and the resource component (client-side). Unit tests target each one separately.
For resource components, test that:
- The component renders the expected output when
useToolDatareturns valid data - Loading state shows a spinner or skeleton (when
isLoadingis true) - Error state shows an error message (when
isErroris true) - Cancelled state shows a neutral message (when
isCancelledis true) - Edge cases like empty arrays, missing optional fields, and long strings render correctly
- Different display modes (inline, pip, fullscreen) change layout when your component adapts to them
For tool handlers, test that:
- Valid input returns the expected
structuredContent - Invalid or missing input is handled gracefully
- External API responses are processed correctly
- Error cases return appropriate error content
Skip testing things the framework handles for you: MCP protocol serialization, iframe sandboxing, host communication, and postMessage wiring. Those are sunpeak’s job, not yours.
Running Unit Tests
sunpeak projects come with Vitest configured. Unit tests live in tests/unit/ and run with:
pnpm test:unit
This runs Vitest with happy-dom as the DOM environment, which means your React components render in a simulated browser without launching a real one. Tests execute in milliseconds.
For development, watch mode re-runs tests as you edit:
pnpm test:unit -- --watch
To run a single test file:
pnpm test:unit tests/unit/dashboard.test.tsx
Unit Testing Resource Components
Resource components are React components that read data from sunpeak hooks like useToolData, useAppState, useDisplayMode, and useHostInfo. To test them in isolation, you mock those hooks and render the component with @testing-library/react.
Here’s a resource component that displays a list of pull requests:
// src/resources/pr-list/pr-list.tsx
import { useToolData, SafeArea } from 'sunpeak';
interface PullRequest {
id: number;
title: string;
author: string;
status: 'open' | 'merged' | 'closed';
}
interface PrListOutput {
repo: string;
pullRequests: PullRequest[];
}
export function PrListResource() {
const { output, isError, isLoading, isCancelled } = useToolData<unknown, PrListOutput>(
undefined,
undefined,
);
if (isLoading) return <SafeArea><p>Loading pull requests...</p></SafeArea>;
if (isError) return <SafeArea><p>Failed to load pull requests.</p></SafeArea>;
if (isCancelled) return <SafeArea><p>Request stopped.</p></SafeArea>;
if (!output) return null;
return (
<SafeArea>
<h2>{output.repo}</h2>
<ul>
{output.pullRequests.map((pr) => (
<li key={pr.id}>
<span className={`status-${pr.status}`}>{pr.status}</span>
{' '}{pr.title} by {pr.author}
</li>
))}
</ul>
</SafeArea>
);
}
And here’s the unit test file:
// tests/unit/pr-list.test.tsx
import { render, screen } from '@testing-library/react';
import { describe, it, expect, vi, beforeEach } from 'vitest';
import { PrListResource } from '../../src/resources/pr-list/pr-list';
let mockToolData: Record<string, unknown> = {};
vi.mock('sunpeak', () => ({
useToolData: () => mockToolData,
useAppState: () => [{}, vi.fn()],
useDisplayMode: () => 'inline',
useHostInfo: () => ({ hostVersion: undefined, hostCapabilities: { serverTools: true } }),
SafeArea: ({ children }: { children: React.ReactNode }) => <div>{children}</div>,
}));
describe('PrListResource', () => {
beforeEach(() => {
mockToolData = {
output: null,
input: null,
inputPartial: null,
isError: false,
isLoading: false,
isCancelled: false,
cancelReason: null,
};
});
it('renders pull request list', () => {
mockToolData = {
...mockToolData,
output: {
repo: 'acme/widgets',
pullRequests: [
{ id: 1, title: 'Add search', author: 'alice', status: 'open' },
{ id: 2, title: 'Fix pagination', author: 'bob', status: 'merged' },
],
},
};
render(<PrListResource />);
expect(screen.getByText('acme/widgets')).toBeDefined();
expect(screen.getByText(/Add search/)).toBeDefined();
expect(screen.getByText(/Fix pagination/)).toBeDefined();
expect(screen.getByText(/alice/)).toBeDefined();
});
it('shows loading state', () => {
mockToolData = { ...mockToolData, isLoading: true };
render(<PrListResource />);
expect(screen.getByText('Loading pull requests...')).toBeDefined();
});
it('shows error state', () => {
mockToolData = { ...mockToolData, isError: true };
render(<PrListResource />);
expect(screen.getByText('Failed to load pull requests.')).toBeDefined();
});
it('shows cancelled state', () => {
mockToolData = { ...mockToolData, isCancelled: true };
render(<PrListResource />);
expect(screen.getByText('Request stopped.')).toBeDefined();
});
it('renders empty list', () => {
mockToolData = {
...mockToolData,
output: { repo: 'acme/widgets', pullRequests: [] },
};
render(<PrListResource />);
expect(screen.getByText('acme/widgets')).toBeDefined();
expect(screen.queryByRole('listitem')).toBeNull();
});
});
The pattern is consistent across every resource component you write:
- Mock
sunpeakat the module level withvi.mock() - Set
mockToolDatabefore each test to control whatuseToolDatareturns - Render the component with
render(<YourResource />) - Assert against the DOM with
screenqueries
Mocking sunpeak Hooks
The vi.mock('sunpeak') block replaces every export from the sunpeak package. You need to provide replacements for each hook your component uses. Here are the hooks you’ll mock most often:
useToolData returns the tool’s output, input, and state flags. This is the primary data source for resource components.
useToolData: () => ({
output: null, // structuredContent from the tool result
input: null, // complete tool call arguments
inputPartial: null, // streaming partial arguments
isError: false, // true when the tool failed
isLoading: false, // true until tool-result or tool-cancelled arrives
isCancelled: false, // true when the user stopped the model
cancelReason: null, // reason string from tool-cancelled
})
useAppState returns a state tuple, similar to useState. Use this when your component has interactive UI that syncs state back to the host.
useAppState: () => [currentState, setStateFn]
useDisplayMode returns the current display mode string: 'inline', 'pip', or 'fullscreen'.
useDisplayMode: () => 'inline'
useHostInfo returns host metadata, including capabilities.
useHostInfo: () => ({
hostVersion: undefined,
hostCapabilities: { serverTools: true },
})
SafeArea is a layout component that adds padding for host-specific chrome. In tests, replace it with a passthrough div.
Unit Testing Tool Handlers
Tool handlers are server-side functions that receive input, process it, and return structured content. They don’t depend on React or browser APIs, which makes them straightforward to test.
Here’s a tool handler:
// src/tools/search-repos/handler.ts
import { searchGitHub } from '../../lib/github';
interface SearchInput {
query: string;
language?: string;
}
export async function handler(input: SearchInput) {
const repos = await searchGitHub(input.query, input.language);
return {
content: [{ type: 'text' as const, text: `Found ${repos.length} repositories` }],
structuredContent: {
query: input.query,
results: repos.map((r) => ({
name: r.full_name,
stars: r.stargazers_count,
description: r.description,
})),
},
};
}
And the unit test:
// tests/unit/search-repos.test.ts
import { describe, it, expect, vi, beforeEach } from 'vitest';
import { handler } from '../../src/tools/search-repos/handler';
vi.mock('../../lib/github', () => ({
searchGitHub: vi.fn(),
}));
import { searchGitHub } from '../../lib/github';
const mockSearchGitHub = vi.mocked(searchGitHub);
describe('search-repos handler', () => {
beforeEach(() => {
vi.clearAllMocks();
});
it('returns formatted results', async () => {
mockSearchGitHub.mockResolvedValue([
{ full_name: 'facebook/react', stargazers_count: 220000, description: 'A JS library for building UIs' },
{ full_name: 'vuejs/vue', stargazers_count: 207000, description: 'Progressive JS framework' },
]);
const result = await handler({ query: 'frontend framework' });
expect(result.structuredContent.results).toHaveLength(2);
expect(result.structuredContent.results[0].name).toBe('facebook/react');
expect(result.structuredContent.query).toBe('frontend framework');
expect(result.content[0].text).toBe('Found 2 repositories');
});
it('passes language filter to API', async () => {
mockSearchGitHub.mockResolvedValue([]);
await handler({ query: 'orm', language: 'rust' });
expect(mockSearchGitHub).toHaveBeenCalledWith('orm', 'rust');
});
it('handles empty results', async () => {
mockSearchGitHub.mockResolvedValue([]);
const result = await handler({ query: 'nonexistent-thing' });
expect(result.structuredContent.results).toHaveLength(0);
expect(result.content[0].text).toBe('Found 0 repositories');
});
it('propagates API errors', async () => {
mockSearchGitHub.mockRejectedValue(new Error('rate limited'));
await expect(handler({ query: 'test' })).rejects.toThrow('rate limited');
});
});
Tool handler tests are plain async function tests. Mock external dependencies with vi.mock(), call the handler with test input, and assert on the returned structuredContent and content. The structuredContent shape is especially important to test because your resource component depends on it. If the handler changes a field name and the component still reads the old name, an integration test would catch it, but a unit test on each side catches it faster.
Testing Display Mode Behavior
If your resource component adapts its layout for different display modes, you need to test each one. Change the useDisplayMode mock return value between tests:
let mockDisplayMode = 'inline';
vi.mock('sunpeak', () => ({
useToolData: () => mockToolData,
useDisplayMode: () => mockDisplayMode,
useHostInfo: () => ({ hostVersion: undefined, hostCapabilities: { serverTools: true } }),
SafeArea: ({ children }: { children: React.ReactNode }) => <div>{children}</div>,
}));
it('shows compact layout in inline mode', () => {
mockDisplayMode = 'inline';
render(<DashboardResource />);
expect(screen.queryByTestId('sidebar')).toBeNull();
});
it('shows full layout in fullscreen mode', () => {
mockDisplayMode = 'fullscreen';
render(<DashboardResource />);
expect(screen.getByTestId('sidebar')).toBeDefined();
});
The same approach works for testing host-specific CSS variables or theme differences. Mock useHostInfo to return different host names and verify that your component applies the right classes or styles.
Testing useAppState Interactions
Interactive MCP Apps use useAppState to sync state between the resource component and the host. Test that user interactions call the state setter correctly:
import { render, screen, fireEvent } from '@testing-library/react';
const mockSetState = vi.fn();
let mockAppState: Record<string, unknown> = {};
vi.mock('sunpeak', () => ({
useToolData: () => mockToolData,
useAppState: () => [mockAppState, mockSetState],
useDisplayMode: () => 'inline',
useHostInfo: () => ({ hostVersion: undefined, hostCapabilities: { serverTools: true } }),
SafeArea: ({ children }: { children: React.ReactNode }) => <div>{children}</div>,
}));
it('updates state when user selects a tab', () => {
mockAppState = { activeTab: 'overview' };
render(<DashboardResource />);
fireEvent.click(screen.getByText('Details'));
expect(mockSetState).toHaveBeenCalledWith({ activeTab: 'details' });
});
it('renders the active tab content', () => {
mockAppState = { activeTab: 'details' };
render(<DashboardResource />);
expect(screen.getByText('Detail view content')).toBeDefined();
});
Test both directions: that user actions trigger setState with the right value, and that the component renders correctly for a given state.
Organizing Unit Test Files
sunpeak auto-discovers tests from the tests/ directory. A typical structure looks like:
tests/
unit/
pr-list.test.tsx # resource component tests
search-repos.test.ts # tool handler tests
format-date.test.ts # utility function tests
e2e/
pr-list.spec.ts # e2e tests with inspector fixture
simulations/
pr-list-success.json # simulation files for e2e tests
pr-list-empty.json
Keep unit tests in tests/unit/ and e2e tests in tests/e2e/. This separation lets you run them independently: pnpm test:unit for fast feedback during development, pnpm test:e2e for host-level confidence before merging.
Name test files to match the resource or tool they test. If src/resources/pr-list/pr-list.tsx is your component, name the test tests/unit/pr-list.test.tsx. For tool handlers in src/tools/search-repos/handler.ts, name the test tests/unit/search-repos.test.ts.
When Unit Tests Are Enough (and When They’re Not)
Unit tests are the fastest feedback loop for MCP App development. They run in milliseconds, don’t need a browser, and catch most logic and rendering bugs. But they have blind spots.
Unit tests catch:
- Rendering bugs in resource components (wrong text, missing elements, broken conditional logic)
- Tool handler logic errors (wrong calculations, missing fields, bad error handling)
- State management bugs (wrong state transitions, missing state handlers)
- Regressions when you change component or handler code
Unit tests miss:
- MCP protocol issues (your handler returns data but the protocol layer serializes it wrong)
- Host rendering differences (your component works in happy-dom but breaks in the real ChatGPT iframe)
- Cross-host CSS bugs (styles look right in isolation but conflict with host CSS variables)
- Display mode transitions (switching from inline to fullscreen triggers a bug your unit test doesn’t exercise)
For full confidence, combine unit tests with e2e tests (which render in a real browser against simulated hosts), integration tests (which exercise the MCP protocol layer), and visual regression tests (which catch CSS and layout bugs). Run them all with pnpm test or separately with pnpm test:unit, pnpm test:e2e, and pnpm test:visual.
Get Started
sunpeak projects come with Vitest configured and a tests/unit/ directory ready for your tests. If you’re starting a new MCP App, ChatGPT App, or Claude Connector:
npx sunpeak@latest create my-app
cd my-app
pnpm test:unit
The scaffolded project includes example unit tests you can use as a starting point. For existing projects, check the testing framework page or the complete testing guide for setup instructions.
Get Started
npx sunpeak new
Further Reading
- Complete guide to testing ChatGPT Apps and MCP Apps
- Mocking and stubbing in MCP App tests - simulation files and fixture patterns
- Snapshot testing MCP Apps - toMatchSnapshot and toMatchInlineSnapshot
- Integration testing MCP Apps - the mcp fixture for protocol-level tests
- MCP App error handling - loading, error, and cancelled states
- MCP App CI/CD - run your tests in GitHub Actions
- MCP App framework
- ChatGPT App framework
- Claude Connector framework
- Testing framework
Frequently Asked Questions
How do I unit test an MCP App resource component?
Mock the sunpeak hooks (useToolData, useAppState, useDisplayMode, useHostInfo) with vi.mock("sunpeak"), then render your component with @testing-library/react. Assert against the rendered output using screen queries like getByText, getByRole, and queryByText. Unit tests run with pnpm test:unit using Vitest and happy-dom, so no browser or paid account is needed.
What is the difference between unit tests and e2e tests for MCP Apps?
Unit tests import your resource component or tool handler directly, mock all dependencies, and run in happy-dom without a browser. They execute in milliseconds. E2e tests use the inspector fixture from sunpeak/test to render your full MCP App in a real browser against simulated ChatGPT and Claude runtimes. They are slower but catch rendering, iframe, and cross-host bugs that unit tests miss. Use both: unit tests for fast feedback on logic and structure, e2e tests for host-level confidence.
How do I mock useToolData in MCP App unit tests?
Call vi.mock("sunpeak", () => ({ useToolData: () => ({ output: yourMockData, input: null, inputPartial: null, isError: false, isLoading: false, isCancelled: false, cancelReason: null }) })) at the top of your test file. Use a module-level variable for the mock data so you can change it between tests to cover success, loading, error, and cancelled states.
How do I unit test an MCP App tool handler?
Import your tool handler function directly and call it with test arguments. Assert on the returned structuredContent, content array, and isError flag. Mock any external API calls with vi.mock() so tests stay fast and deterministic. Tool handler unit tests verify your server-side logic without running the MCP server or rendering any UI.
What should I unit test in an MCP App?
Unit test four things: resource components (do they render the right output for a given tool data shape?), tool handlers (does the server logic return correct structuredContent?), state transitions (does the component handle loading, error, cancelled, and success states?), and utility functions (input parsing, data formatting, validation). Skip unit testing things the framework handles, like MCP protocol serialization or iframe sandboxing.
Do I need a ChatGPT or Claude account to unit test MCP Apps?
No. Unit tests run entirely locally with Vitest and happy-dom. They do not connect to ChatGPT, Claude, or any external service. You mock the sunpeak hooks to provide controlled data and render components in a simulated DOM. Unit tests run the same way on your machine and in CI/CD with zero external dependencies.
How do I test MCP App error and loading states in unit tests?
Change your mock useToolData return value between tests. Set isLoading: true with output: null to test loading state. Set isError: true with output: null to test error state. Set isCancelled: true to test cancelled state. Each test verifies that your component renders the correct UI for that state, like a spinner for loading or an error message for errors.
How do I run only unit tests for my MCP App?
Run pnpm test:unit. This executes Vitest with happy-dom against your tests/unit/ directory. To run a specific test file, use pnpm test:unit tests/unit/dashboard.test.tsx. To run in watch mode during development, use pnpm test:unit -- --watch. Unit tests are separate from e2e tests (pnpm test:e2e), visual regression tests (pnpm test:visual), and live tests (pnpm test:live).