Snapshot Testing MCP Apps, ChatGPT Apps, and Claude Connectors

April 10, 2026 Abe Wheeler

MCP Apps MCP App Testing MCP App Framework ChatGPT Apps ChatGPT App Testing ChatGPT App Framework Claude Connectors Claude Connector Testing Claude Connector Framework

Snapshot testing MCP App resource components and tool output.

MCP App resource components are React components. If you’ve used React, you’ve probably used snapshot testing: render a component, serialize the output, save it to a file, and fail the test if the output changes on the next run. The same pattern works for MCP Apps, ChatGPT Apps, and Claude Connectors, with a few MCP-specific twists.

TL;DR: Use toMatchSnapshot() on rendered resource components to catch structural regressions. Use toMatchInlineSnapshot() on tool handler structuredContent to catch data shape changes. Mock sunpeak hooks to snapshot different display modes, themes, and states. Run with pnpm test:unit and update with pnpm test:unit -- -u.

Why Snapshot Testing Works Well for MCP Apps

MCP App resource components receive data through useToolData() and render it as HTML inside an iframe. The rendered output depends on the tool data, display mode, theme, and host. That’s a lot of combinations, and writing manual assertions for every element in every state gets tedious fast.

Snapshot tests solve this by capturing the full rendered output once and alerting you when it changes. You don’t write expect(screen.getByText('...')).toBeInTheDocument() for every element. You write one toMatchSnapshot() call and Vitest tracks the whole thing.

This is especially useful for MCP Apps because:

Resource components often render complex data structures (tables, charts, nested lists) where manual assertions are verbose
Tool handler output shapes need to stay stable because the resource component depends on them
Display mode and theme combinations multiply your test surface, and snapshots cover each one cheaply
Host differences (ChatGPT vs Claude) can cause subtle rendering changes that snapshots catch immediately

Snapshot Testing Resource Components

Start with a basic component snapshot. Mock the sunpeak hooks (see mocking and stubbing patterns for the full setup) and render your component:

import { render } from '@testing-library/react';
import { describe, it, expect, vi, beforeEach } from 'vitest';
import { DashboardResource } from './dashboard';

let mockToolOutput: Record<string, unknown> = {};

vi.mock('sunpeak', () => ({
  useToolData: () => ({
    output: mockToolOutput,
    input: null,
    inputPartial: null,
    isError: false,
    isLoading: false,
    isCancelled: false,
    cancelReason: null,
  }),
  useAppState: () => [{}, vi.fn()],
  useDisplayMode: () => 'inline',
  useHostInfo: () => ({ hostVersion: undefined, hostCapabilities: { serverTools: true } }),
  SafeArea: ({ children }: { children: React.ReactNode }) => <div>{children}</div>,
}));

describe('DashboardResource snapshots', () => {
  beforeEach(() => {
    vi.clearAllMocks();
  });

  it('renders happy path', () => {
    mockToolOutput = {
      quarter: 'Q1',
      year: 2026,
      revenue: 142000,
      deals: 47,
      topProduct: 'Enterprise Plan',
    };
    const { container } = render(<DashboardResource />);
    expect(container).toMatchSnapshot();
  });

  it('renders empty state', () => {
    mockToolOutput = {
      quarter: 'Q4',
      year: 2027,
      revenue: 0,
      deals: 0,
      topProduct: null,
    };
    const { container } = render(<DashboardResource />);
    expect(container).toMatchSnapshot();
  });
});

The first time you run this, Vitest creates a .snap file next to your test:

src/resources/dashboard/
  dashboard.tsx
  dashboard.test.tsx
  __snapshots__/
    dashboard.test.tsx.snap

The .snap file contains the serialized HTML for each test case. On future runs, Vitest compares the current output to the stored snapshot and fails with a diff if anything changed.

What a Snapshot Diff Looks Like

When your component output changes, the test fails with a clear diff:

- Snapshot  - 1
+ Received  + 1

  <div>
    <h2>Q1 2026</h2>
-   <span class="revenue">$142,000</span>
+   <span class="revenue-amount">$142,000</span>
    <p>47 deals</p>
  </div>

This tells you the CSS class on the revenue element changed from revenue to revenue-amount. If that was intentional (you renamed the class), update the snapshot. If it was accidental (a stale merge, a typo), fix it.

Snapshot Testing Tool Handler Output

Resource components depend on the structuredContent shape your tool handler returns. If a field gets renamed or removed, the component breaks. Snapshot the tool handler output to catch these changes early:

import { describe, it, expect, vi } from 'vitest';
import handler from '../../src/tools/show-dashboard';

vi.mock('../../src/lib/api', () => ({
  getDashboardData: vi.fn().mockResolvedValue({
    revenue: 142000,
    deals: 47,
    topProduct: 'Enterprise Plan',
  }),
}));

describe('show-dashboard tool handler', () => {
  it('returns expected structuredContent shape', async () => {
    const result = await handler(
      { quarter: 'Q1', year: 2026 },
      {} as any
    );

    expect(result.structuredContent).toMatchInlineSnapshot(`
      {
        "deals": 47,
        "quarter": "Q1",
        "revenue": 142000,
        "topProduct": "Enterprise Plan",
        "year": 2026,
      }
    `);
  });
});

toMatchInlineSnapshot() writes the expected value right in your test file instead of a separate .snap file. This works well for tool handler output because the payloads are usually small enough to read inline, and reviewers can see the expected shape without opening another file.

If the handler adds a new field, removes one, or changes a type, the inline snapshot fails and shows the diff. This is a fast way to enforce the contract between your tool handler and your resource component.

Snapshotting Display Mode Variations

MCP Apps render differently depending on the display mode. A component might show a compact table in inline mode and a full chart in fullscreen. Snapshot each variation:

import { render } from '@testing-library/react';
import { describe, it, expect, vi, beforeEach } from 'vitest';
import { DashboardResource } from './dashboard';

let mockDisplayMode = 'inline';
let mockToolOutput: Record<string, unknown> = {};

vi.mock('sunpeak', () => ({
  useToolData: () => ({
    output: mockToolOutput,
    input: null,
    inputPartial: null,
    isError: false,
    isLoading: false,
    isCancelled: false,
    cancelReason: null,
  }),
  useDisplayMode: () => mockDisplayMode,
  useAppState: () => [{}, vi.fn()],
  useHostInfo: () => ({ hostVersion: undefined, hostCapabilities: { serverTools: true } }),
  SafeArea: ({ children }: { children: React.ReactNode }) => <div>{children}</div>,
}));

describe('DashboardResource display mode snapshots', () => {
  beforeEach(() => {
    vi.clearAllMocks();
    mockToolOutput = { quarter: 'Q1', revenue: 142000, deals: 47 };
  });

  it('inline mode', () => {
    mockDisplayMode = 'inline';
    const { container } = render(<DashboardResource />);
    expect(container).toMatchSnapshot();
  });

  it('fullscreen mode', () => {
    mockDisplayMode = 'fullscreen';
    const { container } = render(<DashboardResource />);
    expect(container).toMatchSnapshot();
  });

  it('pip mode', () => {
    mockDisplayMode = 'pip';
    const { container } = render(<DashboardResource />);
    expect(container).toMatchSnapshot();
  });
});

This generates three snapshots. If someone changes the fullscreen layout and accidentally breaks the inline layout, the inline snapshot catches it.

Snapshotting Loading and Error States

Don’t forget to snapshot non-happy-path states. These are easy to miss in manual testing and are exactly the kind of thing that regresses silently:

it('loading state', () => {
  vi.mocked(useToolData).mockReturnValue({
    output: null,
    input: null,
    inputPartial: null,
    isError: false,
    isLoading: true,
    isCancelled: false,
    cancelReason: null,
  });
  const { container } = render(<DashboardResource />);
  expect(container).toMatchSnapshot();
});

it('error state', () => {
  vi.mocked(useToolData).mockReturnValue({
    output: null,
    input: null,
    inputPartial: null,
    isError: true,
    isLoading: false,
    isCancelled: false,
    cancelReason: null,
  });
  const { container } = render(<DashboardResource />);
  expect(container).toMatchSnapshot();
});

it('cancelled state', () => {
  vi.mocked(useToolData).mockReturnValue({
    output: null,
    input: null,
    inputPartial: null,
    isError: false,
    isLoading: false,
    isCancelled: true,
    cancelReason: 'User cancelled the request',
  });
  const { container } = render(<DashboardResource />);
  expect(container).toMatchSnapshot();
});

If your error component changes its message or your loading spinner gets replaced, the snapshot shows you exactly what changed.

Snapshot Testing Claude Connector Resources

Claude Connector resource components work the same way. The data comes through useToolData() just like MCP Apps, so the snapshot pattern is identical:

import { render } from '@testing-library/react';
import { describe, it, expect, vi } from 'vitest';
import { TicketListResource } from './ticket-list';

vi.mock('sunpeak', () => ({
  useToolData: () => ({
    output: {
      tickets: [
        { id: 'TICK-1', title: 'Login broken', status: 'open', priority: 'high' },
        { id: 'TICK-2', title: 'Slow dashboard', status: 'in_progress', priority: 'medium' },
      ],
      total: 2,
    },
    input: null,
    inputPartial: null,
    isError: false,
    isLoading: false,
    isCancelled: false,
    cancelReason: null,
  }),
  useAppState: () => [{}, vi.fn()],
  useDisplayMode: () => 'inline',
  SafeArea: ({ children }: { children: React.ReactNode }) => <div>{children}</div>,
}));

describe('TicketListResource', () => {
  it('renders ticket list', () => {
    const { container } = render(<TicketListResource />);
    expect(container).toMatchSnapshot();
  });
});

For Claude Connectors specifically, also snapshot the tool handler’s annotations. If you’re submitting to the Claude Connector Directory, every tool needs correct readOnlyHint or destructiveHint annotations. A snapshot makes sure nobody accidentally removes them:

it('has correct annotations', async () => {
  const toolDef = getToolDefinition('search-tickets');
  expect(toolDef.annotations).toMatchInlineSnapshot(`
    {
      "openWorldHint": false,
      "readOnlyHint": true,
    }
  `);
});

When Snapshots Help vs When They Don’t

Snapshots work best for:

Catching accidental regressions. You refactor a utility function and a component’s output changes unexpectedly. The snapshot catches it before it ships.
Documenting component output. New team members can read .snap files to understand what each component renders for given inputs.
Covering many states cheaply. Writing manual assertions for every element in every display mode is tedious. Snapshots cover the full output with one line.
Enforcing data contracts. Inline snapshots on structuredContent make the contract between your tool handler and resource component explicit and version-controlled.

Snapshots don’t help with:

Testing user interactions. Snapshots capture static output. If you need to test that clicking a button updates the UI, use the inspector fixture from sunpeak/test for e2e tests.
Testing visual appearance. Snapshots compare HTML structure, not pixels. A CSS change that breaks your layout won’t show up in a snapshot. Use pnpm test:visual for visual regression testing.
Testing cross-host rendering. Snapshots run in happy-dom, not a real browser. Host-specific rendering differences (ChatGPT vs Claude iframe behavior, CSS variable values) need e2e tests.

The best MCP App test suites use snapshots as a fast first layer. They run in milliseconds, catch structural regressions early, and keep your data contracts stable. E2e tests and visual regression tests handle the rest.

Managing Snapshots in Practice

A few things that keep snapshots useful over time instead of becoming noise:

Update intentionally. When a snapshot fails, read the diff before running -u. Failing snapshots are only useful if someone actually looks at why they failed. If your team’s habit is to run pnpm test:unit -- -u without reading the diff, you’re just generating files that nobody checks.

Commit snapshots with the code change. If you rename a CSS class, the snapshot update should be in the same commit. Reviewers can see the code change and the resulting output change together.

Keep snapshots focused. Snapshot the smallest meaningful output. If you only care about the table structure, snapshot just the table element instead of the entire container. Large snapshots are harder to review and more likely to change for irrelevant reasons:

// Focused: just the part you care about
it('renders the data table', () => {
  const { container } = render(<DashboardResource />);
  const table = container.querySelector('table');
  expect(table).toMatchSnapshot();
});

Delete stale snapshots. When you remove a test, its snapshot sticks around in the .snap file. Run pnpm test:unit -- --clearInvalidSnapshots periodically to clean them up.

Don’t snapshot everything. A component that renders “Hello, World” doesn’t need a snapshot. Snapshots add the most value for components with complex output that varies based on data, display mode, or theme. Simple components are better served by a targeted assertion like expect(screen.getByText('Hello, World')).toBeInTheDocument().

Running Snapshot Tests

Snapshot tests run as part of your unit test suite:

# Run all unit tests including snapshots
pnpm test:unit

# Update all snapshots after intentional changes
pnpm test:unit -- -u

# Clean up orphaned snapshots
pnpm test:unit -- --clearInvalidSnapshots

In CI/CD, pnpm test runs both unit and e2e tests. Snapshot tests are included automatically. If a snapshot is out of date, the CI build fails and the diff shows exactly what changed. See MCP App CI/CD with GitHub Actions for the full pipeline setup.

Snapshot testing fits naturally into an MCP App testing workflow. Use it alongside simulation files, mocks, and the inspector fixture to cover your app from every angle, fast structural checks with snapshots, behavioral checks with e2e tests, and pixel-level checks with visual regression.

Get Started

Documentation →


npx sunpeak new

Frequently Asked Questions

What is snapshot testing for MCP Apps?

Snapshot testing serializes the rendered output of your MCP App resource components or tool handler responses to a .snap file. On future test runs, Vitest compares the current output to the saved snapshot and fails if anything changed. This catches unintended changes to your component markup, structured content shape, or tool result format without writing manual assertions for every element.

How do I snapshot test an MCP App resource component?

Render your component in a unit test using @testing-library/react, then call expect(container).toMatchSnapshot(). Vitest saves the rendered HTML to a .snap file next to your test. When the component output changes, the test fails and shows you a diff. Update snapshots with pnpm test:unit -- -u after confirming the change is intentional.

Should I use toMatchSnapshot or toMatchInlineSnapshot for MCP Apps?

Use toMatchInlineSnapshot for small, focused outputs like tool handler structured content or single-element assertions. Use toMatchSnapshot for larger outputs like full component renders. Inline snapshots live right in your test file so you see the expected value without opening a separate .snap file, which makes code review easier for small values.

How do I snapshot test MCP App tool handler output?

Call your tool handler with test arguments and snapshot the structuredContent field. Use toMatchInlineSnapshot for small payloads or toMatchSnapshot for larger ones. This catches changes to the data shape your resource component depends on, like renamed fields, missing values, or changed types.

What is the difference between snapshot testing and visual regression testing for MCP Apps?

Snapshot testing compares serialized HTML or JSON output as text diffs. Visual regression testing compares actual screenshots pixel by pixel. Snapshots are fast (milliseconds, no browser needed) and catch structural changes like missing elements or changed attributes. Visual regression is slower (needs a real browser) but catches CSS and layout bugs that snapshots miss. Most MCP App projects benefit from both.

How do I update MCP App snapshots after intentional changes?

Run pnpm test:unit -- -u to update all snapshots. Vitest rewrites the .snap files and inline snapshots to match the current output. Always review the snapshot diff in your commit to make sure every change was intentional. Commit updated snapshots alongside the code change that caused them.

Can I snapshot test MCP Apps across different display modes and themes?

Yes. Mock useDisplayMode and useHostContext in your unit tests to return different values, then snapshot the output for each combination. This catches cases where your component renders different markup in fullscreen vs inline mode, or light vs dark theme. Create separate snapshot tests for each important state combination.

Do snapshot tests replace other MCP App tests?

No. Snapshot tests complement unit tests, e2e tests, and visual regression tests. Snapshots catch structural regressions quickly but do not test user interactions, cross-host rendering, or visual appearance. Use snapshot tests as a fast first line of defense, then e2e tests with the inspector fixture for behavioral testing, and visual regression tests for pixel-level accuracy.