Overview
Live tests validate your MCP Apps inside real ChatGPT — not the inspector. They open your browser, navigate to ChatGPT, send messages that trigger tool calls against your MCP server, and verify the rendered app using Playwright assertions. This catches issues that inspector tests can’t: real MCP connection behavior, actual LLM tool invocation, host-specific iframe rendering, and production resource loading.Prerequisites
- ChatGPT account — You need a ChatGPT account with MCP/Apps support
- Tunnel tool — ngrok, Cloudflare Tunnel, or similar
- Browser session — Logged into chatgpt.com in Chrome, Arc, Brave, or Edge
One-Time Setup
Add your MCP server in ChatGPT settings:- Go to Settings > Apps > Create in ChatGPT
- Enter your tunnel URL with the
/mcppath (e.g.,https://abc123.ngrok.io/mcp) - Save the connection
Running Live Tests
- Imports your ChatGPT session from your browser (Chrome, Arc, Brave, or Edge). Falls back to a manual login window if no session is found. Sessions typically last a few hours — Cloudflare’s HttpOnly
cf_clearancecookie cannot be persisted, so re-authentication is needed when it expires. - Starts
sunpeak dev --prod-resourcesautomatically - Refreshes the MCP server connection in ChatGPT settings (once in globalSetup, before all workers)
- Runs
tests/live/*.spec.tsfiles fully in parallel — each test gets its own chat window
Live tests always run with a visible browser window. chatgpt.com uses bot detection that blocks headless browsers, so a visible browser is required for reliable results.
Running via Validate
You can also run live tests as part of the full validation pipeline:Writing Live Tests
Live test specs live intests/live/ — one file per resource, just like e2e tests. Import test and expect from sunpeak/test/live to get a live fixture that handles login, MCP server refresh, and host-specific message formatting automatically.
The live Fixture
The live fixture provides:
invoke(prompt)— one-liner: starts a new chat, sends the prompt (with host-specific formatting like/{appName}for ChatGPT), waits for the app iframe, and returns aFrameLocatorstartNewChat()— opens a fresh conversation (for multi-step flows)sendMessage(text)— sends a message with host-appropriate formatting (read from yourpackage.json)waitForAppIframe()— waits for the MCP app iframe to render and returns aFrameLocatorsendRawMessage(text)— sends a message without any prefixsetColorScheme(scheme, appFrame?)— switches the host to'light'or'dark'theme; optionally pass an appFrameLocatorto wait for it to updatepage— raw PlaywrightPageobject for advanced assertions
Configuration
The Playwright config is a one-liner:chatgpt). Tests switch themes internally using live.setColorScheme(). When new hosts are supported, add them with a one-line change:
Troubleshooting
'Not logged into ChatGPT' error
'Not logged into ChatGPT' error
On first run, a browser window opens for you to log in to ChatGPT. The session is saved to
.auth/chatgpt.json but typically only lasts a few hours because Cloudflare’s cf_clearance cookie is HttpOnly and cannot be persisted across runs. When you see this error, just re-authenticate in the browser window that opens. If it keeps failing, delete the .auth/ directory and run pnpm test:live again.Tunnel not reachable
Tunnel not reachable
Verify your tunnel is running and the URL is correct. The test checks the tunnel’s
/health endpoint before proceeding.'ChatGPT DOM may have changed' warning
'ChatGPT DOM may have changed' warning
ChatGPT occasionally updates their UI. The
ChatGPTPage class checks selector health at startup. If selectors are stale, update the SELECTORS constant in chatgpt-page.mjs.Tool not called by ChatGPT
Tool not called by ChatGPT
Live tests use specific prompts like “Use the show-albums tool to…” to reliably trigger tool calls. If a tool isn’t called, the test retries once. Persistent failures may indicate the tool isn’t properly connected — check ChatGPT settings.