How Claude Connectors Work: Architecture, Lifecycle, and Limits

April 3, 2026 Abe Wheeler

Claude Connectors Claude Connector Framework Claude Connector Testing Claude Apps MCP Apps MCP App Framework

How Claude Connectors work under the hood: architecture, lifecycle, and limits.

TL;DR: Claude Connectors are remote MCP servers that Claude calls over Streamable HTTP. Your server exposes tools; Claude decides when to call them. Tool responses are capped at 25,000 tokens with a 5-minute timeout. Interactive connectors render React components in sandboxed iframes via structuredContent. Auth uses OAuth 2.1 with PKCE. This post walks through the full architecture so you know exactly what happens at each stage.

You know what Claude Connectors are. Maybe you have built one using the tutorial. But when something breaks, or when you need to optimize for production, you need to understand what actually happens between the moment a user sends a message and the moment your connector’s response appears in the chat.

This post covers the full technical architecture: the request lifecycle, transport protocol, authentication flow, tool response formatting, interactive UI rendering, and the limits you need to design around.

The Request Lifecycle

Here is what happens when a user interacts with a Claude Connector, step by step:

User sends a message. The user types something in Claude that relates to a connected service. For example: “Show me the latest support tickets.”
Claude selects a tool. Claude examines the user’s message and the available tools from all enabled connectors. Each tool has a name, description, and input schema. Claude picks the tool that best matches the intent and generates arguments that conform to the schema.
Claude sends a JSON-RPC request. Claude sends a tools/call request over Streamable HTTP to your MCP server’s endpoint (typically /mcp). The request body includes the tool name and the arguments Claude generated.
Your handler runs. Your tool handler receives the arguments, validates them, calls your backend (database, REST API, third-party service), and builds a response.
Your server returns the result. The handler returns either text content (for standard connectors) or structuredContent (for interactive connectors with UI). The response travels back over HTTP.
Claude processes the response. For text content, Claude reads the data and writes a natural language response that incorporates it. For structuredContent, Claude renders the linked resource component in the conversation.
The user sees the result. Text shows up as part of Claude’s message. Interactive UI shows up as an embedded card, dashboard, or form within the conversation.

The whole round trip typically takes 1-3 seconds, depending on how fast your handler responds.

Transport: Streamable HTTP

Claude Connectors communicate over Streamable HTTP, the standard remote transport in the MCP specification since version 2025-03-26.

Your server exposes a single endpoint (like /mcp) that handles three HTTP methods:

POST receives JSON-RPC messages from Claude (tool calls, initialization, resource reads). The server can respond with a single JSON response or open an SSE stream for longer operations.
GET (optional) opens an SSE stream for server-initiated notifications, like notifications/resources/list_changed.
DELETE terminates the session when Claude disconnects.

Sessions are tracked with an Mcp-Session-Id header. Claude sends this header with every request after initialization, and your server uses it to maintain per-session state if needed.

The older HTTP+SSE transport (two separate endpoints for a long-lived SSE connection and message posting) still works in Claude today, but Anthropic has signaled it will be removed. The Connectors Directory already requires Streamable HTTP for new submissions. If your connector still uses SSE, see the migration guide.

Streamable HTTP has a practical advantage: it works on serverless platforms like Cloudflare Workers and Vercel Edge Functions because it does not require persistent connections. You can also run it in stateless mode for horizontal scaling without sticky sessions.

Authentication: OAuth 2.1 with PKCE

Not every connector needs auth. If your connector serves public data or runs inside a private network, you can skip it. But if your connector accesses user-specific data, OAuth is how Claude handles identity.

The flow works like this:

User enables the connector. In Claude Settings, the user adds your connector URL. If your server declares that it requires auth, Claude initiates the OAuth flow.
Claude redirects to your OAuth provider. The user sees your login page (Google, GitHub, your own auth server, etc.) and authorizes access. Claude uses PKCE (Proof Key for Code Exchange) as required by OAuth 2.1.
Claude stores tokens. After the user authorizes, Claude receives an access token and refresh token. Both are encrypted and stored by Anthropic’s backend. Claude never stores your user’s password.
Tool calls include the token. On each tool call, Claude passes the access token to your server. Your handler validates the token and uses it to fetch data for that specific user.
Token refresh happens automatically. When the access token expires, Claude uses the refresh token to get a new one without user interaction.

If you are submitting to the Connectors Directory, OAuth is required for any connector that accesses private data. Pure client credentials flow (machine-to-machine without user interaction) is not supported. You also need to allowlist both https://claude.ai/api/mcp/auth_callback and https://claude.com/api/mcp/auth_callback as redirect URIs. See the OAuth guide for implementation details.

Without OAuth, Claude sends no user identity to your server. No user IDs, no session tokens, no IP addresses. If you need per-user data, OAuth is the only mechanism.

Tool Response Formatting

How you structure your tool’s response determines what Claude does with it. There are two formats.

Text Content (Standard Connectors)

Return a content array with text items. Claude reads the text and incorporates it into a conversational response.

return {
  content: [
    {
      type: 'text',
      text: JSON.stringify({
        tickets: [
          { id: 'TK-1234', title: 'Login broken on Safari', status: 'open' },
          { id: 'TK-1235', title: 'Billing page 500 error', status: 'resolved' },
        ],
      }),
    },
  ],
};

Claude gets this JSON, understands the structure, and writes something like: “There are two recent tickets. TK-1234 is an open issue about Safari login, and TK-1235 was a billing page error that’s been resolved.”

This format works well for data that Claude should reason over, summarize, or combine with other information.

Structured Content (Interactive Connectors)

Return structuredContent to render a React component inside the conversation.

return {
  structuredContent: {
    type: 'resource',
    resource: {
      uri: 'ui://ticket-card',
      mimeType: 'text/html',
    },
    content: {
      id: 'TK-1234',
      title: 'Login broken on Safari',
      status: 'open',
      assignee: 'Jamie',
      priority: 'high',
    },
  },
};

The resource.uri points to a registered MCP App resource. Claude loads that resource’s React component in a sandboxed iframe and passes the content object to it. Your component receives this data through the useToolData() hook and renders it however you want: a card with status badges, a chart, a form with action buttons.

You can mix both formats across different tools in the same connector. A search tool might return text content so Claude can summarize results, while a detail tool returns structuredContent to show a rich card.

How Interactive Rendering Works

When Claude receives structuredContent from a tool call, it renders your UI inside the conversation. Here is what happens:

Claude matches the resource URI. The uri field in your structuredContent (like ui://ticket-card) maps to a resource your server registered during initialization.
Claude fetches the resource. Claude sends a resources/read request to your server for that URI. Your server responds with the HTML/JS bundle for the component.
Claude creates a sandboxed iframe. The resource loads in an iframe with a restrictive Content Security Policy. Scripts cannot load from CDNs or external URLs. Your JavaScript must be bundled inline or served from your own server.
Your component receives data. Inside the iframe, your React component calls useToolData() to get the content object from the tool response. It renders based on that data.
The iframe adapts to its display mode. The resource starts inline in the conversation. Your component can read the current display mode with useDisplayMode() and request transitions (fullscreen, picture-in-picture) with useApp().requestDisplayMode().

The sandbox restrictions matter for development. Because Claude’s iframe blocks external script sources, your framework needs to serve a pre-built production bundle rather than a Vite dev server. When you change code, the server sends a notifications/resources/list_changed notification, and Claude re-fetches the resource. This is different from ChatGPT, which supports Vite HMR natively. You can read more about this in the display mode reference and the request display mode guide.

If you use sunpeak for development, it handles the bundle detection and rebuild notifications automatically. pnpm dev starts a local inspector that replicates the Claude runtime, so you can iterate on your UI without connecting to a real Claude session.

Tool Annotations

Every tool in your connector should have annotations. These are hints that tell Claude how to treat the tool.

export const tool: AppToolConfig = {
  title: 'Delete Ticket',
  description: 'Permanently delete a support ticket by ID.',
  annotations: {
    destructiveHint: true,
  },
};

The two annotations that matter most:

readOnlyHint: true tells Claude the tool only reads data. Claude may call it without asking the user for confirmation.
destructiveHint: true tells Claude the tool modifies or deletes data. Claude will ask the user for confirmation before calling it.

If you set neither, Claude treats the tool as potentially destructive and may still prompt for confirmation.

Missing annotations cause about 30% of Connectors Directory rejections. Annotate every tool, even the read-only ones. It takes 10 seconds per tool and saves you a rejection cycle.

For more on tool schemas and descriptions, see the tool design guide.

Limits You Need to Design Around

Claude Connectors have hard limits that affect how you build your tools.

25,000 Token Response Cap

Tool results cannot exceed 25,000 tokens. This includes the full content array or structuredContent object. If your handler returns more, Claude truncates or drops the response entirely.

For most tools, 25,000 tokens is plenty. A JSON array of 100 records with 5 fields each is typically around 3,000-5,000 tokens. But if your tool returns raw documents, log files, or large datasets, you’ll hit this limit.

Strategies for staying under:

Paginate. Accept a page or offset parameter in your tool schema and return a fixed number of results per call.
Summarize. Instead of returning raw data, aggregate or filter it in your handler before returning.
Split tools. Use a search tool that returns IDs and summaries, and a separate detail tool that returns the full record for one item.
Truncate with a warning. If the data might exceed the limit, truncate it and include a message like “Showing first 50 of 312 results.”

5-Minute Timeout

Tool handlers must complete within 5 minutes. If your handler calls a slow external API, processes a large file, or runs a complex database query, it needs to finish within this window.

Add timeouts to all external requests in your handler (most HTTP libraries default to no timeout, which means a hanging upstream service will hang your tool until the 5-minute limit kills it). Cache repeated calls when possible.

For operations that genuinely take longer than 5 minutes, use an async pattern: one tool starts the job and returns a job ID, and another tool checks the status and returns results when ready.

Rate Limiting

Claude does not publish specific rate limits for connector tool calls, but your server should implement its own rate limiting. A user could trigger dozens of tool calls in a single conversation, and your backend may not handle that volume for every user simultaneously.

Standard approaches work: rate limit per user (based on the OAuth token), per IP, or per session. Return an error with a clear message when limits are hit so Claude can tell the user to wait.

Connection Limits by Plan

Free Claude users can add one custom connector. Pro, Max, Team, and Enterprise plans support more. On Team and Enterprise plans, org admins must approve custom connectors before individual users can add them. These limits apply to custom connectors only; Connectors Directory integrations have their own limits per plan.

Notifications: Server-to-Claude Communication

Claude Connectors are primarily request-response: Claude calls your tools, and your server responds. But there is one mechanism for your server to communicate back to Claude outside of a tool call.

Your server can send a notifications/resources/list_changed notification over the transport. This tells Claude that the set of available resources has changed. Claude may re-fetch the resource list and update any rendered resources.

This is how file-watching works during development. When you change a resource component’s code, the server sends this notification, and Claude re-fetches and re-renders the updated bundle.

For production connectors, this notification is useful when your data changes and you want any open resources to refresh. But it only works while the transport connection is active. If the user closes the conversation, the connection ends.

True push-based workflows (where your server initiates a message to the user) are not supported in Claude Connectors. For that pattern, Claude Code supports channels, which are MCP servers that forward external events like webhooks into a session.

Testing the Full Lifecycle Locally

You can test every stage of this lifecycle on your own machine, without a Claude account, using a local MCP inspector.

sunpeak includes an inspector that replicates the Claude runtime at localhost. It sends the same JSON-RPC requests, enforces the same iframe sandbox, and renders your resources in the same way Claude does.

pnpm dev

This starts your MCP server and opens the inspector at localhost:3000. Select “Claude” from the host dropdown to test Claude-specific behavior (the pre-built bundle requirement, notifications, display modes). You can also switch to “ChatGPT” to verify cross-host compatibility.

The inspector loads your simulation files (JSON fixtures that define tool input/output states), so you can test every tool and resource combination without hitting your real backend.

For automated testing, sunpeak’s test runner executes simulations headlessly, which means you can add Claude Connector tests to your CI pipeline with pnpm test (runs both unit and e2e tests).

Get Started

Documentation →


npx sunpeak new

Frequently Asked Questions

How do Claude Connectors work?

Claude Connectors are remote MCP servers that Claude calls over Streamable HTTP. When a user asks a question, Claude decides whether a connector tool would help, sends a JSON-RPC request to your server, and your tool handler runs and returns data. If the tool returns text content, Claude weaves it into a text response. If the tool returns structuredContent, Claude renders your React component in an iframe inside the conversation. The whole round trip, from user message to rendered response, typically takes 1-3 seconds.

What protocol do Claude Connectors use?

Claude Connectors use Streamable HTTP, the standard remote transport in the MCP specification (version 2025-03-26). Your server exposes a single endpoint (typically /mcp) that accepts POST requests with JSON-RPC messages and optionally GET requests for server-initiated notifications. SSE transport still works but is deprecated and no longer accepted for Connectors Directory submissions.

What is the token limit for Claude Connector tool responses?

Claude Connector tool results are capped at 25,000 tokens. If your tool handler returns more than that, Claude truncates or drops the response. For large datasets, paginate your results, summarize data before returning it, or split the operation across multiple tools. The 25,000 token limit applies to the full content array, including both text and structuredContent.

How long can a Claude Connector tool take to respond?

Tool handlers must complete within 5 minutes. If your handler hits an external API, processes large files, or runs a complex query, add timeouts to external requests and cache repeated calls. For operations that genuinely take longer than 5 minutes, consider an async pattern where one tool starts the job and another tool checks the result.

How does authentication work for Claude Connectors?

Claude Connectors use OAuth 2.1 with PKCE for authentication. When a user enables an authenticated connector, Claude redirects them to your OAuth provider. After the user authorizes, Claude stores encrypted access and refresh tokens. On each tool call, Claude includes the access token in the request. Your server validates the token and uses it to fetch user-specific data. Without OAuth, Claude does not send any user identity information to your server.

How do interactive Claude Connectors render UI?

When a tool handler returns structuredContent instead of text, Claude renders the linked MCP App resource inside a sandboxed iframe in the conversation. Your React component receives the structuredContent data via the useToolData() hook and renders it as a card, chart, form, or any UI. The iframe is sandboxed with a restrictive Content Security Policy, so scripts must be bundled (no CDN loads) and external API calls require connectDomains in your resource metadata.

Can Claude Connectors push data to Claude without being called?

Not directly. Claude Connectors respond to tool calls; they cannot initiate messages. However, your server can send a notifications/resources/list_changed notification over the transport to tell Claude that available resources have changed. Claude may then re-fetch resources. For true push-based workflows, Claude Code supports channels, which are MCP servers that forward external events into a session.

What are Claude Connector tool annotations and why do they matter?

Tool annotations are hints you add to your tool definition that tell Claude how to handle the tool. readOnlyHint: true means the tool only reads data. destructiveHint: true means it modifies or deletes data. Claude uses these hints to decide whether to ask the user for confirmation before calling the tool. Missing annotations cause about 30% of Connectors Directory rejections. Always annotate every tool.