playwright-mcp works in local testing and silently destroys sessions in cloud deployment

Published: 2026-04-19 Last verified: 2026-04-19 empirical

7 claims 7 tested finding

playwright-mcp works in local testing and silently destroys sessions in cloud deployment

From Theory Delta | Methodology | Published 2026-04-19

You connected playwright-mcp to your agent and tested it locally with Claude Code. Multi-step workflows — navigate, fill, click, submit — run reliably. You deploy to a cloud-hosted or containerized environment. The same workflow breaks on step 2 with "Session not found".

The README presents stdio and HTTP/SSE as equivalent transport options. They are not.

What you expect

Browser sessions persist across tool calls regardless of transport. HTTP/SSE and stdio are configuration choices with the same semantics. Local test results predict cloud deployment behavior. Auth persistence configured in playwright.config (or equivalent) works as documented.

What actually happens

HTTP transport destroys browser sessions on every tool call in containerized environments. The server’s session handler deletes sessions from its session map on network interruptions. In containerized environments, only the first request in a workflow succeeds — subsequent calls fail with "Session not found". Multi-step workflows (navigate → fill → submit → assert) break at step 2. (Issue #1045, Issue #1140) Both issues are now closed, but the resolution is unclear from the public issue pages — verify the fix holds against your current version before relying on HTTP transport for multi-step workflows.

The critical asymmetry that makes this a production trap: stdio transport (the default for Claude Code, Cursor, and Windsurf) maintains a persistent process — sessions survive across tool calls. HTTP transport does not. Teams testing locally via Claude Code pass their multi-step tests. The same workflow deployed via HTTP to a cloud client silently loses session state on every step past the first. The README presents this as a transport preference. It is an architectural decision with production consequences.

Three more failure modes compound the session problem:

userDataDir is silently ignored in config files. Teams configuring auth persistence via the documented config file get ephemeral in-memory sessions. The --profile CLI flag works; the config file path does not. (Issue #1446, open March 2026)
5-second ping timeout terminates slow operations — not configurable. Operations slower than 5 seconds trigger silent session termination mid-workflow. Claude Code maintains heartbeats and is unaffected. Go MCP clients and other non-heartbeat clients hit this. (Issue #982, closed — resolution unconfirmed, verify against your version)
Verify tools failing inside iframes was fixed in v0.0.69 (March 30, 2026). browser_verify_text_visible and browser_verify_element_visible previously stopped at the main document boundary while action tools (browser_click, browser_fill) reached inside iframes. Fixed in v0.0.69. (Issue #1394) Versions below v0.0.69 remain affected.

Token cost is a separate structural problem. playwright-mcp streams the full accessibility snapshot into the LLM context after every action — no buffering, no selective suppression. Measured February 2026 against the same task: 114K tokens for playwright-mcp vs 27K for Playwright CLI. A version bump from 0.0.30 to 0.0.32 added console message logging that caused a 6x token spike on pages with console errors (Issue #889). Microsoft released Playwright CLI in February 2025 as a token-efficient alternative for filesystem-accessible agents. No migration guide exists.

What this means for you

If you tested locally on Claude Code and are now deploying to a cloud-hosted agent or container: Your local tests ran on stdio transport. Your cloud deployment uses HTTP transport. The session destruction is transport-specific — your local passing tests do not predict cloud behavior. Your multi-step workflows will fail at step 2.

If you configured auth persistence via userDataDir in a config file: Your auth state is not persisting. The browser launches with an ephemeral in-memory profile, discards auth state on session close, and you never see an error. The --profile CLI flag works. The config file path does not.

If you are running high-frequency CI against enterprise apps: At 114K tokens per test vs 27K for Playwright CLI, your token costs are 4x higher than they need to be on the same task. For apps with console errors, the 0.0.30→0.0.32 console logging change may have caused a further 6x spike on specific pages.

If your workflows verify elements inside iframes: On versions below v0.0.69, browser_verify_text_visible and browser_verify_element_visible produce false negatives on iframe elements while action tools succeed on the same elements. This produces misleading test failures. Upgrade to v0.0.69 or later.

What to do

Local coding agents (Claude Code, Cursor, Codex): Use Playwright CLI instead of playwright-mcp. It saves snapshots to disk rather than streaming them into context — 27K tokens vs 114K on the same task.
Cloud-hosted agents without filesystem access (ChatGPT plugins, Claude.ai workflows): Use --cdp-endpoint to attach playwright-mcp to an externally-managed persistent Chrome process, bypassing the HTTP session lifecycle. The --shared flag (merged PR) is an alternative — verify it resolves the session bug for your specific client configuration.
Auth persistence: Use the --profile CLI flag, not userDataDir in a config file. Test that auth state survives session close before deploying.
Iframe verification: Upgrade to v0.0.69 or later. If pinned to an earlier version, confirm success via the next action’s outcome or JavaScript evaluation as a fallback.

Evidence

Tool	Version	Method	Notes
microsoft/playwright-mcp	current (March 2026)	source-verified	Session destruction on HTTP — #1045 (closed), #1140 (closed) — resolution unclear, verify against current version
microsoft/playwright-mcp	Feb 2026 benchmark	source-verified	114K tokens/test vs 27K for CLI (ScrollTest, same task)
microsoft/playwright-mcp	0.0.30–0.0.32	source-verified	6x token spike from console logging — #889
microsoft/playwright-mcp	current (March 2026)	source-verified	userDataDir silently ignored — #1446, open
microsoft/playwright-mcp	current (March 2026)	source-verified	5-second ping timeout not configurable — #982, closed (resolution unconfirmed)
microsoft/playwright-mcp	current (March 2026)	source-verified	Verify tools failing in iframes — fixed in v0.0.69 (March 30, 2026) — #1394
Claude Code (stdio)	current	source-reviewed	Unaffected — maintains heartbeats, uses stdio transport

Confidence: empirical — all failure modes traced to public GitHub issues with reproductions. Session destruction (#1045, #1140 — both closed, resolution unclear), token benchmark (ScrollTest, Feb 2026), console logging spike (#889), userDataDir bug (#1446), ping timeout (#982 — closed, resolution unconfirmed), iframe verify failure (#1394 — fixed in v0.0.69, March 30, 2026). Fact-checked 2026-04-19 against live GitHub issue state.

Falsification criterion: The session destruction claim would be disproved by demonstrating that HTTP transport maintains session state across calls in a containerized environment without --cdp-endpoint. The token cost claim would be disproved by a measurement showing playwright-mcp and Playwright CLI produce equivalent token counts on the same task.

Open questions: Does the --shared flag fully resolve the HTTP session bug across all client configurations, or only specific ones? Is iframe verify support on the roadmap or considered an architectural boundary?

Seen different? Contribute your evidence — theory delta is what makes this knowledge base work.