The Playwright MCP server provides browser automation capabilities using Playwright through the Model Context Protocol (MCP). Instead of relying on screenshots or visually-tuned models, it operates on Playwright’s accessibility tree, allowing LLMs to interact deterministically with web pages using structured data.

When should you use it

Use the Playwright MCP server when you want an agent to:
  • Automate browser actions.
  • Extract structured page context via snapshots, avoiding ambiguity of pixel-based methods.
  • Test and verify UI elements, text, or values without vision models.
  • Capture console logs, network requests, PDFs, or traces during automated workflows.
  • Manage tabs, dialogs, and file uploads in real-time web automation.

Requirements

  • Requirements:
    • Node.js 18 or newer
    • An MCP-compatible client (VS Code, Cursor, Windsurf, Claude Desktop, Goose, etc.)
  • Installation:
    Install the Playwright MCP server with your MCP client.
  • Optional capabilities (enabled via --caps):
    • vision → coordinate-based mouse actions
    • pdf → save pages as PDF
    • verify → element/text/value verification
    • tracing → start/stop browser tracing

Tools

Core interaction

  • browser_click — Click (or double click) on an element.
  • browser_hover — Hover over an element.
  • browser_type — Type text into an editable element, with optional submit/slow typing.
  • browser_fill_form — Fill multiple form fields at once.
  • browser_select_option — Select one or more dropdown values.
  • browser_press_key — Press a keyboard key.
  • browser_drag — Perform drag-and-drop between elements.
  • browser_file_upload — Upload one or multiple files.
  • browser_navigate — Navigate to a specific URL.
  • browser_navigate_back — Go back to the previous page.
  • browser_tabs — List, create, close, or select tabs.
  • browser_close — Close the current page.

Page context & capture

  • browser_snapshot — Capture structured accessibility snapshot (preferred for automation).
  • browser_take_screenshot — Take a screenshot of viewport, full page, or element.
  • browser_pdf_save (opt-in via --caps=pdf) — Save page as PDF.

Evaluation & debugging

  • browser_evaluate — Run JavaScript in page context.
  • browser_console_messages — Return console messages.
  • browser_network_requests — Return all network requests since load.
  • browser_resize — Resize the browser window.
  • browser_handle_dialog — Accept/decline modal dialogs or prompts.

Verification (opt-in via --caps=verify)

  • browser_verify_element_visible — Verify an element is visible by role + accessible name.
  • browser_verify_text_visible — Verify a text string is visible.
  • browser_verify_list_visible — Verify a list with expected items is visible.
  • browser_verify_value — Verify element values (e.g., checkbox state, input value).

Coordinate-based (opt-in via --caps=vision)

  • browser_mouse_click_xy — Click at a coordinate.
  • browser_mouse_drag_xy — Drag mouse between coordinates.
  • browser_mouse_move_xy — Move mouse to a coordinate.

Tracing (opt-in via --caps=tracing)

  • browser_start_tracing — Start trace recording.
  • browser_stop_tracing — Stop trace recording.

Installation

  • browser_install — Install the required browser binaries if not already present.

Notes

  • Prefer browser_snapshot over screenshots for interaction—it’s structured, fast, and deterministic.
  • Many tools require both a human-readable element description and a ref from the snapshot for safety and determinism.
  • Optional capabilities (vision, pdf, verify, tracing) must be explicitly enabled when starting the server.