|
1 | 1 | # Playwright Screenshot Example
|
2 | 2 |
|
3 |
| -This example demonstrates how to use the OpenAI Agents SDK with Playwright MCP (Machine Control Protocol) to automate browser interactions and capture screenshots. |
| 3 | +This example uses the [Playwright MCP server](https://github.com/modelcontextprotocol/servers/tree/main/src/playwright), running locally via `npx`. |
4 | 4 |
|
5 |
| -## Features |
6 |
| - |
7 |
| -- Navigates to websites |
8 |
| -- Captures full-page screenshots |
9 |
| -- Saves screenshots with descriptive filenames |
10 |
| -- Reports file information |
11 |
| - |
12 |
| -## Requirements |
13 |
| - |
14 |
| -- Node.js and npm |
15 |
| -- Python 3.9+ with the OpenAI Agents SDK installed |
16 |
| - |
17 |
| -## Installation |
18 |
| - |
19 |
| -Before running this example, make sure you have the Playwright MCP package available: |
| 5 | +Run it via: |
20 | 6 |
|
21 | 7 | ```bash
|
22 |
| -# This will be installed automatically when you run the example |
23 |
| -npm install -g @playwright/mcp |
| 8 | +uv run python examples/mcp/playwright_example/main.py |
24 | 9 | ```
|
25 | 10 |
|
26 |
| -## Usage |
| 11 | +## Details |
27 | 12 |
|
28 |
| -Run the example: |
| 13 | +The example uses the `MCPServerStdio` class from `agents.mcp`, with the command: |
29 | 14 |
|
30 | 15 | ```bash
|
31 |
| -python main.py |
| 16 | +npx -y "@playwright/mcp@latest" --headless |
32 | 17 | ```
|
33 | 18 |
|
34 |
| -The script will: |
35 |
| -1. Launch a headless Playwright browser via the MCP server |
36 |
| -2. Navigate to OpenAI's website |
37 |
| -3. Capture screenshots |
38 |
| -4. Save them to the `screenshots` directory |
| 19 | +The script demonstrates browser automation capabilities by: |
| 20 | +1. Navigating to example.com and httpbin.org |
| 21 | +2. Taking full-page screenshots |
| 22 | +3. Saving them to a local `screenshots` directory with descriptive filenames |
| 23 | + |
| 24 | +Under the hood: |
| 25 | + |
| 26 | +1. The server is spun up in a subprocess, and exposes Playwright tools for browser automation |
| 27 | +2. We add the server instance to the Agent via `mcp_servers` |
| 28 | +3. Each time the agent runs, we call out to the MCP server to fetch the list of tools via `server.list_tools()` |
| 29 | +4. If the LLM chooses to use an MCP tool, we call the MCP server to run the tool via `server.run_tool()` |
39 | 30 |
|
40 | 31 | ## Customization
|
41 | 32 |
|
|
0 commit comments