Every browser automation project I've worked on has broken within three months. Not because the code was wrong — because the target site changed its layout. OpenClaw's browser skill eliminates that problem. The agent reads the page content and reasons about what to click or extract, rather than following a brittle script of DOM selectors.
Setting Up the OpenClaw Browser Skill
The browser skill requires two things: the skill declared in your config and Playwright installed in your environment. That's the full prerequisite list. No external service, no API key, no separate browser farm.
# Install Playwright Chromium browser
npx playwright install chromium
# Verify installation
npx playwright --version
Once Playwright is installed, add the browser skill to your CLAUDE.md:
# CLAUDE.md
skills:
- browser # enables navigate, click, type, screenshot, extract
- file_write # for saving extracted data
browser_config:
headless: true # set false for debugging
timeout: 30000 # 30 second page load timeout
viewport: "1280x800" # standard desktop viewport
That's all the configuration needed to start. The agent now has access to browser_navigate, browser_click, browser_type, browser_screenshot, and browser_extract tool calls.
headless: false opens a visible browser window during development. You can watch exactly what the agent is doing — which clicks it attempts, where it gets confused, what the page actually looks like. Switch to headless only after the workflow runs correctly in visible mode.Common Browser Automation Task Patterns
Form Filling and Submission
The agent identifies form fields by reading their labels, placeholder text, and surrounding context — not by CSS selectors. This means form automation survives site redesigns that would break a traditional Playwright script.
A typical form filling instruction in your system prompt:
system: |
Navigate to {{FORM_URL}}.
Fill in the contact form with these values:
- Name: {{NAME}}
- Email: {{EMAIL}}
- Message: {{MESSAGE}}
Submit the form and confirm success by checking for a confirmation message.
If submission fails, screenshot the error and stop.
The agent reads the page, identifies the correct fields by label matching, fills them, submits, and verifies. No selector maintenance required.
Multi-Page Navigation and Data Extraction
For pagination workflows — extracting data from a listing that spans multiple pages — the agent navigates page by page, extracts the data structure from each page, and accumulates results. The pattern works for search results, product listings, directory pages, and any paginated content.
We'll get to the anti-bot considerations in a moment — but first understand that this pattern assumes the target site permits automated access. Always check robots.txt and terms of service before building any scraping workflow.
Login-Gated Content
The browser skill handles authenticated sessions. The agent navigates to the login page, enters credentials from environment variables (never hardcoded), and maintains the session cookie for subsequent page visits. This enables automation on internal dashboards, member-only content, and account-specific data.
{{ENV_VAR_NAME}} syntax. Config files with hardcoded passwords get committed to version control. This is the most common security mistake in browser automation setups.Playwright Integration Details
OpenClaw's browser skill is built on Playwright, which means you have access to its full capability set when you need lower-level control. For most use cases, the high-level skill interface is sufficient. For edge cases, you can pass raw Playwright instructions through the code execution skill.
The key Playwright behaviors that affect OpenClaw browser automation:
- Network idle waiting — Playwright waits for network activity to settle before considering a page loaded. This handles SPAs and lazy-loaded content correctly.
- Auto-scroll — the agent can instruct Playwright to scroll to trigger lazy-loaded images or infinite scroll pagination.
- Frame handling — iframes are accessible, including embedded forms and third-party widgets.
- File downloads — Playwright can capture file download events and save files to a specified path.
- PDF generation — any page can be printed to PDF, useful for report capture workflows.
For advanced scenarios — intercepting network requests, modifying response headers, injecting JavaScript — use the code execution skill alongside browser to access the raw Playwright API:
# Advanced: intercept API responses during navigation
system: |
Use the browser skill to navigate to {{URL}}.
Then use code_exec to intercept the /api/data endpoint response
and extract the JSON payload before it renders to the page.
Save the raw JSON to output/api-response.json.
skills:
- browser
- code_exec
- file_write
Anti-Bot Considerations for Production Use
Standard headless Playwright is detectable. Cloudflare, DataDome, and similar services identify it through browser fingerprint signals — missing plugins, inconsistent timing, headless-specific navigator properties. The detection rate on aggressively protected sites is high.
OpenClaw mitigates this through several built-in behaviors: realistic timing delays between actions, human-like mouse movement simulation, and standard user-agent strings. These measures handle basic bot detection on most sites.
For sites with aggressive protection, additional measures are needed:
- Playwright-stealth plugin — patches the headless browser fingerprint to match a real Chrome instance. Install via
playwright-extraand configure it in the browser skill options. - Residential proxy routing — routes browser traffic through residential IPs. Required for IP-rate-limited sites. Configure in
browser_config.proxy. - Request rate limiting — add explicit delays between page requests in your system prompt. "Wait 2–4 seconds between each page navigation" reduces detection risk significantly.
- Session persistence — reuse browser contexts across runs to maintain cookies and avoid repeated login detection events.
Be clear-eyed about what OpenClaw alone achieves. It handles the majority of real-world browser automation tasks without issue. For the subset of high-security targets, plan for additional infrastructure.
Production-Ready Browser Automation Patterns
A browser automation workflow that runs well in testing will fail in production without explicit reliability patterns. Here is the configuration structure we've found works consistently across different target sites and task types:
# Production browser automation config
system: |
You are a browser automation agent. Follow these rules on every run:
RATE LIMITING: Wait 2-3 seconds between page navigations. Never make
more than 20 navigation requests per run.
TIMEOUTS: If a page doesn't load within 30 seconds, log the URL and skip it.
Continue with remaining URLs.
ERRORS: If an extraction fails, log: "FAILED: [url] - [reason]"
then continue. Do not stop the run on single-page failures.
SCREENSHOTS: Take a screenshot on any unexpected state (login redirect,
captcha page, error page). Save to output/screenshots/{{TIMESTAMP}}.png
OUTPUT: Write all extracted data to output/results-{{DATE}}.json
Append, never overwrite, if the file already exists.
skills:
- browser
- file_write
browser_config:
headless: true
timeout: 30000
viewport: "1280x800"
This configuration handles the four failure modes that break most production browser automation: timeouts, bot detection redirects, single-page failures, and output overwrites. Each is addressed explicitly before the first production run.
Common Mistakes in Browser Automation Setups
- Hardcoding credentials in the config file. Use environment variables. Always.
- No timeout configuration. Without timeouts, a slow-loading page stalls the entire run indefinitely. Set explicit page load timeouts and extraction timeouts.
- Testing only on the target site in isolation. Browser automation behavior changes when the agent is running multiple tasks. Test the full pipeline under load before declaring it production-ready.
- Skipping robots.txt review. Before building any browser automation, verify the target site permits automated access. Terms of service violations can result in IP bans and legal issues.
- No logging or screenshot capture. When browser automation fails, you need to know why. Screenshots on unexpected states are the fastest debugging tool available.
- Using browser skill when firecrawl would suffice. Browser automation is slower and more resource-intensive than firecrawl. For read-only content extraction from accessible pages, firecrawl is faster and more reliable.
Frequently Asked Questions
How do I set up the OpenClaw browser skill?
Add 'browser' to your skills list in CLAUDE.md and ensure Playwright is installed in your environment (npx playwright install chromium). The browser skill launches a headless Chromium instance and gives the agent navigate, click, type, and extract capabilities automatically.
Can OpenClaw browser automation handle JavaScript-heavy sites?
Yes. OpenClaw uses Playwright under the hood, which executes JavaScript fully before extracting content. Sites that load data via fetch or XHR are handled correctly — the agent waits for network activity to settle before reading the page, unlike static scrapers that miss dynamic content.
Does OpenClaw browser automation get blocked by anti-bot systems?
Standard headless Playwright gets detected by aggressive bot protection. OpenClaw mitigates this through user-agent rotation and human-like timing delays between actions. For high-stakes scraping against Cloudflare-protected sites, you need additional stealth plugins or a residential proxy — OpenClaw alone is not sufficient.
What is the difference between the browser skill and firecrawl in OpenClaw?
The browser skill gives the agent full interactive control — it can click, fill forms, navigate, and scrape. Firecrawl is read-only content extraction optimized for speed and structured output. Use browser when you need interaction; use firecrawl when you only need to extract content from accessible pages.
How do I make OpenClaw browser automation production-ready?
Add explicit rate limiting instructions to your system prompt, define maximum pages per run, specify retry logic for timeouts, and run in a container with a fixed Playwright version. Log every navigation action for debugging. Test failure behavior with intentional 404s and timeouts before deploying.
Can OpenClaw fill out forms automatically?
Yes. The browser skill can identify input fields by label, placeholder, or position, type values, select dropdown options, and click submit buttons. It handles multi-page forms by following the submission flow. Always test form automation in a staging environment before running against production forms.
You now have the full browser automation setup from installation through production hardening. The semantic approach to element identification is OpenClaw's biggest advantage over traditional Playwright scripts — no selector maintenance, no breakage on site updates. Get Playwright installed, set headless: false, and run your first navigation task today. The feedback loop is immediate and the results speak for themselves.