Browser-CLI: Let Your AI Agent Control the Browser from the Command Line

go dev.to

Ever wanted your AI coding assistant to actually use a browser? Not just read web pages, but click buttons, fill forms, take screenshots, and extract data — all from the terminal?

That's exactly why I built Browser-CLI.

What is it?

Browser-CLI is a Go-based command-line tool that wraps Playwright to give AI agents full browser control through simple shell commands. No API keys, no browser extensions, no complex setup — just run a command and you're off.

👉 GitHub: https://github.com/zmysysz/browser-cli

⭐ Stars and feedback are appreciated!

# Install
git clone https://github.com/zmysysz/browser-cli
cd browser-cli && make build && make install
make setup-browsers  # first time only

# Use
browser-cli navigate https://example.com
browser-cli fill "#search" "hello world"
browser-cli click "button[type=submit]"
browser-cli text
Enter fullscreen mode Exit fullscreen mode

Why not just use Playwright directly?

Playwright is great, but it's a library — you need to write code to use it. Browser-CLI turns it into a universal CLI interface that any AI agent can call without writing a single line of automation code.

This means:

  • Claude Code can browse the web
  • OpenAI Codex can fill forms and extract data
  • Cursor can take screenshots and interact with pages
  • Any AI agent can automate browser tasks through shell commands

Key Features

  • 🤖 AI-First Design — Structured JSON output, auto-managed server, clear command semantics
  • 🔒 Session Isolation — Each agent gets its own browser instance via --session
  • 🍪 Cookie Persistence — Auto save/load, login states preserved across sessions
  • 🌐 Proxy Support--proxy http://host:port for restricted networks
  • 🎯 Web Componentssmart-click and pick for custom elements and Shadow DOM
  • ⌨️ Full Keyboard — Shortcuts, combos, Tab/Enter/Escape, Ctrl+A/C/V
  • 📄 PDF & Screenshot — Export pages as PDF or PNG
  • 📁 File Upload — Upload files to any <input type="file">

30 Commands at a Glance

Category Commands
Navigate navigate, back, forward, reload
Click click, click-js, smart-click, right-click, dblclick
Input fill, type, select, keyboard, upload
Extract text, screenshot, elements, eval, pdf
Utility wait, scroll, pick
Tabs tab-new, tab-list, tab-switch, tab-close
Dialogs dialog-status, dialog-accept, dialog-dismiss
Session status, stop, session-list, cookie

Integration with AI Tools

Browser-CLI ships with ready-to-use integration files:

File Tool How to Use
integrations/claude/browser.md Claude Code Copy to .claude/commands/
integrations/codex/browser-cli.md OpenAI Codex Copy to ~/.codex/skills/
AGENTS.md Cursor, Windsurf Already in project root
skills/browser-cli/SKILL.md GAL Copy to ~/.gal/skills/

Real-World Example

Here's how an AI agent can search GitHub and extract results:

# Navigate to GitHub
browser-cli navigate https://github.com/search?q=browser+automation

# Extract search results
browser-cli eval "JSON.stringify(
  Array.from(document.querySelectorAll('.repo-list-item a.v-align-middle'))
  .map(a => ({name: a.textContent.trim(), url: a.href}))
)"

# Take a screenshot
browser-cli screenshot github-results.png
Enter fullscreen mode Exit fullscreen mode

Architecture

Browser-CLI uses a client-server architecture over Unix sockets:

AI Agent → shell command → browser-cli (client) → Unix socket → server → Playwright → Browser
Enter fullscreen mode Exit fullscreen mode

The server auto-starts on first command and stays running. Multiple agents can connect simultaneously with isolated sessions.

No CGO Required

Pure Go binary, compiles with CGO_ENABLED=0:

# Static Linux build
make build-static

# Cross-compile for Windows
CGO_ENABLED=0 GOOS=windows GOARCH=amd64 go build -o browser-cli.exe .

# Cross-compile for macOS
CGO_ENABLED=0 GOOS=darwin GOARCH=arm64 go build -o browser-cli-mac .
Enter fullscreen mode Exit fullscreen mode

Get Started

git clone https://github.com/zmysysz/browser-cli
cd browser-cli && make build && make install
make setup-browsers
browser-cli navigate https://example.com
Enter fullscreen mode Exit fullscreen mode

Star ⭐ the repo if you find it useful! Feedback and contributions welcome.


This article was drafted with the help of an AI agent — but the tool itself was built by hand. 😉

Source: dev.to

arrow_back Back to Tutorials