Browser Automation + /improve: AI Agents That Browse the Web and Fix Themselves

This week I shipped 5 versions of pydantic-deepagents — the modular agent runtime for Python. Today: the two features that close the loop — browser automation and session-based self-improvement.

Part 1: BrowserCapability — 9 Playwright Tools

pip install 'pydantic-deep[browser]'
playwright install chromium

from pydantic_deep.capabilities import BrowserCapability

agent = create_deep_agent(
    model="anthropic:claude-opus-4-6",
    extra_capabilities=[BrowserCapability(
        allowed_domains=["github.com", "docs.python.org"],
        auto_screenshot=True,
    )]
)

The 9 tools: navigate, click, type_text, get_text, screenshot, scroll, go_back, go_forward, execute_js.

Safety design: Single-tab (predictable state), domain allowlist (agent can't navigate outside allowed domains), automatic popup interception, content truncation to prevent context overflow.

Browser lifecycle: Chromium starts before the agent run, stops after — whether the run succeeds, fails, or is cancelled. No orphaned processes.

CLI:

pydantic-deep tui --browser --browser-headed   # visible window
pydantic-deep run "research X on GitHub" --browser --sandbox docker

Bug fix: Browser tools now force kind='function' — they never trigger approval dialogs mid-task.

Part 2: /improve — Session-Based Self-Improvement

After each session, /improve analyzes the full run and extracts:

UserFactInsight — what the agent learned about you and your preferences
AgentLearningInsight — strategies that worked, failure modes encountered

Both write to MEMORY.md. Next session loads MEMORY.md automatically.

Key finding: We tested summaries vs raw tool traces as input to the synthesis step. Raw traces performed significantly better — summaries compress away the signal that matters. /improve reads from tool_log.jsonl (written per session), not from a summary.

The loop: agent runs → /improve extracts insights → MEMORY.md grows → next run starts smarter.

This Week's Full Stack

Monday: StuckLoopDetection | Tuesday: LimitWarnerCapability | Wednesday: curl install | Thursday: Docker sandbox | Today: browser + /improve

An agent that detects loops, knows its context limits, installs in 30s, runs in Docker, browses the web, and learns from every session.

Full breakdown: https://oss.vstorm.co/blog/browser-automation-improve-ai-agents-pydantic-deep/

GitHub: https://github.com/vstorm-co/pydantic-deep