I built an AI agent that does OSINT investigations from your terminal

Most OSINT tools are great at one thing. You run holehe for emails, sherlock for usernames, sublist3r for domains. But you're the one deciding the workflow, switching between tools, copy-pasting results.

I wanted to remove that middle layer. So I built OpenOSINT — you describe a target in plain English, the AI figures out what to investigate and how, runs the tools, and hands you a report.

How it works

The core idea is simple: instead of hardcoding a fixed pipeline, I use Claude's native tool use API to let the model decide at each step what to do next based on what it found so far.

you ❯ investigate john.doe@gmail.com

→ search_email(john.doe@gmail.com)
  Found: spotify, wordpress, office365, gravatar

→ search_breach(john.doe@gmail.com)
  Found: 2 breaches (LinkedIn 2016, Adobe 2013)

→ search_paste(john.doe@gmail.com)
  No results.

✓ Report saved to reports/2025-05-08_john-doe.md

No hardcoded sequence. The model sees the holehe results and decides whether to check breaches next, look up the domain, or go straight to the report. It's a genuine reasoning loop, not a fixed script.

Why native tool use matters

The first version I built used a manual ReAct loop — I was parsing JSON from the model, extracting tool calls, running them, feeding results back. It worked but it was fragile. Models hallucinate tool results when they're bored.

With the Anthropic tool use API, the model returns stop_reason: "tool_use" when it wants to call something. You execute it, return the result, and the model continues. The loop is clean:

def run(self, prompt: str) -> str:
    messages = [{"role": "user", "content": prompt}]

    while True:
        response = self.provider.chat(
            messages=messages,
            system=SYSTEM_PROMPT,
            tools=self.tool_registry.get_definitions()
        )

        if response.stop_reason == "end_turn":
            return response.content

        if response.stop_reason == "tool_use":
            messages.append({"role": "assistant", "content": response.raw_content})

            results = []
            for call in response.tool_calls:
                result = self.tool_registry.execute(call.name, call.input)
                results.append({
                    "type": "tool_result",
                    "tool_use_id": call.id,
                    "content": result
                })

            messages.append({"role": "user", "content": results})

The model never gets a chance to invent results because it always receives the actual tool output before continuing.

Tools included

Tool	What it wraps	What it finds
`search_email`	holehe	social accounts linked to an email
`search_username`	sherlock	accounts across 300+ platforms
`search_domain`	sublist3r	subdomains
`search_breach`	HaveIBeenPwned API	data breach exposure
`search_whois`	python-whois	domain registrant info
`search_ip`	ipinfo.io	geolocation, ASN, hostname
`generate_dorks`	built-in	Google dork URLs for any target
`search_paste`	psbdmp API	Pastebin dump mentions
`search_phone`	phoneinfoga	carrier, country, line type

Each tool handles missing dependencies gracefully — if sherlock isn't installed it tells you the install command instead of crashing.

Multi-provider

The AI layer is completely swappable. On first run you pick your provider:

Select provider:
  [1] Anthropic (Claude) — Recommended
  [2] OpenAI (GPT-4o)
  [3] Ollama (Local) — Experimental

The same agentic loop runs regardless. Anthropic is noticeably better at following structured tool-use instructions, but all three work. Local models via Ollama are marked experimental because they're inconsistent with JSON-structured responses.

The terminal UI

Built with Rich. Tool calls log inline as they happen so you can see the investigation unfold in real time rather than waiting for a final dump.

openosint ❯ investigate john.doe@example.com

  ⠸ Investigating...

  → search_email          john.doe@example.com
  ✓ Found: spotify, wordpress, gravatar, office365

  → search_breach         john.doe@example.com
  ✓ Found in 2 breaches

  ╭──────────────────── Report ─────────────────────╮
  │ ## Ambiguity Check                              │
  │ Single target identified — high confidence.     │
  │                                                 │
  │ ## Online Presence                              │
  │ Confirmed: Spotify, WordPress, Gravatar,        │
  │ Office365                                       │
  │                                                 │
  │ ## Data Breaches                                │
  │ LinkedIn (2016), Adobe (2013)                   │
  ╰─────────────────────────────────────────────────╯

  Report saved → reports/2025-05-08_john-doe.md

Install

pip install openosint
openosint config    # runs the setup wizard
openosint investigate "john.doe@example.com"

Or from source:

git clone https://github.com/OpenOSINT/OpenOSINT
cd OpenOSINT
pip install -e .
openosint config

What's next

Web UI (optional, for non-terminal users)
Export to PDF
Graph visualization of connections between identifiers
More tools: LinkedIn scraping, GitHub profile analysis, image metadata

Reminder: OpenOSINT is for authorized use only. Read DISCLAIMER.md before using.

Source: github.com/OpenOSINT/OpenOSINT