I built an AI agent that does OSINT investigations from your terminal

python dev.to

Most OSINT tools are great at one thing. You run holehe for emails, sherlock for usernames, sublist3r for domains. But you're the one deciding the workflow, switching between tools, copy-pasting results.

I wanted to remove that middle layer. So I built OpenOSINT — you describe a target in plain English, the AI figures out what to investigate and how, runs the tools, and hands you a report.

How it works

The core idea is simple: instead of hardcoding a fixed pipeline, I use Claude's native tool use API to let the model decide at each step what to do next based on what it found so far.

you ❯ investigate john.doe@gmail.com

→ search_email(john.doe@gmail.com)
  Found: spotify, wordpress, office365, gravatar

→ search_breach(john.doe@gmail.com)
  Found: 2 breaches (LinkedIn 2016, Adobe 2013)

→ search_paste(john.doe@gmail.com)
  No results.

✓ Report saved to reports/2025-05-08_john-doe.md
Enter fullscreen mode Exit fullscreen mode

No hardcoded sequence. The model sees the holehe results and decides whether to check breaches next, look up the domain, or go straight to the report. It's a genuine reasoning loop, not a fixed script.

Why native tool use matters

The first version I built used a manual ReAct loop — I was parsing JSON from the model, extracting tool calls, running them, feeding results back. It worked but it was fragile. Models hallucinate tool results when they're bored.

With the Anthropic tool use API, the model returns stop_reason: "tool_use" when it wants to call something. You execute it, return the result, and the model continues. The loop is clean:

def run(self, prompt: str) -> str:
    messages = [{"role": "user", "content": prompt}]

    while True:
        response = self.provider.chat(
            messages=messages,
            system=SYSTEM_PROMPT,
            tools=self.tool_registry.get_definitions()
        )

        if response.stop_reason == "end_turn":
            return response.content

        if response.stop_reason == "tool_use":
            messages.append({"role": "assistant", "content": response.raw_content})

            results = []
            for call in response.tool_calls:
                result = self.tool_registry.execute(call.name, call.input)
                results.append({
                    "type": "tool_result",
                    "tool_use_id": call.id,
                    "content": result
                })

            messages.append({"role": "user", "content": results})
Enter fullscreen mode Exit fullscreen mode

The model never gets a chance to invent results because it always receives the actual tool output before continuing.

Tools included

Tool What it wraps What it finds
search_email holehe social accounts linked to an email
search_username sherlock accounts across 300+ platforms
search_domain sublist3r subdomains
search_breach HaveIBeenPwned API data breach exposure
search_whois python-whois domain registrant info
search_ip ipinfo.io geolocation, ASN, hostname
generate_dorks built-in Google dork URLs for any target
search_paste psbdmp API Pastebin dump mentions
search_phone phoneinfoga carrier, country, line type

Each tool handles missing dependencies gracefully — if sherlock isn't installed it tells you the install command instead of crashing.

Multi-provider

The AI layer is completely swappable. On first run you pick your provider:

Select provider:
  [1] Anthropic (Claude) — Recommended
  [2] OpenAI (GPT-4o)
  [3] Ollama (Local) — Experimental
Enter fullscreen mode Exit fullscreen mode

The same agentic loop runs regardless. Anthropic is noticeably better at following structured tool-use instructions, but all three work. Local models via Ollama are marked experimental because they're inconsistent with JSON-structured responses.

The terminal UI

Built with Rich. Tool calls log inline as they happen so you can see the investigation unfold in real time rather than waiting for a final dump.

openosint ❯ investigate john.doe@example.com

  ⠸ Investigating...

  → search_email          john.doe@example.com
  ✓ Found: spotify, wordpress, gravatar, office365

  → search_breach         john.doe@example.com
  ✓ Found in 2 breaches

  ╭──────────────────── Report ─────────────────────╮
  │ ## Ambiguity Check                              │
  │ Single target identified — high confidence.     │
  │                                                 │
  │ ## Online Presence                              │
  │ Confirmed: Spotify, WordPress, Gravatar,        │
  │ Office365                                       │
  │                                                 │
  │ ## Data Breaches                                │
  │ LinkedIn (2016), Adobe (2013)                   │
  ╰─────────────────────────────────────────────────╯

  Report saved → reports/2025-05-08_john-doe.md
Enter fullscreen mode Exit fullscreen mode

Install

pip install openosint
openosint config    # runs the setup wizard
openosint investigate "john.doe@example.com"
Enter fullscreen mode Exit fullscreen mode

Or from source:

git clone https://github.com/OpenOSINT/OpenOSINT
cd OpenOSINT
pip install -e .
openosint config
Enter fullscreen mode Exit fullscreen mode

What's next

  • Web UI (optional, for non-terminal users)
  • Export to PDF
  • Graph visualization of connections between identifiers
  • More tools: LinkedIn scraping, GitHub profile analysis, image metadata

Reminder: OpenOSINT is for authorized use only. Read DISCLAIMER.md before using.

Source: github.com/OpenOSINT/OpenOSINT

Source: dev.to

arrow_back Back to Tutorials