I Built "seoextract": A Python CLI Tool That Audits Website SEO from the Terminal

SEO tools are useful, but many of them are either too heavy, too expensive, or too dependent on dashboards.

So I built seoextract — a simple Python CLI package that audits a website directly from the terminal and gives a clear SEO report with scores, grades, and actionable issues.

The goal was simple:

«Enter a website URL. Get an SEO audit. Understand what needs to be fixed.»

What is "seoextract"?

"seoextract" is a Python-based SEO audit tool that crawls a website and checks common SEO problems such as:

Missing or weak title tags
Missing meta descriptions
Thin content
Missing canonical tags
Poor internal linking
Missing schema markup
Missing viewport meta tag
Basic page-level SEO quality

It then calculates a score and grade for the website.

Example output:

seoextract audit https://www.python.org --max-pages 1

Output:

SEO Audit Complete

URL: https://www.python.org
Pages crawled: 1
Site score: 75.0
Grade: B
Total issues: 3
Critical: 0
Warnings: 2
Info: 1

Why I Built It

Most beginners learn SEO as a checklist:

Add a title
Add a meta description
Use headings properly
Add internal links
Add schema markup
Improve content length

But when building real websites, manually checking every page becomes boring and repetitive.

I wanted to build something that could automate the basic audit process.

At the same time, I wanted this project to help me improve my Python skills, especially in:

Web scraping
CLI development
Package structuring
SEO rule design
Publishing Python packages to PyPI

That is how "seoextract" started.

How It Works

The workflow is straightforward.

First, the user runs a command from the terminal:

seoextract audit https://example.com

Then "seoextract" performs these steps:

Fetches the page HTML
Parses the page content
Extracts SEO-related elements
Applies built-in SEO rules
Calculates a score
Displays a structured audit report

The tool checks both technical SEO signals and content-level signals.

For example, if a page has no meta description, the tool reports it as an issue and gives a fix suggestion:

[WARNING] Missing Meta Description
fix: Add a meta description between 50–160 characters summarising the page.

Example Audit

Running:

seoextract audit https://example.com

May return something like:

Audit Summary

site_score : 59.0
grade : D
pages_crawled : 1
total_issues : 7
safe_browsing : True

Detected issues:

[WARNING] Title Too Short
fix: Title is 14 chars. Expand to at least 50 characters.

[WARNING] Missing Meta Description
fix: Add a meta description between 50–160 characters summarising the page.

[WARNING] Thin Content
fix: Page has only 21 words. Aim for at least 300 words of meaningful content.

[INFO] Missing Canonical Tag
fix: Add a canonical tag to prevent duplicate content issues.

[INFO] Poor Internal Linking
fix: Add at least 2 internal links to help search engines discover related pages.

[INFO] No Schema Markup
fix: Add Schema.org structured data to improve search result appearance.

This makes the report beginner-friendly because it does not just say what is wrong. It also tells what needs to be fixed.

Features

The current version includes:

Website SEO auditing from the terminal
Page crawling with max-page control
SEO issue detection
Score and grade calculation
Human-readable fix suggestions
CLI interface
PyPI package support

The command format is simple:

seoextract audit --max-pages

Example:

seoextract audit https://www.python.org --max-pages 1

What I Learned While Building It

This project taught me that even a simple CLI tool needs proper structure.

At first, web scraping looks like it can be done in just a few lines of Python using BeautifulSoup.

But a real tool needs more than that.

It needs:

Input validation
Error handling
HTML parsing
URL normalization
Rule-based checks
Scoring logic
Clean terminal output
Package configuration
CLI command registration

That is why a proper project structure matters.

A 5-line script can scrape a title.

But a package should be reliable, reusable, and understandable.

Why This Project Matters

"seoextract" is not trying to replace advanced SEO platforms.

Instead, it is useful for:

Beginners learning SEO
Developers checking their websites
Students building portfolio projects
Freelancers auditing small websites
Python learners practicing real-world CLI tools

It is a practical project because it combines programming with a real business use case.

SEO is not just a technical topic. It connects directly to traffic, visibility, leads, and marketing.

That makes this project more useful than a basic toy script.

Future Improvements

The next versions can include:

LLM-based SEO suggestions
Better scoring rules
Export to JSON, CSV, or PDF
More detailed page reports
Broken link detection
Keyword density analysis
Image alt text checking
Sitemap detection
Robots.txt analysis
Better multi-page crawling
AI-generated recommendations

One important improvement I am planning is to add optional LLM support so the tool can generate more detailed recommendations based on the page content.

For example, instead of only saying:

Missing meta description

It could suggest:

Suggested meta description:
Learn Python programming with tutorials, documentation, downloads, and community resources from Python.org.

That would make the tool much more useful for real users.

Final Thoughts

Building "seoextract" helped me understand how real developer tools are structured.

It is not just about writing scraping code.

It is about turning code into a usable product:

A CLI command
A package
A report system
A scoring engine
A tool that someone else can install and use

This project started as a simple SEO checker, but it became a strong learning experience in Python packaging, CLI development, and practical automation.

If you are learning Python, I highly recommend building small CLI tools like this.

They force you to think beyond code and start thinking like a product builder.