Building a Smart Job Application Tracker with FastAPI, TF-IDF Matching, and Analytics

python dev.to

Job hunting is a numbers game, and keeping track of dozens of applications across LinkedIn, Indeed, company sites, and cold emails quickly becomes chaotic. I built AppTrack — a full-stack job application tracker with resume-JD matching, pipeline analytics, and smart follow-up reminders. Here's how.

The Problem

When you're actively job hunting, you need to track:

  • Where you applied and when
  • Current status of each application
  • Which sources (LinkedIn, referral, etc.) actually get responses
  • When to follow up
  • How well your resume matches each role

Spreadsheets work initially, but they don't scale. You need filtering, analytics, and automation.

Architecture

┌─────────────────────────────────────┐
│           Frontend (SPA)            │
│   Tailwind CSS + Alpine.js + Chart  │
└──────────────┬──────────────────────┘
               │ REST API
┌──────────────▼──────────────────────┐
│          FastAPI Backend            │
│  ┌─────────┐ ┌─────────┐ ┌──────┐  │
│  │  CRUD   │ │Analytics│ │Match │  │
│  │ Router  │ │ Router  │ │Router│  │
│  └────┬────┘ └────┬────┘ └──┬───┘  │
│       │           │         │       │
│  ┌────▼───────────▼─────────▼───┐   │
│  │      Service Layer           │   │
│  │  ┌──────┐ ┌─────┐ ┌──────┐  │   │
│  │  │App   │ │Stats│ │TF-IDF│  │   │
│  │  │Svc   │ │ Svc │ │Match │  │   │
│  │  └──┬───┘ └──┬──┘ └──┬───┘  │   │
│  └─────┼────────┼───────┼──────┘   │
│        │        │       │           │
│  ┌─────▼────────▼───────▼──────┐   │
│  │     SQLite (aiosqlite)      │   │
│  │  applications | events |     │   │
│  │  contacts | reminders        │   │
│  └─────────────────────────────┘   │
└─────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

Tech Stack

Component Technology Why
API Framework FastAPI Auto-generated OpenAPI docs, async, type-safe
Database SQLite + aiosqlite Zero config, async, perfect for personal tools
Matching scikit-learn TF-IDF No external APIs needed, fast, interpretable
Frontend Tailwind + Alpine.js Lightweight, no build step needed
Charts Chart.js Beautiful charts with minimal code
CLI Click + Rich Terminal-first workflow
CI GitHub Actions Automated testing on push

Key Feature: Resume-JD Matching

The most interesting feature is the TF-IDF-based resume matcher. It scores how well your resume matches a job description — completely offline, no API costs.

from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity

def score_match(resume_text: str, job_description: str) -> dict:
    vectorizer = TfidfVectorizer(
        stop_words="english",
        ngram_range=(1, 2),
        max_features=5000,
        sublinear_tf=True,
    )
    tfidf_matrix = vectorizer.fit_transform([resume_text, job_description])
    similarity = cosine_similarity(tfidf_matrix[0:1], tfidf_matrix[1:2])
    score = round(float(similarity[0][0]) * 100, 1)

    # Extract matching and missing keywords
    jd_keywords = extract_keywords(job_description)
    resume_keywords = extract_keywords(resume_text)
    matching = jd_keywords & resume_keywords
    missing = jd_keywords - resume_keywords

    return {
        "score": score,
        "matching_keywords": sorted(matching),
        "missing_keywords": sorted(missing),
        "suggestion": generate_suggestion(score, missing),
    }
Enter fullscreen mode Exit fullscreen mode

The key decisions:

  • ngram_range=(1, 2) captures both single words ("python") and two-word phrases ("data engineering")
  • sublinear_tf=True applies logarithmic TF scaling so common words don't dominate
  • Keyword extraction uses a curated tech vocabulary plus regex for acronyms/proper nouns

This gives you a practical score plus actionable feedback: which keywords you match and which are missing.

Smart Reminders

When you create an application, AppTrack automatically sets a 7-day follow-up reminder. When you move an application to an interview stage, it creates:

  • An interview prep reminder (immediate)
  • A thank-you note reminder (1 day after)
async def update_status(app_id: str, new_status: str, note: str = None):
    # Update the status
    await db.execute(
        "UPDATE applications SET status = ?, updated_at = ? WHERE id = ?",
        (new_status, now, app_id),
    )

    # Log the event
    await db.execute(
        "INSERT INTO events (...) VALUES (...)",
        (event_id, app_id, 'status_change', old_status, new_status, now),
    )

    # Auto-create interview reminders
    if new_status in {"phone_screen", "technical", "onsite"}:
        await create_reminder(app_id, "interview_prep", "Prepare for interview")
        await create_reminder(app_id, "thank_you", "Send thank-you note", days=1)
Enter fullscreen mode Exit fullscreen mode

The Dashboard

The frontend is a single HTML file using CDN-loaded Tailwind CSS, Alpine.js, and Chart.js. Four tabs:

  1. Applications — Sortable, filterable table with inline status updates
  2. Analytics — Pipeline funnel, weekly trends, source breakdown charts
  3. Match Scorer — Paste a JD, get instant match analysis
  4. Reminders — Pending follow-ups with dismiss functionality

No build step needed. Just serve the HTML.

Pipeline Analytics

The analytics module queries SQLite to calculate:

  • Response rate: % of applications that moved past "applied"
  • Source effectiveness: Which sources (LinkedIn vs referral vs cold email) convert best
  • Pipeline funnel: Visual breakdown of where applications are in the process
  • Weekly trends: Application velocity over time
async def get_sources():
    rows = await db.execute_fetchall("""
        SELECT
            COALESCE(source, 'unknown') as source,
            COUNT(*) as cnt,
            SUM(CASE WHEN status IN ('phone_screen', 'technical', 'onsite', 'offer', 'accepted')
                THEN 1 ELSE 0 END) as interview_cnt
        FROM applications
        GROUP BY source
        ORDER BY cnt DESC
    """)
    return [{
        "source": r["source"],
        "count": r["cnt"],
        "conversion_rate": round(r["interview_cnt"] / r["cnt"] * 100, 1)
    } for r in rows]
Enter fullscreen mode Exit fullscreen mode

This is the data that actually helps you optimize your job search strategy.

Full REST API

The API covers everything:

POST   /api/applications          Create application
GET    /api/applications          List with filters/pagination
GET    /api/applications/{id}     Get details + timeline
PUT    /api/applications/{id}     Update fields
PATCH  /api/applications/{id}/status  Update status
DELETE /api/applications/{id}     Delete

GET    /api/analytics/overview    Summary stats
GET    /api/analytics/pipeline    Funnel data
GET    /api/analytics/trends      Weekly trends
GET    /api/analytics/sources     Source effectiveness

POST   /api/match/score           Score resume vs JD
POST   /api/import/csv            Import from CSV
GET    /api/export/csv            Export to CSV
GET    /api/reminders             Pending reminders
PATCH  /api/reminders/{id}        Dismiss/snooze
Enter fullscreen mode Exit fullscreen mode

FastAPI auto-generates interactive Swagger docs at /docs — great for recruiter demos.

Testing

34 tests covering CRUD, analytics, matching, reminders, and integration scenarios:

$ pytest tests/ -v
========================= test session starts =========================
tests/test_analytics.py::test_overview_empty PASSED
tests/test_analytics.py::test_overview_with_data PASSED
tests/test_analytics.py::test_pipeline PASSED
tests/test_api.py::test_full_application_lifecycle PASSED
tests/test_api.py::test_csv_export PASSED
tests/test_applications.py::test_create_application PASSED
tests/test_applications.py::test_status_change_creates_event PASSED
tests/test_matcher.py::test_score_match_basic PASSED
tests/test_matcher.py::test_score_match_keywords PASSED
tests/test_reminders.py::test_reminders_created_on_apply PASSED
... (34 total)
========================= 34 passed in 0.30s =========================
Enter fullscreen mode Exit fullscreen mode

Tests use an in-memory SQLite database and async HTTP client — fast and isolated.

Running It

# Clone and install
git clone https://github.com/hajirufai/apptrack.git
cd apptrack
pip install -r requirements.txt

# Run
python -m uvicorn app.main:app --reload

# Or with Docker
docker compose up -d
Enter fullscreen mode Exit fullscreen mode

Visit http://localhost:8000 for the dashboard, /docs for the API.

What I'd Add Next

  • Email parsing: Auto-extract application data from confirmation emails
  • Browser extension: Quick-add from job listing pages
  • Salary tracking: Compare offers with market data
  • AI cover letter drafts: Generate tailored cover letters from the match analysis

Key Takeaways

  1. SQLite is underrated for personal tools — zero config, fast, and aiosqlite makes it async-compatible
  2. TF-IDF matching gives surprisingly useful results for resume-JD comparison without any API costs
  3. Auto-generated reminders prevent the #1 job search mistake: forgetting to follow up
  4. CDN-loaded frontend (Tailwind + Alpine.js) means zero build complexity for dashboard UIs
  5. Build what you need — the best portfolio projects solve your own problems

Check out the full source on GitHub. If you're job hunting, feel free to fork it and track your own applications!

Source: dev.to

arrow_back Back to Tutorials