Building a Real-Time eSIM Price Index with AI

When we started building eSIMDB AI, one of the first hard questions was: how do you maintain accurate, current pricing across 120+ eSIM providers when each has its own pricing structure, API format, and update frequency?

This post covers the architecture we settled on, the problems we ran into, and what we'd do differently.

The Problem Space

Travel eSIM pricing is surprisingly volatile. Providers run flash sales, change regional plan structures, add or remove country coverage, and update pricing with little notice. A comparison that was accurate yesterday may show stale prices today.

We needed to handle 120+ providers, each with a different API structure, authentication method, rate limit policy, and data schema.

Architecture Overview

Our pipeline has three main layers:

1. Data Ingestion Layer

Each provider gets an adapter — a normalized interface that outputs a standardized plan object:

{"provider":"airalo","plan_id":"airalo_eu_5gb_30d","countries":["FR","DE","ES"],"data_gb":5,"validity_days":30,"price_usd":13.50,"hotspot_allowed":true,"speed_5g":false,"last_updated":"2026-06-01T14:32:00Z"}

2. Normalization & Scoring Layer

Once normalized, each plan gets a composite value score:

def score_plan(plan, query):
    return weighted_sum([
        (score_price(plan, query), 0.25),
        (provider_reliability[plan.provider], 0.20),
        (score_coverage(plan, query), 0.20),
        (score_flexibility(plan), 0.15),
        (score_features(plan, query), 0.10),
        (score_activation(plan), 0.10),
    ])

Provider reliability scores come from a rolling average of user reviews and activation success rates, updated weekly.

3. Query & Retrieval Layer

When a user types "Paris 10 days 5GB under €20", the AI layer:

Extracts intent (destination: France, duration: 10 days, data: ~5GB, budget: €20)
Fetches candidate plans with country match
Applies scoring with user's specific context
Returns top 3–5 plans with honest tradeoff explanations

We use a fine-tuned small LLM for query understanding — it handles ambiguous queries ("Paris and then maybe Amsterdam") much better than regex.

Refresh Strategy

Top 20 providers by volume: refresh every 6 hours
Mid-tier providers: every 12 hours
Long-tail providers: every 24 hours
Anomaly-triggered: when price shifts >15% between cycles, force refresh immediately

Challenges We Didn't Anticipate

Multi-country plan complexity. A "Europe" plan from one provider might cover 26 countries; another covers 42. We now explicitly show which specific countries each plan covers.

Currency volatility. We refresh FX rates every hour and display prices in the user's inferred currency.

Provider reliability signal. A cheap plan from a provider with 15% activation failure rate is worse value than a slightly pricier reliable provider. We aggregate review data from multiple sources.

What We'd Do Differently

Build provider adapters as plugins from day one
Invest in anomaly detection earlier
Separate the AI query layer from the comparison engine

Current State

The index covers 15,000+ plans across 195 countries, refreshes every 6–24 hours, and serves comparisons in under 2 seconds. Try it at esimdb.ai — free, no sign-up.

Built with: Python, PostgreSQL, Redis, small fine-tuned LLM for query parsing.