I parsed 18,000 emission factors from DEFRA, EPA, and ADEME into one free API

javascript dev.to

The problem

I needed emission factors for a carbon reporting project.

If you've tried this, you know what happens next: you open the DEFRA 2025 flat-format spreadsheet and it's 3,997 rows across 40-something sheets, with the same activity described three different ways. You download the EPA Hub workbook and it's 12 separate tables, mixed units (kg/MMBtu, g/MMBtu, lb/MWh), and different GWP conventions depending on year. You grab the ADEME Base Carbone CSV — 11,000 rows in French, with semicolon delimiters, Windows-1252 encoding, and commas as decimal separators.

Three weeks into the project I realised I wasn't building a carbon calculator. I was building a parser pipeline.

This post is what I wish I'd read on day one.

The units that lie

EPA's CH₄ and N₂O factors look like they're in kg/MMBtu, same as CO₂. They're not.

EPA natural gas (stationary combustion):
  CO₂ = 53.06 kg/MMBtu
  CH₄ = 1.0 g/MMBtu    ← grams, not kg
  N₂O = 0.1 g/MMBtu    ← grams, not kg
Enter fullscreen mode Exit fullscreen mode

If you forget the unit mismatch, your CH₄ footprint is off by a factor of 1,000. Which is funny because CH₄ has a GWP of 28, so the total error ends up being a factor of 28,000 too low on the methane portion.

The fix is trivial once you know:

const co2 = qty_mmbtu * co2_factor_mmbtu;                    // already kg
const ch4 = (qty_mmbtu * ch4_factor_mmbtu * GWP_CH4) / 1000; // g → kg CO₂e
const n2o = (qty_mmbtu * n2o_factor_mmbtu * GWP_N2O) / 1000; // g → kg CO₂e
Enter fullscreen mode Exit fullscreen mode

EPA's electricity factors (eGRID) are in lb/MWh. You multiply by 0.453592 to get kilograms. EPA's waste factors are in metric tons CO₂e per short ton material, which requires a ×1000 to get to kg. Every source has its own version of this trap.

DEFRA is kinder — almost every factor is already in kg CO₂e / activity unit. When you see something like this, keep it.

The GWP versions that don't match

Global Warming Potential values change as IPCC publishes new assessment reports. For 100-year horizons:

Gas AR4 AR5 AR6
CH₄ 25 28 27.9
N₂O 298 265 273

The GHG Protocol default is AR5. EPA 2023 factors were still on AR4 (CH₄=25). EPA 2024+ moved to AR5 (CH₄=28). DEFRA has been on AR5 since 2020.

If your calculator silently uses a single GWP table, you're either inflating 2023 methane or deflating 2024 methane. I keep a per-year GWP object and pull the matching one when calculating.

Scope 2 has two answers

GHG Protocol Scope 2 Guidance (2015) requires dual reporting:

  • Location-based: grid average emission factor (e.g. DEFRA UK electricity: 0.177 kg CO₂e/kWh in 2025)
  • Market-based: contractually-driven. Backed by REC / I-REC / GO / PPA certificates → often zero or very low.

These aren't interchangeable. A company with 100% renewable PPA can legitimately report 0 for Scope 2 market-based and 3,200 kg CO₂e for the same activity location-based. Every credible framework (CDP, SBTi, EU CSRD) wants both.

I show both by default on every electricity entry. The REC toggle zeroes out the market-based side.

T&D losses are Scope 3, not Scope 2

This one surprised me.

GHG Protocol Scope 3 Standard, Category 3 (Fuel- and Energy-Related Activities):
Transmission & distribution losses for purchased electricity are Scope 3.

If you pull the standard DEFRA UK electricity factor (0.177 in 2025), it doesn't include T&D. There's a separate T&D factor (0.01853 kg CO₂e/kWh for UK) that DEFRA labels "scope": "Scope 3".

Plenty of calculators fold T&D into Scope 2. That's wrong under GHG Protocol, DEFRA guidance, EPA, and ISO 14064-1. I added an auto-suggest banner: if a user adds UK DEFRA electricity to their inventory, we prompt them to also add the matching Scope 3 T&D entry.

Putting it in an API

After three weeks of parsing, I had a single normalized index:

DEFRA 2023-2025: 3,997 factors
EPA 2023-2025:   1,260 factors
ADEME 2024:     11,442 factors
Ember 2024:        193 factors (193 countries grid intensity)
------
Total: ~17,000 distinct activity-factor records
Enter fullscreen mode Exit fullscreen mode

Every record carries its source, year, scope, license, and original unit.

The REST API is live at:

curl "https://www.sustainmetrics.net/api/v1/factors?source=defra&year=2025&category=fuels" \
  -H "X-API-Key: sm_your_key"
Enter fullscreen mode Exit fullscreen mode

Example response (trimmed):

{"ok":true,"count":127,"data":[{"source":"defra","year":2025,"scope":"Scope 1","category":"fuels","activity":"Natural gas","unit":"kWh (Net CV)","co2e_factor":0.18290,"license":"Open Government Licence v3.0"},...]}
Enter fullscreen mode Exit fullscreen mode

Free tier is 50 calls/day; signup just needs an email. Calculator itself needs no signup at all.

Where I went wrong

A few things I had to rewrite:

  1. First pass used a single GWP table. Had to redo it once I saw EPA 2023 vs 2024 diverge. Lesson: version your GWP as data, not as a constant.
  2. Treated DEFRA's scope field as string. It's sometimes a multi-line string with embedded newlines. Normalize before querying.
  3. Didn't handle ADEME's French decimal commas. Parsed numbers came out zero. Added a .replace(',', '.') pass before parseFloat.
  4. Stored cloud projects as JSON blobs without a schema. When I wanted to run cross-project trend analysis, I couldn't query them efficiently. Now the blob has a minimum required schema and indexed fields for year, scope totals, source.

What I'd like feedback on

If you've built similar tooling or you do carbon accounting for real:

  • The cross-source comparison page at sustainmetrics.net/compare shows factor differences for the same activity across standards. Does it surface anything useful to you?
  • API shape/api/v1/factors?source=defra&year=2025&scope=Scope%202&search=electricity. Is this the shape you'd want, or would you prefer versioned JSON Schema + cursor-based pagination?
  • Any obvious scope/unit/GWP mistakes in the index. I did the parsing mostly by hand from official flat files. Second pair of eyes very welcome.

The calculator is free, the comparison page is free, and 50 API calls a day is free (email signup only). Pro ($29/mo) exists for people who need PDF/Excel exports and cloud project storage.

Links

If any of this is wrong or could be better, tell me.

Source: dev.to

arrow_back Back to Tutorials