A Kanji Numeral Converter With 小字, 大字, and 壱弐参 Legal Forms

typescript dev.to

A Kanji Numeral Converter With 小字, 大字, and 壱弐参 Legal Forms

Japanese has three number systems: everyday kanji (一二三), legal kanji (壱弐参 — used on contracts to prevent forgery), and fullwidth digits (123). Converting between Arabic numerals and any of these requires handling place values up to 京 (10^16), grouping by 4 digits instead of 3, and knowing when to omit or include the leading 一.

1234 in English is "one thousand two hundred thirty-four". In Japanese it's 千二百三十四 — the 一 before 千 is omitted. But on a legal contract it's 壱阡弐佰参拾肆 — the 壱 is required, and even the unit characters change to their formal equivalents.

🔗 Live demo: https://sen.ltd/portfolio/kansuji-converter/
📦 GitHub: https://github.com/sen-ltd/kansuji-converter

Features:

  • 3 systems: 小字 (everyday), 大字 (legal), fullwidth (全角)
  • Range: 0 to 10^16 (京)
  • Both directions (num → kan, kan → num)
  • Legal yen formatting for contracts
  • Examples for common amounts
  • BigInt-backed math (no float loss)
  • Japanese / English UI
  • Zero dependencies, 74 tests

Groups of 4, not 3

English groups thousands: 1,234,567,890. Japanese groups ten thousands: 12億3456万7890. Every 4 digits gets a new large unit:

Large unit Kanji Value
万 (man) 10^4 10,000
億 (oku) 10^8 100,000,000
兆 (chou) 10^12 1 trillion
京 (kei) 10^16 10 quadrillion

So conversion splits the number into 4-digit groups from the right, formats each group independently, and joins them with large units:

export function toKansuji(num, system = 'shouji') {
  let n = BigInt(num);
  if (n === 0n) return '';

  const groups = [];
  while (n > 0n) {
    groups.push(n % 10000n);
    n = n / 10000n;
  }
  // groups[0] is 一-千, groups[1] is 万, groups[2] is 億, groups[3] is 兆, groups[4] is 京

  const parts = [];
  for (let i = groups.length - 1; i >= 0; i--) {
    if (groups[i] === 0n) continue;
    parts.push(formatGroup(groups[i], system) + LARGE_UNITS[i]);
  }
  return parts.join('');
}
Enter fullscreen mode Exit fullscreen mode

Empty groups (0000) are skipped to avoid 一万〇〇〇〇三 garbage output.

BigInt for precision

At 10^16, regular JavaScript numbers start losing precision. 10^16 is Number.MAX_SAFE_INTEGER ≈ 9 × 10^15. BigInt has no such limit:

let n = typeof num === 'bigint' ? num : BigInt(num);
// All arithmetic in BigInt
while (n > 0n) {
  groups.push(n % 10000n);
  n = n / 10000n;
}
Enter fullscreen mode Exit fullscreen mode

The n variable is a bigint, not a number. Operations use 0n, 10000n (BigInt literals) to avoid mixing types. The result is exact even at 9,999,999,999,999,999 (which is within the 京 range).

Implicit 一 omission

In 小字 (everyday) notation, a leading 1 before 十/百/千 is usually omitted:

  • 11 = 十一 (not 一十一)
  • 100 = 百 (not 一百)
  • 1000 = 千 (not 一千)
  • 10000 = 一万 (NOT 一十千 — the 一 before 万 IS required)
function formatGroup(n, system) {
  const digits = [Number(n) / 1000 | 0, (Number(n) / 100 | 0) % 10, ...];
  let result = '';
  const units = ['', '', '', ''];
  for (let i = 0; i < 4; i++) {
    if (digits[i] === 0) continue;
    // Omit leading 一 before 千/百/十 (but not before standalone 一)
    if (digits[i] === 1 && units[i] !== '' && system === 'shouji') {
      result += units[i];
    } else {
      result += DIGIT_CHARS[digits[i]] + units[i];
    }
  }
  return result;
}
Enter fullscreen mode Exit fullscreen mode

In 大字 (legal) notation, the 壱 is NEVER omitted. This is intentional — legal documents use 大字 precisely to prevent someone from adding a stroke to a 一 and turning 一万円 into 十万円. The 壱 is visually too different to forge.

大字: 壱弐参 for contracts

The legal kanji system uses visually distinct characters:

Value 小字 大字
1
2
3
10
100
1000
10000

Real legal use: Japanese checks, contracts, and official documents legally require 大字 for monetary amounts. The purpose is forgery prevention — 一 can be turned into 二, 十, 千 with added strokes, but 壱 cannot be turned into 弐 without obvious tampering.

Series

This is entry #86 in my 100+ public portfolio series.

Source: dev.to

arrow_back Back to Tutorials