Most "privacy-focused" PDF Tools Make One Quiet Compromise. Mine Doesn't. [Devlog #3]

rust dev.to

Most "privacy-focused" PDF tools make one quiet compromise.
They still phone home.

Hiyoko doesn't. Not once.

Here's the architecture decision behind that —
and what breaks if you get the encryption wrong.


The rule I set on day one

The app only touches the network when the user explicitly asks it to.

PDFs carry contracts, medical records, tax documents. A lot of tools still run background telemetry or license checks while you work. I didn't want any of that.

Everything in Hiyoko PDF Vault runs locally. No exceptions.


Why AES-256-GCM — and what breaks with CBC

This is where most implementations quietly fail.

Mode Integrity check What happens if someone tampers with the file
AES-256-CBC None You decrypt corrupted garbage. Silently.
AES-256-GCM Built-in (AEAD) Decryption fails immediately. You know.

CBC encrypts fine — but it has no authentication. That means a tampered ciphertext can decrypt into something without raising any alarm. You'd need to bolt on HMAC separately, and that's exactly where implementation mistakes happen.

GCM gives you encryption + integrity in one pass. Wrong password or modified file → immediate failure, clear error. No ambiguity.


Rust implementation

Using the aes-gcm crate:

use aes_gcm::{
    aead::{Aead, KeyInit, OsRng},
    Aes256Gcm, Nonce,
};
use aes_gcm::aead::rand_core::RngCore;

pub fn encrypt_pdf(data: &[u8], key: &[u8; 32]) -> Result, String> {
    let cipher = Aes256Gcm::new(key.into());

    let mut nonce_bytes = [0u8; 12];
    OsRng.fill_bytes(&mut nonce_bytes);
    let nonce = Nonce::from_slice(&nonce_bytes);

    let ciphertext = cipher
        .encrypt(nonce, data)
        .map_err(|e| e.to_string())?;

    // prepend nonce so we can recover it on decrypt
    let mut result = nonce_bytes.to_vec();
    result.extend_from_slice(&ciphertext);
    Ok(result)
}

pub fn decrypt_pdf(data: &[u8], key: &[u8; 32]) -> Result, String> {
    if data.len() < 12 {
        return Err("data too short".to_string());
    }
    let (nonce_bytes, ciphertext) = data.split_at(12);
    let cipher = Aes256Gcm::new(key.into());
    let nonce = Nonce::from_slice(nonce_bytes);

    cipher
        .decrypt(nonce, ciphertext)
        .map_err(|_| "decryption failed — wrong password or corrupted file".to_string())
}
Enter fullscreen mode Exit fullscreen mode

Nonce is randomly generated on every encrypt call. Same file encrypted twice → completely different output. Makes pattern analysis useless.


Key derivation: why Argon2id over bcrypt

User passwords are variable length, so we need a fixed 32-byte key. The choice of KDF matters more than most people realize.

pub fn derive_key(password: &str, salt: &[u8]) -> [u8; 32] {
    let mut key = [0u8; 32];
    Argon2::default()
        .hash_password_into(password.as_bytes(), salt, &mut key)
        .expect("key derivation failed");
    key
}
Enter fullscreen mode Exit fullscreen mode

Argon2id is memory-hard. Brute-forcing it requires not just compute but RAM — which makes GPU-based attacks expensive. bcrypt and PBKDF2 don't have that property. For new implementations, there's no reason not to use Argon2id.


What "zero leak" actually means in practice

Beyond no network calls, I built a few extra guarantees:

  • Temp files go under NSTemporaryDirectory(), deleted immediately after processing — not dumped in /tmp and forgotten
  • Clipboard auto-clears 30 seconds after sensitive text is copied (in progress)
  • Log files record file paths and operation types only — never file contents

Current state (dev build)

Wrong password → GCM auth tag verification fails instantly → clear error message. The user always knows exactly what went wrong.


Next

1000-page PDFs rendered without freezing — virtual scroll + Ghost Batch architecture.


Hiyoko PDF Vault → https://hiyokoko.gumroad.com/l/HiyokoPDFVault
X → @hiyoyok

Source: dev.to

arrow_back Back to Tutorials