Protobuf vs JSON

Modern applications thrive on fast and efficient communication. While JSON has long been the go-to format for data exchange, many large-scale tech companies are now rethinking their choice. Companies like Atlassian, Netflix, and Google have shifted to Protocol Buffers (Protobuf) to optimize performance and reduce payload sizes across services.

Their motivation? Faster API responses, smaller data transfers, better compatibility for growing systems, and built-in schema enforcement.

In this guide, we'll break down what JSON and Protobuf are, compare their strengths and weaknesses, and walk you through practical examples in Ruby. Whether you're building microservices or optimizing existing APIs, this guide will help you decide.

1. What is JSON?

JSON (JavaScript Object Notation) is a lightweight, human-readable data format commonly used in web APIs and configs. It's based on JavaScript syntax and supported by virtually all programming languages.

Example:

{"name":"Alice","age":28,"isActive":true}

2. Why is JSON Preferred?

JSON is often the default choice because:

Readable: Easily understood and edited by humans
Simple: No extra libraries needed in JavaScript/Node.js
Interoperable: Works across all platforms and languages
Debug-friendly: Can be logged and inspected directly
Universal: Every language has built-in or one-line library support

But JSON has limitations in performance, file size, and type safety, especially in large systems.

3. What are Protocol Buffers?

Protocol Buffers (Protobuf) is a binary serialization format developed by Google. It encodes data into compact, fast, structured bytes using .proto schema definitions.

Key features:

Compact binary format (3-10x smaller than JSON)
Faster parsing and serialization
Schema enforcement via .proto files
Built-in backward and forward compatibility
Code generation for multiple languages

Example .proto file:

syntax = "proto3";

message User {
  string name = 1;
  int32 age = 2;
  bool is_active = 3;
}

4. Protobuf vs JSON: Key Differences

Feature	JSON	Protobuf
Format	Text (UTF-8)	Binary
Human Readable	Yes	No (bytes)
Size	Larger	Smaller (3-10x)
Speed	Slower parsing	Faster parsing
Schema	Optional/loose	Required (`.proto`)
Type Safety	Weak (runtime errors)	Strong (compile-time checks)
Versioning	Manual (URL versioning)	Built-in (field numbers)
Browser-friendly	Native	Needs library

Concrete example: A list of 1000 products

JSON: ~150KB payload, ~300ms download on 3G
Protobuf: ~45KB payload, ~90ms download on 3G

5. How to Use Protobuf in Ruby

In Ruby, Protobuf is typically used with pre-generated classes.

Step 1: Define the `.proto` File

syntax = "proto3";

message Customer {
  string name = 1;
  string email = 2;
  int32 seats = 3;
}

message CreateCustomerRequest {
  string name = 1;
  string email = 2;
}

message CreateCustomerResponse {
  Customer customer = 1;
}

Step 2: Generate Ruby Code

# Install protoc compiler (platform-specific)
# macOS: brew install protobuf
# Linux: apt-get install protobuf-compiler

# Generate Ruby classes
protoc --ruby_out=app/contracts customer.proto

This generates customer_pb.rb:

# Generated by the protocol buffer compiler. DO NOT EDIT!
require 'google/protobuf'

descriptor_data = "..." # Binary schema data

Google::Protobuf::DescriptorPool.generated_pool.add_serialized_file(descriptor_data)

module MyApp
  Customer = Google::Protobuf::DescriptorPool.generated_pool.lookup("Customer").msgclass
  CreateCustomerRequest = Google::Protobuf::DescriptorPool.generated_pool.lookup("CreateCustomerRequest").msgclass
  CreateCustomerResponse = Google::Protobuf::DescriptorPool.generated_pool.lookup("CreateCustomerResponse").msgclass
end

Step 3: Use in Your Application

# Creating a message
request = MyApp::CreateCustomerRequest.new(
  name: "Acme Corp",
  email: "billing@acme.com"
)

# Accessing fields
request.name   # => "Acme Corp"
request.email  # => "billing@acme.com"

# Serialize to JSON (for HTTP APIs)
json_payload = request.to_json
# => '{"name":"Acme Corp","email":"billing@acme.com"}'

# Parse incoming JSON
raw_json = '{"name":"Acme Corp","email":"billing@acme.com"}'
parsed = MyApp::CreateCustomerRequest.decode_json(raw_json)
parsed.name  # => "Acme Corp"

# Serialize to binary (for internal services)
binary = request.to_proto
# => <Binary data>

# Parse from binary
decoded = MyApp::CreateCustomerRequest.decode(binary)

6. Real-World API Patterns

Pattern 1: Twirp (JSON over HTTP with Protobuf Schema)

Twirp is a simple RPC framework from Twitch that uses Protobuf for API contracts but communicates over HTTP with JSON.

Rules:

All endpoints use POST
Content-Type: application/json
URL pattern: /api/{service}/{method}

Example:

# Controller
class UsersController < ApplicationController
  def create
    # Decode JSON body to Protobuf object
    request = UsersPb::CreateUserRequest.decode_json(request.raw_post)

    # Business logic here...
    user = user.create!(name: request.name, email: request.email)

    # Encode response to JSON
    response = UsersPb::CreateUserResponse.new(
      user: UsersPb::Customer.new(
        name: user.name,
        email: user.email
      )
    )

    render json: response.to_json
  end
end

This gives you:

Type safety at the API boundary
Human-readable JSON for debugging
Automatic schema documentation via .proto files

Pattern 2: gRPC (Binary over HTTP/2)

For internal microservices, gRPC uses Protobuf's binary format over HTTP/2 for maximum performance.

Characteristics:

Binary payloads (smaller, faster)
HTTP/2 (multiplexing, streaming)
Strongly typed stubs in all languages
Bi-directional streaming support

7. When to Use Each

Use Protobuf when:

Building microservices that communicate frequently
Performance and bandwidth are critical (mobile, IoT)
You need strict schemas and versioning
Data payloads are large or frequent

Use JSON when:

Building browser-facing APIs
You want human-readable data for debugging
Quick prototyping and development
Working with third-party/public APIs
Team is unfamiliar with Protobuf toolchain

The Hybrid Approach (Best of Both)

Many teams use both strategically:

Protobuf binary for internal service-to-service communication (speed)
JSON for browser/client-facing APIs (readability)
The same .proto file generates both binary and JSON serializers

Example architecture:

Browser ──HTTP/JSON──▶ API Gateway ──gRPC/Protobuf──▶ User Service
                                      │
                                      └─gRPC/Protobuf──▶ Payment Service

8. Common Gotchas

1. Field Numbers Matter

In Protobuf, each field has a number (e.g., string name = 1;). These numbers:

Must never change for existing fields
Can be reused only after a field is deleted for 1+ years
Enable backward compatibility (old code ignores new fields)

2. Default Values

Protobuf has implicit defaults:

string → ""
int32 → 0
bool → false

You can't distinguish between "field not set" and "field set to default" without using optional keyword (proto3).

3. Required vs Optional

In proto3, all fields are optional by default. If you need explicit presence tracking:

optional string name = 1;  # Now you can check if it was set

9. Conclusion

Use Case	Recommendation
Human readability	JSON
Performance (internal services)	Protobuf
Browser/frontend APIs	JSON
Microservices (service mesh)	Protobuf + gRPC
RPC with schema enforcement	Protobuf + Twirp
Debugging	JSON
Schema evolution	Protobuf

Both formats have their place. JSON is perfect for human-facing interfaces and simple APIs, while Protobuf excels in high-performance, structured systems.