Rails Health Check

Beyond `/up`: Production-Grade Health Checks for Rails and Rack Applications

Most applications start with a simple question:

Is the process alive?

Rails answers that with the built-in /up endpoint.

But production systems need to answer a much more important question:

Is the application actually healthy?

Databases, Redis, queue systems, SMTP servers, external APIs, disk space, and memory pressure all affect an application's ability to serve requests. A process can remain alive even when the application itself is degraded or completely unavailable.

That's the problem I set out to solve with rails_health_checks.

The Problem with Basic Health Endpoints

Simple liveness checks are easy:

get "/up" => proc { [200, {}, ["OK"]] }

But modern applications depend on much more than the web process itself.

Questions that matter in production include:

Can ActiveRecord reach the database?
Is Redis available?
Are background jobs piling up?
Is an external API responding?
Is the server running low on disk space?
Is memory usage approaching dangerous levels?

Teams often end up maintaining custom controllers, bespoke checks, or aging libraries that require significant configuration and don't scale well.

Introducing rails_health_checks

rails_health_checks provides production-grade health endpoints with:

Built-in checks for common dependencies
Parallel execution
Result caching
Prometheus metrics
Structured JSON responses
Authentication options
Check grouping
Custom checks

It began as a Rails engine, but now also includes a standalone Rack application that can be mounted into virtually any Rack-based framework.

Rails Integration

In Rails, setup is straightforward:
gem "rails_health_checks"

Mount the engine:
mount RailsHealthChecks::Engine => "/health"

and you're done.

Beyond Rails: Rack Support

One of the newest additions to the project is RailsHealthChecks::Rack::App.

This makes the same endpoints available without requiring Rails routing or ActionDispatch.

That means the gem can now be used with:

Sinatra
Roda
Plain Rack applications
Internal services
Lightweight APIs

For example:

# config.ru

require "rails_health_checks"
require "rails_health_checks/rack/app"
the 
map "/health" do
  run RailsHealthChecks::Rack::App
end

run MyApp

The same endpoints are exposed:

/health
/health/live
/health/metrics
/health/:group

bringing a consistent health-checking experience across different Ruby stacks.

Parallel by Design

One common problem with health systems is latency.

If checks are performed sequentially:

Database
Redis
SMTP
Sidekiq
External APIs

Response time equals the sum of all dependencies.

rails_health_checks executes checks in parallel using Concurrent::Future, making the total response time roughly equal to the slowest dependency rather than all of them combined.

Benchmarks show five 10 ms checks completing in roughly 13 ms instead of over 60 ms—a speedup of approximately 4.5×.

Caching to Reduce Load

Monitoring systems often hit health endpoints every few seconds.

Without caching, every request may repeatedly:

Query the database
Ping Redis
Check queue systems
Contact external services

Enabling caching is as simple as:

RailsHealthChecks.configure do |config|
  config.cache_duration = 10
end

This absorbs probe traffic and prevents health checks themselves from becoming a source of load.

Prometheus Metrics Included

The gem also exposes a Prometheus endpoint:
GET /health/metrics

allowing health status and latency to be scraped directly by Prometheus and visualized in Grafana.

No additional adapters or exporters are required.

Framework-Agnostic by Design

Not every check depends on Rails.

Checks such as:

Disk space
Memory usage
HTTP endpoints
Redis
SMTP

can run inside any Rack application.

Meanwhile, Rails-specific checks continue to work naturally inside Rails applications.

This allows the same monitoring strategy to be used across a range of services, rather than maintaining separate solutions for each framework.

Why I Built It

I wanted something that:

Felt native to Rails.
Worked outside Rails when needed.
Scaled under heavy probe traffic.
Supported modern queue systems.
Produced structured responses and metrics.
Was easy to extend.
Didn't require every team to reinvent health checks.

The result is a library that aims to answer not just:

"Is the process running?"

but the more useful question:

"Is the application healthy?"

Because in production, those are rarely the same thing.