Image Generation in Rails — DALL-E, Stability AI, and Async Processing

Welcome back to the Ruby for AI series. We've built chat interfaces, RAG systems, and AI agents. Now let's make our Rails app create images.

We'll integrate DALL-E and Stability AI, handle async generation with Active Job, and display results with Turbo Streams. Real code, real patterns.

Setting Up

Add the gems:

# Gemfile
gem "ruby-openai"
gem "httparty"
gem "image_processing"

bundle install
rails g model GeneratedImage prompt:text provider:string image_url:text status:integer user:references
rails db:migrate

The model:

# app/models/generated_image.rb
class GeneratedImage < ApplicationRecord
  belongs_to :user
  has_one_attached :image

  enum :status, { pending: 0, generating: 1, completed: 2, failed: 3 }

  validates :prompt, presence: true
  validates :provider, inclusion: { in: %w[dall_e stability_ai] }
end

DALL-E Integration

Wrap the OpenAI image API in a clean service:

# app/services/dall_e_service.rb
class DallEService
  def initialize
    @client = OpenAI::Client.new(access_token: ENV["OPENAI_API_KEY"])
  end

  def generate(prompt, size: "1024x1024", quality: "standard")
    response = @client.images.generate(
      parameters: {
        model: "dall-e-3",
        prompt: prompt,
        size: size,
        quality: quality,
        n: 1
      }
    )

    data = response.dig("data", 0)
    {
      url: data["url"],
      revised_prompt: data["revised_prompt"]
    }
  rescue Faraday::Error => e
    Rails.logger.error("DALL-E error: #{e.message}")
    raise
  end
end

Stability AI Integration

Stability AI gives you more control over the generation process:

# app/services/stability_ai_service.rb
class StabilityAiService
  BASE_URL = "https://api.stability.ai/v2beta"

  def initialize
    @api_key = ENV["STABILITY_API_KEY"]
  end

  def generate(prompt, negative_prompt: "", aspect_ratio: "1:1")
    response = HTTParty.post(
      "#{BASE_URL}/stable-image/generate/sd3",
      headers: {
        "Authorization" => "Bearer #{@api_key}",
        "Accept" => "application/json"
      },
      multipart: true,
      body: {
        prompt: prompt,
        negative_prompt: negative_prompt,
        aspect_ratio: aspect_ratio,
        output_format: "png"
      }
    )

    raise "Stability AI error: #{response.code}" unless response.success?

    {
      base64: response.parsed_response["image"],
      seed: response.parsed_response["seed"]
    }
  end
end

Async Processing with Active Job

Image generation takes seconds. Never block a web request:

# app/jobs/generate_image_job.rb
class GenerateImageJob < ApplicationJob
  queue_as :ai_tasks
  retry_on Faraday::Error, wait: :polynomially_longer, attempts: 3

  def perform(generated_image_id)
    image = GeneratedImage.find(generated_image_id)
    image.generating!

    result = case image.provider
             when "dall_e"
               generate_with_dall_e(image)
             when "stability_ai"
               generate_with_stability(image)
             end

    attach_image(image, result)
    image.completed!

    broadcast_update(image)
  rescue StandardError => e
    image&.failed!
    Rails.logger.error("Image generation failed: #{e.message}")
    broadcast_update(image)
  end

  private

  def generate_with_dall_e(image)
    service = DallEService.new
    result = service.generate(image.prompt)
    { url: result[:url], type: :url }
  end

  def generate_with_stability(image)
    service = StabilityAiService.new
    result = service.generate(image.prompt)
    { data: Base64.decode64(result[:base64]), type: :binary }
  end

  def attach_image(image, result)
    case result[:type]
    when :url
      file = URI.open(result[:url])
      image.image.attach(io: file, filename: "generated_#{image.id}.png")
    when :binary
      image.image.attach(
        io: StringIO.new(result[:data]),
        filename: "generated_#{image.id}.png",
        content_type: "image/png"
      )
    end
  end

  def broadcast_update(image)
    Turbo::StreamsChannel.broadcast_replace_to(
      "user_#{image.user_id}_images",
      target: "image_#{image.id}",
      partial: "generated_images/image",
      locals: { image: image }
    )
  end
end

The Controller

# app/controllers/generated_images_controller.rb
class GeneratedImagesController < ApplicationController
  before_action :authenticate_user!

  def index
    @images = current_user.generated_images.order(created_at: :desc)
    @image = GeneratedImage.new
  end

  def create
    @image = current_user.generated_images.build(image_params)
    @image.status = :pending

    if @image.save
      GenerateImageJob.perform_later(@image.id)
      respond_to do |format|
        format.turbo_stream
        format.html { redirect_to generated_images_path }
      end
    else
      render :index, status: :unprocessable_entity
    end
  end

  private

  def image_params
    params.require(:generated_image).permit(:prompt, :provider)
  end
end

Real-Time Updates with Turbo

The create response streams the pending card immediately:

<%# app/views/generated_images/create.turbo_stream.erb %>
<%= turbo_stream.prepend "images" do %>
  <%= render "generated_images/image", image: @image %>
<% end %>

<%= turbo_stream.replace "new_image_form" do %>
  <%= render "generated_images/form", image: GeneratedImage.new %>
<% end %>

The image partial handles all states:

<%# app/views/generated_images/_image.html.erb %>
<div id="<%= dom_id(image) %>" class="image-card">
  <p class="prompt"><%= image.prompt %></p>

  <% case image.status %>
  <% when "pending", "generating" %>
    <div class="spinner">Generating your image...</div>
  <% when "completed" %>
    <%= image_tag image.image, class: "generated" if image.image.attached? %>
  <% when "failed" %>
    <div class="error">Generation failed. Try again.</div>
  <% end %>
</div>

Subscribe to updates on the index page:

<%# app/views/generated_images/index.html.erb %>
<%= turbo_stream_from "user_#{current_user.id}_images" %>

<%= render "form", image: @image %>
<div id="images">
  <%= render @images %>
</div>

What's Happening

The flow: user submits a prompt → controller saves a pending record → Turbo Stream immediately shows a spinner → Active Job picks up the work → the API generates the image → the job attaches it and broadcasts a Turbo Stream replacement → the spinner becomes the finished image. No page refresh. No polling.

Both providers work the same way from the user's perspective. DALL-E returns a URL you download. Stability AI returns base64 you decode. The job handles both.

Next up: voice and transcription. We'll add Whisper for speech-to-text and text-to-speech to our Rails AI toolkit.