Welcome back to the Ruby for AI series. We've built chat interfaces, RAG systems, and AI agents. Now let's make our Rails app create images.
We'll integrate DALL-E and Stability AI, handle async generation with Active Job, and display results with Turbo Streams. Real code, real patterns.
Setting Up
Add the gems:
# Gemfile
gem "ruby-openai"
gem "httparty"
gem "image_processing"
bundle install
rails g model GeneratedImage prompt:text provider:string image_url:text status:integer user:references
rails db:migrate
The model:
# app/models/generated_image.rb
class GeneratedImage < ApplicationRecord
belongs_to :user
has_one_attached :image
enum :status, { pending: 0, generating: 1, completed: 2, failed: 3 }
validates :prompt, presence: true
validates :provider, inclusion: { in: %w[dall_e stability_ai] }
end
DALL-E Integration
Wrap the OpenAI image API in a clean service:
# app/services/dall_e_service.rb
class DallEService
def initialize
@client = OpenAI::Client.new(access_token: ENV["OPENAI_API_KEY"])
end
def generate(prompt, size: "1024x1024", quality: "standard")
response = @client.images.generate(
parameters: {
model: "dall-e-3",
prompt: prompt,
size: size,
quality: quality,
n: 1
}
)
data = response.dig("data", 0)
{
url: data["url"],
revised_prompt: data["revised_prompt"]
}
rescue Faraday::Error => e
Rails.logger.error("DALL-E error: #{e.message}")
raise
end
end
Stability AI Integration
Stability AI gives you more control over the generation process:
# app/services/stability_ai_service.rb
class StabilityAiService
BASE_URL = "https://api.stability.ai/v2beta"
def initialize
@api_key = ENV["STABILITY_API_KEY"]
end
def generate(prompt, negative_prompt: "", aspect_ratio: "1:1")
response = HTTParty.post(
"#{BASE_URL}/stable-image/generate/sd3",
headers: {
"Authorization" => "Bearer #{@api_key}",
"Accept" => "application/json"
},
multipart: true,
body: {
prompt: prompt,
negative_prompt: negative_prompt,
aspect_ratio: aspect_ratio,
output_format: "png"
}
)
raise "Stability AI error: #{response.code}" unless response.success?
{
base64: response.parsed_response["image"],
seed: response.parsed_response["seed"]
}
end
end
Async Processing with Active Job
Image generation takes seconds. Never block a web request:
# app/jobs/generate_image_job.rb
class GenerateImageJob < ApplicationJob
queue_as :ai_tasks
retry_on Faraday::Error, wait: :polynomially_longer, attempts: 3
def perform(generated_image_id)
image = GeneratedImage.find(generated_image_id)
image.generating!
result = case image.provider
when "dall_e"
generate_with_dall_e(image)
when "stability_ai"
generate_with_stability(image)
end
attach_image(image, result)
image.completed!
broadcast_update(image)
rescue StandardError => e
image&.failed!
Rails.logger.error("Image generation failed: #{e.message}")
broadcast_update(image)
end
private
def generate_with_dall_e(image)
service = DallEService.new
result = service.generate(image.prompt)
{ url: result[:url], type: :url }
end
def generate_with_stability(image)
service = StabilityAiService.new
result = service.generate(image.prompt)
{ data: Base64.decode64(result[:base64]), type: :binary }
end
def attach_image(image, result)
case result[:type]
when :url
file = URI.open(result[:url])
image.image.attach(io: file, filename: "generated_#{image.id}.png")
when :binary
image.image.attach(
io: StringIO.new(result[:data]),
filename: "generated_#{image.id}.png",
content_type: "image/png"
)
end
end
def broadcast_update(image)
Turbo::StreamsChannel.broadcast_replace_to(
"user_#{image.user_id}_images",
target: "image_#{image.id}",
partial: "generated_images/image",
locals: { image: image }
)
end
end
The Controller
# app/controllers/generated_images_controller.rb
class GeneratedImagesController < ApplicationController
before_action :authenticate_user!
def index
@images = current_user.generated_images.order(created_at: :desc)
@image = GeneratedImage.new
end
def create
@image = current_user.generated_images.build(image_params)
@image.status = :pending
if @image.save
GenerateImageJob.perform_later(@image.id)
respond_to do |format|
format.turbo_stream
format.html { redirect_to generated_images_path }
end
else
render :index, status: :unprocessable_entity
end
end
private
def image_params
params.require(:generated_image).permit(:prompt, :provider)
end
end
Real-Time Updates with Turbo
The create response streams the pending card immediately:
<%# app/views/generated_images/create.turbo_stream.erb %>
<%= turbo_stream.prepend "images" do %>
<%= render "generated_images/image", image: @image %>
<% end %>
<%= turbo_stream.replace "new_image_form" do %>
<%= render "generated_images/form", image: GeneratedImage.new %>
<% end %>
The image partial handles all states:
<%# app/views/generated_images/_image.html.erb %>
<div id="<%= dom_id(image) %>" class="image-card">
<p class="prompt"><%= image.prompt %></p>
<% case image.status %>
<% when "pending", "generating" %>
<div class="spinner">Generating your image...</div>
<% when "completed" %>
<%= image_tag image.image, class: "generated" if image.image.attached? %>
<% when "failed" %>
<div class="error">Generation failed. Try again.</div>
<% end %>
</div>
Subscribe to updates on the index page:
<%# app/views/generated_images/index.html.erb %>
<%= turbo_stream_from "user_#{current_user.id}_images" %>
<%= render "form", image: @image %>
<div id="images">
<%= render @images %>
</div>
What's Happening
The flow: user submits a prompt → controller saves a pending record → Turbo Stream immediately shows a spinner → Active Job picks up the work → the API generates the image → the job attaches it and broadcasts a Turbo Stream replacement → the spinner becomes the finished image. No page refresh. No polling.
Both providers work the same way from the user's perspective. DALL-E returns a URL you download. Stability AI returns base64 you decode. The job handles both.
Next up: voice and transcription. We'll add Whisper for speech-to-text and text-to-speech to our Rails AI toolkit.