I Built a YouTube Sleep Channel That Runs Itself — Here's the Entire Stack

Most YouTube automation advice is vague. "Post consistently." "Find your niche." "Use AI tools."

This is not that post.

This is the exact stack — code, scripts, launchd daemons, and Python libraries — that my AI agent uses to produce 10-hour sleep videos, convert them to MP3 audio, and stage them for YouTube upload every single night.

No camera. No microphone. No editing software. Zero manual steps after the initial setup.

Why Sleep Content

Sleep content is the perfect automation target for three reasons:

Length is an asset, not a liability. 8-10 hour videos generate 8-10 hours of watch time per view. YouTube's RPM (Revenue Per Mille) for sleep content runs $8-12 — 3x the average channel.
Viewers don't actually watch. They queue it and fall asleep. Retention doesn't crater after the first 30 seconds the way it does for tutorial content. Flat, steady retention curves are fine here.
Production is entirely algorithmic. You don't need footage. You need noise, frequency, and duration.

The Audio Pipeline

The foundation is a Python script that generates all audio programmatically using numpy. No samples, no licensed audio, no API calls. Just math.

def generate_brown_noise(duration_sec: float, amplitude: float = 0.3) -> np.ndarray:
    samples = int(duration_sec * SAMPLE_RATE)
    white = np.random.randn(samples)
    brown = np.cumsum(white)
    brown = brown / np.max(np.abs(brown)) * amplitude
    return brown.astype(np.float32)

def generate_binaural(duration_sec: float, target_freq: float = 6.0,
                      carrier_freq: float = 200.0, amplitude: float = 0.25) -> np.ndarray:
    samples = int(duration_sec * SAMPLE_RATE)
    t = np.linspace(0, duration_sec, samples, dtype=np.float32)
    left = np.sin(2 * np.pi * carrier_freq * t) * amplitude
    right = np.sin(2 * np.pi * (carrier_freq + target_freq) * t) * amplitude
    return np.column_stack([left, right])

Brown noise approximates the texture of distant rain, ocean waves, or a running river. Binaural beats — where the left and right channels differ by a target frequency — entrain the brain toward specific states. Delta (1-4 Hz) for deep sleep. Theta (4-8 Hz) for meditation and lucid dreaming.

The agent can generate any combination via a recipe system:

RECIPES = {
    "ocean-theta": {
        "description": "Ocean waves + theta binaural beats (lucid dreaming)",
        "layers": [
            {"type": "brown", "amplitude": 0.2},
            {"type": "binaural", "freq": 6.0, "amplitude": 0.15},
        ]
    },
    "thunderstorm": {
        "description": "Thunderstorm simulation — heavy pink/white noise surge + delta",
        "layers": [
            {"type": "pink", "amplitude": 0.28},
            {"type": "white", "amplitude": 0.12},
            {"type": "brown", "amplitude": 0.18},
            {"type": "binaural", "freq": 2.5, "amplitude": 0.10},
        ]
    },
    # ... 12 more recipes
}

A 1-hour WAV file generates in ~3 seconds. Then ffmpeg converts it to MP3:

def convert_to_mp3(wav_path: str, mp3_path: str):
    subprocess.run([
        "/opt/homebrew/bin/ffmpeg", "-y", "-i", wav_path,
        "-acodec", "libmp3lame", "-q:a", "2", mp3_path
    ])
    os.unlink(wav_path)

The Video Pipeline

The video is a looping visual — a slow nature scene, a starfield, a fireplace — paired with the audio. The base render is 1 hour. Then ffmpeg loops it 10x to create the full video:

ffmpeg -stream_loop 9 -i "base-1hr.mp4" \
  -c copy "final-10hr.mp4" -y

-c copy means no re-encoding. It's just concatenating the stream. A 10-hour video generates in under 10 seconds on an M-series Mac. The final file size is approximately 1.7 GB.

The visual layer is produced separately with Remotion (React-based video framework). The agent renders a base composition with ffmpeg looping for the 10-hour extension.

The Automation Schedule

A launchd daemon on macOS fires the full pipeline every night at 3am:

<key>StartCalendarInterval</key>
<dict>
    <key>Hour</key>
    <integer>3</integer>
    <key>Minute</key>
    <integer>0</integer>
</dict>

The shell script:

Picks a scene and title based on day of week + week number (so output rotates through 28 variations before repeating)
Generates the audio for that scene's recipe
Muxes audio into the base video
Loops to 10 hours
Uploads to YouTube via the Data API v3

That last step requires OAuth credentials stored in a JSON file. One-time setup, then it's autonomous.

Sleep Stories

Beyond ambient audio, the channel also produces narrated sleep stories — scripted, ~15-minute first-person guided narratives designed for maximum sleep induction. These are narrated with a TTS voice (Mistral's Voxtral API) and layered over the ambient audio bed.

The stories follow a consistent format: grounding language, second-person perspective, progressive physical heaviness cues, and a 5-second pause between major sections. Scripts are written as plain .txt files with inline [pause Xs] and [SFX:name] markers that the narration pipeline interprets.

Revenue Model

At ~1,000 subscribers with normal sleep content watch time patterns, the math looks like this:

Average 8hr watch time per view
~20 views/day early on
160 watch hours/day
~4,800 watch hours/month
Monetized after 1,000 subs + 4,000 watch hours
At $10 RPM on 10,000 monthly views: $100/month floor

The ceiling is significantly higher. Channels in this space with 50K+ subs report $3-8K/month. All generated by automation running while the owner sleeps — which is fitting.

What I'd Do Differently

Start with stories, not ambient. Ambient channels compete on view count. Story channels compete on engagement and subscriber loyalty. Stories convert viewers to subscribers faster.

Don't skip the title research. My best-performing titles are hyper-specific: "10 Hours of NYC Rain on a Window at 3am" outperforms "Rainy Night Sounds" by a wide margin. Specificity creates expectation and delivers on it.

Batch the backlog immediately. Before posting anything, generate 30 days of content. YouTube rewards consistent uploading with algorithmic distribution. If your automation fails for three days, you want buffer.

The full pipeline — audio generator, video builder, upload script, scheduler — is available at whoffagents.com for builders who want to run their own instance.

This system is run autonomously by Atlas, my AI agent. I reviewed this article and everything in it reflects actual production code.