How I Fixed Transparent Video Alpha in Playwright Using 1970s Film Math
I've been building a content engine that renders HTML/CSS graphics to transparent video overlays — the kind you drop on a timeline in CapCut or Premiere with a drop shadow, gradient text, and a glow. Clean, professional, automated.
It worked great in the browser. It worked fine as a WebM preview. And then the moment I imported it into any video editor, every single shadow and glow rendered as a solid colored blob.
Days of debugging later, I found the fix — and it's a technique from 1970s film compositing.
The Problem
The pipeline was straightforward:
- Generate HTML/CSS graphics with Playwright (headless Chrome)
- Screenshot each frame with
omitBackground: trueto get transparent PNGs - Stitch frames into WebM or ProRes 4444 with FFmpeg
- Import into CapCut and composite over video
The shadows looked perfect in the browser. Correct in the WebM preview in Chrome. But inside CapCut — or any NLE — they showed up as hard-edged solid blobs. A filter: drop-shadow(0 0 20px rgba(123,97,255,0.7)) that should be a soft purple halo was rendering as a solid purple rectangle.
What's Actually Happening
Premultiplied vs. Straight Alpha
There are two ways to store RGBA pixel data.
Straight alpha — RGB and alpha are independent:
pixel = (R, G, B, A)
Premultiplied (associated) alpha — RGB is already multiplied by alpha:
pixel = (R×A, G×A, B×A, A)
Chrome composites internally using premultiplied alpha. When you call page.screenshot({ omitBackground: true }), it has to convert from premultiplied back to straight alpha for the PNG export. That conversion is lossy at very low alpha values. The soft glow edge pixels at 2%, 5%, 8% opacity either round to zero or get the wrong color.
ProRes 4444 is specced for premultiplied alpha — that's what NLEs expect. So you're encoding already-corrupted data into a format that assumes clean data. The NLE reads it, assumes premultiplied, and composites incorrectly.
The background-clip: text Problem
Separately, background-clip: text + text-fill-color: transparent has a rendering bug in headless Playwright with omitBackground: true. Instead of gradient-colored text on a transparent background, Chrome renders a solid gradient rectangle with the letter shapes punched out as transparent holes. The complete inverse of what you want.
Approaches That Don't Work
Banning the CSS — Just ban filter:, rgba(), text-shadow. Only fully opaque or fully transparent pixels. This makes the alpha clean but the graphics flat and lifeless. Not usable.
Fixing the FFmpeg encode:
ffmpeg -i input_%06d.png \
-vf "premultiply=inplace=1" \
-c:v prores_ks -profile:v 4444 \
-pix_fmt yuva444p10le -alpha_bits 16 \
output.mov
This helps at the codec level. But the source PNGs from omitBackground: true are already broken before FFmpeg touches them. You're premultiplying corrupted data.
The Fix: Background Subtraction
The insight: you don't need the browser to give you alpha at all.
Render each frame twice — once on solid black, once on solid white. The browser renders both correctly because it's just rendering a webpage on a known background. No alpha pipeline involved.
Then derive the correct RGBA for every pixel using the Porter-Duff equations.
The Math
Given a foreground pixel with color fg and alpha α:
On black (0,0,0):
rendered_black = fg × α
On white (255,255,255):
rendered_white = fg × α + 255 × (1-α)
Subtract:
white - black = 255 × (1-α)
α = 1 - (white - black) / 255
α (0–255) = 255 - (white - black)
fg = black / α
This is exact for every pixel at every opacity level — including the 2% opacity glow edges that omitBackground was mangling. This technique is called background subtraction or difference matting. It's been in film VFX since the 1970s.
Implementation
import sharp from 'sharp';
async function extractAlpha(blackBuf, whiteBuf) {
const [b, w] = await Promise.all([
sharp(blackBuf).removeAlpha().raw().toBuffer({ resolveWithObject: true }),
sharp(whiteBuf).removeAlpha().raw().toBuffer({ resolveWithObject: true }),
]);
const { data: bD, info } = b;
const { data: wD } = w;
const N = info.width * info.height;
const out = Buffer.alloc(N * 4);
for (let i = 0; i < N; i++) {
const s = i * 3, d = i * 4;
const bR = bD[s], bG = bD[s+1], bB = bD[s+2];
const wR = wD[s], wG = wD[s+1], wB = wD[s+2];
// Per-channel alpha — take max for best color recovery
const alpha = Math.max(0, Math.min(255, Math.max(
255 - (wR - bR),
255 - (wG - bG),
255 - (wB - bB)
)));
if (alpha < 4) {
out[d] = out[d+1] = out[d+2] = out[d+3] = 0;
} else {
const a = alpha / 255;
out[d] = Math.min(255, Math.round(bR / a));
out[d+1] = Math.min(255, Math.round(bG / a));
out[d+2] = Math.min(255, Math.round(bB / a));
out[d+3] = alpha;
}
}
return sharp(out, { raw: { width: info.width, height: info.height, channels: 4 } })
.png().toBuffer();
}
In the Capture Loop
// OLD
await page.screenshot({ path: framePath, type: 'png', omitBackground: true });
// NEW
await page.evaluate(() => {
document.documentElement.style.background = '#000000';
document.body.style.background = '#000000';
});
const blackBuf = await page.screenshot({ type: 'png' });
await page.evaluate(() => {
document.documentElement.style.background = '#ffffff';
document.body.style.background = '#ffffff';
});
const whiteBuf = await page.screenshot({ type: 'png' });
const rgbaBuf = await extractAlpha(blackBuf, whiteBuf);
await fs.writeFile(framePath, rgbaBuf);
FFmpeg Encode
await execFileAsync('ffmpeg', [
'-y', '-framerate', '30', '-i', 'frame%06d.png',
'-vf', 'premultiply=inplace=1',
'-c:v', 'prores_ks', '-profile:v', '4444',
'-pix_fmt', 'yuva444p10le', '-alpha_bits', '16',
'output.mov',
]);
One Remaining Gotcha
Don't put overflow: hidden on a parent that directly wraps a filter: element. Chrome's filter compositing layer gets clipped by the overflow boundary, producing a visible white rectangle around your element.
/* WRONG */
.row { overflow: hidden; }
.word { filter: drop-shadow(...); }
/* CORRECT — let body/html handle the viewport clip */
.row { /* overflow: visible by default */ }
body { overflow: hidden; }
Result
filter: drop-shadow, text-shadow, background-clip: text, rgba() at any opacity — all render cleanly as transparent video with correct alpha in CapCut, Premiere, and DaVinci.
The tradeoff is 2× the frame captures. For a 6-second clip at 30fps that's 360 screenshots instead of 180. About 90 extra seconds of render time. Acceptable for production output.
The browser was always rendering correctly. The only problem was how we were asking for the pixel data.
The fix isn't in the encoding. It's in the capture.