How We Solved iOS Audio Analysis for Breathwork Biofeedback

javascript dev.to

Real-time breath detection sounds straightforward until you try to ship it on iOS. Our app, shii·haa, guides users through breathwork and provides live biofeedback — it needs to hear you breathe and respond, frame by frame. In the browser, this is about twenty lines of Web Audio API code. On iOS inside a Capacitor app, it turned into a three-month rabbit hole that ended with us writing a native Swift plugin and open-sourcing it.

This is the story of that journey.

Plugin links:


The Challenge

shii·haa is a breathwork and biofeedback app built with Ionic, Capacitor, and Vue 3, running as a progressive web app, an iOS app, and an Android app from a single codebase. One of its core features is real-time audio analysis: the app listens to the user's breathing through the microphone, computes the energy envelope of the signal, and uses that to detect inhale and exhale phases.

On the web, this works beautifully. The Web Audio API's AnalyserNode gives you a frequency-domain or time-domain buffer every animation frame. You compute the RMS (root mean square) of the time-domain signal, apply a short rolling average, and you have a breath curve. Clinical-grade signal processing in a weekend.

Then we packaged the app for iOS. That's when things stopped working.


The Bug

Inside a Capacitor iOS app, the web content runs in a WKWebView. Apple's WKWebView supports getUserMedia — the API that gives you access to the microphone — starting with iOS 14.5. So microphone access works. The stream arrives. The AudioContext creates. No errors are thrown.

But the AnalyserNode returns garbage.

Specifically: getByteTimeDomainData() fills its array entirely with 128 — the silence value in an unsigned 8-bit PCM encoding. getByteFrequencyData() returns all zeros. No matter how loudly you breathe into the phone, the data never changes.

This is a known WebKit bug. The audio stream from getUserMedia inside WKWebView is not actually routed into the Web Audio graph the way it is in Safari. The MediaStreamSourceNode receives the stream, no exception is raised, but the audio samples delivered to the AnalyserNode are silent placeholders. The signal is there at the OS level — iOS is capturing your microphone — but the bridge between the native audio session and the WebView's audio rendering process is broken for analysis purposes.

Related reports:

The fundamental problem is architectural. WKWebView runs the web content in a separate process (the WebKit renderer process) for security and stability. When you call getUserMedia, the microphone permission is granted to the host app process, not the renderer. Audio gets bridged across the process boundary, but in a way that works for playback and WebRTC — not for AnalyserNode tap-style analysis. The audio just doesn't arrive at the right place.


What We Tried (and Failed)

Adjusting getUserMedia constraints

Our first hypothesis was that the audio processing pipeline needed specific hints. We tried every combination of constraints imaginable:

const stream = await navigator.mediaDevices.getUserMedia({
  audio: {
    echoCancellation: false,
    noiseSuppression: false,
    autoGainControl: false,
    sampleRate: 44100,
    channelCount: 1,
  }
});
Enter fullscreen mode Exit fullscreen mode

Result: same silent data. The constraints affect what the OS attempts to configure, but the sampling gap happens downstream of that.

ScriptProcessorNode

ScriptProcessorNode is deprecated but still works in most browsers. It fires an onaudioprocess callback every buffer, letting you inspect raw samples in JavaScript. We wired it in as an alternative to AnalyserNode:

const processor = audioContext.createScriptProcessor(2048, 1, 1);
processor.onaudioprocess = (e) => {
  const inputData = e.inputBuffer.getChannelData(0);
  // inputData was all zeros on iOS WKWebView
};
Enter fullscreen mode Exit fullscreen mode

Same problem. The audio samples reaching the callback were flat. This confirmed the issue isn't specific to AnalyserNode — the entire getUserMedia-to-WebAudio pipeline is broken for analysis in WKWebView.

cordova-plugin-audioinput

This Cordova plugin routes audio through a native capture path and delivers PCM data to JavaScript via Cordova events. Some community members reported it working as a workaround. We tried bridging it into our Capacitor project.

It partially worked — we could receive audio data — but the integration was fragile. The plugin's timing model didn't align cleanly with our animation-frame-based rendering loop, latency was unpredictable, and the plugin is effectively unmaintained. More critically, we couldn't control the capture format cleanly enough for our RMS computation to be reliable.

GainNode amplification

One forum suggestion was that the signal was present but too quiet. We added a GainNode with a gain of 50:

const gainNode = audioContext.createGain();
gainNode.gain.value = 50;
source.connect(gainNode);
gainNode.connect(analyser);
Enter fullscreen mode Exit fullscreen mode

No change. When the input buffer is all zeros, amplifying it by 50 still gives you zeros.

We were stuck.


The Solution: A Native Capacitor Plugin with AVAudioEngine

The insight we needed: the microphone is working in the iOS app. The OS is happily capturing audio. The problem is purely about getting that data into JavaScript. We didn't need the Web Audio API at all — we needed to bypass WKWebView's broken bridge entirely.

The solution was to write a native Capacitor plugin in Swift that:

  1. Opens the microphone using AVAudioEngine — Apple's native, fully-featured audio graph framework
  2. Installs a tap on the input node that fires on every audio buffer
  3. Computes the RMS of each buffer in Swift (fast, low-overhead)
  4. Emits the result back to JavaScript via Capacitor's event system

The architecture looks like this:

[iOS Microphone]
       ↓
[AVAudioEngine InputNode]
       ↓  (installTap callback, runs on audio thread)
[RMS computation in Swift]
       ↓
[Capacitor notifyListeners()]
       ↓
[JavaScript event handler]
       ↓
[Breath detection logic / UI update]
Enter fullscreen mode Exit fullscreen mode

No getUserMedia. No AnalyserNode. No broken bridge. The audio never touches the WebView's audio graph.

The Swift core

@objc func startListening(_ call: CAPPluginCall) {
    let engine = AVAudioEngine()
    let inputNode = engine.inputNode
    let format = inputNode.outputFormat(forBus: 0)

    inputNode.installTap(onBus: 0, bufferSize: 1024, format: format) { [weak self] buffer, _ in
        guard let channelData = buffer.floatChannelData?[0] else { return }
        let frameCount = Int(buffer.frameLength)

        // Compute RMS
        var sum: Float = 0
        for i in 0..<frameCount {
            let sample = channelData[i]
            sum += sample * sample
        }
        let rms = sqrt(sum / Float(frameCount))

        self?.notifyListeners("audioLevel", data: ["rms": rms])
    }

    do {
        try engine.start()
        call.resolve()
    } catch {
        call.reject("Failed to start audio engine: \(error.localizedDescription)")
    }
}
Enter fullscreen mode Exit fullscreen mode

AVAudioEngine handles all the session management, permission checking, and hardware configuration. The installTap callback runs on Apple's internal audio thread and fires every ~23ms at 44.1 kHz with a buffer size of 1024 frames — low enough latency for real-time biofeedback.

The TypeScript API

Install the plugin:

npm install @shiihaa/capacitor-audio-analysis
npx cap sync
Enter fullscreen mode Exit fullscreen mode

Then use it:

import { AudioAnalysis } from '@shiihaa/capacitor-audio-analysis';

// Request microphone permission
await AudioAnalysis.requestPermission();

// Listen for audio level events
await AudioAnalysis.addListener('audioLevel', (data: { rms: number }) => {
  const level = data.rms; // 0.0 (silence) to ~0.5 (loud breathing)
  updateBreathCurve(level);
});

// Start capture
await AudioAnalysis.startListening();

// Stop when done
await AudioAnalysis.stopListening();
await AudioAnalysis.removeAllListeners();
Enter fullscreen mode Exit fullscreen mode

The rms value is a floating-point number between 0.0 (complete silence) and roughly 0.3–0.5 for a loud inhale. We apply a short rolling window average in JavaScript and threshold-detect peaks for inhale/exhale phase segmentation.

On Android and web, the plugin gracefully falls back to the standard getUserMedia + AnalyserNode path — those platforms don't have the WKWebView limitation, so the native layer isn't needed.


The Result

After integrating the plugin, real-time breath detection on iOS worked immediately. The RMS signal from AVAudioEngine is clean, low-latency, and reliable across device generations from iPhone 8 to iPhone 16 Pro.

But we didn't stop at audio. shii·haa uses a three-signal approach for breath phase detection:

Signal 1 — Microphone energy (via this plugin): The RMS envelope of breathing sounds, filtered with a 300ms rolling average to smooth out noise transients.

Signal 2 — Heart rate variability via BLE: We connect to Garmin, Polar, and Wahoo chest straps over Bluetooth Low Energy and stream real-time R-R intervals. During inhalation, heart rate naturally accelerates; during exhalation, it decelerates. This is Respiratory Sinus Arrhythmia (RSA) — a well-characterized physiological coupling that can serve as an independent breath-phase signal, confirmed in published cardiorespiratory research.

Signal 3 — Threshold analysis: A configurable amplitude threshold that the user can calibrate to their breathing pattern and environment.

When all three signals agree on a phase transition, the biofeedback is accurate enough for clinical-grade guidance. This matters because shii·haa is designed by Dr. Felix Zeller — an intensive care physician and clinical psychologist — not just as a wellness app, but as a precision tool.

The biofeedback loop is now live: breathe in, watch the indicator climb, reach the target zone, shift to exhale. The app guides users through resonance frequency breathing (around 5–6 breath cycles per minute), the rhythm that maximally amplifies RSA and activates the parasympathetic nervous system. All of this running on a phone, with a chest strap, with sub-100ms feedback latency.


Open Source

We've open-sourced the plugin because this is a problem every Capacitor developer building audio analysis features on iOS will hit, and the existing workarounds are insufficient.

The plugin is MIT-licensed. Get it here:

Contributions welcome — especially:

  • Android native path: Currently falls back to Web Audio on Android, but a native AudioRecord-based path could improve latency
  • Additional metrics: Zero-crossing rate, spectral centroid, or other features for more sophisticated breath classification
  • Background audio support: Keeping the session alive when the app is backgrounded (requires UIBackgroundModes: audio in Info.plist)
  • Testing: Real-device testing across iOS versions — the WKWebView audio stack changes between releases

If you're building a meditation app, a voice analysis tool, a pitch detector, or anything else that needs real microphone data on iOS inside a Capacitor app, this plugin removes the biggest obstacle.


Dr. Felix Zeller is an intensive care physician, emergency doctor, and clinical psychologist based in Zürich. He built shii·haa — a breathwork and biofeedback app — with Perplexity Computer. Try it at shiihaa.app.

Source: dev.to

arrow_back Back to Tutorials