← Back to blog

By the Outspoken Team · March 22, 2026

Wake Word Detection on Raspberry Pi

A Raspberry Pi with a USB microphone and a custom wake word model is a capable voice trigger for almost anything: a DIY smart speaker, a Home Assistant satellite, a voice-controlled robot, or a hands-free kiosk. This guide walks through the full setup — hardware selection, software installation, writing a detection script, triggering actions, and running everything as a background service.

Hardware Requirements

Raspberry Pi model: Pi 3B+ is the minimum. Pi 4 or Pi 5 is recommended if you want lower detection latency and headroom for other processes. The Pi Zero 2W will work but is tight on CPU time.

Microphone: Any USB microphone works. For a basic build, a cheap $5–10 USB mic is fine. For better pickup quality in noisy environments, the ReSpeaker 2-Mic HAT (attaches via GPIO) or the MATRIX Voice give you hardware noise suppression.

Memory and storage: The wake word model file from Outspoken is 50–400 KB depending on the layer size you chose at training time. The shared embedding and mel spectrogram models add another ~3 MB combined. RAM usage at runtime is around 50–100 MB including the Python process.

No GPU required

ONNX Runtime runs entirely on the Pi CPU. The openWakeWord inference loop is lightweight — on a Pi 4, it processes each 80ms audio chunk in under 5ms, leaving plenty of CPU headroom.

Software Setup

Operating System

Use Raspberry Pi OS (64-bit) on a Pi 4 or Pi 5. The 64-bit image gives you access to arm64 Python wheels, which saves compilation time when installing ONNX Runtime.

On a Pi 3B+, the default OS is 32-bit (armv7). That works, but ONNX Runtime may not have a prebuilt wheel for your exact Python version -- see the note below.

Install Dependencies

Update the system and install audio libraries:

sudo apt update && sudo apt upgrade -y
sudo apt install -y python3-pip python3-venv portaudio19-dev

Create a virtual environment and install the Python packages:

python3 -m venv ~/wakeword-env
source ~/wakeword-env/bin/activate
pip install openwakeword sounddevice

openwakeword pulls in onnxruntime as a dependency. On 64-bit Pi OS this resolves to the standard onnxruntime wheel. On 32-bit systems, the install may fail or fall back to a CPU-only build -- in that case, install the arm32 wheel manually:

# Only needed on 32-bit Pi OS (armv7)
pip install https://github.com/nknytk/built-onnxruntime-for-raspberrypi-linux/raw/master/wheels/bullseye/onnxruntime-1.16.3-cp311-cp311-linux_armv7l.whl

Check your architecture first

Run uname -m to confirm your architecture. You'll see aarch64 (64-bit) or armv7l (32-bit). The standard pip install onnxruntime works on aarch64. For armv7l, you need a pre-built wheel from a community repo or you'll need to compile from source.

Download Your Model Files

The openWakeWord pipeline uses three ONNX models:

  1. melspectrogram.onnx -- converts raw audio to a mel spectrogram
  2. embedding_model.onnx -- maps the spectrogram to a feature embedding
  3. your_wake_word.onnx -- classifies the embedding as wake word or not

The first two are shared models that openWakeWord provides and downloads automatically on first run. Your custom model is the third one -- download it from the Outspoken dashboard and copy it to your Pi:

# From your local machine
scp ~/Downloads/hey_jarvis.onnx pi@raspberrypi.local:~/wakeword-models/

Or download directly on the Pi using the signed URL from the dashboard:

mkdir -p ~/wakeword-models
cd ~/wakeword-models
wget -O hey_jarvis.onnx "https://your-signed-download-url"

Detection Script

The following script listens continuously from the default audio input and prints a message whenever the wake word is detected. It uses openWakeWord's high-level API, which handles the mel spectrogram and embedding steps automatically.

#!/usr/bin/env python3
"""Wake word detection using openWakeWord."""
 
import numpy as np
import sounddevice as sd
from openwakeword.model import Model
 
# Path to your custom model
MODEL_PATH = "/home/pi/wakeword-models/hey_jarvis.onnx"
 
# Detection threshold: 0.0–1.0
# Higher = fewer false activations, more missed detections
THRESHOLD = 0.5
 
# Audio settings -- must match what openWakeWord expects
SAMPLE_RATE = 16000
CHUNK_SIZE = 1280  # ~80ms at 16kHz
 
def on_detection(score: float) -> None:
    """Called when the wake word is detected."""
    print(f"Wake word detected! Score: {score:.3f}")
    # Add your action here (see examples below)
 
def main() -> None:
    model = Model(wakeword_models=[MODEL_PATH], inference_framework="onnx")
 
    print(f"Listening for wake word (threshold={THRESHOLD})...")
 
    with sd.InputStream(
        samplerate=SAMPLE_RATE,
        channels=1,
        dtype="float32",
        blocksize=CHUNK_SIZE,
    ) as stream:
        while True:
            audio_chunk, _ = stream.read(CHUNK_SIZE)
            audio_flat = audio_chunk.flatten()
 
            # openWakeWord expects int16 audio internally; convert from float32
            audio_int16 = (audio_flat * 32768).astype(np.int16)
 
            prediction = model.predict(audio_int16)
 
            for wake_word, score in prediction.items():
                if score >= THRESHOLD:
                    on_detection(score)
 
if __name__ == "__main__":
    main()

Test it from the command line:

source ~/wakeword-env/bin/activate
python3 ~/wakeword-detector.py

Say your wake word. You should see "Wake word detected!" printed in the terminal within about 100ms.

Picking a threshold

Start with 0.5. If you're getting false activations from background speech, raise it to 0.6–0.7. If the model is missing real activations, lower it to 0.3–0.4. This adjusts runtime sensitivity without retraining.

Triggering Actions

Replace the on_detection function body with whatever you want to happen when the wake word fires.

Run a shell command:

import subprocess
 
def on_detection(score: float) -> None:
    subprocess.Popen(["aplay", "/home/pi/sounds/chime.wav"])

Call a webhook (e.g., a Home Assistant automation):

import requests
 
HA_WEBHOOK_URL = "http://homeassistant.local:8123/api/webhook/my-wake-word"
 
def on_detection(score: float) -> None:
    try:
        requests.post(HA_WEBHOOK_URL, timeout=2)
    except requests.exceptions.RequestException:
        pass  # Don't crash the detection loop on network errors

Toggle a GPIO pin (e.g., to signal another microcontroller):

import RPi.GPIO as GPIO
 
GPIO.setmode(GPIO.BCM)
GPIO.setup(17, GPIO.OUT, initial=GPIO.LOW)
 
def on_detection(score: float) -> None:
    GPIO.output(17, GPIO.HIGH)
    # Reset after 500ms in a separate thread if needed

Add a cooldown to prevent repeated triggers:

import time
 
last_trigger = 0.0
COOLDOWN_SECONDS = 3.0
 
def on_detection(score: float) -> None:
    global last_trigger
    now = time.monotonic()
    if now - last_trigger < COOLDOWN_SECONDS:
        return
    last_trigger = now
    print(f"Wake word detected! Score: {score:.3f}")
    # Your action here

Running as a systemd Service

To start wake word detection automatically on boot, create a systemd service unit.

Create the file /etc/systemd/system/wakeword.service:

[Unit]
Description=Wake Word Detector
After=network.target sound.target
 
[Service]
Type=simple
User=pi
WorkingDirectory=/home/pi
ExecStart=/home/pi/wakeword-env/bin/python3 /home/pi/wakeword-detector.py
Restart=on-failure
RestartSec=5
StandardOutput=journal
StandardError=journal
 
[Install]
WantedBy=multi-user.target

Enable and start the service:

sudo systemctl daemon-reload
sudo systemctl enable wakeword.service
sudo systemctl start wakeword.service

Check that it's running:

sudo systemctl status wakeword.service

View live logs:

journalctl -u wakeword.service -f

Audio device access

The pi user must be in the audio group to access the microphone. Check with groups pi. If audio is missing, add it with sudo usermod -aG audio pi and reboot.

Choosing the Right Microphone Input

If your Pi has multiple audio devices (built-in headphone jack, USB mic, HAT), sounddevice may not pick the right one automatically. List available devices:

source ~/wakeword-env/bin/activate
python3 -c "import sounddevice; print(sounddevice.query_devices())"

Note the index of your USB microphone. Pass it explicitly to sd.InputStream:

with sd.InputStream(
    samplerate=SAMPLE_RATE,
    channels=1,
    dtype="float32",
    blocksize=CHUNK_SIZE,
    device=2,  # Replace with your mic's index
) as stream:

You can also set a system-wide default with ALSA, but explicit device selection in the script is more reliable across reboots.

Home Assistant Integration

If you're running this Pi as a voice satellite for Home Assistant, the approach above (posting to a webhook from on_detection) is the simplest path. For full two-way voice interaction -- wake word, speech-to-text, response, text-to-speech -- the Wyoming protocol integrates directly with Home Assistant's voice assistant pipelines.

See the Home Assistant integration guide →

Next Steps

With detection running and a service keeping it alive across reboots, the Pi is a capable always-on voice trigger. From here you can:

The ONNX model you downloaded from Outspoken is self-contained -- no cloud calls, no API keys, no recurring fees after training. Everything runs locally on the Pi.


For the full Python implementation with raw ONNX Runtime inference and more integration patterns, see the Python wake word detection guide.


Don't have a model yet? Sign up for Outspoken -- train a custom wake word for free, download the ONNX model, and have it running on your Pi in an afternoon.