By the Outspoken Team · March 22, 2026
Wake Word Detection on Raspberry Pi
A Raspberry Pi with a USB microphone and a custom wake word model is a capable voice trigger for almost anything: a DIY smart speaker, a Home Assistant satellite, a voice-controlled robot, or a hands-free kiosk. This guide walks through the full setup — hardware selection, software installation, writing a detection script, triggering actions, and running everything as a background service.
Hardware Requirements
Raspberry Pi model: Pi 3B+ is the minimum. Pi 4 or Pi 5 is recommended if you want lower detection latency and headroom for other processes. The Pi Zero 2W will work but is tight on CPU time.
Microphone: Any USB microphone works. For a basic build, a cheap $5–10 USB mic is fine. For better pickup quality in noisy environments, the ReSpeaker 2-Mic HAT (attaches via GPIO) or the MATRIX Voice give you hardware noise suppression.
Memory and storage: The wake word model file from Outspoken is 50–400 KB depending on the layer size you chose at training time. The shared embedding and mel spectrogram models add another ~3 MB combined. RAM usage at runtime is around 50–100 MB including the Python process.
No GPU required
ONNX Runtime runs entirely on the Pi CPU. The openWakeWord inference loop is lightweight — on a Pi 4, it processes each 80ms audio chunk in under 5ms, leaving plenty of CPU headroom.
Software Setup
Operating System
Use Raspberry Pi OS (64-bit) on a Pi 4 or Pi 5. The 64-bit image gives you access to arm64 Python wheels, which saves compilation time when installing ONNX Runtime.
On a Pi 3B+, the default OS is 32-bit (armv7). That works, but ONNX Runtime may not have a prebuilt wheel for your exact Python version -- see the note below.
Install Dependencies
Update the system and install audio libraries:
sudo apt update && sudo apt upgrade -y
sudo apt install -y python3-pip python3-venv portaudio19-devCreate a virtual environment and install the Python packages:
python3 -m venv ~/wakeword-env
source ~/wakeword-env/bin/activate
pip install openwakeword sounddeviceopenwakeword pulls in onnxruntime as a dependency. On 64-bit Pi OS this resolves to the standard onnxruntime wheel. On 32-bit systems, the install may fail or fall back to a CPU-only build -- in that case, install the arm32 wheel manually:
# Only needed on 32-bit Pi OS (armv7)
pip install https://github.com/nknytk/built-onnxruntime-for-raspberrypi-linux/raw/master/wheels/bullseye/onnxruntime-1.16.3-cp311-cp311-linux_armv7l.whlCheck your architecture first
Run uname -m to confirm your architecture. You'll see aarch64 (64-bit) or armv7l (32-bit). The standard pip install onnxruntime works on aarch64. For armv7l, you need a pre-built wheel from a community repo or you'll need to compile from source.
Download Your Model Files
The openWakeWord pipeline uses three ONNX models:
melspectrogram.onnx-- converts raw audio to a mel spectrogramembedding_model.onnx-- maps the spectrogram to a feature embeddingyour_wake_word.onnx-- classifies the embedding as wake word or not
The first two are shared models that openWakeWord provides and downloads automatically on first run. Your custom model is the third one -- download it from the Outspoken dashboard and copy it to your Pi:
# From your local machine
scp ~/Downloads/hey_jarvis.onnx pi@raspberrypi.local:~/wakeword-models/Or download directly on the Pi using the signed URL from the dashboard:
mkdir -p ~/wakeword-models
cd ~/wakeword-models
wget -O hey_jarvis.onnx "https://your-signed-download-url"Detection Script
The following script listens continuously from the default audio input and prints a message whenever the wake word is detected. It uses openWakeWord's high-level API, which handles the mel spectrogram and embedding steps automatically.
#!/usr/bin/env python3
"""Wake word detection using openWakeWord."""
import numpy as np
import sounddevice as sd
from openwakeword.model import Model
# Path to your custom model
MODEL_PATH = "/home/pi/wakeword-models/hey_jarvis.onnx"
# Detection threshold: 0.0–1.0
# Higher = fewer false activations, more missed detections
THRESHOLD = 0.5
# Audio settings -- must match what openWakeWord expects
SAMPLE_RATE = 16000
CHUNK_SIZE = 1280 # ~80ms at 16kHz
def on_detection(score: float) -> None:
"""Called when the wake word is detected."""
print(f"Wake word detected! Score: {score:.3f}")
# Add your action here (see examples below)
def main() -> None:
model = Model(wakeword_models=[MODEL_PATH], inference_framework="onnx")
print(f"Listening for wake word (threshold={THRESHOLD})...")
with sd.InputStream(
samplerate=SAMPLE_RATE,
channels=1,
dtype="float32",
blocksize=CHUNK_SIZE,
) as stream:
while True:
audio_chunk, _ = stream.read(CHUNK_SIZE)
audio_flat = audio_chunk.flatten()
# openWakeWord expects int16 audio internally; convert from float32
audio_int16 = (audio_flat * 32768).astype(np.int16)
prediction = model.predict(audio_int16)
for wake_word, score in prediction.items():
if score >= THRESHOLD:
on_detection(score)
if __name__ == "__main__":
main()Test it from the command line:
source ~/wakeword-env/bin/activate
python3 ~/wakeword-detector.pySay your wake word. You should see "Wake word detected!" printed in the terminal within about 100ms.
Picking a threshold
Start with 0.5. If you're getting false activations from background speech, raise it to 0.6–0.7. If the model is missing real activations, lower it to 0.3–0.4. This adjusts runtime sensitivity without retraining.
Triggering Actions
Replace the on_detection function body with whatever you want to happen when the wake word fires.
Run a shell command:
import subprocess
def on_detection(score: float) -> None:
subprocess.Popen(["aplay", "/home/pi/sounds/chime.wav"])Call a webhook (e.g., a Home Assistant automation):
import requests
HA_WEBHOOK_URL = "http://homeassistant.local:8123/api/webhook/my-wake-word"
def on_detection(score: float) -> None:
try:
requests.post(HA_WEBHOOK_URL, timeout=2)
except requests.exceptions.RequestException:
pass # Don't crash the detection loop on network errorsToggle a GPIO pin (e.g., to signal another microcontroller):
import RPi.GPIO as GPIO
GPIO.setmode(GPIO.BCM)
GPIO.setup(17, GPIO.OUT, initial=GPIO.LOW)
def on_detection(score: float) -> None:
GPIO.output(17, GPIO.HIGH)
# Reset after 500ms in a separate thread if neededAdd a cooldown to prevent repeated triggers:
import time
last_trigger = 0.0
COOLDOWN_SECONDS = 3.0
def on_detection(score: float) -> None:
global last_trigger
now = time.monotonic()
if now - last_trigger < COOLDOWN_SECONDS:
return
last_trigger = now
print(f"Wake word detected! Score: {score:.3f}")
# Your action hereRunning as a systemd Service
To start wake word detection automatically on boot, create a systemd service unit.
Create the file /etc/systemd/system/wakeword.service:
[Unit]
Description=Wake Word Detector
After=network.target sound.target
[Service]
Type=simple
User=pi
WorkingDirectory=/home/pi
ExecStart=/home/pi/wakeword-env/bin/python3 /home/pi/wakeword-detector.py
Restart=on-failure
RestartSec=5
StandardOutput=journal
StandardError=journal
[Install]
WantedBy=multi-user.targetEnable and start the service:
sudo systemctl daemon-reload
sudo systemctl enable wakeword.service
sudo systemctl start wakeword.serviceCheck that it's running:
sudo systemctl status wakeword.serviceView live logs:
journalctl -u wakeword.service -fAudio device access
The pi user must be in the audio group to access the microphone. Check with groups pi. If audio is missing, add it with sudo usermod -aG audio pi and reboot.
Choosing the Right Microphone Input
If your Pi has multiple audio devices (built-in headphone jack, USB mic, HAT), sounddevice may not pick the right one automatically. List available devices:
source ~/wakeword-env/bin/activate
python3 -c "import sounddevice; print(sounddevice.query_devices())"Note the index of your USB microphone. Pass it explicitly to sd.InputStream:
with sd.InputStream(
samplerate=SAMPLE_RATE,
channels=1,
dtype="float32",
blocksize=CHUNK_SIZE,
device=2, # Replace with your mic's index
) as stream:You can also set a system-wide default with ALSA, but explicit device selection in the script is more reliable across reboots.
Home Assistant Integration
If you're running this Pi as a voice satellite for Home Assistant, the approach above (posting to a webhook from on_detection) is the simplest path. For full two-way voice interaction -- wake word, speech-to-text, response, text-to-speech -- the Wyoming protocol integrates directly with Home Assistant's voice assistant pipelines.
Next Steps
With detection running and a service keeping it alive across reboots, the Pi is a capable always-on voice trigger. From here you can:
- Pipe the post-wake-word audio to a speech-to-text service (Whisper, Vosk)
- Chain detection into a full voice assistant pipeline
- Use multiple wake words by loading several
.onnxmodels at once (openWakeWord supports this natively) - Add a visual indicator LED that pulses on detection
The ONNX model you downloaded from Outspoken is self-contained -- no cloud calls, no API keys, no recurring fees after training. Everything runs locally on the Pi.
For the full Python implementation with raw ONNX Runtime inference and more integration patterns, see the Python wake word detection guide.
Don't have a model yet? Sign up for Outspoken -- train a custom wake word for free, download the ONNX model, and have it running on your Pi in an afternoon.