Beehive Acoustic Monitor
Honey bees produce sounds that systematically change with colony state — queen presence, swarming readiness, fanning, hissing, disease. By pairing a small Raspberry Pi with a MEMS microphone inside the hive, you can record continuously, non-invasively, and feed the audio into a classifier that flags interesting events for the beekeeper. This tutorial walks the full path: wire the mic, sample correctly, preprocess the audio, and train a small model.
You'll need
- Raspberry Pi Zero 2 W (~$15, low-power, perfect for in-hive deployment) or Pi 4 (more headroom if you want to train/inference on-device)
- I²S MEMS microphone — Adafruit SPH0645LM4H breakout ($7-8) for prototyping, or the bare ICS-43434 (~$3) if you're comfortable soldering. Both are flat from 50 Hz to 20 kHz with ~65 dB SNR.
- microSD card (32 GB+), short jumper wires, 3 mm GORE-TEX patch for an acoustic vent
- IP65 ABS project box, PG7 cable glands, silica desiccant packs
- Optional but recommended: solar panel + Li-Ion cell + TP4056 charger for autonomous operation
- Raspberry Pi OS Lite (64-bit)
1. Wire the mic to the Pi (I²S)
I²S sends digital audio over three wires plus power, so it's immune to the noise that plagues long analog mic runs inside a hive.
| SPH0645 / ICS-43434 | Pi GPIO | Pin # |
|---|---|---|
| 3V | 3.3 V | 1 |
| GND | GND | 6 |
| BCLK | GPIO 18 (PCM_CLK) | 12 |
| LRCL | GPIO 19 (PCM_FS) | 35 |
| DOUT | GPIO 20 (PCM_DIN) | 38 |
| SEL | GND (left channel) | any GND |
Enable the I²S kernel module by appending to /boot/firmware/config.txt (or /boot/config.txt on older OS images):
dtparam=i2s=on
dtoverlay=googlevoicehat-soundcard
Reboot, then confirm the card appears:
arecord -l
You should see a snd_rpi_googlevoicehat_soundcar device. That's your mic — address it as plughw:0.
i2s-mmap work too if you prefer.
2. Set the right sampling parameters
Most beehive sound energy sits below 1 kHz, with useful information up to ~4 kHz. Anything beyond that wastes storage and power.
| Parameter | Edge / production | Research-grade |
|---|---|---|
| Sample rate | 16 kHz | 48 kHz |
| Bit depth | 16-bit | 24-bit |
| Clip length | 5 sec | 10 sec |
| Interval | every 15 min | every 5 min |
| Daily volume | ~14 MB (WAV) | ~150 MB (WAV) |
For a single hive on a Pi Zero 2 W with Wi-Fi upload, 16 kHz / 16-bit / 5 s every 15 min is the sweet spot. FLAC compression knocks ~50% off the daily volume if storage matters.
3. Capture clips on a schedule
A tiny shell loop is enough. Save as /home/pi/record_hive.sh:
#!/usr/bin/env bash
set -euo pipefail
OUT=/home/pi/recordings
mkdir -p "$OUT"
TS=$(date +%Y-%m-%d_%H-%M-%S)
arecord -D plughw:0 -f S16_LE -r 16000 -c 1 -d 5 \
"$OUT/$TS.wav"
# Optional: compress to FLAC and remove WAV
flac --silent --delete-input-file "$OUT/$TS.wav"
Drive it from cron — every 15 minutes:
*/15 * * * * /home/pi/record_hive.sh >> /home/pi/recordings/hive.log 2>&1
For continuous (research-grade) recording, see the Acoustic Sound Recorder tutorial for the systemd service pattern with --max-file-time.
4. Preprocess before features
Raw clips go through bandpass filtering, a noise-floor gate (important — see callout), normalisation, and windowing. This is the v1.1-fixed pipeline from the source guide:
import librosa, numpy as np
from scipy import signal
def preprocess(path, sr=16000):
y, _ = librosa.load(path, sr=sr, mono=True)
y -= np.mean(y) # DC removal
sos = signal.butter(4, [20, 4000], btype='bandpass',
fs=sr, output='sos')
y = signal.sosfilt(sos, y) # bandpass 20-4000 Hz
# Critical: gate near-silence BEFORE normalisation.
# Otherwise a dead mic or winter cluster silence gets amplified
# to full scale and pollutes training data with pure noise.
if np.max(np.abs(y)) < 0.01:
return None
y = y / np.max(np.abs(y)) # peak normalise
# 5-sec windows, 50% overlap
win = sr * 5
hop = win // 2
windows = [y[s:s+win] for s in range(0, len(y) - win + 1, hop)]
return {'audio': y, 'windows': windows, 'sr': sr}
y / max(abs(y)).
5. Extract features
Two standard choices for audio classification — pick based on where the model runs:
- Mel spectrogram (~64 × 313 for a 5 s clip at 16 kHz): primary feature for CNN classifiers. Heavier but expressive.
- 13 MFCCs + delta + delta-delta (~39 features × time → summarise to 156-d vector): compact, great for edge inference and classical ML.
With librosa:
mel = librosa.feature.melspectrogram(y=audio, sr=16000, n_mels=64, fmax=4000)
log_mel = librosa.power_to_db(mel)
mfcc = librosa.feature.mfcc(y=audio, sr=16000, n_mfcc=13)
delta = librosa.feature.delta(mfcc)
ddelt = librosa.feature.delta(mfcc, order=2)
features = np.concatenate([mfcc, delta, ddelt], axis=0) # (39, T)
6. Train a baseline classifier
Start with the simplest formulation: binary healthy vs stressed. A CNN on log-Mel spectrograms gets you to 90%+ accuracy on the Kaggle Beehive Audio Dataset with a few thousand clips per class.
- Optimiser: AdamW, weight decay 1e-4, cosine annealing
- Label smoothing: 0.1 (bee labels are inherently noisy)
- Augmentation: gain jitter ±20%, time shift ±1 s, SpecAugment (time mask ≤20 frames, freq mask ≤8 bins). Aim for 500+ clips per class after augmentation.
- Loss: weighted cross-entropy if classes are imbalanced (queenless and dead are always rare)
7. Deploy back to the Pi (optional)
Once trained, convert to TFLite and run on-device so only labels (not audio) leave the hive:
import tensorflow as tf
conv = tf.lite.TFLiteConverter.from_saved_model('beehive_model')
conv.optimizations = [tf.lite.Optimize.DEFAULT]
conv.target_spec.supported_types = [tf.int8] # INT8 quantisation
open('beehive_model_int8.tflite', 'wb').write(conv.convert())
A quantised 1D-CNN on MFCC summaries fits in ~50-100 KB and infers in milliseconds on a Pi Zero 2 W. Run it from the cron script after each capture and POST the label to a small endpoint instead of uploading audio.
Hive placement
- Centre of the brood box, between frames 4 and 5. Avoid top bars (vibration) and frame edges (dead acoustics).
- Mic membrane facing downward so wax and propolis don't accumulate.
- ~100 mm above the bottom board, ~20 cm from the entrance (cuts wind and outside insect noise).
- Seal the enclosure fully and use silica gel packs inside — the 35°C hive vs. ambient temp gradient will cause condensation on any vented box. Replace desiccant every 6 months.
Bill of materials (per hive)
| Item | Notes | ~AUD |
|---|---|---|
| Pi Zero 2 W | or ESP32-S3 for lower power | $25 |
| ICS-43434 / SPH0645 | I²S MEMS mic | $3–8 |
| microSD 32 GB | Class 10 | $5 |
| IP65 enclosure + cable glands | ABS, PG7 | $7 |
| GORE-TEX acoustic vent | 3 mm patch | $1 |
| Solar panel + Li-Ion + TP4056 | autonomous power | $22 |
| Desiccant packs | silica gel | $1 |
| Total | ~$65 | |
Where to go next
- Public datasets to bootstrap with: Kaggle Beehive Audio Dataset (8k clips, 5 classes), OSBH Sound Dataset (multi-year), MLBeeHive (University of Bologna, ground-truth inspection linked)
- Open-source platforms: Open Source Beehives, Edge Impulse for end-to-end TinyML
- Important caveat: Varroa-by-acoustics is not field-validated. The 225 Hz mite wing-beat signature you'll see cited in older papers overlaps normal bee wing noise and cannot be reliably separated outside controlled lab settings. Use alcohol wash or sugar roll counting as the diagnostic, and treat acoustic Varroa signals as a research pointer only.
- Remote access to recordings: pair with the Cloudflare Tunnel tutorial for a private endpoint without opening router ports.