Gathering Sensor Data using Keyword Spotting

With the growing adoption of neural networks, microcontrollers have become an exciting new area for their practical application.

The Development Board

My journey started with the course Fundamentals of TinyML and the Tiny Machine Learning Kit.

This kit includes an Arduino Nano 33 BLE Sense, which includes a range of sensors:

9 axis inertial sensor
humidity, and temperature sensor
barometric sensor
microphone
gesture, proximity, light color and light intensity sensor

The microcontroller features a better CPU than traditional Arduino boards, has a small form factor. It also features an integrated voltage regulator, which makes it easy to connect batteries with higher voltage levels.

Other considerations

I also purchased the Seeed Xiao ESP32C3, which includes WiFi as well as the XIAO nRF52840 Sense, which includes a microphone and an inertial sensor and connected them to the Round Display. However, the Arduino library, which the company provides needed many configuration changes. The two Seeed microcontrollers also did not include an internal voltage regulator, which is why I ultimately went back to the Arduino Nano. Although the Arduino Nano does not come with an integrated battery management system, it still required less external modules than the Seeed microcontrollers.

The components

I also purchased an OLED display, a Battery Management System, a Micro-SD reader as well as a 3.7V Lithium-Polymer Battery. All of the components were compatible with standard Arduino libraries and used either SPI or I2C.

The algorithm

The algorithm I wanted to run on the controller consisted of a keyword trigger, as well as functionality to display the trigger word on a screen and some functionality to save both the detected trigger word and the sensor data to an Micro-SD card.

While there is some publicly available code which allows running tflite-micro (Tensorflow for Microcontrollers) using the standalone library, I decided to use Edge Impulse. Edge Impulse also takes care of acquiring data, slicing it, preprocessing the Audio data using the Mel Frequency Envelope, selecting the best model architecture and hyperparameters and creating either an Arduino or C++ library.

The final loop includes the inference, which detects the keywords in a one second timeframe, prints the keyword to the display, and saves the keyword audio, which is collected in a cyclical buffer, to the SD card.

void loop()
{
    microphone_inference_record();

    signal_t signal;
    signal.total_length = EI_CLASSIFIER_SLICE_SIZE;
    signal.get_data = &microphone_audio_signal_get_data;
    ei_impulse_result_t result = {0};

    EI_IMPULSE_ERROR r = run_classifier_continuous(&signal, &result, debug_nn);
    if (r != EI_IMPULSE_OK)
    {
        displayAndPrint("ERR: Failed to run classifier.");
        return;
    }

    if (++print_results >= (EI_CLASSIFIER_SLICES_PER_MODEL_WINDOW))
    {
        // Assume the cyclical buffer is fully valid and of length EI_CLASSIFIER_RAW_SAMPLE_COUNT
        signed short *classified_buffer = inference.cyclical_buffer;
        unsigned int buffer_length = EI_CLASSIFIER_RAW_SAMPLE_COUNT;

        for (size_t ix = 0; ix < EI_CLASSIFIER_LABEL_COUNT; ix++)
        {
            if (result.classification[ix].value > 0.5)
            {
                displayAndPrint((String(result.classification[ix].label) + ": " + result.classification[ix].value).c_str());
                saveAudioToSD(result.classification[ix].label, classified_buffer, buffer_length);
            }
        }
        print_results = 0;
    }

    clearDisplayAfterDuration();
}

The saved audio data is then converted to the wav file format and classified again using a more powerful model, which is not resource constrained. The models which were considered were a 1D CNN, a 2D CNN, ResNet, EfficientNet and a Vision Transformer.

Model Name	Total Parameters	Size (KB)	Test Loss	Test Accuracy
1D Neural Net	203783	796	0.60	79%
2D Neural Net	109959	429	0.32	91%
ResNet	43079	168	0.28	93%
EfficientNet	26311	103	0.22	94%
Vision Transformer	80479	314	0.42	91

As seen in the table, the EfficientNet architecture performed best. It had the highest accuracy and the smallest size. This is the code for the simplified EfficientNet architecture:

import numpy as np
import tensorflow as tf
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Conv2D, DepthwiseConv2D, Dense, BatchNormalization, ReLU, GlobalAveragePooling2D, Reshape, Dropout, Input
from tensorflow.keras.optimizers import Adam
from typing import Tuple
from sklearn.utils.class_weight import compute_class_weight

def mb_conv_block(inputs, filters, kernel_size, strides):
    x = Conv2D(filters, kernel_size, strides=strides, padding='same', use_bias=False)(inputs)
    x = BatchNormalization()(x)
    x = ReLU()(x)

    x = DepthwiseConv2D(kernel_size=(3, 3), strides=(1, 1), padding='same', use_bias=False)(x)
    x = BatchNormalization()(x)
    x = ReLU()(x)

    x = Conv2D(filters, kernel_size=(1, 1), strides=(1, 1), padding='same', use_bias=False)(x)
    x = BatchNormalization()(x)
    return x

def create_efficientnet_like_model(input_shape: Tuple[int, int, int], num_classes: int) -> Model:
    inputs = Input(shape=input_shape)
    x = mb_conv_block(inputs, filters=32, kernel_size=(3, 3), strides=(2, 2))
    x = mb_conv_block(x, filters=64, kernel_size=(3, 3), strides=(1, 1))
    x = mb_conv_block(x, filters=128, kernel_size=(3, 3), strides=(2, 2))
    x = mb_conv_block(x, filters=256, kernel_size=(3, 3), strides=(1, 1))

    x = GlobalAveragePooling2D()(x)
    x = Reshape((1, 1, 256))(x)
    x = Dropout(0.3)(x)
    x = Dense(num_classes, activation='softmax')(x)
    x = Reshape((num_classes,))(x)

    model = Model(inputs=inputs, outputs=x)
    model.compile(optimizer=Adam(learning_rate=0.001),
                  loss='categorical_crossentropy',
                  metrics=['accuracy'])
    return model

# Calculate class weights
labels = np.argmax(y_train, axis=1)  # Assuming y_train is one-hot encoded
class_weights = compute_class_weight('balanced', classes=np.unique(labels), y=labels)
class_weights_dict = dict(enumerate(class_weights))

# Model configuration
num_classes = y_train.shape[1]  # Number of classes, ensure y_train is defined
input_shape = (X_train.shape[1], X_train.shape[2], 1)  # Adjust based on your actual input shape, ensure X_train is defined

model = create_efficientnet_like_model(input_shape, num_classes)
model.summary()

# Fit and evaluate model
X_train_reshaped = X_train[..., np.newaxis]  # Add a channel dimension, ensure X_train is defined
X_test_reshaped = X_test[..., np.newaxis]    # Add a channel dimension, ensure X_test is defined

history = model.fit(X_train_reshaped, y_train, validation_split=0.2, epochs=50, batch_size=32, verbose=2, class_weight=class_weights_dict)
loss, accuracy = model.evaluate(X_test_reshaped, y_test, verbose=2)  # Ensure y_test is defined
print(f'Test loss: {loss}')
print(f'Test accuracy: {accuracy}')

# save model as best_model.h5
model.save('best_model.h5')

Arduino Nano BLE Sense, Charger Module, Battery, SD-Card Reader, SD-Card, OLED-Display on a breadboard

The case

I designed a basic case for the microcontroller and its components using a snap-fit mechanism. I am planning on creating separate cases for the individual components to tidy up the components and to make the charging port, SD card slot and micro usb accessible from outside.

A simple case for the microcontroller, designed in Fusion 360.

Referenced Links

pll.harvard.edu

Fundamentals of TinyML | Harvard University

Focusing on the basics of machine learning and embedded systems, such as smartphones, this course will introduce you to the “language” of TinyML.

store.arduino.cc

Arduino Tiny Machine Learning Kit

Discover the Arduino Tiny Machine Learning Kit – beginner-friendly kit to build and train ML models on microcontrollers. Start today!

www.seeedstudio.com

Seeed Studio XIAO ESP32-C3 

Seeed Studio XIAO ESP32-C3 adopts new RISC-V architecture, supporting both Wi-Fi and BLE wireless connectivities. For Internet of Things applications, you will find it is flexible and suitable for all kinds of IoT scenarios.

www.seeedstudio.com

Seeed Studio XIAO nRF52840 Sense (XIAO BLE Sense)




Seeed Studio XIAO nRF52840 Sense by Nordic is carrying Bluetooth 5.0 wireless capability and is able to operate with low power consumption. Featuring onboard IMU and PDM, it can be your best tool for embedded Machine Learning projects.




www.seeedstudio.com

Round Display for Seeed Studio XIAO

Seeed Studio Round Display for XIAO is an expansion board compatible with all XIAO development boards. It features a fully covered touch screen on one side, designed as a 39mm disc. It contains an onboard RTC holder, charge chip, and TF card slot within its compact size, perfect for interactive displays in smart homes, wearables, and more.

www.amazon.de

https://www.amazon.de/dp/B07BDFXFRK?ref=ppx_yo2ov_dt_b_fed_asin_title&th=1

www.amazon.de

Aideepen 6 Stück Type-C USB-C TC4056 5V 1A Li–Ion Lithium-Akku-Ladeplatine Ladegerät Modul mit doppelten Schutzfunktionen: Amazon.de: Elektronik & Foto

www.amazon.de

https://www.amazon.de/dp/B09YYG6BT3?ref=ppx_yo2ov_dt_b_fed_asin_title&th=1

www.amazon.de

https://www.amazon.de/dp/B08215N9R8?ref=ppx_yo2ov_dt_b_fed_asin_title

edgeimpulse.com

The Development Board

Other considerations

The components

The algorithm

The case

Referenced Links

Fundamentals of TinyML | Harvard University

Arduino Tiny Machine Learning Kit

Seeed&#x20;Studio&#x20;XIAO&#x20;ESP32-C3&#x20;

Seeed&#x20;Studio&#x20;XIAO&#x20;nRF52840&#x20;Sense&#x20;&#x28;XIAO&#x20;BLE&#x20;Sense&#x29;

Round&#x20;Display&#x20;for&#x20;Seeed&#x20;Studio&#x20;XIAO

Aideepen 6 Stück Type-C USB-C TC4056 5V 1A Li–Ion Lithium-Akku-Ladeplatine Ladegerät Modul mit doppelten Schutzfunktionen: Amazon.de: Elektronik &amp; Foto

Edge Impulse - The Leading Edge AI Platform

Seeed Studio XIAO ESP32-C3

Seeed Studio XIAO nRF52840 Sense (XIAO BLE Sense)

Round Display for Seeed Studio XIAO

Aideepen 6 Stück Type-C USB-C TC4056 5V 1A Li–Ion Lithium-Akku-Ladeplatine Ladegerät Modul mit doppelten Schutzfunktionen: Amazon.de: Elektronik & Foto