Storing and labelling sensor data
My goal was to train a neural network for Human Activity Recognition. I wanted to find an easy way of storing the data on a server without having to manually transfer the data while also being able to easily annotate the data. I decided to use my phone as a gateway to send data from the microcontroller to the phone where it is then forwarded to the server. This article is planned to be part of a larger series of articles. Each article should describe one subset of the design decisions. The series is planned to be split up into several different parts covering
The microcontroller design decisions
The mobile app design decisions
The server code and its design decisions
The neural network that is trained on the data
This post is a summary of the major decisions taken in each of the sections.
The Microcontroller Part
On the microcontroller side, I first and foremost decided to go with Zephyr RTOS. The reason being that it provides a Hardware Abstraction Layer that allows me to write code that is mostly independent of the microcontroller that I am using.
The Requirements
The microcontroller needs to essentially take care of three different things.
It needs to establish a connection to a phone
It needs to sample data
It needs to forward the sampled data to the phone
Establishing a connection
The microcontroller was programmed to search for bluetooth connections by only searching for connections when movement exceeds a predefined threshold. If the threshold is not met. The MCU remains idle. If it exceeds the threshold it looks for connections and if possible, connects to the device.
Data Sampling
The sampling is the easiest among the listed things. The microcontroller that I used for this purpose has an integrated 6-axis LSM6DS3TR-C IMU Sensor which captures both Gyroscope and Accelerometer data with various sampling rates, such as 12.5Hz which I used for idle sampling with little movement or 52Hz which I used for active sampling with lots of movement.
Data Transmission
The data transmission was significantly harder to implement. The reason being that the default bluetooth configuration for the XIAO does not have a good range and does not allow sending larger package sizes. To avoid frequent disconnects, I therefore used Coded PHY, which should increase the maximum possible transmission distance and configured the transmit power to +8 dBm, which is the largest possible value supported by the board I used.
In order to avoid frequent data transmissions which would ultimately decrease battery life, I decided to batch the accelerometer and gyroscope samples and increase the package size of the Bluetooth packages as much as possible. Each sample is stored as six 16-bit integers—three for acceleration (0.01 m/s² resolution) and three for angular velocity (0.05 deg/s resolution). With a maximum transmission package size of 247 bytes and a 10-byte header containing the sample count, format version, and a shared timestamp for the batch, this allowed a maximum transmission of 19 samples per Bluetooth packet. At a sampling rate of 52 Hz, this results in a new Bluetooth packet approximately every 365 milliseconds, which is still enough for my use case of real time motion recognition and feedback to the user.
Other considerations
The MCU runs on a 150 mAh battery. Using the PPK2, I measured an idle power consumption of about 2 mA and an active power consumption of 3.2 mA. This is significant and largely due to the increased power consumption of the Bluetooth radio when transmitting at +8 dBm. Idle power consumption is also quite high. The dev board that I used supports hardware interrupts, so technically the MCU should be able to only wake up when movement is happening; however, as of now I did not get this to work. Instead, I relied on polling and the idle sample rate of 12 Hz. This still gives me a battery life of 75 hours in idle and approximately 47 hours in active mode.
The Mobile App Part
For the Mobile App I chose React Native. Largely due to the reason that I am very familiar with React. Retrospectively, I think choosing Kotlin only would have been the better approach. I am not planning on creating an IOS version and having the React Native Bridge creates overhead. Furthermore, to run this app in the background, I had to use Foreground Services, which needed to be written in native Kotlin code anyways.
The mobile app had to take care of the following things:
Scan, connect and reconnect to the MCU
Receive and forward the data from the MCU to the server
Record Audio and text input and forward both to the server
Scanning and Connecting to the MCU
Scanning and connecting to the MCU was the easiest among the listed parts. The app only scans for devices that match the nrf52 id. Once a connection is established, the app transfers configuration parameters to the MCU. If a connection is lost, which mostly happened due to the MCU being too far away to maintain an active connection, the app tries to reestablish the connection every thirty seconds. This is done using a foreground service, so that the app also works while the phone is locked.
Receive and forward the data from the MCU to the server
The data batches that are received from the MCU are batched again to create larger batches and avoid frequent data transmission. This was also done using a foreground service so that the data transmission also works when the phone is locked.
Record Audio and Text Input and forward both to the server
In order to easily label the data that is being forwarded to the server, I wanted to use voice commands. Since I was using a phone already for data transmission, using the built in microphone of my phone was the most natural decision. For this I used VadWebRTC to only transmit audio when speech is detected.
The Sensor Data Hub app provides a range of customizable settings for both user preferences and low-level BLE device behavior. Below is a breakdown of the key options available to optimize power consumption, responsiveness, and data collection fidelity.
App Preferences
One of the key design decisions for the mobile app was to provide extensive configurability without overwhelming the user. The Sensor Data Hub app includes three main categories of settings that balance ease of use with the flexibility needed for different use cases and hardware optimization.
User Experience Settings
The app preferences focus on accessibility and user comfort. Language selection supports both German and English for text-to-speech functionality, which is crucial for hands-free operation during data collection sessions. The light/dark theme toggle ensures the app remains usable in various lighting conditions, particularly important when collecting data in different environments.
Audio and Voice Interface
Since voice commands are central to the data labeling workflow, the audio settings deserve special attention. Voice enrollment allows the system to learn individual speech patterns for more accurate speaker identification—essential when multiple researchers might be using the same device. The auto-read messages feature provides audio feedback from the assistant, enabling truly hands-free operation. Sound notifications can be toggled based on whether discrete operation is needed during data collection.
BLE Device Configuration
The most sophisticated part of the settings system controls the microcontroller's behavior remotely. These settings are stored locally on the phone and synchronized with the MCU when a connection is established, allowing users to fine-tune performance without needing to reprogram the device.
Motion Detection Parameters The motion threshold (0.16 to 6.0, default 0.5) determines the sensitivity for switching between idle and active sampling modes. Lower values make the system more responsive to subtle movements, while higher values preserve battery life by requiring more pronounced motion to activate high-frequency sampling.
The shake threshold (5G to 20G, default 20G) serves a dual purpose: it triggers BLE advertising when the device is disconnected, allowing users to reconnect by simply shaking the sensor. This gesture-based reconnection eliminates the need to interact with the phone interface during active data collection.
Timing Controls Motion timeout settings (1s to 30s, default 2s) control how long the system waits after motion stops before returning to idle mode. Shorter timeouts maximize battery life, while longer timeouts prevent rapid switching during intermittent activities.
The shake detection interval (50ms to 500ms, default 500ms) balances responsiveness with power consumption when the device is disconnected and polling for reconnection gestures.
Sampling Configuration Both active and idle sample rates are user-configurable, with active rates ranging from 12Hz to 208Hz (default 52Hz) and idle rates from 12Hz to 52Hz (default 12Hz). This flexibility allows optimization for different types of activities—high-frequency sampling for rapid movements, lower frequencies for subtle activities or extended battery life.
The "Send Data in Idle Mode" toggle is particularly useful for applications where even minimal movement patterns are relevant, though it does impact battery consumption.
Communication Optimization Bluetooth timeout (20ms to 200ms, default 200ms) and advertising timeout (10s to 60s, default 30s) settings fine-tune the communication reliability. The Bluetooth timeout ensures packet transmission completes before moving to the next operation, while advertising timeout balances discoverability with power consumption.