Performance evaluation of image classification models on resource-constrained STM32 microcontrollers
Deploying deep learning on microcontrollers offers real-time intelligence at the edge, but tight memory and compute budgets complicate design choices. This study evaluates image classification on the STM32H747IDISCO using a compact convolutional neural network trained on five board classes (Arduino...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | en |
| Published: |
Research and Scientific Innovation Society
2025
|
| Online Access: | http://eprints.utem.edu.my/id/eprint/29506/2/003291501202612022919.pdf http://eprints.utem.edu.my/id/eprint/29506/ https://rsisinternational.org/journals/ijriss/view/performance-evaluation-of-image-classification-models-on-resource-constrained-stm32-microcontrollers https://dx.doi.org/10.47772/IJRISS.2025.910000238 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Deploying deep learning on microcontrollers offers real-time intelligence at the edge, but tight memory and compute budgets complicate design choices. This study evaluates image classification on the STM32H747IDISCO using a compact convolutional neural network trained on five board classes (Arduino Uno, Node MCU, ESP8266-01, Micro: bit V2.0, ESP32-CAM). A small, augmented dataset (50–100 images per class) was used with standard transformations; models were quantised to int8 and deployed via STM32CubeIDE and the STM32- AI CLI. The analysis examines how input resolution (1080p vs 480p) interacts with accuracy, memory footprint, latency, and power. Four classes achieve ≥95% accuracy across both resolutions, while ESP8266-01 improves from 65.7% (1080p) to 92.3% (480p), suggesting that downsampling can suppress distracting fine-grained artefacts. Activation-buffer tuning and post-training quantisation reduce RAM from ~761 kB to ~610 kB and
Flash from ~1.42 MB to ~1.20 MB without accuracy loss; 480p further lowers latency by up to 35% and power by ~20%. The findings provide a resolution-aware benchmark and practical guidance for balancing fidelity and efficiency on STM32-class MCUs, and they motivate future work with larger benchmarks, cross-platform comparisons, and pruning/distillation pipelines. |
|---|
