Performance evaluation of image classification models on resource-constrained STM32 microcontrollers

Deploying deep learning on microcontrollers offers real-time intelligence at the edge, but tight memory and compute budgets complicate design choices. This study evaluates image classification on the STM32H747IDISCO using a compact convolutional neural network trained on five board classes (Arduino...

Full description

Saved in:
Bibliographic Details
Main Authors: Md Salim, Sani Irwan, Mohd Shaifullizan, Muhammad Aiman Akmal, Mohd Zin, Mohd Shahril Izuan, Samsudin, Sharatul Izah, Awang Md Isa, Azmi
Format: Article
Language:en
Published: Research and Scientific Innovation Society 2025
Online Access:http://eprints.utem.edu.my/id/eprint/29506/2/003291501202612022919.pdf
http://eprints.utem.edu.my/id/eprint/29506/
https://rsisinternational.org/journals/ijriss/view/performance-evaluation-of-image-classification-models-on-resource-constrained-stm32-microcontrollers
https://dx.doi.org/10.47772/IJRISS.2025.910000238
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Deploying deep learning on microcontrollers offers real-time intelligence at the edge, but tight memory and compute budgets complicate design choices. This study evaluates image classification on the STM32H747IDISCO using a compact convolutional neural network trained on five board classes (Arduino Uno, Node MCU, ESP8266-01, Micro: bit V2.0, ESP32-CAM). A small, augmented dataset (50–100 images per class) was used with standard transformations; models were quantised to int8 and deployed via STM32CubeIDE and the STM32- AI CLI. The analysis examines how input resolution (1080p vs 480p) interacts with accuracy, memory footprint, latency, and power. Four classes achieve ≥95% accuracy across both resolutions, while ESP8266-01 improves from 65.7% (1080p) to 92.3% (480p), suggesting that downsampling can suppress distracting fine-grained artefacts. Activation-buffer tuning and post-training quantisation reduce RAM from ~761 kB to ~610 kB and Flash from ~1.42 MB to ~1.20 MB without accuracy loss; 480p further lowers latency by up to 35% and power by ~20%. The findings provide a resolution-aware benchmark and practical guidance for balancing fidelity and efficiency on STM32-class MCUs, and they motivate future work with larger benchmarks, cross-platform comparisons, and pruning/distillation pipelines.