tinyML Talks on August 29, 2023 “A hardware-aware neural architecture search algorithm targeting ultra-low-power microcontrollers” by Andrea Mattia Garavagno from University of Genoa

We held our next tinyML Talks webcast. Andrea Mattia Garavagno from Sant’Anna School of Advanced Studies of Pisa and University of Genoa presented A hardware-aware neural architecture search algorithm targeting ultra-low-power microcontrollers on August 29, 2023.

Hardware-aware neural architecture search (HW NAS), the process of automating the design of neural architectures taking into consideration hardware constraints, has already outperformed the best human designs on many tasks. However, it is known to be highly demanding in terms of hardware, thus limiting access to non-habitual neural network users. Fostering its adoption for the next-generation IoT and wearable devices design, we propose an HW NAS that can be run on laptops, even if not mounting a GPU. The proposed technique, designed to have both a low search cost and resource usage, produces tiny convolutional neural networks (CNNs) targeting low-end microcontrollers. It achieves state-of-the-art results in the human-recognition tasks, on the Visual Wake Word dataset a standard TinyML benchmark, in just 3:37:0 hours on a laptop mounting an 11th Gen Intel(R) Core™ i7-11370H CPU @ 3. 30GHz equipped with 16 GB of RAM and 512 GB of SSD, without using a GPU.Dr Martin Peacock is an industrial bioelectrochemist with over twenty years of biosensor experience, having had industrial roles from Abbott Diabetes to GSK, and solving technical challenges from continuous glucose monitoring to RNA analysis. In recent years Martin has set-up biosensor focused companies across the globe from Silicon Valley California to Oslo Norway.

Andrea Mattia Garavagno was born in Rome (Italy) in 1996. He received his BSc in Electronic Engineering from the University of Genoa, and the MSc in Embedded Computing Systems from Scuola Superiore Sant’Anna and the University of Pisa, Italy. He is currently a PhD student at the Scuola Superiore Sant’Anna and the University of Genoa. Together with Giuliano Donzellini e Luca Oneto, he co-authored the Italian book "Introduzione al Progetto di Sistemi a Microprocessore", and the international book “Introduction to Microprocessor-Based Systems Design” published by Springer in 2021 and 2022. Currently he is working on hardware-aware neural architecture search targeting microcontrollers.

=========================

Watch on YouTube:
Andrea Mattia Garavagn

Download presentation slides:
Andrea Mattia Garavagn

Feel free to ask your questions on this thread and keep the conversation going!

=========================

Q: The MIT teams also generate a TinyEngine. Did you run your code on their tinyEngine instead of TfLite?
A: No, I didn’t try. I used the TfLite engine for my experiments. Anyway, you can try that, if you want. The code is open-source and can be found at GitHub - AndreaMattiaGaravagno/NanoNAS: A simple way to obtain CNNs that run on low-RAM microcontrollers, e.g. 40 kiB, on your laptop, without a GPU.. It should be not so difficult to adapt it for using their engine and the performance gains should be nice. As a starting point, you may consider trying to execute with their engine the models presented here at https://github.com/AndreaMattiaGaravagno/NanoNAS/tree/main/Models, which are the ones shown during the talk. You can also look at other engines like the one proposed in the X-CUBE-AI package from ST. Feel free to reach me at AndreaMattia.Garavagno@santannapisa.it if you need more information.

Q: You are downsampling in every cell. Are you not afraid that this is a too aggressive down-sampling approach as you are already starting with a very small size of 50 x 50?
A: Yes, it is aggressive, but this was done to maintain the number of MACC low so that I can fit the computational resources available to low-end MCUs. Anyway, if you want to test other solutions, I invite you to do it by modifying my code, which can be found at GitHub - AndreaMattiaGaravagno/NanoNAS: A simple way to obtain CNNs that run on low-RAM microcontrollers, e.g. 40 kiB, on your laptop, without a GPU.. Feel free to reach me at AndreaMattia.Garavagno@santannapisa.it if you need more information.

Q: please resend all the links that were sent earlier in the chat. thank you
A: Sure. Here is the link for NanoNAS, the HW NAS presented during this talk (GitHub - AndreaMattiaGaravagno/NanoNAS: A simple way to obtain CNNs that run on low-RAM microcontrollers, e.g. 40 kiB, on your laptop, without a GPU.). While at GitHub - AndreaMattiaGaravagno/ColabNAS: A simple way to obtain CNNs that run on low-RAM microcontrollers, e.g. 40 kiB, using online GPU programs like Google Colaboratory or Kaggle Kernel., you can find ColabNAS, another HW NAS targeting low-end MCUs, that is meant to be executed on online GPU programs like Google Colaboratory and Kaggle Kernel, to avoid the need for physical GPU. It provides similar results in similar execution times, and it is more repeatable and more precise than NanoNAS at the expense of requiring a GPU to obtain results in an acceptable amount of time.

Q: Could you please provide the links related to the sources we need for this typical hardware implementation of machine learning structures?
A: At GitHub - AndreaMattiaGaravagno/NanoNAS: A simple way to obtain CNNs that run on low-RAM microcontrollers, e.g. 40 kiB, on your laptop, without a GPU., you should find everything you need to obtain CNNs in the TfLite format that fit the resource constraints of microcontrollers. At Get started with microcontrollers  |  TensorFlow Lite, you can find a good tutorial for doing inference on microcontrollers with TfLite models. I hope this answers your question. Feel free to reach me at AndreaMattia.Garavagno@santannapisa.it if you need more information.

thanks for this info!