Getting Started with TinyML - Pete Warden - March 31, 2020

We held our first tinyML Talk! Pete Warden has presented Getting Started with TinyML on March 31, 2020 at 8 AM Pacific time.

warden

If you’re interested in running machine learning on embedded devices but aren’t sure how to get started, Pete Warden from Google’s TensorFlow Micro team will run through how to build and run your own TinyML applications. This will include an overview of the different boards that are available, the software frameworks, and tutorials to get you up and running.

Watch on YouTube
Download presentation slides

Feel free to ask your questions on this thread and keep the conversation going!

3 Likes

Can you discuss the differences (if any) between tinyML and GPU-powered edge devices like coral and jetson Nano please? In terms of use cases, technical effort and future development. Thanks!

One more question - is TinyML a google / TensorFlow initiative only, or is it something other machine learning libraries have capability for? Thanks again.

Hey Dan! Didn’t get a chance to answer this during the session, so I’ll answer it here. There are a ton of organizations, researchers and companies involved with making TinyML possible—you can see a small snapshot of them in the companies that sponsored this year’s TinyML Summit:

https://www.tinyml.org/summit/#sponsors

Embedded ML involves a whole pipeline of tools, and inference libraries like TensorFlow Lite are just one part. There are generally multiple options for each component and we’re only just beginning!

is it possible to use edgeimpule model on mobile? e.g. through MLKit.

It is! Although we’re focused on embedded machine learning, you can output the trained TensorFlow model from Edge Impulse (and soon the converted .tflite file, too) and run that using whatever mobile ML framework you choose. You can also export WebAssembly that will run pretty much anywhere :slight_smile:

1 Like

We had a lot of questions and couldn’t get to them all! I’ll take a shot at answering the remaining ones, below. Note that these are my answers, not Pete’s—though Pete is welcome to chime in with additions and corrections :slight_smile:

What are the current research challenges in TinyML.

There are many! Here are a few of the big topics:

  • Model compression; how do you get a model that has the same performance but is smaller in size or takes less time to compute? Includes quantization, sparsity, and binarization.
  • Runtime architectures. TensorFlow Lite uses an interpreter, but some approaches use code-generation.
  • Hardware architectures. Can we design custom silicon that is especially suited to running tiny models with low power?
  • Training techniques for tiny models. Reducing model size has only recently become a goal; are there approaches to training that give better results than the approaches used with larger models? Includes distillation, pruning.

I have been using TinyML for a while. Any guidelines on how much memory should be allocated for the tensor-arena in inference engine ?

It’s currently best tackled by trial-and-error; write some unit tests that run your model, start with a big tensor arena size, and just keep making it smaller until your code no longer runs. The size required for a given model remains static between inferences.

Q: If we do research which is best device for benchmarking.

We’ve found it useful to focus on devices with an Arm Cortex-M core, since these represent the typical range of specs for embedded devices. The M4 is probably the sweet spot right now in terms of balance between capability and power use.

is there a danger of over-optimizing for Tiny-ness? For e.g., we spent years developing computationally-efficient ML techniques but GPUs came along and now you can do deep learning on desktops. Could the same thing happen in the next 5 years where we have large compute on tiny form-factors or do we run up against the laws of physics w.r.t to power requirements?

There is some particularly impressive hardware in development that will make it easier to run larger models on-device, or run the same models using less power—for example, Arm’s Ethos-U55, or Eta Compute’s ECM3531. Devices are likely to become more capable, but the same is true of higher power accelerators, so TinyML will continue to exist as a concept.

What is the inferencing rate of the camera on sparkfun edge?

I believe it’s a few seconds per inference using the latest Arm CMSIS-NN optimizations. I haven’t tried this yet myself, though!

Is there an open source driver for the camera for the sparkfun edge?

Yes, the camera module is the HM01B0 and you can find the driver here:

Is the K210 / Sipeed boards supported?

I haven’t used the K210, but I think it comes with a compiler that is able to generate C code from TensorFlow Lite models. This is a different approach than used by TensorFlow Lite for Microcontrollers, which is a runtime that can interpret and execute the models directly. So you can use some of the TensorFlow Lite tooling, but not all of it.

Any HW support/suggestions for training on the edge?

There’s not much out there around training on the edge! It’s generally difficult due to the compute constraints and lack of labelled data, but I believe TensorFlow Lite will support training on mobile devices soon. It’s unlikely this will come to TensorFlow Lite for Microcontrollers.

Is there a benchmark program you recommend to measure whether a device can run TinyML efficiently?

You could start by modifying the TensorFlow Lite for Microcontrollers test suites, which run several different models:

https://www.tensorflow.org/lite/microcontrollers/library#run_the_tests

Hi. Looking to create Edge ML for nature classifier with a camera and audio to use on an IoT network on the Edge for a park. Finding that balance with power such as using Coral which we do now that needs a power line and battery power that lasts a few days at least what would you recommend.

This depends a lot on what you need to do with the camera. If you’re looking for real-time inference, you’ll need something bigger than TinyML (at least for the time being). If you can get away with an inference every few seconds, something with the power of an Arm Cortex-M4 should work.

1 Like

Many thanks @dansitu for moderating and tackling the questions! Everyone can feel free to open a new thread here in the forum about anything related to ultra-low power machine learning.

1 Like

YouTube video and slides are now linked in the original post of this thread.

1 Like

Hi Pete. Thank you very much for this presentation.
Just a question: You talked about the Sparkfun board. I started to read about it but the reviews were not that promising:


Can I have you opinion on that?
Thanks a lot.
Fabrice

Thank you for the presentation. Around 50:00 I heard something about network finalization but I may have mis-heard it. What is network finalization (or the phrase it actually was)?