We held our tenth tinyML Talks webcast with two presentations: Zuzana Jelcicova from Demant has presented Benchmarking and Improving NN Execution on Digital Signal Processor vs. Custom Accelerator for Hearing Instruments and Daniel Situnayake from Edge Impulse has presented How to train and deploy tiny ML models for three common sensor types on July 7, 2020 at 8:00 AM and 8:30 AM Pacific Time.
Zuzana Jelcicova (left) and Daniel Situnayake (right)
Hearing instruments are supported by multicore processor platforms that include several digital signal processors (DSPs). These resources can be used to implement neural networks (NNs); however, execution time and energy consumption are prohibitive to do so. In this presentation, we will talk about benchmarking neural network workloads relevant for hearing aids on Demant’s DSP-based platform. We will also introduce a custom NN processing engine (NNE) that was developed to achieve further power optimizations by exploiting a set of various techniques (reduced wordlength, several MACs in parallel, two-step scaling etc.).
A pretrained, fully connected feedforward NN (Hello Edge: Keyword Spotting on Microcontrollers) was used as a benchmark model to run a keyword spotting application using Google speech command dataset on both, the DSP and NNE. We will talk about the performance of the two implementations, where the NNE significantly outperforms the DSP solution.
Zuzana Jelcicova graduated from Technical University of Denmark (DTU) in 2019 as a MSc of Computer Science and Engineering. Since then she has been pursuing a Ph.D. degree in collaboration with DTU and Demant A/S - an international hearing healthcare group that offers solutions and services to help people with hearing loss. The topic of Zuzana’s Ph.D. are neural networks in resource constraint hearing instruments with the focus on hardware and digital design.
TinyML is incredibly exciting, but if you’re hoping to train your own model it can be difficult to know where to start. In this talk, Dan walks through his workflow and best practices for training models for three very different types of data: time-series from sensors, audio, and vision. We’ll be using Edge Impulse, a free online studio for training embedded machine learning models.
Daniel Situnayake is the Founding TinyML engineer at Edge Impulse. He’s co-author of the O’Reilly book TinyML, alongside Pete Warden. He previously worked on the TensorFlow team at Google, and he co-founded Tiny Farms Inc., deploying machine learning on industrial scale insect farms.
Feel free to ask your questions on this thread and keep the conversation going!
Here are the questions and answers from tinyML talk " Benchmarking and Improving NN Execution on Digital Signal Processor for Hearing Instruments" by Zuzana Jelcicova at Demant.
- On Slide 14: If the network is not trained to the leading edge of state-of-the-art, how representative are the presented numbers? Could it be that a very well-trained network with higher accuracy would already lose significant accuracy at 8bit?
[Answer] Of course, a bigger/smaller gap between the baseline model and its 8-bit quantized version could occur for a well-trained network. However, we would not expect a significant loss – rather the same trend as (suitable) post-training quantization results in little degradation in model accuracy.
- My question is on acquiring datasets. Please which is the best place to get Rainfall datasets?
[Answer] We do not work with Rainfall datasets. I can, therefore, not give you an answer, unfortunately. (you could though try to search on Kaggle).
- Does the Q5.19 processing force your implementation into custom logic (FPGA/GA) or have you run proforma on general-purpose processing too (in simulation)? If general uP possible, are there libraries?
[Answer] The vector math supporting 4x Q5.19 format is implemented directly in our DSP. We did not use external libraries – the processing of the Q format is done by ourselves.
- When comparing energy on slide 30, what is the clock speed?
[Answer] The energy numbers are estimated based on Horowitz’s  paper for a 45nm process, so frequency is not necessary. To do the calculations, only the total amount of MACs and memory accesses are needed.
 M. Horowitz, “1.1 computing’s energy problem (and what we can do about it),” in2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC), vol. 57, Feb 2014, pp. 10–14.
- Can you rank the optimizations with respect to potential energy efficiency improvement?
[Answer] Each of the optimizations brings significant improvements and complements well the other two. Regarding the ranking, we could keep them in the given order, i.e.: 1) Reduced wordlength 2) Parallel MACs 3) Two-step scaling.
- Are there any results to compare such hardware for the Feature Extraction (cepstrum domain transformation) stage?
[Answer] We did not include the MFCC feature extraction in the results for this experiment. If you are interested in comparing the efficiency of different implementations in general, there are many works that state power required to execute the feature extraction stage.
- Can you provide more context regarding your use case of the keyword spotting model in hearing aid?
[Answer] Keyword spotting could be used for many actions, e.g. adjusting settings or changing environments - without using a smartphone. This would be, especially, useful in situations that require user’s full attention, and/or interaction with a smartphone is difficult/not possible.
- What hardware used in this practice?
[Answer]We have not deployed our accelerator on any device yet - if that is what you are asking.
- What kind of personalization do we need to learn on hearing aid?
[Answer]Every user has their own needs and preferences when it comes to hearing instrument settings. Therefore, it is important to tweak the settings individually for every person. In order to do so, the user must go through the fitting and adjustment stages carried out by a hearing specialist. You can, therefore, imagine the advantages that a neural network learning can add in the process of fitting the hearing instrument to the preferences of the individual user. Ideally, you would like to have a hearing instrument that is capable of adjusting seamlessly to any environment without users even noticing, and being able to do e.g. noise reduction, speech separation, etc. (what human auditory brain of people with normal hearing does without bigger effort).