Two tinyML Talks on July 21, 2020: 1) "AI/ML SoC for Ultra-Low-Power Mobile and IoT devices" by Tomer Malach (DSP Group); 2) “Pushing the Limits of Ultra-low Power Computer Vision for tinyML Applications” by Aravind Natarajan (Qualcomm Technologies, Inc.)

We held our eleventh tinyML Talks webcast with two presentations: Tomer Malach from DSP Group has presented AI/ML SoC for Ultra-Low-Power Mobile and IoT devices and Aravind Natarajan from Qualcomm Technologies has presented Pushing the Limits of Ultra-low Power Computer Vision for tinyML Applications on July 21, 2020 at 8:00 AM and 8:30 AM Pacific Time.


Tomer Malach (left) and Aravind Natarajan (right)

TinyML deep neural networks (DNNs) are at the core of artificial intelligence (AI) research with proven inference applications in the areas of voice, audio, proximity, gesture, and imaging, just to mention a few. With so many applications, DNN AI models are evolving rapidly and becoming more complex. This complexity is mainly expressed by the increasing number of computations per second, as well as the larger required memory capacity, both of which conflict with TinyML’s low-power requirement. The challenge is exacerbated by the need to implement DNN AI models in real time on TinyML hardware. This requires higher memory-core bandwidth and multiple-accumulate (MAC) capability, while also again putting pressure on designers to increase power consumption, silicon area and memory size, despite being aware of the constraints around these parameters. To realize more fully the true potential of TinyML, what’s needed is a way to address all these design issues simultaneously. This Tiny ML talk will present how AI models, run on a best-fit hardware and software (HW/SW) architecture, combined with the compression and optimization methods, can relieve many AI core bottlenecks and expand the device’s capabilities. The AI compression approach helps to scale down megabyte (Mbyte) models to kilobyte (Kbyte) models, increases the memory utilization efficiency for reduced power consumption (and potentially smaller memory size), and run very complex models on a very small AI system-on-chip (SoC), resulting in an ultra-low-power AI device consuming only microwatts (µW) of power.

Tomer Malach is an AI Hardware architect at DSP Group and an M.Sc student at Ben-Gurion University of the Negev, Israel. Tomer received his B.Sc. degree (with honors) in electrical and computer engineering from Ben-Gurion University.
His current research focus is on compression by hardware of machine learning models for edge devices.

Achieving always-on computer vision in a battery-constrained device for TinyML applications is a challenging feat. To meet the requirements of computer vision at <1mW, innovation and end-to-end optimization is necessary across the sensor, custom ASIC components, architecture, algorithm, software, and custom trainable models. Qualcomm Technologies developed an always-on computer vision module that comprises a low-power monochrome qVGA CMOS image sensor and an ultra-low power custom SoC with dedicated hardware for computer vision algorithms. By challenging long-held assumptions in traditional computer vision, we are enabling new applications in mobile phones, wearables, and IoT. We also introduce always-on computer vision system training tools, which facilitate rapid training, tuning, and deployment of custom object detection models. This talk presents the Qualcomm QCC112 chip, use cases enabled by this device, and an overview of the training tools.

Aravind Natarajan is a Staff Engineer at Qualcomm AI Research, working on ultra-low power computer vision applications. Prior to joining Qualcomm, Aravind received his PhD in Computer Engineering from the University of Texas at Dallas, working on concurrent data structures. He is the author of multiple research papers and has been granted 6 patents.

==========================

Watch on YouTube:
Tomer Malach
Aravind Natarajan

Download presentation slide:
Tomer Malach
Aravind Natarajan

Feel free to ask your questions on this thread and keep the conversation going!

Here are the answers to some of the questions we did not get to during the presentation.

Q: Where do I buy a kit to get started?
A: The kit is currently available for purchase on the Intrinsyc website. Please note that Intrynsic has been acquired by Lantronix, so the link may change in the future.

Q: Are any TOPS/Watt kind numbers available for this hardware?
A: Because it’s hardware optimized, not all applications consume the same amount of power. Please reach out to us at cvm@qti.qualcomm.com for more information, if you are interested.

Q: How do you build custom detection models for the QCC112?
A: We provide a suite of easy to use training tools called the AOCVS Portal to enable users to train their custom models that can run on our hardware. They can also use data collected from other sensors for training.

Q: What would the power budget look like for an RGB camera, rather than a monochrome camera?
A: One could put a bayer filter over our image sensor, but we would lose about 80% of the light. Additionally, the ISP processing to convert the bayer image into a 3 channel color image would consume significantly more power than our low power sensor.

Q: Is there any specific operation that the hardware does not support?
A: All operations generated by our software tools (AOCVS Portal) are supported. Any operations outside this may not be supported.

Q: What are the memory limitations for the sensor? I’m assuming we wouldn’t be able to compress and run YOLO on this sensor?
A: Our software tools (AOCVS Portal) are optimized for our hardware pipeline and we cannot support any external models directly. We provide several stock models, and can detect several objects concurrently.

Q: How large is the current camera chip?
A: Our image sensor is 4.1mm x 3.9mm, while our processor is 4.0mm x 5.2mm

Q: What models can be supported? Any constraints such as type of operations and model size?
A: All models trained using the AOCVS Portal can run on our hardware. Model sizes are limited to ~128KB.

Q: Are there any papers published by Qualcomm on this available?
A: You may refer to our 2019 tinyML Summit Poster and 2020 tinyML Summit poster

Q: Can we extract the raw images in order to provide retraining feedback and hence improve the accuracy of a fixed, in-situ sensor?
A: In certain modes the images can be extracted.