We held our tinyML Talks webcast with one presentation: Manu Rastogi has presented Tutorial on micro-kernel based hardware acceleration on August 13, 2020 at 8:00 AM Pacific Time.
Energy and compute are both scarce for deep learning deployment at the edge. Rapid innovation in new layer types and network topologies makes it even more challenging. There is also increased pressure on hardware designs and toolchain development for automated and efficient model deployment. Often the hardware and toolchains lag behind in the support of new layers. Since deep learning is becoming more ubiquitous there is stiff competition among different hardware vendors to provide the most energy-efficient solutions. The key piece to model deployment at the edge is the mico-kernels or the micro-code that orchestrate the data movement and the computation of these networks on hardware. As part of this talk, we will walk through the matrix multiplication micro-code. We will understand the various trade-offs between different optimization strategies and extend these principles to neural networks.
Manu Rastogi received his B.Tech from India and his MS and Ph.D. from the University of Florida in 2012. Since graduation, he has worked at Qualcomm Research and HP Labs. As a member of the Qualcomm research team, he worked on the Qualcomm Zeroth processor in various capacities and later on the Qualcomm deep learning engine. His roles at Qualcomm varied from developing signal processing algorithms, model development, and deep learning model optimizations. At HP he led the efforts around machine learning at the edge and self-supervised learning methods using mutual information for speaker identification.
Watch on YouTube:
Download presentation slide:
Feel free to ask your questions on this thread and keep the conversation going!