We held our third tinyML Talks webcast with two presentations:
Song Han from MIT has presented Once-for-All: Train One Network and Specialize it for Efficient Deployment and Alexander Eroma from Octonion has presented Unsupervised collaborative learning technology at the Edge for industrial machine vendors on April 28, 2020 at 8:00 AM and 08:30 AM Pacific Time.
Alexander Eroma (left) and Song Han (right)
"Once-for-All: Train One Network and Specialize it for Efficient Deployment"
Assistant Professor, Department of Electrical Engineering and Computer Science
Massachusetts Institute of Technology (MIT)
We address the challenging problem of efficient inference across many devices and resource constraints, especially on edge devices. We propose to train a once-for-all network (OFA) that supports diverse architectural settings by decoupling training and search. We can quickly get a specialized sub-network by selecting from the OFA network without additional training. We also propose a novel progressive shrinking algorithm, a generalized pruning method that reduces the model size across many more dimensions than pruning (depth, width, kernel size, and resolution), which can obtain a surprisingly large number of sub-networks (> 1019) that can fit different latency constraints. On edge devices, OFA consistently outperforms SOTA NAS methods (up to 4.0% ImageNet top1 accuracy improvement over MobileNetV3, or same accuracy but 1.5x faster than MobileNetV3, 2.6x faster than Efficient Net w.r.t measured latency) while reducing many orders of magnitude GPU hours and CO2 emission. In particular, OFA achieves a new SOTA 80.0% ImageNet top1 accuracy under the mobile setting (<600M MACs). OFA is the winning solution for 4th Low Power Computer Vision Challenge, both classification track and detection track.
Song Han is an assistant professor in MIT’s Department of Electrical Engineering and Computer Science. His research focuses on efficient deep learning computing. He has proposed “deep compression” that can reduce neural network size by an order of magnitude, and the hardware implementation “efficient inference engine” that first exploited model compression and weight sparsity in deep learning accelerators. He received a best paper award at the ICLR’16 and FPGA’17. He is a recipient of NSF CAREER Award and MIT Technology Review Innovators Under 35. Many of the pruning, compression, and acceleration techniques have been integrated into commercial AI chips. He was the co-founder and chief scientist of DeePhi Tech that was acquired by Xilinx. He earned a PhD in electrical engineering from Stanford University.
"Unsupervised collaborative learning technology at the Edge for industrial machine vendors"
Intelligence Team Lead
Introduction to Unsupervised Collaborative learning technology from Octonion SA that allows industrial machine vendors and owners to get machine efficiency insights. TinyML and TinyEgde approaches are the base construction blocks of Octonion’s technology. The presentation addresses the implementation of the Edge pipeline from Octonion that is compatible with ARM Cortex-M4 core.
Alexander graduated from Belarusian State University of Informatics and Radioelectronics as an engineer of computer systems and networks. Also, he graduated with Master of Engineering degree in mathematical and software of computers, complexes, and computer networks. Since 2015 he is participating in the Ph.D. course in the area of computer science. Alexander is the author of several scientific articles in the field of machine learning. At Octonion, Alexander is responsible for the development of complex algorithms, high-performance code, as well as solution architecture .
Feel free to ask your questions on this thread and keep the conversation going!