We held our next tinyML Talks webcast. Manuele Rusci from the Katholieke Universiteit Leuven presented TinyDenoiser: RNN-based Speech Enhancement on a Multi-Core MCU with Mixed FP16-INT8 Post-Training Quantization on January 26, 2023.
Unit (MCU), with 1+8 general-purpose RISC-V cores and support for vector 8-bit integer (INT8) and 16-bit floating-point (FP16) arithmetic. To achieve low-latency execution, we propose a software pipeline interleaving parallel computation of LSTM or GRU recurrent units with manually-managed memory transfers of the model parameters. To ensure minimal accuracy degradation with respect to the full-precision models, we also propose a novel FP16-INT8 Mixed-Precision Post-Training Quantization (PTQ) scheme that compresses the recurrent layers to 8-bit while the bit precision of remaining layers is kept to FP16. Experiments are conducted on multiple LSTM and GRU based SE models belonging to the TinyDenoiser family and featuring up to 1.24M parameters. Thanks to the proposed approach, we speed-up the computation by up to 4× with respect to the lossless FP16 baselines, while showing a low-degradation of the PESQ score. Our design results >10× more energy efficient than state-of-the-art SE solutions deployed on single-core MCUs that make use of smaller models and quantization-aware training.
Dr. Manuele Rusci received the Ph.D. degree in electronic engineering from the University of Bologna in 2018. He is currently holding a MSCA Post-Doctoral Fellowship at the Katholieke Universiteit Leuven, after being Post-Doc at the University of Bologna. His main research interests include low-power AI-powered smart sensors and on-device continual learning.
Watch on YouTube:
Download presentation slides:
Feel free to ask your questions on this thread and keep the conversation going!