We held our next tinyML Talks webcast. Foroozan Karimzadeh from Georgia Institute of Technology presented Twofold Sparsity: Joint Bit- and Network-level Sparse Deep Neural Network for Energy-efficient RRAM Based CIM on November 21, 2023.
The rising popularity of intelligent mobile devices and the computational cost of deep learning-based models call for efficient and accurate on-device inference schemes. AI-powered edge devices require compressed deep learning algorithms and energy efficient hardware. A traditional Von Neumann architecture suffers from the latency and power dissipation caused by intra-chip data communication. Therefore, Compute-in-memory (CIM) architecture has emerged to overcome these challenges by conducting the computations in the memory and reducing the need for data movement between memory and processing units. However, the current deep learning compression techniques are not designed to take advantage of CIM architecture. In this work, we proposed Twofold Sparsity, a joint bit- and network-level sparsity method to highly sparsify the deep leaning models by taking advantage of CIM architecture for energy-efficient computations. Twofold Sparsity method sparsify the network during training by adding two regularizations, one to sparsify the weights using Linear Feedback Shift Register (LFSR) mask, and the other one to sparsify the values in the bit-level by making bits to be zero. During inference, the same LFSR is used to choose the correct sparsed weights and 2bit/cell RRAM based CIM is responsible to do the computation. Twofold Sparsity method achieved 2.2x to 14x energy efficiency in different overall sparsity rates from 10% to 90% compared to the original 8-bit network and eventually enabling powerful deep learning models to be run on power constrained edge devices.
Foroozan Karimzadeh is currently a postdoctoral fellow at Georgia Institute of Technology. She received her PhD degree at Electrical and Computer Engineering department, Georgia Institute of Technology under supervision of Dr. Raychowdhury in 2022. Her research interest mainly includes developing novel algorithms and hardware co-design for energy efficient deep learning and large language models. She was selected as an MIT rising star in EECS, 2023. Foroozan was awarded a prestigious Semiconductor Research Corporation (SRC) Graduate Fellowship, which is awarded in partnership with Texas Instruments. Also, she received DAC Young fellow award at Design and Automation Conference, 2022.
Download presentation slides:
Feel free to ask your questions on this thread and keep the conversation going!