Ieee Signal Processing Magazine, Special Issue On Deep Learning For Image Understanding

By Y. Cheng et al.
Read the original document by opening this link in a new tab.

Table of Contents

I. Introduction
II. Parameter Pruning and Quantization
III. Low-Rank Approximation and Sparsity

Summary

This paper discusses model compression and acceleration techniques for deep neural networks, focusing on reducing computational complexity and memory usage. It reviews methods such as parameter pruning and quantization, low-rank factorization, and knowledge distillation. The paper also explores the challenges and opportunities in deploying deep learning systems on devices with limited resources.
×
This is where the content will go.