Truly Sparse Neural Networks at Scale

By Selima Curci et al
Read the original document by opening this link in a new tab.

Table of Contents

1 Introduction
2 Results
2.1 Proposed contributions
2.1.1 WASAP-SGD method
2.1.2 All-ReLU
2.1.3 Importance Pruning
2.1.4 Large scale Sparse Neural Network framework
2.2 Performance on Sequential Trained Sparse MLPs

Summary

Recently, sparse training methods have started to be established as a de facto approach for training and inference efficiency in artificial neural networks. In this paper, the authors introduce three novel contributions specifically designed for sparse neural networks: a parallel training algorithm, an activation function, and a hidden neurons importance metric. These contributions aim to train truly sparse neural networks to harvest their full potential. The paper discusses the importance of sparse connectivity in neural networks and the challenges faced due to the dominance of dense matrix operations in current deep learning software and hardware. The authors propose solutions to improve the scalability and efficiency of neural networks by using sparse training methods and novel algorithms. Experimental results demonstrate the performance of the proposed methods on various datasets, showcasing the potential of sparse neural networks in achieving high accuracy with reduced computational resources.
×
This is where the content will go.