Implicit Self-Regularization in Deep Neural Networks: Evidence from Random Matrix Theory and Implications for Learning

By Charles H. Martin et al
Published on Oct. 2, 2018
Read the original document by opening this link in a new tab.

Table of Contents

1 Introduction
2 Simple Capacity Metrics and Transitions during Backprop
3 Basic Random Matrix Theory (RMT)
4 Empirical Results: ESDs for Existing, Pretrained DNNs
5 5+1 Phases of Regularized Training
6 Empirical Results: Detailed Analysis on Smaller Models
7 Explaining the Generalization Gap by Exhibiting the Phases
8 Discussion and Conclusion

Summary

Random Matrix Theory (RMT) is applied to analyze the weight matrices of Deep Neural Networks (DNNs), showing evidence of Implicit Self-Regularization in the training process. The study identifies 5+1 Phases of Training that demonstrate increasing amounts of Implicit Self-Regularization. Results suggest that well-trained DNN architectures exhibit Heavy-Tailed Self-Regularization, impacting the generalization gap phenomena. The paper delves into the theoretical and practical implications of these findings, emphasizing the importance of understanding and controlling the Energy Landscape in DNN optimization.
×
This is where the content will go.