Deep Linear Networks Dynamics: Low-Rank Biases Induced by Initialization Scale and L2 Regularization.

Published: Jun 30, 2021
Abstract
For deep linear networks (DLN), various hyperparameters alter the dynamics of training dramatically. We investigate how the rank of the linear map found by gradient descent is affected by (1) the initialization norm and (2) the addition of L_{2}regularization on the parameters. For (1), we study two regimes: (1a) the linear/lazy regime, for large norm initialization; (1b) a \textquotedbl saddle-to-saddle\textquotedbl{} regime for small...
Paper Details
Title
Deep Linear Networks Dynamics: Low-Rank Biases Induced by Initialization Scale and L2 Regularization.
Published Date
Jun 30, 2021
Citation AnalysisPro
  • Scinapse’s Top 10 Citation Journals & Affiliations graph reveals the quality and authenticity of citations received by a paper.
  • Discover whether citations have been inflated due to self-citations, or if citations include institutional bias.