AdamP: Slowing Down the Slowdown for Momentum Optimizers on Scale-invariant Weights

Published: May 3, 2021
Abstract
Normalization techniques, such as batch normalization (BN), are a boon for modern deep learning. They let weights converge more quickly with often better generalization performances. It has been argued that the normalization-induced scale invariance among the weights provides an advantageous ground for gradient descent (GD) optimizers: the effective step sizes are automatically reduced over time, stabilizing the overall training procedure. It is...
Paper Details
Title
AdamP: Slowing Down the Slowdown for Momentum Optimizers on Scale-invariant Weights
Published Date
May 3, 2021
Citation AnalysisPro
  • Scinapse’s Top 10 Citation Journals & Affiliations graph reveals the quality and authenticity of citations received by a paper.
  • Discover whether citations have been inflated due to self-citations, or if citations include institutional bias.