AdamP: Slowing Down the Slowdown for Momentum Optimizers on Scale-invariant Weights

Published: May 3, 2021

Abstract

Normalization techniques, such as batch normalization (BN), are a boon for modern deep learning. They let weights converge more quickly with often better generalization performances. It has been argued that the normalization-induced scale invariance among the weights provides an advantageous ground for gradient descent (GD) optimizers: the effective step sizes are automatically reduced over time, stabilizing the overall training procedure. It is...

Paper Fields

Paper Details

Title

AdamP: Slowing Down the Slowdown for Momentum Optimizers on Scale-invariant Weights

Published Date

May 3, 2021

Citation AnalysisPro

You’ll need to upgrade your plan to Pro

Looking to understand the true influence of a researcher’s work across journals & affiliations?

Scinapse’s Top 10 Citation Journals & Affiliations graph reveals the quality and authenticity of citations received by a paper.
Discover whether citations have been inflated due to self-citations, or if citations include institutional bias.

Learn more

Notes

History