Original paper

Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding

Published: Feb 15, 2016

Abstract

: Neural networks are both computationally intensive and memory intensive, making them difficult to deploy on embedded systems with limited hardware resources. To address this limitation, we introduce deep compression, a three stage pipeline: pruning, trained quantization and Huffman coding, that work together to reduce the storage requirement of neural networks by 35x to 49x without affecting their accuracy. Our method first prunes the network...

Paper Fields

Paper Details

Title

Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding

Published Date

Feb 15, 2016

Citation AnalysisPro

You’ll need to upgrade your plan to Pro

Looking to understand the true influence of a researcher’s work across journals & affiliations?

Scinapse’s Top 10 Citation Journals & Affiliations graph reveals the quality and authenticity of citations received by a paper.
Discover whether citations have been inflated due to self-citations, or if citations include institutional bias.

More information

Notes

History