[DEEP COMPRESSION: Compressing deep neural networks with pruning, trained quantization and huffman coding](https://ingenjoy.notion.site/DEEP-COMPRESSION-Compressing-deep-neural-networks-with-pruning-trained-quantization-and-huffman-co-165308e26924811eb1eaf3ab553a4961)
[A Survey of Model Compression and Acceleration for Deep Neural Networks](https://ingenjoy.notion.site/A-Survey-of-Model-Compression-and-Acceleration-for-Deep-Neural-Networks-165308e26924816898a6f19fc6fd0f88)
[Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference](https://ingenjoy.notion.site/Quantization-and-Training-of-Neural-Networks-for-Efficient-Integer-Arithmetic-Only-Inference-165308e2692481a29c4ede489a17d519)
[Quantizing deep convolutional networks for efficient inference_A whitepaper](https://ingenjoy.notion.site/Quantizing-deep-convolutional-networks-for-efficient-inference_A-whitepaper-165308e26924816fa0d0cec80b847b4e)
[Learned step size quantization](https://ingenjoy.notion.site/Learned-step-size-quantization-165308e2692481d7afd5c4ffdc0e6609)
[Rethinking floating point for deep learning](https://ingenjoy.notion.site/Rethinking-floating-point-for-deep-learning-165308e26924817ebdf0f00c0f737de5)
[LSQ+: Improving low-bit quantization through learnable offsets and better initialization](https://ingenjoy.notion.site/LSQ-Improving-low-bit-quantization-through-learnable-offsets-and-better-initialization-165308e2692481dc8faffe3b0ebc6cbb)
Differentiable Model Compression via Pseudo Quantization Noise
Characterising Bias in Compressed Models
Towards Accurate Post-training Network Quantization via Bit-Split and Stitching
Training with Quantization Noise for Extreme Model Compression
Integer Quantization for Deep Learning Inference: Principles and Empirical Evaluation