논문

[DEEP COMPRESSION: Compressing deep neural networks with pruning, trained quantization and huffman coding](https://ingenjoy.notion.site/DEEP-COMPRESSION-Compressing-deep-neural-networks-with-pruning-trained-quantization-and-huffman-co-165308e26924811eb1eaf3ab553a4961)

[A Survey of Model Compression and Acceleration for Deep Neural Networks](https://ingenjoy.notion.site/A-Survey-of-Model-Compression-and-Acceleration-for-Deep-Neural-Networks-165308e26924816898a6f19fc6fd0f88)

[Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference](https://ingenjoy.notion.site/Quantization-and-Training-of-Neural-Networks-for-Efficient-Integer-Arithmetic-Only-Inference-165308e2692481a29c4ede489a17d519)

[Quantizing deep convolutional networks for efficient inference_A whitepaper](https://ingenjoy.notion.site/Quantizing-deep-convolutional-networks-for-efficient-inference_A-whitepaper-165308e26924816fa0d0cec80b847b4e)

[Learned step size quantization](https://ingenjoy.notion.site/Learned-step-size-quantization-165308e2692481d7afd5c4ffdc0e6609)

[Rethinking floating point for deep learning](https://ingenjoy.notion.site/Rethinking-floating-point-for-deep-learning-165308e26924817ebdf0f00c0f737de5)

[LSQ+: Improving low-bit quantization through learnable offsets and better initialization](https://ingenjoy.notion.site/LSQ-Improving-low-bit-quantization-through-learnable-offsets-and-better-initialization-165308e2692481dc8faffe3b0ebc6cbb)

Differentiable Model Compression via Pseudo Quantization Noise

Characterising Bias in Compressed Models

Towards Accurate Post-training Network Quantization via Bit-Split and Stitching

Training with Quantization Noise for Extreme Model Compression

Integer Quantization for Deep Learning Inference: Principles and Empirical Evaluation

Quantization Basic

참고 자료