
A. Quantization and Binarization
- reducing the number of bits : kkmeans scalar quantization
- significant speed-up with minimal loss of accuracy : 8-bit quantization
- reduced memory usage and float point operations with little loss in classification accuracy : 16-bit fixed-point representation in stochastic rounding based CNN training
- weight sharing and then applied Huffman coding to the quantized weights as well as the codebook to further reduce the rate : the network was retrained to learn the final weights for the remaining sparse connections
- minimize Hessian- weighted quantization errors in average to cluster parameters
- the 1-bit representation of each weight, that is binary weight neural networks
kkmeans scalar quantization
[6] Y. Gong, L. Liu, M. Yang, and L. D. Bourdev, “Compressing deep convolutional networks using vector quantization,” CoRR, vol. abs/1412.6115, 2014.(FAIR)
[7] Y. W. Q. H. Jiaxiang Wu, Cong Leng and J. Cheng, “Quantized convolutional neural networks for mobile devices,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
8-bit quantization
[8] V. Vanhoucke, A. Senior, and M. Z. Mao, “Improving the speed of neural networks on cpus,” in Deep Learning and Unsupervised Feature Learning Workshop, NIPS 2011, 2011.
16-bit fixed-point representation