Abstract Submission

6693. K-Quant: a non uniform post-training quantization algorithm

Contributed abstract in session FA-5: Optimization and Artificial Intelligence II, stream Optimization and Artificial Intelligence.

Friday, 9:00 - 10:40
Room: Pontryagin

Authors (first author is the speaker)

1. Enrico Civitelli
Department of Information Engineering, Università degli Studi di Firenze
2. Leonardo Taccari
Fleetmatics Research
3. Fabio Schoen
Dipartimento di Ingegneria dell'Informazione, Università degli Studi di Firenze


Quantization is a simple yet effective way to deploy deep neural networks on resource-limited hardware. Post-training quantization algorithms are particularly interesting because they do not require the full dataset to run. In this work we explore a way to perform non uniform post-training quantization using an optimization algorithm to minimize the output differences between each compressed layer and the original one. The proposed method significantly reduces the memory required by the neural network without affecting the performance in terms of accuracy.


Status: accepted

Back to the list of papers