Network compression via quantization and sparsification

HKUST Electronic Theses

Network compression via quantization and sparsification

by Lu Hou

THESIS 2019

Ph.D. Computer Science and Engineering

xiii, 123 pages : illustrations ; 30 cm

Abstract

Deep neural network models, though very powerful and highly successful, are computationally expensive in terms of space and time. Recently, there have been a number of attempts on compressing the network. These attempts greatly reduce the network size, and allow the possibility of deploying deep models in resource-constrained environments.

In this thesis, we focus on two kinds of network compression methods: quantization and sparsification. We first propose to directly minimize the loss w.r.t. the quantized weights by using the proximal Newton algorithm. We provide a closed-form solution for binarization, as well as an efficient approximate solution for ternarization and m-bit (where m 2) quantization. To speed up distributed training of weight-quantized networks, we then propose to...[ Read more ]

View Copyrighted to the author. Reproduction is prohibited without the author’s prior written consent.

Details

Collection HKUST Electronic Theses Degree Ph.D. Department Computer Science and Engineering Supervisors Kwok, James T. Authors Hou, Lu Subjects Machine learning Mathematical models Neural networks (Computer science) Data compression (Computer science) Language English Call number Thesis CSED 2019 Hou DOI 10.14711/thesis-991012757468303412

Full record

Network compression via quantization and sparsification

by Lu Hou

Post a Comment Cancel reply