THESIS
2022
1 online resource (xiv, 102 pages) : illustrations (some color)
Abstract
Although deep learning (DL) methods have achieved tremendous successes in various
medical image analysis tasks, the high demand on computation resources impedes their
practicability in computer-aided healthcare products and clinical practices. To address the
efficiency concerns on deep-learning-based medical image analysis algorithms, this thesis
introduces several model quantization methods that can lower the requirements of DL
models on computational precision (bit-width) without compromising their performance.
First, a quantization-aware training (QAT) approach is designed for medical image
segmentation tasks. This method features an adaptive quantization function with improved
derivative approximation, the radical usage of residual connections as well as a
knowledge distillation-aid...[
Read more ]
Although deep learning (DL) methods have achieved tremendous successes in various
medical image analysis tasks, the high demand on computation resources impedes their
practicability in computer-aided healthcare products and clinical practices. To address the
efficiency concerns on deep-learning-based medical image analysis algorithms, this thesis
introduces several model quantization methods that can lower the requirements of DL
models on computational precision (bit-width) without compromising their performance.
First, a quantization-aware training (QAT) approach is designed for medical image
segmentation tasks. This method features an adaptive quantization function with improved
derivative approximation, the radical usage of residual connections as well as a
knowledge distillation-aided training scheme, which, in combination, lead to highly accurate
segmentation models with extremely low bit-width (e.g., binary or 2-bit models).
Experiments on two public medical image segmentation datasets have demonstrated the
efficacy of this method.
Second, to avoid the expensive finetuning phase in the training-based quantization
methods, a post-training quantization (PTQ) algorithm is developed, which requires neither a large-scale dataset, nor a long training stage. The method is based on a layer-wise
optimization strategy and resorts to ADMM routines to solve each layer-wise problem
efficiently. Extensive experiments have been carried out on both segmentation and registration
tasks, and the results evidence its advantage over SOTA alternatives.
Third, the proposed ADMM-based PTQ framework is further augmented by two novel
component, i.e., a locally adaptive optimization heuristic to improve the convergence of
ADMM and a self-adaptive attention mechanism to combat the class imbalance issue in
lesion segmentation. Comparison with existing methods and ablation studies are conducted
to verify the superiority of the novel design.
Finally, the GPU implementation of low-precision CNN models is introduced, based
on which an actual runtime analysis is presented on both individual convolutional layers
and practical medical image segmentation models. A further study on the detailed
time consumption of different model components is also performed, which verifies the
theoretical complexity analysis in this thesis, and also reveals several future directions to
improve the model acceleration performance.
Post a Comment