THESIS
2020
1 online resource (xiii, 95 pages) : illustrations (some color)
Abstract
Large-scale deep learning tasks usually run on parallel and distributed frameworks such
as TensorFlow and PyTorch, and take hours to days to obtain training results. These
frameworks utilize hardware accelerators, especially GPUs, to speed up the computation.
However, data access and processing in these tasks takes a significant amount of time.
Therefore, we propose to accelerate these tasks by improving their dataset storage and
processing. Firstly, we develop DIESEL, a scalable dataset storage and caching system
that runs between a training framework and the underlying distributed file system.
The main features of DIESEL include metadata snapshot, per-task distributed cache,
and chunk-based storage and shuffle. Secondly, we optimize a GPU-assisted image
decoding method for training ta...[
Read more ]
Large-scale deep learning tasks usually run on parallel and distributed frameworks such
as TensorFlow and PyTorch, and take hours to days to obtain training results. These
frameworks utilize hardware accelerators, especially GPUs, to speed up the computation.
However, data access and processing in these tasks takes a significant amount of time.
Therefore, we propose to accelerate these tasks by improving their dataset storage and
processing. Firstly, we develop DIESEL, a scalable dataset storage and caching system
that runs between a training framework and the underlying distributed file system.
The main features of DIESEL include metadata snapshot, per-task distributed cache,
and chunk-based storage and shuffle. Secondly, we optimize a GPU-assisted image
decoding method for training tasks on image datasets. Furthermore, we introduce
an online region-of-interest (ROI) method to reduce the data movement cost between
computer nodes. Our experiments on real-world training tasks show that (1) DIESEL
halves the data access time and reduces the training time by around 15%-27%, (2) our
optimized image decoding method is 30%-50% faster than existing GPU-accelerated
image decoding libraries, and (3) our online ROI method reduces the data transfer
time between DIESEL’s caching layer to the deep learning framework by around 50%.
Overall, our system outperforms existing systems by a factor of two to three times on
the end-to-end running time of deep learning tasks on image datasets.
Post a Comment