Optimize resource scheduling in multi-tenant clusters at scale

HKUST Electronic Theses

Optimize resource scheduling in multi-tenant clusters at scale

by Qizhen Weng

THESIS 2022

Ph.D. Computer Science and Engineering

1 online resource (xv, 130 pages) : illustration (some color)

Abstract

Witnessing the soaring demand for computation over the past decade, tech companies are piling up numerous commodity machines to serve requests from massive users. Such large-scale multi-tenant clusters, with optimized resource scheduling, have the potential to be highly efficient. However, it is challenging to achieve high performance and low cost in practice. Given heterogeneous hardware and diverse workloads, many schedulers either fail with low resource utilization, which increases the cost, or cause high workload contention, which decreases the performance.

In this dissertation, starting with a characterization study of a production cluster, we present the challenges posed to resource scheduling; for example, low resource utilization, presence of hard-to-schedule tasks demanding high-end GPUs, imbalance load across machines, and severe contention on CPU resources. To tackle these issues, packing and balancing are two major approaches. Bin-packing consolidates workloads on fewer servers, accommodating demanding tasks and improving resource utilization. Load-balancing scatters tasks over the cluster, mitigating contention and boosting workload performance.

Following the packing method towards higher utilization, we find resource fragmentation to be a major obstacle, especially in GPU-sharing clusters where conventional bin-packing is unviable. It is because the scheduling of GPU-sharing tasks that requests a partial GPU cannot be modeled as a classic bin packing problem, due to the discrete and interchangeable nature of GPU resources. Therefore, we take a new approach towards high utilization by minimizing fragmentation. We quantify the degree of GPU fragmentation statistically, and then use this metric to guide scheduling. We propose a novel scheduling heuristic called Fragmentation Gradient Descent (FGD), which consistently outperforms a variety of packing-based schedulers and further utilizes hundreds of GPUs in large-scale cluster emulations driven by production traces.

Following the balancing method towards better performance, we study the placement of long-running application (LRA) containers. LRAs, with stringent performance requirements, are difficult to schedule due to their sophisticated resource interferences and I/O dependencies. Existing schedulers, avoiding contention by minimizing the violations of placement constraints, fall short in performance as manually expressed constraints only provide qualitative scheduling guidelines. Consequently, we design Metis, a data-driven scheduling system that learns to optimally place LRA containers using deep reinforcement learning (RL) techniques. Metis eliminates the complex manual specification of placement constraints and offers concrete quantitative scheduling criteria. Enhanced by hierarchical learning techniques, Metis scales to large clusters and substantially increases the throughput of workloads in real deployments on the public cloud.

[ Hide abstract ]

View Copyrighted to the author. Reproduction is prohibited without the author’s prior written consent.

Details

Collection HKUST Electronic Theses Degree Ph.D. Department Computer Science and Engineering Supervisors Wang, Wei Authors Weng, Qizhen Subjects Computer scheduling Data processing Mathematical models Electronic data processing Distributed processing Parallel processing (Electronic computers) Information resources management Resource allocation Language English Call number Thesis CSE 2022 Weng DOI 10.14711/thesis-991013142358703412

Full record

Optimize resource scheduling in multi-tenant clusters at scale

by Qizhen Weng

Post a Comment Cancel reply