THESIS
2016
xiii, 134 pages : illustrations ; 30 cm
Abstract
Recent years have witnessed a tremendous increase in the popularity of cloud computing
services to support business, communication, online customer service and help make life more
productive and efficient. Naturally, this has been accompanied by a constant expansion of
data centers scale and global geographical outreach, resulting in a dramatic growth of the
energy consumed to power such data centers. Several studies have shown that the energy
consumed today by data centers in the US alone is roughly equivalent to the annual output
of 34 large (500-megawatt) coal-fired power plants. Also the consumption is forecast to reach
double in less than 10 years. This not only costs data center providers billions of dollars
in energy bills, but also generates hundreds of millions of tonne...[
Read more ]
Recent years have witnessed a tremendous increase in the popularity of cloud computing
services to support business, communication, online customer service and help make life more
productive and efficient. Naturally, this has been accompanied by a constant expansion of
data centers scale and global geographical outreach, resulting in a dramatic growth of the
energy consumed to power such data centers. Several studies have shown that the energy
consumed today by data centers in the US alone is roughly equivalent to the annual output
of 34 large (500-megawatt) coal-fired power plants. Also the consumption is forecast to reach
double in less than 10 years. This not only costs data center providers billions of dollars
in energy bills, but also generates hundreds of millions of tonnes of carbon pollution per
year. Energy consumption in data centers comes from several aspects: i) computing and
networking equipments take the lion's share, ii) cooling equipments, and iii) power draw and
other ancillary equipments. Any reduction of such consumption is seen as such a boon that
for example, to cut cooling costs, some heavy data center users/providers such as Facebook
and Google have built data centers in as far flung areas as the Arctic circle, while others like
Microsoft are considering undersea data centers. In this thesis, we consider several important
problems of resource allocation in data centers while optimizing the energy consumed by computing and networking equipments.
The thesis consists of three parts. The first falls within the area of the so-called platform-as-a-service (PaaS) cloud service model, and deals with job scheduling in the MapReduce
massive-data parallel-processing framework. In this part, we consider energy efficiency as a
by-product of minimizing the makespan of jobs. More specifically, we first propose a new
scheduling algorithm called Multiple Queue Scheduler (MQS) to improve the data locality
rate of map tasks as a means to curb the costly data migration delays. Thereafter, to take
into account the intricate details of MapReduce framework such as the early shuffle problem,
we propose the Dynamic Priority Multiple Queue Scheduler (DPMQS) that dynamically increases
the priority of jobs that are close to completing their Map phase to speed up the start
of the reduce phase, thus reducing further the expected job holding time and the makespan.
We implemented both algorithms in Hadoop and compared their performance to other existing
algorithms. The second part falls within the realm of infrastructure-as-a-service (IaaS)
and deals with energy efficient virtual machine (VM) scheduling in data centers. We notably
formulate the minimum energy VM scheduling problem as a non-convex optimization problem,
prove its NP-hardness, then propose two greedy approximation algorithms, minimum
energy VM scheduling algorithm (MinES) and minimum communication VM scheduling algorithm
(MinCS), to reduce the energy consumption while satisfying the tenants' service
level agreements. Finally, in the third part, under the IaaS service model, we explore the
potential of cloud providers supporting services with more intricate network topologies than
currently practised. In particular, we consider the problem of embedding virtual clusters
specified by the tenants into a data center in a energy efficient manner. We carefully provide
a mathematical optimization model of this problem, prove its NP-hardness, then propose an
approximation algorithm, the so-called minimum energy virtual cluster embedding (MinE-VCE)
to solve the problem. We tested all proposed algorithms the latter two parts using
real data traces as well as synthetic workloads to demonstrate their performance.
Post a Comment