THESIS
2017
xxiv, 280 pages : illustrations ; 30 cm
Abstract
By making computing available to the users in the same manner as utilities, such as water, electricity, and gas are, Cloud computing has defined a new computational approach that is revolutionizing the IT industry. As a natural consequence, and under the impetus of competition between cloud service providers (CSPs), data center deployments have seen a dramatic increase. Today, it is believed that most data or information circulating on the Internet, originates from some data center somewhere in the world.
The archetype of public data centers consists of tens of thousands of servers mutually interconnected via a high-speed interconnections network made of thousands of commodity network switches. They often run many applications that serve a huge number of users simultaneously. As a c...[
Read more ]
By making computing available to the users in the same manner as utilities, such as water, electricity, and gas are, Cloud computing has defined a new computational approach that is revolutionizing the IT industry. As a natural consequence, and under the impetus of competition between cloud service providers (CSPs), data center deployments have seen a dramatic increase. Today, it is believed that most data or information circulating on the Internet, originates from some data center somewhere in the world.
The archetype of public data centers consists of tens of thousands of servers mutually interconnected via a high-speed interconnections network made of thousands of commodity network switches. They often run many applications that serve a huge number of users simultaneously. As a consequence, data center-hosted applications often adopt a multi-tiered model where several services running on distributed servers work together to satisfy a single client request. In such a model, performance depends greatly on the ability of the communication network to provide efficient and timely data transfers.
After introducing the background of the architectural design of data center networks and examining how such new operating environments affect network applications performance, we then explore the causes of performance degradations in such high-throughput low-latency data center networks (DCNs). Following that, we finally focus on presenting a collection of mechanisms we have designed to deal with network congestion in DCNs. In particular, as in public data centers virtual machines and their protocol stacks are under the control of the cloud tenants, we focus primarily on new designs that avoid modifying the protocol stack and its underlying congestion control algorithm.
In this perspective, by conducting simulation experiments and analyzing real network traffic traces, we first identify the reasons behind the problems faced by TCP in DCNs: namely, unfairness, incast congestion, short TCP loss cycles and bloated flow completion times for short-lived flows due to the inadequate retransmission timeout in TCP. We also propose a collection of traffic control schemes to address each of the problems without incurring any modifications to the TCP stack.
In particular, we propose: i) switch-based mechanisms, such as the so-called RWNDQ, IQM and HSCC schemes to address inter-TCP-flows unfairness, TCP incast congestion and the short TCP cycle, respectively; ii) hypervisor-based schemes, namely, T-RACKs and HyGenICC to handle any inadequate TCP RTO and inter-transport-protocol unfairness, respectively; and iii) SDN-based approaches, viz., SICC and SDN-GCC to provide an SDN implementation alternative to the switch-based IQM and hypervisor-based HyGenICC, respectively.
The proposed schemes are demonstrated to achieve considerable performance gains for cloud applications via mathematical modeling, empirical analysis, simulation and real-testbed implementation and experiments on various network scenarios and topologies. The thesis is supplemented with the prototyping of the switch-based schemes in the NetFPGA platform.
Post a Comment