THESIS
2019
xv, 75 pages : illustrations ; 30 cm
Abstract
In recent years, the link speed of data center networks (DCNs) significantly increases,
from 1Gbps to 10Gbps, to 40/100Gbps with 200Gbps on the horizons. In the era of high speed DCNs, it is increasingly clear that traditional kernel based network transports can no longer meet the requirements of modern data center applications, mainly for two reasons. First, traditional network transports adopt reactive algorithms for congestion control, which is too slow and inefficient at high speed. Second, kernel based transports have very high CPU overhead at high speed and thus can hardly deliver low latency and high
throughput to applications/services at low cost. Realizing the drawbacks of traditional network transports, great effort has been made in the recent years. However, existing soluti...[
Read more ]
In recent years, the link speed of data center networks (DCNs) significantly increases,
from 1Gbps to 10Gbps, to 40/100Gbps with 200Gbps on the horizons. In the era of high speed DCNs, it is increasingly clear that traditional kernel based network transports can no longer meet the requirements of modern data center applications, mainly for two reasons. First, traditional network transports adopt reactive algorithms for congestion control, which is too slow and inefficient at high speed. Second, kernel based transports have very high CPU overhead at high speed and thus can hardly deliver low latency and high
throughput to applications/services at low cost. Realizing the drawbacks of traditional network transports, great effort has been made in the recent years. However, existing solutions either fail to achieve desirable performance or are difficult to deploy in production environments.
Regarding congestion control for high speed DCNs, proactive transport has drawn great attention in the community. With proactive transport, link capacities are proactively allocated as “credits” to each sender who then is able to send “scheduled packets” at a
right rate to ensure zero queueing and high link utilization. Despite being promising, a fundamental challenge is that proactive transport requires at least one RTT for the credits to be computed and delivered. In the thesis, we reveal that such one-RTT “pre-credit”
phase is crucial, but none of prior solutions has treated it properly.
Regarding providing desirable network performance at low CPU overhead, public cloud providers like Microsoft and Google are deploying remote direct memory access
(RDMA) over Ethernet (RoCE) in their data centers to enable low latency, high throughput data transfers with minimal CPU overhead. RoCE deployments, however, are vulnerable to deadlocks induced by Priority Flow Control (PFC). Once deadlock is formed, throughput
of the whole network or part of the network will go to zero due to the backpressure effect of PFC pause.
This thesis describes my research efforts to address the above two challenges. First, we
present Aeolus, a solution focusing on “pre-credit” packet transmission acting as a building block for all proactive transports. With Aeolus, two seemingly contradictory goals are achieved simultaneously: eliminating the one-RTT additional delay of “pre-credit” phase while still preserving all the good properties of proactive solutions. Second, we propose a
practical deadlock prevention scheme for RDMA DCNs, called Tagger. By carrying tags in the packets and installing pre-generated match-action rules in the switches for tag manipulation and buffer management, Tagger guarantees deadlock-freedom using only modest
buffers without any changes to the routing protocol or switch hardware.
Post a Comment