Large-scale peer-assisted content distribution systems within the “cloud” of the Internet have provided valuable services to a large population of end users, ranging from file sharing, live video streaming, to video-on-demand (VoD). Great attention from both academia and the industry has been paid to this area. With users not only downloading data but also uploading data to one another, such peer-assisted systems are easy to deploy and have good scalability. However, due to the highly dynamic nature of distributed peers with heterogeneous capacities and diverse behaviors, there still remain several fundamental challenges in large-scale peer-assisted content distribution and video streaming systems, with respect to the cost-performance trade-off in peer-assisted online hosting and distribution, and the flash crowd problem in P2P live streaming, as well as the service qualities of peer-assisted VoD. This thesis seeks to address these challenges through not just mathematical modeling and analysis, but also practical system design and measurement, in order to bridge theory and practice.
First, to guarantee adequate levels of service quality while conserving prohibitive server costs, we seek to explore the design space of online hosting and distribution systems that integrate peer bandwidth contributions with strategic server resource provisioning in a complementary and transparent manner. Specifically, we first model and analyze new strategies to allocate scarce server resources—including both storage space and bandwidth—in peer-assisted online hosting systems. The objective is to maximize the use of limited server storage and bandwidth resources to guarantee adequate levels of file availability and downloading performance, while taking full advantage of peer assistance. We identify a number of unique challenges involved in such systems, and propose our design of resource allocation protocols to address these challenges. Based on the guidelines derived from our analysis, we design and measure FS2You, a large-scale and real-world online hosting system with peer bandwidth assistance and semi-persistent file availability. FS2You is designed to dramatically mitigate server bandwidth costs. We present our architectural and protocol design, as well as an extensive measurement study at a large scale to demonstrate the effectiveness of our design, using real-world traces that we have collected. To our knowledge, our work represents the first attempt to design, implement, and evaluate a new peer-assisted semi-persistent online hosting system at a realistic scale. Since the launch of FS2You, it has quickly become one of the most popular online hosting systems in mainland China, and a favorite in many online forums across the country.
Second, it is evident from our experiences with real-world P2P live streaming systems that, it is not uncommon to have hundreds of thousands of users trying to join a program during the first few minutes of a live broadcast. This phenomenon, unique in live streaming systems, referred to as flash crowd, poses significant challenges in the system scalability and user experience. We have developed a mathematical model to capture and understand the inherent relationship between time and scale in P2P streaming systems experiencing the flash crowd. Specifically, we show that there is a fundamental upper bound on the system scale with respect to a time constraint. In addition, our analysis has brought forth an in-depth understanding of the effects of the gossip protocol and peer churn. To our knowledge, our work represents the first attempt to provide an analytical characterization and understanding of the inherent scale-time relationship in P2P streaming systems, with a particular focus on flash crowd and various other critical factors.
Third, due to the lack of theoretical foundation and new storage and transmission mechanisms, the service quality—including the video streaming bit rates and the startup and seek latencies—provided by current peer-assisted VoD systems are still far from optimum. In practice, we design, implement, fine-tune and measure Novasky, a real-world VoD system capable of delivering cinematic-quality video streams to end users. The foundation of the Novasky design is a P2P storage cloud, storing and refreshing media streams in a decentralized fashion using local storage spaces of end users. Different from existing peer-assisted VoD systems, it features a new peer storage and replacement algorithm using Reed-Solomon codes and an adaptive server push-to-peer strategy. Novasky has been deployed in the Tsinghua University campus network, operational since September 2009, attracting 10,000 users to date, and providing over 1,000 DVD or 720p video streams with bit rates of 1 – 2 Mbps. Based on real-world traces collected over 6 months, we show that Novasky can achieve rapid startups within 4 – 9 seconds, and extremely short seek latencies within 3 seconds.
Furthermore, we develop a theoretical framework based on queueing models, in order to (1) justify the superiority of service prioritization based on a taxonomy of requests, and (2) understand the fundamental principles behind optimal caching and prefetching designs in peer-assisted VoD systems. The focus is to instruct how limited uploading bandwidth resources and peer caching capacities can be utilized most efficiently to achieve better system performance. Specifically, we first use priority queueing analysis to prove how service quality and user experience can be statistically guaranteed, by prioritizing requests in the order of significance, including urgent playback (e.g., random seeks or initial startup), normal playback, and prefetching. We then proceed to construct a fine-grained stochastic supply-demand model to investigate peer caching and prefetching as a global optimization problem. This can not only provide insights into understanding the fundamental characterization of demand, but also offer guidelines towards optimal caching and prefetching strategies in peer-assisted VoD systems
Post a Comment