Benjamin
Cassell,
PhD
candidate
David
R.
Cheriton
School
of
Computer
Science
Many content delivery services use key components such as web servers, databases, and key-value stores to serve content over the Internet. This content can include web pages, streaming video and audio, pictures, games, personal data, social networking content, and software. Today’s content delivery services face challenges unique from those of the past. The first challenge faced is scale: Content is consumed at an unprecedented and growing rate, and much of this content is increasing in size. For example, it is becoming common for videos and photos to be delivered in Ultra High Definition (UHD) or with High Dynamic Range (HDR), resulting in large amounts of data transferred to an increasing number of consumers. This scale drives the need for efficient content delivery, as the physical machines, or virtual machines, or both, required to serve content are expensive.
Another challenge faced by modern content delivery systems is an increase in resource demand and contention. Services which run in cloud environments, for example, must share physical resources with collocated applications. Systems must deal with the resource consumption associated with growing scale and content sizes. Furthermore, other modern features consume additional resources: content encryption, for example, is progressively more ubiquitous even for large content like streaming video, and consumes large amounts of CPU resources.
Existing systems have difficulty adapting to these challenges while still performing efficiently. For instance, while many systems are designed to work with small data, they often struggle to service many concurrent requests for large data (as is the case for streaming video web servers). Our main goal is to demonstrate how software can be augmented or replaced to help improve the performance and hardware efficiency of targeted components of modern content delivery services.
We first introduce Libception, a system designed to help improve disk throughput for applications that process numerous concurrent disk requests for large content. By using serialization and aggressive prefetching, Libception improves the throughput of the Apache and nginx web servers by a factor of 2 on FreeBSD and 2.5 on Linux when serving HTTP streaming video content. Notably, this improvement is achieved without changing the source code of either web server. We additionally show that Libception’s benefits translate into performance gains for other workloads, reducing the runtime of a microbenchmark using the utility diff by 50% (again without modifying the application’s source code).
We next implement Nessie, a distributed, RDMA-based, in-memory key-value store whose unique protocol allows inter-server operations to complete without consuming any CPU resources besides those of the initiating server. Nessie’s design is intended to improve iv performance for systems in environments where CPU resources are shared (such as cloud environments), systems that perform in-memory distribution of large data, and systems that experience frequent periods at non-peak load during which energy could be conserved. We find Nessie improves throughput by 70% versus other approaches when storing large values in write-oriented workloads. Nessie also doubles throughput versus other approaches when CPU contention is introduced. Finally, Nessie provides 41% power savings (relative to idle power consumption) versus other approaches when system load is at 20% of peak throughput.
Finally, we build and evaluate RocketStreams, a framework which facilitates the creation of applications that disseminate and deliver live streaming video. Our framework exposes an easy-to-use API which provides applications with access to high-performance live streaming video dissemination, eliminating the need to implement complicated data management and networking code. RocketStreams’ TCP-based dissemination compares favourably to industry-grade alternatives, reducing CPU utilization on delivery nodes by 54% and increasing viewer throughput by 27% versus the Redis data store. Additionally, when RDMA-enabled hardware is available, RocketStreams provides RDMA-based dissemination which further increases overall performance, decreasing CPU utilization by 95% and increasing concurrent viewer throughput by 55% versus Redis.