Please note: This PhD defence will be given online.
Cong
Guo, PhD
candidate
David
R.
Cheriton
School
of
Computer
Science
Consolidating multiple workloads on the same physical machine is an effective measure for utilizing resources efficiently and reducing costs. The main objective is to execute multiple demanding workloads using no more than necessary resources while simultaneously maximizing performance. Conventional work-conserving resource managers are designed for this purpose. However, without adequate control, the performance of consolidated workloads may degrade dramatically or become unpredictable because of contention for shared resources. Hence, resource isolation should be enforced according to a sharing policy when there is resource contention among workloads, i.e., each workload should obtain a theoretical share of resources. In reality, it is challenging for state-of-the-art resource managers to achieve both resource isolation and work conservation simultaneously due to complex and dynamic workloads.
This thesis proposes adaptive resource allocation to address this sharing problem and studies CPU management as an example. A novel feedback-based resource manager is designed to perform adaptive allocation of CPU resources, taking into account each workload’s requirements. First, an application-agnostic metric is proposed as the feedback signal, which can be used to measure the performance change of various applications in a non-invasive and timely way. Second, two alternative feedback-based algorithms are designed to search for the optimal resource allocation for each workload. The adaptive allocation is modelled as a dynamic optimization problem. The algorithms solve this problem by assessing performance changes in response to a change in resource allocation. The algorithms are demonstrated to be capable of handling complex and dynamic workloads. The resource manager proposed in this thesis uses these algorithms to determine the CPU allocation for multiple tenants. A prototype is implemented with four different sharing policies. For three common policies, the experimental evaluation confirms that the resource manager can achieve resource isolation and work conservation simultaneously, while the existing best-practice mechanisms cannot. Moreover, the resource manager can support a novel efficiency policy, which determines CPU sharing based on the overall system efficiency. In addition, a preliminary study shows that the feedback-based methodology for CPU management can be extended to control I/O bandwidth.