Seminar • Networks and Distributed Systems: Scalable Replay-Based Replication for Fast OLTP Databases | Cheriton School of Computer Science

Ashvin Goel, Associate Professor
Electrical and Computer Engineering and Computer Science, University of Toronto

Databases commonly use primary-backup replication schemes for fault tolerance and disaster recovery. These schemes raise significant challenges for modern, in-memory databases, which generate high transaction rates, and recovery logs at close to memory bandwidth. It is hard to replay the recovery log scalably on the backup, making the backup a bottleneck. Moreover, the log transfer can cause network bottlenecks. Both these bottlenecks can significantly slow the primary database.

This work proposes addressing these problems by using record-replay for replicating fast databases. Our design enables replay to be performed scalably and concurrently, allowing the backup performance to scale with the primary. At the same time, our approach requires only 15-20% of the network bandwidth required by traditional logging, reducing network infrastructure costs significantly.

Location Information

Location Address: DC - William G. Davis Computer Research Centre
200 University Avenue West
1304
Waterloo, ON, CA N2L 3G1

Location coordinates: