Book Cover

Principles of Distributed Database Systems, Fourth Edition

M. Tamer Özsu
Patrick Valduriez

Springer
ISBN 978-3-030-26252-5
2020


The fourth edition is finally out... It has been ten years since the release of the third edition -- it took a while, but we are very happy with the results. Through this site, we will make available presentation slides, solutions to some of the exercises, and (hopefully few) errors. These are accessible through the links on the left. The slides are downloadable by anyone and they are also available from Springer site (click here). However, access to solutions to exercises is restricted to those academics who have adopted the book for a course. Therefore, we ask you to register and provide some evidence of the course adoption. We also ask you not to put the solutions online in any format or to distribute them to anyone who may then post them online.

The book is available from Springer, Amazon, and Chapters-Indigo (in Canada).

The fourth edition of this classic textbook sees major updates. This edition has completely new chapters on Big Data Platforms (distributed storage systems, MapReduce, Spark, data stream processing, graph analytics) and on NoSQL, NewSQL and polystore systems. It also includes an updated web data management chapter that includes RDF and semantic web discussion, an integrated database integration chapter focusing both on schema integration and querying over these systems. The peer-to-peer computing chapter has been updated with a discussion of blockchains. The chapters that describe classical distributed and parallel database technology have all been updated.

The new edition covers the breadth and depth of the field from a modern viewpoint. Graduate stduents, as well as senior undergraduate students studying computer science and other related fields can use this book as the primary textbook. Researchers working in computer science will also find this book useful.

The major changes in the fourth edition are the following:

  1. Over the years, the motivations and the environment for this technology have somehow shifted (Web, cloud, etc.). In light of this, the introductory chapter needed a serious refresh. The introduction is revised with the aim of a more comtemporary look at the technology.
  2. A new chapter on big data processing is added to cover distributed storage systems, data stram processing, MapReduce and Spark platforms, graph analytics, and data lakes.
  3. Similarly, the growing influence of NoSQL systems is addressed by devoting a new chapter to it. This chapter covers the four types of NoSQL (ke-value stores, document stores, wide column systems, and graph DBMSs), as well as NewSQL systems and polystores. 
  4. Database integration and multidatabase query processing chapters from the third edition are combined into a uniform chapter on database integration.
  5. Web data management chapter has undergone a major revision shifting its focus from XML to RDF technology, which is more appropriate at this time. The new chapter also includes a discussion of web data integration approaches.
  6. As part of the cleaning up of previous chapters, the peer-to-peer chapter is updated to include a discussion of blockchain; query processing and transaction management chapters are updated by removing the fundamental chentralized techniques. This material is now provided as online appendices (see the link on the left). New topics that have become important have been added to thee chapters such as dynamic query processing (eddies) and Paxos consensus algorithm.
  7. The parallel DBMS chapter has been updated by clarifying hte objectives, in particular, scale-up versus scale-out, and parallel architectures such as UMA and NUMA are now discussed. 
  8. The distributed design chapter has been refreshed by including a discussion of modern approaches that combine fragmentation and allocation.
  9. Although object technology continues to play a role in information systems, its importance in distributed/parallel data management has declined. Therefore, this chapter is removed from the print copy and is provided as an online appendix (see link on the left).

The resulting book is now a modern coverage of the distributed and parallel database technology. We hope it will be useful for a number of years until a new edition is warranted.

As always, we would very much like to hear from you. Let us know what you think we did right and what we got wrong; what you would like to be included in the next edition and what no longer needs to be included. Of course, let us know if you discover any errors.

M. Tamer Özsu (tamer.ozsu@uwaterloo.ca)
Patrick Valduriez (Patrick.Valduriez@inria.fr)