CS848 Paper Review Form - Fall 2006 Paper Title: "Replication and consistency: being lazy helps sometimes" Author(s): Yuri Breitbart and Henry F. Korth 1) Is the paper technically correct? [X] Yes [ ] Mostly (minor flaws, but mostly solid) [ ] No 2) Originality [ ] Very good (very novel, trailblazing work) [X] Good [ ] Marginal (very incremental) [ ] Poor (little or nothing that is new) 3) Technical Depth [X] Very good (comparable to best conference papers) [ ] Good (comparable to typical conference papers) [ ] Marginal depth [ ] Little or no depth 4) Impact/Significance [ ] Very significant [X] Significant [ ] Marginal significance. [ ] Little or no significance. 5) Presentation [X] Very well written [ ] Generally well written [ ] Readable [ ] Needs considerable work [ ] Unacceptably bad 6) Overall Rating [ ] Strong accept (very high quality) [X] Accept (high quality - would argue for acceptance) [ ] Weak Accept (marginal, willing to accept but wouldn't argue for it) [ ] Weak Reject (marginal, probably reject) [ ] Reject (would argue for rejection) 7) Summary of the paper's main contribution and rationale for your recommendation. (1-2 paragraphs) This paper presents a globally serializable transaction management protocol for distributed replicated data management systems. The protocol improves on previous work in this area in a few different ways. The first improvement is replacing eager updates used by other systems with lazy updates; this delays propagation of changes to replicated data items which removes the need to coordinate a global commit. Another interesting feature is a virtualization layer at the site level. This associates a "virtual site" with each transaction at each physical site that the transaction executes, and provides all of the data items that the transaction requires. Lastly, the protocol presented guarantees serializability, making it a legitimate candidate for real world use. The paper makes some strong assumptions about the nature of the transactions in the system (namely concerning the data items which are allowed to be updated by transactions) but there seems to be potential for practical applications as the approach is inspired by distributed data-warehousing. It is because of this novel and practical approach that I recommend an overall "Accept" rating. 8) List 1-3 strengths of the paper. (1-2 sentences each, identified as S1, S2, S3.) S1. The presented protocol allows read only transactions to run without having to request global locks. This can lead to substantial performance gains in real world systems. S2. The presented protocol guarantees global serializability. S3. Transactions which are forced to wait (to avoid a cycle in the replication graph) need only to wait until any transaction in the cycle is removed, this results in fewer deadlocks compared to lock based protocols in which transactions wait for a particular transaction to release a lock. 9) List 1-3 weaknesses of the paper (1-2 sentences each, identified as W1, W2, W3.) W1. Only transactions originating at a data items primary site may update that data item. This restricts usage of the protocol to problem domains which behave in this manner. W2. Detecting deadlock in the replication graph can be expensive, a simpler "timeout" mechanism gets used instead. W3. There is additional overhead in maintaining the replication graph. 10) Detailed comments for authors. The approach taken by the presented GS algorithm offers many advantages over traditional systems, but suffers from the restriction of only updating data items if the transaction originates from the data items primary site. The paper mentions that "a variety of applications" fit within these restrictions but no examples are given. The restriction seems to preclude many traditional types of data management problems which many be distributed with replication. It would be interesting to see if/how the algorithm can be generalized to relax this constraint.