CS848 Paper Review Form - Fall 2006 Paper Title: Ganymed: Scalable Replication for Transactional Web Applications Author(s): 1) Is the paper technically correct? [X] Yes [ ] Mostly (minor flaws, but mostly solid) [ ] No 2) Originality [ ] Very good (very novel, trailblazing work) [X] Good [ ] Marginal (very incremental) [ ] Poor (little or nothing that is new) 3) Technical Depth [ ] Very good (comparable to best conference papers) [X] Good (comparable to typical conference papers) [ ] Marginal depth [ ] Little or no depth 4) Impact/Significance [ ] Very significant [X] Significant [ ] Marginal significance. [ ] Little or no significance. 5) Presentation [ ] Very well written [X] Generally well written [ ] Readable [ ] Needs considerable work [ ] Unacceptably bad 6) Overall Rating [ ] Strong accept (very high quality) [ ] Accept (high quality - would argue for acceptance) [X] Weak Accept (marginal, willing to accept but wouldn't argue for it) [ ] Weak Reject (marginal, probably reject) [ ] Reject (would argue for rejection) 7) Summary of the paper's main contribution and rationale for your recommendation. (1-2 paragraphs) This paper looks at the use of a proxy machine, called a scheduler, to implement the lazy-master system of replicating databases. The scheduler is intended to coordinate the transactions of the system to ensure snapshot isolation is maintained at all replicas. They present an algorithm, RSI-Pc, that handles transactions so that isolation can be handelled for both the SERIALIZABLE and READ COMMITED transaction isolation levels. They implement their middleware system using a custom JDBC driver and they create a prototype application. They test their prototype application for the number of failures and show that they have predictable throughput even in the case of failures. This paper is an extension of the idea presented by Gray et al. and implements an algorithm that provides snapshot isolation. The experimental results provided use a benchmark and, thus, allow for easy comparisons. However, certain experiments were not run even though they were crucial, i.e. the failure of a scheduler. Also, they introduce an abstraction called the manager to increase fault tolerance but don't discuss the overhead associated with this addition. They completely ignore the synchornization problem of keeping two or more scheduler synchronized so that a scheduler failure does not lead to catastrophic aborts. 8) List 1-3 strengths of the paper. (1-2 sentences each, identified as S1, S2, S3.) S1- Implement an elegant solution to the problem of replication consistency S2- Show that performance for their system conforms to predictable patterns S3- Clearly explain the abstraction behind their system 9) List 1-3 weaknesses of the paper (1-2 sentences each, identified as W1, W2, W3.) W1- Avoids fault tolerance by adding an extra layer of abstraction without appropriately considering the consequences W2- Fails to test the performance of the system under the failure of the scheduler, which, without replication, represents the single point of failure in the system. W3- They contradict themselves when they say their system supports PostgreSQL and Oracle. They only seem to implement the PostgreSQL and mention the feasibility of Oracle. 10) Detailed comments for authors. Overall, the paper was well written and the result is interesting. However, I think that more time should have been spent on testing the most vital part of the system, the scheduler.