[loch14] David Lo, Liqun Cheng, Rama Govindaraju, Luiz Barroso, and Christos Kozyrakis. Towards energy proportionality for large-scale latency-critical workloads. In Proc. Int'l Symp. on Computer Architecture, June 2014. [ bib | .pdf | .pdf ]
[mawe14] Nirmesh Malviya, Ariel Weisberg, Samuel Madden, and Michael Stonebraker. Rethinking main memory OLTP recovery. In Proc. IEEE Int'l Conf. on Data Engineering, pages 604-615, 2014. [ bib | .pdf ]
[volt14] Voltdb technical overview. VoltDB whitepaper, 2014. downloaded January 2014. [ bib | .pdf | .pdf ]
[stwe13] Michael Stonebraker and Ariel Weisberg. The VoltDB main memory DBMS. Bulletin of the IEEE Technical Committee on Data Engineering, 36(2):21-27, June 2013. [ bib | .pdf ]
[bafe13] Peter Bailis, Alan Fekete, Ali Ghodsi, Joseph M. Hellerstein, and Ion Stoica. HAT, not CAP: Towards highly available transactions. In Proc. Workshop on Hot Topics in Operating Systems, May 2013. [ bib | .pdf ]
[kalm13] David Kalmuk. Understanding the DB2 process model architecture in warehouse and PureScale environments. IDUG DB2 Tech Conference presentation, May 2013. [ bib | .pdf ]
[krpa13] Tim Kraska, Gene Pang, Michael Franklin, Samuel Madden, and Alan Fekete. MDCC: Multi-data center consistency. In Proc. EuroSys Conf., pages 113-126, April 2013. [ bib | .pdf | .pdf ]
[vade13] Tamas Vajk, Laszlo Deak, Krisztian Fekete, and Gergely Mezei. Automatic NoSQL schema development: A case study. In Proc. IASTED Int'l Conf. Parallel and Distributed Computing and Networks (PDCN 2013), pages 656-663, February 2013. [ bib | .pdf ]
[agda13] Divyakant Agrawal, Sudipto Das, and Amr. Data Management in the Cloud: Challenges and Opportunities. Number 32 in Synthesis Lectures on Data Management. Morgan & Claypool, 2013. [ bib | .pdf ]
[beda13] Philip A. Bernstein and Sudipto Das. Rethinking eventual consistency. In Proc. ACM SIGMOD Int'l Conf. on Management of Data, 2013. [ bib | .pdf | .pdf ]
This is a short overview of a SIGMOD tutorial.
[difr13] Cristian Diaconu, Craig Freedman, Erik Ismert, Per-Ake Larson, Pravin Mittal, Ryan Stonecipher, Nitin Verma, and Mike Zwilling. Hekaton: SQL Server's memory-optimized OLTP engine. In Proc. ACM SIGMOD Int'l Conf. on Management of Data, pages 1243-1254, 2013. [ bib | DOI | .pdf | .pdf ]
[naag13] Faisal Nawab, Divyakant Agrawal, and Amr El Abbadi. Message futures: Fast commitment of transactions in multi-datacenter environments. In Proc. Conf. on Innovative Database Research, 2013. [ bib | .pdf | .pdf ]
[tepr13] Douglas B. Terry, Vijayan Prabhakaran, Ramakrishna Kotla, Mahesh Balakrishnan, Marcos K. Aguilera, and Hussam Abu-Libdeh. Consistency-based service level agreements for cloud storage. In Proc. Symp. on Operating Systems Principles, 2013. [ bib | DOI | .pdf ]
Describes a transactional key-value store called Pileus which allows applications to define an SLA for each Get operation. SLA describes the application's preferred performance/consistency tradeoff. SLA is a list of latency/consistency/utility triples, with earlier items preferred to later items. System tries to achieve SLA by controlling which and how many replicas to use for each Get request.
[tosc13] A. Tomic, D. Sciascia, and F. Pedone. MoSQL: An elastic storage engine for MySQL. In ACM Symposium on Applied Computing, DADS Track, 2013. [ bib | .pdf ]
[tuzh13] Stephen Tu, Wenting Zheng, Eddie Kohler, Barbara Liskov, and Samuel Madden. Speedy transactions in multicore in-memory databases. In Proc. Symp. on Operating Systems Principles, pages 18-32, 2013. [ bib | DOI | .pdf | .pdf ]
[zhpo13] Yang Zhang, Russell Power, Siyuan Zhou, Yair Sovran, Marcos K. Aguilera, and Jinyang Li. Transaction chains: achieving serializability with low latency in geo-distributed storage systems. In Proc. Symp. on Operating Systems Principles, 2013. [ bib | DOI | .pdf ]
Chops transactions into subtransactions, each of which executes at a single site. Uses static conflict analysis to determine whether subtransactions can be executed independently while ensuring the serializability of the whole outer transaction. User intiated aborts can only occur in the first subtransaction, and system acknowledges commit as soon as the first subtransaction succeeds. Implemented in a system called Lynx.
[atbu12] Paolo Atzeni, Francesca Bugiotti, and Luca Rossi. Uniform access to non-relational database systems: the SOS platform. In Proc. Int'l Conf. on Advanced Info. Systems Engineering, June 2012. [ bib | .pdf | .pdf ]
Describes a meta-system intended to abstract different types of NoSQL systems. Apps see the meta-system, which translates operations to underlying NoSQL systems. Uses an abstract schema consisting of structs, sets, and attributes.
[coli12] James Cowling and Barbara Liskov. Granola: Low-overhead distributed transaction coordination. In Proc. USENIX Annual Technical Conf., June 2012. [ bib | .pdf | .pdf ]
[bigd12] Challenges and opportunities with big data. white paper, February 2012. [ bib | .pdf | .pdf ]
[aubo12] A. Auradkar, C. Botev, S. Das, D. De Maagd, A. Feinberg, P. Ganti, L. Gao, B. Ghosh, K. Gopalakrishna, B. Harris, et al. Data infrastructure at linkedin. In Proc. IEEE Int'l Conf. on Data Engineering, pages 1370-1381, 2012. [ bib ]
Includes a description of Voldemort.
[bave12] Peter Bailis, Shivaram Venkataraman, Joseph M. Hellerstein, Michael Franklin, and Ion Stoica. Probabilistically bounded staleness for practical partial quorums. Technical Report UCB/EECS-2012-4, Dept. of EECS, University of California at Berkeley, January 2012. accepted to VLDB 2012. [ bib | .pdf ]
[cabl12] Ting Caoy, Stephen M Blackburny, Tiejun Gaoy, and Kathryn S McKinley. The yin and yang of power and performance for asymmetric hardware and managed software. In Proc. Int'l Symp. on Computer Architecture, 2012. [ bib | .pdf | .pdf ]
[code12] James C. Corbett, Jeffrey Dean, Michael Epstein, Andrew Fikes, Christopher Frost, J. J. Furman, Sanjay Ghemawat, Andrey Gubarev, Christopher Heiser, Peter Hochschild, Wilson Hsieh, Sebastian Kanthak, Eugene Kogan, Hongyi Li, Alexander Lloyd, Sergey Melnik, David Mwaura, David Nagle, Sean Quinlan, Rajesh Rao, Lindsay Rolig, Yasushi Saito, Michal Szymaniak, Christopher Taylor, Ruth Wang, and Dale Woodford. Spanner: Google's globally-distributed database. In Proc. USENIX Conf. on Operating Systems Design and Implementation, 2012. [ bib | .pdf ]
Stores versioned (timestamped) key to value mappings, replicated using Paxos. Paxos leader also implements a lock table and transaction manager. Transactions local to one group (Paxos instance) are handled by that leader, otherwise 2PC is used among group leaders. Groups span zones (datacenters). Underlying data are stored in Colossus (which is local to a data center?) Keys are grouped into common prefix directories, and directories are assigned to groups. (There are also tablets - difference between tablets and directories is not clear.) Responsibility for controlling replication split between apps and admin. Spanner admins specify a set of replication options (number and placement of copies), apps choose which of these options to use for each directory. Quasi-relational data model, with versioned values, is implemented on top of the basic key to value mapping. Relations are organized hierarchically, with rows linkage based on common key prefixes. Each unique key in the top level table corresponds to a directory.
[krpa12] Tim Kraska, Gene Pang, Michael J. Franklin, and Samuel Madden. MDCC: Multi-data center consistency. Computing Research Repository (CoRR), abs/1203.6049(arXiv:1203.6049v1), 2012. [ bib | .pdf | http ]
[lipo12] Cheng Li, Daniel Porto, Allen Clement, Johannes Gehrke, Nuno Preguica, and Rodrigo Rodrigues. Making geo-replicated systems fast as possible, consistent when necessary. In Proc. USENIX Conf. on Operating Systems Design and Implementation, pages 265-278, 2012. [ bib | .pdf | .pdf ]
[vaja12] Kenzo Van Craeynest, Aamer Jaleel, Lieven Eeckhout, Paolo Narvaez, and Joel Emer. Scheduling heterogeneous multi-cores through performance impact estimation (pie). In Proc. Int'l Symp. on Computer Architecture, 2012. [ bib | .pdf | .pdf ]
[vowa12] Hoang Tam Vo, Sheng Wang, Divyakant Agrawal, Gang Chen, and Beng Chin Ooi. LogBase: A scalable log-structured database system in the cloud. Proc. of the VLDB Endowment, 5(10):1004-1015, 2012. [ bib | .pdf | .pdf ]
Objective include high write bandwidth and low read latency. Relational data abstraction. Data are vertically partitioned (using a workload), and then horizontally partitioned within each vertical partition, resulting in tablets. Data are versioned. Interface is record oriented, includes get, put, insert, delete, scan. Each server maintains a single log for all tablets it is responsible for. Also, a multi-version index to locate records on reads. Index is checkpointed to disk periodically to reduce recovery time. Multi-version optimistic concurrency control used to provide SI over multiple records.
[wepi11] Zhou Wei, Guillaume Pierre, and Chi-Hung Chi. CloudTPS: Scalable transactions for Web applications in the cloud. IEEE Transactions on Services Computing, 5(4):525-539, 2012. [ bib | .pdf | .pdf ]
[llfr11] Wyatt Lloyd, Michael J. Freedman, Michael Kaminsky, and David G. Andersen. Don't settle for eventual: scalable causal consistency for wide-area storage with COPS. In Proc. Symp. on Operating Systems Principles, October 2011. [ bib | DOI | .pdf ]
[pore11] Raluca Ada Popa, Catherine Redfield, Nickolai Zeldovich, and Hari Balakrishnan. CryptDB: Protecting confidentiality with encrypted query processing. In Proc. Symp. on Operating Systems Principles, October 2011. [ bib | .pdf | .pdf ]
[sopo11] Yair Sovran, Russell Power, Marcos K. Aguilera, and Jinyang Li. Transactional storage for geo-replicated systems. In Proc. Symp. on Operating Systems Principles, October 2011. [ bib | .pdf | .pdf ]
[kovi11] Ioannis Koltsidas and Stratis D. Viglas. Data management over flash memory (tutorial presentation). In Proc. ACM SIGMOD Int'l Conf. on Management of Data, June 2011. [ bib | .pdf ]
[mrys11] Michael Rys. Scalable SQL. Communications of the ACM, 54(6):48-53, June 2011. [ bib ]
Discusses data and functional partitioning in fairly generic terms. Also includes a case study of scaleout for MySpace, using SQL Server.
[shmi11] Mohammad Bilal Sheikh, Umar Farooq Minhas, Omar Zia Khan, Ashraf Aboulnaga, Pascal Poupart, and David J. Taylor. A bayesian approach to online performance modeling for database appliances using gaussian models. In Proc. Int'l Conf. on Autonomic Computing, June 2011. [ bib | .pdf ]
[bihu11] Kenneth P. Birman, Qi Huang, and Dan Freedman. Overcoming the D in CAP: Using Isis2 to build locally responsive cloud services. Technical report, Cornell University, April 2011. unnumbered technical report. [ bib | .pdf | .pdf ]
[lajo11] Horacio Lagar, Kaustubh Joshi, Matti Hiltunen, Roy Bryant, Eyal de Lara, Alexey Tumanov, Olga Irzak, and Adin Scannell. Kaleidoscope: Cloud micro-elasticity via vm state coloring. In Proc. EuroSys Conf., April 2011. [ bib | .pdf | .pdf ]
[babo11] Jason Baker, Chris Bond, James Corbett, J.J. Furman, Andrey Khorlin, James Larson, Jean-Michel Leon, Yawei Li, Alexander Lloyd, and Vadim Yushprakh. Megastore: Providing scalable, highly available storage for interactive services. In Proc. Conf. on Innovative Database Research, January 2011. [ bib | .pdf | .pdf ]
Data is partitioned into entity groups, each independently and sychronously replicated over a wide area. Multiple data centers, each with a NoSQL data store (BigTable). Transactions allowed within entity groups, but not across. Transactions seem to be reads then writes. Uses Paxos to do wide area replication of a log that includes all transactions' updates, and get agreement on their order of execution. Concurrency control is effectively optimistic - if two transactions are writing in the same entity group at the same time, one may be aborted and retried.
[becs11] Philip A. Bernstein, Istvan Cseri, Nishant Dani, Nigel Ellis, Ajay Kallan, Gopal Kakivaya, David B. Lomet, Ramesh Manne, Lev Novik, and Tomas Talius. Adapting Microsoft SQL Server for cloud computing. In Proc. IEEE Int'l Conf. on Data Engineering, 2011. [ bib | .pdf ]
Describes Cloud SQL Server, used by SQL Azure DBMS-as-a-service. Uses partitioning, transactions confined to a single partition. Each DBMS instance has private storage. Synchronous master-slave DBMS-level replication for HA.
[cujo11] Carlo Curino, Evan Jones, Raluca Ada Popa, Nirmesh Malviya, Eugene Wu, Samuel Madden, Hari Balakrishnan, and Nickolai Zeldovich. Relational Cloud: A database service for the cloud. In Proc. Conf. on Innovative Database Research, January 2011. [ bib | .pdf | .pdf ]
Multiple multi-tenant DBMS, each hosting one or more workloads. Large workloads can be scaled-out over multiple DBMS using workload-aware partitioning.
[goli11] Wojciech M. Golab, Xiaozhou Li, and Mehul A. Shah. Analyzing consistency properties for fun and profit. Technical Report HPL-2011-6, HP Laboratories, 2011. Technical report version of a PODC11 paper. [ bib | .pdf ]
[hewi11] Eben Hewitt. Cassandra: The Definitive Guide. O'Reilly, 2011. [ bib | .pdf ]
[nawi11] Mahdi Tayarani Najaran, Primal Wijesekera, Andrew Warfield, and Norman C. Hutchinson. Distributed indexing and locking: In search of scalable consistency. In Proc. Workshop on Large Scale Distributed Systems and Middleware, 2011. [ bib | .pdf ]
[onru11] Diego Ongaro, Stephen M. Rumble, Ryan Stutsman, John Ousterhout, and Mendel Rosenblum. Fast crash recovery in RAMCloud. In Proc. Symp. on Operating Systems Principles, pages 29-41, 2011. [ bib | .pdf | .pdf ]
RamCloud appears to apps as a simple key-value storage system. Tries to provide low-latency (5-10 us) access times - needs Infiniband to support this. Tables broken into tablets (contiguous ranges of keys in a tablet), each tablet is assigned to a server. Data are organized in memory as a log. On write, insert new record into the log and update a hash table which indicates where the record can be found. Also push the change to several backup copies of the log on other servers. Periodic garbage collection reclaims log segments. Backups buffer their copies of the log in memory and gradually flush it to disk (not on every update). Suggest use of battery backup or capacitors to ensure that unflushed updates can make it to disk in case power is lost to a backup server. Recovery is highly parallelized. Failed server's keys are repartitioned among multiple recovery masters, each of which recovers its part of the key space and becomes the master for that part of the key space.
[pato11] Ippokratis Pandis, Pinar Tozun, Ryan Johnson, and Anastasia Ailamaki. Plp: Page latch-free shared-everything oltp. Proc. of the VLDB Endowment, 4(10):610-621, 2011. [ bib | .pdf | .pdf ]
[safr11] Ido Safruti. The great mobile slowdown. cotendo white paper, 2011. [ bib | .pdf ]
[sopr11] João Soares and Nuno Preguiça. Combining mobile and cloud storage for providing ubiquitous data access. In Proc. Int'l Euro-Par Conf. on Parallel Processing, 2011. [ bib | .pdf | .pdf ]
[trag11] Nguyen Tran, Marcos K. Aguilera, and Mahesh Balakrishnan. Online migration for geo-distributed storage systems. In Proc. USENIX Annual Technical Conf., 2011. [ bib | .pdf | .pdf ]
Describes an abstraction called overlays for data migration in distributed key-value storage systems.
[catt10] Rick Cattell. Scalable SQL and NoSQL data stores. SIGMOD Record, 39(4):12-27, December 2010. [ bib | .pdf | .pdf ]
[adbo10] Sarita V. Adve and Hans-J. Boehm. Memory models: A case for rethinking parallel languages and hardware. Communications of the ACM, 53(8):90-101, August 2010. [ bib ]
Excellent overview of hardware and high-level language memory models.
[stuh10] Julian Stuhler. Ibm db2 purescale: The next big thing or a solution looking for a problem? Database Journal, July 2010. [ bib | http ]
[brho10] Erik Brynjolfsson, Paul Hofmann, and John Jordan. Cloud computing and electricity: Beyond the utility model. Communications of the ACM, 53(5):32-34, May 2010. [ bib | .pdf ]
Discussion of technical and business strengths and weaknesses of the utility computing model, including security, lock-in and interoperability.
[durk10] Dave Durkee. Why cloud computing will never be free. Communications of the ACM, 53(5):62-69, May 2010. [ bib ]
Discusses cloud service pricing, the cloud computing marketplace, and strategies the may be used by vendors to keep costs low, and weaknesses of current cloud SLAs. Then discusses requirements for Cloud 2.0, meaning cloud services intended to support critical enterprise applications. Issues include storage system performance - argues that access randomness and working set size are proportional to the number of applications supported by a shared storage service. Also discusses administration, SLAs and automation.
[scne10] Daniel J. Scales, Mike Nelson, and Ganesh Venkitachalam. The design and evaluation of a practical system for fault-tolerant virtual machines. Technical Report VMware-TR-2010-001, VMWare, May 2010. [ bib | .pdf ]
[cami10] Mustafa Canim, George A. Mihaila, Bishwaranjan Bhattacharjee, Kenneth A. Ross, and Christian A. Lang. Ssd bufferpool extensions for database systems. Proc. of the VLDB Endowment, 3(2):1435-1446, 2010. [ bib | .pdf | .pdf ]
[daag10a] Sudipto Das, Shashank Agarwal, Divyakant Agrawal, and Amr El Abbadi. Elastras: An elastic, scalable, and self managing transactional database for the cloud. Technical Report 2010-04, University of California, Santa Barbara, 2010. [ bib | .pdf ]
[dani10] Sudipto Das, Shoji Nishimura, Divyakant Agrawal, and Amr El Abbadi. Live database migration for elasticity in a multitenant database for cloud platforms. Technical Report 2010-09, Department of Computer Science, University of California Santa Barbara, 2010. [ bib | .pdf ]
[dese10] Biplob Debnath, Sudipta Sengupta, and Jin Li. Flashstore: High throughput persistent key-value store. Proc. of the VLDB Endowment, 3(2):1414-1425, 2010. [ bib | .pdf | .pdf ]
Writes collected in RAM and batched to SSD in chunks large enough to fill a flash page. Hash table in memory is used to index key,value pairs in the SSD. There is also a read cache in RAM. Berkeley DB is used to index key,value records on disk. Record read checks RAM read cache, then RAM write buffer, then SSD, then disk. All reads are added to the RAM read cache. Records are inserted into the SSD when they are written (after staging). SSD pages are organized as a ring buffer. When SSD fills, records on early pages are recycled - either by reinserting them into the SSD or by destaging them to the disk. A clock like algorithm (with recent-reference bit) is used to determine whether a record is reinserted into SSD or destaged to disk.
[feze10] Ariel J. Feldman, William P. Zeller, Michael J. Freedman, and Edward W. Felten. SPORC: Group collaboration using untrusted cloud resources. In Proc. USENIX Conf. on Operating Systems Design and Implementation, 2010. [ bib | .pdf | .pdf ]
[guku10] Ajay Gulati, Chethan Kumar, Irfan Ahmad, and Karan Kumar. Basil: Automated io load balancing across storage devices. In USENIX Conference on File and Storage Technology (FAST'10), 2010. [ bib | .pdf ]
[jopa10] Ryan Johnson, Ippokratis Pandis, Radu Stoica, Manos Athanassoulis, and Anastasia Ailamaki. Aether: A scalable approach to logging. Proc. of the VLDB Endowment, 3(1):681-692, 2010. [ bib | .pdf | .pdf ]
Includes a performance evaluation of Early Lock Release (release locks before commit record goes to disk, but do not return results to client and ensure subsequent transaction's commits are dependent on this one). Also asynchronous log flushing (called flush pipelining) so that threads don't context-switch while waiting for log I/O. Also a technique for parallelizing log buffer insertion.
[joab10] Evan P.C. Jones, Daniel J. Abadi, and Samuel Madden. Low overhead concurrency control for partitioned main memory databases. In Proc. ACM SIGMOD Int'l Conf. on Management of Data, pages 603-614, 2010. [ bib | DOI | .pdf | .pdf ]
[jobo10] William K. Josephson, Lars A. Bongo, David Flynn, and Kai Li. Dfs: A file system for virtualized flash storage. In USENIX Conference on File and Storage Technology (FAST'10), 2010. [ bib | .pdf ]
[leig10] Tom Leighton. Akamai and cloud computing: A perspective from the edge of the cloud. Akamai white paper, 2010. [ bib | .pdf ]
[lizh10] Zhichun Li, Ming Zhang, Zhaosheng Zhu, Yan Chen, Albert Greenberg, and Yi-Min Wang. WebProphet: Automating performance prediction for web services. In Proc. USENIX Conf. on Networked Systems Design and Implementation, 2010. [ bib | .pdf | .pdf ]
[mase10] Prince Mahajan, Srinath Setty, Sangmin Lee, Allen Clement, Lorenzo Alvisi, Mike Dahlin, and Michael Walfish. Depot: Cloud storage with minimal trust. In Proc. USENIX Conf. on Operating Systems Design and Implementation, 2010. [ bib | .pdf | .pdf ]
[peda10] Daniel Peng and Frank Dabek. Large-scale incremental processing using distributed transactions and notifications. In Proc. USENIX Conf. on Operating Systems Design and Implementation, pages 1-15, 2010. [ bib | .pdf | .pdf ]
Describes Percolator, used to incrementally maintain Google's web search index. Provides multi-row transactions and snapshot isolation, using multi-versioning in BigTable. Some transactions may have high latency.
[ouag09] John Ousterhout, Parag Agrawal, David Erickson, Christos Kozyrakis, Jacob Leverich, David Mazières, Subhasish Mitra, Aravind Narayanan, Guru Parulkar, Mendel Rosenblum, Stephen M. Rumble, Eric Stratmann, and Ryan Stutsman. The case for RAMClouds: Scalable high-performance storage entirely in DRAM. Operating Systems Review, 43(4):92-105, December 2009. [ bib | .pdf | .pdf ]
A whitepaper presenting motivation for the RAMCloud project.
[lama09] Avinash Lakshman and Prashant Malik. Cassandra - a decentralized structured storage system. In Proc. ACM SIGOPS Int'l Workshop on Large Scale Distributed Systems and Middleware (LADIS'09), October 2009. [ bib | .pdf | .pdf ]
[pure09] Transparent application scaling with ibm db2 purescale. IBM white paper, October 2009. [ bib | .pdf | .pdf ]
[wure09] Xiaojian Wu and A. L. Narasimha Reddy. Managing storage space in a flash and disk hybrid storage system. In Proc. IEEE/ACM Int'l Symp. on Modelling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS), September 2009. [ bib | .pdf | .pdf ]
[coco09] Greenplum. Mad skills: New analysis practices for big data. Greenplum white paper, March 2009. [ bib | .pdf | .pdf ]
[chot09] Whei-Jen Chen, Masafumi Otsuki, Paul Descovich, Selvaprabhu Arumuggharaj, Toshihiko Kubo, and Yong Jun Bi. High Availability and Disaster Recovery Options for DB2 on Linux, UNIX, and Windows. IBM Redbook, February 2009. [ bib | .pdf | .pdf ]
[abba09] Azza Abouzeid, Kamil Bajda-Pawlikowski, Daniel J. Abadi, Alexander Rasin, and Avi Silberschatz. Hadoopdb: An architectural hybrid of mapreduce and dbms technologies for analytical workloads. Proc. of the VLDB Endowment, 2(1):922-933, 2009. [ bib | .pdf | .pdf ]
[auja09] Stefan Aulbach, Dean Jacobs, Alfons Kemper, and Michael Seibold. A comparison of flexible schemas for software as a service. In Proc. ACM SIGMOD Int'l Conference on Management of Data, pages 881-888, 2009. [ bib | DOI | .pdf ]
[cabh09] Mustafa Canim, Bishwaranjan Bhattacharjee, George Mihaila, Christian Lang, and Ken Ross. An object placement advisor for db2 using solid state storage. Proc. of the VLDB Endowment, 2(2):1318-1329, 2009. [ bib | .pdf | .pdf ]
[daag09] Sudipto Das, Divyakant Agrawal, and Amr El Abbadi. ElasTraS: An elastic transactional data store in the cloud. In Proc. USENIX Workshop on Hot Topics in Cloud Computing, 2009. [ bib | .pdf | .pdf ]
[frpa09] Eric Friedman, Peter M. Pawlowski, and John Cieslewicz. Sql/mapreduce: A practical approach to self-describing, polymorphic, and parallelizable user-defined functions. Proc. of the VLDB Endowment, 2(2):1402-1413, 2009. [ bib | .pdf | .pdf ]
[gare09] John Garrison and A. L. Narasimha Reddy. Umbrella file system: Storage management across heterogeneous devices. ACM Transactions on Storage, 5(1), 2009. [ bib | DOI | .pdf | .pdf ]
[gana09] Alan Gates, Olga Natkovich, Shubham Chopra, Pradeep Kamath, Shravan Narayanam, Christopher Olston, Benjamin Reed, Santhosh Srinivasan, and Utkarsh Srivastava. Building a highlevel dataflow system on top of mapreduce: The pig experience. Proc. of the VLDB Endowment, 2(2):1414-1425, 2009. [ bib | .pdf | .pdf ]
[isyu09] Michael Isard and Yuan Yu. Distributed data-parallel computing using a high-level programming language. In Proc. ACM SIGMOD Int'l Conf. on Management of Data (SIGMOD'09), pages 987-994, 2009. [ bib | DOI | .pdf ]
[nuri09] Lucas Nussbaum and Olivier Richard. A comparative study of network link emulators. In Proceedings of the 2009 Spring Simulation Multiconference, pages 85:1-85:8, 2009. [ bib | .pdf ]
[papa09] Andrew Pavlo, Erik Paulson, Alexander Rasin, Daniel J. Abadi, David J. DeWitt, Samuel Madden, and Michael Stonebraker. A comparison of approaches to large-scale data analysis. In Proc. ACM SIGMOD Int'l Conf. on Management of Data (SIGMOD'09), pages 165-178, 2009. [ bib | DOI | .pdf ]
[thsa09] Ashish Thusoo, Joydeep Sen Sarma, Namit Jain, Zheng Shao, Prasad Chakka, Suresh Anthony, Hao Liu, Pete Wyckoff, and Raghotham Murthy. Hive - a warehousing solution over a map-reduce framework. Proc. of the VLDB Endowment, 2(2):1626-1629, 2009. [ bib | .pdf | .pdf ]
[webo09] Craig D. Weissman and Steve Bobrowski. The design of the Force.com multitenant internet application development platform. In Proc. ACM SIGMOD Int'l Conference on Management of Data (SIGMOD), pages 889-896, 2009. [ bib | DOI | .pdf ]
[coha08] Graham Cormode and Marios Hadjieleftheriou. Finding frequent items in data streams. In Proc. Int'l Conference on Very Large Data Bases (VLDB'08), August 2008. [ bib | .pdf ]
[selt08] Margo Seltzer. Beyond relational databases. ACM Queue, 51(7):52-58, July 2008. [ bib | DOI | .pdf ]
Argues for modular and configurable DBMS to address new applications: warehousing, directory services, web search, mobile device caching, XML, streams.
[prit08] Dan Pritchett. BASE: An acid alternative. ACM Queue, 6(3):48-55, May 2008. [ bib | DOI | .pdf ]
[abma08] Daniel J. Abadi, Samuel Madden, and Nabil Hachem. Column-stores vs. row-stores: How different are they really? In Proc. ACM SIGMOD Int'l Conf. on Management of Data, pages 967-980, 2008. [ bib | .pdf ]
[aggo08] Marcos K. Aguilera, Wojciech M. Golab, and Mehul A. Shah. A practical scalable distributed b-tree. Proc. of the VLDB Endowment, 1(1):598-609, 2008. [ bib | .pdf | .pdf ]
[augr08] Stefan Aulbach, Torsten Grust, Dean Jacobs, Alfons Kemper, and Jan Rittinger. Multi-tenant databases for software as a service: schema-mapping techniques. In Pro. ACM SIGMOD Int'l Conference on Management of Data, pages 1195-1206, 2008. [ bib | DOI | .pdf ]
[brfl08] Matthias Brantner, Daniela Florescu, David Graf, Donald Kossmann, and Tim Kraska. Building a database on S3. In Proc. ACM SIGMOD Int'l Conference on Management of Data (SIGMOD), pages 251-264, 2008. [ bib | DOI | .pdf ]
[caro08] Michael J. Cahill, Uwe Röhm, and Alan D. Fekete. Serializable isolation for snapshot databases. In Proc. ACM SIGMOD Int'l Conference on Management of Data (SIGMOD), pages 729-738, 2008. [ bib | DOI | .pdf ]
[depa08] David J. DeWitt, Erik Paulson, Eric Robinson, Jeffrey F. Naughton, Joshua Royalty, Srinath Shankar, and Andrew Krioukov. Clustera: an integrated computation and data management system. Proc. of the VLDB Endowment, 1(1):28-41, 2008. [ bib | .pdf | .pdf ]
[kaki08] Robert Kallman, Hideaki Kimura, Jonathan Natkins, Andrew Pavlo, Alexander Rasin, Stanley Zdonik, Evan P. C. Jones, Samuel Madden, Michael Stonebraker, Yang Zhang, John Hugg, and Daniel J. Abadi. H-store: A high-performance, distributed main memory transaction processing system. In Proc. Int'l Conf. on Very Large Data Bases, volume 1, pages 1496-1499, 2008. [ bib | .pdf | .pdf ]
[kovi08] Ioannis Koltsidas and Stratis D. Viglas. Flashing up the storage layer. Proc. of the VLDB Endowment, 1(1):514-525, 2008. [ bib | DOI | .pdf | .pdf ]
Considers architecture with both flash and magnetic disk available for persistent storage. Each block lives persistently either on disk or on flash, not both. Assumes there is a demand-paged in-memory block cache that makes a placement decision on eviction of a dirty page. Proposed placement algorithms count page reads and writes uses the counts, as well as the costs of read and write operations on disk and flash, to decide where to place an evicted page. Placement decisions are made independently for each page. In particular, there are no capacity constraints and thus the algorithms may choose to place all blocks on the same device. Proposed cache replacement algorithm keeps some number of least-recently-used pages in four queues corresponding to whether the page is clean or dirty and whether the page is located on flash or disk. Always evict the page with the lowest eviction cost from among these least-recently used pages.
[drep07] Ulrich Drepper. What every programmer should know about memory. November 2007. [ bib | .pdf ]
[kesh07] S. Keshav. How to read a paper. ACM SIGCOMM Computer Communication Review, 37(3):83-84, July 2007. [ bib | http | .pdf ]
[laju07] Pepijn de Langen and Ben H. H. Juurlink. Trade-offs between voltage scaling and processor shutdown for low-energy embedded multiprocessors. In Int'l Workshop on Embedded Computer Systems: Architectures, Modeling, and Simulation, number 4599 in Lecture Notes in Computer Science. Springer-Verlag, July 2007. [ bib | .pdf ]
[stke07] Christopher Stewart, Terence Kelly, and Alex Zhang. Exploiting nonstationarity for performance prediction. In Proc. EuroSys 2007, pages 31-46, March 2007. [ bib | .pdf ]
[agme07] Marcos K. Aguilera, Arif Merchant, Mehul Shah, Alistair Veitch, and Christos Karamanolis. Sinfonia: a new paradigm for building scalable distributed systems. In Proc. ACM SIGOPS Symposium on Operating Systems Principles (SOSP), pages 159-174, 2007. [ bib | DOI | .pdf | .pdf ]
[grae07] Goetz Graefe. The five-minute rule twenty years later, and how flash memory changes the rules. In Proc. Int'l Workshop on Data Management on New Hardware, pages 1-9, 2007. [ bib | DOI | .pdf ]
[hest07] Joseph Hellerstein, Michael Stonebraker, and James Hamilton. Architecture of a database system. Foundations and Trends in Databases, 1(2):141-259, 2007. [ bib | .pdf | .pdf ]
[orac07] Oracle. Scalability and performance with Oracle 11g database. Oracle white paper, 2007. [ bib | .pdf ]
[stma07] Michael Stonebraker, Samuel Madden, Daniel J. Abadi, Stavros Harizopoulos, Nabil Hachem, and Pat Helland. The end of an architectural era (it's time for a complete rewrite). In Proc. Int'l Conf. on Very Large Data Bases, pages 1150-1160, 2007. [ bib | .pdf ]
[beda06] Philip A. Bernstein, Nishant Dani, Badriddine Khessib, Ramesh Manne, and David Shutt. Data management issues in supporting large-scale web services. Bulletin of the IEEE Technical Committee on Data Engineering, 29(4):3-9, December 2006. [ bib | .ps | .ps ]
[rale06] Parthasarathy Ranganathan, Phil Leech, David E. Irwin, and Jeffrey S. Chase. Ensemble-level power management for dense blade servers. In Proc. International Symposium on Computer Architecture (ISCA'06), pages 66-77, June 2006. [ bib | .pdf | .pdf ]
Power management for groups (ensembles) of servers, under the assumption that the servers in a group are likely to require peak power at different times. Goal is to reduce the amount of power overprovisioning required for the group.
[arba06] Arvind Arasu, Shivnath Babu, and Jennifer Widom. The CQL continuous query language: Semantic foundations and query execution. VLDB Journal, 15:121-142, February 2006. [ bib ]
CQL is the query language implemented by the Stanford STREAM database system.
[burr06] Michael Burrows. The chubby lock service for loosely-coupled distributed systems. In Proc. of the Symp. on Operating System Design and Implementation (OSDI'06), pages 335-350, 2006. [ bib | .pdf | .pdf ]
Chubby cell has a primary and secondaries. Primary handles all reads and writes. Writes are replicated to secondaries and acked when a majority have acked. Primary holds master lease, which it will renew unless it fails. Primary failure causes election of new primary via distributed consensus protocol. Chubby exports a simple Unix-like file system interface. File's can act as reader/writer locks. Chubby also provides notifications of various events, such as modification of file contents or master failover. Clients maintain sessions, which a terminated if the client dies or the session becomes idle. Server guarantees a minimum idle lease time before it will determine session is idle and terminate it. Client also maintains a (conservative) session timeout and will eventually decide its session is expired. Sessions can potentially be preserved across master failovers.
[grla06] jim gray and leslie lamport. Consensus on transaction commit. ACM Transactions on Database Systems, 31(1):133-160, 2006. [ bib | DOI | .pdf ]
Defines a fault-tolerant commit protocol called Paxos Commit, which makes progress as long as a majority of participants are available. Does not block on failure of a coordinator, as in standard 2PC.
[crwu06] Sailesh Krishnamurthy, Chung Wu, and Michael Franklin. On-the-fly sharing for streamed aggregation. In Proc. ACM SIGMOD International Conference on Management of Data (SIGMOD'06), pages 623-634, 2006. [ bib | DOI | .pdf ]
[lova06] David Lomet, Zografoula Vagena, and Roger Barga. Recovery from "bad" user transactions. In Proc. ACM SIGMOD Int'l Conference on Management of Data (SIGMOD'06), pages 337 - 346, 2006. [ bib | http | .pdf ]
[nive06] Edmund B. Nightingale, Kaushik Veeraraghavan, Peter M. Chen, and Jason Flinn. Rethink the sync. In USENIX Symposium on Operating Systems Design and Implementation (OSDI'06), 2006. [ bib | .pdf | .pdf ]
[paju06] Seon-yeong Park, Dawoon Jung, Jeong-uk Kang, Jin-soo Kim, and Joonwon Lee. Cflru: A replacement algorithm for flash memory. In Proc. Int'l Conf. on Compilers, Architecture and Synthesis for Embedded Systems, pages 234-241, 2006. [ bib | DOI | .pdf ]
[wudi06] Eugene Wu, Yanlei Diao, and Shariq Rizvi. High-performance complex event processing over streams. In Proc. ACM SIGMOD International Conference on Management of Data (SIGMOD'06), pages 407-418, 2006. [ bib | DOI | .pdf ]
[xigo05] Man Xiong, Brian Goldstein, and Chris Auger. Scaling out SQL Server with data-dependent routing. Dell Power Solutions, August 2005. [ bib | .pdf | .pdf ]
[waro05] Andrew Warfield, Russ Ross, Keir Fraser, Christian Limpach, and Steven Hand. Parallax: managing storage for a million machines. In Proc. USENIX Hot Topics in Operating Systems (HOTOS'05), June 2005. [ bib | .pdf | .pdf ]
Block level storage virtualization targeted at virtual machines. Uses copy-on-write and trie-based block indexing to support versioned device images. Virtualization is implemented in dedicated virtual machines, one for each node in a cluster.
[moch05] Justin D. Moore, Jeffrey S. Chase, Parthasarathy Ranganathan, and Ratnesh K. Sharma. Making scheduling "cool": Temperature-aware workload placement in data centers. In Proc. USENIX Annual Technical Conference, pages 61-75, April 2005. [ bib | .pdf | .pdf ]
[hedi05] Taliver Heath, Bruno Diniz, Enrique V. Carrera, Wagner Meira Jr., and Ricardo Bianchini. Energy conservation in heterogeneous server clusters. In Proc. ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP'05), pages 186-195, 2005. [ bib | DOI | .pdf ]
How to distribute work in a cluster given that different nodes may have different performance and power characteristics. Objective is to minimize power consumption per unit of throughtput. Test implementation is in a cluster web server, and control is achieved by re-distributing the workload among the cluster nodes. Two distribution mechanisms are used: a simple front-end load balancer, and a peer-to-peer mechanism for redistributing requests among servers.
[meag05] Ahmed Metwally, Divyakant Agrawal, and Amr El Abbadi. Efficient computation of frequent and top-k elements in data streams. In Proc. International Conference on Database Theory (ICDT), January 2005. [ bib | .pdf | http ]
[stab05] Michael Stonebraker, Daniel J. Abadi, Adam Batkin, Xuedong Chen, Mitch Cherniack, Miguel Ferreira, Edmond Lau, Amerson Lin, Samuel Madden, Elizabeth J. O'Neil, Patrick E. O'Neil, Alex Rasin, Nga Tran, and Stanley B. Zdonik. C-store: A column-oriented DBMS. In Proc. Int'l Conf. on Very Large Data Bases, pages 553-564, 2005. [ bib | .pdf | .pdf ]
[zhha05] Ning Zhang, Peter J. Haas, Vanja Josifovski, Guy M. Lohman, and Chun Zhang. Statistical learning techniques for costing XML queries. In Proc. International Conference on Very Large Data Bases (VLDB'05), pages 289-300, 2005. [ bib | .pdf | .pdf ]
[zhko05] Rui Zhang, Nick Koudas, Beng Chin Ooi, and Divesh Srivastava. Multiple aggregations over data streams. In Proc. ACM SIGMOD International Conference on Management of Data (SIGMOD'05), pages 299-310, 2005. [ bib | DOI | .pdf ]
[bhtr04] Suparna Bhattacharya, John Tran, Mike Sullivan, and Chris Mason. Linux AIO performance and robustness for enterprise workloads. In Linux Symposium, pages 63-78, 2004. [ bib | .pdf ]
[dihe04] Yixin Diao, Joseph L. Hellerstein, Adam J. Storm, Maheswaran Surendra, Sam Lightstone, Sujay S. Parekh, and Christian Garcia-Arellano. Incorporating cost of control into the design of a load balancing controller. In IEEE Real-Time and Embedded Technology and Applications Symposium, 2004. [ bib | .pdf ]
[lech04] Byung Suk Lee, Li Chen, Jeff Buzas, and Vinod Kannoth. Regression-based self-tuning modeling of smooth user-defined function costs for an object-relational database management system query optimizer. The Computer Journal, 47(6):673-693, 2004. [ bib | .pdf ]
Builds a cost model by tracking costs of recent UDF invocations, including their costs and values of cost-related parameters, and then fitting a model to these data. Includes discussion of statistical issues like collinearity and removal of outliers and collinearity.
[likr04] Jinyuan Li, Maxwell Krohn, David Maziéres, and Dennis Shasha. Secure untrusted data repository (SUNDR). In Proc. USENIX Conf. on Operating Systems Design and Implementation, pages 121-136, 2004. [ bib | .pdf | .pdf ]
[mamu04] John MacCormick, Nick Murphy, Marc Najork, Chandramohan A. Thekkath, and Lidong Zhou. Boxwood: abstractions as the foundation for storage infrastructure. In Proc. of the Symp. on Operating System Design and Implementation (OSDI'04), 2004. [ bib | .pdf ]
[razh04] Amira Rahal, Qiang Zhu, and Per-Ake Larson. Evolutionary techniques for updating query cost models in a dynamic multidatabase environment. VLDB Journal, 13(2):162-176, 2004. [ bib | .pdf ]
Considers cost models as linear functions of a set of explanatory variables for each query class. Initial model is constructed by regression over an initial set of labeled cost samples. Proposes two methods to incrementally maintain such models by folding in new samples and removing the effects of old samples. Assumes that queries from the application workload are labeled and used to train the model.
[akam04] A developers guide to on-demand distributed computing. Akamai white paper, 2004. [ bib | .pdf ]
[pobe03] Rachel Pottinger and Philip A. Bernstein. Merging models based on given correspondences. In Proceedings of the 29th International Conference on Very Large Data Bases, pages 826-873, September 2003. [ bib | .pdf | .pdf ]
[arha03] Walid G. Aref, Moustafa A. Hammad, Ann Christine Catlin, Ihab F. Ilyas, Thanaa M. Ghanem, Ahmed K. Elmagarmid, and Mirette S. Marzouk. Video query processing in the VDBMS testbed for video database research. In ACM International Workshop on Multimedia Databases (MMDB'03), pages 25-32, 2003. [ bib | DOI | .pdf ]
[crjo03] Charles D. Cranor, Theodore Johnson, Oliver Spatscheck, and Vladislav Shkapenyuk. Gigascope: A stream database for network applications. In Proc. ACM SIGMOD International Conference on Management of Data (SIGMOD'03), pages 647-651, 2003. [ bib | .pdf ]
[gada03] Lei Gao, Mike Dahlin, Amol Nayate, Jiandan Zheng, and Arun Iyengar. Application specific data replication for edge services. In Proc. Int'l Conf. on World Wide Web (WWW'03), pages 449-460, 2003. [ bib | DOI | .pdf ]
[ghgo03] Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung. The Google file system. In Proc. Symposium on Operating System Principles (SOSP'03), pages 29-43, 2003. [ bib | .pdf ]
Discusses file system optimized for relatively small number of large files. Workload is large sequential reads and appends. Node failures are normal. Throughtput is more important than latency. Architecture has a single master and many chunk servers. Master stores metadata (namespace, access controls). Chunks are replicated. Local storage on each chunk node is via a Linux file system. Clients do metadata operations through master, then go directly to chunk servers for data retrieval. Implements a weak consistency model. Metadata operations are atomic and serialized. Concurrent writes of the same file range may get mixed, not serialized. Concurrent appends may lead to duplication. To update a chunk, client first determines (from the master or its cache) the locations of all replicas of the chunk. It first send the update to all replicas. It then sends a write request to the master replica, which serializes all such requests. The master forwards the request serialization order to the other replicas, which apply the updates (they already have) in master-chosen order.
[guel03] Isabelle Guyon and Andre Elisseeff. An introduction to variable and feature selection. Journal of Machine Learning Research, 3:1157-1182, 2003. [ bib | .pdf ]
[ilar03] Ihab F. Ilyas, Walid G. Aref, and Ahmed K. Elmagarmid. Supporting top-k join queries in relational databases. In Proceedings of 29th International Conference on Very Large Data Bases (VLDB'03), pages 754-765, 2003. [ bib | .pdf | .pdf ]
Assumes joined tuples are ranked according to a monotone function of tuple ranks of join inputs. Defines physical join operators that can produce join results in rank order. Operator needs to queue up join results until it can be certain that it will produce them in the proper order.
[shba03] Ratnesh K. Sharma, Cullen E. Bash, Chandrakant D. Patel, Richard J. Friedrich, and Jeffrey S. Chase. Balance of power: Dynamic thermal management for internet data centers. Technical Report HPL-2003-5, HP Laboratories, Palo Alto, California, 2003. [ bib | .pdf | .pdf ]
Describes a methodology for thermal load balancing in server rooms. Thermal imbalances can be caused by imbalanced distribution of server workload and by peculiarities of the airflow in the server room, e.g., racks at the end of a row may be hotter than racks in the middle. Input includes server exhaust temperature readings and cold air temperature. Local thermal imbalances can be corrected by adjusting the allocation of work to the various servers.
[brko02] Nicolas Bruno, Nick Koudas, and Divesh Srivastava. Holistic twig joins: optimal XML pattern matching. In Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data, pages 310-321, 2002. [ bib | .pdf ]
A technique for finding twig query matches without first matching individual binary subrelationships in the twig, i.e., this is an N-way structural join. Description of related work is a succinct classification of previous work on twig query processing.
[gily02] Seth Gilbert and Nancy Lynch. Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services. SIGACT News, 33(2):51-59, 2002. [ bib | DOI | .pdf ]
[mena02] Daniel A. Menascé. TPC-W: A benchmark for E-Commerce. IEEE Internet Computing, 6(3):83-87, 2002. [ bib | .pdf ]
[pibi01] Eduardo Pinheiro, Ricardo Bianchini, Enrique Carrera, and Taliver Heath. Load balancing and unbalancing for power and performance in cluster-based systems. In Proc. Workshop on Compilers and Operating Systems for Low Power, September 2001. [ bib | .ps.gz | .ps.gz ]
Automatic power management in a cluster of servers by concentrating load on as few machines as possible and turning others off. Implemented in a web server and in a cluster operation system.
[boco01] P. Bohrer, D. Cohn, E.N. Elnozahy, T. Keller, M. Kistler, C. Lefurgy, R. Rajamony, F. Rawson, and E. V. Hensbergen. Energy conservation for servers. In Proc. IEEE Workshop on Power Management for Real-Time and Embedded Systems, May 2001. [ bib | .pdf | .pdf ]
A brief general overview of the the problem of energy conservation in data centers.
[horn01] Paul Horn. autonomic computing: IBM's perspective on the state of information technology. Technical report, International Business Machines Corporation, Armonk, NY, USA, 2001. [ bib | .pdf ]
[poha01] Rachel Pottinger and Alon Y. Halevy. MiniCon: A scalable algorithm for answering queries using views. VLDB Journal, 10(2-3):182-198, 2001. [ bib | .pdf | .pdf ]
[rodr01] Antony I. T. Rowstron and Peter Druschel. Pastry: Scalable, decentralized object location, and routing for large-scale peer-to-peer systems. In Proc. IFIP/ACM International Conference on Distributed Systems Platforms (Middleware'01), pages 329-350, 2001. [ bib | .pdf ]
[stmo01] Ion Stoica, Robert Morris, David R. Karger, M. Frans Kaashoek, and Hari Balakrishnan. Chord: A scalable peer-to-peer lookup service for internet applications. In Proc. ACM SIGCOMM Conference, pages 149-160, 2001. [ bib | DOI | .pdf | .pdf ]
[brew00] Eric A. Brewer. Towards robust distributed systems. Keynote presentation, ACM Symposium on Principles of Distrbuted Computing (PODC), July 2000. [ bib | .pdf | .pdf ]
Presentation of the CAP conjecture.
[beha00] Philip A. Bernstein, Alon Y. Halevy, and Rachel Pottinger. A vision of management of complex models. SIGMOD Record, 29(4):55-63, 2000. [ bib | .pdf | .pdf ]
[grbr00] Steven D. Gribble, Eric A. Brewer, Joseph M. Hellerstein, and David E. Culler. Scalable, distributed data structures for internet service construction. In Proc. of the Symp. on Operating System Design and Implementation (OSDI'00), pages 319-332, 2000. [ bib | .pdf ]
[yuva00] Haifeng Yu and Amin Vahdat. Design and evaluation of a continuous consistency model for replicated services. In Proc. of the Symp. on Operating System Design and Implementation (OSDI'00), pages 21-21, 2000. [ bib | .pdf ]
[grgr97] Jim Gray and Goetz Graefe. The five-minute rule ten years later, and other computer storage rules of thumb. SIGMOD Record, 26(4):63-68, December 1997. [ bib | DOI ]
[pesp97] Karin Petersen, Mike J. Spreitzer, Douglas B. Terry, Marvin M. Theimer, and Alan J. Demers. Flexible update propagation for weakly consistent replication. In Proc. of the ACM Symp. on Operating Systems Principles (SOSP'97), pages 288-301, 1997. [ bib | DOI | .pdf ]
[tsso96] Odysseas G. Tsatalos, Marvin H. Solomon, and Yannis E. Ioannidis. The GMAP: a versatile tool for physical data independence. The VLDB Journal, 5:101-118, 1996. [ bib ]
[lesi92] Eliezer Levy and Avi Silberschatz. Incremental recovery in main memory database systems. IEEE Transactions on Knowledge and Data Engineering, 4(6):529-540, December 1992. [ bib | .pdf ]
Incremental, page-at-a-time database recovery on demand, rather than recovery of entire DB before transaction processing resumes. Disk portion of log has redo records only, grouped by the page they refer to. Parallel processes flush log records to disk and also apply logged updates to disk version of the page. Buffer manager also does page flushes (no steal policy). Safe-fetch rule says log updates are only applied to (disk copies) of pages that are in the buffer pool - not clear how this helps. Relies on non-vol RAM to store a map indicating which pages may be stale after a crash, so that they can be brought up to date before being read in after a failure. Targets shared memory multiprocessor, where logger and propagator (applies log updates to pages) can run on their own processors. Seems to rely on page-level locking for correctness.
[degr92] David J. DeWitt and Jim Gray. Parallel database systems: The future of high-performance database systems. Communications of the ACM, 35(6):85-98, 1992. [ bib | .pdf ]
Discusses scale-up and speed-up as two distinct parallelism objectives. Discusses shared-memory, shared-disk, and shared-nothing architectures and argues that the latter will provide the best scalability because it places the least demands on the interconnection network because interaction is minmized. Discusses data partitioning and parallelization of relational query operators.
[grae90] Goetz Graefe. Encapsulation of parallelism in the volcano query processing system. In Proc. ACM SIGMOD Int'l Conf. on Management of Data, pages 102-111, 1990. [ bib | DOI | .pdf ]
[gr89] The Tandem Database Group. NonStop SQL, a distributed high performance, high availability implementation of SQL. In D. Gawlick, M. N. Haynie, and A. Reuter, editors, Proc. 2nd Int'l Workshop on High Performance Transaction Systems, volume 359 of Lecture Notes in Computer Science, pages 60-104. Springer-Verlag, 1989. workshop dates September 28-30, 1987. [ bib | .pdf | .pdf ]
[okli88] B. Oki and B. Liskov. Viewstamped replication: A new primary copy method to support highly-available distributed systems. In ACM Symp. on Principles of Distributed Computing, 1988. [ bib ]
[grpu86] Jim Gray and Franco Putzolu. The 5 minute rule for trading memory for disk accesses and the 5 byte rule for trading memory for cpu time. Technical Report 86.1, Tandem Computers, May 1986. Original report was May 1985. [ bib | .pdf | .pdf ]
[lamp78] Leslie Lamport. Time, clocks and the ordering of events in a distributed system. Communications of the ACM, 21(7):558-565, July 1978. [ bib | .pdf ]
[mage70] R. L. Mattson, J. Gecsei, D. R. Slutz, and I. L. Traiger. Evaluation techniques for storage hierarchies. IBM Systems Journal, 9(2):78-117, June 1970. [ bib | DOI | .pdf ]
Includes a proof of optimality of the MIN algorithm.
[xero11] Xeround. Xeround cloud database, part 1 - technology. Xeround white paper. downloaded March 2011. [ bib | .pdf | .pdf ]
Multiple MySQL front ends, replicated partitioned data. Assignment of partitions to nodes can be adjusted to support elastic scale-out. Supports distributed query execution. Company offers database service, rather than software.