Topics in Database Systems

CS 748M, Waterloo, Fall '00


Synopsis:

Traditional database systems focus on concurrent execution numerous but relatively small transactions over large databases; the on-line transaction processing (OLTP).

However, recently the emphasis has shifted to the on-line analytical processing (OLAP): answering few, but rather complex queries over very large/complex data sets with a predetermined interpretation, commonly collected over long periods of time.

The class will focus on the requirements of such systems (what questions are interesting and/or useful), on the limitations of current approaches (what is feasible using current technology/at all), on efficient storage and data models (what do we really need to remember to answer particular OLAP queries), and on query processing algorithms (including approximate answers).

Sample topics:

  • What if I need to store information about Time and Space?
  • What about other apriori interpreted data?
  • What data can we throw away?
  • What if the database (locally) violates integrity constraints?
  • What if I don't need an exact answer?
  • ...
  • The list will be extended with additional topics depending on interest (and sufficient expertise).

    The course will be run as a seminar with few formal lectures. However, participants are expected to prepare for each class session (reading assigned papers/papers relevant to the topic, trying to solve questions from the previous session, etc.). The classes will usually begin with a short presentation followed by discussion.

    Projects:

    Projects are the integral part of the class; to complete your project you have to
  • pick a project; your choice of an in-depth paper (explore a problem in depth and write a paper) or an implementation project (produce a working prototype). I will suggest several topics in one of the first classes. However, I'll appreciate you taking initiative in finding your own topic.
  • prepare a proposal: a written description of the goals (1 page) and a short presentation in the class (10''). The proposal will be discussed in class and the agreed on result will be your contract for passing the class.
  • work on your project.
  • prepare a final report and a presentation to the class (about 30 minutes; for an implementation project you'll need to show test cases, example runs, and/or live demo).
  • For large projects you may form groups; in general you can collaborate as much as you want, but you must indicate your own work (and give credit to others when appropriate).

    Assessment:

    There will be no formal examination(s) or assignments. However, I'll try to summarize at the end of each session what you should look at for the next class (and your grade depends on how well you were prepared for classes, so the preparation is rather mandatory). The assessment will be based on
  • the quality of the submitted project (70%; you can't pass without completing your project), and
  • participation in class discussion (30%).
  • Attendance of classes in semi-mandatory: you are unlikely to pass if I won't be able to remember your face at the end of the semester.

    Prerequisites:

    The lectures assume prior familiarity with databases systems on the level of an introductory database class. In addition you ought to freshen up your skills in the following areas:
  • Elementary math (sets, relations, first-order logic).
  • Basics of complexity theory (e.g., the O-notation, etc.).


  • Fine print: the usual university policies on academic honesty, fair use of computing facilities, etc., apply by default.