Topics in Database Systems

CS 748M, Waterloo, Fall '98


  Instructor:   David Toman (david@uwaterloo.ca)
  Org. Meeting: DC 3307, Sept. 16, 1998, 2pm
  Office:       DC 3128, x4777
  Lectures:     DC 3307, Th 4pm 
  Class Info:   http://db.uwaterloo.ca/~david/classes/dbtopics-fall98


Slides announcing the class

Class Resources on the WEB:

Announcements, material(s), and class notes are/will be available through the WEB; you are expected to look it up here: Lecture notes, Schedule and Reading list, and Projects (To appear as the class starts).

Synopsis:

Traditional database systems focus on concurrent execution numerous but relatively small transactions over large databases; the on-line transaction processing (OLTP).

However, recently the emphasis has shifted to the on-line analytical processing (OLAP): answering few, but rather complex queries over very large data sets, commonly collected over long periods of time. The goal is to discover patterns in the data: e.g., association rules or trends in time sequences generated by market data.

The class will focus on the requirements of an OLAP system (what questions are interesting and/or useful), on the limitations of current approaches (what is feasible using current technology/at all), on efficient storage and data models (what do we really need to remember to answer particular OLAP queries), and on query processing algorithms (including approximate answers).

Sample topics:

  • Time in Databases: how to manage historical data,
  • What about other interpreted data: Constraint Databases and GIS,
  • On-line Analytical Processing (Data Cubes, etc.),
  • Data Warehousing (does it really help?),
  • Data mining and how does it relate to statistical methods,
  • Approximate query answering,
  • ...
  • The list will be extended with additional topics depending on interest (and sufficient expertise).

    The course will be run as a seminar with few formal lectures. However, participants are expected to prepare for each class session (reading assigned papers/papers relevant to the topic, trying to solve questions from the previous session, etc.). The classes will usually begin with a short presentation followed by discussion.

    Projects:

    Projects are the integral part of the class; to complete your project you have to
  • pick a project; your choice of an in-depth paper (explore a problem in depth and write a paper), a breadth survey (survey a topic and write a report), or an implementation project (produce a working prototype). I will suggest several topics in one of the first classes. However, I'll appreciate you taking initiative in finding your own topic.
  • prepare a proposal: a written description of the goals (1 page) and a short presentation in the class (10''). The proposal will be discussed in class and the agreed on result will be your contract for passing the class.
  • work on your project.
  • prepare a final report and a presentation to the class (about 30 minutes; for an implementation project you'll need to show test cases, example runs, and/or live demo).
  • For large projects you may form groups; in general you can collaborate as much as you want, but you must indicate your own work (and give credit to others when appropriate).

    Assessment:

    There will be no formal examination(s) or assignments. However, I'll try to summarize at the end of each session what you should look at for the next class (and your grade depends on how well you were prepared for classes, so the preparation is rather mandatory). The assessment will be based on
  • the quality of the submitted project (70%; you can't pass without completing your project), and
  • participation in class discussion (30%).
  • Attendance of classes in semi-mandatory: you are unlikely to pass if I won't be able to remember your face at the end of the semester.

    Prerequisites:

    The lectures assume prior familiarity with databases systems on the level of an introductory database class. In addition you ought to freshen up your skills in the following areas:
  • Elementary math (sets, relations, first-order logic).
  • Basics of complexity theory (e.g., the O-notation, etc.).


  • Fine print: the usual university policies on academic honesty, fair use of computing facilities, etc., apply by default.