CS 856 
Web Data Management

Guidelines for Term Projects

Instructor: M. Tamer Özsu

Project Objectives and Scope

The projects consists of picking up a research problem related to data management on the Internet and Web and working on its solution for the duration of this term. Any topic that is covered in the course is acceptable as the domain from which a problem can be selected. You can also pick an area of your choosing, but in this case, please consult with me as soon as possible.

What I expect in the project is a good understanding of the problem (resulting in a survey part), insight into its solution and a well defined strategy for its solution. You should treat the term project as if you were doing the initial background study for further in-depth research. In other words, the report should demonstrate an understanding of and an insight into the problem such that given enough time, you could carry it to its logical conclusion and complete the research. How far you go into the solution together with the difficulty of the problem will determine your final mark.

The term project will be done in groups of two (unless the class is very large and we move to groups of three). The final project mark will be computed as follows: (0.3 * survey part mark) + (0.7 * research part mark). All members of the group will receive the same grade. Therefore, it is incumbent upon you to make sure that both partners share in the work (and let me know very quickly if the partnership is not working).

Deliverables

There are two deliverables of the term project:
  1. A survey report that describes the problem domain, with proper problem definition, and a survey of existing work. This should be about 25 typed pages (12pt type with 1.5 spacing). This report, as indicated in the schedule below, will be handed in at the end of the 7th week of classes. Each person will then be responsible for presenting the field to class and leading a discussion. This will start in Week 8 and each topic will get about 1 hour of class time (The timing may be adjusted based on the number of people taking the course).
  2. A research report which will describe your own attempt to either solve a problem in this domain or go a long way towards its solution. What I minimally expect is a good solution approach such that if I gave you 2-3 more months, you could complete the solution, conduct the experiments and produce a publishable paper. This report should be about 30 typed pages, including a summary survey (i.e., a summary of part 1) of about 5 pages.
The first part is something that every individual (group) should do well; the second part is going to vary with each individual's (group's) abilities. It is possible that some groups may not do well on the second part; you should, therefore, make sure that you do really well on the survey part to get a decent mark.

Note that the report is very important. It should be written carefully so that I can understand it easily. There will probably not be enough time for me to spend an inordinate amount of effort trying to understand what you mean. So include the necessary introductory material and make sure that the presentation is good. All reports should be typewritten. Use a word processor of your choice.

Schedule

The following is the schedule that we will follow. We may have to change these a bit based on the number of people in the class (which will determine the presentation schedule).
September 20:

Please find a partner quickly.

September 30:
Find a problem that you want to work on. The list below gives some problems that I am interested in and would like to cover in this course. If you wish to pick up another topic, make an appointment and see me first.
October 14:
You should have finished reading the literature in the area by now. You'll have ten days to write the survey report.
October 16:
I would like to receive a problem definition that is 1-3 typed pages. This document should include a clear description of the problem on which you are going to work in the area that you are writing your survey in.
October  25:

Your survey report should be in by 4PM.

December 20:
Absolute deadline for handing in final reports (by 4PM).

Resources

You should be looking at the proceedings of conferences such as ACM SIGMOD, VLDB, ICDE, WWW, ICDCS, WISE, ... and at journals such as ACM TODS, IEEE TKDE, VLDB Journal, Distributed and Parallel Databases Journal, Journal of Intelligent Information Systems, World Wide Web, and many others.

Most of these publications can either be obtained through the University of Waterloo Library's TRELLIS System, or through ACM Digital Library, or through VLDB Endowment web page, or Michael Ley's Computer Science Bibliography. Michael's Bibliography is probably the best place to start since it incorporates many of the papers. In a few cases where I thought you might have difficulty finding the paper, I have included a link.
 


[University of Waterloo]
University of Waterloo
[Department of Computer Science]
Computer Science
[M. Tamer Özsu's home page]
M.T. Özsu
[M. Tamer Özsu's home page]
CS 856 Home Page