The Fall 2009 offering covers the management of text databases. At the end of the course the student will be able to design or evaluate a database subsystem capable of supporting the needs of text creators and users who wish to access machine-readable text interactively or use XML for data interchange. Students will be familiar with structured text standards, including XML. Students will be able to design wrappers, storage structures, and index methods appropriate for text and understand traditional text applications, such as information retrieval.
Students are expected to understand the fundamentals of database systems, programming language specifications, and data structures and algorithms, each at least at the level of an introductory course.
Text:
J. Melton and S. Buxton, Querying XML : XQuery, XPath, and SQL/XML in context. Morgan Kaufmann, 2006.
See also:
There will be no tests or exams.
Please review the materials concerning plagiarism and academic honesty. You must complete and sign the Academic Integrity Acknowledgement Form, and hand it in by classtime on Thursday, October 1.
Special seminar: Dirk Van Gucht, The Duality Between Query Languages and Index Structures, Thursday, November 19, 9:30-10:30, DC 1304. Note: non-standard start time; non-standard room |
Tuesdays and Thursdays 10-11:20 am
RCH 106
Office hours: Mondays 2-5 pm
1. Introduction
Text-dominated databases. Overview of W3C's XML specifications: XML core, query language, schema, and transformations. Common applications of XML.
2. Data Model
Structured text data model(s) and properties. DOM and SAX. XPath and other path expressions.
3. Text DDLs
DTDs, XML Schema, Relax-NG.
4. Text DMLs
XQuery FLWOR expressions. Full text facilities.
Techniques for storing structured text, including graph and interval encodings. Text indexing techniques. Indexing semi-structured data.
6. Updates and Transformations
Support for updates. Transaction management. XSLT.
XQuery core. Static typing. Dynamic semantics.
8. Query Processing and Optimization
Region algebras. Algorithms for native processing of queries. Query optimization techniques.
Mappings to and from the relational model. SQL/XML.
10. View Matching
Using materialzed SQLXML views to answer SQLXML queries.
11. Web Services
Internationalization. SOAP. WSDL.
12. Streaming Text
Publish-subscribe systems. Pre-filtering documents.