Text Data Manipulation Languages
SQL and OQL
Selection of components, but no ability to reshape them or create new ones
For use with Lore semi-structured data model (OEM directed graph)
recall: EB bibliography example
What works written after 1960
appear in references that include at least two citation segments?
select distinct m.title.t
from eb_bib.para.ref r,
r.cite m, r.cite n
where (m != n) and
(m.title.date > 1960)
Roots in conventional database querying + information
retrieval + Web search engines
XML vs. relations
Requirements
For-Let-Where-OrderedBy-Return clauses
Functions and Operators
User-defined functions
What works written after 1960 appear in
references that include at least two citation segments?
<biblio>
{
FOR
$r IN doc("eb-bib.xml")//ref ,
$t IN distinct-values($r/cite/title)
LET
$c := $r/cite
WHERE
$t/@date > 1960
AND count($c) > 1
ORDER
BY($t/@date DESCENDING)
RETURN
<citation>
{
FOR $a IN $r//author
RETURN
$a
} ,
<work>{ $t/text() } ,
Ed. ,
{ $t/@edition }
</work> ,
<date>{ $t/@date }</date>
</citation>
}
</biblio>
For more examples, examine some of the XML Use Cases or examples available through MonetDB. A list of implementations is maintained by W3C.
Extensions for keyword search, including boolean combinations, stemming, proximity, thesaurus expansion, stopwords, ordering:
FOR $book IN doc("http://bstore1.example.com/full-text.xml")/books/book LET $cont := $book/content WHERE $cont FTCONTAINS "software" FTAND "developer" WITH STEMMING DISTANCE AT MOST 3 WORDS RETURN $book
For more examples, examine some of the XML Full Text Use Cases.
Querying XML, Chapters 10, 11, 12, 13.2, 14, C.2