Updating and Transforming Structured Text
Let XQueryExpr be any expression, XQueryExpr1 be an expression that evaluates to a single node, and QName be an expresssion that evaluates to a single qualified name:
o insert (node | nodes)XQueryExpr (as last | as first)? into XQueryExpr1
o insert (node | nodes) XQueryExpr
(before | after) XQueryExpr1
o delete (node | nodes) XQueryExpr
o replace (value of)? node XQueryExpr1 with XQueryExpr
o rename node
XQueryExpr1 as QName
o ( )
Note: updates must obey XQuery data model:
1. Insertion can only be into an
element or document node, and before or after an element, comment, or
processing instruction node.
2. Before inserting or replacing, sequences
of atomic values must be cast to text with intervening blanks and inserted as a
single text node. After updating, adjacent text nodes must be coalesced with no
intervening blanks.
3. Replacement of the value of an
element or document node must constitute well-formed element content.
Update statements may appear in a return clause of any FLWOR expresssion.
Incompatible updates result in an error.
Can also be used in conditional expressions and function definitions that are declared to be "updating"
Delete references that include two citations, at
least one of which is to a work written before 1960.
FOR
$r IN doc("eb-bib.xml")//ref
LET
$c := $r/cite
WHERE
$r//title/@date < 1960
AND count($c) > 1
RETURN DELETE NODE
$r
Replace W.L. Morton by William
Lewis Morton
wherever it appears as the name of an author.
FOR
$r IN doc("eb-bib.xml")//ref//author[text()="W.L.
Morton"]
RETURN REPLACE
VALUE OF NODE $r WITH "William Lewis Morton"
For each use of "Ibid." insert the appropriate
author or authors.
FOR
$r IN doc("eb-bib.xml")//ref
,
$c IN
$r/cite[@type="ibid"]
RETURN
INSERT NODES
{ $r/cite/author } AS FIRST INTO $c
REPLACE
VALUE OF NODE $c/@type WITH "full"
"Snapshot semantics"
1. Pre-update
processing
a. Bind the
variables declared in the for and let clauses
b. Evaluate each
update in the list of simple updates, and append to a "pending update" list
2. Perform semantic
checking for validity (aborting if result would be invalid)
3. Apply the updates
sequentially
4. Re-validate, if desired, to re-establish data types
o Consider tree structure corresponding to XML
(attributes treated as children before subelements)
o Construct a bottom-up tree automaton that
matches nodes from leaves to the root
1. Content model for an element
(regular expression) converted to FSA
2. Symbol table for an element includes
bag of IDs and bag of IDREFs used in subtree
o XML is valid if
1. automaton for each node matches list of
children
2. no IDs repeated in root's symbol table
3. set of IDREFs
is a subset of set of IDs in root's symbol table
o Construct similar tree automaton for insertion
or replacement value and check for "local validity" (only 1 and 2)
o Check for compatibility of update
1. reexecute automaton at insertion node against
updated list of children
o Note: can start from state corresponding to point of
insertion
2. check IDs(modRoot)
= IDs(root)-IDs(deletion)+IDs(insertion) for duplicate values
3. check IDREFs(root)-IDREFs(deletion)+IDREFs(insertion)-IDs(modRoot) for dangling values
o
Merely wrap statements in begin transaction /end
transaction commands
o Atomic (all or nothing)
o Consistent (preserve DB constraints)
o Isolated (independent of other transactions,
but relaxed by considering ANSI isolation levels)
(1) read uncommitted
(2) read committed
(3) repeatable read
(4) serializable
o Durable (changes guaranteed upon commit)
o How to apply locks at the XML level?
o based on strict 2PL locking for trees
(lock paths from the root)
o account for predicates on content as well
|
|
|
Granted |
|
|
|
Requested |
None |
IS |
IX |
S |
SIX |
X |
IS |
+ |
+ |
+ |
+ |
+ |
P |
IX |
+ |
+ |
+ |
P |
P |
P |
S |
+ |
+ |
P |
+ |
P |
P |
SIX |
+ |
+ |
P |
P |
P |
P |
X |
+ |
P |
P |
P |
P |
P |
o For an individual update command
1. determine all nodes (from the root) on each
specified path, and predicates for each node
2. for each node to be read, acquire IS
locks on all ancestors (in order), then S on node
3. for each node to be updated, acquire IX
locks on all ancestors (in order), then X on node
o apply locks to DataGuide,
rather than to data itself
o
appropriate
even when data is not stored as a graph
o
(usually) smaller
than data graph
o
Grammar-preserving (simple changes to content)
o
Local structural modifications (simple
insertions, deletions, or rearrangements)
o
Global rearrangements (including inversions)
o
Multi-document segmentation and integration
Usually needs user-defined functions to
reconstruct nested structure
XQuery Update Facility defines transformation operator (again to be used in a return clause)
o copy
XQueryVar := XQueryExpr1
(
, XQueryVar := XQueryExpr1
)*
modify
XQueryUpdateExpr1
return
XQueryExpr1
Transformations do not update persistent data.
o
Evolution from XML Stylesheet
Language (XSL) for producing HTML
o
Functional language, converting source tree into
result tree
o
"stylesheet" = set of
templates
o
Push-pull model based on matching patterns using
XPath
<xsl:template
match="class/student">
<xsl:apply-templates/>
<newNode>
<xsl:value-of
select="instructor/firstName"/>
</newNode>
</xsl:template>
o
Types of templates:
<xsl:template match=pattern name=qname priority=number mode=qname>
... possibly including call/apply with other templates ...
</xsl:template>
<xsl:apply-templates select=sequence-expression
mode=qname>
... provide sorting criteria or parameters if
applicable ...
</xsl:apply-templates>
<xsl:call-template
name=qname>
... provide template parameters if applicable ...
</xsl:call-template>
<xsl:value-of
select=sequence-expression
/>
<xsl:for-each select=sequence-expression>
... provide sorting criteria if applicable ...
</xsl:for-each>
<xsl:if test=
expression>
...
</xsl:if>
<xsl:choose>
xsl:when +
xsl:otherwise ?
</xsl:choose>
<xsl:when
test=expression>
...
</xsl:when>
<xsl:otherwise>
...
</xsl:otherwise>
Default templates:
<xsl:template
match ="*|/" mode="#all">
<xsl:apply-templates/>
</xsl:template>
<xsl:template
match="text()|@*" mode="#all">
<xsl:value-of
select="."/>
</xsl:template>
<xsl:template
match="processing-instruction()|comment()" mode="#all"/>
o
Example (adapted from http://www.topxml.com/xsltstylesheets/)
<?xml
version="1.0" encoding="utf-8" ?>
<xsl:stylesheet
version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template
match="/">
<customers>
<xsl:apply-templates
select="/customers" />
</customers>
</xsl:template>
<xsl:template
match="customers">
<xsl:apply-templates
/>
</xsl:template>
<xsl:template
match="customer">
<customer>
<CompanyName>
<xsl:value-of
select="@CompanyName" />
</CompanyName>
<CustomerID>
<xsl:value-of
select="@CustomerID" />
</CustomerID>
<Country>
<xsl:value-of
select="@Country" />
</Country>
</customer>
<xsl:apply-templates
/>
</xsl:template>
</xsl:stylesheet>
Querying XML, Chapters 7.1, 7.2, 13.3