A quick intro to XQueryXQuery is a highly malleable amalgam of an SQL-like syntax grafted on to XPath.At its simplest, an XQuery expression that will likely look familiar is this: //bookThis, of course, is standard XPath and is a complete, valid, self-contained XQuery query. It says to return a list of all book elements existing in whatever XML environment the particular query engine is set up to query against. Each query engine vendor determines the scope of the universe it knows about, which might be a single document, or several thousand. In the case of XQEngine, the environment is the sum total of all documents that have been previously indexed using XmlEngine.setDocument() .
If the above example is querying against the small sample bibliography cited in the W3C's XML Query Use Cases document, we can recast it to point explicitly at that document using XQuery's built-in document("bib.xml")//bookThis query returns exactly the same results as the first.
How the results get returned is again vendor-specific, but one possibility (the likeliest?) is that you'll be getting back a traditional, serialized XML text stream. You might get back DOM nodes, if you've requested that and the feature's supported by the engine you're using. If we assume for the moment we're getting back serialized XML, the above queries pull <book year="1994"> <title>TCP/IP Illustrated</title> <author><last>Stevens</last><first>W.</first></author> ... <book year="1992"> <title>Advanced Programming in the Unix environment</title> ... <book year="2000"> <title>Data on the Web</title> ... etc.We're returning a serialized nodeset. Whether that nodeset gets wrapped by a single enclosing element to form the root of a well-formed document tree -- it doesn't in this example -- is also up to the particular implementation. You might get back a series of unwrapped XML fragments as in the above -- and depending on the query, possibly something even less XML-ish than that. (The query "1" is a perfectly valid XQuery query, although it doesn't return anything that looks at all like XML.)
XPath expressions in XQuery look almost identical to those defined in the November, 1999 W3C XML Path Language (XPath) specification and employed gainfully in several different domains within the W3C world, most noticeably in XSLT. XQuery extends XPath's native syntax is several minor ways, most noticeably to introduce a (These keywords, by the way, are case-insensitive. I generally uppercase them so that they stand out to better delineate a query's structure, but that's strictly a matter of personal preference.)
The productions in the XQuery grammar that formally define a FlwrExpr ::= (ForClause | letClause)+ whereClause? returnClause ForClause ::= 'FOR' Variable 'IN Expr (',' Variable IN Expr)* LetClause ::= 'LET' Variable ':=' Expr (',' Variable := Expr)* WhereClause ::= 'WHERE' Expr ReturnClause ::= 'RETURN' ExprThis, at least, is the formal definition in the original XQuery Working Draft of February, 2001. This might change, although it's unlikely to change greatly.
Reading this fragment of the grammar shouldn't cause any great difficulties if you're accustomed to the mysteries of DTD's. It says, in more colloquial terms, "A
Because the production labelled
This makes the FLWR production very malleable, highly recursive, and capable of generating a large number of possible query instances, including just about any combination of
Before we do, let's look first at the individual LET $books := //book RETURN $booksThis expression says: "Evaluate the XPath expression //book , assign the resulting nodeset to the newly created variable $books , and then return that entire list to the user." The earlier XPath form is obviously more compact than this; there are other reasons, discussed below, for using the more verbose XQuery variant.
Indentation, like casing on keywords, is a matter of personal style.
We've just introduced the concept of a variable, an identifier starting with a dollar sign ('$'). In this case the variable Although it doesn't make much sense in terms of good programming practise, for the sake of showing equivalence, we could just as easily rewrite the above expression as LET $books := //book RETURN //booksIn this case we've assigned the nodeset resulting from evaluating the XPath expression //book to the nodeset variable $books and then promptly disregarded the assignment, re-evaluating the original expression all over again and returning the resulting nodeset. Now that you know you can do that, don't.
FOR expressionsWe can change ourLET statement to a FOR :
FOR $book IN //book RETURN $bookThis is yet another variation on the same theme; we get back the same exact nodeset of books. How does this work? And why? The
Subsequent clauses are then executed once for each iteration, typically doing something useful with each new
The semantics of the
During each iteration, each successive node in the original nodeset is isolated in turn and assigned to the variable Once the
The result set that's constructed in the FOR $book IN //book RETURN $book/editoris going to find only one book out of the four that has an editor, so that's the only node that will be returned. Here are the relevant parts of the file being queried against for reference: <book year="1994"> <author><last>Stevens</last><first>W.</first></author> ... </book> <book year="1992"> <author><last>Stevens</last><first>W.</first></author> ... </book> <book year="2000"> <author><last>Abiteboul</last><first>Serge</first></author> <author><last>Buneman</last><first>Peter</first></author> <author><last>Suciu</last><first>Dan</first></author> ... </book> <book year="1999"> <editor><last>Gerbarg</last><first>Darcy</first></editor> ... </book>In the above query, $book will be assigned to four times, once for each <book> in the nodeset that resulted from evaluating //book , but out of all four subsequent evaluations of the expression $book/editor in the RETURN statement, only one is going to produce a non-null, single node result. That single node ends up comprising the entire nodeset that gets constructed and eventually passed back up and out of the RETURN . The end result of all that work (and explanation!) is:
<editor><last>Gerbarg</last><first>Darcy</first></editor>
FOR $book in //book FOR $author IN $book/author RETURN $author/lastThe RETURN statement ends up getting executed five times. Let's count. $book gets assigned to four times in the top-level FOR . For each assignment, the variable $author in the second FOR gets assigned to a number of times which depends on how many authors each book contains. Enumerating, $author gets assigned, respectively, one <author> node, one <author> node, three <author> nodes, and no <author> nodes. Concatenate these all together in the RETURN statement and you get five author last names.
Again exploring a minor pedagogic variation, we can introduce an intermediate variable using a FOR $book in //book LET $authors := $book/author FOR $author IN $authors RETURN $author/lastThis doesn't buy us much in this case, but in some circumstances it might help simplify the code. Predicates and WHERE'sComing soon to retail outlets everywhere.Decorating the result-setditto. |