Introduction to XML for Database Developers - XPath
(Page 8 of 9 )
The full XPointer syntax is built on the W3C XPath recommendation. XPath was originally built to be used by XPointer and XSLT (a language for transforming XML documents into other XML documents), but it has found application in other standards and technologies. You will see in the next chapter how it is used by OpenXML() in SQL Server 2000, but first you need to examine its syntax.
Location steps are constructs used to select nodes in an XML document. They have the following syntax:
axis::node_test[predicate]
The location step points to the location of other nodes from the position of the current node. If a current node is not specified in any way, the location step is based on the root element.
Axes break up the XML document in relation to the current node. You can think of them as a first filter that you apply to an XML document to point to target nodes. Possible axes are listed in Table 13-3.
Axes | Description |
parent | The parent of the current node. |
ancestor | All ancestors (parent, grandparent, and so on) to the root of the current node. |
child | All children of the current node (first generation). |
descendant | All descendants (children, grandchildren, and so forth) of the current node. |
self | The current node only. |
descendant-or-self | All descendant nodes and the current node. |
ancestor-or-self | All ancestor nodes and the current node. |
attribute | All attributes of the current node. |
namespace | All namespace nodes of the current node. |
following | All nodes after the current node in the XML document. The set does not include attribute nodes, namespace nodes, or ancestors of the context node. |
preceding | All nodes before the current node in the XML document. The set does not include attribute nodes, namespace nodes, or ancestors of the current node. |
following-sibling | All siblings (children of the same parent) after the current node in the XML document. |
preceding-sibling | All siblings (children of the same parent) before the current node in the XML document. |
Table 13-3. Axes in XPath
The node test is a second filter that you can apply on nodes specified by axes. Table 13-4 lists all node tests that can be applied.
A predicate is a filter in the form of a Boolean expression that evaluates each node in the set obtained after applying axes and node test filters. Developers have a rich set of functions (string, node set, Boolean, and number), comparative operators (=,!=, <=,>= <,>), Boolean operators (And, Or), and operators (+, –, *, div, mod). The list is very long (especially the list of functions), and I will not go into detail here. I will just mention the most common function, position(). It returns the position of the node.
Let’s now review how all segments of the location step function together:
child::Equipment[position()<=10]
This location set first points to child nodes of the current node (root if none is selected). Of all child nodes, only elements named Equipment are left in the set. Finally, each of those nodes is evaluated by position and only the first 10 are specified.
Very often, you will try to navigate from node to node through the XML document. You can attach location sets using the forward slash (/). The same character is often used at the beginning of the expression to establish the current node.
In the following example, the parser is pointed to the Inventory.xml file, then to its root element, and then to the first child called Equipment, and finally to the first Model node among its children:
Inventory.xml#/child::Equipment[position() = 1]/child:: Model[position() = 1]
It all works in a very similar fashion to the notation of files and folders, and naturally you can write them all together:
http://www.trigonblue.com/xml/Inventory.xml#/child:: Equipment[position() = 1]/child::Model[position() = 1]
Node Test | Description |
element name | Selects just node(s) with specified name in the set specified by axes. |
* | or node() | All nodes in the set specified by axes. |
comment() | All comment elements in the set specified by axes. |
text() | All text elements in the set specified by axes. |
processing-instruction () | All processing instruction elements in the set specified by axes (if the name is specified in brackets, the parser will match only processing instructions with the specified name). |
| | |
Table 13-4. Node Tests in XPath
XPath constructs are very flexible, but also very complex and laborious to write. To reduce the effort, a number of abbreviations are defined. position() = X can be replaced by X (it is enough to type just the number). Thus, an earlier example can be written as
Inventory.xml#/child::Equipment[1]/child::Model[1]
If an axis is not defined, the parser assumes that the child axis was specified. Thus, the preceding example could be written as
Inventory.xml#/Equipment[1]/Model[1]
The attribute:: axis can be abbreviated as @. Therefore, the following two expressions are equivalent:
Inventory.xml#/child::Equipment[1]/attribute::EquipmentId Inventory.xml#/child::Equipment[1]/@EquipmentId
The current node can be specified using either self::node() or a dot (.). The following two expressions are equivalent:
Order.xml#/self::node()/OrderDate
Order.xml#/./OrderDate
A parent node can be specified either by parent::node() or two dots (..). The following two expressions are equivalent:
parent::node()/Order
../Order
/descendant-or-self::node() selects the current node and all descendant nodes. It can be abbreviated with //. The following two examples select all EquipmentId attributes in the document:
Inventory.xml#/descendant-or-self::node()/@EquipmentId
Inventory.xml#//@EquipmentId
--------------------------------------------------------------------Transforming XML
In many cases in business, information that is already in the form of an XML document needs to be converted to another XML structure. For example, a client of mine is participating in RossetaNet, an e-commerce consortium of IT supply chain organizations that defines standard messages to be sent between partners. Although messages are standardized, each pair of partners can agree to modify their messages slightly to better serve their needs. Such changes are mostly structural—new nodes (fields) can be defined, standard ones can be dropped, a node can change its type from element to attribute, and so on. Instead of generating completely different messages each time (and developing two separate procedures for performing similar tasks), it is preferable to create a simple procedure that will transform a standard XML message into another form.
Another typical situation occurs when an application uses a browser to display an XML document. Although modern browsers such as the latest versions of Internet Explorer are able to display the content of an XML document in the form of a hierarchical tree, this format is not user-friendly. More often, the XML document is transformed into an HTML document and information is organized visually into tables and frames. Such HTML applications usually allow the end user to modify the displayed information interactively (for example, to sort the content of the tables, to display different information in linked tables, or to present data in different formats). Each of these tasks could be performed by modifying the original XML document.
A typical problem with HTML browsers from different vendors is that they are not compatible. Naturally (well, actually, it seems quite unnatural), even different versions of the same browser behave differently. Each of them uses a different variation of the HTML standard. However, these differences are not major, and instead of generating a separate XML document for each of them, you can create a procedure to transform the XML document so that it fits the requirements of the browser currently in use.
You can think of XML as just one type of rendering language. Some systems use other types of rendering languages and appropriate browsers. For example, more and more PDAs and wireless devices such as cellular phones are offering Internet access. They often use a special protocol (Wireless Application Protocol, or WAP) that has its own markup language (Wireless Markup Language—WML) based on XML. A web server offering information should be able to transform the XML document to fulfill the needs of different viewers.
Next: XSL >>
More MS SQL Server Articles
More By McGraw-Hill/Osborne
|
This article was excerpted from chapter 13 of SQL Server 2000 Stored Procedure & XML Programming, second edition, written by Dejan Sunderic (McGraw-Hill/Osborne, 2004; ISBN: 0072228962). Check it out at your favorite bookstore today. Buy this book now.
|
|