Working with XPath: The .NET Way - What is inside XPath?
(Page 2 of 5 )
Everybody knows that XML is nothing but a tree of several related (and structured) nodes of textual information. XPath is a language for picking nodes and sets of nodes out of this tree. From the perspective of XPath, there are seven kinds of nodes:
- The root node
- Element nodes
- Text nodes
- Attribute nodes
- Comment nodes
- Processing instruction nodes
- Namespace nodes
Those are not new buzzwords to any developer who knows XML. Everybody knows that “root” refers to the topmost element within the XML document. All other nodes are comprised of “elements.” Every element contains information either in the form of text or attribute. Commenting is also allowed in an XML document. These are a bit synonymous to XPath as well, but a bit different in certain aspects.
The XPath data model has several features that are not obvious. First, the tree's root node is not the same as its root element. The tree's root node contains the entire document, including the root element and comments and processing instructions that occur before the root element start tag or after the root element end tag. The XPath data model does not include everything in the document. In particular, the XML declaration and DTD are not addressable via XPath. However, if the DTD provides default values for any attributes, then XPath recognizes those attributes.
Finally, “xmlns” attributes are reported as namespace nodes. They are not considered attribute nodes, though a non-namespace aware parser will see them as such. Furthermore these nodes are attached to every element and attribute node for which that declaration has scope. They are not just attached to the single element where the namespace is declared.
XPath uses path expressions to select nodes or node-sets in an XML document. The simplest expression (or location path) is the one that selects the document's root node. This path is simply the forward slash /. (You'll notice that a lot of XPath syntax was deliberately chosen to be similar to the syntax used by the Unix shell. Here / is the root of a Unix filesystem and / is the root node of an XML document.) These path expressions look very much like the expressions you see when you work with a traditional computer file system.
XPath also includes over 100 built-in functions. There are functions for string values, numeric values, date and time comparison, node and QName manipulation, sequence manipulation, Boolean values, and more.
Next: XPath with a simple example >>
More .NET Articles
More By Jagadish Chaterjee