Introduction to XML Document Object Model

Learn about XML and the hierarchical structure of the Document Object Model in this down and dirty piece! Nodes, NodeLists, NameNodeMaps, as well as properties such as parentNodes, childNodes, nodeNames, and nodeValues are explored, explained and code is given. "In order to represent the hierarchical nature of XML, the DOM provides a whole set of objects, methods and properties that allow us to manipulate the DOM. We will not be able to cover them all in this tutorial, but we’ll cover a few to give you the essence of the sort of things you can achieve."

Contributed by
Rating: 4 stars4 stars4 stars4 stars4 stars / 18
February 09, 2004
Rate this Article:
MEH MEH++


SEARCH ASP FREE
TOOLS YOU CAN USE

advertisement

The Document Object Model is an API for HTML and XML documents.  It defines the logical structure of the documents, and the way they can be accessed. DOM gains its importance because it defines a standard way in which you can access and manipulate the XML structure. In short we can say DOM is a programming interface for XML documents and also defines the way an XML document can be accessed and manipulated. A simple illustration will help us understand about the XML document, and how the DOM can be used.

Example 1


<BookAuthors>
 
<Author>
  
<au_id>1001</au_id>
  
<au_lnameGates </au_name>
  
<au_fnameBill </au_name>
 
</Author>
 
<Author>
  
<au_id>1002</au_id>
  
<au_lnamePotter</au_name>
  
<au_fnameHarry </au_name>
 
</Author>
</BookAuthors>

If you take a closer look you will be able to see that XML documents are always hierarchical in nature, which means they always have a top-level or root element and then child elements. So the above document could be represented as:

XML DOM

The tree would have been deeper, if there were more children. In DOM terms these elements are also called nodes. A node just represents a generic element in this tree-type structure.

Covering Our Base

Base Objects

In order to represent the hierarchical nature of XML, the DOM provides a whole set of objects, methods and properties that allow us to manipulate the DOM. We will not be able to cover them all in this tutorial, but we’ll cover a few to give you the essence of the sort of things you can achieve. First and foremost let’s see the DOM objects:

ObjectDescription
NodeA single node in the hierarchy
NodeListA collection of Nodes
NameNodeMapA collection of nodes allowing access by name as well as index.

There are vast numbers of DOM properties that allow us to traverse through the node. The following list gives a few properties. We will sample these DOM objects later.

PropertiesDescription
parentNodeReturns the parent of the current node.
childNodesReturns a NodeList containing the children of the node.
firstChildReturns the first child of the current node.
lastNameReturns the last child of the current node.
previousSiblingReturns the previous sibling, i.e. the previous node at the same
level of the hierarchy.
nextSiblingReturns the next sibling, i.e. the next node at the same
level of the hierarchy.
nodeNameReturns the name of the node.
nodeValueReturns the value of the node.

To get the full list, check out MSDN online XML area at msdn.microsoft.com/xml/.

DOM Simplified

DOM Simplified

Now let us look at the node structure of our XML document with little more detail. We will examine one side of the document structure alone for ease of explanation.  All this applies to the other side as well.

XML DOM

Here in Figure1, you can clearly see how you can use these properties to navigate around the XML DOM. The lines indicate which nodes the properties point to. The children on the root node, BookAuthor, are held in the childNode collection. In the above case BookAuthor only has one child, so both its firstChild and its lastChild properties point to the same node. In the above case which we are discussing, childNode(0 will apply. Since it is the only node in the collection.

The Author node however, has three children, held in a childNodes collection. The pointer to the au_id is that of firstChild property, which is the same as childNodes(0), and the lastchild property points to au_fname node. The previousSibling and nextSibling properties point to the next node collection at the same level.
So let us assume we have a node named baRoot pointing to BookAuthors, the following table helps demonstrate the parent-child hierarchy.

CodePoints To
baRoot.childNodes(0)Author
baRoot.childNodes(0).firstChildau_id
baRoot.childNodes(0).firstChild.nextSiblingau_lname
baRoot.childNodes(0).firstChild.parentNodeAuthor
baRoot.childNodes(0).firstChild.nextSibling.parentNodeAuthor

Specific DOM Objects

XML, was designed to be eXtensible, data integration and data exchange is one of its key features. XML was anchored to cater to a tremendous variety of documents. Despite this there are no specific objects for different types of node. Really, what makes it so intriguing is that, they inherit most of the properties and methods of the Node objects as well as adding specific methods and properties relevant to the particular node type. The following table lists the specific DOM Objects:

ObjectDescription
DocumentThe root object for an XML document.
DocumentTypeStores info about DTD or Schema associated with the XML document.[For e.g. !DOCTYPE in a DTD]
DocumentFragment A lightweight copy of the document. Useful for temporary storage or document insertions.
ElementAn XML element.
Attribute or AttrAn XML attribute.
EntityA parsed or unparsed entity.[E.g. !ENTITY in a DTD.]
EntityReferenceAn XML entity reference.
NotationA notation.[e.g.!NOTATION in DTD]
CharacterDataThe base object for text information in an XML document.
CDATASectionUnparsed character data (e.g. !CDATA in DTD) .
TextThe text content of an element or attribute node.
CommentAn XML comment element.
ProcessingInstructionA processing instruction as held in <?   ?> section
ImplementationApplication specific implementation details.

Working With XML Data

Working With XML Data

So, we are going to write a sample code quickly to see how the DOM traverses through the XML document, using the TravelXML.html. We are going to use Internet Explorer here, with XML Data Island, the data island is simply a HTML tag that acts like data control.


<xml ID“diData” SRC “BookAuthors.xml”></xml>

Above we have a data island named diData, containing data from the XML file BookAuthors.xml. Please note, data islands are like containers for data, they don’t actually show up on the screen. So we need to find a way to access the data from this and display it.


<SPAN ID “txtData”></SPAN>

Our aim is to use DOM object to extract the XML info from the data island, and display the data in SPAN. We will start our work with the root node, and find any child nodes to that root node and display the details of the node. So we will display the name, type and value of the node, we will repeat the process for the child node because a child node can contain nodes of its own. We write a recursive function to use for this is a tree traversal code.

One major piece of information we are going to display is the node’s type. We will convert it into string in this case to make it readable. In order to do this we have to declare a global variable containing the text description of the node type and indexed by the actual node type number. The very beginning of the document will have the following code, well before the JScript code.


var ga_strNodeType = new Array 
(  ‘ ’‘ELEMENT(1),
‘ATTRIBUTE 
(2),
‘TEXT 
(3),
‘CDATA SECTION 
(4),
‘ENTITY REFERENCE 
(5),
‘ENTITY 
(6)
‘DOCUMENT 
(9);
‘DOCUMENT TYPE 
(10),
DOCUMENT FRAGMENT 
(11),
‘NOTATION 
(12)
);

The recursive function that we will be calling is called displaychildNodes.  This function will pair into parameters it accepts an XML node and an integer that indicates the current level of the node in the hierarchy.


function displayChildNodes (baNodeintLevel)
{
var strNodes 
‘’;
//a string variable containing the node 
//information.  
var intCount 0;
//an integer variable containing 
//the count of nodes
var intNode 0;
//a integer variable containing current 
//number of node.       
var baAttrList ‘’;
//A node list of the attributes for 
//a particular node.
//Building the string beginning from the 
//current node name, its type and value. 
//An integer is used to identify the type, 
//and the previously define array 
//ga_strNodeTypes is used to get the 
//description of node type. The getIndent 
//function returns a blank string containing 
//spaces up to the level in a tree.
//To get value for this node
strNodes  + = getIndent(intLevel) + <b>’ +  baNode.nodeName   </b>   Value: <b>’ baNode.nodeValue </b><br>;
//Use a loop to find out if the node has 
//any attributes, if so loop them, adding 
//their details to the string.
strNodes  + = getIndent(intLevel
<b>’ +  baNode.nodeName   
</b>   Value: <b>’ 
baNode.nodeValue </b><br>
//Use a loop to find out if the node 
//has any attributes, if so loop them,
//adding their details to the string.
baAttrList baNode.attributes;
 
If (baAttrList != null)
 
{
  intCount 
baAttrList.length;
  
if (intCount 0)
 
{
//for each attribute display the 
//attribute information.
for(intAttr =0intAttr intCountintAttr++)
 strNodes  
+ = getIndent intLevel ) + <b>’ +  baAttrList(intAttr).nodeName  </b>   Type: <b>’ ga_strNodeTypes[baAttrList(intAttr).nodeType] + </b>   Value: <b>’ baAttrList(intAttr).nodeValue </b><br>;
 
}
}
//Finally we check for any child node, 
//and for each child node call the same function.
intCount = nodAttrList.length;
if (intCount > 0)
{
 //for each child node display the child node 
 //information.
 for(intNode =0; intNode < intCount; intNode++)
 strNodes  + = showChildNodes(baNode.childNodes(intNode), intLevel +1);
 
return strNodes;
}

To display the output from the above code using DOM, you could use the following:


DomXMLData diData
TxtData
.innerHTML showChildNodes(domXMLData0);

Thrilling Results With DOM

The above code calls the function, passing in the top-level node. Loading the TravelXML.html in the Internet Explorer (IE) you can see the output and it will look something like below:

Traversing the Nodes in an XML Document



Parse XML


<FONT face="Verdana, Arial, Helvetica, sans-serif">#document Type:DOCUMENT(9)   Value=null
BookAuthors  Type:ELEMENT(1)  Value:null
Author  Type:ELEMENT(1)  Value:null
au_id  Type:ELEMENT(1)  Value:null</FONT><FONT face="Verdana, Arial, Helvetica, sans-serif">
<class=MsoBodyText2 style="MARGIN: 0in 0in 0pt">
#text  Type Text(3)  Value:1001
au_lname Type:ELEMENT(1)  Value:null
<class=MsoBodyText2 style="MARGIN: 0in 0in 0pt">
#text  Type Text(3)  Value:Gates
au_fname  Type:ELEMENT(1)  Value:null
<class=MsoBodyText2 style="MARGIN: 0in 0in 0pt"
<
class=MsoBodyText2 style="MARGIN: 0in 0in 0pt">#text  Type Text(3)  Value: Bill
Author  Type:ELEMENT(1)  Value:null
au_id  Type:ELEMENT(1)  Value:null

<class=MsoBodyText2 style="MARGIN: 0in 0in 0pt">#text  Type Text(3)  Value:1002
au_lname Type:ELEMENT(1)  Value:null

<class=MsoBodyText2 style="MARGIN: 0in 0in 0pt">#text  Type Text(3)  Value:Potter
au_fname  Type:ELEMENT(1)  Value:null

<class=MsoBodyText2 style="MARGIN: 0in 0in 0pt">#text  Type Text(3)  Value: Harry

 

Hope you got a clear and quick picture of the recursive nature of the XML DOM. At the top we have the #document Type node, which is an inherent parent, that means the root node of all XML documents. Pay careful attention though.  It’s not actually an element-it has a type of DOCUMENT. So the root of the XML data is an XML document, but under that you have XML Elements.

 

In our case the first root element is the BookAuthors element. This in turn contains an element for each Author and an element for each property of the Author. We also notice some additional information for each leaf node (i.e. node with no children).  We have another node called #text. This actually contains the text of the node. You may ask then why does each element have a value of null and its sub-element called #text contains the value of the node. The answer is very simple.  Some nodes may have both, another node, as well as contain text. If a node contains both text and other nodes, what will be the value?  Will it be a text or the child node? This led W3C to specify that text for a node be always held in a child node of type Text.

 

So this leaves us with the final part of the code, for when we accessed the node we did not step deep down to another level in the tree to access the child. We will do it now! 

 

</SPAN><?xml:namespace prefix /><o:p> </o:p></FONT>
<
class=MsoNormal style="MARGIN: 0in 0in 0pt; TEXT-ALIGN: justify"><SPAN style="FONT-SIZE: 11pt; mso-bidi-font-size: 10.0pt"><STRONG><FONT size=2><FONT face="Verdana, Arial, Helvetica, sans-serif"></b>   Value: <b>’ baNode.nodeValue </b><br>;<o:p></o:p></FONT></FONT></STRONG></SPAN>
<
class=MsoBodyText2 style="MARGIN: 0in 0in 0pt"><FONT face="Verdana, Arial, Helvetica, sans-serif">

 

Voila!  We have used the nodeValue property, too. This is Microsoft’s simplified extension of DOM. W3C specified that to access the value of the node-you have to traverse to the child to access the associated TEXT node. Microsoft felt that such a common action as accessing the value of a node to get at the text it holds must be simplified, hence intelligently delegated it by introducing nodeValue property to handle the TEXT nodes.

 

This tutorial has shown you about DOM and how it stores XML data in a tree structure. Now that you understand how XML works and have been introduced to DOM, next we can take a look at how XML integrates with ADO.

blog comments powered by Disqus
XML ARTICLES

- More on Triggers and Styles and Control Temp...
- Looking at Triggers with Styles and Control ...
- A Closer Look at Styles and Control Templates
- Styles and Control Templates
- Properties and More in XAML
- Elements and Attributes in XAML
- XAML in a Nutshell
- Importing XML Files into Access 2007
- Using MSXML3.0 with VB 6.0
- MSXML, concluded
- MSXML, continued
- MSXML Tutorial
- Generating XML Schema Dynamically Using VB.N...
- XSL Transformations using ASP.NET
- Applying XSLT to XML Using ASP.NET

ASP Web Hosting ASP.Net Web Hosting Windows Web Hosting
 
 
 

ASP Free Forums 
 RSS  Tutorials RSS
 RSS  Forums RSS
 RSS  All Feeds
Site Map 
Request Media Kit
Write For Us Get Paid 
Weekly Newsletter
 
Developer Updates  
Free Website Content 
Privacy Policy 
Support 


© 2003-2012 by Developer Shed. All rights reserved. DS Cluster 8 - Follow our Sitemap
Most Popular Topics
All ASP.Net Tutorials