Introduction to XML for Database Developers - XML Namespaces
(Page 3 of 9 )
Some entities from different areas of a document can have the same name. For example, you could receive a purchase order document that contains a <name> tag for the customer and a <name> tag for the company. People reading this document would be able to distinguish them by their context. However, an application would need additional information to interpret the data correctly.
A solution to this problem is to create XML namespaces to provide the XML document with a vocabulary (that is, a context). After that, customer and company names can be referenced using a context prefix:
<contact:name>Tom Jones</contact:name>
<Company:name>Trigon Blue</Company:name>
Naturally, before these prefixes can be used, they have to be defined. The root element of the following document contains three attributes, each of which specifies a namespace and a prefix used to reference it:
<PurchaseOrders
xmlns:contact= http://www.trigonblue.com/schemas/Contact.xsd"
xmlns:Company=
http://www.trigonblue.com/schemas/Company.xsd
xmlns:dsig="http://dsig.org">
>PurchaseOrder>
<Customer>
<contact:name>Tom Jones>/contact:name>
>/Customer>
<PurchaseDate>2000-09-11</PurchaseDate>
<SalesOrganization>
<Company:name>Trigon Blue</Company:name>
<Company:DUNS>817282919</Company:DUNS>
<Company:ID>1212</Company:ID>
</SalesOrganization>
<dsig:digital-signature>78901314</dsig:digital-signature>
</PurchaseOrder>
</PurchaseOrders>
In some cases, it is critical that the namespace points to an actual URL for a resource so that the XML document can be processed correctly, but in some cases (as in the preceding XML document), it is only important that the URI string in the namespace is globally unique (that is, that no other XML document is using the same URI for some other purpose).
Even when you have to use a specific namespace in an XML document, you can still arbitrarily chose a prefix. However, some prefixes are traditionally associated with some namespaces. For example, XML Schema documents traditionally use the xsd prefix and UpdateGrams (see Chapter 15) use the updg prefix.
Structure of XML Documents
XML documents consist of three parts, as you can see in the following illustration:
The first part of the document, called the prolog or document type declaration (not Document Type Definition), is optional. It can contain processing instructions, a DTD, and comments. The second part of the document is the body, which contains the document’s elements. The data in these elements is organized into a hierarchy of elements, their attributes, and their content. Sometimes an XML document contains a third part, an epilog, which is an optional part that can hold final comments, processing instructions, or just white space.
XML Parsers and DOM Applications (or user agents) that use XML documents can use proprietary procedures to access the data in them. Usually, such applications use special components called XML parsers. An XML parser is a program or component that loads the XML document into an internal hierarchical structure of nodes (see Figure 13-1) and provides access to the information stored in these nodes to other components or programs.
The XML Document Object Model (DOM) is a set of standard objects, methods, events, and properties used to access elements of an XML document. DOM is a specification that has received Recommended status from the W3C. Different software vendors have created their own implementations of DOM so that you can use it from (almost) any programming language on (almost) any platform.
Microsoft has initially implemented DOM as a COM component called Microsoft .XMLDOM in msxml.dll. Microsoft used to call it Microsoft XML Parser, but at the

Figure 13-1. A possible graphical interpretation of a node tree
time of this writing it is called Microsoft XML Core Services. It is delivered, for example, with Internet Explorer, or you can download it separately from Microsoft’s web site. Developers can use it from any programming language that can access COM components or ActiveX objects (for example, Visual Basic, Visual Basic .NET, VBScript, Visual C# .NET, Visual J++, JScript, and Visual C++).
Nevertheless, it is unlikely that you will use DOM from Transact-SQL. Microsoft has built special tools for development in Transact-SQL (which are reviewed in the next chapter).
--------------------------------------------------------------------
XML Document Quality There are two levels of document quality in XML: well-formed documents and valid documents.
An XML document is said to be a well-formed document when
- There is one and only one root element.
- All elements that are not empty are marked with start and end tags.
- The order of the elements is hierarchical; that is, an element A that starts within an element B also ends within element B.
- Attributes do not occur twice in one element.
- All entities used have been declared.
An XML document is said to be a valid document when
- The XML document is well-formed.
- The XML document complies with a specified DTD document.
The concept of a valid document has been ported to XML from SGML. In SGML, all documents must be valid; in other words, they must comply with the rules defined in the DTD. XML is not so strict. It is possible to use an XML document even without a DTD document. If the user agent knows how to use the XML document without the DTD, then the DTD need not even be sent over the Internet. It just increases traffic and ties up bandwidth.
XML Schema and XML Schemas
The DTD is not the only type of document that can store rules for an XML document. At the current time, several companies (including Microsoft) have submitted a proposal to W3C for an alternative type of metadata document called the XML Schema. In fact, there are other proposed standards for the same use, which are all referred to as XML schemas. In May of 2001, W3C published its XML Schema Recommendation, which should gradually replace all other XML schemas. However, some of these schemas (such as the one defined by Microsoft) are already in use.
XML schemas are XML language for defining the business rules with which a class of XML documents (data) must comply in order to be valid.
These are the major differences between a DTD and an XML schema:
- XML schemas support data types and range constraints.
- XML schemas allow users to define new data types.
- The language in which XML schemas are written is XML. Developers do not have to learn an additional language as they do with DTDs.
- XML schemas support namespaces (XML entities for defining context).
Why are XML schemas important? A huge portion of application development resources is spent on checking whether data complies with (business) rules about structure and content. If you have a simple language to define the structure and content of data (that is, the business rules by which it is constrained) and you have a schema validator (a tool or program that can check compliance), you will be able to reduce development resource requirements significantly, and therefore reduce the cost to implement applications.
Next: XML–Data Reduced (XDR) Schema >>
More MS SQL Server Articles
More By McGraw-Hill/Osborne
|
This article was excerpted from chapter 13 of SQL Server 2000 Stored Procedure & XML Programming, second edition, written by Dejan Sunderic (McGraw-Hill/Osborne, 2004; ISBN: 0072228962). Check it out at your favorite bookstore today. Buy this book now.
|
|