Introduction to XML for Database Developers

Relational database management systems have been around since the days before the Internet, and Microsoft SQL Server is no exception. In its latest version, it is compatible with XML, which allows it to solve the old challenges faced by RDBM systems (and some new ones) in new ways. This article was excerpted from chapter 13 of SQL Server 2000 Stored Procedure & XML Programming, second edition, written by Dejan Sunderic (McGraw-Hill/Osborne, 2004; ISBN 0072228962).

Contributed by
Rating: 4 stars4 stars4 stars4 stars4 stars / 10
June 22, 2005
Rate this Article:
MEH MEH++


SEARCH ASP FREE
TOOLS YOU CAN USE

advertisement

Microsoft SQL Server has become a giant among the select group of enterprise-ready relational database management systems, but as with those other RDBMSs, its roots are in pre-Internet solutions. The Internet revolution has highlighted a set of old tactical and strategic challenges for the Microsoft SQL Server development team. These challenges include the following:

  • Storing the large amounts of textual information that web-based, user-friendly database applications require
  • Delivering that textual (and other) stored information to the Web
  • Sharing information with other departments and organizations that may not use the same RDBMS system

In earlier editions of SQL Server, Microsoft addressed these issues with features such as Full Text Search, the Web Publishing Wizard, DTS, ADO, and OLE DB. SQL Server 2000 introduces XML compatibility—the new holy grail of the computing industry and the latest attempt to tackle the same old problems.
--------------------------------------------------------------------

XML (R)evolution

To communicate with customers in today’s rich-content world, you need to provide them with information. Until very recently, such information was inevitably encapsulated in proprietary, document-based formats that are not shared easily. For example, word processor documents are optimized for delivery on paper, and relational databases are often structured and normalized in formats unsuitable to end users.

The first step in the right direction was the Standard Generalized Markup Language (SGML). Although it was designed in the late 1960s (by Charles Goldfarb), it became the international standard for defining markup languages in 1986, after the creation of the ISO standard. In the late 1980s, companies and government agencies started to adopt this tag-based language. It allowed them to create and manage paper documentation in a way that was easy to share with others.

Then, in the 1990s, the Web appeared on the scene and our collective focus shifted from isolated islands of personal computers and local networks to a global network of shared information. SGML’s tagged structure would seem to make it a perfect candidate to lead the Internet revolution, but the complexity of SGML makes it difficult to work with and unsuitable for web application design.

Instead of SGML, the developers of the Internet adopted the Hypertext Markup Language (HTML), a simple markup language used to create hypertext documents that are portable from one platform to another. HTML is a simplified subset of SGML. It was originally defined in 1991 by Tim Berners-Lee as a way to organize, view, and transfer scientific documents across different platforms. It uses the Hypertext Transfer Protocol (HTTP) to transfer information over the Internet. This new markup language was an exciting development and soon found nonscientific applications. Eventually, companies and users started to use it as a platform for e-commerce—the processing of business transactions without the exchange of paper-based business documents.

Unfortunately, HTML has some disadvantages. One of the biggest arises as a result of its main purpose. HTML is designed to describe only how information should appear—that is, its format. It was not designed to define the syntax (logical structure) or semantics (meaning) of a document. It could make a document readable to a user, but it required that user to interact with, and interpret, the document. The computer itself could not parse the document because the necessary metadata (literally, data about the data) was not included with the document.

Another problem with HTML is that it is not extensible. It is not possible to create new tags. HTML is also a “standard” that exists in multiple versions—and multiple proprietary implementations. Web developers know that they have to test even their static HTML pages in all of the most popular browsers (and often in several versions of each) because each browser (and each version of each browser) implements this “standard” somewhat differently. Different development tool sets support different versions of this standard (and often different features within a single standard).

In 1996, a group working under the auspices of the World Wide Web Consortium (W3C) created a new standard tagged language called the eXtensible Markup Language (XML). It was designed to address some of the problems of HTML and SGML. XML is a standardized document formatting language (again, a subset of SGML) that enables a publisher to create a single document source that can be viewed, displayed, or printed in a variety of ways. As is the case with HTML, XML is primarily designed for use on the Internet. However, as already mentioned, HTML is designed primarily to address document formatting issues, while XML addresses issues relating to data and object structure. XML is also extensible in that it provides a standard mechanism for any document builder to define new XML tags within any XML document. Its features lower the barriers for creation of integrated, multiplatform, application-to-application protocols.
--------------------------------------------------------------------

Introduction to XML

In today’s world, words such as “tag,” “markup,” “element,” “attributes,” and “schema” are buzzwords that you can hear anywhere (well, at least in the IT industry), but what do these terms mean in the context of markup languages?

Introduction to Markup Languages

In a broader sense, a markup is anything that you place within a document that provides additional meaning or additional information. For example, this book uses italic font to emphasize each new phrase or concept that is defined or introduced. I have a habit of using a highlighter when I am reading books. Each time I use my highlighter, I change the format of the text as a means of helping me find important segments later.

Markups usually define

  • Formatting
  • Structure
  • Meaning

A reader has to have an implicit set of rules for placing markups in a document—otherwise those markups are meaningless to that reader. A markup language is a set of rules that defines

  • What constitutes a markup
  • What a markup means
Building Blocks of Markup Languages

The syntax of markup languages such as SGML, HTML, and XML is based on tags, elements, and attributes.

A tag is a markup language building block that consists of delimiters (angled brackets) and the text between them:

<TITLE>

An element is a markup language part that consists of a pair of tags and the text between them:

<TITLE>SQL Server 2000 Stored Procedure Programming</TITLE>

Each element has an opening tag and a closing tag. The text between these tags is called the content of the element.

An attribute is a component in the form of a name/value pair that delimits a tag:

<font size="2">

Okay, suppose you have created a document and have marked up some parts of it. Now what? You can share it with others. They will use something called a user agent to review the document. In a broader context, a user agent could be a travel agent that helps a customer buy tickets for a trip. However, in the IT industry, a user agent is a program that understands the markup language and presents information to an end user. An example of such a program is a web browser designed to present documents created using HTML.

XML Elements and Attributes

The following is a simple example of an XML document:

<Inventory>
 
<Asset Inventoryid="5">
   
<Equipment>Toshiba Portege 7020CT<Equipment>
   
<EquipmentType>Notebook</EquipmentType>
    
<LocationId>2<LocationId>
   
<StatusId>1<;/StatusId>;
    
<LeaseId>1234</LeaseId>
    
<LeaseScheduleId>1414</LeaseScheduleId>
    
<OwnerId>83749271</OwnerId>
    
<Cost>6295.00</Cost>
   
<AcquisitionType>Lease</AcquisitionType>
  </Asset>
</Inventory>

An XML document must contain one or more elements. One of the elements is not part of any other element and therefore is called the document’s root element. It must be uniquely named. In the preceding example, the root element is named Inventory.

Each element can, in turn, contain one or more elements. In the preceding example, the Inventory element contains one Asset element. The Asset element also contains other elements (Equipment, EquipmentType, and so on). The Equipment element contains just its content—the text string "Toshiba Portege 7020CT."

Unlike HTML, XML is case sensitive. Therefore, Asset, asset, and ASSET would represent different elements.

It is possible to define an empty element. Such elements can be displayed using standard opening and closing tags:

<Inventory></Inventory>

or using special notation:

<Inventory/>

If an element contains attributes but no content, an empty element is an efficient way to write it:

<Asset Inventoryid="5"/>

An element can have more than one attribute. The following example shows an empty element that contains nine attributes:

<Asset Inventoryid="12" EquipmentId="1" LocationId="2" StatusId="1"
LeaseId="1" LeaseScheduleId="1" OwnerId="1" Lease="100.0000"
AcquisitionTypeID="2"/>

You are not allowed to repeat an attribute in the same tag. The following example shows a syntactically incorrect element:

<InventoryInventoryid="12"Inventoryid="13"/>

Processing Instructions

An XML document often starts with a tag that is called a processing instruction. For example, the following processing instruction notifies the reader that the document it belongs to is written in XML that complies with version 1.0:

<?xml version="1.0"?>

A processing instruction has the following format:

<?name data?>

The name portion identifies the processing instruction to the application that is processing the XML document. Names must start with xml. The data portion that follows is optional and includes information that may be used by the application.

TIP


Although it is not required, it is recommended that you start an XML document with a processing instruction that explicitly identifies that document as an XML document defined using a specified version of the standard.

Document Type Definition

As mentioned earlier, markups are meaningless if defining rules for the following is not possible:

  • What constitutes a markup
  • What a markup means

A Document Type Definition (DTD) is a type of document that is often used to define such rules for XML documents. The DTD contains descriptions and constraints (naturally, not Transact-SQL constraints) for each element (such as the order of element attributes and membership). User agents can use the DTD file to verify that an XML document complies with its rules.

The DTD can be an external file that is referenced by an XML document:

<!DOCTYPE Inventory SYSTEM "Inventory.dtd">

or it can be part of the XML document itself:

<?xml version="1.0"?>
<!DOCTYPE Inventory[
<!ELEMENT Inventory (Asset+)>
<!ELEMENT Asset (EquipmentId, LocationId, StatusId, LeaseId,
                LeaseScheduleId, OwnerId, Cost, AcquisitionTypeID)>
<!ATTLIST Asset Inventoryid CDATA #IMPLIED>
<!ELEMENT EquipmentId
(#PCDATA)>
<!ELEMENT LocationId (#PCDATA)>
<!ELEMENT StatusId (#PCDATA)>
<!ELEMENT LeaseId (#PCDATA)>
<!ELEMENT LeaseScheduleId (#PCDATA)>
<!ELEMENT OwnerId (#PCDATA)>
<!ELEMENT Cost (#PCDATA)>
<!ELEMENT AcquisitionTypeID (#PCDATA)>
]>
<Inventory>
  <Asset Inventoryid="5">
    <EquipmentId>1</EquipmentId>
   
<LocationId>2</LocationId>
   
<StatusId>1</StatusId>
   
<LeaseId>1</LeaseId>
   
<LeaseScheduleId>1</LeaseScheduleId>
   
<OwnerId>1</OwnerId>
   
<Cost>1295.00</Cost>
   
<AcquisitionTypeID>1</AcquisitionTypeID>
 
</Asset>
</Inventory>

The DTD document does not have to be stored locally. A reference can include a URL or URI that provides access to the document:

<!DOCTYPE Inventory SYSTEM http://www.trigonblue.com/dtds/Inventory.dtd>

A Uniform Resource Identifier (URI) identifies a persistent resource on the Internet. It is a number or name that is globally unique. A special type of URI is a Uniform Resource Locator (URL) that defines a location of a resource on the Internet. A URI is more general because it should find the closest copy of a resource and because it would eliminate problems in finding a resource that was moved from one server to another.

NOTE


In some cases, it is not important that a URI points to a specific resource, but the string that is supplied must be globally unique, meaning no other XML document (that can be merged with the current XML document) is using the same string for some other resource. However, there are also cases in which a URI points to a specific resource on the Internet and the content of the string is critical for proper processing of an XML document.

XML Comments and CDATA sections

It is possible to write comments within an XML document. The basic syntax of the comment is

<!--commented text-->

where commented text can be any character string that does not contain two consecutive hyphens (--) and that does not end with a hyphen (-).

Comments can stretch over more than one line:

<!-- This is a comment. -->
<!--
This is another comment.
-->

Comments cannot be part of any other tag:

<Order <!-- This is an illegal comment. --> OrderId = "123">
...
</Order>

You can use CDATA sections in XML documents to insulate blocks of text from XML parsers. For example, if you are writing an article about XML and you want also to store it in the form of an XML document, you can use CDATA sections to force XML parsers to ignore markups with sample XML code.

The basic syntax of a CDATA section is

<![CDATA[string]]>

The string can be any character string that does not contain the string ]]>.

CDATA sections can occur anywhere in an XML document where character data is allowed:

<Example>
  <Text>
   
<![CDATA[<Inventory Inventoryid="12"/>]]>
 
</Text>
</Example>

Character and Entity References

Like HTML and SGML, XML also includes a simple way to reference characters that do not belong to the ASCII character set. The syntax of a character reference is

&#dec-value;
&#xhex-value;

The decimal (dec-value) or hexadecimal (hex-value) code of the character must be preceded by &# or &#x, respectively, and followed by a semicolon (;).

Entity references are used in XML to insert characters that would cause problems for the XML parser if they were inserted directly into the document. This type of reference is basically a mnemonic alternative to a character reference. There are five basic entity references:

Entity

Meaning

&amp;

&

&apos;

'

&lt;

<

&gt;

>

&quot;

"

Entity references are often used to represent characters with special meaning in XML. In the following example, entity references are used to prevent the XML parser from parsing the content of the Text element:

<Example>
  <Text>
    &lt;
Inventory Inventoryid="12"/&gt;
 
</Text>
</Example>

XML Namespaces

Some entities from different areas of a document can have the same name. For example, you could receive a purchase order document that contains a <name> tag for the customer and a <name> tag for the company. People reading this document would be able to distinguish them by their context. However, an application would need additional information to interpret the data correctly.

A solution to this problem is to create XML namespaces to provide the XML document with a vocabulary (that is, a context). After that, customer and company names can be referenced using a context prefix:

<contact:name>Tom Jones</contact:name>
<Company:name>Trigon Blue</Company:name>

Naturally, before these prefixes can be used, they have to be defined. The root element of the following document contains three attributes, each of which specifies a namespace and a prefix used to reference it:

<PurchaseOrders 
    xmlns:contact=
http://www.trigonblue.com/schemas/Contact.xsd"   
    xmlns:Company=
http://www.trigonblue.com/schemas/Company.xsd
    xmlns:dsig="http://dsig.org">
  
>PurchaseOrder>
      <Customer>
       
<contact:name>Tom Jones>/contact:name>
      
>/Customer>
<PurchaseDate>2000-09-11</PurchaseDate>
<SalesOrganization>
      <Company:name>Trigon Blue</Company:name>
      <Company:DUNS>817282919</Company:DUNS>
      <Company:ID>1212</Company:ID>
    </SalesOrganization>
    <dsig:digital-signature>78901314</dsig:digital-signature>
  </PurchaseOrder>
</PurchaseOrders>

In some cases, it is critical that the namespace points to an actual URL for a resource so that the XML document can be processed correctly, but in some cases (as in the preceding XML document), it is only important that the URI string in the namespace is globally unique (that is, that no other XML document is using the same URI for some other purpose).

Even when you have to use a specific namespace in an XML document, you can still arbitrarily chose a prefix. However, some prefixes are traditionally associated with some namespaces. For example, XML Schema documents traditionally use the xsd prefix and UpdateGrams (see Chapter 15) use the updg prefix.

Structure of XML Documents

XML documents consist of three parts, as you can see in the following illustration:

 

The first part of the document, called the prolog or document type declaration (not Document Type Definition), is optional. It can contain processing instructions, a DTD, and comments. The second part of the document is the body, which contains the document’s elements. The data in these elements is organized into a hierarchy of elements, their attributes, and their content. Sometimes an XML document contains a third part, an epilog, which is an optional part that can hold final comments, processing instructions, or just white space.

XML Parsers and DOM

Applications (or user agents) that use XML documents can use proprietary procedures to access the data in them. Usually, such applications use special components called XML parsers. An XML parser is a program or component that loads the XML document into an internal hierarchical structure of nodes (see Figure 13-1) and provides access to the information stored in these nodes to other components or programs.

The XML Document Object Model (DOM) is a set of standard objects, methods, events, and properties used to access elements of an XML document. DOM is a specification that has received Recommended status from the W3C. Different software vendors have created their own implementations of DOM so that you can use it from (almost) any programming language on (almost) any platform.

Microsoft has initially implemented DOM as a COM component called Microsoft .XMLDOM in msxml.dll. Microsoft used to call it Microsoft XML Parser, but at the


Figure 13-1.  A possible graphical interpretation of a node tree

time of this writing it is called Microsoft XML Core Services. It is delivered, for example, with Internet Explorer, or you can download it separately from Microsoft’s web site. Developers can use it from any programming language that can access COM components or ActiveX objects (for example, Visual Basic, Visual Basic .NET, VBScript, Visual C# .NET, Visual J++, JScript, and Visual C++).

Nevertheless, it is unlikely that you will use DOM from Transact-SQL. Microsoft has built special tools for development in Transact-SQL (which are reviewed in the next chapter).
--------------------------------------------------------------------

XML Document Quality

There are two levels of document quality in XML: well-formed documents and valid documents.

An XML document is said to be a well-formed document when

  • There is one and only one root element.
  • All elements that are not empty are marked with start and end tags.
  • The order of the elements is hierarchical; that is, an element A that starts within an element B also ends within element B.
  • Attributes do not occur twice in one element.
  • All entities used have been declared.

An XML document is said to be a valid document when

  • The XML document is well-formed.
  • The XML document complies with a specified DTD document.

The concept of a valid document has been ported to XML from SGML. In SGML, all documents must be valid; in other words, they must comply with the rules defined in the DTD. XML is not so strict. It is possible to use an XML document even without a DTD document. If the user agent knows how to use the XML document without the DTD, then the DTD need not even be sent over the Internet. It just increases traffic and ties up bandwidth.

XML Schema and XML Schemas

The DTD is not the only type of document that can store rules for an XML document. At the current time, several companies (including Microsoft) have submitted a proposal to W3C for an alternative type of metadata document called the XML Schema. In fact, there are other proposed standards for the same use, which are all referred to as XML schemas. In May of 2001, W3C published its XML Schema Recommendation, which should gradually replace all other XML schemas. However, some of these schemas (such as the one defined by Microsoft) are already in use.

XML schemas are XML language for defining the business rules with which a class of XML documents (data) must comply in order to be valid.

These are the major differences between a DTD and an XML schema:

  • XML schemas support data types and range constraints.
  • XML schemas allow users to define new data types.
  • The language in which XML schemas are written is XML. Developers do not have to learn an additional language as they do with DTDs.
  • XML schemas support namespaces (XML entities for defining context).

Why are XML schemas important? A huge portion of application development resources is spent on checking whether data complies with (business) rules about structure and content. If you have a simple language to define the structure and content of data (that is, the business rules by which it is constrained) and you have a schema validator (a tool or program that can check compliance), you will be able to reduce development resource requirements significantly, and therefore reduce the cost to implement applications.

XML–Data Reduced (XDR) Schema

When SQL Server 2000 was released, the W3C was still working on its XML Schema specification—it was not even clear which variation would be adopted. Microsoft has implemented a variation of XML schema syntax called XML–Data Reduced (XDR) in the MSXML parser (Microsoft XML Core Services) that was delivered initially as a part of Internet Explorer 5, and later in SQL Server 2000.

Microsoft promised complete support for XML Schema when the W3C awarded it Recommended status, but before that could happen, more and more organizations started using XDR. It is also important to note that Microsoft uses XDR in BizTalk, one of the most significant initiatives in the Web Services market. It is an initiative intended to create e-commerce vocabularies for different vertical markets.

At the time of this writing, SQL Server 2000 is using XDR schemas for several features, and support for XML Schema is increasing with each SQL Server 2000 Web Release.

The following is an example of an XDR schema document:

<Schema name="Schema"
  xmlns="urn:schemas-microsoft-com:xml-data" xmlns:dt="urn:schemas-microsoft-com:datatypes">
<ElementType name="Inventory" content="empty" model="closed">
      <AttributeType name="Inventoryid" dt:type="i4"/> 
      <AttributeType name="EquipmentId" dt:type="i4"/> 
      <AttributeType name="LocationId" dt:type="i4"/> 
      <AttributeType name="StatusId" dt:type="ui1"/>
      <AttributeType name="LeaseId" dt:type="i4"/>
      <AttributeType name="LeaseScheduleId" dt:type="i4"/>
      <AttributeType name="OwnerId" dt:type="i4"/>
      <AttributeType name="Rent" dt:type="fixed.14.4"/>
      <AttributeType name="Lease" dt:type="fixed.14.4"/> 
      <AttributeType name="Cost" dt:type="fixed.14.4"/>
      <AttributeType name="AcquisitionTypeID" dt:type="ui1"/>
      <attribute type="Inventoryid"/>
      <attribute type="EquipmentId"/>
      <attribute type="LocationId"/>
      <attribute type="StatusId"/>
      <attribute type="LeaseId"/>
      <attribute type="LeaseScheduleId"/>
      <attribute type="OwnerId"/>
      <attribute type="Rent"/>
      <attribute type="Lease"/>
      <attribute type="Cost"/>
      <attribute type="AcquisitionTypeID"/>
   </ElementType>
</Schema>

This XDR schema describes the structure of an XML document that contains Inventory information. The schema describes just one element—ElementType. The definition also specifies its name ("Inventory"), content (the tag is "empty" because all information will be carried in attributes), and content model ("closed"—indicating that it is not possible to add elements that are not specified in the schema).

The element contains several attributes. Each attribute is first defined in an AttributeType element and then instantiated in an attribute element:

  <AttributeType name="Cost" dt:type="fixed.14.4"/>
...
  <attribute type="Cost"/>

For each attribute, the schema defines a name and a data type. You can see a list of acceptable data types in the appendix at the end of this book.

The following listing shows an XML document that complies with the previous XDR schema:

<Inventory xmlns="x-schema:Schema.xml"
           Inventoryid="5"
           EquipmentId="1"
           LocationId="2"
           StatusId="1"
           LeaseId="1"
           LeaseScheduleId="1"
           OwnerId="1"
           Cost="1295.0000"
           AcquisitionTypeID="1"/>

Schema Constraints

This section reviews XDR schema attributes that can be used to declare elements and attributes. These can be classified as

  • Element constraints
  • Attribute constraints
  • XML data types
  • Group constraints
Element Constraints

Elements in an XDR schema can be constrained using attributes of the <ElementType> tag:

  • name
  • content
  • model
  • order
  • group
  • minOccurs
  • maxOccurs

The name attribute defines the name of the subelement.

Possible values for the content attribute are listed in the Table 13-1.

content

Meaning

"textOnly"

Only text is allowed as content

"eltOnly"

Only other elements are allowed as content

"empty"

No content

"mixed"

Both text and elements are allowed as content

    Table 13-1.  content Attribute Values

An important innovation in XDR schemas (that was not available in DTDs) is the capability to add nondeclared elements and attributes to an XML document. By default, every element of every XML document has its model attribute set to "open". To prevent the addition of nondeclared elements and attributes, the model attribute has to be set to "closed".

It is also possible to define how many times a subelement can appear in its parent element by using the maxOccurs and minOccurs attributes. Positive integer values and "*" (unlimited number) are allowed in the maxOccurs attribute, and "0" and positive integer values are allowed in the minOccurs attribute. The default value for minOccurs is "0". The default value for maxOccurs is "1", except that when the content attribute is "mixed", maxOccurs must be "*".

An order attribute specifies the order and quantity of subelements (see Table 13-2). The default value for order is "seq" when the content attribute is set to "eltOnly" and is "many" when the content attribute is set to "mixed".

Attribute Constraints

By their nature, attributes are more constrained than elements. For example, attributes do not have subelements (or subattributes), and it is not possible to have more than one instance of an attribute within the element.

The required attribute (constraint) in a schema specifies that the attribute is mandatory in XML documents that follow the schema. The default attribute

order

Meaning

"seq"

Subelements must appear in the order listed in the schema.

"one"

Only one of the subelements listed in the schema can appear in the XML document.

"many"

Any number of subelements can appear in any order.

    Table 13-2.  order Attribute Values of <ElementType>

(constraint) in a schema specifies the default value of the attribute in an XML document (the parser will use that value if an attribute is not present).

The schema can be set so that an attribute value is constrained to a set of predefined values:

<AttributeType name="status"
               dt:type="enumeration"
               dt:values="open in-process completed" />

XML Data Types

The schema can also enforce the data type of the attribute or element. Table A-2 in the appendix lists data types and their meanings, and Table A-3 in the appendix shows the mapping between XML data types and SQL Server data types.

Group Constraints

The group element allows an author to apply certain constraints to a group of subelements. In the following example, only one price (rent, lease, or cost) can be specified for the Inventory element:

<Schema name="Schema" xmlns="urn:schemas-microsoft-com:xml-data"
 xmlns:dt="urn:schemas-microsoft-com:datatypes"> <ElementType name="Inventory" content="eltOnly"
            
model="closed" order="many">
  <element type="Inventoryid"/>
  <element type="EquipmentId"/>
  <element type="LocationId"/>
  <element type="StatusId"/>
  <element type="LeaseId"/>
  <element type="LeaseScheduleId"/>
  <element type="OwnerId"/>
  <group order = "one">
    <element type="Rent"/>
    <element type="Lease"/>
    <element type="Cost"/>
 
</group>
  <element type="AcquisitionTypeID"/>
</ElementType>
</Schema>

The group constraint accepts order, minOccurs, and maxOccurs attributes.

XML Schema (XSD)

In May of 2001, XML Schema was given Recommended status by the W3C. Unfortunately, this stamp of approval happened after Microsoft had already released SQL Server 2000. However, in subsequent web releases of XML for SQL and SQLXML, and in releases of other products such as Visual Studio .NET, Microsoft has added support for XML Schema.

You can find the W3C XML Schema Recommendation specification, tools, and other resources at www.w3.org/XML/schema.html. I will try, however, to introduce the most important concepts.

The purpose of XML Schema is to define a class of XML documents. Each document of a specified class is an instance of that XML document class. The Equipment.xsd file contains an XML Schema document that defines instances of XML documents with Equipment information:

<?xml version="1.0" encoding="utf-8" ?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"  
            targetNamespace=
"http://www.trigonblue.com/Equipment.xsd"              
            xmlns=
"http://www.trigonblue.com/Equipment.xsd"   
            xmlns:mstns=
http://www.trigonblue.com/Equipment.xsd
            xmlns:msdata=
"urn:schemas-microsoft-com:xml-msdata" 
            elementFormDefault="qualified" 
            attributeFormDefault="qualified">
<xsd:element name="Document">
  <xsd:complexType>
    <xsd:choice maxOccurs="Unbounded">
      <xsd:element name="Equipment">
        <xsd:complexType>
          <xsd:sequence>
           
<xsd:element name="EquipmentId" 
                         msdata:ReadOnly="true"
                         msdata:AutoIncrement="true"
                         type="xsd:int" />
           
<xsd:element name="Make" type="xsd:string" />
            <xsd:element name="Model" type="xsd:string" />
            <xsd:element name="EqTypeId"type="xsd:short" />
            <xsd:element name="ModelSDX" type="xsd:string" />
            <xsd:element name="MakeSDX" type="xsd:string" />
        </xsd:sequence>
      </xsd:complexType>
    </xsd:element>
  </xsd:choice>
</xsd:complexType>
      <xsd:unique name="DocumentKey1" msdata:PrimaryKey="true">  
        <xsd:selector xpath=".//mstns:Equipment" />
        <xsd:field xpath="mstns:EquipmentId" />
    </xsd:unique>
  </xsd:element>
</xsd:schema>

You may notice that all elements in all XML Schema documents have an xsd prefix. Therefore, they are often stored in .xsd files and referred to as XSD schemas.

An XSD schema defines the structure and the types of data that can be used in a valid XML document instance. The following XML document is a valid instance of the previous schema:

<?xml version="1.0" encoding="utf-8" ?>
<Document xmlns="http://www.trigonblue.com/Equipment.xsd" 
          xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance
          xsi:schemaLocation="Equipment.xsd">
 
<Equipment>
    <EquipmentId>478</EquipmentId>
    <Make>Compaq</Make>
    <Model>15 Cart. DLT Library Tabletop Conversion Kit </Model>
    <EqTypeId>1</EqTypeId>
 
</Equipment>
  <Equipment>
    <EquipmentId>394</EquipmentId>
    <Make>Compaq</Make>
    <Model>2KVA Prestige W/Ext full Bat</Model>  
    <EqTypeId>1</EqTypeId>
 
</Equipment>
 
<Equipment>
    <EquipmentId>347</EquipmentId>
    <Make>Compaq</Make>
    <Model>Deskpro EN CMT PIII 733 10GB 128MB 48xCD nVidia NT</Model>
    <EqTypeId>1</EqTypeId>
  </Equipment>
...
</Document>

When an instance of an XML document and an XSD schema are processed together in a schema validator, the program checks whether the instance complies with the business rules defined in the schema and reports the result to the caller.

xsd:schema Element

All XML Schema documents must contain the xsd:schema root element:

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"  
            targetNamespace= http://www.trigonblue.com/Equipment.xsd
            xmlns=http://www.trigonblue.com/Equipment.xsd
            xmlns:mstns=
"http://www.trigonblue.com/Equipment.xsd

            xmlns:msdata="urn:schemas-microsoft-com:xml-msdata"
           
elementFormDefault="qualified
            attributeFormDefault="qualified">
..
.
</xsd:schema>

The xmlns:xsd attribute notifies a parser (or any other agent) that all elements with an xsd prefix should be processed as XSD schemas. You must be very careful to include a reference to the namespace using the URL indicated in the preceding listing. Otherwise, parsers and validators will not recognize it and will not be able to process the XSD schema properly. All components of an XSD schema (such as elements, types, sequence, schema) are defined in this namespace.

The targetNamespace attribute specifies the namespace of a target XML document instance. The XML document instance must have a matching namespace declaration.

The xmlns attribute defines the default name of the namespace in the XML document instance. When you set elementFormDefault and attributeFormDefault to “qualified,” all elements and attributes defined in the XSD schema (not just global elements and attributes) will belong to the target namespace in the XML document instance, and they must be namespace qualified (that is, they must contain the appropriate prefix).

Structure Declarations and Definitions

The primary tasks of the developer (or program) writing an XSD schema are to

  • Declare the components of an XML document instance (elements and attributes)
  • Define the components that are used inside the XSD schema (such as simple and complex types and attribute and model groups)
Element and Attribute Declaration

xsd:element and xsd:attribute are used to declare elements and attributes in an XML document instance. In their simplest forms, elements and attributes can be defined by name and type (data type). In the following case, the LocationId attribute is defined as int and the Location element is defined as string:

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
 
<xsd:element   name="Location"   type="xsd:string"/>
 
<xsd:attribute name="LocationId" type="xsd:int"/> </xsd:schema>

Attributes can be declared based only on simple types, while elements can be declared based on both simple and complex types (I will define types in the next two sections).

Attributes and elements can be defined either globally (just below xsd:schema), as in the previous example, or inside other elements and complex types.

Simple Type Declarations

Types are XSD schema equivalents of data types. Simple types are XSD components that cannot contain xsd:elements and xsd:attributes. Some simple types like int, datetime, string, ID, IDREF, language, and gYear (Gregorian year) are defined along with the XML Schema in the xsd namespace, while others can be derived from them in the XSD schema (like user-defined data types in TSQL).

A new simple type is derived in an xsd:simpleType element:

<xsd:simpleType name="ProdYear">
 
<xsd:restriction base="xsd:gYear">
   
<xsd:minInclusive value="1990"/>
   
<xsd:maxInclusive value="2010"/>
 
</xsd:restriction>
</xsd:simpleType>

A new simple type is defined by a name (attribute) and a set of facets (elements) inside the xsd:restriction element. In this case, the minInclusive and maxInclusive facets define the range of acceptable years.

The enumeration facet can be used to define a lookup list (the list of acceptable values):

<xsd:simpleType name="CanProvince">
  <xsd:restriction base="xsd:string">
   
<xsd:enumeration value="ON"/>
   
<xsd:enumeration value="BC"/>
    <xsd:enumeration value="MA"/>
    <xsd:enumeration value="NB"/>
    ..
.
  </xsd:restriction>
</xsd:simpleType>

The pattern facet uses regular expressions to define the format of values in the element:

<xsd:simpleType name="GUID">
  <xsd:restriction base="xsd:string">
   
<xsd:pattern value="[0-9a-fA-F]{8}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}
-[0-9a-fA-F]{4}-[0-9a-fA-F]{12}"/>
  </xsd:restriction>
</xsd:simpleType>

In this case, the GUID type is defined as a set of 8, 4, 4, 4, and 12 hexadecimal digits divided by dashes (-).

The W3C Recommendation defines the following facets:

  • length
  • minLength
  • maxLength
  • pattern
  • enumeration
  • whiteSpace
  • maxInclusive
  • minInclusive
  • maxExclusive
  • minExclusive
  • totalDigits
  • fractionDigits

Naturally, you cannot use every facet with every type. Only some combinations make sense.

Complex Types

Complex types are XSD schema elements that can be defined to contain additional attributes and elements. In the following example, the EqType complex type definition consists of two element declarations:

<xsd:complexType name="EqType">
 
<xsd:sequence>
    <xsd:element name="EqTypeId"   type="xsd:integer"/>
   
<xsd:element name="EqTypeName" type="xsd:string"/>
 
</xsd:sequence>
</xsd:complexType>

The meaning of the xsd:sequence element is that both elements must be present in the sequence (order) in which they are declared.

Complex types can be built from other simple and complex types:

<xsd:complexType name="Equpment">
  <xsd:sequence>
    <xsd:element name="EqId"  type="xsd:integer" minOccurs="1"/>
    <xsd:element name="Make"  type="string" minOccurs="1"/>
    <xsd:element name="Model" type="string" minOccurs="1"/>
    <xsd:element name="eqType" type="EqType" minOccurs="1"/>
    <xsd:element  ref="comment" minOccurs="0"/>
  </xsd:sequence>
  <xsd:attribute name="guid"    type="GUID"/>
</xsd:complexType>

The first three elements are defined as simple types, but the eqType element is defined as an instance of the EqType complex type defined earlier.

The first four elements are defined inline—the definition of each element also contains its type. There is an alternative means of defining elements: you can name an element in one place and reference it in another. For example, the comment element was named somewhere else in the schema and is only referenced here.

NOTE


The advantage of an inline definition is that it is more compact—the definition and the instance of an element are in the same place. The advantage of a named definition is that it can be referenced in many places.

Groups

An xsd:group element allows you to define a set of xsd:elements that will later be referenced together. In the following example, the ExtendedPrice element has been defined as the group of Price, Currency, and Quantity elements:

<xsd:element name="OrderItem">
    <xsd:complexType>
...
        <xsd:group ref="ExtendedPrice"/>
...
   
</xsd:complexType>
</xsd:element>
...
<xsd:group name="ExtendedPrice">
       
<xsd:sequence>
            <xsd:element name="Price" type="xsd:decimal"/>
            <xsd:element name="Currency" type="xsd:string"/>
            <xsd:element name="Quantity" type="xsd:decimal"/>
       
</xsd:sequence>
</xsd:group>

An attributeGroup element also allows you to define a group of attributes that can later be referenced together:

<xsd:element name="Equipment" maxOccurs="unbounded"> 
  <xsd:complexType>
...
      <xsd:attributeGroup ref="EquipmentProp"/>
...
  </xsd:complexType>
</xsd:element>
...
<xsd:attributeGroup name="EquipmentProp">
     
<xsd:attribute name="Make"    type="xsd:string"/> 
      <xsd:attribute name="Model"   type="xsd:string"/>
      <xsd:attribute name="EqType"  use="required">
        
<xsd:simpleType>
           
<xsd:restriction base="xsd:string"> 
               <xsd:enumeration value="Monitor"/>
               <xsd:enumeration value="Desktop"/>
               <xsd:enumeration value="Keyboard"/>
           
</xsd:restriction>
         </xsd:simpleType>
      </xsd:attribute>
</xsd:attributeGroup>

Annotating Schemas

To add a comment to the schema, you can use the xsd:annotation element. It can contain two subelements. The xsd:documentation element is used to mark comments written for people, while the xsd:appinfo element is used to mark information for programs (such as style sheets and SQLXML). The content of the appinfo element should be well formed XML (so that target applications can parse them).

NOTE


It’s not that programs need a valid comment from the developer. The appinfo annotations simply do not have any meaning for the schema validator, while they might be very important instructions for target programs.

<xsd:annotation>
    <xsd:documentation xml:lang="en-US">
 
This element should be linked with Location.LocId. 
    </xsd:documentation>
    <xsd:appinfo>
         <TbSql proc="Location.LocId">Link</TbSql> 
    </xsd:/appinfo>
<xsd:/annotation>

You should also use the xml:lang attribute inside the xsd:documentation element to indicate the language in which your comment is written.

Annotations can be placed only:

  • Before and after any global component (such as schema, simpleType, and attribute)
  • At the beginning of nonglobal components

XSD Schema Tools

The XML development community, including Microsoft, has developed many useful tools for development and management of XSD schemas. You can find a comprehensive list with links at www.w3.org/XML/Schema#Tools. I will now demonstrate use of three tools developed by Microsoft.

XSD Designer Visual Studio .NET supports the use of XSD schemas primarily to process ADO.NET data sets. It contains the XSD Designer—a graphical tool that allows you to drag and drop relation tables to link them and then build XSD schema out of them. To use it:

  1. Open a new Visual Studio .NET project.

  2. Select File | Add Item from the menu.

  3. Select XSD Schema. The program will open the XSD Schema Designer with the Schema pane active.

  4. Open Server Explorer.

  5. Expand the server node until you reach the tables in the Asset database.

  6. Drag the Equipment table onto the XSD Schema Designer (see Figure 13-2).

  7. You can switch to the XML pane to see the code of the schema.


                Figure 13-2.  The XSD Schema Designer

The XSD Schema Designer generates schemas that have some additional elements and attributes that are needed to validate ADO.NET datasets:

<?xml version="1.0" encoding="utf-8" ?>
<xs:schema targetNamespace="http://tempuri.org/XMLSchema.xsd" 
          elementFormDefault="qualified" 
          xmlns="http://tempuri.org/XMLSchema.xsd"
          xmlns:mstns="http://tempuri.org/XMLSchema.xsd"
          xmlns:xs=http://www.w3.org/2001/XMLSchema
          xmlns:msdata="urn:schemas-microsoft-com:xml-msdata">
  <xs:element name="Document">
    <xs:complexType>
      <xs:choice maxOccurs="Unbounded">
        <xs:element name="Equipment">
          <xs:complexType>
            <xs:sequence>
              <xs:element name="EquipmentId"
                          
msdata:ReadOnly="true
                          msdata:AutoIncrement="true"
                          
type="xs:int" />
              <xs:element name="Make" type="xs:string" 
                          minOccurs="0" />
             
<xs:element name="Model" type="xs:string" 
                          minOccurs="0" />
              <xs:element name="EqTypeId" type="xs:short"
                          minOccurs="0" />
              <xs:element name="ModelSDX" type="xs:string"
                          minOccurs="0" />
              <xs:element name="MakeSDX" type="xs:string" 
                          minOccurs="0" />
           </xs:sequence>
        </xs:complexType>
     </xs:element>
 
</xs:choice>
</xs:complexType>
<xs:unique name="DocumentKey1"
msdata:PrimaryKey="true">
      <xs:selector xpath=".//mstns:Equipment" />
      <:field xpath="mstns:EquipmentId" />
    </xs:unique>
  </xs:element>
</xs:schema>

The XSD is generated with temporary URI (tempuri.org) namespace references. You can replace them with your own namespaces:

<?xml version="1.0" encoding="utf-8" ?>
<xs:schema targetNamespace=http://www.trigonblue.com/Equipment.xsd
         elementFormDefault="qualified"
         xmlns="http://www.trigonblue.com/XMLSchema.xsd"  
         xmns:mstns=
http://www.trigonblue.com/Equipment.xsd
         xmlns:xs="http://www.w3.org/2001/XMLSchema
         xmlns:msdata="urn:schemas-microsoft-com:xml-msdata">

XSD by Example: Microsoft XSD Inference  Microsoft XSD Inference is a web-based utility that you can use to create an XSD schema from an XML instance document (see Figure 13-3). You can think of it as “XSD by example.” When you select a well-formed XML file, the utility generates an XSD schema that can be used to validate it. You can continue refining the XSD schema by selecting more XML files. At the end, you might also need to edit it manually to implement additional components such as facets (restrictions) and annotations.

XSD Schema Validator  Another tool that you might find useful when working with XSD schemas is the XSD Schema Validator, a web application (see Figure 13-4) that
 
                         Figure 13-3.  XSD by example

 
Figure 13-4.  The XSD Schema Validator

you can find at http://apps.gotdotnet.com/xmltools/xsdvalidator/. If you load your schema and your XML document instance, the XSD Schema Validator will report whether or not the instance complies with rules specified in the schema.

NOTE


You cannot validate an instance if you simply put the instance and schema in a folder on an IIS server and open it using Internet Explorer. Unfortunately, the parser in IE is not set as a validating parser and will simply display the content of the document instance.


Linking and Querying in XML

XML today represents more than a simple language for encoding documents. W3C is working on a whole other set of specifications for using information in XML documents. Specifications such as XLink, XPointer, XPath, and XQL allow querying, linking, and access to specific parts of an XML document.

This is a vast topic, and I will briefly review only XPointer and XPath, since they are used in SQL Server 2000.

XPointer

The XPointer reference works in a fashion very similar to the HTML hyperlink. You can point to a segment of an XML document by appending an XML fragment identifier to the URI of the XML document. A fragment identifier is often enclosed in xpointer(). For example, the following pointer directs the parser to an element with the ID attribute set to "Toshiba" in the document at a specified location:

http://www.trigonblue.com/xml/Equipment.xml#xpointer(Toshiba)

The character # is a fragment specifier. It serves as a delimiter between the URI and the fragment identifier, and it specifies the way that the XML parser will render the target. In the preceding case, the parser renders the whole document to access only a specified fragment. To force the parser to parse only the specified fragment, you should use | as a fragment specifier:

http://www.trigonblue.com/xml/Equipment.xml|xpointer(Toshiba)

Use of the | fragment specifier is recommended because it leads to reduced memory usage.

xpointer() is not always required. If a document has a schema that specifies the ID attribute of an element, you can omit the xpointer() and point to a fragment of the document using only the ID attribute value:

http://www.trigonblue.com/xml/Equipment.xml#Toshiba

Child sequence fragment identifiers use numbers to specify a fragment:

http://www.trigonblue.com/xml/Equipment.xml#/2/1/3

The preceding example should be interpreted as follows: /—start from the top element of the document; 2—then go to the second child element of the top element; 1—then go to the first subelement of that element; 3—then go to the third subelement of that element.

Child sequence fragment identifiers do not have to start from the top element:

http://www.trigonblue.com/xml/Equipment.xml#Toshiba/1/3

In this example, fragment identification starts from the element with its ID set to "Toshiba". The parser then finds its first subelement and points to its third subelement.

XPath

The full XPointer syntax is built on the W3C XPath recommendation. XPath was originally built to be used by XPointer and XSLT (a language for transforming XML documents into other XML documents), but it has found application in other standards and technologies. You will see in the next chapter how it is used by OpenXML() in SQL Server 2000, but first you need to examine its syntax.

Location steps are constructs used to select nodes in an XML document. They have the following syntax:

axis::node_test[predicate]

The location step points to the location of other nodes from the position of the current node. If a current node is not specified in any way, the location step is based on the root element.

Axes break up the XML document in relation to the current node. You can think of them as a first filter that you apply to an XML document to point to target nodes. Possible axes are listed in Table 13-3.

Axes

Description

parent

The parent of the current node.

ancestor

All ancestors (parent, grandparent, and so on) to the root of the current node.

child

All children of the current node (first generation).

descendant

All descendants (children, grandchildren, and so forth) of the current node.

self

The current node only.

descendant-or-self

All descendant nodes and the current node.

ancestor-or-self

All ancestor nodes and the current node.

attribute

All attributes of the current node.

namespace

All namespace nodes of the current node.

following

All nodes after the current node in the XML document. The set does not include attribute nodes, namespace nodes, or ancestors of the context node.

preceding

All nodes before the current node in the XML document. The set does not include attribute nodes, namespace nodes, or ancestors of the current node.

following-sibling

All siblings (children of the same parent) after the current node in the XML document.

preceding-sibling

All siblings (children of the same parent) before the current node in the XML document.

     Table 13-3.  Axes in XPath

The node test is a second filter that you can apply on nodes specified by axes. Table 13-4 lists all node tests that can be applied.

A predicate is a filter in the form of a Boolean expression that evaluates each node in the set obtained after applying axes and node test filters. Developers have a rich set of functions (string, node set, Boolean, and number), comparative operators (=,!=, <=,>= <,>), Boolean operators (And, Or), and operators (+, –, *, div, mod). The list is very long (especially the list of functions), and I will not go into detail here. I will just mention the most common function, position(). It returns the position of the node.

Let’s now review how all segments of the location step function together:

child::Equipment[position()<=10]

This location set first points to child nodes of the current node (root if none is selected). Of all child nodes, only elements named Equipment are left in the set. Finally, each of those nodes is evaluated by position and only the first 10 are specified.

Very often, you will try to navigate from node to node through the XML document. You can attach location sets using the forward slash (/). The same character is often used at the beginning of the expression to establish the current node.

In the following example, the parser is pointed to the Inventory.xml file, then to its root element, and then to the first child called Equipment, and finally to the first Model node among its children:

Inventory.xml#/child::Equipment[position() = 1]/child:: Model[position() = 1]

It all works in a very similar fashion to the notation of files and folders, and naturally you can write them all together:

http://www.trigonblue.com/xml/Inventory.xml#/child:: Equipment[position() = 1]/child::Model[position() = 1]

Node Test

Description

element name

Selects just node(s) with specified name in the set specified by axes.

*

or node()

All nodes in the set specified by axes.

comment()

All comment elements in the set specified by axes.

text()

All text elements in the set specified by axes.

processing-instruction ()

All processing instruction elements in the set specified by axes (if the name is specified in brackets, the parser will match only processing instructions with the specified name).

 

 

    Table 13-4.  Node Tests in XPath 

XPath constructs are very flexible, but also very complex and laborious to write. To reduce the effort, a number of abbreviations are defined. position() = X can be replaced by X (it is enough to type just the number). Thus, an earlier example can be written as

Inventory.xml#/child::Equipment[1]/child::Model[1]

If an axis is not defined, the parser assumes that the child axis was specified. Thus, the preceding example could be written as

Inventory.xml#/Equipment[1]/Model[1]

The attribute:: axis can be abbreviated as @. Therefore, the following two expressions are equivalent:

Inventory.xml#/child::Equipment[1]/attribute::EquipmentId Inventory.xml#/child::Equipment[1]/@EquipmentId

The current node can be specified using either self::node() or a dot (.). The following two expressions are equivalent:

Order.xml#/self::node()/OrderDate
Order.xml#/./OrderDate

A parent node can be specified either by parent::node() or two dots (..). The following two expressions are equivalent:

parent::node()/Order
../Order

/descendant-or-self::node() selects the current node and all descendant nodes. It can be abbreviated with //. The following two examples select all EquipmentId attributes in the document:

Inventory.xml#/descendant-or-self::node()/@EquipmentId
Inventory.xml#//@EquipmentId

--------------------------------------------------------------------Transforming XML

In many cases in business, information that is already in the form of an XML document needs to be converted to another XML structure. For example, a client of mine is participating in RossetaNet, an e-commerce consortium of IT supply chain organizations that defines standard messages to be sent between partners. Although messages are standardized, each pair of partners can agree to modify their messages slightly to better serve their needs. Such changes are mostly structural—new nodes (fields) can be defined, standard ones can be dropped, a node can change its type from element to attribute, and so on. Instead of generating completely different messages each time (and developing two separate procedures for performing similar tasks), it is preferable to create a simple procedure that will transform a standard XML message into another form.

Another typical situation occurs when an application uses a browser to display an XML document. Although modern browsers such as the latest versions of Internet Explorer are able to display the content of an XML document in the form of a hierarchical tree, this format is not user-friendly. More often, the XML document is transformed into an HTML document and information is organized visually into tables and frames. Such HTML applications usually allow the end user to modify the displayed information interactively (for example, to sort the content of the tables, to display different information in linked tables, or to present data in different formats). Each of these tasks could be performed by modifying the original XML document.

A typical problem with HTML browsers from different vendors is that they are not compatible. Naturally (well, actually, it seems quite unnatural), even different versions of the same browser behave differently. Each of them uses a different variation of the HTML standard. However, these differences are not major, and instead of generating a separate XML document for each of them, you can create a procedure to transform the XML document so that it fits the requirements of the browser currently in use.

You can think of XML as just one type of rendering language. Some systems use other types of rendering languages and appropriate browsers. For example, more and more PDAs and wireless devices such as cellular phones are offering Internet access. They often use a special protocol (Wireless Application Protocol, or WAP) that has its own markup language (Wireless Markup Language—WML) based on XML. A web server offering information should be able to transform the XML document to fulfill the needs of different viewers.

XSL

The eXtensible Stylesheet Language (XSL) addresses the need to transform XML documents from one XML form to another and to transform XML documents to other formats such as HTML and WML. It is based on Cascading Style Sheets (CSS), a language for styling HTML documents. Over time, XSL has been transformed into three other languages:

  • XSLT for transforming XML documents
  • XSLF for rendering
  • XPath for accessing a specific part of an XML document
XSLT

XSLT is a (new) language for transforming XML documents. W3C gave it Recommended status in November 1999. XSLT style sheet files are also well-formed XML documents. These files are processed by XSLT processors. Such a processor can be a separate tool or part of an XML parser (as in the case of MSXML).

At this point, I will not go into detail about XSL and XSLT syntax. Such topics are really beyond the scope of this book. Refer to www.w3.org/Style/XSL/ and www.w3.org/TR/xslt for more information on this topic. However, I will cover the use of XSLT in SQL Server 2000 later in “Using XSL,” in Chapter 14.

--------------------------------------------------------------------Why XML?

I have described XML, which is all well and fine, but of course the questions arise: why do you need XML, and what can you do with it? Two major areas of application are

  • Exchange of information between organizations
  • Information publishing
Exchange of Information Between Organizations

XML provides platform-independent data transport for a variety of types of information, from simple messages (commands, information requests) to the most complex business documents. Its extensible nature—the ease with which you can add new nodes or branches, create multiple instances of the same element, and use open schemas to add elements as necessary (provided they comply with schema rules)—makes XML an ideal development language for the rapidly evolving “dotcom” economy. You can use XML to implement solutions that can grow and evolve with an organization and be relatively certain that your solution will not end up on next year’s scrap heap and that the organization will not have to replace it at an enormous cost as the needs of the organization grow and change.

It is no wonder that Microsoft has incorporated support for XML in its new releases of applications such as SQL Server, Exchange, Visual Studio, and Internet Explorer. This support allows Microsoft to remain the major player in operating systems and network solutions even as businesses organize themselves into trading communities and industry associations defined by their ability to exchange information seamlessly and securely via the Internet.

EDI: a Cautionary Tale

XML is finding extensive application within the B2B (business-to-business) and B2C (business-to-consumer) arenas, to name but two of this young century’s most ubiquitous buzzwords. XML’s success in this emerging marketplace is largely due to its platform independence, which translates directly to the bottom line in terms of low implementation costs. Trading partners require only Internet access and a web browser to conduct secure business transactions over the Internet.

One of the buzzwords of the early 1990s was EDI (Electronic Data Interchange). EDI is still around, but it has never fulfilled its promise to make the exchange of paper documents between businesses obsolete. It was the cost of implementation that prevented EDI from fulfilling this promise. The problem that EDI encountered is a variation on the “Tower of Babel” theme: the proliferation of languages and protocols ensured that each implementation would be unique, and therefore costly.

Classic EDI follows a hub-and-spoke model: a large company (the “hub”) that must manage business relationships with a large number of suppliers (the “spokes”) decrees that the spoke organizations must implement EDI or lose their trading-partner status. The spoke organizations have to bear the considerable cost of implementation or lose a considerable portion of their business income.

A company that is forced to implement EDI by virtue of a trading relationship with a hub company receives an “implementation guide” that describes the EDI standard with which it must comply. One EDI veteran described the difference between classic EDI and XML-based e-commerce succinctly: with EDI, your postal carrier delivers an implementation guide printed on paper; with XML-based e-commerce, the implementation guide is attached to the electronic business document/transaction in the form of a DTD or XML schema.

This comparison is a gross oversimplification of the relationship between these two technologies, but it does highlight one reason that XML-based e-commerce has succeeded with small- to medium-sized businesses where EDI could not, and that is its relatively low cost of implementation.

The other reason for this success is that XML-based e-commerce leverages Internet-based communications. The dial-up Value Added Networks (VANs) of the EDI world are more or less glorified (and generally expensive) electronic mailboxes to which you post business documents and from which you download business documents from your trading partners. The XML revolution has spawned Internet-based, third-party Application Service Providers (ASPs) and “Infomediaries” to take the place of the VANs and use XML to conduct business transactions between diverse trading partners.

Of course, these ASPs and Infomediaries are in the business of developing data-based applications for the Web, so you can begin to see why it is so important that SQL Server be XML-ready. The new XML features in SQL Server 2000, along with SQL Server’s ease of use, make it a leader in this emerging market.

Information Publishing

Just as trading partners can use XML to exchange business documents, organizations and individuals can use XML to develop data-based applications that publish information. The only real difference between business document exchange and information publishing is that the information itself becomes the commodity.

Using XML to publish information located in a SQL Server database combines the easy access of the Internet with the power and data integrity of a mature RDBMS. Browser-based applications allow users to retrieve data dynamically from diverse databases.

blog comments powered by Disqus
MS SQL SERVER ARTICLES

- Windows Azure Media Services Launched by Mic...
- Windows Server 8 Cloud Backup Beta Released
- Idera Announces SQL Compliance Manager 3.6
- Idera SQL Doctor 3.0 and MS SQL Changes
- Microsoft Cuts Windows Azure Compute and Sto...
- Express5800 to Mesh with SQL Server 2012
- Microsoft Azure Outage
- Windows Azure Server Supported by RealCloud ...
- Idera Releases SQL Diagnostic Manager v7.1
- MS SQL Sever 2012 Launch, New Idera Release
- OpenText Azure Cloud Solution, Geminaire Raa...
- Melissa Data Releases MatchUp Tool for SQL S...
- Glovia`s G2 ERP Solution to Support SQL Serv...
- Upgrade Assistant for SQL Server 2012 Releas...
- Azure Update Features Several New Improvemen...

ASP Web Hosting ASP.Net Web Hosting Windows Web Hosting
ASP Free Forums 
 RSS  Tutorials RSS
 RSS  Forums RSS
 RSS  All Feeds
Site Map 
Request Media Kit
Write For Us Get Paid 
Weekly Newsletter
 
Developer Updates  
Free Website Content 
Privacy Policy 
Support 


© 2003-2012 by Developer Shed. All rights reserved. DS Cluster 6 - Follow our Sitemap
Most Popular Topics
All ASP.Net Tutorials