This tutorial explores the IIS XML support for SQL 2000 Server. By configuring a virtual root on Microsoft Internet Information Services, data from a SQL 2000 server can be accessed using XML over HTTP, the details of which will be examined in another article. This is possible because of the native support for XML in SQL 2000 Server.
SQL 2000 Server provides T-SQL language extensions to operate bi-directionally with relational and XML sources (see the first picture in this article). It also provides two system stored procedures, sp_XML_preparedocument and sp_XML_removedocument, that assist the XML to Relational transformation. SQL 2000 server's XML access capabilties are further enhanced with SQLXML 3.0, which allows XML formatting both on the client and the server. Prior to SQLXML 3.0 XML, formatting was on the server.
SQL Server is limited to mainly structured hierarchical and relational data, limited by absence of document storge facilities, and limited query capabilites (refer to BOL for restrictions in using For XML, for example) among others. SQL 2005 has many nice features, such as a brand new datatype called XML, available in both Typed and untyped varieties; the possibility of mixing relational and XML data; Querying XML and relational data; user defined functions with XML; and many others. Make time to attend this excellent Web seminar (author: Shanker Pal, Program Manager, Microsoft) to hear about it from the horse's mouth (http://www.microsoft.com/seminar/events/webcasts/default.mspx). The following aspects of XML support for SQL 2000 will be discussed in this article:
T-SQL support for for XML
Retrieving Data in XML formats, for XML
From XML document to Table structure, open XML
Overview of XML support in IIS for SQL 2000 Server
Configuring the IIS Virtual Directory Management for SQL 2000 Server
This somewhat simplifed picture shows how SQL 2000 supports XML by way of some new constructs for retrieving data from a relational database into various formats of XML, and for inserting data from an XML source into relational data. The two way processability of data either from XML to relational(write) and from relational to XML(read) adds a great deal of flexibility in handling hierarchical and relational data.
For XML
SQL 2000 Server has enhanced the Select Statement to retrieve relational data from the tables in XML format. In returning the single column of formatted data in XML, the server runs the query and then XML formats the results, depending on the particular clause, as in the following extensions:
Select From
[where]
[order by]
[for xml (raw|auto[,elements]|
explicit)[,xmldata]
[,binary base64])]
For XML Raw returns rows in a generic row identifier <row>. The intersections for that row with the columns are returned as attributes in name/value pairs. This data may need further processing to map attributes to elements.
For XML raw can have further qualifiers, binary base64 and XMLData. The for xml raw, xmldata returns a schema as well. The next query is run against the pubs database in SQL 2000 server. This may be copied into the Query Analyzer and executed.
The following picture shows the results returned (it starts with the schema first, then the rowset). As the result is stuffed into a column, in the query analyzer's result pane it may look garbled, but it can be copied and pasted into notepad. The results also lack a <root>element, and therefore it is not a well formed XML document, but an XML fragment. In order to view the result in a browser, the returned result has to be enclosed inside <root></root>. The result, as seen in the browser after this, is shown in the next picture (the latter part is truncated to save space).
select pub_id, price, type,ytd_sales,titleauthor.title_id,au_ord
from titles,titleauthor
where titleauthor.title_id=titles.title_id
and price>15
order by pub_id
for xml raw,xmldata
For XML auto returns rows in a more flexible format with reference to tag names. It is essentially element-centric. The row identifier now becomes table name. However, using the qualifier Elements, the For xml auto, elements returns rows qualified by the table name tags, and the columns in each row qualifed by the Column name tag. It is no longer defined by attributes, but rather by elements. The result of running the same query as before but with [for xml raw, xmldata] replaced by [for xml auto] produces the following results as seen in a browser.
For XML Explicit is the mode to use if you require a more flexible way of producing the results of the query. But this requires a more complex query than the other two modes to be run against the database. The syntax required by the for xml explicit mode is demanding because the query has to satisfy the structure of a Universal Table (the row sets mirroring the nesting feature of hierachical data). A very simple example of running a query in the for xml explict mode shows the Universal Table format and the relational parent; child tables are considered here for purposes of illustration.
The tables used in this example are shown in this picture. The two tables have the parent, child relationship and the for XML explicit query will retrieve the data to conform with the Universal Table format. This query is far more complex than the other modes discussed.
The query used to retrieve data that conforms with the Universal Table format is shown here. In order to use the explicit mode, the requirement is the Universal Table format, and since the output is in XML, it has to produce a "well formed document." The first query in this UNION returns the NumId, and this is formatted as Number (third line in the first query, see also output). In the second query NumID is again referenced, because it is required for the join with the Color table.
The column that is produced is the colr column. This is formated as COLOR (see lines 3 and 4 in the first query, and in the order by clause. Also note the output). Running the query by commenting out the for xml explicit clause gives a better insight into this UNION query. It goes without saying that all conditions of forming a UNION must be strictly followed (number of columns, column order, data type, and so forth). This can be extended to deeper nesting orders, although a lot of trial and error may be needed. SQL 2005 removes all these complications by the various enhancements.
select 1 as tag,
null as parent,
nums.NumId as [Number!1!NumId],
null as [COLOR!2!colr]
from nums
union all
select 2,
1,
nums.NumId,
Color.colr
from nums, color
where nums.NumId=color.numid
order by [Number!1!NumId], [COLOR!2!colr]
for xml explicit
The result of this query is as shown here:
Open XML
Going from XML to a relational table takes a couple of steps. However, it is a powerful way of carrying out INSERT, DELETE, UPDATE actions by means of carefully constructed XML documents.
Step 1. Prepare a well formed XML document either in the element-centric representation, or in the attribute-centric representation (the code shown here is attribute-centric representation). @idoc is an integer handle, and @doc is a variable that holds the document specified by the set statement. SQL errors 7000 to 7009 describe errors that may be reported during OPENXML operations.
/* T-SQL batch to create a sample XML Document*/
declare @idoc int
declare @doc varchar(1000)
set @doc=
'
<root>
<student studentID="100"
fname="Sam"
lname="Johnson"
ssn="122-23-3456"
phone="(800)-123-2345"/>
<student studentID="101"
fname="Mona"
lname="Lewis"
ssn="532-23-4556"
phone="(789)-321-2345"/>
</root>'
Step 2. Prepare an internal representation of this document by using the stored procedure sp_xml_preparedocument. This takes an input parameter (@doc) and produces an output parameter (@idoc), an internal handle to the document. This parses the document provided using the MSXML parser and keeps it ready to use. This is available as long as connection is not broken to the server, but remember, it uses memory.
[This is the syntax]
sp_xml_preparedocument hdoc OUTPUT
[, xmltext]
[, xpath_namespaces]
Execute sp_xml_preparedocument @idoc OUTPUT, @doc
Step 3. Execute the select statement with the OPENXML rowset provider. The syntax of the OPENXML is as follows:
[This is the syntax]
OpenXML (idoc int [in],rowpattern nvarchar[in],[flags byte[in]])
[WITH (SchemaDeclaration | TableName)]
*Here
flags values 0,1 are for attribute-centric, and 2 for
element-centric documents.
Step 4. Make sure you remove the internal representation of the stored procedure to free memory.
Exec sp_xml_removedocument @idoc
Putting them all together, the following query is produced:
/*Step 1: T-SQL batch to create a sample XML Document*/ declare @idoc int declare @doc varchar(1000) set @doc= ' <root> <student studentID="100" fname="Sam" lname="Johnson" ssn="122-23-3456" phone="(800)-123-2345"/> <student studentID="101" fname="Mona" lname="Lewis" ssn="532-23-4556" phone="(789)-321-2345"/> </root>'
/*Step 2: Prepare an internal representation of this document*/ Exec sp_xml_preparedocument @idoc output, @doc /* Step 3: Execute a Select statement that uses OPENXML rowset provider look under common objects/rowset in the query analyzer objects */ select * from OPENXML(@idoc, '/root/student',1) with (fname varchar(20), lname varchar(20), ssn varchar(20), phone varchar(20) )
Run this query and you should see:
If the with statement is deleted (no schema declared) from the above Select statement, the results are returned in what is called an Edge Table format with all the columns as shown in the next picture.
Ascii-based XML's self-describing document structure is well suited for data exchange over the Internet, and makes it the standard language for the B2B business sector. XML is already the de facto lingua franca of the Internet. The following picture shows the various hardware/software items involved in a succesful exchange of data over the Internet/intranet.
As seen in the earlier section, support for bidirectional (xml<->Relational) manipulation of data makes it possible to send XML queries through an URL. Besides the URL approach, it is also possible to access data using templates and xpath queries, all of which is facilitated by the SQL 2000 server specific Virtual directory. Also, XML data in XML documents can update the relational database using update datagrams. As XML is a dominant feature of .NET framework, it makes it eminently suitable for databased application development. XML data access is possible both with ADO.NET and ADO, thus ensuring support for both technologies and providing a smooth migration path.
Before one can access data stored on an SQL Server through HTTP requests to the IIS, a SQL Server specific virtual root (also known as the virtual directory) needs to be set up. A detailed, step-by-step procedure is outlined in the next section.
By installing SQL 2000 Server, XML support by IIS is not automatic. You need to use the IIS Virtual Directory Management for SQL Server utility to enable this support. This utility is available as a shortcut when SQL 2000 Server is installed as shown in this picture.The IIS virtual directory defines a connection between IIS and an instance of SQL Server 2000. The virtual directory is displayed as an MMC [Microsoft Management Console] snap-in. However, when SQLXML3.0 is added, an enhanced snap-in is made available. In this tutorial, only the out of the box SQL 2000 Server's version of the virtual directory configuraion is described.
Double clicking the Configure SQL XML Support in IIS shortcut brings up the IIS Virtual Directory Management for SQL server in the MMC console, as shown for this IIS Web server called Nechost. Nechost is the name of the IIS on this machine. By clicking the + symbol, you can expand this node to display the default website as shown in the picture that follows the next.
In the following steps the above default virtual site will be configured to provide XML support. Right click this default web site and choose to create a New -> virtual directory by choosing the drop-down option as shown in this picture.
There are several tabs on this New Virtual Directory Properties window. The General tab is the one that sets up the name of the virtual directory and also the full path to the actual directory that contains the files of the virtual directory. In the next picture both these folders have the same name XMLEnabled
The next tab is the one that configures the Security. This is the authentication needed for the SQL XML ISAPI DLL. The SQLVDIR object is configured by a wizard which steps thru this process. There are three choices for the authentication:
Anonymous Access
Enabling the radio button Always Log on as sets up the Anonymous user access. When this is enabled, the user gets access to the resource unless it is denied by the NTFS permissions. In this case, IIS provides stored credentials to the Windows OS using a special user account IUSR_ machine name and, by default, IIS controls the password.
In the present case the annonymous user is IUSR_Nechost. This should automatically come up when the radio button is enabled and the Windows credentials radio button is enabled. enable Windows account synchronization will also automatically get checked. This allows the IIS to synchronize the annonymous password (password field is grey).
The other option with Always log on as with the radio button SQL Server will look for the built-in internet guest account, IUSR_Machinename. In this case it is only necessary to provide the anonymous user name.This authentication (Annonymous log on by either mode) is optimum for Internet applications and, provides the best performance with no overheads.
Basic Authentication
When this mode is selected, the other two options will be greyed out. The authentication is now sought on the SQL Server account database. This has implications when you go to the next tab Data Source. The databases that can be accessed are limited to the permissions given to the user noted in basic authentication. Since the password is transmitted by HTTP as Base64 encoded data (basically clear text) this mode is insecure. It is mostly for intranet applications. This can be made more secure using SSL.
Integrated Authentication
Users in the domain, or trusted domains are allowed access. This is the best method for intranet users. The Windows users must have login accounts in the SQL Server.
The next tab is the Data Source. Here the browse button can be used to choose the appropriate server from the server group, or an appropriate server instance. You need to pick the default database to use from among the databases for which the credentials have permissions. At this point you may be asked to provide User Name and Password depending on the security mode (Basic) selected. In this example pubs database was chosen to continue with this tutorial.
The next tab specifies the SQL 2000 server access through the IIS. Allow template queries is enabled by default. URL queries may be allowed by checking the Allow URL queries option, but they are not safe and therefore are not recommended. They are useful in developing SQL requests before preparing a template file. As it passes anything that can go into an URL, they can also modify the database, drop tables, and so forth. Allow XPath allows users to execute XPath queries over SQL Views. Allow Post allows posting of data by the user (HTTP GET & POST), this is disabled by default. In this example, all options are chosen. In some versions of SQL Server another option, Allow posted updategrams may also be available, and also the default choices could be different as well.
The next tab, Virtual Names, that specifies a name as a part of the URL to execute, is configured as follows:
When this window comes up, there are no default choices. Click on New which pops up a window where the following choices can be made.
Database object, dbObject. No path information is needed for this option. This allows specifying a database table or view at the URL.
Xpath query against a mapping schema by choosing schema. These queries can return results in native SQL Server format.
SQL queries in template files by choosing, template. Templates can be created to execute both SQLl queries as well as XPath queries. XSL can be used to transform the query results.
The virtual names hide the actual directories where information is stored, thus offering more security. The choices made appear as shown here. The path usually picks up the virtual directory chosen in the general tab.
The last one is the Advanced tab. The location of the sqlisapi.dll is automatially retrieved. In case it does not, or should you like to test a new version, you should browse to the location. The default directory in Windows 2000 Professional is C:\Program Files\Common Files\System\Oledb\sqlisapi.dll. This completes the configuration of the virtual directory. In order to force loading of modified documents, it is best to check the "cache" option.
The next picture shows three different virtual directories on the Nechhost server each with a different set of choices.
Summary
In this first part, two main items were considered. Firstly, the language enhancements to the T-SQL available in and out of the box installation of SQL 2000 Server was discussed. The powerful for XML clause in its various modes to retrieve data in XML format from the SQL 2000 Server, and the open XML clause together with the two new stored procedures for the uploading of XML document to a table structure were discussed. Secondly, the enabling of a SQL 2000 server specific virtual directory was discussed in detail. These are two of the required items to converse with the SQL 2000 Server in XML over the Internet. The next part will focus on the various ways such a data based information transfer takes place across the Internet.