Using Data Transformation Services: Can We Transfer XML Data?

There has been a lot of discussion in the SQL forums on whether or not DTS has XML support. This investigative tutorial looks at this issue with some interesting observations.

Introduction

One of the chief problems with DTS having XML support is that the SQL2000 support for XML, in the form of language enhancements to the Transact SQL for  two way traffic between XML and relational data works well; however, it is not adequate for handling the traffic in the DTS context. The main reason is that the XML that is returned by the forXml clause comes out with characters that are not parsed by the XML parser. We will see some aspects of this in the tutorial.

It is possible to recover the XML that is processed by DTS, but may require some post processing. An elegant, purely ActiveX script based answer gleaned from an Internet link is also described. To understand this article better, please review the following tutorials, on forXML and DTS Global variables.

The forXML clause

In going from relational data to XML, SQL 2000 Server has enhanced the Select statement. SQL 2000 server returns a single column of XML formatted data that may exceed the size of a query return in the query analyzer. In returning the single column of formatted data in XML, the server runs the query and then XML formats the results depending on the particular clause, as in the following extensions:

Select <> From <>
[where]<>
[order by]<>
[for xml (raw|auto[,elements]|
          explicit)[,xmldata]
          [,binary base64])]

For XML Raw returns rows in a generic row identifier <row>. The intersections for that row with the columns are returned as attributes in name/value pairs. This data may need further processing to map attributes to elements. For XML raw can have further qualifiers, binary base64 and XMLData. The for xml raw, xmldata returns a schema as well.

The next query is run against the pubs database in SQL 2000 server. This may be copied into the Query Analyzer and executed. The following picture shows the results returned (it starts with the schema first and then the rowset). As the result is stuffed into a column, in the query analyzer’s result pane it may look garbled, but it can be copied and pasted into notepad. The results also lack a <root>element, and therefore they do not give you a well formed XML document, but an XML fragment. In order to view the result in a browser, the returned result has to be enclosed inside <root></root >. The result, as seen in the browser after this, is shown in the next picture (the latter part is truncated to save space).

select pub_id, price, type,ytd_sales,titleauthor.title_id,au_ord
from titles,titleauthor
where titleauthor.title_id=titles.title_id
and price>15
order by pub_id
for xml raw, xmldata

However if you capture the complete result from the query analyzer for the above query on Northwind you would see the following. This is a malformed XML document, even if you were to get rid of the first line. In the picture above I removed this line and then enclosed the result in between tags.

XML_F52E2B61-18A1-11d1-B105-00805F49916B <Schema name=”Schema1″ xmlns=”urn:schemas-microsoft-com:xml-data” xmlns:dt=”urn:schemas-microsoft-com:datatypes”><ElementType name=”row” content=”empty” model=”closed”><AttributeType name=”pub_id” dt:type=”string”/><AttributeType name=”price” dt:type=”fixed.14.4″/><AttributeType name=”type” dt:type=”string”/><AttributeType name=”ytd_sales” dt:type=”i4″/><AttributeType name=”title_id” dt:type=”string”/><AttributeType name=”au_ord” dt:type=”ui1″/><attribute type=”pub_id”/><attribute type=”price”/><attribute type=”type”/><attribute type=”ytd_sales”/><attribute type=”title_id”/><attribute type=”au_ord”/></ElementType></Schema><row xmlns=”x-schema:#Schema1″ pub_id=”0736″ price=”19.9900″ type=”psychology ” ytd_sales=”4072″ title_id=”PS3333″ au_ord=”1″/><row xmlns=”x-schema:#Schema1″ pub_id=”0877″ price=”19.9900″ type=”mod_cook ” ytd_sales=”2032″ title_id=”MC2222″ au_ord=”1″/><row xmlns=”x-schema:#Schema1″ pub_id=”0877″ price=”21.5900″ type=”psychology ” ytd_sales=”375″ title_id=”PS1372″ au_ord=”2″/><row xmlns=”x-schema:#Schema1″ pub_id=”0877″ price=”21.5900″ type=”psychology ” ytd_sales=”375″ title_id=”PS1372″ au_ord=”1″/><row xmlns=”x-schema:#Schema1″ pub_id=”0877″ price=”20.9500″ type=”trad_cook ” ytd_sales=”375″ title_id=”TC3218″ au_ord=”1″/><row xmlns=”x-schema:#Schema1″ pub_id=”1389″ price=”20.0000″ type=”popular_comp” ytd_sales=”4095″ title_id=”PC8888″ au_ord=”2″/><row xmlns=”x-schema:#Schema1″ pub_id=”1389″ price=”25.0000″ type=”popular_comp” title_id=”PC9999″ au_ord=”1″/><row xmlns=”x-schema:#Schema1″ pub_id=”1389″ price=”19.9900″ type=”business ” ytd_sales=”4095″ title_id=”BU1032″ au_ord=”1″/><row xmlns=”x-schema:#Schema1″ pub_id=”1389″ price=”20.0000″ type=”popular_comp” ytd_sales=”4095″ title_id=”PC8888″ au_ord=”1″/><row xmlns=”x-schema:#Schema1″ pub_id=”1389″ price=”19.9900″ type=”business ” ytd_sales=”4095″ title_id=”BU1032″ au_ord=”2″/><row xmlns=”x-schema:#Schema1″ pub_id=”1389″ price=”22.9500″ type=”popular_comp” ytd_sales=”8780″ title_id=”PC1035″ au_ord=”1″/><row xmlns=”x-schema:#Schema1″ pub_id=”1389″ price=”19.9900″ type=”business ” ytd_sales=”4095″ title_id=”BU7832″ au_ord=”1″/> (12 row(s) affected)

{mospagebreak title=Implementing DTS Transfer of XML}

I will be using the SQL 2000 server, and will be running a simple select query with the forXML clause. This query will be a part of the ExecuteSQL task in the designer. The result of this query, which is but a single column (line, if you want to call it that) will be referenced to a Global Output Variable. I also fashion an ActiveXScript that will read this recordset (referenced by the Global variable)  invoking ADODB and then save the result (persist) to an external XML file on my hard drive. I will also be using a workflow to make sure that the query is executed before the ActiveXScript is run. Please refer to my tutorials on DTS with Global variables. For each of these steps, there will be supporting screen shots. The overall designer view is shown in the next picture.

Description of Table used

In order to implement data transfer, I will be using a very simple table in my test bed database TestWiz. This sample has very few elements with no offending characters that you may sometimes find in tables. The next picture shows the table used.

Query Analyzer Result

The query that will produce the XML data used in the ExecuteSQL Task is shown in the next picture, together with the result from the Query Analyzer. Notice the first line, which is normally the column name while returning records from a relational table.

{mospagebreak title=DTS Design Details}

First of all, I will be using a Global Variable NewXMLDTS and into this I will be stuffing the result of my query. The next picture shows the Global Variables tab of the DTS Package Properties. The Type is shown as Dispatch in this picture, as the program had finished executing when the picture was captured. Initially it would be <other>. The name Newxml is not used in this tutorial.

Next I will be using an ExecuteSQL Task which uses the connection that I have established. This process has been described a number of times and will not be repeated. The next two pictures show the two relevant portions of this ExecuteSQL task.

This picture shows the Parameter mapping accessed by clicking the Parameters… button. Notice that the same characters that are in the Query Analyzer are also in the mapped parameter.

{mospagebreak title=The ActiveX Script}

The ActiveXScript will glue the query result to the Global variable as well as persist the XML to an XML file. The following script is used in this task. The script instantiates (creates) an ADODB.Recordset, and to this the Global Variable is assigned. The Recordset is saved as an XML file. The scripting interface in DTS is very strict; make sure that the syntax is as close to the one shown as possible.

'**********************************************************************
'  Visual Basic ActiveX Script
' Jayaram Krishnaswamy
'************************************************************************

Function Main()
dim RS
set RS = CreateObject ("ADODB.Recordset")
set RS = DTSGlobalVariables("NewXMLDTS").value 

rs.save "C:Documents and Settingscomputer userDesktopNov7
Newxml.xml", _ adPersistXML Main = DTSTaskExecResult_Success End Function

Also a Workflow item is added to make the script run only after a successful run of the ExecuteSQL task. The package is saved after making sure none of the individual items are not highlighted.

DTS Package run results

This package runs successfully and a file will be created in the location where it is saved. If this file is displayed in the browser, what you will see is the following:

Source view of browser display

This is pretty ugly and useless. Now let’s take a look at the source view of this browser output. You will see the following. In addition to what was seen in the Query Analyzer result, other offending characters may also be present. The source view may be cleaned up through some filtering, but this is not a clean solution.

{mospagebreak title=ActiveX Script Based Data transfer implementation}

The following link here describes the ActiveXScript based exporting of XML to a file. I did not author this article. I have used the script, and made changes, so that I can look at the same example above — the tiny little table, and the query. The script is as shown. In the DTS designer, drag and drop an ActiveXScript task and double click to open the scripting interface. Cut and paste this script. If you plan to use this script, make sure that you write out sSQL all in one line.

'**********************************************************************
'  Visual Basic ActiveX Script
'  Please refer to the original author here: http://www.sqlxml.org/
'  faqs.aspx?faq=10
'************************************************************************

Function Main()
    Dim oCmd, sSQL, oDom

    Set oDom = CreateObject("Msxml2.DOMDocument.4.0")

    Set oCmd = CreateObject ("ADODB.Command")
    oCmd.ActiveConnection = "Provider=<b>SQLOLEDB</b>;
Data Source=(local);Initial & _ Catalog=<b>TestWiz</b>; Integrated Security=SSPI" sSQL = "<ROOT xmlns:sql='urn:schemas-microsoft-com:xml-sql'> _ <sql:query>select <b>fname</b> from <b>XMLDTSTest</b> for xml auto _ </sql:query></ROOT>" oCmd.CommandText = sSQL oCmd.Dialect = "{5D531CB2-E6Ed-11D2-B252-00C04F681B71}" oCmd.Properties("Output Stream") = oDom oCmd.Execute , , 1024 oDom.Save "c:xmldtsjay.xml" Main = DTSTaskExecResult_Success End Function

The Result of this package

The following picture shows the browser display of the XML file xmldtsjay.xml created by the above script. Isn’t this neat?


Summary

The short answer for the question posed in the title is not a total ‘no’, but a partial ‘yes’. The script based solution appears to be independent of DTS except for the last line in the code. Does it work outside of DTS? I have not tested it yet. Moreover, it used a different Provider — SQLOLEDB. But perhaps the reason it works so well is because of the DOM support, and the fact that the query resembles that of a template query. For other questions regarding the above code, I direct the readers to the link at the beginning of this section.

One thought on “Using Data Transformation Services: Can We Transfer XML Data?

  1. Here is another interesting tutorial on DTS as related dealing with XML. There are lots of FAQ’s on this on the Internet. DTS in SQL 2000 seems to work only partially. The ActiveX Script at the end of the tutorial provides a neat solution, but is it truly a DTS task?
    Things are fast moving to SQL 2005. DTS is not called by this name anymore. Let’s see how SQL 2005 handle such tasks. Thank you once again for reading the article, and I invite you to discuss.

    Wishing you all a very happy New Year.

[gp-comments width="770" linklove="off" ]