The Largest Repository of ColdFusion Knowledge in The World for More Than 12 Years

ColdFusion on Ulitzer

Subscribe to ColdFusion on Ulitzer: eMailAlertsEmail Alerts newslettersWeekly Newsletters
Get ColdFusion on Ulitzer: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn


CFDJ Authors: AppDynamics Blog, Michael Kopp, Tad Anderson, Bob Gourley, Jayaram Krishnaswamy

Related Topics: ColdFusion on Ulitzer, XML Magazine

CFDJ: Article

Flexible XML Web Design: XML Data Islands

Flexible XML Web Design: XML Data Islands

Using XML with XSL doesn't mean you have to eschew your favorite server-sided scripting language for XML/XSL. XML data islands offer you the best of both worlds.

XML data islands are a powerful technique for embedding XML data in HTML documents. Internet Explorer 5 (IE5) is currently the only browser supporting them. However, you can still enjoy the flexibility of XML data islands if you process the XML on your server and insert the resulting HTML into your server page. I'll demonstrate this technique using Active Server Pages (ASP) and a C++ Active Template Library (ATL) component. I'll assume the reader has a basic understanding of XML, XSL and Microsoft's XML parser. I'll explain the C++ component in general terms that should be understandable to non-C++ programmers. Although I'm using Microsoft technologies, the concept of XML data islands is relevant regardless of the platform.

As the complexity, importance and size of Web applications increase, design issues become increasingly critical. A dominant strategy is a three-tiered architecture - dividing the application into display, business and database layers. Typically the display layer consists of cascading stylesheets, HTML documents and dynamic scripting pages such as ASP, JavaServer Pages (JSP) and ColdFusion. The business layer consists of server-sided scripting pages and, frequently, components such as COM objects or Java servlets. The database layer consists of the database, associated stored procedures and database-related components.

Three-tiered strategies often break down in practice. A single server-sided scripting page or component often combines display logic with both business and data logic. For example, an ASP page that gets and transforms data, then formats its display - all in one page - is all too common. This combination of logic (a two-tiered design) is often problematic. It's usually harder to debug and distinguish what the different code is doing in a two-tiered application. These applications are also not as scalable and require increased regression testing every time the code changes.

XML and XSL address this problem. Display logic is contained in XSL stylesheets. Business and database logic are contained in components and/or server-sided scripting pages. The business logic is responsible for getting XML, performing the necessary transformations and applying an XSL stylesheet to the XML data. This separation results in applications that are easier to debug, modify and maintain.

XML data islands are XML data embedded in HTML documents. IE5 can process an XML data island directly. JavaScript or VBScript combine the XML and XSL stylesheets using IE5's XML parser, MSXML. We can also use MSXML on the server independent of IE5. After processing the XML data on the server, we insert the area of data-driven HTML into an ASP page. Using MSXML on the server enables us to use an ASP page while using XML with XSL. The result of the XSL transformation on the XML data isn't a complete HTML document, but rather an HTML document fragment.

Consider a hypothetical Internet application for viewing a list of all employees. This list needs to be formatted output for viewing in Web browsers and handheld devices, as well as raw data so future applications can obtain it easily using HTTP. XML combined with XSL meets these requirements. Obtaining different display formats is as easy as applying different XSL stylesheets to the XML data. Building a switch that causes the ASP page to return XML only meets the requirement to expose the raw data to other applications.

Architectural Overview
The class diagram in Figure 1 provides a logical overview of our site and a better understanding of the components it is composed of. The notation is Unified Modeling Language (UML) with Web application extensions. The pages DeptListings and EmpDisplay_1/EmpDisplay_2 are displayed by the browser. EmpDisplay_1 and EmpDisplay_2 are actually the same page as EmpListing, but we model them separately due to the dynamic nature of server-sided scripting pages. These pages have two lives. Their first is on the server (EmpListing) and their second is in the client's browser (EmpDisplay_1 and EmpDisplay_2). The further distinction between EmpDisplay_1 and EmpDisplay_2 illustrates that for IE5, EmpDisplay_2 needs to contain two <xml> tags, a <div> tag and a large block of JavaScript to transform the XML (transform.js).

Dynamic Overview
The sequence diagram in Figure 2 illustrates the sequence of messages in our application; it provides a high-level overview of the messages passed among the different elements.


Figure 2:

As the diagram illustrates, first DeptListings calls EmpListing; then, with the help of xmlparser, EmpListing is built. It's built differently depending on whether XML or HTML was requested. Then it's sent to the browser. We break down EmpListing's processing in the following steps:

  1. Dimension any variables needed and instantiate the xmlparser component.
  2. Determine if the URL parameter indicates rawxml or html.
  3. If rawxml, call xmlparser.getData and output the results (XML).
  4. If not rawxml, determine if the browser is IE5.
  5. If not rawxml and not IE5: -Include the header -Call xmlparser.getData, making sure to pass the XSL template and the DTD (Document Type Definition) path, and write the processed results to the browser.
    -Include the footer.
  6. If not rawxml and IE5: -Write the JavaScript needed to process the XML data island to the browser.
    -Write a body tag with an onLoad method that calls the JavaScript function.
    -Include the header.
    -Call xmlparser.getData and place the results in a variable in EmpListing.
    -Place the opening tag <xml id="xmldata"> and closing tag </xml> around the XML.
    -Add the opening tag <xml id="xsldata" src="ourpath"> and closing tag to the end of the variable.
    -Write the variable to the browser.
    -Include the footer.

Essentially, if the user isn't using IE5, get HTML from the XML component; otherwise put both the XML data and the XSL template in <xml> tags and include a JavaScript function. This function uses the two <xml> tags to transform the data and display it in a <div> tag using the <div> tag's innerHtml property.

There are several techniques for transforming data from a database into XML. If using ADO 2.1, we can use the save method of a recordset and persist it as XML. If using ADO2.5, we can write the ADO output directly to the output stream as XML. However, both solutions tie us to Microsoft's view of how XML data should appear (Microsoft's XML Reduced Data Schema Format). Two more general techniques are: (1) build the XML DOMDocument node by node or (2) create a large XML string and load it into an XML DOMDocument.

There are benefits to both solutions. In the first one we instantiate an XML document and build it. Building the document directly gives us access to all the XML DOMDocument's methods. However, instantiating the XML DOMDocument object and all its subobjects can be costly. As reported on MSDN by Chris Lovett, program manager for Microsoft's XML team, Microsoft's own tests demonstrate that loading a string is "roughly five times faster" than building a document node by node. In our design, if the user has IE5 there's no reason to instantiate a DOMDocument on our server at all. Only when the client requests formatted output and doesn't have IE5 do we instantiate an XML DOMDocument, load the XML string and transform it. For these performance reasons we chose the later technique.

The xmlparser component gets the XML and transforms it. This process is summarized in the following steps:

  1. Declare and initialize any needed variables.
  2. Get the recordset (for example, ADO).
  3. Allocate enough space for a large data string (the XML).
  4. Begin the XML string (including a reference to the DTD).
  5. Repeat for each record:
    -Append record opening tag to XML string.
    -Repeat for each field:
    -Append field opening tag to XML string.
    -Append field data to XML string.
    -Append field closing tag to XML string.
    -Append record closing tag to XML string.
  6. Close the XML string.
  7. Determine if XML or HTML is to be returned.
  8. If XML, return the XML string.
  9. If HTML:
    -Instantiate the XML/XSL parser (for example, MSXML.DLL).
    -Ensure the XML parser is set to validate the XML with our DTD.
    -Load the XML.
    -Load the XSL template.
    -Process the XML with the XSL.
    -Return the transformed data.

    The server-sided scripting page is the "glue" that holds our application together. It inserts any other files needed (such as headers and footers), instantiates our component, passes the required parameters to it, then inserts the output. The component that transforms the ADO to XML and the XSL stylesheet does most of the work.

    Implementation
    In this example we use ASP, a C++ ATL COM component and Microsoft's XML parser, MSXML. Our C++ component uses ADO and MSXML via the #import statements (see Listing 1). Our data is from a simple two-table Access database (see Figure 3) with an ODBC connection named employees.

    XML Component Implementation
    Neither scripting languages nor Visual Basic handle string concatenation very efficiently. However, a Visual C++ ATL component that preallocates the space needed for the XML string will be fast. We implement xmlparser using C++. Listing 2 shows our component's primary method. The ATL wizard-generated IDL code isn't listed. Our method's signature is:

    getData(BSTR *pConnection, BSTR *pQuery, BSTR *pDTDPath, BSTR *pXslTemplatePath,
    BSTR *pTranformedData)

    Note that *pTransformedData is the return data, not an input parameter. Our method takes a connection to an ODBC data source, the query, and the paths to the XML DTD and the XSL stylesheet.

    We implement the code in Listing 2 as envisioned earlier. Even without a solid understanding of C++, you should be able to get the gist of what the method is doing: getting an ADO recordset, looping through that recordset and creating an XML string (using the string template in the Standard Template Library). If the method returns XML, the line:

    *pTransformedData = SysAllocString(wstrXmlString.c_str());

    is executed and XML is returned. If the method returns HTML, the string and the XSL template are loaded into XML DOMDocument objects, the XSL transforms the XML and HTML is returned. Since we're using a DTD, we turn on validation with the code:

    SpXmlDocument->put_validateOnParse(true)

    The error-handling skeleton code is included but not implemented.

    Our use of MSXML is "low-rent." Although MSXML has a rich object hierarchy, we used only two: the top-level object (DOMDocument) and the parseError object. We needed to use only six methods. The method put_validateOnParse ensures the parser validates the XML with the DTD. The XML string is loaded via the LoadXML method. The XSL stylesheet is loaded using its path and the Load method. The code also uses the get_parseError and get_errorCode to check for any errors after the XML and XSL are finished loading. If there are no errors, the line:

    *pTransfromedData = SysAllocString(spXmlDocument->
    transformNode(spXslDocument))

    uses the DOMDocument containing the XSL, transforms the XML to the desired output and writes it to pTransformedData.

    Active Server Page
    Listing 3 contains the implementation of DeptListing, and Listing 4 contains the implementation of EmpListing. First, any needed variables are dimensioned and we instantiate our XML processing component (xmlparser.dll). If we need only XML returned to the browser, then the processing consists of:

    Response.Write(xmlProcessor.getData("employees", cstr(strQuery), "Listing.dtd", "rawxml"))

    where employees is the DSN, strQuery is the SQL query statement, listing.dtd is the DTD and rawxml indicates we want XML returned. If the user wants XML we're finished.

    If the user wants HTML, we must do more processing. First, determine if the user's browser is IE5. If it is, use the following code to write the necessary data, since IE5 can process XML data islands directly:

    <%if instr(Request.ServerVariables("HTTP_USER_AGENT"), "MSIE 5.") > 0 then %>
    <SCRIPT LANGUAGE=javascript src="transform.js">
    </SCRIPT>
    <body onLoad="transformXml();">

    and

    <div id="xslTarget"></div>
    <xml id="xmldata">
    <%
    Response.write(xmlProcessor.getData("employees", cstr(strQuery), "Listing.dtd", "rawxml"))
    %>
    </xml>
    <xml id="xsldata" src="listings.xsl"></xml>

    I'll discuss the JavaScript page, transform.js, later. For now, all we need to know is that it contains a function called transformXML, which converts the XML data contained in <xml id="xmldata"></xmldata> to HTML using the XSL stylesheet in <xml id="xsldata src="listings.xsl"></xml>, and writes it to the innerHTML of the tag <div id="xslTarget">. Both the text between the <xml> tags and the text contained in the page indicated by the src attribute are treated as XML by IE5.

    If the user's browser isn't IE5, the ASP page includes the header and the footer, maps a path to the XSL template and calls the getData method of xmlparser:

    strXslTemplate = server.MapPath ("listings.xsl")
    Response.write(xmlProcessor.getData("employees", cstr(strQuery),
    cstr(server.mappath("Listing.dtd")), cstr(strXslTemplate)))

    Rather than passing rawxml as a parameter, EmpListing passes the XSL stylesheet path to xmlparser. The component xmlparser transforms the XML to HTML and returns the results.

    JavaScript for IE5
    Listing 5 contains the code for transform.js. This code applies the XSL stylesheet to the XML data island (when the user's browser is IE5). Although I don't explain MSXML's error handling, I include the code needed to handle errors. The statement:

    result = xmldata.transformNode(xsldata.XMLDocument);

    transforms the XML with the XSL, and the statement:

    xslTarget.innerHTML = result;

    inserts the results into the

    tag. Note that the try/catch block is IE5-specific JavaScript.

    XML Data and DTD
    The resulting XML is shown in Figure 4.

    The root tag is <records>. Each ADO record becomes an XML <record> and each ADO field is a child element of <record>. Our application uses a DTD for a practical reason: specifying the DTD in Listing 6 ensures that MSXML catches changes to the database or query. The line:

    <!ELEMENT record (name, sex, deptname, salary)>

    ensures that each record contains a name, sex, deptname and salary child element. For example, as shown in Figure 5, the DTD code:

    <!ELEMENT record (name, test, sex, deptname, salary)>

    combined with an XML record that contains name, sex, deptname and salary (but not test) causes MSXML to throw an error because the XML doesn't have a <test> element and the DTD requires it. The DTD helps keep the configuration of the database/queries and the XSL templates in sync. Note that there's a performance penalty for validating an XML dataset.

    XSL Stylesheet
    The XSL stylesheet (see Listing 7) applies formatting rules to XML that result in the desired display. Our stylesheet contains multiple templates, indicated by the tag <xsl:template match="matchingnode">. Our main template, <xsl:template match="/">, matches our document's root element. Inside that template we call a second template when <records> is matched. Since the XML data set has one <records> tag, we apply the second template once. Inside the records template we loop through each record using the statement <xsl:for-each select="record">.

    For each record we build a table row containing five table cells. Two of the cells contain data obtained from the XML data set using the <xsl:value-of> tag. The third cell applies a template to salary. This template contains an <xsl:eval> statement. The JavaScript that's inside the <xsl:eval> tag formats the node if it isn't blank:

    <xsl:template match="salary">
    <xsl:eval>if (this.text != "") formatnumber(this.text, "#,##0.00");</xsl:eval>
    </xsl:template>

    Odd and even rows have a different background color. The code:

    <xsl:script>
    <![CDATA[function even(e) {return absoluteChildNumber(e)%2==0;}]]>
    </xsl:script>

    and

    <tr>
    <xsl:choose>
    <xsl:when expr="even(this)">
    <xsl:attribute name="bgcolor">#9999cc</xsl:attribute>
    </xsl:when>
    <xsl:otherwise>
    <xsl:attribute name="bgcolor">#ccccff</xsl:attribute>
    </xsl:otherwise>
    </xsl:choose>

    accomplishes this coloring. The function even returns true or false depending on whether the number is even. This function is called by the conditional statement <xsl:when expr="even(this)"> inside the <xsl:choose> block. The parameter "this" refers to the current node, <record>. The JavaScript function then uses the absoluteChildNumber property of the XML DOMDocument to determine if the current node is odd or even. The attribute bgcolor is added to the <tr> tag and the transformed tag becomes either:

    <tr bgcolor="#9999cc">

    or

    <tr bgcolor="#ccccff">

    Conclusion
    The resulting output is illustrated in Figure 6.


    Figure 6:

    We breezed over quite a few concepts, and I didn't discuss how to program in C++ or write an XSL template. Rather, I focused on a practical example of what you can do with these technologies. Table 1 lists numerous online resources that expand on the architecture and technologies presented.

    Data islands are a great way to implement XML in your site. They enable you to use server-sided scripting languages with the power of XML and XSL. The resulting output of XML and XSL is simply an HTML document fragment that's stuck in an ASP page. Localizing display logic to an XSL template avoids many of the difficulties associated with two-tiered Web design. Finally, allowing a page to have a "switch" that causes it to return only XML makes the page's data easily accessible to other applications.

    Reference
    C++ ATL Component Tutorial: www.asptoday.com/articles/20000824.htm

More Stories By James A. Brannan

James A. Brannan is a consultant specializing in Internet programming in the
Washington, DC, metropolitan area. He is also pursuing a master's degree at
the University of Maryland.

Comments (1) View Comments

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


Most Recent Comments
LA Bryant 07/17/01 04:09:00 PM EDT

Greetings,

This excellent article details the architectural process of documenting the requirements of a web application. I was wondering where I could find more information pertaining to Unified Modeling Language (UML) with Web application extensions.

Thanks,

Great Magazine by the Way!! Keep it Up!!

Lyle A Bryant
lyle_bryant@hotmail.com