XSL Transformations: XSLT Alleviates XML Schema Incompatibility Headaches -- MSDN Magazine, August 2000

XSL Transformations: XSLT Alleviates XML Schema Incompatibility Headaches
Don Box, Aaron Skonnard, John Lam

This article assumes you’re familiar with XML and XPath
Code for this article: Box0800.exe (32KB)

SUMMARY
The XSL Transformations (XSLT) specification defines an XML-based language for expressing transformation rules that map one XML document to another. XSLT has many of the constructs found in traditional programming languages, including variables, functions, iteration, and conditional statements. In this article you’ll learn how to use the XSLT instructions and template rules, manage namespaces, control transformation output, use multiple stylesheets, and employ pattern-matching with template rules. A sidebar explains how to access XSLT from MSXML using the IXSLTemplate and IXSLProcessor interfaces.

This article was adapted from the forthcoming book Essential XML (Chapter 5), by Don Box, Aaron Skonnard, and John Lam © 2000 Addison Wesley Longman. Reproduced by permission of Addison Wesley Longman. All rights reserved.

The XML Schema definition language is poised to become the dominant way to describe the type and structure of XML documents. XML Schemas provide the basic infrastructure for building interoperable systems based on XML since they give you a common language for describing XML that is based on proven software engineering principles. That stated, the expressiveness of XML Schemas makes it possible (if not likely) that multiple organizations modeling the same set of domain-specific abstractions will come up with different schema documents. Yes, this problem could be solved via industry consortia defining canonical schema for each domain, but until that happens, dealing with multiple schema definitions of the same basic information will be a fact of life. Enter XSL Transformations (XSLT).

The XSLT specification defines an XML-based language for expressing transformation rules from one class of XML document to another. The XSLT language can be thought of as a programming language, and there are at least two XSLT execution engines currently available that can directly execute an XSLT document as a program. But, XSLT documents are also useful as a general-purpose language for expressing transformations from one schema type to another. In fact, we could imagine using an XSLT document as one form of input to an arbitrary XML translation engine.

XSLT excels at mapping one XML-based representation onto another. Consider the XML document shown in Figure 1 (which will be referred to often throughout our discussion).

----------------------------------------
Figure 1
XML Source Document

<?xml version="1.0"?>
<product title="Essential XML" xmlns="http://awl.com">
<writer name='Don Box' />
<writer name='Aaron Skonnard'/>
<writer name='John Lam'/>
</product>
----------------------------------------

Note that the element names belong to a namespace (and schema) defined by Addison Wesley Longman (the publisher of our book). Now consider this second representation of the same information:

<?xml version="1.0"?> <content xmlns="http://www.develop.com/Schemas/book"> <name>Essential XML</name> <contributors> <staff principal="true">Don Box</staff> <staff>Aaron Skonnard</staff> <staff>John Lam</staff> </contributors> </content>

This time the element names belong to a namespace (and schema) defined by DevelopMentor (the home of the authors).

The two documents just shown appear to contain roughly the same information. However, appearances can often be deceiving. Without human intervention, it is impossible to algorithmically determine whether there is any correlation whatsoever between the two underlying schema, even in the presence of similar or identical local element and attribute names. Once a human capable of understanding the semantics of the two schema has determined that there is in fact some relationship, it would be useful to have a language for describing the transformations necessary to convert instances of one schema into instances of the other.

One way to describe these transformations is to simply write code in a traditional programming language. Figure 2 shows an example of a Document Object Model (DOM)-based program written in the Java language that translates the first document into the second.

----------------------------------------
Figure 2
DOM-based Translation
----------------------------------------
import org.w3c.dom.*;
Document transform(Document source) throws Exception {
String tns = "http://www.develop.com/Schemas/book";
DOMImplementation dom = source.getImplementation();
Document target = dom.createDocument(tns, "content",
                                       null);
Element sourceRoot = source.getDocumentElement();
String title = sourceRoot.getAttribute("title");
Element e1 = target.createElementNS(tns, "name");
e1.appendChild(target.createTextNode(title));
target.getDocumentElement().appendChild(e1);
e1 = target.createElementNS(tns,"contributors");
boolean bFirst = false;
for (Node author = sourceRoot.getFirstChild();
       author != null; author = author.getNextSibling()) {
    if (author.getNodeType() != Node.ELEMENT_NODE)
      continue;
    String name = ((Element)author).getAttribute("name");
    Element e2 = target.createElementNS(tns, "staff");
    e2.appendChild(target.createTextNode(name));
    if (!bFirst)
      e2.setAttributeNS("", "principal", "true");
    e1.appendChild(e2);
    bFirst = true;
}
target.getDocumentElement().appendChild(e1);
return target;
}
----------------------------------------

While this program is functional, it is only readable by virtual machines and, assuming the source code is provided, humans that can read Java-language code. Moreover, this program is very brittle and would require a significant amount of modification to track the independent evolution of both the source and target schemas.

In contrast, consider the XSLT document in Figure 3 that accomplishes the same task as the previous Java-language program.

----------------------------------------
Figure 3
XSLT Translation
----------------------------------------

<?xml version="1.0"?>
<content xmlns="http://www.develop.com/Schemas/book"
      xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
      xsl:exclude-result-prefixes='src'
      xsl:version='1.0'
      xmlns:src="http://awl.com">
  <name><xsl:value-of select="/src:product/@title"/></name>
  <contributors>
    <xsl:for-each select='/src:product/src:writer'>
      <xsl:if test='position() = 1' >
        <staff principal='true'
        ><xsl:value-of select="@name"/></staff>
      </xsl:if>
      <xsl:if test='position() > 1'>
        <staff><xsl:value-of select="@name"/></staff>
      </xsl:if>
    </xsl:for-each>
  </contributors>
</content>
----------------------------------------

This XML document reflects both the source and target schemas and can be fed to an XSLT processor to translate instances of the source schema into instances of the target schema. As shown here, schema transformations are described by implementing an exemplar of the target schema in terms of its changes from the source. More importantly, this document can be read using a standard XML parser and can act as input to a wide variety of processing software, not just XSLT transformation engines.

As Figure 3 illustrates, simply using XML does not ensure inter-operation. Yes, the world now agrees on where the angle brackets go. But being able to interpret anything beyond that is a much tougher problem to solve. XML Schemas give you the tool for representing type definitions, but in the absence of a common schema, human intervention is still needed to bridge organizational boundaries. XSLT provides an XML-centric way to build that bridge.

XSLT Basics
XSLT is an XML-based language that describes transformations from XML documents into arbitrary text-based formats (which may or may not be XML). As shown in Figure 4, XSLT assumes that three documents are in use: the source document, the XSLT stylesheet document, and the result document. The source document is simply a well-formed XML document that provides the input for the transformation. The stylesheet document is an XML document that uses the XSLT vocabulary for expressing transformation rules. The result document is a text document that is produced by running the source document through the transformations found in the XSLT stylesheet.

Figure 4 XSLT Transformation Process

Figure 4. XSLT Transformation Process

The XSLT stylesheet contains one or more XSLT templates. A template is a collection of literal result elements and XSLT instructions. Literal result elements are elements that are copied more or less verbatim into the result document. XSLT instructions are well-known elements that alter the processing of the template. Figure 5 shows the complete list of XSLT instructions. XSLT instructions and other constructs are always qualified by the XSLT namespace URI (http://www.w3.org/1999/XSL/Transform), which is typically mapped to the namespace prefix xsl.

An XSLT stylesheet can be written as either a single-template literal result element or as an explicit stylesheet. The former is simply an exemplar of the target schema with namespace-qualified XSLT instructions added. The latter is an XSLT-specific document format whose root element is xsl:stylesheet. The explicit stylesheet will be discussed in the Template-based Programming section later in this article.

Literal result element-based stylesheets support a simple subset of the more flexible and expressive xsl:stylesheet vocabulary. The following is the simplest possible XSLT stylesheet written as a literal result element:

<?xml version='1.0' ?> <doc>Hello, World</doc>

This stylesheet produces the same result document independent of the source document. Like the classic program from Kernighan and Richie, this version is extremely useless and produces the same output no matter what input is fed to it.

The xsl-copy and xsl-value Instructions
The easiest way to include content from the source document is via the xsl:copy-of instruction. This instruction works much like XInclude's include element (which is remotely similar to external parsed entities). The primary difference is that xsl:copy-of is integrated into the XPath context used by XSLT.

Consider the following XSLT literal result element:

<?xml version='1.0' ?> <doc xmlns:xsl='http://www.w3.org/1999/XSL/Transform' xsl:version='1.0' ><xsl:copy-of select='/book/author'/></doc>

The rules of XSLT indicate that the xsl:copy-of element will be replaced by the node-set produced by the select XPath expression. Assuming that the following source document is supplied as input

<?xml version="1.0"?> <book title="Essential XML" > <author name='Don' age='25' canadian='false'/> <author name='Aaron' age='19' canadian='false'/> <author name='John' age='20' canadian='true'/> </book>

the result document would look like this:

<?xml version='1.0' ?> <doc> <author name='Don' age='25' canadian='false'/> <author name='Aaron' age='19' canadian='false'/> <author name='John' age='20' canadian='true'/> </doc>

The xsl:copy-of instruction is ideal for copying node-sets, however, it is somewhat cumbersome for copying text from the source document. For this, XSLT provides two mechanisms: one for including source text as attribute [children], the other for including source text as element [children].

All attributes of literal result elements are interpreted as attribute value templates, which are simply strings that contain embedded XPath expressions that are evaluated to produce result text. XSLT treats the { and } characters specially when they appear as attribute [children]. When the { character is encountered, the subsequent characters (up to the closing }) are interpreted as a string-value XPath expression. For example, had the following attribute appeared in the literal result element

bookname='The book is {/book/@title}'

the corresponding attribute in the result document would look like this:

bookname='The book is Essential XML'

To escape the { and } characters in attribute [children], you must use a redundant { or } character to signal that no attribute value template is present. For example, the following attribute from a literal result element

curlies='{{}}'

would expand to this in the result document:

curlies='{}'

Attribute value templates are also allowed on a small subset of XSLT instructions. (You can tell immediately whether a given XSLT instruction attribute accepts attribute value templates by the presence of the surrounding { } in the syntax guide for that instruction and attribute in the XSLT specification or the chart in Figure 5.)

----------------------------------------
Figure 5
XSLT Instructions

Instruction Syntax Description

xsl:copy-of
<xsl:copy-of select = expression />
Emits the node-set corresponding to the select expression.

xsl:value-of
<xsl:value-of select = string-expression disable-output-escaping = "yes" | "no" />
Emits the string corresponding to the select expression.

xsl:if
<xsl:if test = boolean-expression> <!- - Content: template - -> </xsl:if>
Evaluates the template if and only if the test expression evaluates to true.

xsl:choose
<xsl:choose> <!- - Content: (xsl:when+, xsl:otherwise?) - -> </xsl:choose>
Evaluates the template from the first xsl:when clause whose test expression evaluates to true. If none of the test expressions evaluate to true, then the template contained in the xsl:otherwise clause is evaluated.

xsl:for-each
<xsl:for-each select = node-set-expression> <!- - Content: (xsl:sort*, template) - -> </xsl:for-each>
Evaluates the template against each node in node-set returned by the select expression. The order of evaluation can be influenced using one or more xsl:sorts.

xsl:call-template
<xsl:call-template name = qname> <!- - Content: xsl:with-param* - -> </xsl:call-template>
Invokes the template rule named by name.

xsl:variable
<xsl:variable name = qname select = expression> <!- - Content: template - -> </xsl:variable>
Declares a variable named name and initializes it using the select expression or template.

xsl:text
<xsl:text disable-output-escaping = "yes" | "no"> <!- - Content: #PCDATA - -> </xsl:text>
Emits the text found in #PCDATA. Escaping of the five built-in entities is controlled using disable-output-escaping.

xsl:number
<xsl:number level = "single" | "multiple" | "any" count = pattern from = pattern value = number-expression format = { string } lang = { nmtoken } letter-value = { "alphabetic" | "traditional" } grouping-separator = { char } grouping-size = { number } />
Emits a number based on the XPath number expression found in value.

xsl:copy
<xsl:copy use-attribute-sets = qnames> <!- - Content: template - -> </xsl:copy>
Copies the current context node (and associated namespace nodes) to the result tree fragment.

xsl:apply-templates
<xsl:apply-templates select = node-set-expression mode = qname> <!- - Content: (xsl:sort | xsl:with-param)* - -> </xsl:apply-templates>
Invokes the best-match template rules against the node-set returned by the select expression.

xsl:apply-imports <xsl:apply-imports /> Promotes the current stylesheet in import precedence.

xsl:message
<xsl:message terminate = "yes" | "no"> <!- - Content: template - -> </xsl:message>
Emits a message in a processor-dependent manner.

xsl:fallback
<xsl:fallback> <!- - Content: template - -> </xsl:fallback>
Evaluates the template when the parent instruction/directive is not supported by the current processor.

xsl:comment
<xsl:comment> <!- - Content: template - -> </xsl:comment>
Emits an XML comment containing the template as its character data.

xsl:processing-instruction
<xsl:processing-instruction name = { ncname }> <!- - Content: template - -> </xsl:processing-instruction>
Emits an XML processing instruction whose [target] is name and whose [children] are based on template.

xsl:element
<xsl:element name = { qname } namespace = { uri-reference } use-attribute-sets = qnames> <!- - Content: template - -> </xsl:element>
Emits an XML element whose [local name] is name, whose [namespace URI] is namespace, and whose [children] are based on template.

xsl:attribute
<xsl:attribute name = { qname } namespace = { uri-reference }> <!- - Content: template - -> </xsl:attribute>
Emits an XML attribute whose [local name] is name, whose [namespace URI] is namespace, and whose [children] are based on template.

The { and } only receive special treatment when they appear as attribute [children]. To include source text as element [children], you must use the xsl:value-of instruction. This instruction is similar to xsl:copy-of, the primary difference being that the xsl:value-of select expression is converted to a string prior to replacement.

Consider the following literal result element:

<?xml version='1.0' ?> <doc xmlns:xsl='http://www.w3.org/1999/XSL/Transform' xsl:version='1.0' ><xsl:value-of select='/book'/></doc>

This XSLT stylesheet indicates that the content of the doc element should consist of the text found by running the XPath expression /book against the source document. Since the select expression is a node-set, it is implicitly converted to a string prior to insertion using the XPath conversion rules. The following is the result document that corresponds to this XSLT stylesheet:

<?xml version='1.0' ?> <doc/>

Note that because the source document contained no character data as element [children], the select expression /book yielded the empty string. Had the select expression been /book/author/@name, the result document would have looked like this

<?xml version='1.0' ?> <doc>Don</doc>

since the node-set to string conversion rules state that the first node is converted to a string-value.

Conditional Evaluation Instructions

The xsl:copy-of and related instructions are simple mechanisms for importing content from the source document. XSLT also provides a set of conditional evaluation instructions that should be familiar to developers coming from procedural programming languages. The xsl:if instruction is equivalent to the if statement in C++ and the Java language. The xsl:choose, xsl:when, and xsl:otherwise instructions are the equivalent to switch, case, and default commands in C++ and the Java language.

The simplest of these is, of course, the xsl:if instruction. An element representing an xsl:if instruction must have a test attribute that contains a Boolean XPath expression. If this expression evaluates to true, then the [children] of the xsl:if element (which is itself an XSLT template) will be processed. If the expression evaluates to false, then the [children] of the xsl:if element are ignored.

Consider the following XSLT literal result element:

<?xml version='1.0' ?> <doc xmlns:xsl='http://www.w3.org/1999/XSL/Transform' xsl:version='1.0'> <xsl:if test='count(//author) > 4'> <cacophony/> </xsl:if> </doc>

This XSLT stylesheet uses the xsl:if instruction to test the number of author elements. If the source document has more than four author elements, the result document will look like this:

<?xml version='1.0' ?> <doc><cacophony/></doc>

If the source document has four or fewer author elements, the result document will look like this:

<?xml version='1.0' ?> <doc/>

Note that the XPath expression used in the test attribute is a Boolean expression similar to that found in an XPath predicate, rather than a full-blown XPath location path.

The xsl:choose instruction provides the functionality of a Java-language or C++ switch statement. The xsl:choose instruction has one or more xsl:when clauses and an optional xsl:otherwise clause. The xsl:when and xsl:otherwise elements are represented as [children] of a containing xsl:choose element. Each xsl:when element can have a test attribute that contains a Boolean XPath expression. Unlike a Java-language or C++ switch, multiple xsl:when clauses may have tests that evaluate to true. To resolve the case of having more than one clause being true, XSLT only allows the first xsl:when clause whose test is successful to be processed.

Take a look at the following XSLT literal result element:

<?xml version='1.0' ?> <doc xmlns:xsl='http://www.w3.org/1999/XSL/Transform' xsl:version='1.0'> <xsl:choose> <xsl:when test='count(//author) = 1'> <soloist/> </xsl:when> <xsl:when test='count(//author) < 5'> <ensemble/> </xsl:when> <xsl:otherwise> <cacophony/> </xsl:otherwise> </xsl:choose> </doc>

If the source document has exactly one author element, the result document will look like this:

<?xml version='1.0' ?> <doc><soloist/></doc>

If instead the source document has between two and four author elements (or no author elements at all), the result document will look like this:

<?xml version='1.0' ?> <doc><ensemble/></doc>

Under any other circumstances, the result document will look like this:

<?xml version='1.0' ?> <doc><cacophony/></doc>

Note that the xsl:otherwise has no test clause and fills the role of the default clause from a Java-language or C++ switch statement. Also note that the order of xsl:when clauses is important. Had the two xsl:when clauses appeared in reverse order, the <soloist/> element would never be emitted, as the expression

count(//author) = 1

would be masked by the expression ahead of it

count(//author) < 5

since 1 is less than 5.

Repetition and Looping Instructions

In addition to conditional instructions, XSLT also provides instructions for repetition and looping. The most basic of these instructions is xsl:for-each, which relies on an XPath expression to produce a node-set that controls the iteration of the loop. When an xsl:for-each element is encountered, the XPath expression found in its select attribute is evaluated. Each node from the resultant node-set is then fed into the [children] of the xsl:for-each element for further processing.

Consider this XSLT literal result element:

<?xml version='1.0' ?> <doc xmlns:xsl='http://www.w3.org/1999/XSL/Transform' xsl:version='1.0'> <xsl:for-each select='/book/author'> <by/> </xsl:for-each> </doc>

Assuming the source document in Figure 1, the following result document will be produced:

<?xml version='1.0' ?> <doc><by/><by/><by/></doc>

This xsl:for-each example is not all that interesting because its content does not rely on the selected node from the node-set. Accessing content from within an xsl:for-each instruction requires a bit more explanation.

XPath expressions that appear in XSLT stylesheets are evaluated relative to a context. This context consists of at least an XPath node and node-set. For absolute location paths, this context is largely immaterial. For relative location paths, however, this context means everything. For example, the following XSLT instruction requires some notion of which node you are talking about and the node-set to which it belongs:

<xsl:value-of select='position()' />

For the examples shown prior to this discussion of the xsl:for-each instruction, the context node is the root node of the source document and the context node-set is the set that contains only the root node. But certain XSLT constructs can change the context used to evaluate XPath expressions. The xsl:for-each instruction is one such construct.

The xsl:for-each instruction alters the context of all XPath expressions found in its [children]. The context node-set is the node-set returned from the select expression. The context node changes for each iteration of the loop. For the nth iteration of the loop, the context node is the nth node of the context node-set.
For example, take at look at this XSLT literal result element:

<?xml version='1.0' ?> <doc xmlns:xsl='http://www.w3.org/1999/XSL/Transform' xsl:version='1.0'> <xsl:for-each select='/book/author'> <by id='{position()}'> <xsl:value-of select='@name' /> </by> </xsl:for-each> </doc>

The two XPath expressions that appear inside of the xsl:for-each instruction, position() and @name, will be evaluated against the current node from the node-set produced by the select expression. Given this XSLT stylesheet and the source document from Figure 1, the result document would look like this:

<?xml version='1.0' ?> <doc><by id='1'>Don</by><by id='2'>Aaron</by><by id='3' >John</by></doc>

Note that in this case the node-set produced by the xsl:for-each select expression contained each of the author elements from the source document.
By default, the xsl:for-each instruction iterates over its node-set in document order. This behavior can be changed using the xsl:sort instruction. The xsl:sort instruction elements must appear as the initial [children] of an xsl:for-each element and must adhere to the following syntax:

<xsl:sort select = string-expression lang = { nmtoken } data-type = { "text" | "number" | qname-but-not-ncname } order = { "ascending" | "descending" } case-order = { "upper-first" | "lower-first" } />

The most important attribute is the select attribute, which specifies an XPath expression that is used as the sort key for the node-set. In the absence of a select attribute, the XSLT processor assumes "." as the select expression.

To see xsl:sort in action, consider the following XSLT literal result element:

<?xml version='1.0' ?> <doc xmlns:xsl='http://www.w3.org/1999/XSL/Transform' xsl:version='1.0'> <xsl:for-each select='/book/author'> <xsl:sort select='@name' /> <by><xsl:value-of select='@name' /></by> </xsl:for-each> </doc>

When the source document is fed through this XSLT stylesheet, the following result document would be produced:

<?xml version='1.0' ?> <doc><by>Aaron</by><by>Don</by><by>John</by></doc>

You should note that the node-set produced by the XPath expression /book/author was processed in sorted order based on the name attribute. Had the following xsl:sort instruction been used instead

<xsl:sort select='@age' order='descending' data-type='number' />

the node-set would have been processed in descending order based on the age attribute, resulting in this document:

<?xml version='1.0' ?> <doc><by>Don</by><by>John</by><by>Aaron</by></doc>

The data-type attribute controls how the ordering of the value space is to be interpreted. There are two built-in constants, text and number, whose meaning is obvious. Additionally, the QName referring to an XML Schema datatype is supported in XSLT, but at the time of this writing no processors supported sorting based on schema types.

The xsl:sort instruction can also be used to specify more than one sort key for a node-set. The first xsl:sort instruction encountered within the content of an xsl:for-each instruction is considered the primary sort key. Subsequent xsl:sort instructions are interpreted as secondary sort keys. For example, consider the following XSLT literal result element:

<?xml version='1.0' ?> <doc xmlns:xsl='http://www.w3.org/1999/XSL/Transform' xsl:version='1.0'> <xsl:for-each select='/book/author'> <xsl:sort select='@canadian' order='descending'/> <xsl:sort select='@name' /> <by><xsl:value-of select='@name' /></by> </xsl:for-each> </doc>

Due to the multiply sorted xsl:for-each instruction, this XSLT document would yield the following result:

<?xml version='1.0' ?> <doc><by>John</by><by>Aaron</by><by>Don</by></doc>

Because the canadian attribute is used by the first xsl:sort instruction, it dominates the sort order.

Template-based Programming

The discussion so far has used a literal result element as the stylesheet. This is effective for simple transformations, but due to its linear structure it is impossible to modularize into smaller chunks of reusable XSLT. For that reason, most nontrivial XSLT stylesheets don't use literal result elements as stylesheets. Instead they use the explicit xsl:stylesheet format.

An xsl:stylesheet is primarily a collection of one or more template rules. Template rules play the role of functions in XSLT and always appear as top-level [children] of the xsl:stylesheet element. A template rule binds a QName or pattern to a template. As we mentioned earlier, a template is a collection of literal result elements and XSLT instructions. The QName is simply a symbolic name that is associated with a template. The following is a named XSLT template rule:

<xsl:template name='emitSignature' > <sig><xsl:value-of select='/book/@title'/></sig> </xsl:template>

This template rule can be called from other templates using the xsl:call-template instruction:

<xsl:template name='enchilada' > <doc> <xsl:call-template name='emitSignature' /> </doc> </xsl:template>

Assuming the source document in Figure 1, the enchilada template rule would yield the following result:

<doc><sig>Essential XML</sig></doc>

To invoke the enchilada template rule, you simply use an xsl:call-template instruction somewhere else in the stylesheet.

Like functions, template rules can be invoked with parameters. To support parameters, the template rule's [children] may begin with one or more xsl:param instructions that declare a named parameter and set its default value. All template parameters are added to the context of the template and are available to XPath expressions by prefixing the parameter name with a $, as in $arg1. The syntax for the xsl:param instruction is defined as follows:

<xsl:param name = qname select = expression>  </xsl:param>

A parameter declaration can use either an XPath expression or a template as its [children] to set the default value of the parameter.

The following named template rule declares and uses two template parameters:

<xsl:template name='emitTop' > <xsl:param name='arg1' select='/book/author[2]/@name' /> <xsl:param name='arg2' >true</xsl:param> <top> <xsl:if test='$arg2' > <sometimes/> </xsl:if> <one><xsl:value-of select='$arg1' /></one> </top> </xsl:template>

If this template rule were to be invoked with no parameters, the default values would be derived based on either the select attribute or [children] of each xsl:param instruction. Given the source document in Figure 1, that would yield the following result tree fragment:

<top> <sometimes/> <one>Aaron</one> </top>

To invoke a template rule using parameters, simply provide one or more xsl:with-param instructions as [children] to the xsl:call-template instruction.

<xsl:call-template name='emitTop' > <xsl:with-param name='arg1' >Hello</xsl:with-param> <xsl:with-param name='arg2' select='false()' /> </xsl:call-template>

Other than the element name, the syntax of the xsl:with-param instruction is identical to that of the xsl:param instruction. Given this invocation of emitTop, the following result tree fragment would be produced:

<top> <one>Hello</one> </top>

Note that because arg2 evaluated to the Boolean value false, the [children] of the xsl:if instruction are never evaluated.

The xsl:param element typically appears as [children] of a template rule. XSLT also allows the stylesheet itself to accept parameters. These stylesheet-level parameter declarations must appear as [children] of the xsl:stylesheet element. Exactly how an XSLT processor initializes the values of these parameters is processor-specific. The sidebar "Using XSLT from MSXML" explains how MSXML deals with stylesheet-level parameters.

In addition to named parameters, XSLT also supports named variables via the xsl:variable instruction. The syntax of the xsl: variable instruction is identical to that of xsl:param except for the element name. The difference between xsl:param and xsl:variable is that parameters, unlike variables, can have their initial values overridden at template invocation time by using xsl:with-param. Additionally, xsl:param instructions must appear at the top of the template in which they are included; xsl:variable instructions can appear anywhere an instruction is allowed. In either case, a given variable or parameter name can only be defined once per template. Variables and parameters defined as [children] of an xsl:stylesheet element are global in scope and are visible across all templates, but a template can hide the global definition by defining a variable or parameter with the same name. So this is a legal XSLT stylesheet:

<xsl:stylesheet version='1.0' xmlns:xsl='http://www.w3.org/1999/XSL/Transform'> <xsl:param name='bob' select='2' /> <xsl:template name='steve'> <xsl:variable name='bob' select='4'/> </xsl:template> </xsl:stylesheet>

but the following is not since the name bob is defined more than once in the same template:

<xsl:stylesheet version='1.0' xmlns:xsl='http://www.w3.org/1999/XSL/Transform'> <xsl:template name='steve'> <xsl:param name='bob' select='2' /> <xsl:variable name='bob' select='4'/> </xsl:template> </xsl:stylesheet>

It is important to note that there is no way to modify the value of a variable or parameter once it has been defined. This is because XSLT is a functional programming language, not an imperative one like C++ or Java. Unlike C++ or Java-language functions, XSLT templates cannot have side effects. Rather, they can only produce results that act as output of the overall stylesheet or as input to other templates. Part of the reason for this is to simplify the XSLT processing model. Another reason is to support the parallel or out-of-order execution of template rules, as the execution of one rule is guaranteed not to interfere with the execution of another.

Namespaces and XSLT

The example source and stylesheet documents used throughout this article have not used namespaces other than the required affiliation of XSLT instructions and constructs with the XSLT namespace. This was intentional in order to make the literal result elements and XPath expressions more compact and readable. In general this is unrealistic as most interesting XML documents rely on namespaces quite heavily.

Recall that non-prefixed QNames that appear in XPath expression are assumed to be unaffiliated with respect to namespaces. This is true irrespective of any default namespace declaration that may be in effect. In contrast, prefixed QNames that appear in XPath expressions are always expanded prior to being used in a NodeTest or other QName-aware construct. Whenever an XPath expression appears as an attribute value of an XSLT element, XSLT adds the declarations found in the [in-scope namespace declarations] property (sans any default namespace declaration that may be in-scope) to the XPath evaluation context.

When processing a literal result element, all namespace declarations that map to http://www.w3.org/1999/XSL/Transform are stripped out of the result. (All XSLT-specific attributes that are used when a literal result element acts as a stylesheet—such as xsl:version—disappear during the translation as well.) However, all other namespace declarations that appear in literal result elements are considered significant and will appear in the result document. This behavior can be suppressed using the xsl:exclude- result-prefixes attribute. This attribute can appear at the root element of the stylesheet document and contains a space-delimited list of namespace prefixes. Namespace declarations whose prefixes appear in this list will not appear in the result document.

For example, here is an XSLT literal result element that makes use of namespaces:

<?xml version='1.0' ?> <dm:out xmlns='http://example.com' xmlns:xsl='http://www.w3.org/1999/XSL/Transform' xmlns:dm=' http://www.develop.com/Schemas/doc' xmlns:awl='http://awl.com' xsl:version='1.0' ><xsl:value-of select='/awl:product/@title'/></dm:out>

As shown, this stylesheet would produce the following output when it is evaluated:

<?xml version='1.0' ?> <dm:out xmlns='http://example.com' xmlns:dm='http://www.develop.com/Schemas/doc' xmlns:awl='http://awl.com' >Essential XML</dm:out>

Because the awl namespace declaration existed only as context for the XPath expression in the template, it is superfluous in the result document. To suppress this unnecessary namespace declaration, the literal result element could be written as follows:

<?xml version='1.0' ?> <dm:out xmlns='http://example.com' xmlns:xsl='http://www.w3.org/1999/XSL/Transform' xmlns:dm=' http://www.develop.com/Schemas/doc' xmlns:awl='http://awl.com' xsl:exclude-result-prefixes='awl' xsl:version='1.0' ><xsl:value-of select='/awl:book/@title'/></dm:out>

This stylesheet would have yielded a result document with no awl namespace declaration:

<?xml version='1.0' ?> <dm:out xmlns='http://example.com xmlns:dm='http://www.develop.com/Schemas/doc' >Essential XML</dm:out>

To exclude the default namespace declaration as well, include the pseudo-prefix #default in the list of excluded prefixes. Had the exclude-result-prefixes attribute from the previous example looked like this

xsl:exclude-result-prefixes='awl #default'

the result document would have looked like this:

<?xml version='1.0' ?> <dm:out xmlns:dm='http://www.develop.com/Schemas/doc' >Essential XML</dm:out>

Note that when an explicit xsl:stylesheet element is used, the exclude-result-prefixes attribute (like the version attribute) must not be prefixed.
The fact that all namespace declarations mapping to the XSLT namespace URI are suppressed introduces an interesting puzzle: how do you write a stylesheet that generates a stylesheet as its result document? Consider the following stylesheet document:

<xsl:stylesheet version='1.0' xmlns:xsl='http://www.w3.org/1999/XSL/Transform'> <xsl:template name='emitStylesheet' > <xsl:stylesheet version='1.0'> <xsl:template name='{//@procName}' /> </xsl:stylesheet> </xsl:template> </xsl:stylesheet>

The intent of the previous stylesheet is to produce the following result document:

<xsl:stylesheet version='1.0' xmlns:xsl='http://www.w3.org/1999/XSL/Transform'> <xsl:template name='func2' /> </xsl:stylesheet>

However, there are two problems. First, the xsl:stylesheet element may not legally appear as [children] of an xsl:template. Second, any namespace declarations binding the XSLT namespace URI will be suppressed in the result document. Both of these problems are addressed by xsl:namespace-alias.
The xsl:namespace-alias instruction alters the namespace URI of one namespace declaration by associating it with a second namespace declaration that provides the actual namespace URI to be used in the result document. The syntax of the xsl:namespace-alias is as follows:

<xsl:namespace-alias stylesheet-prefix = prefix | "#default" result-prefix = prefix | "#default" />

The result-prefix indicates which namespace declaration contains the actual namespace URI that will appear in the result document. The stylesheet-prefix indicates which namespace declaration should be redeclared prior to emitting the result document.
Consider the following stylesheet document:

<xslt:stylesheet version='1.0' xmlns:xslt='http://www.w3.org/1999/XSL/Transform' xmlns:xsl='urn:fake:uri'> <xslt:namespace-alias stylesheet-prefix='xsl' result-prefix='xslt' /> <xslt:template name='emitStylesheet' > <xsl:stylesheet version='1.0'> <xsl:template name='{//@procName}' /> </xsl:stylesheet> </xslt:template> </xslt:stylesheet>

Because the [children] of the emitStylesheet template are no longer affiliated with the XSLT namespace, they are treated as literal result elements. However, any namespace declarations for the urn:fake:uri namespace URI will be transformed into declarations for the http://www.w3.org/1999/XSL/Transform namespace URI. That means that the emitStylesheet template would produce the following result document:

<xsl:stylesheet version='1.0' xmlns:xsl='http://www.w3.org/1999/XSL/Transform'> <xsl:template name='func2' /> </xsl:stylesheet>

Had the xsl:namespace-alias instruction not appeared in the stylesheet, the result document would have looked like this:

<xsl:stylesheet version='1.0' xmlns:xsl='urn:fake:uri'> <xsl:template name='func2' /> </xsl:stylesheet>

While it is possible to use namespace aliases with other namespaces, the XSLT namespace URI is the most obvious use of this instruction.

Generated Output

There is a class of transformations that cannot be achieved using literal result elements. If the local name or namespace URI of an element or attribute needs to be calculated based on XSLT processing, a literal result element cannot be used. Conditionally emitting attributes is also complicated when using literal result elements. Additionally, any processing instructions or comments that appear in templates are stripped away during processing.

XSLT defines four instructions that can be used to algorithmically generate specific output nodes (xsl:comment, xsl:processing-instruction, xsl:element, and xsl:attribute). These instructions can be used in place of literal result elements. Because they make extensive use of attribute value templates, they are highly suited to dynamic output formats. Of the four, xsl:comment and xsl:processing-instruction are the easiest to understand. The syntax for these two instructions is as follows:

<xsl:comment>  </xsl:comment> <xsl:processing-instruction name= { ncname }>  </xsl:processing-instruction>

Note that the name attribute of xsl:processing-instruction is an attribute value template, which allows the [target] of the processing instruction to be dynamically generated. To see these instructions in action, consider the following named template:

<xsl:template name='annotated' >  <?Magnum PI?> <xsl:comment>comment number two</xsl:comment> <xsl:processing-instruction name='A'> PI</xsl:processing-instruction> </xsl:template>

When executed, this named template emits the following:

<?A PI?>

As mentioned earlier, literal comments and processing instructions that appear in templates are not emitted into the result document.
The xsl:attribute instruction adds an attribute to the [attributes] property of the current result element. The syntax for the xsl:attribute instruction is shown in Figure 5. The xsl:attribute element's [children] are interpreted as text to produce the [children] of the resultant attribute. The xsl:attribute instruction may not be used after [children] have been added to the current result element.
Take a look at the following named template:

<xsl:template name='simple' > <bob><xsl:attribute name='id'>32</xsl:attribute><x/></bob> </xsl:template>

Because the xsl:attribute instruction appears as [children] of the bob element, the attribute will be associated with that element in the result document. This named template will produce the following result:

<bob id='32'><x/></bob>

Note that the [children] of the id attribute were generated by evaluating the [children] of the xsl:attribute instruction as a template.
The power of the xsl:attribute instruction lies in the fact that it is dynamic. Consider the following named template that uses xsl:if and attribute value templates:

<xsl:template name='fancy' > <xsl:param name='sAttName' select='bob' /> <xsl:param name='bUseAttribute' select='true()' /> <xsl:param name='sAttValue' /> <bob> <xsl:if test='$bUseAttribute' > <xsl:attribute name='{$sAttName}'> <xsl:value-of select='$sAttValue' /> </xsl:attribute> </xsl:if><x/> </bob> </xsl:template>

When this named template is invoked like so

<xsl:call-template name='fancy' > <xsl:with-param name='sAttName' >myAttr</xsl:with-param> <xsl:with-param name='bUseAttribute' select='true()' /> <xsl:with-param name='sAttValue'>Hi, XSLT</xsl:with-param> </xsl:call-template>

the following code would be generated as the result:

<bob myAttr='Hi, XSLT'><x/></bob>

As shown here, xsl:attribute provides a great deal of flexibility.
By default, the xsl:attribute emits an attribute that is not affiliated with any namespace. This behavior can be altered in one of two ways. The simplest way to affiliate an xsl:attribute instruction with a namespace is to use the namespace attribute:

<xsl:template name='ns1' > <bob> <xsl:attribute namespace='http://b.com' name='id' >32</xsl:attribute> </bob> </xsl:template>

This template will produce the following result:

<bob xmlns:auto-ns1='http://b.com' auto-ns1:id='32/>

Note that because only the [local name] property was provided, the XSLT processor needed to manufacture a namespace declaration and prefix. The prefix used can be controlled by specifying a QName instead of an NCName

<xsl:template name='ns1' > <bob> <xsl:attribute namespace='http://b.com' name='b:id' >32</xsl:attribute> </bob> </xsl:template>

which produces the following result:

<bob xmlns:b='http://b.com' b:id='32/>

An alternative method for affiliating an xsl:attribute instruction with a namespace is to simply use a QName with no corresponding namespace attribute. This causes the XSLT processor to look at the [in-scope namespace declarations] property to derive the namespace URI.

<xsl:template name='ns1' xmlns:b='http://b.com'> <bob> <xsl:attribute name='b:id' >32</xsl:attribute> </bob> </xsl:template>

This template produces a result that is identical to the previous example result.
The fourth instruction in this family is xsl:element. It is an alternative to the literal result element. Both the xsl:element instruction and the literal result element cause an element to be emitted in the result document. Only the xsl:element instruction can generate the element name dynamically using XPath expressions. The syntax for the xsl:element instruction is shown in Figure 5.
The basic usage model of xsl:element is the same as for a literal result element. For example, the following named template that uses a literal result element

<xsl:template name='elems' > <bob xmlns='http://example.com'><steve/></bob> </xsl:template>

could be rewritten like this

<xsl:template name='elems' > <xsl:element name='bob' namespace='http://example.com'> <xsl:element name='steve' namespace='http://example.com'/> </xsl:element> </xsl:template>

or using this hybrid style:

<xsl:template name='elems' > <xsl:element name='bob' namespace='http://example.com'> <steve xmlns='http://example.com'/> </xsl:element> </xsl:template>

The namespace prefixes used by xsl:element are controlled the same way as for xsl:attribute.
As shown earlier, there are multiple ways to add to the [attributes] property of the result element. The most obvious way is to use the xsl:attribute instruction. The less obvious way is to use attribute sets. Attribute sets are named collections of xsl:attribute instructions that appear as [children] of the xsl:stylesheet element. Attribute sets are named by a QName and referenced by the use-attribute-sets attribute of the xsl:element instruction or a literal result element.
Take a look at the stylesheet shown in Figure 6.

----------------------------------------
Figure 6
Using Attribute Sets
<xsl:stylesheet version='1.0'
 xmlns:xsl='http://www.w3.org/1999/XSL/Transform' >
<xsl:attribute-set name='myAttrs' >
 <xsl:attribute name='a'>a-val</xsl:attribute>
 <xsl:attribute name='b'>b-val</xsl:attribute>
</xsl:attribute-set>
<xsl:attribute-set name='yourAttrs' >
 <xsl:attribute name='c'>c-val</xsl:attribute>
</xsl:attribute-set>
<xsl:template name='elems2' >
 <xsl:element name='bob'
 use-attribute-sets='myAttrs yourAttrs' >
 <steve xsl:use-attribute-sets='yourAttrs' />
 </xsl:element>
</xsl:template>
</xsl:stylesheet>
----------------------------------------

Given this stylesheet, the elems2 template would produce the following result:

<bob a='a-val' b='b-val' c='c-val' > <steve c='c-val'/> </bob>

It is also legal for one attribute set to include other attribute sets using the use-attribute-sets attribute in its definition.

Multiple Stylesheets
XSLT provides two mechanisms for breaking an xsl:stylesheet into multiple documents. The simplest mechanism is the xsl:include directive, which references an external xsl:stylesheet by its URI. When an xsl:include directive appears as a child of an xsl:stylesheet, the XSLT directives found in the referenced document will be inserted into the current document just as if they were defined inline. For example, consider the two XSLT stylesheets shown in Figure 7.

Figure 7

Employing xsl:include

<?xml version='1.0' ?>

<xsl:stylesheet version='1.0'
 xmlns:xsl='http://www.w3.org/1999/XSL/Transform'>
<xsl:template name='func-a' ><a/></xsl:template>
</xsl:stylesheet>

<?xml version='1.0' ?>

<xsl:stylesheet version='1.0'
 xmlns:xsl='http://www.w3.org/1999/XSL/Transform'>
<xsl:include href='stylesheeta.xsl' />
<xsl:template name='func-b' >
 <xsl:call-template name='func-a' />
</xsl:template>
</xsl:stylesheet>

The second stylesheet uses the xsl:include directive to include the directives from the first. The resulting stylesheet is equivalent to the following:

<?xml version='1.0' ?>
<!-- stylesheetb.xsl -->
<xsl:stylesheet version='1.0' 
           xmlns:xsl='http://www.w3.org/1999/XSL/Transform'>
  <xsl:template name='func-a' ><a/></xsl:template>
  <xsl:template name='func-b' >
    <xsl:call-template name='func-a' />
  </xsl:template>
</xsl:stylesheet>

Defining the same template name more than once produces an error, so it is dangerous to xsl:include a document multiple times.

In addition to wholesale inclusion, XSLT supports a collision-aware import mechanism using its xsl:import directive. Like xsl:include, xsl:import causes the contents of an external stylesheet to be merged with the importing stylesheet. The distinction between xsl:import and xsl:include is only evident when there are name collisions. When using xsl:include, if the external stylesheet defines a named construct (such as a template rule) with an identical identifier as the including stylesheet, an error is generated. When using xsl:import, it is expected that the external stylesheet may define a named construct with an identical identifier as the including stylesheet. In the case of such a collision, the construct in the importing stylesheet takes precedence.

In fact, there is a hierarchy for determining precedence for colliding definitions. If an xsl:import directive results in a collision with a construct defined in the importing stylesheet, the construct in the imported stylesheet is hidden by the construct in the importing stylesheet. If two xsl:import directives in the same stylesheet result in a collision, the construct in the first imported stylesheet is hidden by the construct in the second imported stylesheet.

To maintain sanity, all xsl:import directives must appear as the initial [children] of an xsl:stylesheet element. It is illegal for an xsl:import element to appear after an xsl:include, xsl:template, or any other top-level stylesheet element.

Figure 8 shows a collection of stylesheets that use xsl:import to form an aggregate stylesheet.

Figure 8

xsl:import Stylesheets

<?xml version='1.0' ?>

<xsl:stylesheet version='1.0'
 xmlns:xsl='http://www.w3.org/1999/XSL/Transform'>
<xsl:import href='first.xsl' />
<xsl:import href='second.xsl' />
<xsl:template name='func-b' >
 <xsl:call-template name='func-a' />
</xsl:template>
</xsl:stylesheet>

<?xml version='1.0' ?>

<xsl:stylesheet version='1.0'
 xmlns:xsl='http://www.w3.org/1999/XSL/Transform'>
<xsl:template name='func-a' ><first/></xsl:template>
</xsl:stylesheet>

<?xml version='1.0' ?>

<xsl:stylesheet version='1.0'
 xmlns:xsl='http://www.w3.org/1999/XSL/Transform'>
<xsl:import href='third.xsl' />
<xsl:template name='func-a' ><second/></xsl:template>
</xsl:stylesheet>

<?xml version='1.0' ?>

<xsl:stylesheet version='1.0'
 xmlns:xsl='http://www.w3.org/1999/XSL/Transform'>
<xsl:template name='func-a' ><third/></xsl:template>
</xsl:stylesheet>

Assuming that the root.xsl stylesheet is the initial stylesheet, the result of calling the func-b template would be this:

<second/>

The definition of func-a in first.xsl is hidden by the definition in second.xsl. Even though second.xsl imports third.xsl, because second.xsl is considered the importing stylesheet, its definition of func-a hides the definition in third.xsl.

Pattern-based Templates
The template rules shown so far in this article have been named and invoked based on a symbolic name. XSLT also supports invoking template rules based on pattern-matching. An XSLT pattern is a stylized XPath expression that identifies subsets of the source document. An XSLT pattern contains one or more location paths separated by a |. Location paths contained in XSLT patterns can only use the child and attribute axes. However, a location path in an XSLT pattern can use the // operator, but not the equivalent descendant-or-self axis.
Patterns are associated with a template rule using the match attribute instead of the name attribute.

<xsl:template match='author|illustrator' > <contributor /> </xsl:template>

This template rule matches elements whose name is either author or illustrator. Template rules can also use predicates and multiple location steps.

<xsl:template match='book//author[@name != "Don"]' > <contributor /> </xsl:template>

This template rule matches author elements that have a book element as an ancestor, but do not have a name attribute whose [children] is the string "Don".
It is common for multiple template rules to match a given node. In those scenarios, the template rule that is actually used is determined by the following rules:

Only template rules whose match pattern matches the current node may be chosen.
Of the matching template rules, template rules with higher import precedence are always chosen over template rules with lower import precedence.
Of the matching template rules with the same import precedence, template rules with higher priority are always chosen over template rules with lower priority. The priority is calculated as follows:

If the xsl:template has a priority attribute, the decimal number that's found there is used independent of rules c through f.
Templates whose patterns use | to separate multiple alternatives are treated as distinct template rules, one per alternative.
Templates whose patterns are of the form child::QName, attribute::QName, or child::processing-instruction(literal) or their abbreviated equivalents are assigned a default priority of 0.
Templates whose patterns are of the form child::NCName:* or attribute:: NCName:* or their abbreviated equivalents are assigned a default priority of -0.25.
Templates whose patterns are of the form child::<node-test> or attribute:: <node-test> or their abbreviated equivalents are assigned a default priority of -0.5.
Templates whose patterns are not of the forms described in rules c through e and that do not have an explicit priority attribute are assigned a default priority of 0.5.

More than one matching template rule with the same priority and import precedence produces an error. In the case of such an error, an XSLT processor may signal the error or recover by breaking the tie by choosing the template rule that occurs last in the enclosing stylesheet.

These rules are explained in more detail in Section 5.5 of the XSLT specification (see http://www.w3.org/TR/xslt).

In general, the pattern matching rules prefer the most specific match from the stylesheet whose import priority is highest. Consider the following stylesheet fragment:

<xsl:transform version='1.0' xmlns:a='http://awl.com'
         xmlns:xsl='http://www.w3.org/1999/XSL/Transform'>
  <xsl:template
             match='/a:product/a:writer'><a/></xsl:template>
  <xsl:template match='a:writer'        ><b/></xsl:template>
  <xsl:template match='a:*'             ><c/></xsl:template>
  <xsl:template match='*'               ><d/></xsl:template>
</xsl:transform>

Using the XML source document shown in Figure 1, an element node corresponding to an author element would match any of these template rules. Because there is more than one match, the priorities must be taken into account. According to the rules just described, the default priorities for these template rules are (in order) 0.5, 0, -0.25, and -0.5. Since there is a uniquely highest priority template rule (the first rule), that rule would be selected for execution. In contrast, the book element would only match the third and fourth template rules, and since the third rule has a higher priority, that rule would be selected.
Note that these calculations assume that all template rules appear in the same stylesheet and that no importing stylesheet has matching template rules. (In the matching rules, import precedence is the most important factor in choosing a template rule.) Also note that explicit priority attributes would affect the prioritization considerably.
Pattern-based template rules are not invoked using the xsl:call-template instruction. Rather, they are invoked using the xsl:apply-templates instruction.

<xsl:apply-templates select=node-set-expresssion : node() mode=qname>  </xsl:apply-templates>

The xsl:apply-templates instruction is remarkably similar to the xsl:for-each instruction. Both instructions take a node-set and iteratively apply a template to each node in some order. For an xsl:for-each, the template that is applied is simply the [children] of the xsl:for-each instruction. For an xsl:apply-templates instruction, the template is selected based on pattern matching against all known template rules. Also, like the xsl:for-each instruction, invoking a template rule via xsl:apply-templates alters the context of the template being evaluated based on the node-set returned by the select expression and the current node in the iteration sequence.
Consider the following XSLT stylesheet that uses pattern matching and xsl:apply-templates to invoke template rules:

<xsl:stylesheet version='1.0' xmlns:xsl='http://www.w3.org/1999/XSL/Transform'> <xsl:template match='author' > <by><xsl:value-of select='@name' /></by> </xsl:template> <xsl:template match='book' > <doc><xsl:apply-templates /></doc> </xsl:template> <xsl:template match='/' > <xsl:apply-templates/> </xsl:template> </xsl:stylesheet>

Assuming the same source document shown in Figure 1, this stylesheet produces the following result document:

<?xml version='1.0' ?> <doc><by>Don</by><by>Aaron</by><by>John</by></doc>

Let's walk through the processing of this stylesheet. XSLT processing always starts by executing the template rule that best matches the root node of the source document. This node is matched by the XSLT pattern /, which is always used to find the initial template rule. If no explicit template rule exists, there is a built-in template rule whose contents exactly match the third template rule in this stylesheet. (We'll have more to say on built-in template rules later in this section.) In the case of this stylesheet, the third template rule will be selected as the initial template rule. That template rule simply invokes the xsl:apply-templates instruction with no select attribute, which, via defaults, is equivalent to:

<xsl:apply-templates select='node()' />

Using the same source document, this select expression will return a node-set containing one element node (book). Since there is only one template rule that matches the element node named book (the second rule), that rule is then invoked using the book element node as the current context node.
Upon evaluating the second template rule, the literal result element doc will be emitted. Inside of that element, however, there is another xsl:apply-templates instruction. The implicit select expression for that instruction will return a node-set containing the three author element nodes. The xsl:apply-templates instruction will find only one matching template rule, the first rule. Upon evaluating the final template rule, the literal result element by is encountered with an xsl:value-of as its [children]. Because xsl:apply-templates alters the context upon invocation of a template rule, the relative XPath expression used in the xsl:value-of instruction will be evaluated relative to the current author node.
XSLT defines a set of seven built-in template rules (one per node type) that are implicitly considered during template rule matching. These built-in rules are given the lowest possible import precedence, which means they will only be selected when there is absolutely no viable template rule explicitly defined in any of the available stylesheets. The built-in rule for element and root nodes recursively applies the best-match template to all child nodes:

<xsl:template match='*|/' > <xsl:apply-templates select='node()' /> </xsl:template>

The built-in template rule for text and attribute nodes simply copies the text values through:

<xsl:template match='text()|@*' > <xsl:value-of select='.' /> </xsl:template>

The built-in template rule for comments and processing instruction nodes discards the nodes from further processing:

<xsl:template match='processing-instruction()|comment()' />

The built-in template rule for namespace nodes also discards the nodes from further processing, but since XSLT patterns do not support matching namespace nodes, this built-in template rule cannot be overridden.
It is hard to discuss the built-in template rules and not discuss the xsl:copy instruction. Despite their similar names, the xsl:copy and xsl:copy-of instructions are quite different. The xsl:copy instruction emits a copy of the current context node independent of its type. If the node type is an element node, the associated namespace nodes are copied as well, but the [attributes] and [children] of the element are not copied. To copy these nodes as well, you must include an xsl:apply-templates instruction to recursively copy the subordinate nodes. The following is the stylesheet that performs the identity transformation:

<xsl:transform version='1.0' xmlns:xsl='http://www.w3.org/1999/XSL/Transform'> <xsl:template match='@*|node()' > <xsl:copy> <xsl:apply-templates select='@*|node()'/> </xsl:copy> </xsl:template> </xsl:transform>

Note that this lone template rule matches all attribute nodes and all child nodes including comment, processing instruction, and text nodes. The template itself simply uses the xsl:copy instruction to pass the current node through. The subtemplate inside the xsl:copy instruction then recursively invokes the template rule against all of the child and attribute nodes.

Controlling Output

By default, an XSLT stylesheet produces an XML document. This can be changed using the xsl:output directive. The xsl:output directive must appear as a child element of the xsl:stylesheet element and must adhere to the following syntax:

<xsl:output method = "xml" | "html" | "text" | qname-but-not-ncname version = nmtoken encoding = string omit-xml-declaration = "yes" | "no" standalone = "yes" | "no" doctype-public = string doctype-system = string cdata-section-elements = qnames indent = "yes" | "no" media-type = string />

The most important of these attributes is the method attribute. The method attribute sets the output method of the XSLT stylesheet. If not provided, the default is typically XML. (In the absence of an xsl:output element, a set of heuristics is used to detect Web documents as result documents. Consult the XSLT specification for the details of these heuristics.)
XSLT provides two other output methods: text and HTML. The former assumes that the result document is simply a text file with no implicit structure. The latter assumes that the result document is a Web document whose version is controlled by the version attribute. Consider the following XSLT stylesheet:

<xsl:stylesheet version='1.0' xmlns:xsl='http://www.w3.org/1999/XSL/Transform' > <xsl:template match='/' > Hello, World </xsl:template> </xsl:stylesheet>

As it stands, this XSLT stylesheet is illegal; the result document is not well-formed XML. To make this stylesheet legal, an xsl:output directive is needed:

<xsl:stylesheet version='1.0' xmlns:xsl='http://www.w3.org/1999/XSL/Transform' > <xsl:output method='text' /> <xsl:template match='/' > Hello, World </xsl:template> </xsl:stylesheet>

Given this sample stylesheet, the following result document would be produced:

Hello, World

Because this is not an XML document, there is no XML declaration nor is there any expectation of well-formedness.

The output method also controls the handling of the five built-in entities (lt, gt, amp, apos, and quot). In method='text' mode, all references to the five built-in entities are converted to their literal values prior to emission into the result document. In all other modes, the default behavior is to always emit entity references for these five characters. This behavior can be disabled using the disable-output-escaping attribute. This attribute appears on both the xsl:value-of and xsl:text instructions.

The xsl:text instruction is used to emit character data based on the character data [children] of the xsl:text instruction. Consider the following named template:

<xsl:template name='emitme' > Hel<lo, <xsl:text>Wo>rld</xsl:text> </xsl:template>

In method='text' mode, the result of this named template would look like this:

Hel<lo, wo>rld

In method='xml' mode, the result of this named template would look like this:

Hel<lo, wo>rld

It is possible to suppress the appearance of the second entity reference using the disable-output-escaping attribute. Consider this slight variation on the previous named template:

<xsl:template name='emitme' > Hel<lo, <xsl:text disable-output-escaping='yes' >Wo>rld</xsl:text> </xsl:template>

In method='xml' mode, the result of this named template would look like this:

Hel<lo, wo>rld

Note that in element content, the > character is perfectly legal.

It is also possible to force the use of a CDATA section in the result document. The xsl:output directive's cdata-section-elements attribute can contain a list of QName-based element names. When a text node is encountered, if its parent element's expanded name is in the list of cdata-section-elements, the text node will be emitted inside of a CDATA section.

Take a look at this stylesheet:

<xsl:stylesheet version='1.0' xmlns:xsl='http://www.w3.org/1999/XSL/Transform' > <xsl:output method='xml' cdata-section-elements='bob'/> <xsl:template match='/' > <steve> <bob><Hello></bob> <george><Hello></george> </steve> </xsl:template> </xsl:stylesheet>

This stylesheet produces the following result document:

<steve> <bob><![CDATA[<hello>]]></bob> <george><Hello></george> </steve>

Note that when a CDATA section is emitted, all built-in entity references are expanded.
The discussion so far has managed to avoid the nasty issue of whitespace handling. XSLT provides a well-defined set of rules and mechanisms for controlling whitespace in result documents. In general, XSLT acknowledges the fact that mixed content is the exceptional case and strips all whitespace-only text nodes from the source document tree and from the stylesheet. This stripping can be suppressed by the presence of any xml:space='preserve' attributes that may be in effect. Additionally, the stylesheet can contain one or more xsl:strip-space or xsl:preserve-space directives that list the element names that require overriding behavior.
Consider the following source document:

<?xml version='1.0' ?> <root> <a> <e/> </a> <e/> <c xml:space='default'> <e/> </c> <d> <e/> </d> </root>

If no xsl:strip-space or xsl:preserve-space directives are in effect, this source document is exactly identical to the following stripped source document:

<?xml version='1.0' ?> <root><a><e/></a> <e/> <c xml:space='default'><e/></c><d><e/></d></root>

Note that only the b element's whitespace was preserved. Assuming that the following two directives were to appear within the stylesheet document

<xsl:strip-space elements='b' /> <xsl:preserve-space elements='a root'/>

the source document would be pre-stripped to be exactly identical to the following:

<?xml version='1.0' ?> <root> <a> <e/> </a> <e/> <c xml:space='default'><e/></c> <d><e/></d> </root>

Note that the xsl:strip-space directive can override the value of the xml:space attribute. The stripping behavior just described applies to stylesheets as well as source documents. The primary distinction is that the xsl:text element is automatically in the whitespace preserving list and does not need to be explicitly listed in the xsl:preserve-space directive.

Conclusion

XSL Transformations solve a major problem caused by the proliferation of multiple XML Schemas describing complementary data. With XSLT, you can use your favorite programming language to map XML documents to one another, creating output in an arbitrary text-based format (including XML). Of course, simply mapping documents doesn't ensure that they can interoperate properly—human interaction is still needed to interpret the data—but XSL Transformations provide a valuable first step that makes the task easier.

For background information see:
http://www.w3.org/TR/xml-infoset
http://www.w3.org/TR/xpath

Don Box is a cofounder of DevelopMentor, a COM think tank that educates the software industry in COM, MTS, and ATL. Don Wrote Essential COM, and coauthored the follow-up Effective COM (Addison-Wesley, 1998). Reach Don at http://www.develop.com/dbox/. Aaron Skonnard is an instructor and researcher at DevelopMentor, where he co-manages the XML curriculum. Aaron wrote Essential WinInet (Addison-Wesley Longman, 1998) and coauthored Essential XML (Addison-Wesley Longman, 2000). Get in touch with Aaron at http://www.skonnard.com/default.asp. John Lam wrote and teaches the Essential Web Applications course for DevelopMentor. He can be reached at jlam@iunknown.com

From the August 2000 issue of MSDN Magazine.
Get it at your local newsstand, or better yet, subscribe.

Instruction	Syntax	Description
xsl:copy-of	<xsl:copy-of select = expression />	Emits the node-set corresponding to the select expression.
xsl:value-of	<xsl:value-of select = string-expression disable-output-escaping = "yes" \| "no" />	Emits the string corresponding to the select expression.
xsl:if	<xsl:if test = boolean-expression> <!- - Content: template - -> </xsl:if>	Evaluates the template if and only if the test expression evaluates to true.
xsl:choose	<xsl:choose> <!- - Content: (xsl:when+, xsl:otherwise?) - -> </xsl:choose>	Evaluates the template from the first xsl:when clause whose test expression evaluates to true. If none of the test expressions evaluate to true, then the template contained in the xsl:otherwise clause is evaluated.
xsl:for-each	<xsl:for-each select = node-set-expression> <!- - Content: (xsl:sort*, template) - -> </xsl:for-each>	Evaluates the template against each node in node-set returned by the select expression. The order of evaluation can be influenced using one or more xsl:sorts.
xsl:call-template	<xsl:call-template name = qname> <!- - Content: xsl:with-param* - -> </xsl:call-template>	Invokes the template rule named by name.
xsl:variable	<xsl:variable name = qname select = expression> <!- - Content: template - -> </xsl:variable>	Declares a variable named name and initializes it using the select expression or template.
xsl:text	<xsl:text disable-output-escaping = "yes" \| "no"> <!- - Content: #PCDATA - -> </xsl:text>	Emits the text found in #PCDATA. Escaping of the five built-in entities is controlled using disable-output-escaping.
xsl:number	<xsl:number level = "single" \| "multiple" \| "any" count = pattern from = pattern value = number-expression format = { string } lang = { nmtoken } letter-value = { "alphabetic" \| "traditional" } grouping-separator = { char } grouping-size = { number } />	Emits a number based on the XPath number expression found in value.
xsl:copy	<xsl:copy use-attribute-sets = qnames> <!- - Content: template - -> </xsl:copy>	Copies the current context node (and associated namespace nodes) to the result tree fragment.
xsl:apply-templates	<xsl:apply-templates select = node-set-expression mode = qname> <!- - Content: (xsl:sort \| xsl:with-param)* - -> </xsl:apply-templates>	Invokes the best-match template rules against the node-set returned by the select expression.
xsl:apply-imports	<xsl:apply-imports />	Promotes the current stylesheet in import precedence.
xsl:message	<xsl:message terminate = "yes" \| "no"> <!- - Content: template - -> </xsl:message>	Emits a message in a processor-dependent manner.
xsl:fallback	<xsl:fallback> <!- - Content: template - -> </xsl:fallback>	Evaluates the template when the parent instruction/directive is not supported by the current processor.
xsl:comment	<xsl:comment> <!- - Content: template - -> </xsl:comment>	Emits an XML comment containing the template as its character data.
xsl:processing-instruction	<xsl:processing-instruction name = { ncname }> <!- - Content: template - -> </xsl:processing-instruction>	Emits an XML processing instruction whose [target] is name and whose [children] are based on template.
xsl:element	<xsl:element name = { qname } namespace = { uri-reference } use-attribute-sets = qnames> <!- - Content: template - -> </xsl:element>	Emits an XML element whose [local name] is name, whose [namespace URI] is namespace, and whose [children] are based on template.
xsl:attribute	<xsl:attribute name = { qname } namespace = { uri-reference }> <!- - Content: template - -> </xsl:attribute>	Emits an XML attribute whose [local name] is name, whose [namespace URI] is namespace, and whose [children] are based on template.