You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@xalan.apache.org by mo...@apache.org on 2001/12/14 15:46:56 UTC
cvs commit: xml-xalan/java/xdocs/sources/xsltc xsltc_runtime.xml
morten 01/12/14 06:46:56
Modified: java/xdocs/sources/xsltc xsltc_runtime.xml
Log:
An update of XSLTC's runtime environment design document.
PR: none
Obtained from: n/a
Submitted by: morten@xml.apache.org
Reviewed by: morten@xml.apache.org
Revision Changes Path
1.3 +209 -120 xml-xalan/java/xdocs/sources/xsltc/xsltc_runtime.xml
Index: xsltc_runtime.xml
===================================================================
RCS file: /home/cvs/xml-xalan/java/xdocs/sources/xsltc/xsltc_runtime.xml,v
retrieving revision 1.2
retrieving revision 1.3
diff -u -r1.2 -r1.3
--- xsltc_runtime.xml 2001/12/14 13:09:56 1.2
+++ xsltc_runtime.xml 2001/12/14 14:46:56 1.3
@@ -58,72 +58,57 @@
-->
<s1 title="XSLTC runtime environment">
+
+ <s2 title="Contents">
+
+ <p>This document describes the design and overall architecture of XSLTC's
+ runtime environment. This does not include the internal DOM and the DOM
+ iterators, which are all covered in separate documents.</p>
+
<ul>
<li><link anchor="overview">Runtime overview</link></li>
<li><link anchor="translet">The compiled translet</link></li>
<li><link anchor="types">External/internal type mapping</link></li>
<li><link anchor="mainloop">Main program loop</link></li>
+ <li><link anchor="library">Runtime library</link></li>
+ <li><link anchor="output">Output handling</link></li>
</ul>
+ </s2>
+
<!--=================== OVERVIEW SECTION ===========================-->
<anchor name="overview"/>
<s2 title="Runtime overview">
- <p>The actual transformation of the input XML document is initiated by
- one of these classes:</p>
-
- <ul>
- <li>
- <code>com.sun.xslt.runtime.DefaultRun</code> (runs in a terminal)
- </li>
- <li>
- <code>com.sun.xslt.demo.applet.TransformApplet</code> (runs in an applet)
- </li>
- <li>
- <code>com.sun.xslt.demo.servlet.Translate</code> (runs in a servlet)
- </li>
- </ul>
+ <p>This figure shows the main components of XSLTC's runtime environment:</p>
+
+ <p><img src="runtime_design.gif" alt="runtime_design.gif"/></p>
+ <p><ref>Figure 1: Runtime environment overview</ref></p>
- <p>Any one of these classes will have to go through the folloing steps in
- order to initiate a transformation:</p>
+ <p>The various steps these components have to go through to transform a
+ document are:</p>
<ul>
- <li>
- Instanciate the translet object. The name of the translet (ie. class)
- to use is passed to us as a string. We use this string as a parameter
- to the static method <code>Class.forName(String name)</code> to get a
- reference to a translet object.
- </li>
- <li>
- Instanciate a <code>com.sun.xsl.parser.Parser</code> object to parse the
- input XML file, and instanciate a DOM (we have our own DOM
- implementation especially designed for XSLTC) where we store the
- input document.
- </li>
- <li>
- Pass any parameters to the translet (currently only possible when
- running the transformation in a terminal using DefaultRun)
- </li>
- <li>
- Instanciate a handler for the result document. This handler must be
- extend the <code>TransletOutputHandler</code> class.
- </li>
- <li>
- Invoke the <code>transform()</code> method on the translet, passing the
- instanciated DOM and the output handler as parameters.
- </li>
+ <li>instanciate a parser and hand it the input document</li>
+ <li>build an internal DOM from the parser's SAX events</li>
+ <li>instanciate the translet object</li>
+ <li>pass control to the translet object</li>
+ <li>receive output events from the translet</li>
+ <li>format the output document</li>
</ul>
+ <p>This process can be initiated either through XSLTC's native API or
+ through the implementation of the JAXP/TrAX API.</p>
+
</s2><anchor name="translet"/>
<s2 title="The compiled translet">
<p>A translet is always a subclass of <code>AbstractTranslet</code>. As well
as having access to the public/protected methods in this class, the
- translet is compiled with these methods:</p>
+ translet is compiled with these methods:</p><source>
+ public void transform(DOM, NodeIterator, TransletOutputHandler);</source>
- <p><code>public void transform(DOM, NodeIterator, TransletOutputHandler);</code></p>
-
<p>This method is passed a <code>DOMImpl</code> object. Depending on whether
the stylesheet had any calls to the <code>document()</code> function this
method will either generate a <code>DOMAdapter</code> object (when only one
@@ -131,13 +116,12 @@
are more than one XML input documents). This DOM object is passed on to
the <code>topLevel()</code> method.</p>
- <p>When the <code>topLevel()</code> method returns we initiate the output
+ <p>When the <code>topLevel()</code> method returns, we initiate the output
document by calling <code>startDocument()</code> on the supplied output
handler object. We then call <code>applyTemplates()</code> to get the actual
output contents, before we close the output document by calling
- <code>endDocument()</code> on the output handler.</p>
-
- <p><code>public void topLevel(DOM, NodeIterator, TransletOutputHandler);</code></p>
+ <code>endDocument()</code> on the output handler.</p><source>
+ public void topLevel(DOM, NodeIterator, TransletOutputHandler);</source>
<p>This method handles all of these top-level elements:</p>
<ul>
@@ -146,54 +130,64 @@
<li><code><xsl:key></code></li>
<li><code><xsl:param></code> (for global parameters)</li>
<li><code><xsl:variable></code> (for global variables)</li>
- </ul>
-
- <p><code>public void applyTemplates(DOM, NodeIterator, TransletOutputHandler);</code></p>
+ </ul><source>
+ public void applyTemplates(DOM, NodeIterator, TransletOutputHandler);</source>
<p>This is the method that produces the actual output. Its central element
- is a big <code>switch()</code> statement that is used to choose the available
- templates for the various node in the input document. See the chapter
- <link anchor="mainloop">Main Program Loop</link> for details on this method.</p>
+ is a big <code>switch()</code> statement that is used to trigger the code
+ that represent the available templates for the various node in the input
+ document. See the chapter on the
+ <link anchor="mainloop">main program loop</link> for details on this method.
+ </p><source>
+ public void <init>();</source>
- <p><code>public void <init> ();</code></p>
<anchor name="namesarray"/>
- <p>The translet's constructor initializes a table
- of all the elements we want to search for in the XML input document.
- This table is called the <code>namesArray</code> and it is passed to the DOM
- holding the input XML document.</p>
+ <p>The translet's constructor initializes a table of all the elements we
+ want to search for in the XML input document. This table is called the
+ <code>namesArray</code>, and maps each element name to an unique integer
+ value, know as the elements <i>"translet-type"</i>.
+ The DOMAdapter, which acts as a mediator between the DOM and the translet,
+ will map these element identifier to the element identifiers used internally
+ in the DOM. See the section on <link anchor="types">extern/internal type
+ mapping</link> and the internal DOM design document for details on this.</p>
<p>The constructor also initializes any <code>DecimalFormatSymbol</code>
objects that are used to format numbers before passing them to the
- output handler.</p>
-
- <p><code>public boolean stripSpace(int nodeType);</code></p>
-
- <p>This method is only present if any <code><xsl:strip-space></code> or
- <code><xsl:preserve-space></code> elements are present in the stylesheet.
- If that is the case, the translet implements the
+ output post-processor. The output processor uses thes symbols to format
+ decimal numbers in the output.</p><source>
+ public boolean stripSpace(int nodeType);</source>
+
+ <p>This method is only present if any <code><xsl:strip-space></code>
+ or <code><xsl:preserve-space></code> elements are present in the
+ stylesheet. If that is the case, the translet implements the
<code>StripWhitespaceFilter</code> interface by containing this method.</p>
- </s2><anchor name="types"/>
- <s2 title="External/internal type mapping">
+ </s2>
- <anchor name="external-types"/>
+ <!--=================== TYPE MAPPING SECTION ===========================-->
+
+ <anchor name="types"/>
+ <s2 title="External/internal type mapping">
- <p>This is the very core of XSL transformations: <em>Read carefully!!!</em></p>
+ <p>This is the very core of XSL transformations:
+ <em>Read carefully!!!</em></p>
- <p>Every node in the input XML document(s) is assigned a type by the
- DOM builder class. This type is an integer value which represents the
+ <anchor name="external-types"/>
+ <p>Every node in the input XML document(s) is assigned a type by the DOM
+ builder class. This type is a unique integer value which represents the
element, so that for instance all <code><bob></code> elements in the
- input document will be given type <ref>7</ref> and can be referred to by using
+ input document will be given type <code>7</code> and can be referred to by
that integer. These types can be used for lookups in the
<link anchor="namesarray">namesArray</link> table to get the actual
- element name (in this case "bob"). These types are referred to as
- <em>external types</em> or <em>DOM types</em>, as they are types known only
- to the DOM and the DOM builder.</p>
+ element name (in this case "bob"). The type identifiers used in the DOM are
+ referred to as <em>external types</em> or <em>DOM types</em>, as they are
+ types known only outside of the translet.</p>
<anchor name="internal-types"/>
-
<p>Similarly the translet assignes types to all element and attribute names
- that are referenced in the stylesheet. These types are referred to as
+ that are referenced in the stylesheet. This type assignment is done at
+ compile-time, while the DOM builder assigns the external types at runtime.
+ The element type identifiers used by the translet are referred to as
<em>internal types</em> or <em>translet types</em>.</p>
<p>It is not very probable that there will be a one-to-one mapping between
@@ -222,9 +216,9 @@
</source>
<p>In this stylesheet we are looking for elements <code><B></code>,
- <code><C></code> and <code><A></code>. For this example we can assume
- that these element types will be assigned the values 0, 1 and 2. Now, lets
- say we are transforming this XML document:</p>
+ <code><C></code> and <code><A></code>. For this example we can
+ assume that these element types will be assigned the values 0, 1 and 2.
+ Now, lets say we are transforming this XML document:</p>
<source>
<?xml version="1.0"?>
@@ -238,32 +232,34 @@
</source>
<p>This XML document has the elements <code><A></code>,
- <code><B></code> and <code><F></code>, which we assume are assigned the
- types 7, 8 and 9 respectively (the numbers below that are assigned for
- specific element types, such as the root node, text nodes, etc.). This
- causes a mismatch between the type used for <code><B></code> in the
- translet and the type used for <code><B></code> in the DOM. Th
+ <code><B></code> and <code><F></code>, which we assume are
+ assigned the types 7, 8 and 9 respectively (the numbers below that are
+ assigned for specific element types, such as the root node, text nodes,etc).
+ This causes a mismatch between the type used for <code><B></code> in
+ the translet and the type used for <code><B></code> in the DOM. The
DOMAdapter class (which mediates between the DOM and the translet) has been
- given two tables for convertint between the two types; <code>mapping</code> for
- mapping from internal to external types, and <code>reverseMapping</code> for
- the other way around.</p>
+ given two tables for convertint between the two types; <code>mapping</code>
+ for mapping from internal to external types, and <code>reverseMapping</code>
+ for the other way around.</p>
<p>The translet contains a <code>String[]</code> array called
- <code>namesArray</code>. This array will contain all the element and attribute
+ <code>namesArray</code>. This array contains all the element and attribute
names that were referenced in the stylesheet. In our example, this array
would contain these string (in this specific order): "B",
"C" and "A". This array is passed as one of the
- parameters to the DOM adapter constructor (the other adapter is the DOM
- itself). The DOM adapter passes this table on to the DOM. The DOM has
- a hashtable that maps known element names to external types. The DOM goes
- through the <code>namesArray</code> from the DOM sequentially, looks up each
- name in the hashtable, and is then able to map the internal type to an
- external type. The result is then passed back to the DOM adapter.</p>
-
- <p>The reverse is done for external types. External types that are not
- interesting for the translet (such as the type for <code><F></code>
- elements in the example above) are mapped to a generic <code>"ELEMENT"</code>
- type 3, and are more or less ignored by the translet.</p>
+ parameters to the DOM adapter constructor (the other parameter is the DOM
+ itself). The DOM adapter passes this table on to the DOM. The DOM generates
+ a hashtable that maps its known element names to the types the translet
+ knows. The DOM does this by going through the <code>namesArray</code> from
+ the translet sequentially, looks up each name in the hashtable, and is then
+ able to map the internal type to an external type. The result is then passed
+ back to the DOM adapter.</p>
+
+ <p>External types that are not interesting for the translet (such as the
+ type for <code><F></code> elements in the example above) are mapped
+ to a generic <code>"ELEMENT"</code> type (integer value 3), and are more or
+ less ignored by the translet. Uninterresting attributes are similarly
+ mapped to internal type <code>"ATTRIBUTE"</code> (integer value 4).</p>
<p>It is important that we separate the DOM from the translet. In several
cases we want the DOM as a structure completely independent from the
@@ -273,12 +269,16 @@
available for simultaneous access by several translet/servlet couples.</p>
<p><img src="runtime_type_mapping.gif" alt="runtime_type_mapping.gif"/></p>
- <p><ref>Figure 1: Two translets accessing a single dom using different type mappings</ref></p>
+ <p><ref>Figure 2: Two translets accessing a single dom using different type mappings</ref></p>
- </s2><anchor name="mainloop"/>
- <s2 title="Main program loop">
+ </s2>
+
+ <!--===================== MAIN LOOP SECTION ============================-->
+
+ <anchor name="mainloop"/>
+ <s2 title="Main program loop">
- <p>The main loop in the translet is found in the <code>applyTemplates()</code>
+ <p>The main body of the translet is the <code>applyTemplates()</code>
method. This method goes through these steps:</p>
<ul>
@@ -304,8 +304,10 @@
(a new iterator is created for the node's children, and this iterator
is passed with a recursive call to <code>applyTemplates()</code>).
Unrecognised attribute nodes (type 4) will be handled like text nodes.
- The <code>switch()</code> statement in <code>applyTemplates</code> will thereby
- look something like this:</p>
+ This makes up the default (built in) templates of any stylesheet. Then,
+ we add one <code>"case"</code>for each node type that is matched by any
+ pattern in the stylesheet. The <code>switch()</code> statement in
+ <code>applyTemplates</code> will thereby look something like this:</p>
<source>
public void applyTemplates(DOM dom, NodeIterator,
@@ -346,16 +348,17 @@
<p>Note that each "case" will not lead directly to a single template.
There may be several templates that match node type 7
- (say <code><B></code>). In the sample stylesheet in the previous chapter
- we have to templates that would match a node <code><B></code>. We have
- one <code>match="//B"</code> (match just any <code><B></code> element) and
- one <code>match="A/B"</code> (match a <code><B></code> element that is a
- child of a <code><A></code> element). In this case we would have to
- compile code that first gets the type of the current node's parent, and
- then compared this type with the type for <code><A></code>. If there was
- no match we will have executed the first <code><xsl:for-each></code>
- element, but if there was a match we will have executed the last one.
- Consequentally, the compiler will generate the following code:</p>
+ (say <code><B></code>). In the sample stylesheet in the previous
+ chapter we have to templates that would match a node <code><B></code>.
+ We have one <code>match="//B"</code> (match just any <code><B></code>
+ element) and one <code>match="A/B"</code> (match a <code><B></code>
+ element that is a child of a <code><A></code> element). In this case
+ we would have to compile code that first gets the type of the current node's
+ parent, and then compared this type with the type for
+ <code><A></code>. If there was no match we will have executed the
+ first <code><xsl:for-each></code> element, but if there was a match
+ we will have executed the last one. Consequentally, the compiler will
+ generate the following code (well, it will look like this anyway):</p>
<source>
switch(DOM.getType(node)) {
@@ -401,9 +404,8 @@
<source>
<?xml version="1.0"?>
- <DOC
- xmlns:foo="http://foo.com/spec"
- xmlns:bar="http://bar.net/ref">
+ <DOC xmlns:foo="http://foo.com/spec"
+ xmlns:bar="http://bar.net/ref">
<foo:A>In foo namespace</foo:A>
<bar:A>In bar namespace</bar:A>
</DOC>
@@ -413,7 +415,94 @@
regardless of what namespace they are in, and use the same <code>"if"</code>
structure within the <code>switch()</code> statement above. The other option
is to assign different types to <code><foo:A></code> and
- <code><bar:A></code> elements.</p>
+ <code><bar:A></code> elements. The latter is the option we chose, and
+ it is described in detail in the namespace design document.</p>
+
+ </s2>
+
+ <!--===================== RUNTIME SECTION =============================-->
+
+ <anchor name="library"/>
+ <s2 title="Runtime library">
+
+ <p>The runtime library offers basic functionality to the translet at
+ runtime. It is analoguous to UNIX's <code>libc</code>. The whole runtime
+ library is contained in a single class file:</p>
+
+<source>
+ org.apache.xalan.xsltc.runtime.BasisLibrary
+</source>
+
+ <p>This class contains a large set of static methods that are invoked by
+ the translet. These methods are largely independent from eachother, and
+ they implement the following:</p>
+
+ <ul>
+ <li>simple XPath functions that do not require a lot of code
+ compiled into the translet class</li>
+ <li>functions for formatting decimal numbers to strings</li>
+ <li>functions for comparing nodes, node-sets and strings - used by
+ equality expressions, predicates and other</li>
+ <li>functions for generating localised error messages</li>
+ </ul>
+
+ <p>The runtime library is a central part of XSLTC. But, as metioned earlier,
+ the functions within the library are rarely related, so there is no real
+ overall design/architecture. The only common attribute of many of the
+ methods in the library is that all static methods that implement an XPath
+ function and with a capital <code>F</code>.</p>
</s2>
+
+ <!--====================== OUTPUT SECTION =============================-->
+
+ <anchor name="output"/>
+ <s2 title="Output handler">
+
+ <p>The translet passes its output to an output post-processor before the
+ final result is handed to the client application over a standard SAX
+ interface. The interface between the translet and the output handler is
+ very similar to a SAX interface, but it has a few non-standard additions.
+ This interface is described in this file:</p>
+
+<source>
+ org.apache.xalan.xsltc.TransletOutputHandler
+</source>
+
+ <p>This interface is implemented by:</p>
+
+<source>
+ org.apache.xalan.xsltc.runtime.TextOutput
+</source>
+
+ <p>This class, despite its name, handles all types of output (XML, HTML and
+ TEXT). Our initial idea was to have a base class implementing the
+ <code>TransletOutputHandler</code> interface, and then have one subclass
+ for each of the output types. This proved very difficult, as the output
+ type is not always known until after the transformation has started and
+ some elements have been output. But, this is an area where a change like
+ that has the potential to increase performance significantly. Output
+ handling has a lot to do with analyzing string contents, and by narrowing
+ down the number of string comparisons and string updates one can acomplish
+ a lot.</p>
+
+ <p>The main tasks of the output handler are:</p>
+
+ <ul>
+ <li>determine the output type based on the output generated by the
+ translet (not always necessary)</li>
+ <li>generate SAX events for the client application</li>
+ <li>insert the necessary namespace declarations in the output</li>
+ <li>escape special characters in the output</li>
+ <li>insert <DOCTYPE> and <META> elements in HTML output</li>
+ </ul>
+
+ <p>There is a very clear link between the output handler and the
+ <code>org.apache.xalan.xsltc.compiler.Output</code> class that handles
+ the <code><xsl:output></code> element. The <code>Output</code> class
+ stores many output settings and parameters in the translet class file and
+ the translet passes these on to the output handler.</p>
+
+ </s2>
+
</s1>
---------------------------------------------------------------------
To unsubscribe, e-mail: xalan-cvs-unsubscribe@xml.apache.org
For additional commands, e-mail: xalan-cvs-help@xml.apache.org