You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@xalan.apache.org by sb...@locus.apache.org on 2000/07/28 04:08:28 UTC
cvs commit: xml-xalan/xdocs/sources/design conceptual.gif data.gif design2_0_0.xml org_apache.gif trax.gif xpath.gif design1_1_0.xml
sboag 00/07/27 19:08:27
Added: xdocs/sources/design conceptual.gif data.gif design2_0_0.xml
org_apache.gif trax.gif xpath.gif
Removed: xdocs/sources/design design1_1_0.xml
Log:
Changed name to design_2_0_0.xml, and did a bunch of work to add XPath, add some explanitory diagrams, etc.
Revision Changes Path
1.1 xml-xalan/xdocs/sources/design/conceptual.gif
<<Binary file>>
1.1 xml-xalan/xdocs/sources/design/data.gif
<<Binary file>>
1.1 xml-xalan/xdocs/sources/design/design2_0_0.xml
Index: design2_0_0.xml
===================================================================
<?xml version="1.0"?>
<!DOCTYPE s1 SYSTEM "file:///C:\x\xml-stylebook\styles\apachexml\dtd\document.dtd">
<s1 title="Xalan-J 2.0 Design">
<p><link>Xalan-J 2.0 Design</link><img src="xmllogo.gif" alt="xmllogo.gif"/></p>
<ul>
<li>Author: Scott Boag</li>
<li>State: In Progress</li>
<li><jump href="http://xml.apache.org/xalan-j/apidocs/index.html">Xalan-J 2.0 Javadoc</jump></li>
</ul>
<s2 title="Introduction">
<p><link idref="intro">Introduction</link></p>
<p>This document presents the basic design for Xalan-J 2.0, which is a
<jump href="http://www.awl.com/cseng/titles/0-201-89542-0/techniques/refactoring.htm">refactoring</jump>
and redesign of the Xalan-J 1.x proces
sor. The main goals of this redesign are
to: </p>
<ol>
<li>Make the design and code more understandable by Open Source
people.</li>
<li>Reduce code size and complexity.</li>
<li>By simplifying the code, make optimization easier.</li>
<li>Make modules generally more localized, and less tangled with other
modules.</li>
<li>Begin the adoption of the TrAX (Transformations for XML)
interfaces.</li>
<li>Increase the streamability of transformations.</li></ol>
<p>The techniques used toward these goals are to:</p>
<ol>
<li>In general, flatten the hierarchy of packages, in order to make the
structure more apparent from the top-level view.</li>
<li>Break the construction and the validation of the XSLT stylesheet from
the stylesheet objects themselves.</li>
<li>Drive the construction of the stylesheet through a table, so that it
is less prone to error.</li>
<li>Break the transformation process into a separate package, away from
the stylesheet objects.</li>
<li>Create this design document, as a start-point for people wanting to
approach the code.</li>
</ol>
<p>The goals are not:</p>
<ol>
<li>To add more features in the progress of this refactoring. This is
design and code clean-up, to meet the above-named goals. In the course of the
refactoring, it is expected that it will be <em>much</em> easier to add
features once this work is completed.</li>
<li>To optimize code for the sake of optimization. However, it is
expected that the code will be faster once the work is complete.</li>
</ol>
<p>How well we've achieved the goals will be measured by feedback from the
<link anchor="http://xml-archive.webweaving.org/xml-archive-xalan">Xalan-dev</link> list, and by software metrics tools.</p>
<p>Please note that the diagrams in this design document are meant to be
useful abstractions, and may not always be exact.</p>
</s2>
<s2 title="Overview of Architecture">
<p><link idref="overview">Overview of Architecture</link></p>
<p>Xalan 2.0 is divided into four major modules, and various smaller
modules. The main modules are:</p>
<gloss>
<label><code><link anchor="http://xml.apache.org/xalan-j/apidocs/org/apache/xalan/processor/package-summary.html">org.apache.xalan.process</link></code></label>
<item>The module that processes the stylesheet, and provides the main
entry point into Xalan.</item>
</gloss>
<gloss>
<label><code><link anchor="http://xml.apache.org/xalan-j/apidocs/org/apache/xalan/templates/package-summary.html">org.apache.xalan.templates</link></code></label>
<item>The module that defines the stylesheet structures, including the
Stylesheet object, template element instructions, and Attribute Value
Templates. </item>
</gloss>
<gloss>
<label><code><link anchor="http://xml.apache.org/xalan-j/apidocs/org/apache/xalan/transformer/package-summary.html">org.apache.xalan.transformer</link></code></label>
<item>The module that applies the source tree to the Templates, and
produces a result tree.</item>
</gloss>
<gloss>
<label><code><link anchor="http://xml.apache.org/xalan-j/apidocs/org/apache/xpath/package-summary.html">org.apache.xpath</link></code></label>
<item>The module that processes both XPath expressions, and XSLT Match
patterns.</item>
</gloss>
<p>In addition to the above modules, Xalan implements the
<link anchor="http://trax.openxml.org/">TrAX</link> interfaces, and depends on the
<link anchor="http://www.megginson.com/SAX/Java/index.html">SAX2</link> and <link anchor="http://www.w3.org/TR/DOM-Level-2/">DOM</link> packages.
</p><p><img src="trax.gif" alt="trax.gif"/></p><p>There is also a general utilities package that contains both XML utility
classes such as QName, but generally useful classes such as
StringToIntTable.</p>
<p>In the diagram below, the dashed lines denote visibility. All packages
access the SAX2 and DOM packages.</p>
<p><img src="xalan1_1x1.gif" alt="xalan1_1x1.gif"/></p>
<p>In addition to the above packages, there are the following additional
packages:</p>
<gloss>
<label><code><link anchor="http://xml.apache.org/xalan-j/apidocs/org/apache/xalan/client/package-summary.html">org.apache.xalan.client</link></code></label>
<item>This package has a client applet. I suspect this should be moved
into the samples directory.</item>
</gloss>
<gloss>
<label><code><link anchor="http://xml.apache.org/xalan-j/apidocs/org/apache/xalan/extensions/package-summary.html">org.apache.xalan.extensions</link></code></label>
<item>This holds classes belonging to the Xalan extensions mechanism,
which allows Java code and script to be called from within a stylesheet.</item>
</gloss>
<gloss>
<label><code><link anchor="http://xml.apache.org/xalan-j/apidocs/org/apache/xalan/lib/package-summary.html">org.apache.xalan.lib</link></code></label>
<item>This is the built-in Xalan extensions library, which holds
extensions such as Redirect (which allows a stylesheet to produce multiple
output files).</item>
</gloss>
<gloss>
<label><code><link anchor="http://xml.apache.org/xalan-j/apidocs/org/apache/xalan/res/package-summary.html">org.apache.xalan.res</link></code></label>
<item>This holds resource files needed by Xalan, such as error message
resources.</item>
</gloss>
<gloss>
<label><code><link anchor="http://xml.apache.org/xalan-j/apidocs/org/apache/xalan/trace/package-summary.html">org.apache.xalan.trace</link></code></label>
<item>This package contains classes and interfaces that allow a caller to
add trace listeners to the transformation, allowing an interface to XSLT
debuggers and similar tools.</item>
</gloss>
<gloss>
<label><code><link anchor="http://xml.apache.org/xalan-j/apidocs/org/apache/xalan/xslt/package-summary.html">org.apache.xalan.xslt</link></code></label>
<item>This package is for backwards compatibility with applications that
depend on Xalan 1.x interfaces.</item>
</gloss>
<p>A more conceptual view of this architecture is as follows:</p><p><img src="conceptual.gif" alt="Picture of conceptual architecture."/></p></s2><anchor name="process"/>
<s2 title="Process Module">
<p><link idref="process">Process Module</link></p>
<p>The <code>org.apache.xalan.process</code> module implements the
<code>org.apache.xalan.trax.Processor</code> interface, which provides a
factory method for creating a concrete Processor instance, and provides methods
for creating a <code>org.apache.xalan.trax.Templates</code> instance, which, in
Xalan and XSLT terms, is the Stylesheet. Thus the task of the process module is
to read the XSLT input in the form of a file, stream, SAX events, or a DOM
tree, and produce a Templates/Stylesheet object.</p>
<p>The overall strategy is to define a schema that dictates the legal
structure for XSLT elements and attributes, and to associate with those
elements construction-time processors that can fill in the appropriate fields
in the top-level Stylesheet object, and also associate classes in the templates
module that can be created in a generalized fashion. This makes the validation
object-to-class associations centralized and declarative.</p>
<p>The schema's root class is
<code>org.apache.xalan.processor.XSLTSchema</code>, and it is here that the
XSLT schema structure is defined. XSLTSchema uses
<code>org.apache.xalan.processor.XSLTElementDef</code> to define elements, and
<code>org.apache.xalan.processor.XSLTAttributeDef</code> to define attributes.
Both classes hold the allowed namespace, local name, and type of element or
attribute. The XSLTElementDef also holds a reference to a
<code>org.apache.xalan.processor.XSLTElementProcessor</code>, and a sometimes a
<code>Class</code> object, with which it can create objects that derive from
<code>org.apache.xalan.templates.ElemTemplateElement</code>. In addition, the
XSLTElementDef instance holds a list of XSLTElementDef instances that define
legal elements or character events that are allowed as children of the given
element.</p>
<p>The implementation of the <code>org.apache.xalan.trax.Processor</code>
interface is in <code>org.apache.xalan.processor.StylesheetProcessor</code>,
which creates a <code>org.apache.xalan.processor.StylesheetHandler</code>
instance. This instance acts as the ContentHandler for the parse events, and is
handed to the <code>org.xml.sax.XMLReader</code>, which the StylesheetProcessor
uses to parse the XSLT document. The StylesheetHandler then receives the parse
events, which maintains the state of the construction, and passes the events on
to the appropriate XSLTElementProcessor for the given event, as dictated by the
XSLTElementDef that is associated with the given event.</p>
<p><img src="process.gif" alt="process.gif"/></p>
</s2><anchor name="templates"/>
<s2 title="Templates Module">
<p><link idref="templates">Templates Module</link></p>
<p>The <code>org.apache.xalan.templates</code> module implements the
<code>org.apache.xalan.trax.Templates</code> interface, and defines a set of
classes that represent a Stylesheet. The primary purpose of this module is to
hold stylesheet data, not to perform procedural tasks associated with the
construction of the data, nor tasks associated with the transformation itself.
</p>
<p>A <code>StylesheetRoot</code>, which implements the
<code>Templates</code> interface, is a type of <code>StylesheetComposed</code>,
which is a <code>Stylesheet</code> composed of itself and all included
<code>Stylesheet</code> objects. A <code>StylesheetRoot</code> has a global
imports list, which is a list of all imported <code>StylesheetComposed</code>
instances. From each <code>StylesheetComposed</code> object, one can iterate
through the list of directly or indirectly included <code>Stylesheet</code>
objects, and one call also iterate through the list of all
<code>StylesheetComposed</code> objects of lesser import precedence.
<code>StylesheetRoot</code> is a <code>StylesheetComposed</code>, which is a
<code>Stylesheet</code>.</p>
<p>Each stylesheet has a set of properties, which can be set by various
means, usually either via an attribute on xsl:stylesheet, or via a top-level
xsl instruction (for instance, xsl:attribute-set). The get methods for these
properties only access the declaration within the given <code>Stylesheet</code>
object, and never takes into account included or imported stylesheets. The
<code>StylesheetComposed</code> derivative object, if it is a root
<code>Stylesheet</code> or imported <code>Stylesheet</code>, has "composed"
getter methods that do take into account imported and included stylesheets, for
some of these properties. The table of Stylesheet properties, with composed
methods, is as follows. Note that the names of the attributes are according to
a formula for translating the xsl names to the Java get/set method names.</p>
<table>
<tr>
<th>Property</th>
<th>Type</th>
<th>XSL Origin</th>
<th>Composed Methods</th>
<th>Note</th>
</tr>
<tr>
<td>XmlnsXsl</td>
<td>String</td>
<td>xmlns:xsl</td>
<td>(none, applies to stylesheet only)</td>
<td></td>
</tr>
<tr>
<td>ExtensionElementPrefixes</td>
<td>StringVector</td>
<td><code><jump href="http://www.w3.org/TR/xslt#extension-element">extension-element-prefixes</jump></code>
attribute</td>
<td>(none, applies to stylesheet only)</td>
<td></td>
</tr>
<tr>
<td>ExcludeResultPrefixes</td>
<td>StringVector</td>
<td><code><jump href="http://www.w3.org/TR/xslt#literal-result-element">exclude-result-prefixes
or xsl:exclude-result-prefixes</jump></code> attributes</td>
<td>(not sure about this... only from root?)</td>
<td>I think this should be a root method, and a single list should be
made, like with xsl:output.</td>
</tr>
<tr>
<td>Id</td>
<td>String</td>
<td>The <code><jump href="http://www.w3.org/TR/xslt#section-Embedding-Stylesheets">id</jump></code>
attribute</td>
<td>(none, applies to stylesheet only)</td>
<td></td>
</tr>
<tr>
<td>Version</td>
<td>String</td>
<td>The <code><jump href="http://www.w3.org/TR/xslt#forwards">version</jump></code> attribute</td>
<td>(none, applies to stylesheet only)</td>
<td></td>
</tr>
<tr>
<td>XmlSpace</td>
<td>boolean</td>
<td><code><jump href="http://www.w3.org/TR/xslt#strip">xml:space</jump></code> attribute</td>
<td>(none, applies to stylesheet only)</td>
<td></td>
</tr>
<tr>
<td>Import</td>
<td>Vector (list of StylesheetComposed objects)</td>
<td><code><jump href="http://www.w3.org/TR/xslt#import">xsl:import</jump></code> element</td>
<td>getImportComposed(int i) / getImportCountComposed()</td>
<td>Composed list contains all imported sheets, not the importing sheet
itself.</td>
</tr>
<tr>
<td>Include</td>
<td>Vector (list of Stylesheet objects)</td>
<td><code><jump href="http://www.w3.org/TR/xslt#include">xsl:include</jump></code> element</td>
<td>getIncludeComposed(int i) / getIncludeCountComposed()</td>
<td>Composed list contains all directly or indirectly included
stylesheets.</td>
</tr>
<tr>
<td>DecimalFormat</td>
<td>Stack (list of DecimalFormatProperties objects)</td>
<td><code><jump href="http://www.w3.org/TR/xslt#format-number">xsl:decimal-format</jump></code>
element</td>
<td>getDecimalFormatComposed(QName name)</td>
<td></td>
</tr>
<tr>
<td>StripSpaces</td>
<td>Stack (list of XPath match pattern objects)</td>
<td><code><jump href="http://www.w3.org/TR/xslt#strip">xsl:strip-space</jump></code>
element</td>
<td>getWhiteSpaceInfo(TransformerImpl transformContext, Node
sourceTree, Element targetElement)</td>
<td></td>
</tr>
<tr>
<td>PreserveSpaces</td>
<td>Stack (list of XPath match pattern objects)</td>
<td><code><jump href="http://www.w3.org/TR/xslt#strip">xsl:preserve-space</jump></code>
element</td>
<td>getWhiteSpaceInfo(TransformerImpl transformContext, Node
sourceTree, Element targetElement)</td>
<td></td>
</tr>
<tr>
<td>Output</td>
<td>OutputFormatExtended</td>
<td><code><jump href="http://www.w3.org/TR/xslt#output">xsl:output</jump></code> element</td>
<td>getOutputComposed() on StylesheetRoot only</td>
<td></td>
</tr>
<tr>
<td>Key</td>
<td>Vector (list of KeyDeclaration objects)</td>
<td><code><jump href="http://www.w3.org/TR/xslt#key">xsl:key</jump></code> element</td>
<td>getKeysComposed()</td>
<td></td>
</tr>
<tr>
<td>AttributeSet</td>
<td>Vector (list of ElemAttributeSet objects)</td>
<td><code><jump href="http://www.w3.org/TR/xslt#attribute-sets">xsl:attribute-set</jump></code>
element</td>
<td>On StylesheetRoot only?</td>
<td></td>
</tr>
<tr>
<td>Variable</td>
<td>Vector (list of ElemVariable objects)</td>
<td><code><jump href="http://www.w3.org/TR/xslt#top-level-variables">xsl:variable</jump></code>
element</td>
<td></td>
<td></td>
</tr>
<tr>
<td>Param</td>
<td>Vector (list of ElemParam objects)</td>
<td><code><jump href="http://www.w3.org/TR/xslt#top-level-variables">xsl:param</jump></code>
element</td>
<td></td>
<td></td>
</tr>
<tr>
<td>Template</td>
<td>Vector (list of ElemTemplate objects)</td>
<td><code><jump href="http://www.w3.org/TR/xslt#section-Defining-Template-Rules">xsl:template</jump></code>
element</td>
<td>getTemplateComposed(TransformerImpl transformContext, Node
sourceTree, Node targetNode, QName mode) and getTemplateComposed(QName
qname)</td>
<td></td>
</tr>
<tr>
<td>NamespaceAlias</td>
<td>Vector (list of ElemTemplate objects)</td>
<td><code><jump href="http://www.w3.org/TR/xslt#literal-result-element">xsl:namespace-alias</jump></code>
element</td>
<td>On StylesheetRoot only?</td>
<td></td>
</tr>
<tr>
<td>NonXslTopLevel</td>
<td>Hashtable (table of opaque objects keyed by QName)</td>
<td>Any top-level non-xslt element.</td>
<td>none.</td>
<td></td>
</tr>
<tr>
<td>Href</td>
<td>URL</td>
<td>The location of the stylesheet, possibly set by xsl:include or
xsl:import.</td>
<td>none.</td>
<td></td>
</tr>
<tr>
<td>StylesheetRoot</td>
<td>StylesheetRoot</td>
<td>The root of the stylesheet tree, for quick access.</td>
<td>none.</td>
<td></td>
</tr>
<tr>
<td>StylesheetParent</td>
<td>Stylesheet</td>
<td>The importing or including stylesheet.</td>
<td>none.</td>
<td></td>
</tr>
<tr>
<td>StylesheetComposed</td>
<td>StylesheetComposed</td>
<td>The closest importing stylesheet.</td>
<td>none.</td>
<td></td>
</tr>
<tr>
<td>NamespaceDecls</td>
<td>Linked list of NameSpace elements</td>
<td>xmlns:foo attribute map</td>
<td>(none, applies to stylesheet only)</td>
<td></td>
</tr>
</table>
</s2><anchor name="transformer"/>
<s2 title="Transformer Module">
<p><link idref="transformer">Transformer Module</link></p>
<p>The <link anchor="http://xml.apache.org/xalan-j/apidocs/org/apache/xalan/transformer/package-summary.html">Transformer</link> module is in charge of run-time transformations. The <link anchor="http://xml.apache.org/xalan-j/apidocs/org/apache/xalan/transformer/TransformerImpl.html">TransformerImpl</link> object, which implements the TrAX <link anchor="http://trax.openxml.org/javadoc/trax/Transformer.html">Transformer</link> interface, and has an association with a <link anchor="http://xml.apache.org/xalan-j/apidocs/org/apache/xalan/templates/StylesheetRoot.html">StylesheetRoot</link> object, begins the processing of the source tree (or provides a <link anchor="http://www.megginson.com/SAX/Java/javadoc/org/xml/sax/ContentHandler.html">ContentHandler</link> reference), and performs the transformation. The Transformer package does as much of the transformation as it can, but element level operations are generally performed in the <link anchor="http://xml.apache.org/xalan-j/apidocs/org/apache/xalan/templ
ates/ElemTemplateElement.html#execute(org.apache.xalan.transformer.TransformerImpl, org.w3c.dom.Node, org.apache.xalan.utils.QName)">ElemTemplateElement.execute(...)</link> methods.</p><p>Result Tree events are fed into a <link anchor="http://xml.apache.org/xalan-j/apidocs/org/apache/xalan/transformer/ResultTreeHandler.html">ResultTreeHandler</link> object, which acts as a layer between the direct calls to the result
tree content handler (often a Serializer), and the Transformer. For one thing,
we have to delay the call to
startElement(name, atts) because of the
xsl:attribute and xsl:copy calls. In other words,
the attributes have to be fully collected before you
can call startElement.</p><p>Other important classes in this package are:</p><gloss><label>CountersTable and Counter</label><item>The Counter class does incremental counting for support of xsl:number.
This class stores a cache of counted nodes (m_countNodes).
It tries to cache the counted nodes in document order...
the node count is based on its position in the cache list. The CountersTable class is a table of counters, keyed by ElemNumber objects, each
of which has a list of Counter objects.</item></gloss><gloss><label>KeyIterator, KeyManager, and KeyTable</label><item>These classes handle mapping of keys declared with the xsl:key element.</item></gloss><gloss><label>TransformState</label><item>This interface is meant to be used by a consumer of SAX2 events produced by Xalan, and enables the consumer
to get information about the state of the transform. It
is primarily intended as a tooling interface.</item></gloss><p>Even though the following modules are defined in the org.apache.xalan package, instead of the transformer package, they are defined in this section as they are mostly related to runtime transformation.</p>
<s3 title="Stree Module"><p><link idref="stree">Stree Module [And discussions about streaming]</link></p><p>The Stree module implements the default <link anchor="http://www.w3.org/TR/xpath#data-model">Source Tree </link> for Xalan, that is to be transformed. It implements read-only <link anchor="http://www.w3.org/TR/DOM-Level-2/">DOM2</link> interfaces, and provides some information needed for fast transforms, such as document order indexes. It also attempts to allow a streaming transform by launching the transform on a secondary thread as soon as the SAX2 <link anchor="http://www.megginson.com/SAX/Java/javadoc/org/xml/sax/ContentHandler.html#startDocument()">StartDocument</link> event has occured. When the transform requests a node, and node is not present, the getFirstChild and GetNextSibling methods will wait until the child node has arrived, or an <link anchor="http://www.megginson.com/SAX/Java/javadoc/org/xml/sax/ContentHandler.html#endElement(java.lang.String,%20java.lang.String,%20java.lang.Str
ing)">endElement</link> event has occured.</p><p>Note that the secondary thread is an issue. It would be better to do the same thing as described above on a single thread, but using the parser in 'pull' mode, or simply with a parseNext method so the parse would occur in blocks.</p><p>This kind of streaming is not perfect because it still requires an entire source tree to be concretely built. There have been a lot of good discussions on the xalan-dev list about how to do static analysis of a stylesheet, and be able to allocate only the nodes needed by the transform, while they are needed (or not allocate source objects at all).</p><p>Vincent-Olivier Arsenault <vincent@neuro6.com> has proposed the following design:</p><p>By looking at the stylesheet you know how streamable it is (of course this
needs strict adherence to the xslt recommendation). since there's a root
template and no <xsl:apply-templates/> you can build your context list
containing only absolute x-path (which means nodes get out of context
faster).</p>
<p>The paths of the relevant nodes, for this stylesheet, are (ok this is an
example, so I may be missing some):</p>
<ol>
<li>path: "/address" context: "address" (at </address>, you get rid of the
whole "person/address" stuff);</li>
<li>path: "/adn" context: "adn";</li>
<li>path: "/medicalrecord" context: "/" (for possibly repetitive nodes, the
context is always the parent node).</li>
</ol>
<p>And all the rest goes to trash!!!!</p>
<p>Let me refine:</p>
<p>you analyze the whole stylesheet like that (would be good if optimization
and x-path list could be done simultaneously) and you end up with a list of
expanded paths mapped to all the templates.</p>
<p>An entry in the list (i would call this list the transformation stack) would
consist of 4 things:</p>
<ol>
<li>the relevance context xpath (on which the input nodes will be tested for
pertinence: do we keep it of not);</li>
<li>the transformation rule to apply to the matching nodes (this can just be a
forwarder to another template transformation stack);</li>
<li>a result buffer (in which the nodes that can't be streamed are temporarily
stored);</li>
<li>the streaming context xpath (triggers streaming of the buffer to the
output).</li>
</ol>
</s3><s3 title="Extensions Module"><p><link idref="extensions">Extensions Module</link></p><p>This package contains an implementation of Xalan Extension Mechanism, which uses the <link anchor="http://oss.software.ibm.com/developerworks/opensource/bsf/">Bean Scripting Framework</link>.
The Bean Scripting Framework (BSF) is an architecture for incorporating scripting into Java applications and applets. Scripting languages such as Netscape Rhino (Javascript), VBScript, Perl, Tcl, Python, NetRexx and Rexx can be used to augment XSLT's functionality. In addition, the Xalan extension mechanism allows use of Java classes. See the <link anchor="http://xml.apache.org/xalan/extensions.html">XalanJ 1 extension documentation</link> for a description of using extensions in a stylesheet. Please note that the W3C XSL Working Group is working on a specification for standard extension bindings, and this module will change to follow that specification. </p><p>[More needed... -sb]</p></s3></s2><anchor name="xpath"/>
<s2 title="XPath Module">
<p><link idref="xpath">XPath Module</link></p>
<p>This module is pulled out of the Xalan package, and put in the org.apache package, to emphasize that the intention is that this package can be used independently of the XSLT engine, even though it has dependencies on the Xalan utils module.</p><p><img src="org_apache.gif" alt="xalan ---> xpath"/></p>
<p>The XPath module first compiles the XPath strings into expression trees, and then executes these expressions via a call to the XPath execute(...) function. </p> <p>Major classes are:</p><gloss><label>XPath</label><item>Represents a compiled XPath. Major function is <code>XObject execute(XPathContext xctxt, Node contextNode,
PrefixResolver namespaceContext).</code></item></gloss><gloss><label>XPathAPI</label><item>The methods in this class are convenience methods into the
low-level XPath API.</item></gloss><gloss><label>XPathContext</label><item>Used as the runtime execution context for XPath.</item></gloss><gloss><label>DOMHelper</label><item>Used as a helper for handling DOM issues. May be subclassed to take advantage
of specific DOM implementations.</item></gloss><gloss><label>SourceTreeManager</label><item>bottlenecks all management of source trees. The methods
in this class should allow easy garbage collection of source
trees, and should centralize parsing for those source trees.</item></gloss><gloss><label>Expression</label><item>The base-class of all expression objects, allowing polymorphic behaviors.</item></gloss><p>The general architecture of the XPath module is diveded into the compiler, and categories of expression objects.</p><p><img src="xpath.gif" alt="xpath modules"/></p><p>The most important module is the axes module. This module implements the DOM2 <link anchor="http://www.w3.org/TR/DOM-Level-2/traversal.html#Iterator-overview">NodeIterator</link> interface, and is meant to allow XPath clients to either override the default behavior or to replace this behavior.</p><p>The LocPathIterator and UnionPathIterator classes implement the <link anchor="http://www.w3.org/TR/DOM-Level-2/java-binding.html#org.w3c.dom.traversal.NodeIterator">NodeIterator</link> interface, and polymorphically use AxesWalker derived objects to execute each step in the path. The whole trick is to execute the LocationPath in depth-first do
cument order so that nodes can be found without necessarily looking ahead or performing a bredth-first search.</p><s3 title="XPath Database Connection"><p><link idref="xpath-database">XPath Direct Database Connections</link></p><p>An important part of the XPath design in both Xalan 1 and Xalan 2, is to enable database connections to be used as drivers directly to the XPath <link anchor="http://www.w3.org/TR/xpath#location-paths">LocationPath</link> handling. This allows databases to be directly connected to the transform, and be able to take advantage of internal indexing and the like. While in Xalan 1 this was done via the <link anchor="http://xml.apache.org/xalan/apidocs/org/apache/xalan/xpath/XLocator.html">XLocator</link> interface, in Xalan 2 this interface is no longer used, and has been replaced by the DOM2 <link anchor="http://www.w3.org/TR/DOM-Level-2/traversal.html#Iterator-overview">NodeIterator</link> interface. An application or extension should be able to install their own NodeIterator for a
given document.</p><p><img src="data.gif" alt="data.gif"/></p><p>[More to do]</p></s3></s2>
<s2 title="Utils Package">
<p><link idref="utils">Utils Package</link></p>
<p>This package contains general utilities for use by both the xalan and xpath packages. It is the intention that many of these utility classes (or their equivelents) be eventually brought into the org.apache.xml package for general use. The list of major utilities are as follows:</p><gloss><label>AttList</label><item>Wraps a DOM attribute list in a SAX Attributes.</item></gloss><gloss><label>BoolStack, IntStack, IntVector, etc.</label><item>Simple stacks and vectors for primative values.</item></gloss><gloss><label>DefaultErrorHandler</label><item>Implements SAX error handler for default reporting.</item></gloss><gloss><label>DOMBuilder</label><item>Takes SAX events (in addition to some extra events
that SAX doesn't handle yet) and adds the result to a document
or document fragment.</item></gloss><gloss><label>Heap</label><item>Classic heap implementation.</item></gloss><gloss><label>MutableAttrListImpl</label><item>Mutable version of AttributesImpl.</item></gloss><gloss><label>NameSpace</label><item>A representation of a namespace.</item></gloss><gloss><label>NodeVector</label><item>A very simple table that stores a list of Nodes.</item></gloss><gloss><label>ObjectPool</label><item>Used for reuse of objects.</item></gloss><gloss><label>PrefixResolver</label><item>The class that implements this interface can resolve prefixes
to namespaces.</item></gloss><gloss><label>PrefixResolverDefault</label><item>This class implements a generic PrefixResolver for a DOM, that
can be used to perform prefix-to-namespace lookup
for an XPath.</item></gloss><gloss><label>QName</label><item>Class to represent a qualified XML name.</item></gloss><gloss><label>StringToStringTable</label><item>A very simple lookup table that stores a list of strings for lookup. Used when a hashtable is too much overhead.</item></gloss><gloss><label>SystemIDResolver</label><item>Able to take a SystemID string and try and turn it into a good absolute URL.</item></gloss><gloss><label>TreeWalker</label><item>Implements a Visitor design pattern, doing a pre-order walk of the DOM tree, calling a ContentHandler interface as it goes. Used for DOM-to-SAX conversion.</item></gloss><gloss><label>Trie</label><item>A digital search trie for 7-bit ASCII text.</item></gloss><gloss><label>UnImplNode</label><item>To be subclassed by classes that wish to act as DOM nodes, without having to implement all the methods. Widely used.</item></gloss></s2>
<s2 title="Other Packages">
<p><link idref="other">Other Packages</link></p>
<gloss><label>client</label><item>Implementation of Xalan Applet [should we keep this?].
</item></gloss>
<gloss><label>dtm</label><item>Implementation of the Document Table Model (DTM) [Should we keep this?].</item></gloss>
<gloss><label>extensions</label><item>Implementation of Xalan Extension Mechanism, which uses the Bean Scripting Framework.</item></gloss>
<gloss><label>lib</label><item>Implementation of Xalan-specific extensions [I want to add lots more extensions to this package!].</item></gloss><gloss><label>res</label><item>Contains strings that require internationalization.</item></gloss></s2>
<s2 title="Coding Conventions">
<p><link idref="coding-conventions">Coding Conventions</link></p>
<p>This section documents the coding conventions used in the Xalan
source.</p>
<ol>
<li>Class files are arranged with constructors and possibly an init()
function first, public API methods second, package specific, protected, and
private methods following (arranged based on related functionality), member
variables with their getter/setter access methods last.</li>
<li>Non-static member variables are prefixed with "m_".</li>
<li>static final member variables should always be upper case, without
the "m_" prefix. They need not have accessors.</li>
<li>Private member variables that are not accessed outside the class need
not have getter/setter methods declared.</li>
<li>Private member variables that are accessed outside the class should
have either package specific or public getter/setter methods declared. All
accessors should follow the bean design patterns.</li>
<li>Package-scoped member variables, public member variables, and
protected member variables should not be declared.</li>
</ol>
</s2>
<s2 title="Open Issues">
<p><link idref="open-issues">Open Issues</link></p>
<p>This section documents architectural and design issues that I still
consider to be open or unsolved. (This list is ongoing, and will change over
time... it's simply a place for me to note problems that are ongoing and need
to be solved.)</p>
<gloss>
<label>Space stripping</label>
<item>In Xalan 1.x, it is clear that space stripping was a major
performance issue. This needs to be solved in Xalan 2.0 by stripping the
space nodes as the document is being parsed. This is a major problem though for
DOM trees. This can be perhaps be solved by preprocessing the DOM tree and
creating a table of space-stripping parent elements, when the nodes can't be
pre-stripped.</item>
</gloss>
</s2>
</s1>
1.1 xml-xalan/xdocs/sources/design/org_apache.gif
<<Binary file>>
1.1 xml-xalan/xdocs/sources/design/trax.gif
<<Binary file>>
1.1 xml-xalan/xdocs/sources/design/xpath.gif
<<Binary file>>