You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@xalan.apache.org by sh...@apache.org on 2014/05/16 18:11:35 UTC

svn commit: r1595253 [14/18] - in /xalan/java/branches/WebSite: ./ xalan-j/ xalan-j/design/ xalan-j/design/resources/ xalan-j/resources/ xalan-j/xsltc/ xalan-j/xsltc/resources/

Added: xalan/java/branches/WebSite/xalan-j/xsltc/xsl_whitespace_design.html
URL: http://svn.apache.org/viewvc/xalan/java/branches/WebSite/xalan-j/xsltc/xsl_whitespace_design.html?rev=1595253&view=auto
==============================================================================
--- xalan/java/branches/WebSite/xalan-j/xsltc/xsl_whitespace_design.html (added)
+++ xalan/java/branches/WebSite/xalan-j/xsltc/xsl_whitespace_design.html Fri May 16 16:11:33 2014
@@ -0,0 +1,471 @@
+<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
+<html>
+<head>
+<title>ASF: &lt;xsl:strip/preserve-space&gt;</title>
+<meta http-equiv="content-type" content="text/html; charset=UTF-8" />
+<meta http-equiv="Content-Style-Type" content="text/css" />
+<link rel="stylesheet" type="text/css" href="resources/apache-xalan.css" />
+</head>
+<!--
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the  "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ -->
+<body>
+<div id="title">
+<table class="HdrTitle">
+<tbody>
+<tr>
+<th rowspan="2">
+<a href="../index.html">
+<img alt="Trademark Logo" src="resources/XalanJ-Logo-tm.png" width="190" height="90" />
+</a>
+</th>
+<th text-align="center" width="75%">
+<a href="index.html">XSLTC Design</a>
+</th>
+</tr>
+<tr>
+<td valign="middle">&lt;xsl:strip/preserve-space&gt;</td>
+</tr>
+</tbody>
+</table>
+<table class="HdrButtons" align="center" border="1">
+<tbody>
+<tr>
+<td>
+<a href="http://www.apache.org">Apache Foundation</a>
+</td>
+<td>
+<a href="http://xalan.apache.org">Xalan Project</a>
+</td>
+<td>
+<a href="http://xerces.apache.org">Xerces Project</a>
+</td>
+<td>
+<a href="http://www.w3.org/TR">Web Consortium</a>
+</td>
+<td>
+<a href="http://www.oasis-open.org/standards">Oasis Open</a>
+</td>
+</tr>
+</tbody>
+</table>
+</div>
+<div id="navLeft">
+<ul>
+<li>
+<a href="index.html">Overview</a>
+</li></ul><hr /><ul>
+<li>
+<a href="xsltc_compiler.html">Compiler design</a>
+</li></ul><hr /><ul>
+<li>Whitespace<br />
+</li>
+<li>
+<a href="xsl_sort_design.html">xsl:sort</a>
+</li>
+<li>
+<a href="xsl_key_design.html">Keys</a>
+</li>
+<li>
+<a href="xsl_comment_design.html">Comment design</a>
+</li></ul><hr /><ul>
+<li>
+<a href="xsl_lang_design.html">lang()</a>
+</li>
+<li>
+<a href="xsl_unparsed_design.html">Unparsed entities</a>
+</li></ul><hr /><ul>
+<li>
+<a href="xsl_if_design.html">If design</a>
+</li>
+<li>
+<a href="xsl_choose_design.html">Choose|When|Otherwise design</a>
+</li>
+<li>
+<a href="xsl_include_design.html">Include|Import design</a>
+</li>
+<li>
+<a href="xsl_variable_design.html">Variable|Param design</a>
+</li></ul><hr /><ul>
+<li>
+<a href="xsltc_runtime.html">Runtime</a>
+</li></ul><hr /><ul>
+<li>
+<a href="xsltc_dom.html">Internal DOM</a>
+</li>
+<li>
+<a href="xsltc_namespace.html">Namespaces</a>
+</li></ul><hr /><ul>
+<li>
+<a href="xsltc_trax.html">Translet &amp; TrAX</a>
+</li>
+<li>
+<a href="xsltc_predicates.html">XPath Predicates</a>
+</li>
+<li>
+<a href="xsltc_iterators.html">Xsltc Iterators</a>
+</li>
+<li>
+<a href="xsltc_native_api.html">Xsltc Native API</a>
+</li>
+<li>
+<a href="xsltc_trax_api.html">Xsltc TrAX API</a>
+</li>
+<li>
+<a href="xsltc_performance.html">Performance Hints</a>
+</li>
+</ul>
+</div>
+<div id="content">
+<h2>&lt;xsl:strip/preserve-space&gt;</h2>
+
+  <ul>
+    <li>
+<a href="#functionality">Functionality</a>
+</li>
+    <li>
+<a href="#identify">Identifying strippable whitespace nodes</a>
+</li>
+    <li>
+<a href="#which">Determining which nodes to strip</a>
+</li>
+    <li>
+<a href="#strip">Stripping nodes</a>
+</li>
+    <li>
+<a href="#filter">Filtering whitespace nodes</a>
+</li>
+  </ul>
+  
+  <a name="functionality">‌</a>
+  <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h3>Functionality</h3>
+
+  <p>The <code>&lt;xsl:strip-space&gt;</code> and <code>&lt;xsl:preserve-space&gt;</code>
+  elements are used to control the way whitespace nodes in the source XML
+  document are handled. These elements have no impact on whitespace in the XSLT
+  stylesheet. Both elements can occur only as top-level elements, possible more
+  than once, and the elements are always empty</p>
+ 
+  <p>Both elements take one attribute "elements" which contains a
+  whitespace separated list of named nodes which should be or preserved
+  stripped from the source document. These names can be on one of these three
+  formats (NameTest format):</p>
+
+  <ul>
+    <li>
+      All whitespace nodes:
+      <code>elements="*"</code>
+    </li>
+    <li>
+      All whitespace nodes with a namespace:
+      <code>elements="&lt;namespace&gt;:*"</code>
+    </li>
+    <li>
+      Specific whitespace nodes: <code>elements="&lt;qname&gt;"</code>
+    </li>
+  </ul>
+
+  <a name="identify">‌</a>
+  <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h3>Identifying strippable whitespace nodes</h3>
+
+  <p>The DOM detects all text nodes and assigns them the type <code>TEXT</code>.
+  All text nodes are scanned to detect whitespace-only nodes. A text-node is
+  considered a whitespace node only if it consist entirely of characters from
+  the set { 0x09, 0x0a, 0x0d, 0x20 }. The DOM implementation class has a static
+  method used to detect such nodes:</p>
+
+<blockquote class="source">
+<pre>
+    private static final boolean isWhitespaceChar(char c) {
+        return c == 0x20 || c == 0x0A || c == 0x0D || c == 0x09;
+    }
+</pre>
+</blockquote>
+
+  <p>The characters are checked in probable order.</p>
+
+  <p> The DOM has a bit-array that is used to  tag text-nodes as strippable
+  whitespace nodes:</p>
+
+  <blockquote class="source">
+<pre>private int[] _whitespace;</pre>
+</blockquote>
+
+  <p>There are two methods in the DOM implementation class for accessing this
+  bit-array: <code>markWhitespace(node)</code> and <code>isWhitespace(node)</code>.
+  The array is resized together with all other arrays in the DOM by the
+  <code>DOM.resizeArrays()</code> method. The bits in the array are set in the
+  <code>DOM.maybeCreateTextNode()</code> method. This method must know whether
+  the current node is a located under an element with an
+  <code>xml:space="&lt;value&gt;"</code> attribute in the DOM, in which
+  case it is not a strippable whitespace node.</p>
+
+  <p>An auxillary class, WhitespaceHandler, is used for this purpose. The class
+  works in a way as a stack, where you "push" a new strip/preserve setting
+  together with the node in which this setting was determined. This means that
+  for every time the DOM builder encounters an <code>xml:space</code> attribute
+  it will invoke a method on an instance of the WhitespaceHandler class to
+  signal that a new preserve/strip setting has been encountered. This is done
+  in the <code>makeAttributeNode()</code> method. The whitespace handler stores the
+  new setting and pushes the current element node on its stack. When the
+  DOM builder closes up an element (in <code>endElement()</code>), it invokes
+  another method of the whitespace handler to check if the strip/preserve
+  setting is still valid. If the setting is now invalid (we're closing the
+  element whose node id is on the top of the stack) the handler inverts the
+  setting and pops the element node id off the stack. The previous
+  strip/preserve setting is then valid, and the id of node where this setting
+  was defined is on the top of the stack.</p>
+
+  <a name="which">‌</a>
+  <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h3>Determining which nodes to strip</h3>
+
+  <p>A text node is never stripped unless it contains only whitespace
+  characters (Unicode characters 0x09, 0x0A, 0x0D and 0x20). Stripping a text
+  node means that the node disappears from the DOM; so that it is never
+  included in the output and that it is ignored by all functions such as
+  <code>count()</code>. A text node is preserved if any of the following apply:</p>
+
+  <ul>
+    <li>
+      the element name of the parent of the text node is in the set of
+      elements listed in <code>&lt;xsl:preserve-space&gt;</code>
+    </li>
+    <li>
+      the text node contains at least one non-whitespace character
+    </li>
+    <li>
+      an ancenstor of the whitespace text node has an attribute of
+      <code>xsl:space="preserve"</code>, and no close ancestor has and
+      attribute of <code>xsl:space="default"</code>.
+    </li>
+  </ul>
+
+  <p>Otherwise, the text node is stripped. Initially the set of 
+  whitespace-preserving element names contains all element names, so the
+  default behaviour is to preserve all whitespace text nodes.</p>
+
+  <p>This seems simple enough, but resolving conflicts between matching
+  <code>&lt;xsl:strip-space&gt;</code> and <code>&lt;xsl:preserve-space&gt;</code>
+  elements requires a lot of thought. Our first consideration is import
+  precedence; the match with the highest import precedence is always chosen.
+  Import precedence is determined by the order in which the compared elements
+  are visited. (In this case those elements are the top-level
+  <code>&lt;xsl:strip-space&gt;</code> and <code>&lt;xsl:preserve-space&gt;</code>
+  elements.) This example is taken from the XSLT recommendation:</p>
+
+  <ul>
+    <li>stylesheet A imports stylesheets B and C in that order;</li>
+    <li>stylesheet B imports stylesheet D;</li>
+    <li>stylesheet C imports stylesheet E.</li>
+  </ul>
+
+  <p>Then the order of import precedence (lowest first) is D, B, E, C, A.</p>
+
+  <p>Our next consideration is the priority of NameTests (XPath spec):</p>
+  <ul>
+    <li>
+      <code>elements="&lt;qname&gt;"</code> has priority 0
+    </li>
+    <li>
+      <code>elements="&lt;namespace&gt;:*"</code> has priority -0.25
+    </li>
+    <li>
+      <code>elements="*"</code> has priority -0.5
+    </li>
+  </ul>
+
+  <p>It is considered an error if the desicion is still ambiguous after this,
+  and it is up to the implementors to decide what the apropriate action is.</p>
+
+  <p>With all this complexity, the normal usage for these elements is quite
+  smiple; either preserve all whitespace nodes but one type:</p>
+
+  <blockquote class="source">
+<pre>&lt;xsl:strip-space elements="foo"/&gt;</pre>
+</blockquote>
+
+  <p>or strip all whitespace nodes but one type:</p>
+
+  <blockquote class="source">
+<pre>
+    &lt;xsl:strip-space elements="*"/&gt;
+    &lt;xsl:preserve-space elements="foo"/&gt;</pre>
+</blockquote>
+
+  <a name="strip">‌</a>
+  <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h3>Stripping nodes</h3>
+
+  <p>The ultimate goal of our design would be to totally screen all stripped
+  nodes from the translet; to either physically remove them from the DOM or to
+  make it appear as if they are not there. The first approach will cause
+  problems in cases where multiple translets access the same DOM. In the future
+  we wish to let translets run within servlets / JSPs with a common DOM cache.
+  This DOM cache will keep copies of DOMs in memory to prevent the same XML
+  file from being downloaded and parsed several times. This is a scenarios we
+  might see:</p>
+
+    <p>
+<img src="DOMInterface.gif" alt="DOMInterface.gif" />
+</p>
+    <p>
+<b>
+<i>Figure 1: Multiple translets accessing a common pool of DOMs</i>
+</b>
+</p>
+
+  <p>The three translets running on this host access a common pool of 4 DOMs.
+  The DOMs are accessed through a common DOM interface. Translets accessing
+  a single DOM will have a DOMAdapter and a single DOMImpl object behind this
+  interface, while translets accessing several DOMs will be given a MultiDOM
+  and a set of DOMImpl objects.</p>
+
+  <p>The translet to the left may want to strip some nodes from the shared DOM
+  in the cache, while the other translets may want to preserve all whitespace
+  nodes. Our initial thought then is to keep the DOM as it is and somehow
+  screen the left-hand translet of all the whitespace nodes it does not want to
+  process. There are a few ways in which we can accomplish this:</p>
+
+  <ul>
+    <li>
+      The translet can, prior to starting to traverse the DOM, send a reference
+      to the tables containing information on which nodes we want stripped to
+      the DOM interface. The DOM interface is then responsible for hiding all
+      stripped whitespace nodes from the iterators and the translet. A problem
+      with this approach is that we want to omit the DOM interface layer if
+      the translet is only accessing a single DOM. The DOM interface layer will
+      only be instanciated by the translet if the stylesheet contained a call
+      to the <code>document()</code> function.<br />
+<br />
+    </li>
+    <li>
+      The translet can provide its iterators with information on which nodes it
+      does not want to see. The translet is still shielded from unwanted
+      whitespace nodes, but it has the hassle of passing extra information over
+      to most iterators it ever instanciates. Note that all iterators do not
+      need be aware of whitepspace nodes in this case. If you have a look at
+      the figure again you will see that only the first level iterator (that is
+      the one closest to the DOM or DOM interface) will have to strip off
+      whitespace nodes. But, there may be several iterators that operate
+      directly on the DOM ( invoked by code handling XSL functions such as
+      <code>count()</code>) and every single one of those will need to be told
+      which whitespace nodes the translet does not want to see.<br />
+<br />
+    </li>
+    <li>
+      The third approach will take advantage of the fact that not all
+      translets will want to strip whitespace nodes. The most effective way of
+      removing unwanted whitespace nodes is to do it once and for all, before
+      the actual traversal of the DOM starts. This can be done by making a
+      clone of the DOM with exlusive-access rights for this translet only. We
+      still gain performance from the cache because we do not have to pay the
+      cost of the delay caused by downloading and parsing the XML source file.
+      The cost we have to pay is the time needed for the actual cloning and the
+      extra memory we use.<br />
+<br />
+      Normally one would imagine the translet (or the wrapper class that
+      invokes the translet) calls the DOM cache with just an URL and receives
+      a reference to an instanciated DOM. The cache will either have built
+      this DOM on-demand or just passed back a reference to an existing tree.
+      In this case the DOM would need an extra call that a translet would use
+      to clone a DOM, passing the existing DOM reference to the cache and
+      recieving a new reference to the cloned DOM. The translet can then do
+      whatever it wants with this DOM (the cache need not even keep a reference
+      to this tree).
+    </li>
+  </ul>
+  
+  <p>We are lucky enough to be able to combine the first two approaches. All
+  iterators that directly access the DOM (axis iterators) are instanciated by
+  calls to the DOM interface layer (the DOM class). The actual iterators are
+  created in the DOM implementation layer (the DOMImpl class). So, we can pass
+  references to the preserve/strip whitespace tables to the DOM, and the DOM
+  will make sure that all axis iterators return node sets with respect to these
+  tables.</p>
+  <a name="filter">‌</a> 
+  <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h3>Filtering whitespace nodes</h3>
+
+  <p>For each axis iterator and for <code>DOM.makeStringValue()</code> and
+  <code>DOM.stringValueAux()</code> we must apply a filter for eliminating all
+  unwanted whitespace nodes. To achive this we must build a very efficient
+  predicate for determining if the current node should be stripped or not. This
+  predicate is built by <code>Whitespace.compilePredicate()</code>. This method is
+  static and builds a predicate for a vector of WhitespaceRule objects. (The
+  WhitespaceRule class is defined within the Whitespace class.) Each
+  WhitespaceRule object contains information for a single element listed
+  in an <code>&lt;xsl:strip/preserve-space&gt;</code> element, and is broken down
+  into the following information:</p>
+
+  <ul>
+    <li>the namespace (can be the default namespace)</li>
+    <li>the element name or "<code>*</code>"</li>
+    <li>the type of rule; NS:EL, NS:<code>*</code> or <code>*</code>
+</li>
+    <li>the priority of the rule (based on import precedence and type)</li>
+    <li>the action; either strip or preserver</li>
+  </ul>
+
+  <p>The Vector of WhitespaceRules is arranged in order of priority and
+  redundant rules are removed. A predicate method is then compiled into the
+  translet as:</p>
+
+<blockquote class="source">
+<pre>
+    public boolean stripSpace(int node);
+</pre>
+</blockquote>
+
+  <p>Unfortunately this method cannot be declared static.</p>
+
+  <p>When the Stylesheet objectcompiles the <code>topLevel()</code> method of the
+  translet it checks for the existance of the <code>stripSpace()</code> method. If
+  this method exists the <code>topLevel()</code> will be compiled to pass the
+  translet to the DOM as a StripWhitespaceFilter (the translet implements this
+  interface when the <code>stripSpace()</code> method is compiled).</p>
+
+  <p>All axis iterators and the <code>DOM.makeStringValue()</code> and
+  <code>DOM.stringValueAux()</code> methods check for the existance of this filter
+  (it is kept in a global variable in the DOM implementation class) and takes
+  the appropriate actions. The methods in the DOM for returning axis iterators
+  will place a StrippingIterator on top of the axis iterator if the filter is
+  present, and the two methods just mentioned will return empty strings for
+  whitespace nodes that should be stripped.</p>
+ 
+  
+<p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+</div>
+<div id="footer">Copyright © 1999-2014 The Apache Software Foundation<br />Apache, Xalan, and the Feather logo are trademarks of The Apache Software Foundation<div class="small">Web Page created on - Thu 2014-05-15</div>
+</div>
+</body>
+</html>

Propchange: xalan/java/branches/WebSite/xalan-j/xsltc/xsl_whitespace_design.html
------------------------------------------------------------------------------
    svn:eol-style = native

Added: xalan/java/branches/WebSite/xalan-j/xsltc/xsltc_compiler.html
URL: http://svn.apache.org/viewvc/xalan/java/branches/WebSite/xalan-j/xsltc/xsltc_compiler.html?rev=1595253&view=auto
==============================================================================
--- xalan/java/branches/WebSite/xalan-j/xsltc/xsltc_compiler.html (added)
+++ xalan/java/branches/WebSite/xalan-j/xsltc/xsltc_compiler.html Fri May 16 16:11:33 2014
@@ -0,0 +1,560 @@
+<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
+<html>
+<head>
+<title>ASF: XSLTC Compiler Design</title>
+<meta http-equiv="content-type" content="text/html; charset=UTF-8" />
+<meta http-equiv="Content-Style-Type" content="text/css" />
+<link rel="stylesheet" type="text/css" href="resources/apache-xalan.css" />
+</head>
+<!--
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the  "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ -->
+<body>
+<div id="title">
+<table class="HdrTitle">
+<tbody>
+<tr>
+<th rowspan="2">
+<a href="../index.html">
+<img alt="Trademark Logo" src="resources/XalanJ-Logo-tm.png" width="190" height="90" />
+</a>
+</th>
+<th text-align="center" width="75%">
+<a href="index.html">XSLTC Design</a>
+</th>
+</tr>
+<tr>
+<td valign="middle">XSLTC Compiler Design</td>
+</tr>
+</tbody>
+</table>
+<table class="HdrButtons" align="center" border="1">
+<tbody>
+<tr>
+<td>
+<a href="http://www.apache.org">Apache Foundation</a>
+</td>
+<td>
+<a href="http://xalan.apache.org">Xalan Project</a>
+</td>
+<td>
+<a href="http://xerces.apache.org">Xerces Project</a>
+</td>
+<td>
+<a href="http://www.w3.org/TR">Web Consortium</a>
+</td>
+<td>
+<a href="http://www.oasis-open.org/standards">Oasis Open</a>
+</td>
+</tr>
+</tbody>
+</table>
+</div>
+<div id="navLeft">
+<ul>
+<li>
+<a href="index.html">Overview</a>
+</li></ul><hr /><ul>
+<li>Compiler design<br />
+</li></ul><hr /><ul>
+<li>
+<a href="xsl_whitespace_design.html">Whitespace</a>
+</li>
+<li>
+<a href="xsl_sort_design.html">xsl:sort</a>
+</li>
+<li>
+<a href="xsl_key_design.html">Keys</a>
+</li>
+<li>
+<a href="xsl_comment_design.html">Comment design</a>
+</li></ul><hr /><ul>
+<li>
+<a href="xsl_lang_design.html">lang()</a>
+</li>
+<li>
+<a href="xsl_unparsed_design.html">Unparsed entities</a>
+</li></ul><hr /><ul>
+<li>
+<a href="xsl_if_design.html">If design</a>
+</li>
+<li>
+<a href="xsl_choose_design.html">Choose|When|Otherwise design</a>
+</li>
+<li>
+<a href="xsl_include_design.html">Include|Import design</a>
+</li>
+<li>
+<a href="xsl_variable_design.html">Variable|Param design</a>
+</li></ul><hr /><ul>
+<li>
+<a href="xsltc_runtime.html">Runtime</a>
+</li></ul><hr /><ul>
+<li>
+<a href="xsltc_dom.html">Internal DOM</a>
+</li>
+<li>
+<a href="xsltc_namespace.html">Namespaces</a>
+</li></ul><hr /><ul>
+<li>
+<a href="xsltc_trax.html">Translet &amp; TrAX</a>
+</li>
+<li>
+<a href="xsltc_predicates.html">XPath Predicates</a>
+</li>
+<li>
+<a href="xsltc_iterators.html">Xsltc Iterators</a>
+</li>
+<li>
+<a href="xsltc_native_api.html">Xsltc Native API</a>
+</li>
+<li>
+<a href="xsltc_trax_api.html">Xsltc TrAX API</a>
+</li>
+<li>
+<a href="xsltc_performance.html">Performance Hints</a>
+</li>
+</ul>
+</div>
+<div id="content">
+<h2>XSLTC Compiler Design</h2>
+
+  <ul>  
+    <li>
+<a href="#overview">Compiler Overview</a>
+</li>
+    <li>
+<a href="#ast">Building the Abstract Syntax Tree</a>
+</li>
+    <li>
+<a href="#typecheck">Type-check and Cast Expressions</a>
+</li>
+    <li>
+<a href="#compile">JVM byte-code generation</a>
+</li>
+  </ul>
+
+  
+
+  <a name="overview">‌</a>
+  <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h3>Compiler overview</h3>
+
+    <p>The main component of the XSLTC compiler is the class</p>   
+    <ul>
+      <li>
+<code>org.apache.xalan.xsltc.compiler.XSLTC</code>
+</li>
+    </ul>
+
+    <p>This class uses three parsers to consume the input stylesheet(s):</p>
+
+    <ul>
+      <li>
+<code>javax.xml.parsers.SAXParser</code>
+</li>
+    </ul>
+
+    <p>is used to parse the stylesheet document and pass its contents to
+    the compiler as basic SAX2 events.</p>
+
+    <ul>
+      <li>
+<code>com.sun.xslt.compiler.XPathParser</code>
+</li>
+    </ul>
+
+    <p> is a parser used to parse XPath expressions and patterns. This parser
+    is generated using JavaCUP and JavaLEX from Princeton University.</p>
+
+    <ul>
+      <li>
+<code>com.sun.xslt.compiler.Parser</code>
+</li>
+    </ul>
+
+    <p>is a wrapper for the other two parsers. This parser is responsible for
+    using the other two parsers to build the compiler's abstract syntax tree
+    (which is described in more detail in the next section of this document).
+    </p>
+
+  
+
+  
+  <a name="ast">‌</a>
+  <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h3>Building an Abstract Syntax Tree</h3>
+
+    <p>An abstract syntax tree (AST) is a data-structure commonly used by
+    compilers to separate the parse-phase from the later phases of the
+    compilation. The AST has one node for each parsed token from the stylesheet
+    and can easily be parsed at the stages of type-checking and bytecode
+    generation.</p>
+
+    <ul>
+      <li>
+        <a href="#mapping">Mapping stylesheet elements to AST nodes</a>
+      </li>
+      <li>
+        <a href="#domxsl">Building the AST from AST nodes</a>
+      </li>
+      <li>
+        <a href="#mapping">Mapping XPath expressions and patterns to additional AST nodes</a>
+      </li>
+    </ul>
+
+    <p>The SAX parser passes the contents of the stylesheet to XSLTC's main
+    parser. The SAX events represent a decomposition of the XML document that
+    contains the stylesheet. The main parser needs to create one AST node from
+    each node that it receives from the SAX parser. It also needs to use the
+    XPath parser to decompose attributes that contain XPath expressions and
+    patterns. Remember that XSLT is in effect two languages: XML and XPath,
+    and one parser is needed for each of these languages. The SAX parser breaks
+    down the stylesheet document, the XPath parser breaks down XPath expressions
+    and patterns, and the main parser maps the decomposed elements into nodes
+    in the abstract syntax tree.</p>
+
+    <a name="mapping">‌</a>
+    <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h4>Mapping stylesheets elements to AST nodes</h4>
+
+    <p>Every element that is defined in the XSLT 1.0 spec is represented by a
+    a class in the <code>org.apache.xalan.xsltc.compiler</code> package. The
+    main parser class contains a <code>Hashtable</code> that that maps XSL
+    elements into Java classes that make up the nodes in the AST. These Java
+    classes all reside in the <code>org.apache.xalan.xsltc.compiler</code>
+    package and extend either the <code>TopLevelElement</code> or the
+    <code>Instruction</code> classes. (Both these classes extend the
+    <code>SyntaxTreeNode</code> class.)</p>
+
+    <p>The mapping from XSL element names to Java classes/AST nodes is set up
+    in the <code>initClasses()</code> method of the main parser:</p>
+<blockquote class="source">
+<pre>
+    private void initStdClasses() {
+	try {
+	    initStdClass("template",    "Template");
+	    initStdClass("param",       "Param");
+	    initStdClass("with-param",  "WithParam");
+	    initStdClass("variable",    "Variable");
+	    initStdClass("output",      "Output");
+	    :
+	    :
+	    :
+	}
+    }
+
+    private void initClass(String elementName, String className)
+	throws ClassNotFoundException {
+	_classes.put(elementName,
+		     Class.forName(COMPILER_PACKAGE + '.' + className));
+    }</pre>
+</blockquote>
+
+    
+
+    <a name="domxsl">‌</a>
+    <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h4>Building the AST from AST nodes</h4>
+    <p>The parser builds an AST from the various syntax tree nodes. Each node
+    contains a reference to its parent node, a vector containing references
+    to all child nodes and a structure containing all attribute nodes:</p>
+<blockquote class="source">
+<pre>
+    protected SyntaxTreeNode _parent; // Parent node
+    private   Vector _contents;       // Child nodes
+    protected Attributes _attributes; // Attributes of this element</pre>
+</blockquote>
+
+
+    <p>These variables should be accessed using these methods:</p>
+<blockquote class="source">
+<pre>
+    protected final SyntaxTreeNode getParent();
+    protected final Vector getContents();
+    protected String getAttribute(String qname);
+    protected Attributes getAttributes();</pre>
+</blockquote>
+
+    <p>At this time the AST only contains nodes that represent the XSL elements
+    from the stylesheet. A SAX parse is generic and can only handle XML files
+    and will not break up and identify XPath patterns/expressions (these are
+    stored as attributes to the various nodes in the tree). Each XSL instruction
+    gets its own node in the AST, and the XPath patterns/expressions are stored
+    as attributes of these nodes. A stylesheet looking like this:</p>
+<blockquote class="source">
+<pre>
+    &lt;xsl:stylesheet .......&gt;
+      &lt;xsl:template match="chapter"&gt;
+        &lt;xsl:text&gt;Chapter&lt;/xsl:text&gt;
+        &lt;xsl:value-of select="."&gt;
+      &lt;/xsl:template&gt;
+    &lt;/xsl&gt;stylesheet&gt;</pre>
+</blockquote>
+
+    <p>will be stored in the AST as indicated in the following picture:</p>
+    <p>
+<img src="ast_stage1.gif" alt="ast_stage1.gif" />
+</p>
+    <p>
+<b>
+<i>Figure 1: The AST in its first stage</i>
+</b>
+</p>
+
+    <p>All objects that make up the nodes in the initial AST have a
+    <code>parseContents()</code> method. This method is responsible for:</p>
+
+    <ul>
+      <li>parsing the values of those attributes that contain XPath expressions
+      or patterns, breaking each expression/pattern into AST nodes and inserting
+      them into the tree.</li>
+      <li>reading/checking all other required attributes</li>
+      <li>propagate the <code>parseContents()</code> call down the tree</li>
+    </ul>
+    
+
+    <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h4>Mapping XPath expressions and patterns to additional AST nodes</h4>
+
+    <p>The nodes that represent the XPath expressions and patterns extend
+    either the <code>Expression</code> or <code>Pattern</code> class
+    respectively. These nodes are not appended to the <code>_contents</code>
+    vectory of each node, but rather stored as individual references in each
+    AST element node. One example is the <code>ForEach</code> class that
+    represents the <code>&lt;xsl:for-each&gt;</code> element. This class has
+    a variable that contains a reference to the AST sub-tree that represents
+    its <code>select</code> attribute:</p>
+<blockquote class="source">
+<pre>
+    private Expression _select;</pre>
+</blockquote>
+   
+    <p>There is no standard way of storing these XPath expressions and each
+    AST node that contains one or more XPath expression/pattern must handle
+    that itself. This handling basically involves passing the attribute's
+    value to the XPath parser and receiving back an AST sub-tree.</p>
+
+    <p>With all XPath expressions/patterns expanded, the AST will look somewhat
+    like this:</p>
+
+    <p>
+<img src="ast_stage2.gif" alt="ast_stage2.gif" />
+</p>
+    <p>
+<b>
+<i>Fiugre 2: The AST in its second stage</i>
+</b>
+</p>
+
+    
+  
+
+  
+
+  <a name="typecheck">‌</a>
+  <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h3>Type-check and Cast Expressions</h3>
+
+    <p>In many cases we will need to typecast the top node in the expression
+    sub-tree to suit the expected result-type of the expression, or to typecast
+    child nodes to suit the allowed types for the various operators in the
+    expression. This is done by calling 'typeCheck()' on the root-node in the
+    XSL tree. Each SyntaxTreeNode node is responsible for inserting type-cast
+    nodes between itself and its child nodes or XPath nodes. These type-cast
+    nodes will convert the output-types of the child/XPath nodes to the expected
+    input-type of the parent node. Let look at our AST again and the node that
+    represents the <code>&lt;xsl:value-of&gt;</code> element. This element
+    expects to receive a string from its <code>select</code> XPath expression,
+    but the <code>Step</code> expression will return either a node-set or a
+    single node. An extra node is inserted into the AST to perform the
+    necessary type conversions:</p>
+
+    <p>
+<img src="ast_stage3.gif" alt="ast_stage3.gif" />
+</p>
+    <p>
+<b>
+<i>Figure 3: XPath expression type cast</i>
+</b>
+</p>
+
+    <p>The <code>typeCheck()</code> method of each SyntaxTreeNode object will
+    call <code>typeCheck()</code> on each of its XPath expressions. This method
+    will return the native type returned by the expression. The AST node will
+    insert an additional type-conversion node if the return-type does not match
+    the expected data-type. Each possible return type is represented by a class
+    in the <code>org.apache.xalan.xsltc.compiler.util</code> package. These
+    classes all contain methods that will generate bytecodes needed to perform
+    the actual type conversions (at runtime). The type-cast nodes in the AST
+    mainly consist of calls to these methods.</p>
+  
+
+  
+
+  <a name="compile">‌</a>
+  <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h3>JVM byte-code generation</h3>
+
+    <ul>
+      <li>
+<a href="#stylesheet">Compiling the stylesheet</a>
+</li>
+      <li>
+<a href="#toplevel">Compiling top-level elements</a>
+</li>
+      <li>
+<a href="#templates">Compiling template code</a>
+</li>
+      <li>
+<a href="#instructions">Compiling instructions, functions expressions and patterns</a>
+</li>
+    </ul>
+
+    <p>Evey node in the AST extends the <code>SyntaxTreeNode</code> base class
+    and implements the <code>translate()</code> method. This method is
+    responsible for outputting the actual bytecodes that make up the
+    functionality required for each element, function, expression or pattern.
+    </p>
+
+    <a name="stylesheet">‌</a>
+    <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h4>Compiling the stylesheet</h4>
+    <p>Some nodes in the AST require more complex code than others. The best
+    example is the <code>&lt;xsl:stylesheet&gt;</code> element. The code that
+    represents this element has to tie together the code that is generated by
+    all the other elements and generate the actual class definition for the main
+    translet class. The <code>Stylesheet</code> class generates the translet's
+    constructor and methods that handle all top-level elements.</p>
+    
+
+    <a name="toplevel">‌</a>
+    <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h4>Compiling top-level elements</h4>
+    <p>The bytecode that handles top-level elements must be generated before any
+    other code. The '<code>translate()</code>' method in these classes are
+    mainly called from these methods in the Stylesheet class:</p>
+<blockquote class="source">
+<pre>
+    private String compileBuildKeys(ClassGenerator);
+    private String compileTopLevel(ClassGenerator, Enumeration);
+    private void compileConstructor(ClassGenerator, Output);</pre>
+</blockquote>
+
+    <p>These methods handle most top-level elements, such as global variables
+    and parameters, <code>&lt;xsl:output&gt;</code> and
+    <code>&lt;xsl:decimal-format&gt;</code> instructions.</p>
+    
+
+    <a name="templates">‌</a>
+    <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h4>Compiling template code</h4>
+    <p>All XPath patterns in <code>&lt;xsl:apply-template&gt;</code>
+    instructions are converted into numeric values (known as the pattern's
+    kernel 'type'). All templates with identical pattern kernel types are
+    grouped together and inserted into a table known as a test sequence.
+    (The table of test sequences is found in the Mode class in the compiler
+    package. There will be one such table for each mode that is used in the
+    stylesheet). This table is used to build a big <code>switch()</code>
+    statement in the translet's <code>applyTemplates()</code> method. This
+    method is initially called with the root node of the input document.</p>
+
+    <p>The <code>applyTemplates()</code> method determines the node's type and
+    passes this type to the <code>switch()</code> statement to look up the
+    matching template. The test sequence code (the <code>TestSeq</code> class)
+    is responsible for inserting bytecodes to find  one  matching template
+    in cases where more than one template matches the current node type.</p>
+
+    <p>There may be several templates that share the same pattern kernel type.
+    Here are a few examples of templates with patterns that all have the same
+    kernel type:</p>
+<blockquote class="source">
+<pre>
+    &lt;xsl:template match="A/C"&gt;
+    &lt;xsl:template match="A/B/C"&gt;
+    &lt;xsl:template match="A | C"&gt;</pre>
+</blockquote>
+
+    <p>All these templates will be grouped under the type for
+    <code>&lt;C&gt;</code> and will all get the same kernel type (the type for
+    <code>"C"</code>). The last template will be grouped both under
+    <code>"C"</code> and <code>"A"</code>, since it matches either element.
+    If the type identifier for <code>"C"</code> in this case is 8, all these
+    templates will be put under <code>case 8:</code> in
+    <code>applyTemplates()</code>'s big <code>switch()</code> statement. The
+    <code>TestSeq</code> class will insert some code under the
+    <code>case 8:</code> statement (similar to if's and then's) in order to
+    determine which of the three templates to trigger.</p>
+    
+
+    <a name="instructions">‌</a>
+    <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h4>Compiling instructions, functions, expressions and patterns</h4>
+
+    <p>The template code is generated by calling <code>translate()</code> on
+    each <code>Template</code> object in the abstract syntax tree. This call
+    will be propagated down the abstract syntax tree and every element will
+    output the bytecodes necessary to complete its task.</p>
+
+    <p>The Java Virtual Machine is stack-based, which goes hand-in-hand with
+    the tree structure of a stylesheet and the AST. A node in the AST will
+    call <code>translate()</code> on its child nodes and any XPath nodes before
+    it generates its own bytecodes. In that way the correct sequence of JVM
+    instructions is generated.  Each one of the child nodes is responsible of
+    creating code that leaves the node's output value (if any) on the stack.
+    The typical procedure for the parent node is to create JVM code that
+    consumes these values off the stack and then leave its own output on the
+    stack (for its parent).</p>
+
+    <p>The tree-structure of the stylesheet is in this way closely tied with
+    the stack-based JVM. The design does not offer any obvious way of extending
+    the compiler to output code for other non-stack-based VMs or processors.</p>
+    
+
+  
+
+<p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+</div>
+<div id="footer">Copyright © 1999-2014 The Apache Software Foundation<br />Apache, Xalan, and the Feather logo are trademarks of The Apache Software Foundation<div class="small">Web Page created on - Thu 2014-05-15</div>
+</div>
+</body>
+</html>

Propchange: xalan/java/branches/WebSite/xalan-j/xsltc/xsltc_compiler.html
------------------------------------------------------------------------------
    svn:eol-style = native

Added: xalan/java/branches/WebSite/xalan-j/xsltc/xsltc_dom.html
URL: http://svn.apache.org/viewvc/xalan/java/branches/WebSite/xalan-j/xsltc/xsltc_dom.html?rev=1595253&view=auto
==============================================================================
--- xalan/java/branches/WebSite/xalan-j/xsltc/xsltc_dom.html (added)
+++ xalan/java/branches/WebSite/xalan-j/xsltc/xsltc_dom.html Fri May 16 16:11:33 2014
@@ -0,0 +1,748 @@
+<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
+<html>
+<head>
+<title>ASF: XSLTC Internal DOM</title>
+<meta http-equiv="content-type" content="text/html; charset=UTF-8" />
+<meta http-equiv="Content-Style-Type" content="text/css" />
+<link rel="stylesheet" type="text/css" href="resources/apache-xalan.css" />
+</head>
+<!--
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the  "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ -->
+<body>
+<div id="title">
+<table class="HdrTitle">
+<tbody>
+<tr>
+<th rowspan="2">
+<a href="../index.html">
+<img alt="Trademark Logo" src="resources/XalanJ-Logo-tm.png" width="190" height="90" />
+</a>
+</th>
+<th text-align="center" width="75%">
+<a href="index.html">XSLTC Design</a>
+</th>
+</tr>
+<tr>
+<td valign="middle">XSLTC Internal DOM</td>
+</tr>
+</tbody>
+</table>
+<table class="HdrButtons" align="center" border="1">
+<tbody>
+<tr>
+<td>
+<a href="http://www.apache.org">Apache Foundation</a>
+</td>
+<td>
+<a href="http://xalan.apache.org">Xalan Project</a>
+</td>
+<td>
+<a href="http://xerces.apache.org">Xerces Project</a>
+</td>
+<td>
+<a href="http://www.w3.org/TR">Web Consortium</a>
+</td>
+<td>
+<a href="http://www.oasis-open.org/standards">Oasis Open</a>
+</td>
+</tr>
+</tbody>
+</table>
+</div>
+<div id="navLeft">
+<ul>
+<li>
+<a href="index.html">Overview</a>
+</li></ul><hr /><ul>
+<li>
+<a href="xsltc_compiler.html">Compiler design</a>
+</li></ul><hr /><ul>
+<li>
+<a href="xsl_whitespace_design.html">Whitespace</a>
+</li>
+<li>
+<a href="xsl_sort_design.html">xsl:sort</a>
+</li>
+<li>
+<a href="xsl_key_design.html">Keys</a>
+</li>
+<li>
+<a href="xsl_comment_design.html">Comment design</a>
+</li></ul><hr /><ul>
+<li>
+<a href="xsl_lang_design.html">lang()</a>
+</li>
+<li>
+<a href="xsl_unparsed_design.html">Unparsed entities</a>
+</li></ul><hr /><ul>
+<li>
+<a href="xsl_if_design.html">If design</a>
+</li>
+<li>
+<a href="xsl_choose_design.html">Choose|When|Otherwise design</a>
+</li>
+<li>
+<a href="xsl_include_design.html">Include|Import design</a>
+</li>
+<li>
+<a href="xsl_variable_design.html">Variable|Param design</a>
+</li></ul><hr /><ul>
+<li>
+<a href="xsltc_runtime.html">Runtime</a>
+</li></ul><hr /><ul>
+<li>Internal DOM<br />
+</li>
+<li>
+<a href="xsltc_namespace.html">Namespaces</a>
+</li></ul><hr /><ul>
+<li>
+<a href="xsltc_trax.html">Translet &amp; TrAX</a>
+</li>
+<li>
+<a href="xsltc_predicates.html">XPath Predicates</a>
+</li>
+<li>
+<a href="xsltc_iterators.html">Xsltc Iterators</a>
+</li>
+<li>
+<a href="xsltc_native_api.html">Xsltc Native API</a>
+</li>
+<li>
+<a href="xsltc_trax_api.html">Xsltc TrAX API</a>
+</li>
+<li>
+<a href="xsltc_performance.html">Performance Hints</a>
+</li>
+</ul>
+</div>
+<div id="content">
+<h2>XSLTC Internal DOM</h2>
+
+  <ul>
+    <li>
+<a href="#functionality">General functionlaity</a>
+</li>
+    <li>
+<a href="#components">Components of the internal DOM</a>
+</li>
+    <li>
+<a href="#structure">Internal structure</a>
+</li>
+    <li>
+<a href="#navigation">Tree navigation</a>
+</li>
+    <li>
+<a href="#namespaces">Namespaces</a>
+</li>
+    <li>
+<a href="#w3c">W3C DOM2 navigation support</a>
+</li>
+    <li>
+<a href="#adapter">The DOM adapter - DOMAdapter</a>
+</li>
+    <li>
+<a href="#multiplexer">The DOM multiplexer - MultiDOM</a>
+</li>
+    <li>
+<a href="#builder">The DOM builder - DOMImpl$DOMBuilder</a>
+</li>
+  </ul>
+
+  <a name="functionality">‌</a>
+  <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h3>General functionality</h3>
+  <p>The internal DOM gives the translet access to the XML document(s) it has
+  to transform. The interface to the internal DOM is specified in the DOM.java
+  class. This is the interface that the translet uses to access the DOM. 
+  There is also an interface specified for DOM caches -- DOMCache.java</p>
+
+  <a name="components">‌</a>
+  <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h3>Components of the internal DOM</h3>
+
+  <p>This DOM interface is implemented by three classes:</p>
+  <ul>
+    <li>
+<b>org.apache.xalan.xsltc.dom.DOMImpl</b>
+<br />
+<br />
+      This is the main DOM class. An instance of this class contains the nodes
+      of a <b>single</b> XML document.<br />
+<br />
+    </li>
+    <li>
+<b>org.apache.xalan.xsltc.dom.MultiDOM</b>
+<br />
+<br />
+      This class is best described as a DOM multiplexer. XSLTC was initially
+      designed to operate on a single XML document, and the initial DOM and
+      the DOM interface were designed and implemented without the
+      <code>document()</code> function in mind. This class will allow a translet
+      to access multiple DOMs through the original DOM interface.<br />
+<br />
+    </li>
+    <li>
+<b>org.apache.xalan.xsltc.dom.DOMAdapter</b>
+<br />
+<br />
+      The DOM adapter is a mediator between a DOMImpl or a MultiDOM object and
+      a single translet. A DOMAdapter object contains mappings and reverse
+      mappings between node  types in the DOM(s) and node types in the translet.
+      This mediator is needed to allow several translets access to a single DOM.
+      <br />
+<br />
+    </li>
+    <li>
+<b>org.apache.xalan.xsltc.dom.DocumentCache</b>
+<br />
+<br />
+      A sample DOM cache (implementing DOMCache) that is used with our sample
+      transformation applications.
+    </li>
+  </ul>
+
+  <p>
+<img src="DOMInterface.gif" alt="DOMInterface.gif" />
+</p>
+  <p>
+<b>
+<i>Figure 1: Main components of the internal DOM</i>
+</b>
+</p>
+
+  <p>The figure above shows how several translets can access one or more
+  internal DOM from a shared pool of cached DOMs. A translet can also access a
+  DOM tree outside of a cache. The Stylesheet class that represents the XSL
+  stylesheet to compile contains a flag that indicates if the translet uses the
+  <code>document()</code> function. The code compiled into the translet will act
+  accordingly and instanciate a MultiDOM object if needed (this code is compiled
+   in the compiler's <code>Stylesheet.compileTransform()</code> method).</p>
+
+  <a name="structure">‌</a>
+  <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h3>Internal Structure</h3>
+  <ul>
+    <li>
+<a href="#node-id">Node identification</a>
+</li>
+    <li>
+<a href="#element-nodes">Element nodes</a>
+</li>
+    <li>
+<a href="#attribute-nodes">Attribute nodes</a>
+</li>    
+    <li>
+<a href="#text-nodes">Text nodes</a>
+</li>
+    <li>
+<a href="#comment-nodes">Comment nodes</a>
+</li>    
+    <li>
+<a href="#pi" />Processing instructions</li>
+  </ul>
+  <a name="node-id">‌</a>
+  <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h4>Node identifation</h4>
+
+  <p>Each node in the DOM is represented by an integer. This integer is an
+  index into a series of arrays that describes the node. Most important is
+  the <code>_type[]</code> array, which holds the (DOM internal) node type. There
+  are some general node types that are described in the DOM.java interface:</p>
+
+<blockquote class="source">
+<pre>
+    public final static int ROOT                   = 0;
+    public final static int TEXT                   = 1;
+    public final static int UNUSED                 = 2;
+    public final static int ELEMENT                = 3;
+    public final static int ATTRIBUTE              = 4;
+    public final static int PROCESSING_INSTRUCTION = 5;
+    public final static int COMMENT                = 6;
+    public final static int NTYPES                 = 7;
+</pre>
+</blockquote>
+
+  <p>Element and attribute nodes will be assigned types based on their expanded
+  QNames. The <code>_type[]</code> array is used for this:</p>
+
+<blockquote class="source">
+<pre>
+    int    type      = _type[node];             // get node type
+</pre>
+</blockquote>
+
+  <p>The node type can be used to look up the element/attribute name in the
+  element/attribute name array <code>_namesArray[]</code>:</p>
+
+<blockquote class="source">
+<pre>
+    String name      = _namesArray[type-NTYPES]; // get node element name
+</pre>
+</blockquote>
+
+  <p>The resulting string contains the full, expanded QName of the element or
+  attribute. Retrieving the namespace URI of an element/attribute is done in a
+  very similar fashion:</p>
+
+<blockquote class="source">
+<pre>
+    int    nstype    = _namespace[type-NTYPES]; // get namespace type
+    String namespace = _nsNamesArray[nstype];   // get node namespace name
+</pre>
+</blockquote>
+  <a name="element-nodes">‌</a>
+  <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h4>Element nodes</h4>
+
+  <p>The contents of an element node (child nodes) can be identified using
+  the <code>_offsetOrChild[]</code> and <code>_nextSibling[]</code> arrays. The
+  <code>_offsetOrChild[]</code> array will give you the first child of an element
+  node:</p>
+
+<blockquote class="source">
+<pre>
+    int    child     = _offsetOrChild[node];    // first child
+    child = _nextSibling[child];                // next child
+</pre>
+</blockquote>
+
+  <p>The last child will have a "<code>_nextSibling[]</code>" of 0 (zero).
+  This value is OK since the root node (the 0 node) will not be a child of
+  any element.</p>
+
+  <a name="attribute-nodes">‌</a>
+  <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h4>Attribute nodes</h4>
+
+  <p>The first attribute node of an element is found by a lookup in the
+  <code>_lengthOrAttr[]</code> array using the node index:</p>
+
+<blockquote class="source">
+<pre>
+    int    attribute = _offsetOrChild[node];    // first attribute
+    attribute = _nextSibling[attribute];        // next attribute
+</pre>
+</blockquote>
+
+  <p>The names of attributes are contained in the <code>_namesArray[]</code> just
+  like the names of element nodes. The value of attributes are store the same
+  way as text nodes:</p>
+
+<blockquote class="source">
+<pre>
+    int    offset    = _offsetOrChild[attribute]; // offset into character array
+    int    length    = _lengthOrAttr[attribute];  // length of attribute value
+    String value     = new String(_text, offset, length);
+</pre>
+</blockquote>
+  <a name="text-nodes">‌</a>
+  <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h4>Text nodes</h4>
+
+  <p>Text nodes are stored identically to attribute values. See the previous
+  section on <a href="#attribute-nodes">attribute nodes</a>.</p>
+  <a name="comment-nodes">‌</a>
+  <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h4>Comment nodes</h4>
+
+  <p>The internal DOM does currently <b>not</b> contain comment nodes. Yes, I
+  am quite aware that the DOM has a type assigned to comment nodes, but comments
+  are still not inserted into the DOM.</p>
+  <a name="pi">‌</a>
+  <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h4>Processing instructions</h4>
+
+  <p>Processing instructions are handled as text nodes. These nodes are stored
+  identically to attribute values. See the previous section on
+  <a href="#attribute-nodes">attribute nodes</a>.</p>
+
+  <a name="navigation">‌</a>
+  <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h3>Tree navigation</h3>
+
+  <p>The DOM implementation contains a series of iterator that implement the
+  XPath axis. All these iterators implement the NodeIterator interface and
+  extend the NodeIteratorBase base class. These iterators do the job of
+  navigating the tree using the <code>_offsetOrChild[]</code>, <code>_nextSibling</code>
+  and <code>_parent[]</code> arrays. All iterators that handles XPath axis are
+  implemented as a private inner class of DOMImpl. The translet uses a handful
+  of methods to instanciate these iterators:</p>
+
+<blockquote class="source">
+<pre>
+    public NodeIterator getIterator();
+    public NodeIterator getChildren(final int node);
+    public NodeIterator getTypedChildren(final int type);
+    public NodeIterator getAxisIterator(final int axis);
+    public NodeIterator getTypedAxisIterator(final int axis, final int type);
+    public NodeIterator getNthDescendant(int node, int n);
+    public NodeIterator getNamespaceAxisIterator(final int axis, final int ns);
+    public NodeIterator orderNodes(NodeIterator source, int node);
+</pre>
+</blockquote>
+
+  <p>There are a few iterators in addition to these, such as sorting/ordering
+  iterators and filtering iterators. These iterators are implemented in
+  separate classes and can be instanciated directly by the translet.</p>
+
+  <a name="namespaces">‌</a>
+  <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h3>Namespaces</h3>
+
+  <p>Namespace support was added to the internal DOM at a late stage, and the
+  design and implementation of the DOM bears a few scars because of this. 
+  There is a separate <a href="xsltc_namespace.html">design
+  document</a> that covers namespaces.</p>
+
+  <a name="w3c">‌</a>
+  <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h3>W3C DOM2 navigation support</h3>
+
+  <p>The DOM has a few methods that give basic W3C-type DOM navigation. These
+  methods are:</p>
+
+<blockquote class="source">
+<pre>
+    public Node makeNode(int index);
+    public Node makeNode(NodeIterator iter);
+    public NodeList makeNodeList(int index);
+    public NodeList makeNodeList(NodeIterator iter);
+</pre>
+</blockquote>
+
+  <p>These methods return instances of inner classes of the DOM that implement
+  the W3C Node and NodeList interfaces.</p>
+
+  <a name="adapter">‌</a>
+  <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h3>The DOM adapter - DOMAdapter</h3>
+  <ul>
+    <li>
+<a href="#translet-dom">Translet/DOM type mapping</a>
+</li>
+    <li>
+<a href="#whitespace">Whitespace text-node stripping</a>
+</li>
+    <li>
+<a href="#method-mapping">Method mapping</a>
+</li>
+  </ul>
+  <a name="translet-dom">‌</a>
+  <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h4>Translet/DOM type mapping</h4>
+
+  <p>The DOMAdapter class performs the mappings between DOM and translet node
+  types, and vice versa. These mappings are necessary in order for the translet
+  to correctly identify an element/attribute in the DOM and for the DOM to
+  correctly identify the element/attribute type of a typed iterator requested
+  by the translet. Note that the DOMAdapter also maps translet namespace types
+  to DOM namespace types, and vice versa.</p>
+
+  <p>The DOMAdapter class has four global tables that hold the translet/DOM
+  type and namespace-type mappings. If the DOM knows an element as type
+  19, the DOMAdapter will translate this to some other integer using the
+  <code>_mapping[]</code> array:</p>
+
+<blockquote class="source">
+<pre>
+    int domType = _mapping[transletType];
+</pre>
+</blockquote>
+
+  <p>This action will be performed when the DOM asks what type a specific node
+  is. The reverse is done then the translet wants an iterator for a specific
+  node type. The DOMAdapter must translate the translet-type to the type used
+  internally in the DOM by looking up the <code>_reverse[]</code> array:</p>
+
+<blockquote class="source">
+<pre>
+    int transletType = _mapping[domType];
+</pre>
+</blockquote>
+
+  <p>There are two additional mapping tables: <code>_NSmapping[]</code> and
+  <code>_NSreverse[]</code> that do the same for namespace types.</p>
+  <a name="whitespace">‌</a>
+  <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h4>Whitespace text-node stripping</h4>
+
+  <p>The DOMAdapter class has the additional function of stripping whitespace
+  nodes in the DOM. This functionality had to be put in the DOMAdapter, as
+  different translets will have different preferences for node stripping.</p>
+  <a name="method-mapping">‌</a>
+  <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h4>Method mapping</h4>
+
+  <p>The DOMAdapter class implements the same <code>DOM</code> interface as the
+  DOMImpl class. A DOMAdapter object will look like a DOMImpl tree, but the
+  translet can access it directly without being concerned with type mapping
+  and whitespace stripping. The <code>getTypedChildren()</code> demonstrates very
+  well how this is done:</p>
+
+<blockquote class="source">
+<pre>
+    public NodeIterator getTypedChildren(int type) {
+        // Get the DOM type for the requested typed iterator
+        final int domType = _reverse[type];
+        // Now get the typed child iterator from the DOMImpl object
+        NodeIterator iterator = _domImpl.getTypedChildren(domType);
+        // Wrap the iterator in a WS stripping iterator if child-nodes are text nodes
+	if ((domType == DOM.TEXT) &amp;&amp; (_filter != null))
+	    iterator = _domImpl.strippingIterator(iterator,_mapping,_filter);
+	return(iterator);
+    }
+</pre>
+</blockquote>
+
+  <a name="multiplexer">‌</a>
+  <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h3>The DOM multiplexer - MultiDOM</h3>
+
+  <p>The DOM multiplexer class is only used when the compiled stylesheet uses
+  the <code>document()</code> function. An instance of the MultiDOM class also
+  implements the DOM interface, so that it can be accessed in the same way
+  as a DOMAdapter object.</p>
+
+  <p>A node in the DOM is identified by an integer. The first 8 bits of this
+  integer are used to identify the DOM in which the node belongs, while the
+  lower 24 bits are used to identify the node within the DOM:</p>
+    <table border="1">
+      <tr>
+        <td class="content" rowspan="1" colspan="1">31-24</td>
+        <td class="content" rowspan="1" colspan="1">23-16</td>
+        <td class="content" rowspan="1" colspan="1">16-8</td>
+        <td class="content" rowspan="1" colspan="1">7-0</td>
+      </tr>
+      <tr>
+        <td class="content" rowspan="1" colspan="1">DOM id</td>
+        <td class="content" rowspan="1" colspan="3">node id</td>
+      </tr>
+    </table>
+
+  <p>The DOM multiplexer has an array of DOMAdapter objects. The topmost 8
+  bits of the identifier is used to find the correct DOM from the array. Then
+  the lower 24 bits are used in calls to methods in the DOMAdapter object:</p>
+
+<blockquote class="source">
+<pre>
+    public int getParent(int node) {
+	return _adapters[node&gt;&gt;&gt;24].getParent(node &amp; 0x00ffffff) | node &amp; 0xff000000;
+    }
+</pre>
+</blockquote>
+
+  <p>Note that the node identifier returned by this method has the same upper 8
+  bits as the input node. This is why we <code>OR</code> the result from
+  <code>DOMAdapter.getParent()</code> with the top 8 bits of the input node.</p>
+
+  <a name="builder">‌</a>
+  <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h3>The DOM builder - DOMImpl$DOMBuilder</h3>
+  <ul>
+    <li>
+<a href="#startelement">startElement()</a>
+</li>
+    <li>
+<a href="#endelement">endElement()</a>
+</li>
+    <li>
+<a href="#startprefixmapping">startPrefixMapping()</a>
+</li>
+    <li>
+<a href="#endprefixmapping">endPrefixMapping()</a>
+</li>
+    <li>
+<a href="#characters">characters()</a>
+</li>
+    <li>
+<a href="#startdocument">startDocument()</a>
+</li>
+    <li>
+<a href="#enddocument">endDocument()</a>
+</li>
+  </ul>
+
+  <p>The DOM builder is an inner class of the DOM implementation. The builder
+  implements the SAX2 <code>ContentHandler</code> interface and populates the DOM
+  by receiving SAX2 events from a SAX2 parser (presently xerces). An instance
+  of the DOM builder class can be retrieved from <code>DOMImpl.getBuilder()</code>
+  method, and this handler can be set as an XMLReader's content handler:</p>
+
+<blockquote class="source">
+<pre>
+    final SAXParserFactory factory = SAXParserFactory.newInstance();
+    final SAXParser parser = factory.newSAXParser();
+    final XMLReader reader = parser.getXMLReader();
+    final DOMImpl dom = new DOMImpl();
+    reader.setContentHandler(dom.getBuilder());
+</pre>
+</blockquote>
+
+  <p>The DOM builder will start to populate the DOM once the XML parser starts
+  generating SAX2 events:</p>
+  <a name="startelement">‌</a>
+  <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h4>startElement()</h4>
+
+  <p>This method can be called in one of two ways; either with the expanded
+  QName (the element's separate uri and local name are supplied) or as a
+  normal QName (one String on the format prefix:local-name). The DOM stores
+  elements as expanded QNames so it needs to know the element's namespace URI.
+  Since the URI is not supplied with this call, we have to keep track of
+  namespace prefix/uri mappings while we're building the DOM. See 
+  <code>
+<a href="#startprefixmapping">startPrefixMapping()</a>
+</code> below for details on
+  namespace handling.</p>
+
+  <p>The <code>startElement()</code> inserts the element as a child of the current
+  parent element, creates attribute nodes for all attributes in the supplied
+  "<code>Attributes</code>" attribute list (by a series of calls to
+  <code>makeAttributeNode()</code>), and finally creates the actual element node
+  (by calling <code>internElement()</code>, which inserts a new entry in the
+  <code>_type[]</code> array).</p>
+  <a name="endelement">‌</a>
+  <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h4>endElement()</h4>
+
+  <p>This method does some cleanup after the <code>startElement()</code> method,
+  such as revering <code>xml:space</code> settings and linking the element's
+  child nodes.</p>
+  <a name="startprefixmapping">‌</a>
+  <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h4>startPrefixMapping()</h4>
+
+  <p>This method is called for each namespace declaration in the source
+  document. The parser should call this method before the prefix is referenced
+  in a QName that is passed to the <code>startElement()</code> call. Namespace
+  prefix/uri mappings are stored in a Hashtable structure. Namespace prefixes
+  are used as the keys in the Hashtable, and each key maps to a Stack that
+  contains the various URIs that the prefix maps to. The URI on top of the
+  stack is the URI that the prefix currently maps to.</p>
+
+  
+    <p>
+<img src="namespace_stack.gif" alt="namespace_stack.gif" />
+</p>
+    <p>
+<b>
+<i>Figure 2: Namespace handling in the DOM builder</i>
+</b>
+</p>
+
+
+  <p>Each call to <code>startPrefixMapping()</code> results in a lookup in the
+  Hashtable (using the prefix), and a <code>push()</code> of the URI onto the
+  Stack that the prefix maps to.</p>
+  <a name="endprefixmapping">‌</a>
+  <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h4>endPrefixMapping()</h4>
+
+  <p>A namespace prefix/uri mapping is closed by locating the Stack for the
+  prefix, and then <code>pop()</code>'ing the topmost URI off this Stack.</p>
+  <a name="characters">‌</a>
+  <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h4>characters()</h4>
+
+  <p>Text nodes are stored as simple character sequences in the character array
+  <code>_text[]</code>. The start and lenght of a node's text can be determined by
+  using the node index to look up <code>_offsetOrChild[]</code> and
+  <code>_lengthOrAttribute[]</code>.</p>
+
+  <p>We want to re-use character sequences if two or more text nodes have
+  identical content. This can be achieved by having two different text node
+  indexes map to the same character sequence. The <code>maybeReuseText()</code>
+  method is always called before a new character string is stored in the
+  <code>_text[]</code> array. This method will locate the offset of an existing
+  instance of a character sequence.</p>
+  <a name="startdocument">‌</a>
+  <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h4>startDocument()</h4>
+
+  <p>This method initialises a bunch of data structures that are used by the
+  builder. It also pushes the default namespace on the namespace stack (so that
+  the "" prefix maps to the <code>null</code> namespace).</p>
+  <a name="enddocument">‌</a>
+  <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h4>endDocument()</h4>
+
+  <p>This method builds the <code>_namesArray[]</code>, <code>_namespace[]</code>
+  and <code>_nsNamesArray[]</code> structures from temporary datastructures used
+  in the DOM builder.</p>
+
+   
+ 
+<p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+</div>
+<div id="footer">Copyright © 1999-2014 The Apache Software Foundation<br />Apache, Xalan, and the Feather logo are trademarks of The Apache Software Foundation<div class="small">Web Page created on - Thu 2014-05-15</div>
+</div>
+</body>
+</html>

Propchange: xalan/java/branches/WebSite/xalan-j/xsltc/xsltc_dom.html
------------------------------------------------------------------------------
    svn:eol-style = native

Added: xalan/java/branches/WebSite/xalan-j/xsltc/xsltc_iterators.html
URL: http://svn.apache.org/viewvc/xalan/java/branches/WebSite/xalan-j/xsltc/xsltc_iterators.html?rev=1595253&view=auto
==============================================================================
--- xalan/java/branches/WebSite/xalan-j/xsltc/xsltc_iterators.html (added)
+++ xalan/java/branches/WebSite/xalan-j/xsltc/xsltc_iterators.html Fri May 16 16:11:33 2014
@@ -0,0 +1,573 @@
+<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
+<html>
+<head>
+<title>ASF: XSLTC node iterators</title>
+<meta http-equiv="content-type" content="text/html; charset=UTF-8" />
+<meta http-equiv="Content-Style-Type" content="text/css" />
+<link rel="stylesheet" type="text/css" href="resources/apache-xalan.css" />
+</head>
+<!--
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the  "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ -->
+<body>
+<div id="title">
+<table class="HdrTitle">
+<tbody>
+<tr>
+<th rowspan="2">
+<a href="../index.html">
+<img alt="Trademark Logo" src="resources/XalanJ-Logo-tm.png" width="190" height="90" />
+</a>
+</th>
+<th text-align="center" width="75%">
+<a href="index.html">XSLTC Design</a>
+</th>
+</tr>
+<tr>
+<td valign="middle">XSLTC node iterators</td>
+</tr>
+</tbody>
+</table>
+<table class="HdrButtons" align="center" border="1">
+<tbody>
+<tr>
+<td>
+<a href="http://www.apache.org">Apache Foundation</a>
+</td>
+<td>
+<a href="http://xalan.apache.org">Xalan Project</a>
+</td>
+<td>
+<a href="http://xerces.apache.org">Xerces Project</a>
+</td>
+<td>
+<a href="http://www.w3.org/TR">Web Consortium</a>
+</td>
+<td>
+<a href="http://www.oasis-open.org/standards">Oasis Open</a>
+</td>
+</tr>
+</tbody>
+</table>
+</div>
+<div id="navLeft">
+<ul>
+<li>
+<a href="index.html">Overview</a>
+</li></ul><hr /><ul>
+<li>
+<a href="xsltc_compiler.html">Compiler design</a>
+</li></ul><hr /><ul>
+<li>
+<a href="xsl_whitespace_design.html">Whitespace</a>
+</li>
+<li>
+<a href="xsl_sort_design.html">xsl:sort</a>
+</li>
+<li>
+<a href="xsl_key_design.html">Keys</a>
+</li>
+<li>
+<a href="xsl_comment_design.html">Comment design</a>
+</li></ul><hr /><ul>
+<li>
+<a href="xsl_lang_design.html">lang()</a>
+</li>
+<li>
+<a href="xsl_unparsed_design.html">Unparsed entities</a>
+</li></ul><hr /><ul>
+<li>
+<a href="xsl_if_design.html">If design</a>
+</li>
+<li>
+<a href="xsl_choose_design.html">Choose|When|Otherwise design</a>
+</li>
+<li>
+<a href="xsl_include_design.html">Include|Import design</a>
+</li>
+<li>
+<a href="xsl_variable_design.html">Variable|Param design</a>
+</li></ul><hr /><ul>
+<li>
+<a href="xsltc_runtime.html">Runtime</a>
+</li></ul><hr /><ul>
+<li>
+<a href="xsltc_dom.html">Internal DOM</a>
+</li>
+<li>
+<a href="xsltc_namespace.html">Namespaces</a>
+</li></ul><hr /><ul>
+<li>
+<a href="xsltc_trax.html">Translet &amp; TrAX</a>
+</li>
+<li>
+<a href="xsltc_predicates.html">XPath Predicates</a>
+</li>
+<li>Xsltc Iterators<br />
+</li>
+<li>
+<a href="xsltc_native_api.html">Xsltc Native API</a>
+</li>
+<li>
+<a href="xsltc_trax_api.html">Xsltc TrAX API</a>
+</li>
+<li>
+<a href="xsltc_performance.html">Performance Hints</a>
+</li>
+</ul>
+</div>
+<div id="content">
+<h2>XSLTC node iterators</h2>
+
+  <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h3>Contents</h3>
+
+  <p>This document describes the function of XSLTC's node iterators. It also
+  describes the <code>NodeIterator</code> interface and some implementations of
+  this interface are described in detail:</p>
+
+  <ul>
+    <li>
+<a href="#purpose">Node iterator function</a>
+</li>
+    <li>
+<a href="#interface">NodeIterator interface</a>
+</li>
+    <li>
+<a href="#baseclass">Node iterator base class</a>
+</li>    
+    <li>
+<a href="#details">Implementation details</a>
+</li>    
+  </ul>
+
+  
+
+   
+
+  <a name="purpose">‌</a>
+  <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h3>Node Iterator Function</h3>
+
+    <p>Node iterators have several functions in XSLTC. The most obvious is
+    acting as a placeholder for node-sets. Node iterators also act as a link
+    between the translet and the DOM(s), they can act as filters (implementing
+    predicates), they contain the functionality necessary to cover all XPath
+    axes and they even serve as a front-end to XSLTC's node-indexing mechanism
+    (for the <code>id()</code> and <code>key()</code> functions).</p>
+  
+
+   
+
+  <a name="interface">‌</a>
+  <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h3>Node Iterator Interface</h3>
+
+    <p>The node iterator interface is defined in
+    <code>org.apache.xalan.xsltc.NodeIterator</code>.</p>
+
+    <p>The most basic operations in the <code>NodeIterator</code> interface are
+    for setting the iterators start-node. The "start-node" is
+    an index into the DOM. This index, and the axis of the iterator, determine
+    the node-set that the iterator contains. The axis is programmed into the
+    various node iterator implementations, while the start-node can be set by
+    calling:</p>
+<blockquote class="source">
+<pre>
+    public NodeIterator setStartNode(int node);</pre>
+</blockquote>
+
+    <p>Once the start node is set the node-set can be traversed by a sequence of
+    calls to:</p>
+<blockquote class="source">
+<pre>
+    public int next();</pre>
+</blockquote>
+
+    <p>This method will return the constant <code>NodeIterator.END</code> when
+    the whole node-set has been returned. The iterator can be reset to the start
+    of the node-set by calling:</p>
+<blockquote class="source">
+<pre>
+    public NodeIterator reset();</pre>
+</blockquote>
+
+    <p>Two additional methods are provided to set the position within the
+    node-set. The first method below will  mark  the current node in the
+    node-set, while the second will (at any point) set the iterators position
+    back to that node.</p>
+<blockquote class="source">
+<pre>
+    public void setMark();
+    public void gotoMark();</pre>
+</blockquote>
+
+    <p>Every node iterator implements two functions that make up the
+    functionality behind XPath's <code>getPosition()</code> and
+    <code>getLast()</code> functions.</p>
+<blockquote class="source">
+<pre>
+    public int getPosition();
+    public int getLast();</pre>
+</blockquote>
+
+    <p>The <code>getLast()</code> function returns the number of nodes in the
+    set, while the <code>getPosition()</code> returns the current position
+    within the node-set. The value returned by <code>getPosition()</code> for
+    the first node in the set is always 1 (one), and the value returned for the
+    last node in the set is always the same value as is returned by
+    <code>getLast()</code>.</p>
+
+    <p>All node iterators that implement an XPath axis will return the node-set
+    in the natural order of the axis. For example, the iterator implementing the
+     ancestor  axis will return nodes in reverse document order (bottom to
+    top), while the iterator implementing the  descendant  will return
+    nodes in document order. The node iterator interface has a method that can
+    be used to determine if an iterator returns nodes in reverse document order:
+    </p>
+<blockquote class="source">
+<pre>
+    public boolean isReverse();</pre>
+</blockquote>
+
+    <p>Two methods are provided for when node iterators are encapsulated inside
+    a variable or parameter. To understand the purpose behind these two methods
+    we should have a look at a sample XML document and stylesheet first:</p>
+    <blockquote class="source">
+<pre>
+    &lt;?xml version="1.0"?&gt;
+    &lt;foo&gt;
+        &lt;bar&gt;
+            &lt;baz&gt;A&lt;/baz&gt;
+            &lt;baz&gt;B&lt;/baz&gt;
+        &lt;/bar&gt;
+        &lt;bar&gt;
+            &lt;baz&gt;C&lt;/baz&gt;
+            &lt;baz&gt;D&lt;/baz&gt;
+        &lt;/bar&gt;
+    &lt;/foo&gt;
+
+    &lt;xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;
+
+        &lt;xsl:template match="foo"&gt;
+            &lt;xsl:variable name="my-nodes" select="//foo/bar/baz"/&gt;
+            &lt;xsl:for-each select="bar"&gt;
+                &lt;xsl:for-each select="baz"&gt;
+                    &lt;xsl:value-of select="."/&gt;
+                &lt;/xsl:for-each&gt;
+                &lt;xsl:for-each select="$my-nodes"&gt;
+                    &lt;xsl:value-of select="."/&gt;
+                &lt;/xsl:for-each&gt;
+            &lt;/xsl:for-each&gt;
+        &lt;/xsl:template&gt;
+
+    &lt;/xsl:stylesheet&gt;</pre>
+</blockquote>
+
+    <p>Now, there are three iterators at work here. The first iterator is the
+    one that is wrapped inside the variable <code>my-nodes</code> - this
+    iterator contains  all  <code>&lt;baz/&gt;</code> elements in the
+    document. The second iterator contains all <code>&lt;bar&gt;</code>
+    elements under the current element (this is the iterator used by the
+    outer <code>for-each</code> loop). The third and last iterator is the one
+    used by the first of the inner <code>for-each</code> loops. When the outer
+    loop is run the first time, this third iterator will be initialized to
+    contain the first two <code>&lt;baz&gt;</code> elements under the context
+    node (the first <code>&lt;bar&gt;</code> element). Iterators are by default
+    restarted from the current node when used inside a <code>for-each</code>
+    loop like this. But what about the iterator inside the variable
+    <code>my-nodes</code>? The variable should keep its assigned value, no
+    matter what the context node is. In able to prevent the iterator from being
+    reset, we must use a mechanism to block calls to the
+    <code>setStartNode()</code> method. This is done in three steps:</p>
+
+    <ul>
+      <li>The iterator is created and initialized when the variable gets
+      assigned its value (node-set).</li>
+      <li>When the variable is read, the iterator is copied (cloned). The
+      original iterator inside the variable is never used directly. This is
+      to make sure that the iterator inside the variable is always in its
+      original state when read.</li>
+      <li>The iterator clone is marked as not restartable to prevent it from
+      being restarted when used to iterate the <code>&lt;xsl:for-each&gt;</code>
+      element loop.</li>
+    </ul>
+
+    <p>These are the two methods used for the three steps above:</p>
+<blockquote class="source">
+<pre>
+    public NodeIterator cloneIterator();
+    public void setRestartable(boolean isRestartable);</pre>
+</blockquote>
+
+    <p>Special care must be taken when implementing these methods in some
+    iterators. The <code>StepIterator</code> class is the best example of this.
+    This iterator wraps two other iterators; one of which is used to generate
+    start-nodes for the other - so one of the encapsulated node iterators must
+    always remain restartable - even when used inside variables. The
+    <code>StepIterator</code> class is described in detail later in this
+    document.</p>
+
+  
+
+
+   
+
+  <a name="baseclass">‌</a>
+  <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h3>Node Iterator Base Class</h3>
+
+    <p>A node iterator base class is provided to contain some common
+    functionality. The base class implements the node iterator interface, and
+    has a few additional methods:</p>
+<blockquote class="source">
+<pre>
+    public NodeIterator includeSelf();
+    protected final int returnNode(final int node);
+    protected final NodeIterator resetPosition();</pre>
+</blockquote>
+
+    <p>The <code>includeSelf()</code> is used with certain axis iterators that
+    implement both the <code>ancestor</code> and <code>ancestor-or-self</code>
+    axis and similar. One common implementation is used for these axes and
+    this method is used to signal that the  "self"  node should
+    also be included in the node-set.</p>
+
+    <p>The <code>returnNode()</code> method is called by the implementation of
+    the <code>next()</code> method. <code>returnNode()</code> increments an
+    internal node counter/cursor that keeps track of the current position within
+    the node set. This counter/cursor is then used by the 
+    <code>getPosition()</code> implementation to return the current position.
+    The node cursor can be reset by calling <code>resetPosition()</code>. This
+    method is normally called by an iterator's <code>reset()</code> method.</p>
+
+  
+
+   
+
+  <a name="details">‌</a>
+  <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h3>Node Iterator Implementation Details</h3>
+
+    <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h4>Axis iterators</h4>
+
+    <p>All axis iterators are implemented as inner classes of the internal
+    DOM implementation <code>org.apache.xalan.xsltc.dom.DOMImpl</code>. In this
+    way all axis iterator classes have direct access to the internal node
+    type- and navigation arrays of the DOM:</p>
+<blockquote class="source">
+<pre>
+    private short[]   _type;          // Node types
+    private short[]   _namespace;     // Namespace URI types
+    private short[]   _prefix;        // Namespace prefix types
+
+    private int[]     _parent;        // Index of a node's parent
+    private int[]     _nextSibling;   // Index of a node's next sibling node
+    private int[]     _offsetOrChild; // Index of an elements first child node
+    private int[]     _lengthOrAttr;  // Index of an elements first attribute node</pre>
+</blockquote>
+
+    <p>The axis iterators can be instanciated by calling either of these two
+    methods of the DOM:</p>
+<blockquote class="source">
+<pre>
+    public NodeIterator getAxisIterator(final int axis);
+    public NodeIterator getTypedAxisIterator(final int axis, final int type);</pre>
+</blockquote>
+
+    
+
+    <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h4>StepIterator</h4>
+    
+    <p>The <code>StepIterator</code> is used to  chain  other iterators. A
+    very basic example is this XPath expression:</p>
+<blockquote class="source">
+<pre>
+    &lt;xsl:for-each select="foo/bar"&gt;</pre>
+</blockquote>
+
+    <p>To generate the appropriate node-set for this loop we need three
+    iterators. The compiler will generate code that first creates a typed axis
+    iterator; the axis will be  child  and the type will be that assigned
+    to <code>&lt;foo&gt;</code> elements. Then a second typed axis iterator will
+    be created; this also a  child -iterator, but this one with the type
+    assigned to <code>&lt;bar&gt;</code> elements. The third iterator is a
+    step iterator that encapsulates the two axis iterators. The step iterator is
+    the initialized with the context node.</p>
+
+    <p>The step iterator will use the first axis iterator to generate
+    start-nodes for the second axis iterator. In plain english this means that
+    the step iterator will scan all <code>foo</code> elements for any
+    <code>bar</code> child elements. When a <code>StepIterator</code> is
+    initialized with a start-node it passes the start node to the
+    <code>setStartNode()</code> method of its  source -iterator (left).
+    It then calls <code>next()</code> on that iterator to get the start-node
+    for the  iterator  iterator (right):</p>
+<blockquote class="source">
+<pre>
+    // Set start node for left-hand iterator...
+    _source.setStartNode(_startNode);
+    // ... and get start node for right-hand iterator from left-hand,
+    _iterator.setStartNode(_source.next());</pre>
+</blockquote>
+
+    <p>The step iterator will keep returning nodes from its right iterator until
+    it runs out of nodes. Then a new start-node is retrieved by again calling
+    <code>next()</code> on the  source -iterator. This is why the
+    right-hand iterator always has to be restartable - even if the step iterator
+    is placed inside a variable or parameter. This becomes even more complicated
+    for step iterators that encapsulate other step iterators. We'll make our
+    previous example a bit more interesting:</p>
+<blockquote class="source">
+<pre>
+    &lt;xsl:for-each select="foo/bar[@name='cat and cage']/baz"&gt;</pre>
+</blockquote>
+
+    <p>This will result in an iterator-tree similar to this:</p>
+
+    <p>
+<img src="iterator_stack.gif" alt="iterator_stack.gif" />
+</p>
+    <p>
+<b>
+<i>Figure 1: Stacked step iterators</i>
+</b>
+</p>
+
+    <p>The  "foo"  iterator is used to supply the second step
+    iterator with start nodes. The second step iterator will pass these start
+    nodes to the  "bar"  iterator, which will be used to get the
+    start nodes for the third step iterator, and so on....</p>
+
+    
+
+    <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h4>Iterators for Filtering/Predicates</h4>
+
+    <p>The <code>org.apache.xalan.xsltc.dom</code> package contains a few
+    iterators that are used to implement predicates and filters. Such iterators
+    are normally placed on top of another iterator, and return only those nodes
+    that match a specific node value, position, etc.
+    These iterators include:</p>
+
+    <ul>
+      <li>NthIterator</li>
+      <li>NodeValueIterator</li>
+      <li>FilteredStepIterator</li>
+      <li>CurrentNodeListIterator</li>
+    </ul>
+
+    <p>The last one is the most interesting. This iterator is used to implement
+    chained predicates, such as:</p>
+<blockquote class="source">
+<pre>
+    &lt;xsl:value-of select="foo[@blob='boo'][2]"&gt;</pre>
+</blockquote>
+
+    <p>The first predicate reduces the node set from containing all
+    <code>&lt;foo&gt;</code> elements, to containing only those elements that
+    have a  "blob"  attribute with the value 'boo'. The
+    <code>CurrentNodeListIterator</code> is used to contain this reduced
+    node-set. The iterator is constructed by passing it a source iterator (in
+    this case an iterator that contains all <code>&lt;foo&gt;</code> elements)
+    and a filter that implements the predicate (<code>@blob = 'boo'</code>).</p>
+
+    
+
+    <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h4>SortingIterator</h4>
+
+    <p>The sorting iterator is one of the main functional components behind the
+    implementation of the <code>&lt;xsl:sort&gt;</code> element. This element,
+    including the sorting iterator, is described in detail in the
+    <code>&lt;xsl:sort&gt;</code>
+    <a href="xsl_sort_design.html">design document</a>.</p>
+
+    
+
+    <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h4>SingletonIterator</h4>
+
+    <p>The singleton iterator is a wrapper for a single node. The node passed
+    in to the <code>setStartNode()</code> method is the only node that will be
+    returned by the <code>next()</code> method. The singleton iterator is used
+    mainly for node to node-set type conversions.</p>
+
+    <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h4>UnionIterator</h4>
+
+    <p>The union iterator is used to contain unions of node-sets contained in
+    other iterators. Some of the methods in this iterator are unnecessary
+    comlicated. The <code>next()</code> method contains an algorithm for
+    ensuring that the union node-set is returned in document order. We might be
+    better off by simply wrapping the union iterator inside a duplicate filter
+    iterator, but there could be some performance implications. Worth checking.
+    </p>
+
+    
+
+    <p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+<h4>KeyIndex</h4>
+
+    <p>This is not just an node iterator. An index used for keys and ids will
+    return a set of nodes that are contained within the named index and that
+    share a certain property. The <code>KeyIndex</code> implements the node
+    iterator interface, so that these nodes can be returned and handled just
+    like any other node set. See the
+    <a href="xsl_key_design.html">design document</a> for 
+    <code>&lt;xsl:key&gt;</code>, <code>key()</code> and <code>id()</code>
+    for further details.</p>
+
+    
+
+  
+
+<p align="right" size="2">
+<a href="#content">(top)</a>
+</p>
+</div>
+<div id="footer">Copyright © 1999-2014 The Apache Software Foundation<br />Apache, Xalan, and the Feather logo are trademarks of The Apache Software Foundation<div class="small">Web Page created on - Thu 2014-05-15</div>
+</div>
+</body>
+</html>

Propchange: xalan/java/branches/WebSite/xalan-j/xsltc/xsltc_iterators.html
------------------------------------------------------------------------------
    svn:eol-style = native



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@xalan.apache.org
For additional commands, e-mail: commits-help@xalan.apache.org