You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@xalan.apache.org by bu...@apache.org on 2001/09/07 01:24:17 UTC

[DO NOT REPLY: Bug 3477] New: RFC: Iterator code review WRT reinitialization

PLEASE DO NOT REPLY TO THIS MESSAGE. TO FURTHER COMMENT
ON THE STATUS OF THIS BUG PLEASE FOLLOW THE LINK BELOW
AND USE THE ON-LINE APPLICATION. REPLYING TO THIS MESSAGE
DOES NOT UPDATE THE DATABASE, AND SO YOUR COMMENT WILL
BE LOST SOMEWHERE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=3477

*** shadow/3477	Thu Sep  6 16:24:17 2001
--- shadow/3477.tmp.24620	Thu Sep  6 16:24:17 2001
***************
*** 0 ****
--- 1,159 ----
+ +============================================================================+
+ | RFC: Iterator code review WRT reinitialization                             |
+ +----------------------------------------------------------------------------+
+ |        Bug #: 3477                        Product: XalanJ2                 |
+ |       Status: NEW                         Version: CurrentCVS              |
+ |   Resolution:                            Platform: PC                      |
+ |     Severity: Enhancement              OS/Version: Windows NT/2K           |
+ |     Priority: Other                     Component: org.apache.xalan.xsltc  |
+ +----------------------------------------------------------------------------+
+ |  Assigned To: xalan-dev@xml.apache.org                                     |
+ |  Reported By: johnh@schemasoft.com                                         |
+ |      CC list: Cc:                                                          |
+ +----------------------------------------------------------------------------+
+ |          URL:                                                              |
+ +============================================================================+
+ |                              DESCRIPTION                                   |
+ Herein lay the possibly quite-out-to-lunch observations of a
+ possessed soul: forgiveness and corrections are welcome.
+ 
+ I'm first making the assumption that iterators should be
+ reinitializable back to their creation context. Iterators
+ that are 'composite' (made up of an assembly of iterators
+ linked by reference at constructor time) should also be able
+ to have their component iterators reinitializable back to
+ their own creation states.
+ 
+ Bugs related to the reinitialization of iterators appear to
+ manifest themselves in xsltc when variables or parameters
+ containing xpath specified nodesets are used in node contexts
+ other than the one they were created in. Constructs that
+ typically cause problems are xsl:for-each and xsl:template
+ (the latter when invoked via xsl:call-template).
+ Reinitialization problems occur when the compiler generates
+ the codes to 'reset' the iterator prior to it's use, as
+ in the case of for-each.
+ 
+ To permit proper reinitialization the creation node must be
+ tracked and not lost during subsequent reinitialization. In
+ my review, it appeared that about half (15) of the iterators
+ in xsltc have a problem with reinitialization (either they
+ weren't written to support it or they might support it but
+ the compiler won't generate reinitialization code properly).
+ 
+ The abstract class NodeIteratorBase defines the methods
+ setStartNode(int) and reset() which would appear to be designed to
+ facilitate the initialization and reinitialization of an iterator, the
+ former passed a node context as a parameter. In the generated code
+ it appears that both get called to perform an iterator's
+ reinitialization. The parameter passed to setStartNode(int)
+ usually being the current node context (unfortunately for
+ iterators that could be reinitialized properly the compiler
+ always generates a call to this method).
+ 
+ Here-in appears to be the two main re/initialization problems with
+ about half of the current iterators:
+ 1) calling reset() may not be sufficient: they may require a call
+    to setStartNode(int) with the original creation node context (as
+    in the case of KeyIndex or ReverseIterator),
+ 2) calling setStartNode(int) may (depending upon the iterator) cause
+    a 'deep initialization' which could destroy the initial states of
+    any composite iterators (as in the case of StepIterator).
+ 
+ Important to note is the case of an iterator being stored in a
+ variable and that variable being passed as a parameter to a
+ template. In this case the iterator is 'cloned' (a deep copy) prior
+ to being pushed on the stack. Most iterators clear the flag
+ _isRestartable upon being cloned and use this flag to change
+ their setStartNode(int) behaviour (usually, so it does nothing).
+ The net effect is that the iterator can be used once but not
+ reinitialized in the called context. So, clones try to solve
+ the problem - but not quite.
+ 
+ So without any provocation, here is my review of the iterators: 
+ 
+ The following are cases where setStartNode(int) appears to call
+ _source.setStartNode(int), possibly causing a 'deep initialization'.
+ I couldn't find any existing problems with the implementation
+ of reset(). Also, AbsoluteIterator.setStartNode(int) is a bit
+ unique in that rather than returning 'this' it returns the
+ _source iterator - precluding further calls to the original
+ iterator.
+ 
+ DOMImpl.NodeValueIterator
+ AbsoluteIterator
+ CurrentNodeListIterator
+ DupFilterIterator
+ FilteredStepIterator
+ FilterIterator
+ AxisIterator
+ NthIterator
+ SortingIterator
+ 
+ The following is also a case where setStartNode(int) calls
+ _source.setStartNode(int) but reset() calls it as well!
+ 
+ StepIterator
+ 
+ The following is also a case where setStartNode(int) calls
+ _source.setStartNode(int). As well, they both fail to initialize
+ _startNode which may lead to problems when reset() is called.
+ 
+ ReverseIterator
+ DOMImpl.StrippingIterator
+ 
+ In the following, setStartNode fails to initialize _startNode.
+ It may not matter though since this class reimplements reset().
+ 
+ KeyIndex
+ 
+ Here also, setStartNode doesn't initialize _startNode and there
+ appears to be no ill effects since reset() is reimplemented. I
+ do notice however that the constructor for this class calls
+ source.setStartNode(int), possibly changing the state of a parameter!
+ 
+ DOMImpl.OrderedIterator
+ 
+ For this, setStartNode(int) calls _source.setStartNode(int). Reset()
+ is implemented but does no work.
+ 
+ MatchingIterator
+ 
+ Here also, setStartNode(int) calls _heap[i].iterator.setStartNode(int).
+ Since the default implementation of reset() will call setStartNode(int)
+ with _startNode (that this class properly initializes) there may be a
+ big performance hit when resetting this iterator.
+ 
+ UnionIterator
+ 
+ The following are cases where I couldn't find any incompatabilities
+ between setStartNode(int) and reset():
+ 
+ DOMImpl.ChildrenIterator
+ DOMImpl.ParentIterator
+ DOMImpl.TypedChildrenIterator
+ DOMImpl.NamespaceChildrenIterator
+ DOMImpl.NamespaceAttributeIterator
+ DOMImpl.FollowingSiblingIterator
+ DOMImpl.AttributeIterator
+ DOMImpl.TypedAttributeIterator
+ DOMImpl.PrecedingSiblingIterator
+ DOMImpl.PrecedingIterator
+ DOMImpl.FollowingIterator
+ DOMImpl.AncestorIterator
+ DOMImpl.DescendantIterator
+ DOMImpl.NthDescendantIterator
+ SingletonIterator
+ 
+ Where to go from here?
+ - Get some feedback from anyone who reads this.
+ - scope more runtime details
+ -- why does OrderedIterator call setStartNode(int) on a parameter?
+ -- why does AbsoluteIterator return _source from setStartNode(int)?
+ -- look more into clones and their behaviour
+ -- look more into reinitialization of composite iterators
+ -- look more into how to split initialization and reinitialization
+ -- who actually relys upon NodeIteratorBase?
+ - scope more compiletime details
+ -- the current use of startResetIterator(...)
+ -- how to determine when to initialize or reinitialize