You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@xalan.apache.org by John Howard <Jo...@schemasoft.com> on 2001/11/27 04:39:03 UTC

xsltc NodeIterator reset() question

Hi again all,

I think I've found a problem with some of the mechanics of reset(). I'm
particularly wondering about whats done when _iterators (that are part of a
cloned iterator tree) that contain _sources are told to reset(). Any
additional thoughts would be helpful in analyzing the situation.

Consider the following iterator (tree) (compiled from the xpath
dataset/lump[1]/point - the sample path used in
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=4905):

         SI1         SI = StepIterator
        /   \        TCI = TypedChildrenIterator
     SI2     TCI3    NI = NthIterator
    /   \            left links = _source
TCI1     NI          right links = _iterator
        /  \
    TCI2    1

What's attracted my attention is how StepIterator performs it's reset():

  public NodeIterator reset() {
    _source.reset();
    if (_includeSelf)
      _iterator.setStartNode(_startNode);
    else
      _iterator.setStartNode(_source.next());
    return resetPosition();
  }

After cloning, when _iterator.setStartNode(int) is called, it is presumed
that _iterator is _isRestartable but it is not ensured that _iterator's
_source is _isRestartable. StepIterator.cloneIterator() ensures it's
_iterator is restartable using the likes of:

  // Special case -> _iterator must be restartable
  if (clone._iterator instanceof NodeIteratorBase) {
    ((NodeIteratorBase)(clone._iterator))._isRestartable = true;
  }

Here's the situation that happends with the xpath/tree described above.
Assume that the tree is a non-clone and is about to be cloned and that the
patch described in #4905 has not been applied.

SI1.cloneIterator() clones it's _source and _iterator. Before SI1 changes
_iterator._isRestartable or calls reset() SI2 clones it's _source and
_iterator. Before SI2 changes _iterator._isRestartable or calls reset(),
NI, TCI1 and TCI2 are cloned (the latter 2 use the default cloneIterator
implementation that clears _isRestartable, the former leaves _isRestartable
unchanged). On the way out of the callstack SI2 changes _iterator to be
restartable (the NI object) and calls reset(). SI2.reset() calls
_source.reset() (_source being object _TCI1) and then calls
_iterator.setStartNode(int) (_iterator being object NI). NI's
setStartNode(int) calls _source.setStartNode(int) (_source being object
TCI2) but since TCI2 is a clone (remember - we're crawling out of the stack
after the deepest clones have been made) it's not restartable and
TCI2.setStartNode(int) does nothing (_currentChild is left unreset). I think
I'll stop the crawl there.

For this example what we get is an iterator that, once it's cloned, can be
neither completely reinitialized nor completely reset (TCI2.setStartNode is
needed for both codepaths and in both cases does nothing). It can however be
used and recloned as long as the context is the same and there are no
intervening resets (patch #4905 prevents a reset call that upsets
StepIterator, whereas the default implementation of
NodeIteratorBase.cloneIterator ensures that TypedChildrenIterator.reset()
gets called to set _currentChild properly). Phew.

The reason the patch doesn't work for the second stylesheet is that count()
uses BasisLibrary.countF(NodeIterator) which performs a reset() on the
iterator (incompletely resetting the iterator).

It may be possible to determine the syntax of any xpaths that would exhibit
problems due to this bug. However, I'm neither familiar with enough
iterators nor how the tree is constructed - perhaps someone in-the-know
would want to check that out.

Badly in need of a giggle at this point I ripped out the whole
_isRestartable stuff from the codepaths used in this example - everything
worked fine (vars and parms appear to be cloned prior to use anyway).

A quick code-search leads me to consider FilteredStepIterator.reset() a
possible problem method as well.

Can anybody tell if I screwed-up my analysis of the situation? If it seems
ok, does a fix jump out at anybody?

I guess I'd like to say it sure would be nice if there were fewer codepaths
in the cloning/initialization/resetting of these iterators - it would make
debugging lots easier. Like maybe more iterator-framework logic built into
NodeIteratorBase or something (like what SyntaxTreeNode does for
parsing/typechecking/code generation).

cheers,

john

Re: xsltc NodeIterator reset() question

Posted by Morten Jorgensen <mo...@Sun.COM>.
John,

Your analysis is 100% correct. I spent a few hours yesterday as well
looking into the StepIterator's cloneIterator() and reset() function.
The problem occurs, as you said yourself, when two or more
StepIterators are used to represent a path like this "A/B/C":

                 StepIterator
                   /     \
          StepIterator  Child('C')
            /     \
     Child('A')  Child('B')

If you look at the cloneIterator() method you'll see that when the
topmost StepIterator is cloned, it marks the other StepIterator as
not restartable, while the Child('C') iterator is kept restartable.
The problem is that the Child('B') iterator needs to be restarted
for each node produced by Child('A'). The conclusion is that in
cases like this, only the leftmost iterator (the one representing
the very first step) should be tagged as not restartable.

You say that you have removed the _isRestartable flag and that this
does not cause you any problems. I do not think this is a desired
approach for us. We have a huge set of regression tests that will
fail if we do that.

Thanks for all your work. I'll try to come up with a patch for the
StepIterator and FilteredStepIterator. The latter seems to be in a
very bad state.

Thanks again,

Morten Jorgensen,
XML Technology Centre,
Sun Microsystems Ireland ltd.
John Howard wrote:
> 
> Hi again all,
> 
> I think I've found a problem with some of the mechanics of reset(). I'm
> particularly wondering about whats done when _iterators (that are part of a
> cloned iterator tree) that contain _sources are told to reset(). Any
> additional thoughts would be helpful in analyzing the situation.
> 
> Consider the following iterator (tree) (compiled from the xpath
> dataset/lump[1]/point - the sample path used in
> http://nagoya.apache.org/bugzilla/show_bug.cgi?id=4905):
> 
>          SI1         SI = StepIterator
>         /   \        TCI = TypedChildrenIterator
>      SI2     TCI3    NI = NthIterator
>     /   \            left links = _source
> TCI1     NI          right links = _iterator
>         /  \
>     TCI2    1
> 
> What's attracted my attention is how StepIterator performs it's reset():
> 
>   public NodeIterator reset() {
>     _source.reset();
>     if (_includeSelf)
>       _iterator.setStartNode(_startNode);
>     else
>       _iterator.setStartNode(_source.next());
>     return resetPosition();
>   }
> 
> After cloning, when _iterator.setStartNode(int) is called, it is presumed
> that _iterator is _isRestartable but it is not ensured that _iterator's
> _source is _isRestartable. StepIterator.cloneIterator() ensures it's
> _iterator is restartable using the likes of:
> 
>   // Special case -> _iterator must be restartable
>   if (clone._iterator instanceof NodeIteratorBase) {
>     ((NodeIteratorBase)(clone._iterator))._isRestartable = true;
>   }
> 
> Here's the situation that happends with the xpath/tree described above.
> Assume that the tree is a non-clone and is about to be cloned and that the
> patch described in #4905 has not been applied.
> 
> SI1.cloneIterator() clones it's _source and _iterator. Before SI1 changes
> _iterator._isRestartable or calls reset() SI2 clones it's _source and
> _iterator. Before SI2 changes _iterator._isRestartable or calls reset(),
> NI, TCI1 and TCI2 are cloned (the latter 2 use the default cloneIterator
> implementation that clears _isRestartable, the former leaves _isRestartable
> unchanged). On the way out of the callstack SI2 changes _iterator to be
> restartable (the NI object) and calls reset(). SI2.reset() calls
> _source.reset() (_source being object _TCI1) and then calls
> _iterator.setStartNode(int) (_iterator being object NI). NI's
> setStartNode(int) calls _source.setStartNode(int) (_source being object
> TCI2) but since TCI2 is a clone (remember - we're crawling out of the stack
> after the deepest clones have been made) it's not restartable and
> TCI2.setStartNode(int) does nothing (_currentChild is left unreset). I think
> I'll stop the crawl there.
> 
> For this example what we get is an iterator that, once it's cloned, can be
> neither completely reinitialized nor completely reset (TCI2.setStartNode is
> needed for both codepaths and in both cases does nothing). It can however be
> used and recloned as long as the context is the same and there are no
> intervening resets (patch #4905 prevents a reset call that upsets
> StepIterator, whereas the default implementation of
> NodeIteratorBase.cloneIterator ensures that TypedChildrenIterator.reset()
> gets called to set _currentChild properly). Phew.
> 
> The reason the patch doesn't work for the second stylesheet is that count()
> uses BasisLibrary.countF(NodeIterator) which performs a reset() on the
> iterator (incompletely resetting the iterator).
> 
> It may be possible to determine the syntax of any xpaths that would exhibit
> problems due to this bug. However, I'm neither familiar with enough
> iterators nor how the tree is constructed - perhaps someone in-the-know
> would want to check that out.
> 
> Badly in need of a giggle at this point I ripped out the whole
> _isRestartable stuff from the codepaths used in this example - everything
> worked fine (vars and parms appear to be cloned prior to use anyway).
> 
> A quick code-search leads me to consider FilteredStepIterator.reset() a
> possible problem method as well.
> 
> Can anybody tell if I screwed-up my analysis of the situation? If it seems
> ok, does a fix jump out at anybody?
> 
> I guess I'd like to say it sure would be nice if there were fewer codepaths
> in the cloning/initialization/resetting of these iterators - it would make
> debugging lots easier. Like maybe more iterator-framework logic built into
> NodeIteratorBase or something (like what SyntaxTreeNode does for
> parsing/typechecking/code generation).
> 
> cheers,
> 
> john