You are viewing a plain text version of this content. The canonical link for it is here.

Posted to j-users@xalan.apache.org by Turaukar Yur <yu...@hotmail.com> on 2004/01/06 13:04:02 UTC

Why does Xalan not use DOM's getElementsByTag method?

I have a pretty large XML Schema where certain tags may have many many 
different children, e.g.
<A>
   <B1>...</B1>
   <B2>...</B2>
    ...
   <B99>...</B99>
</A>
Running a style sheet with many xpath expressions value-of select="A/Bnn" 
where nn is a number is very inefficient with Xalan.

To improve performance, I have decided to parse the data into my own DOM 
implementation, that supports quick access to a given child tag by name, 
i.e. a very fast implementation of getElementsByTagName.
To my surprise, Xalan seems to ignore this method and "prefers" to iterate 
all child elements of a tag whenever it searches for a specific child 
(getFirstChild+getNexSibling).
Am I missing an important point here? My solution really relies on a quick 
access to given elements.

Thanks for feedback

_________________________________________________________________
E-Mails sind zu unpersönlich? Mit einer Webcam wird der MSN Messenger zum 
Bildtelefon! http://messenger.msn.de Jetzt kostenlos downloaden und der Spaß 
beginnt!

Re: Why does Xalan not use DOM's getElementsByTag method?

Posted by Joseph Kesselman <ke...@us.ibm.com>.




Xalan does not use the DOM directly.  It operates in terms of its own
internal APIs, which are wrapped around the DOM you provide. Hence
optimizations in your DOM may not currently translate into improvements in
Xalan; we may or may not ever touch those specific methods;
getElementsByTag in particular turns out to be something of a pain to take
advantage of when accessing a document through DTM. Conversely, our wrapper
layer may apply optimizations you haven't; for example, we do (under some
circumstances) pre-index the document by tag name and thus perform our own
optimized version of that search.

Nature of the beast, I'm afraid. We have to work against a wide variety of
sources, and that means we don't generally optimize for specific
implementations.

If you really wanted to, you could attempt to write a variant of DOM2DTM
which leveraged your DOM implementation more effectively, and create a
modified DTMManager which recognized your implementation and used that
wrapper rather than our standard one. That's a nontrivial amount of work,
and I don't think the payback would be worth the investment.

In the future, as we move from DTM to XDM as our internal model API, we
hope that writing custom will become a bit easier and DOM support generally
should become more efficient... but that's work still in progress and not
yet ready to be published.

______________________________________
Joe Kesselman, IBM Next-Generation Web Technologies: XML, XSL and more.
"The world changed profoundly and unpredictably the day Tim Berners Lee
got bitten by a radioactive spider." -- Rafe Culpin, in r.m.filk