You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-users@xalan.apache.org by Jo...@it-informatik.de on 2003/06/24 17:52:11 UTC

Performance of XPathEvaluator operating on Xerces DOM

Hi all,

Technicalities first:

I'm talking about Xalan C++ 1.5 using Xerces C++ 2.2.0.
The platform is NT 4.0 SP 5 and the compiler I use is VC 6.0 SP 5.

I'm succesfully using Xerces and Xalan for the following scenario:

My application maintains a Xerces DOM which is programmatically
created and at times extended by parsing and/or tranforming external
documents and importing them into my main DOM. This is all handled
very well and efficiently by Xerces and Xalan.

Additionally I need to *manipulate* my main DOM programmatically
(mainly via DOMNode::setTextValue()) and extract information
(naturally using DOMNode::getTextValue()).

I order to find the relevant nodes I use XPathEvaluator to obtain the
node sets via XPathEvaluator::selectNodeList() and map the XalanNode(s)
back to DOMNode(s) via XercesDocumentWrapper::mapNode(XalanNode *)
as was discussed in various postings on this list.

All is well until my Xerces DOM grows large enough and XPath performance
heavily degrades.

I am sure it has to do with my XPath expressions and the peculiar document
structure of which I'm very well aware, that it's not optimal. But
unfortunately
I'm obliged to do it that way.

The following XPath expression will illustrate the problem:

"/Model/Business"
"/FLNBenutzerCLASS"
"/FLNUserclassCLASS"
"/FLNWerkstattBerechtigungCLASS[WerkstattBerechtigungCLASS/ZugelasseneWerkstatt = /Model/Intern/AuswahlInfo/AusgewaehlteWerkstatt]"
"/FLNWerkstattCLASS"
"/FLNAnlageCLASS[AnlageCLASS/Anlage = /Model/Intern/AuswahlInfo/AusgewaehlteAnlage]"
"/FLNMDETerminalCLASS"
"/FLNTagesSummenCLASS[position() = /Model/Intern/AuswahlInfo/AusgewaehlteTagesSummen]"
"/FLNZustandsWechselCLASS[ZustandsWechselCLASS/AktuellerZustand = 'PROD']"
"/ZustandsWechselCLASS"
"/ZeitSummeZustandBeiEnde"

It's a concatenated string literal for better reading taken from my code.

If the underlying document contains up to 20 or 30 of these FLNZustandsWechselCLASS
elements (the third location step read from the back) the node set is obtained rather quickly.
If it's 50 to 100 it takes several seconds and if it's 200 to 500 or more then the operation
can take 2 minutes and more.

Interestingly, if I leave out the last two location steps the number of elements
does not seem to matter that much.  It is pretty fast then.

It seems as if it has something to do with diving into the FLNZustandsWechselCLASS element
for evaluating the predicate (which works fast) and then after having the candidate node set
going down again in the final two location steps (which slows the process down).

Evaluating the same expression with the SimpleXPathAPI sample or within the xpath console
of the CookTop 2.500 tool indicated no performance problem at all. In the case of
SimpleXPathAPI it may be due to the fact that it's using Xalan's native DOM instead of
mapping to/from the Xerces DOM.

Is there a remedy for this performance degradation or do you have a list of DOs and DON'Ts
regarding XPath operating on a Xerces DOM?

Any help is highly appreciated and thanks for listening.

Regards
Joerg Seidler

joerg.seidler@it-informatik.de





Re: Performance of XPathEvaluator operating on Xerces DOM

Posted by da...@us.ibm.com.



> It seems as if it has something to do with diving into the
FLNZustandsWechselCLASS element
> for evaluating the predicate (which works fast) and then after having the
candidate node set
> going down again in the final two location steps (which slows the process
down).
>
> Evaluating the same expression with the SimpleXPathAPI sample or within
the xpath console
> of the CookTop 2.500 tool indicated no performance problem at all. In the
case of
> SimpleXPathAPI it may be due to the fact that it's using Xalan's native
DOM instead of
> mapping to/from the Xerces DOM.

I suspect you're building the bridge/wrapper the wrong way.  A couple of
questions:

   1. Which Xerces DOM are you using?  The deprecated one, or the new one?
   2. How do you wrap the Xerces DOM instance.  A snippet of your code
   which builds the wrapper would help diagnose this.

Unfortunately, some performance degradation is inherent with wrapping the
Xerces DOM.

Dave