You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xerces.apache.org by Ritu Raj Tiwari <ri...@yahoo.com> on 2005/02/28 20:22:13 UTC

Xerces vs Crimson performance

Folks,
I am migrating an application that makes use of
validating XML parsing. It made use of the crimson
parser and we are moving to xerces.

The XML documents I encounter have a huge DTD they
need to be validated against. On many occasions the
documnents are actually much smaller than the DTD!
Crimson had no way to cache grammars so the xerces
grammar pool looked really exciting for performance
gains.

However, after enabling grammar caching, and running
comparison with my codebase on JDK 1.4.2 + Crimson vs
JDK 5 + Xerces,  I see negligible, if any, performance
gain. I am looking at the total CPU time of the Java
process as it runs through a suite of about 400 XML
files. There is a lot going on apart from XML parsing,
but Xerces vs Crimson (and the JDK) are the only major
differences between the codebases.

My questions are:
- Are thre any obvious ways of boosting Xerces
performance? In my application, all the documents make
use of the same DTD.
- I am currently on the xerces version that ships with
JDK 5. Will moving to xerces 2.6.2 have any gains?

Thanks.
-Raj

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Re: Xerces vs Crimson performance

Posted by "Eric J. Schwarzenbach" <Er...@wrycan.com>.
This is late reply, but fwiw, have you tried turning on or off the feature
       
    http://apache.org/xml/features/dom/defer-node-expansion

The intention of the feature is to make parsing faster, but with certain
situations / document sizes it actually slows it down considerably. Or
at least it used to--I haven't played with it for a while so my data may
be old, but it's easy enough to try. Note that the default value of this
seems to depend on what implemention you use, so I would make sure to
try it explicitly set both ways.

Eric

Ritu Raj Tiwari wrote:

>Folks,
>I am migrating an application that makes use of
>validating XML parsing. It made use of the crimson
>parser and we are moving to xerces.
>
>The XML documents I encounter have a huge DTD they
>need to be validated against. On many occasions the
>documnents are actually much smaller than the DTD!
>Crimson had no way to cache grammars so the xerces
>grammar pool looked really exciting for performance
>gains.
>
>However, after enabling grammar caching, and running
>comparison with my codebase on JDK 1.4.2 + Crimson vs
>JDK 5 + Xerces,  I see negligible, if any, performance
>gain. I am looking at the total CPU time of the Java
>process as it runs through a suite of about 400 XML
>files. There is a lot going on apart from XML parsing,
>but Xerces vs Crimson (and the JDK) are the only major
>differences between the codebases.
>
>My questions are:
>- Are thre any obvious ways of boosting Xerces
>performance? In my application, all the documents make
>use of the same DTD.
>- I am currently on the xerces version that ships with
>JDK 5. Will moving to xerces 2.6.2 have any gains?
>
>Thanks.
>-Raj
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
>For additional commands, e-mail: xerces-j-user-help@xml.apache.org
>
>
>  
>

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org