You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@xerces.apache.org by Mikael Helbo Kjær <mh...@dia.dk> on 2000/02/21 11:31:56 UTC

Results of XMLParser comparison

During research for a big java application relying heavily on XML (and XSL)
I have compared several XML Parsers written in Java. I`ve tested 5 Parsers
for feature richness, XML and XSL standards support, speed, developer
friendliness (like this list) and stability. The tests were very simple and
I don`t attempt to optimize anything or do any advanced charts or stuff like
that. Here are my results:
Aelfred (a very small SAXParser), discarded for age and lack of features.
Speed test wasn`t run.
XP( James Clark`s parser), discarded for lack of DOM support. Speed test
wasn`t run.

So for the really important stuff:
General:
Xerces-J 1.0.1: 
	Fully compliant XML implementation (DOM 1, SAX 1 although we can`t
seem to find a selectNodes(XSLPATTERN,...) function <-hint ), Very feature
rich collection of additional APIs for serialization and so on,XSLT support
in through Xalan 0.19 (when is the 1.0 out ?), Open Source & Possibility of
our developers codeveloping/learning about an XML Parser. Test for Memory
usage hinted at a very well behaved DOMParser (compared esspecially to
Oracle`s Parser).

Oracle XMLParser v2:
	Fully compliant XML implementation( DOM 1 and SAX 1), fully
integrated XSL processor, not open source. Test for Memory usage hinted at a
very very memory hungry Parser overall. Developer support very low level and
non supportive. No codevelopment possible.

Java API for XML Parsing early access 1:
	Fully compliant XML implementation( DOM 1 and SAX 1), no XSL
processor (caused us to drop this parser). Not open source. Test for Memory
usage hinted that this was the best behaved parser memory wise. Developer
support exists through java.sun.com`s Tutorials and the JDC. No
codevelopment possible (We ignore the community forum as we don`t think that
model works).

Speed: This was the area upon which we fixated most. We have the need to be
able to parse both very small, big and HUGE XML-files. All tests were run on
a Windows 2000 Server (which :-) has already crashed mysteriously 5 times ),
using JDK 1.2.2 from JavaSoft and using code which largely looks like this:
Pseudo-code:

void main ()
{
	Parser parser = new Parser() //both sax and dom
	before = System.currentTimeMillis();
	parser.parse(url or inputsource);
	after = System.currentTimeMillis();
	System.out( "Test.xml parsed in: "+ (after-before) );
}

This yielded the following results (all are averages of 10 seperate runs of
the application):

HotSpot 1.0.1:
DOM:
115 kb size xmlfile
Xerces 1.0.1: 1282 ms.
JAXP 1.0 ea1: 1553 ms.
Oracle XmlParser v2: 1121 ms.

SAX:
8.975 kb size xmlfile
Xerces 1.0.1: 6158 ms.
JAXP 1.0 ea1: 4366 ms.
Oracle XmlParser v2: 4366 ms.

Classic VM (JDK 1.2.2):
DOM:
2.436 kb size xmlfile
Xerces 1.0.1: 11016 ms.
JAXP 1.0 ea1: 5358 ms.
Oracle XmlParser v2: 4366 ms.

SAX:
115 kb size xmlfile
Xerces 1.0.1: 661 ms.
JAXP 1.0 ea1: 771 ms.
Oracle XmlParser v2: 551 ms.

8.975 kb size xmlfile
Xerces 1.0.1: 4156 ms.
JAXP 1.0 ea1: 6029 ms.
Oracle XmlParser v2: 3936 ms.

Of course my tests weren`t very thorough but I think that this is still
enough to see a trend amongst the parsers. Oracle and Xerces are clearly
neck at neck. While Xerces is better behaved memory wise and very very
feature rich, the Oracle Parser is faster, but is also very memory hungry
and isn`t open source. Now we`d rather use the Xerces Parser, but if doesn`t
allow the selection of nodes through an XSL pattern, we just can`t stake
ourselves to it. My results are therefore still slightly inconclusive. 

Mikael Helbo Kjær
Software Developer @ DIA a/s

Re: Results of XMLParser comparison

Posted by Steve Muench <sm...@us.oracle.com>.
| Oracle XMLParser v2:
| Fully compliant XML implementation( DOM 1 and SAX 1), fully
| integrated XSL processor, not open source. Test for Memory
| usage hinted at a very very memory hungry Parser overall.
   :
| the Oracle Parser is faster, but is also very memory hungry
|

Can you please forward your tests to the list so
that we may validate your claims of "memory hungriness" :-)

|
| Developer support very low level and non supportive.
|

You must not have ever visited our Online technical forum
for XML. We have expert developers and technical evangelists
monitoring the XML Forum all day long.

Thanks.

__________________________________________________________
Steve Muench, Lead XML Evangelist / Consulting Product Mgr
Oracle Corp, Business Components for Java Development Team
http://technet.oracle.com/tech/xml
----- Original Message -----
From: "Mikael Helbo Kjær" <mh...@dia.dk>
To: <xe...@xml.apache.org>
Sent: Monday, February 21, 2000 2:31 AM
Subject: Results of XMLParser comparison


During research for a big java application relying heavily on XML (and XSL)
I have compared several XML Parsers written in Java. I`ve tested 5 Parsers
for feature richness, XML and XSL standards support, speed, developer
friendliness (like this list) and stability. The tests were very simple and
I don`t attempt to optimize anything or do any advanced charts or stuff like
that. Here are my results:
Aelfred (a very small SAXParser), discarded for age and lack of features.
Speed test wasn`t run.
XP( James Clark`s parser), discarded for lack of DOM support. Speed test
wasn`t run.

So for the really important stuff:
General:
Xerces-J 1.0.1:
Fully compliant XML implementation (DOM 1, SAX 1 although we can`t
seem to find a selectNodes(XSLPATTERN,...) function <-hint ), Very feature
rich collection of additional APIs for serialization and so on,XSLT support
in through Xalan 0.19 (when is the 1.0 out ?), Open Source & Possibility of
our developers codeveloping/learning about an XML Parser. Test for Memory
usage hinted at a very well behaved DOMParser (compared esspecially to
Oracle`s Parser).

Oracle XMLParser v2:
Fully compliant XML implementation( DOM 1 and SAX 1), fully
integrated XSL processor, not open source. Test for Memory usage hinted at a
very very memory hungry Parser overall. Developer support very low level and
non supportive. No codevelopment possible.

Java API for XML Parsing early access 1:
Fully compliant XML implementation( DOM 1 and SAX 1), no XSL
processor (caused us to drop this parser). Not open source. Test for Memory
usage hinted that this was the best behaved parser memory wise. Developer
support exists through java.sun.com`s Tutorials and the JDC. No
codevelopment possible (We ignore the community forum as we don`t think that
model works).

Speed: This was the area upon which we fixated most. We have the need to be
able to parse both very small, big and HUGE XML-files. All tests were run on
a Windows 2000 Server (which :-) has already crashed mysteriously 5 times ),
using JDK 1.2.2 from JavaSoft and using code which largely looks like this:
Pseudo-code:

void main ()
{
Parser parser = new Parser() file://both sax and dom
before = System.currentTimeMillis();
parser.parse(url or inputsource);
after = System.currentTimeMillis();
System.out( "Test.xml parsed in: "+ (after-before) );
}

This yielded the following results (all are averages of 10 seperate runs of
the application):

HotSpot 1.0.1:
DOM:
115 kb size xmlfile
Xerces 1.0.1: 1282 ms.
JAXP 1.0 ea1: 1553 ms.
Oracle XmlParser v2: 1121 ms.

SAX:
8.975 kb size xmlfile
Xerces 1.0.1: 6158 ms.
JAXP 1.0 ea1: 4366 ms.
Oracle XmlParser v2: 4366 ms.

Classic VM (JDK 1.2.2):
DOM:
2.436 kb size xmlfile
Xerces 1.0.1: 11016 ms.
JAXP 1.0 ea1: 5358 ms.
Oracle XmlParser v2: 4366 ms.

SAX:
115 kb size xmlfile
Xerces 1.0.1: 661 ms.
JAXP 1.0 ea1: 771 ms.
Oracle XmlParser v2: 551 ms.

8.975 kb size xmlfile
Xerces 1.0.1: 4156 ms.
JAXP 1.0 ea1: 6029 ms.
Oracle XmlParser v2: 3936 ms.

Of course my tests weren`t very thorough but I think that this is still
enough to see a trend amongst the parsers. Oracle and Xerces are clearly
neck at neck. While Xerces is better behaved memory wise and very very
feature rich, the Oracle Parser is faster, but is also very memory hungry
and isn`t open source. Now we`d rather use the Xerces Parser, but if doesn`t
allow the selection of nodes through an XSL pattern, we just can`t stake
ourselves to it. My results are therefore still slightly inconclusive.

Mikael Helbo Kjær
Software Developer @ DIA a/s


Re: Results of XMLParser comparison

Posted by Andy Heninger <he...@us.ibm.com>.
Looking at these test results, I think that there's a good chance that most
of the time being reported is spent by the Java JIT compilers in compiling
the code, rather than in XML parse itself.  I've seen this happen on timing
tests of Java programs that I've done here.

The way to factor out the JIT time is to run the test code (parse the file)
once first, and then do the timing test for real.  This assumes that what
you want to measure is performance after an application is fully up and
running, as opposed to start-up time.

Trying to account for garbage collection times in Java benchmarks is another
real challenge.  The problem is that, in a short test, the difference
between zero or one GC, or  between one and two, can be huge, and can
completely obscure the time spent in actual code.  One solution is to run
the test in a loop, with run times of several minutes, and with at least a
hundred or so GCs over the durtion of the test.   This has two big benefits:
1) one more or fewer GCs will have a negligible impact on the overall
results, and 2) the memory allocation/collection costs are reasonably
factored into the overall results.

I'm glad that you posted your numbers, and, if you time to take this effort
any further, I will look forward to those results too.

   Regards,

     -- Andy


----- Original Message -----
From: "Mikael Helbo Kjær" <mh...@dia.dk>
To: <xe...@xml.apache.org>
Sent: Monday, February 21, 2000 2:31 AM
Subject: Results of XMLParser comparison


> During research for a big java application relying heavily on XML (and
XSL)
> I have compared several XML Parsers written in Java. I`ve tested 5 Parsers
> for feature richness, XML and XSL standards support, speed, developer
> friendliness (like this list) and stability. The tests were very simple
and
> I don`t attempt to optimize anything or do any advanced charts or stuff
like
> that. Here are my results:
> Aelfred (a very small SAXParser), discarded for age and lack of features.
> Speed test wasn`t run.
> XP( James Clark`s parser), discarded for lack of DOM support. Speed test
> wasn`t run.
>
> So for the really important stuff:
> General:
> Xerces-J 1.0.1:
> Fully compliant XML implementation (DOM 1, SAX 1 although we can`t
> seem to find a selectNodes(XSLPATTERN,...) function <-hint ), Very feature
> rich collection of additional APIs for serialization and so on,XSLT
support
> in through Xalan 0.19 (when is the 1.0 out ?), Open Source & Possibility
of
> our developers codeveloping/learning about an XML Parser. Test for Memory
> usage hinted at a very well behaved DOMParser (compared esspecially to
> Oracle`s Parser).
>
> Oracle XMLParser v2:
> Fully compliant XML implementation( DOM 1 and SAX 1), fully
> integrated XSL processor, not open source. Test for Memory usage hinted at
a
> very very memory hungry Parser overall. Developer support very low level
and
> non supportive. No codevelopment possible.
>
> Java API for XML Parsing early access 1:
> Fully compliant XML implementation( DOM 1 and SAX 1), no XSL
> processor (caused us to drop this parser). Not open source. Test for
Memory
> usage hinted that this was the best behaved parser memory wise. Developer
> support exists through java.sun.com`s Tutorials and the JDC. No
> codevelopment possible (We ignore the community forum as we don`t think
that
> model works).
>
> Speed: This was the area upon which we fixated most. We have the need to
be
> able to parse both very small, big and HUGE XML-files. All tests were run
on
> a Windows 2000 Server (which :-) has already crashed mysteriously 5
times ),
> using JDK 1.2.2 from JavaSoft and using code which largely looks like
this:
> Pseudo-code:
>
> void main ()
> {
> Parser parser = new Parser() file://both sax and dom
> before = System.currentTimeMillis();
> parser.parse(url or inputsource);
> after = System.currentTimeMillis();
> System.out( "Test.xml parsed in: "+ (after-before) );
> }
>
> This yielded the following results (all are averages of 10 seperate runs
of
> the application):
>
> HotSpot 1.0.1:
> DOM:
> 115 kb size xmlfile
> Xerces 1.0.1: 1282 ms.
> JAXP 1.0 ea1: 1553 ms.
> Oracle XmlParser v2: 1121 ms.
>
> SAX:
> 8.975 kb size xmlfile
> Xerces 1.0.1: 6158 ms.
> JAXP 1.0 ea1: 4366 ms.
> Oracle XmlParser v2: 4366 ms.
>
> Classic VM (JDK 1.2.2):
> DOM:
> 2.436 kb size xmlfile
> Xerces 1.0.1: 11016 ms.
> JAXP 1.0 ea1: 5358 ms.
> Oracle XmlParser v2: 4366 ms.
>
> SAX:
> 115 kb size xmlfile
> Xerces 1.0.1: 661 ms.
> JAXP 1.0 ea1: 771 ms.
> Oracle XmlParser v2: 551 ms.
>
> 8.975 kb size xmlfile
> Xerces 1.0.1: 4156 ms.
> JAXP 1.0 ea1: 6029 ms.
> Oracle XmlParser v2: 3936 ms.
>
> Of course my tests weren`t very thorough but I think that this is still
> enough to see a trend amongst the parsers. Oracle and Xerces are clearly
> neck at neck. While Xerces is better behaved memory wise and very very
> feature rich, the Oracle Parser is faster, but is also very memory hungry
> and isn`t open source. Now we`d rather use the Xerces Parser, but if
doesn`t
> allow the selection of nodes through an XSL pattern, we just can`t stake
> ourselves to it. My results are therefore still slightly inconclusive.
>
> Mikael Helbo Kjær
> Software Developer @ DIA a/s
>