You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-dev@xerces.apache.org by Andy Clark <an...@apache.org> on 2001/01/11 06:24:33 UTC

[X1] Validation Performance

I just applied a patch submitted by Akihiko TOZAWA that dramatically
improves the compilation time of content models based on DFAs. Since
we don't have grammar caching yet (coming in Xerces2), applications
such as servers that parse and validate many documents with semi-
complex to complex content models should see a marked improvement.

Please note that this does not improve the validating time -- just
the compilation into a DFAContentModel object. But if you validate
a lot of small documents, your time may be dominated by the 
compilation of the content models. So this should patch will help.

Here are some comparison numbers provided by Akihiko:

- original one (Xerces 1.2.3)

[atozawa@atozawa xml]$ java -classpath /home/atozawa/java/xerces-1_2_3/bin/xerces.jar:/home/atozawa/java/xerces-1_2_3/bin/xercesSamples.jar sax.SAXCount -v sample.xml
DFA build: 30142ms
sample.xml: 30979 ms (3 elems, 0 attrs, 0 spaces, 0 chars)
[atozawa@atozawa xml]$ java -classpath /home/atozawa/java/xerces-1_2_3/bin/xerces.jar:/home/atozawa/java/xerces-1_2_3/bin/xercesSamples.jar sax.SAXCount -v sample3.xml
DFA build: 192488ms
sample3.xml: 192917 ms (26 elems, 0 attrs, 0 spaces, 0 chars)
[atozawa@atozawa xml]$ /opt/jdk1.3/bin/java -Xint -classpath /home/atozawa/java/xerces-1_2_3/bin/xerces.jar:/home/atozawa/java/xerces-1_2_3/bin/xercesSamples.jar sax.SAXCount -v REC-xml-20001006.xml
DFA build: 22ms
DFA build: 22ms
DFA build: 22ms
DFA build: 24ms
DFA build: 30ms
DFA build: 31ms
DFA build: 31ms
DFA build: 32ms
DFA build: 44ms
DFA build: 47ms
DFA build: 49ms
DFA build: 50ms
DFA build: 52ms
DFA build: 52ms
DFA build: 52ms
DFA build: 56ms
DFA build: 64ms
DFA build: 66ms
DFA build: 66ms
DFA build: 66ms
DFA build: 68ms
DFA build: 68ms
DFA build: 68ms
DFA build: 68ms
REC-xml-20001006.xml: 2589 ms (3037 elems, 3023 attrs, 1440 spaces, 116063 chars)

- modified one (patch)

[atozawa@atozawa xml]$ java -classpath /home/atozawa/java/xerces-1_2_3/bin/xerces.jar:/home/atozawa/java/xerces-1_2_3/bin/xercesSamples.jar sax.SAXCount -v sample.xml
DFA build: 500ms
sample.xml: 1030 ms (3 elems, 0 attrs, 0 spaces, 0 chars)
[atozawa@atozawa xml]$ java -classpath /home/atozawa/java/xerces-1_2_3/bin/xerces.jar:/home/atozawa/java/xerces-1_2_3/bin/xercesSamples.jar sax.SAXCount -v sample3.xml
DFA build: 8ms
sample3.xml: 413 ms (26 elems, 0 attrs, 0 spaces, 0 chars)
[atozawa@atozawa xml]$ /opt/jdk1.3/bin/java -Xint -classpath /home/atozawa/java/xerces-1_2_3/bin/xerces.jar:/home/atozawa/java/xerces-1_2_3/bin/xercesSamples.jar sax.SAXCount -v REC-xml-20001006.xml
DFA build: 5ms
DFA build: 5ms
DFA build: 5ms
DFA build: 6ms
DFA build: 7ms
DFA build: 7ms
DFA build: 8ms
DFA build: 9ms
DFA build: 11ms
DFA build: 12ms
DFA build: 13ms
DFA build: 17ms
DFA build: 18ms
DFA build: 18ms
DFA build: 19ms
DFA build: 20ms
DFA build: 21ms
DFA build: 22ms
DFA build: 22ms
DFA build: 23ms
DFA build: 25ms
DFA build: 25ms
DFA build: 25ms
DFA build: 26ms
REC-xml-20001006.xml: 2591 ms (3037 elems, 3023 attrs, 1440 spaces, 116063 chars)

-- 
Andy Clark * IBM, TRL - Japan * andyc@apache.org