You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-users@xalan.apache.org by Holger Flörke <fl...@doctronic.de> on 2004/04/01 08:58:34 UTC

performance of xsl:strip-space/xsl:preserve-space

Hi there,

first of all I would like to mention I am not a performance freak. I think 
the conformance and stability is more important than some percent of 
performance.

I had a look at the 2003 "Sarvega XSLT Benchmark study". There is one 
testcase where Xalan-C (1.5, but also 1.7) reaches only 13% of the average 
throughput of all other tested processors (xalanj, libxslt, saxon, resin, 
xsltc, xt, msxml, jd). A really bad outliner. The testcase is a 
transformation of one docbook sample document to html using the standard 
docbook stylesheets.

I glimpsed into the code and do some performance measurements on my own. 
Xalan-C will be up to 5times faster for this docbook-transformation, if I 
disable the strip-whitespace-processing. I have done this by modifying the 
function StylesheetRoot::shouldStripSourceNode to return false all the time 
(a really radical method and definitly results in wrong results ;^). The 
reason for this performance leak is - in my opinion - the handling of the 
element names given in the stylesheets "xsl:preserve-space", 
"xsl:strip-space". They will be evaluated and scored as full XPaths which 
is an expensive operation.

Maybe I am wrong with my analysis, but if I am right, I think one should 
mention this behaviour within the section "What can I do to speed up 
transformations?" of the Xalan-C-FAQ.

I do not use any xml:space, xsl:preserve-space, and xsl:strip-space within 
my stylesheets and documents. Therefore I have no problem with this 
bottleneck, but maybe some other people do this.

HolgeR
-- 
holger floerke                      d  o  c  t  r  o  n  i  c
email floerke@doctronic.de          information publishing + retrieval
phone +49 2222 9292 90              http://www.doctronic.de

Re: performance of xsl:strip-space/xsl:preserve-space

Posted by Dmitry Hayes <dm...@ca.ibm.com>.



Hi !

>Maybe I am wrong with my analysis, but if I am right, I think one should
>mention this behaviour within the section "What can I do to speed up
>transformations?" of the Xalan-C-FAQ.

Thanks, we will do it for the upcoming release .

Dmitry.



Re: performance of xsl:strip-space/xsl:preserve-space

Posted by Holger Flörke <fl...@doctronic.de>.
Hi David,

> OK, that's really bad.  Can you provide more details about what we need to
> reproduce this and we'll see what we can do?
I used

   http://www.sarvega.com/xslt-benchmark.php

to download the benchmark.

There is a pdf document describing the results and the testbed with all 
sample data and stylesheets. I have not compiled the driver for xalan, but 
used the testXSLT command line for testing.

"""
time `testXSLT17 \
       -IN testfiles/docbook-xsl/test/book2.xml \
       -XSL testfiles/docbook-xsl/html/docbook.xsl \
       > /dev/null  2> /dev/null`
"""
=>
real    0m3.728s
user    0m3.670s
sys     0m0.040s

"""
time `saxon testfiles/docbook-xsl/test/book2.xml \
      testfiles/docbook-xsl/html/docbook.xsl \
      > /dev/null  2> /dev/null`
"""
=>
real    0m2.553s
user    0m2.360s
sys     0m0.100s

It seems not so bad for xalan, but the saxon-script has to load the jvm and 
the real benchmark will be done with precompiled stylesheets.

HolgeR
-- 
holger floerke                      d  o  c  t  r  o  n  i  c
email floerke@doctronic.de          information publishing + retrieval
phone +49 2222 9292 90              http://www.doctronic.de



Re: performance of xsl:strip-space/xsl:preserve-space

Posted by da...@us.ibm.com.



Hi HolgeR,

> first of all I would like to mention I am not a performance freak. I
think
> the conformance and stability is more important than some percent of
> performance.

Me too...

> I had a look at the 2003 "Sarvega XSLT Benchmark study". There is one
> testcase where Xalan-C (1.5, but also 1.7) reaches only 13% of the
average
> throughput of all other tested processors (xalanj, libxslt, saxon, resin,

> xsltc, xt, msxml, jd). A really bad outliner. The testcase is a
> transformation of one docbook sample document to html using the standard
> docbook stylesheets.

OK, that's really bad.  Can you provide more details about what we need to
reproduce this and we'll see what we can do?

> I glimpsed into the code and do some performance measurements on my own.
> Xalan-C will be up to 5times faster for this docbook-transformation, if I

> disable the strip-whitespace-processing. I have done this by modifying
the
> function StylesheetRoot::shouldStripSourceNode to return false all the
time
> (a really radical method and definitly results in wrong results ;^). The
> reason for this performance leak is - in my opinion - the handling of the

> element names given in the stylesheets "xsl:preserve-space",
> "xsl:strip-space". They will be evaluated and scored as full XPaths which

> is an expensive operation.

OK, there are a couple of problems from what I can see.  We rely on using
match patterns (not full XPath expression), because that was probably the
easiest way to do it when things were implemented.  I've wanted to make
improvements to this code for a while now, but I think we will need to
prioritize it.  We also have a bit of an implementation issue, because we
are storing all of these match patterns in the root stylesheet, which is
not correct.  I think we can do a staged improvement of this which will
help alleviate some of the performance problems.

> Maybe I am wrong with my analysis, but if I am right, I think one should
> mention this behaviour within the section "What can I do to speed up
> transformations?" of the Xalan-C-FAQ.

Yes, we can mention it, since it can be a performance issue even when with
a good implementation.

Thanks for the post!

Dave