You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xalan.apache.org by Pavel Ausianik <Pa...@epam.com> on 2002/09/20 11:33:45 UTC

Using single input twice

Hello,

I need to transform a single xml document , defined as an byte[], twice and
I wonder what is the fastest way to do this.

Two ways I have already tried are:

1. Parse input stream, and create a DOM document, then use DOMSource twice
2. Create StreamSource twice, and parse it independently.

Surprisingly the performance of both cases very close (the transformation in
the second case require more time, but first solution requires additional
step to parse input).

What could be other ways do it faster?

Thanks,
Pavel

Re: Using single input twice

Posted by Simon Kitching <si...@ecnetwork.co.nz>.
I have a similar issue. I am actually applying a *tree* of stylesheets
to an input message, generating a set of result messages.

I currently use DOMs as intermediate stages, but am well aware of the
performance and space penalty this imposes.

I have been contemplating creating a SAX filter which essentially
"copies" events to multiple outputs. I haven't got much further than the
general idea...

Cheers,

Simon

On Sat, 2002-09-21 at 01:36, Joseph Kesselman wrote:
> I'm not very surprised by your results. There are known performance issues 
> right now in operating from a DOMSource -- essentially, DOM2DTM is a 
> fairly expensive wrapper and comes all too close to "reparsing" the 
> document as the stylesheet walks through it. We knew this was an issue 
> when we wrote it; DOM2DTM was produced in a bit of a hurry and we hadn't 
> had time to go back and reconsider its design.
> 
> I'm currently working on an alternative that might or might not be better 
> for some DOMs, and specifically addresses the question of repeatedly 
> processing the same nodes -- and yes, that's weasel wording; I honestly 
> don't know whether it will work out or not.  I hope to have an initial 
> prototype running next week. More news when and if.
> 
> Meanwhile... All I can suggest is that you use whichever seems to work 
> better for you, and be prepared to reconsider that choice as our code or 
> your application continue to evolve.
> 
> "It is very clear that this is highly ambiguous."
> 
> ______________________________________
> Joe Kesselman  / IBM Research
> 



Re: Using single input twice

Posted by Joseph Kesselman <ke...@us.ibm.com>.
I'm not very surprised by your results. There are known performance issues 
right now in operating from a DOMSource -- essentially, DOM2DTM is a 
fairly expensive wrapper and comes all too close to "reparsing" the 
document as the stylesheet walks through it. We knew this was an issue 
when we wrote it; DOM2DTM was produced in a bit of a hurry and we hadn't 
had time to go back and reconsider its design.

I'm currently working on an alternative that might or might not be better 
for some DOMs, and specifically addresses the question of repeatedly 
processing the same nodes -- and yes, that's weasel wording; I honestly 
don't know whether it will work out or not.  I hope to have an initial 
prototype running next week. More news when and if.

Meanwhile... All I can suggest is that you use whichever seems to work 
better for you, and be prepared to reconsider that choice as our code or 
your application continue to evolve.

"It is very clear that this is highly ambiguous."

______________________________________
Joe Kesselman  / IBM Research