You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@synapse.apache.org by "Andreas Veithen (JIRA)" <ji...@apache.org> on 2007/12/24 00:42:42 UTC

[jira] Created: (SYNAPSE-213) Improve handling of input in XSLTMediator

Improve handling of input in XSLTMediator
-----------------------------------------

Key: SYNAPSE-213
URL: https://issues.apache.org/jira/browse/SYNAPSE-213
Project: Synapse
Issue Type: Improvement
Components: Core
Affects Versions: 1.1, NIGHTLY
Reporter: Andreas Veithen
Priority: Minor

Currently XSLTMediator uses two different strategies to feed the XML input into the XSLT processor:

* When useDOMSourceAndResults is set to false, the Axiom tree will be serialized to a byte stream (in memory or to a temporary file for large documents) and then fed into the XSLT processor using a StreamSource object.
* When useDOMSourceAndResults is set to true, the code will call ElementHelper.importOMElement to get a DOM compliant version of the Axiom tree. The resulting DOM tree is then passed to the XSLT processor using a DOMSource.

First it should be noted that using a temporary file for the XML input (in contrast to the output of the transformation) doesn't eliminate the need to keep the entire input document in memory. Indeed:

* When the input is read, Axiom will built the entire tree and keep in memory.
* Due to the way XSLT works, the XSLT processor also requires a complete in-memory representation of the input document. The only exception is for XSLT processors that supports streaming, which is not the case for Xalan. Xalan uses its own object model called DTM (Document Table Model) to store the input document in memory.

Since the input document must be kept in memory anyway, the only question is how to efficiently feed the original Axiom tree into the XSLT processor without creating too much overhead and consuming too much memory. Assuming that Xalan is used, the current situation is as follows:

* When useDOMSourceAndResults is set to false, three copies of the XML input will be built: the Axiom tree, the serialized byte stream and Xalan's DTM representation. When temporary files are used for large documents, only two will coexist in memory. However, using temporary files introduces a large overhead.
* When useDOMSourceAndResults is set to true, at least two copies of the input will be built: the Axiom tree and the DOM tree. Indeed, from the code in ElementHelper.importOMElement it can be seen that an entirely new copy of the input tree will be created. In addition, Xalan will create a DTM representation of the DOM tree. The document at http://xml.apache.org/xalan-j/dtm.html suggests that this representation is not a complete copy of the DOM tree, but a wrapper/adapter that is backed by the original DOM tree.

Both strategies used by XSLTMediator are far from optimal. There are at least two strategies that should give better results (with at least one of them being actually simpler):

* Trick Axis2 into producing a DOM compatible tree from the outset, by using a StAXSOAPModelBuilder with a DOMSOAPFactory (this produces objects that implement both the Axiom and DOM interfaces). This however might require some tweaking. The advantage is that there is no need to create a copy anymore. Xalan will only create a DTM wrapper around the existing tree.
* Make sure that a DTM representation is created directly from the Axiom tree without intermediate copy (byte stream or DOM tree). With Java 6/JAXP 1.4 this would be very easy because it has support for StAXSource, which integrates nicely with Axiom. In the meantime, the solution is to pull StAX events from Axiom, convert them to SAX events and push them to the XSLT processor. The Spring WS project has a utility class StaxSource (extending SAXSource) that does this in a completely transparent way (new StaxSource(omElement.getXMLStreamReader())). By using getXMLStreamReaderWithoutCaching instead of getXMLStreamReader, this could probably be further optimized to instruct Axiom not to create the tree for the part of the input message that is being transformed (unless it has already been constructed at that moment).

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

---------------------------------------------------------------------
To unsubscribe, e-mail: synapse-dev-unsubscribe@ws.apache.org
For additional commands, e-mail: synapse-dev-help@ws.apache.org

[jira] Updated: (SYNAPSE-213) Improve handling of input in XSLTMediator

Posted by "Asankha C. Perera (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/SYNAPSE-213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Asankha C. Perera updated SYNAPSE-213:
--------------------------------------

    Fix Version/s: 1.2

fix post 1.1.1 release

> Improve handling of input in XSLTMediator
> -----------------------------------------
>
>                 Key: SYNAPSE-213
>                 URL: https://issues.apache.org/jira/browse/SYNAPSE-213
>             Project: Synapse
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 1.1, NIGHTLY
>            Reporter: Andreas Veithen
>            Priority: Minor
>             Fix For: 1.2
>
>
> Currently XSLTMediator uses two different strategies to feed the XML input into the XSLT processor:
> * When useDOMSourceAndResults is set to false, the Axiom tree will be serialized to a byte stream (in memory or to a temporary file for large documents) and then fed into the XSLT processor using a StreamSource object.
> * When useDOMSourceAndResults is set to true, the code will call ElementHelper.importOMElement to get a DOM compliant version of the Axiom tree. The resulting DOM tree is then passed to the XSLT processor using a DOMSource.
> First it should be noted that using a temporary file for the XML input (in contrast to the output of the transformation) doesn't eliminate the need to keep the entire input document in memory. Indeed:
> * When the input is read, Axiom will built the entire tree and keep in memory.
> * Due to the way XSLT works, the XSLT processor also requires a complete in-memory representation of the input document. The only exception is for XSLT processors that supports streaming, which is not the case for Xalan. Xalan uses its own object model called DTM (Document Table Model) to store the input document in memory.
> Since the input document must be kept in memory anyway, the only question is how to efficiently feed the original Axiom tree into the XSLT processor without creating too much overhead and consuming too much memory. Assuming that Xalan is used, the current situation is as follows:
> * When useDOMSourceAndResults is set to false, three copies of the XML input will be built: the Axiom tree, the serialized byte stream and Xalan's DTM representation. When temporary files are used for large documents, only two will coexist in memory. However, using temporary files introduces a large overhead.
> * When useDOMSourceAndResults is set to true, at least two copies of the input will be built: the Axiom tree and the DOM tree. Indeed, from the code in ElementHelper.importOMElement it can be seen that an entirely new copy of the input tree will be created. In addition, Xalan will create a DTM representation of the DOM tree. The document at http://xml.apache.org/xalan-j/dtm.html suggests that this representation is not a complete copy of the DOM tree, but a wrapper/adapter that is backed by the original DOM tree.
> Both strategies used by XSLTMediator are far from optimal. There are at least two strategies that should give better results (with at least one of them being actually simpler):
> * Trick Axis2 into producing a DOM compatible tree from the outset, by using a StAXSOAPModelBuilder with a DOMSOAPFactory (this produces objects that implement both the Axiom and DOM interfaces). This however might require some tweaking. The advantage is that there is no need to create a copy anymore. Xalan will only create a DTM wrapper around the existing tree.
> * Make sure that a DTM representation is created directly from the Axiom tree without intermediate copy (byte stream or DOM tree). With Java 6/JAXP 1.4 this would be very easy because it has support for StAXSource, which integrates nicely with Axiom. In the meantime, the solution is to pull StAX events from Axiom, convert them to SAX events and push them to the XSLT processor. The Spring WS project has a utility class StaxSource (extending SAXSource) that does this in a completely transparent way (new StaxSource(omElement.getXMLStreamReader())). By using getXMLStreamReaderWithoutCaching instead of getXMLStreamReader, this could probably be further optimized to instruct Axiom not to create the tree for the part of the input message that is being transformed (unless it has already been constructed at that moment).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: synapse-dev-unsubscribe@ws.apache.org
For additional commands, e-mail: synapse-dev-help@ws.apache.org