You are viewing a plain text version of this content. The canonical link for it is here.

Posted to c-users@xalan.apache.org by Anthony Zawacki <zw...@us.ibm.com> on 2003/01/14 21:15:38 UTC

Upgrade option recommendation request




Our team is in the process of upgrading from Xerces-C++ 1.7 to Xerces-C++
2.1 and from Xalan-C++ 1.3 to Xalan-C++ 1.4.

As it turns out, I am the only developer that uses Xalan in my code.
Currently, I use a mixture of Xerces and Xalan to do the appropriate
processing.  Now with the upgrade to Xerces-C++ 2.1, I was planning to
continue to use the deprecated APIs.  Of course, I found very quickly that
that wouldn't work.  So now I have to make a decision:

1)  Upgrade my Xerces related code to use the new API and continue to wrap
the DOM for Xalan use.
2)  Remove Xerces usage altogether, and use the Xalan DOM strictly.

The reason I have not moved to #2 before this point, is that my application
has quite a bit of direct DOM manipulation:
o  Inserting and deleting attributes directly on every message that is
processed.
o  Inserting, storing, and removing entire subtrees on every message that
is processed.

The current usage is fairly optimized, where possible, caching of compiled
XSL and XPath expressions is performed, and I have not run into any
performance issues up until now.

However, If I go for #2, the number of transformations required to process
a message will go up significantly.  Processing of an average message
starts by inserting three attributes into the root node of the message.
Next, two subtrees are inserted a few layers down in the message.  Now,
several XPath expressions are tested based on a configuration file, and
based on the results of the XPath expressions, various actions take place,
including execution of transformations and additional XPath expressions.
At some point, my application determines that it needs to send outbound
messages.  At that point, the three attributes and two subtrees are removed
from the document, the document is serialized, and transmitted to the
appropriate process (again, in case anyone is wondering, based on the
configuration file.)  The application is highly configurable, and basically
interprets a configuration file and executes the instructions.  It's also
important to note that the XSL is embedded in the configuration file, and I
pull the XSL out and create my own DOM document for compiling the XSL.  The
configuration file is XML based, but I require only read-only access to it,
so I think I can use the Xalan DOM just to parse the document, and then
access it.  I don't think that I can continue to use the Xerces based DOM
simply because I would then have to map it back to the Xalan DOM for
compiling anyway.

Will the performance benefit of using just the Xalan DOM be enough of a
boost to overcome performing, literally, 10s of more transformations per
message?
What parameters should go into my decision between #1 and #2 above?
Is there additional information that would be useful to know to aid helping
me make this decision?

Sorry for the long post, and thank you for your help.

Thanks,
Anthony Zawacki

410-571-7161
zwacki@us.ibm.com

Re: Upgrade option recommendation request

Posted by David N Bertoni/Cambridge/IBM <da...@us.ibm.com>.

Hi Anthony,

I'm not sure what you mean when you say continuing to use the deprecated
DOM won't work -- Xalan 1.4 only supports that DOM. There will be support
for the new Xercess DOM in 1.5, and that support is also available in the
latest CVS code.

One way you could re-work your application is to use SAX parsing. If you
can determine which modifications you want to make to the tree at parse
time, you can do tit on-the-fly and build the modified tree directly using
Xalan's source tree. That will be much faster for transformations. Of
course, you cannot modify the tree once it's built, but you could certainly
filter out the stuff you added when you go to serialize the tree.

At any rate, if XPath and XSLT performance is more important to you than a
randomly-mutable tree, you can re-work your application. Otherwise, stay
with the old Xerces DOM and migrate to the new one when the next version of
Xalan is released.

Does that make sense?

Dave

Anthony Zawacki
<zwacki@us.ibm.c To: xalan-c-users@xml.apache.org
om> cc: (bcc: David N Bertoni/Cambridge/IBM)
Subject: Upgrade option recommendation request
01/14/2003 12:15
PM

Our team is in the process of upgrading from Xerces-C++ 1.7 to Xerces-C++
2.1 and from Xalan-C++ 1.3 to Xalan-C++ 1.4.

As it turns out, I am the only developer that uses Xalan in my code.
Currently, I use a mixture of Xerces and Xalan to do the appropriate
processing. Now with the upgrade to Xerces-C++ 2.1, I was planning to
continue to use the deprecated APIs. Of course, I found very quickly that
that wouldn't work. So now I have to make a decision:

1) Upgrade my Xerces related code to use the new API and continue to wrap
the DOM for Xalan use.
2) Remove Xerces usage altogether, and use the Xalan DOM strictly.

The reason I have not moved to #2 before this point, is that my application
has quite a bit of direct DOM manipulation:
o Inserting and deleting attributes directly on every message that is
processed.
o Inserting, storing, and removing entire subtrees on every message that
is processed.

The current usage is fairly optimized, where possible, caching of compiled
XSL and XPath expressions is performed, and I have not run into any
performance issues up until now.

However, If I go for #2, the number of transformations required to process
a message will go up significantly. Processing of an average message
starts by inserting three attributes into the root node of the message.
Next, two subtrees are inserted a few layers down in the message. Now,
several XPath expressions are tested based on a configuration file, and
based on the results of the XPath expressions, various actions take place,
including execution of transformations and additional XPath expressions.
At some point, my application determines that it needs to send outbound
messages. At that point, the three attributes and two subtrees are removed
from the document, the document is serialized, and transmitted to the
appropriate process (again, in case anyone is wondering, based on the
configuration file.) The application is highly configurable, and basically
interprets a configuration file and executes the instructions. It's also
important to note that the XSL is embedded in the configuration file, and I
pull the XSL out and create my own DOM document for compiling the XSL. The
configuration file is XML based, but I require only read-only access to it,
so I think I can use the Xalan DOM just to parse the document, and then
access it. I don't think that I can continue to use the Xerces based DOM
simply because I would then have to map it back to the Xalan DOM for
compiling anyway.

Will the performance benefit of using just the Xalan DOM be enough of a
boost to overcome performing, literally, 10s of more transformations per
message?
What parameters should go into my decision between #1 and #2 above?
Is there additional information that would be useful to know to aid helping
me make this decision?

Sorry for the long post, and thank you for your help.

Thanks,
Anthony Zawacki

410-571-7161
zwacki@us.ibm.com