You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by "Qiu, Wenning" <We...@csgsystems.com> on 2004/05/04 18:30:21 UTC

RE: Performance degradation

About half of the performance degredation in DOMWriterImpl was due to the handling of namespace. Even applications that do not use namespace have to pay the price for it. 

The overhead comes from the data structure used to handle namespace: a stack of hashmaps of namespace bindings. Current implementation creates a hashmap for every Element Node and pushes it on to the stack as it traverses the DOM tree. My test case did not use any namespace, but half of the DOMWriterImpl's 65% performance drop came from the construction and destruction of stack of empty hash maps. 

I wonder if the following optimization is possible:

1) Both DOMParser and DOMBuilder has a DoNamespace feature. Is it possible for the DOMDocument to carry that feature flag, so that DOMWriter can completely bypass the namespace handling if it knows the DOM does not have namespace?

2) Even when namespace is used, it seems highly inefficient to have a hashmap created and destroyed for each DOMElement node. It is only necessary when that node introduces new namespace binding.

It seems to me there are some places we can definitely optimize DOMWriter's performance. 

My application runs about 10% faster when I replaced DOMWriter with my own serializer(built with xercesc.2.5.0, STLport.4.6.2, libhoard.2.1.2d, expat.1.95.7) compared with the same code built with xercesc.2.1.0, STLport.4.5.3, libhoard.2.1.0 and expat.1.95.4. I hope I an drop my serializer when DOMWriter gets more efficient.

-Wenning Qiu

-----Original Message-----
From: Qiu, Wenning [mailto:Wenning_Qiu@csgsystems.com]
Sent: Wednesday, April 28, 2004 5:11 PM
To: xerces-c-dev@xml.apache.org
Subject: RE: Performance degradation


I appreciate the feedbacks on this subject, they have been very helpful.

I quantified my application and I'd like to report the results that I have obtained. 

I compared xercesc.2.1.0 with xercesc.2.5.0. built with STLport.4.5.3. libhoard was taken out since it would not work with quantify. The quantified program has only one thread, it parses XML messages or builds DOM then serializes the DOM using xercesc::DOMWriter.

I noticed only slight performance drop for parsing(~0.5%) and DOM building(~3%). The bottleneck turns out to be the serlization part, where the degradation in performance is around 60%.

DOMWriterImpl::writeNode() used 75,800,391 cycles with Xerces.2.1.0. It used 124,670,672 cycles with Xerces.2.5.0. It's 64.5% slower.

The major contributors are: 

1) XMLFormatter::operator << (const unsigned short*const) used 39,802,051 cycles with Xerces.2.1.0. It used 41,861,068 cycles. It's 5.2% slower.

2) XMLFormatter::operator << (const unsigned short) used 16,760,000 cycles with Xerces.2.1.0. It used 18,032,000 cycles with Xerces.2.5.0. It's 7.6% slower.

3) Some new functions in Xerces.2.5.0 also contributed:
   o xercesc_2_5::RefHashTableOf<unsigned short>::RefHashTableOf() 16.8% of total cycles. Almost all the time is spent calling MemoryManagerImpl::allocate() and Xmemory::operator new(). 
   o xercesc_2_5::XMLFormatter::specialFormat() 9.33% of total cycles.
   o xercesc_2_5::BaseRefVectorOf<xercesc_2.5::RefHashTableOf<unsigned short> >::removeLastElement() 9.23% of total.
   o xercesc_2_5::XMemory::operator new() 4.06% of total cycles.

In addition to the degradation in processing time, xerces.2.5.0 seems not to scale beyond 2 threads when DOM serialization is involved. 


-----Original Message-----
From: Karande Samir [mailto:Samir.Karande@comverse.com]
Sent: Wednesday, April 28, 2004 2:00 PM
To: 'xerces-c-dev@xml.apache.org'
Subject: RE: Performance degradation


Hi Wenning,
	I have seen similar performance degradation (though, not because of
xerces upgrade) in past. It turned out that some of the components I was
using were using their own malloc/new implementation and hoard could not
override the malloc/new calls that were made through those components.
Unfortunately, most of the memory management schemes in third party
components are not optimized for SMP systems and you would start seeing lot
of context switching between threads/processes due to idle wait in malloc. 
	If new memory allocation scheme in xerces is preventing hoard to
take over memory management, its likely that you would see performance
degradation. May be we want to supply libc's new/malloc to the xerces parser
calls explicitly, if its possible.

I hope this helps.

-Samir

-----Original Message-----
From: Qiu, Wenning [mailto:Wenning_Qiu@csgsystems.com]
Sent: Wednesday, April 28, 2004 2:47 PM
To: xerces-c-dev@xml.apache.org
Subject: RE: Performance degradation


Yes I am using a Solaris box with 4 processors.

-----Original Message-----
From: Karande Samir [mailto:Samir.Karande@comverse.com]
Sent: Wednesday, April 28, 2004 12:35 PM
To: 'xerces-c-dev@xml.apache.org'
Subject: RE: Performance degradation


Hi Wenning,
	Do you use multiprocessor (SMP) systems ? I am assuming SMP system
because you are using hoard.
-Samir

-----Original Message-----
From: Qiu, Wenning [mailto:Wenning_Qiu@csgsystems.com]
Sent: Tuesday, April 27, 2004 5:16 PM
To: xerces-c-dev@xml.apache.org
Subject: RE: Performance degradation


Hi Neil,

You are right the "per-document memory heap" in Xerces's DOM implementation
is still there. I missed that when I looked at the source code.

My application accepts XML messages, parses them into DOM, converts DOM to
messages in a proprietory format and send them to a server. When response
messages come back from the server, my application parses the proprietory
response message and builds DOM, it then serialize DOM into XML byte stream
and send out. 

We use expat to do the parsing because for better performance. So the
functionalities we actually use from Xerces is DOM building and
serialization. The validation was not included.


Thanks,
Wenning Qiu

-----Original Message-----
From: Neil Graham [mailto:neilg@ca.ibm.com]
Sent: Tuesday, April 27, 2004 2:56 PM
To: xerces-c-dev@xml.apache.org
Subject: RE: Performance degradation







Hi Wenning,

If by the "per-document memory heap" you're referring to the way Xerces's
DOM implementation works, then nothing has changed since 2.2.  The same
memory paradigm is used.

The pluggable memory management certainly does introduce overhead:  every
time the parser needs memory, it has to reach out to a virtual function
instead of directly calling the system libraries.  Some work was done to
mitigate the parser's habit of creating and destroying short-lived objects
in the 2.4 time-frame, and this bought back a good portion of the
performance that the pluggable memory scheme cost.

To be more helpful, we'd have to understand the characteristics of your
application.  I conjecture you're DOM-based; do you do any validation?  If
so, then some of the grammar caching/persistence capabilities introduced
since 2.3 might be helpful to you.

Cheers,
Neil
Neil Graham
XML Parser Development
IBM Toronto Lab
Phone:  905-413-3519, T/L 969-3519
E-mail:  neilg@ca.ibm.com




 

                      "Qiu, Wenning"

                      <Wenning_Qiu@csgs        To:
<xe...@xml.apache.org>

                      ystems.com>              cc:

                                               Subject:  RE: Performance
degradation                                                   
                      04/27/2004 03:29

                      PM

                      Please respond to

                      xerces-c-dev

 

 




I've yet to Quantify my application. But as I took a brief look at the
xercesc.2.3.0 source code, it seems that the per-document memory heap is
gone with the introduction of pluggable memory manager. The default memory
manager  just turns around and calls new() and delete(). This means higher
overhead for handling large number of small objects. I suspect that the
default memory manager causes the performance degradation. I have to wait
for my application to be quantified to prove that.

Is the per-document memory heap logic provided somewhere in the source
distribution as a memory manager implemantation? It seems more reasonable
to provide that as the default memory manager.

      -----Original Message-----
      From: Jesse Pelton [mailto:jsp@PKC.com]
      Sent: Tuesday, April 27, 2004 9:24 AM
      To: xerces-c-dev@xml.apache.org
      Subject: RE: Performance degradation

      Hmm. I wonder if the pluggable memory manager introduced in 2.3 is
      responsible for the degradation. If I understand your benchmarks
      correctly, changing from Xerces 2.2 or earlier to 2.3 or later
      results in a 28% decrease in message throughput, from 50/sec to
      36/sec. That's pretty serious.

      Can you profile your application to see if there are any obvious
      bottlenecks in Xerces or elsewhere? Knowing where the problem lies
      would help you and/or the maintainers address it.

       From: Qiu, Wenning [mailto:Wenning_Qiu@csgsystems.com]
       Sent: Tuesday, April 27, 2004 9:59 AM
       To: xerces-c-dev@xml.apache.org
       Subject: Performance degradation




             Hi, All


             We have observed a performance degradation when upgrading some
             third-party packages in our production systems.


             We are currently using xercesc.2.1.0 with STLport.4.5.3,
             libhoard.2.1.0 and expat.1.95.4. We are looking at upgrading
             to xercesc.2.5.0, STLport.4.6.2, libhoard.2.1.2d and
             expat.1.95.7.


             Our current production code can process about 40 messages per
             CPU-second in our test environment, while the new build with
             all new 3-rd party packages can do only 36 per CPU-second.
             However, when built with xercesc.2.1.0(or 2.2.0),
             STLport.4.6.2, libhoard.2.1.2d and expat.1.95.7, it can handle
             close to 50 mesages per CPU-second.


             We have tested all xercesc releases since 2.1.0, it seems that
             the performance drop started since 2.3.0 and remained till the
             latest release.


             Is there a way to turn off the unwanted features in the new
             releases so that good performance is retained?


             Does anybody have any idea when performance is to be addressed
             in future releases?


             For now it looks like we can move up to 2.2.0 at best since
             the performance is of great importance for our system.


             Thanks for any feedback.






             Wenning Qiu
             CSG Systems Inc.
             Phone: (402)963-8364
             Email: wenning_qiu@csgsystems.com







---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org

______________________________________________________________________
  This email message has been scanned by PineApp Mail-Secure and has been
found clean.

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org

______________________________________________________________________
  This email message has been scanned by PineApp Mail-Secure and has been
found clean.

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


RE: Performance degradation

Posted by Alberto Massari <am...@progress.com>.
Thanks for the analysis; I'll see what I can do to fix the degradation

Alberto

At 11.30 04/05/2004 -0500, Qiu, Wenning wrote:
>About half of the performance degredation in DOMWriterImpl was due to the 
>handling of namespace. Even applications that do not use namespace have to 
>pay the price for it.
>
>The overhead comes from the data structure used to handle namespace: a 
>stack of hashmaps of namespace bindings. Current implementation creates a 
>hashmap for every Element Node and pushes it on to the stack as it 
>traverses the DOM tree. My test case did not use any namespace, but half 
>of the DOMWriterImpl's 65% performance drop came from the construction and 
>destruction of stack of empty hash maps.
>
>I wonder if the following optimization is possible:
>
>1) Both DOMParser and DOMBuilder has a DoNamespace feature. Is it possible 
>for the DOMDocument to carry that feature flag, so that DOMWriter can 
>completely bypass the namespace handling if it knows the DOM does not have 
>namespace?
>
>2) Even when namespace is used, it seems highly inefficient to have a 
>hashmap created and destroyed for each DOMElement node. It is only 
>necessary when that node introduces new namespace binding.
>
>It seems to me there are some places we can definitely optimize 
>DOMWriter's performance.
>
>My application runs about 10% faster when I replaced DOMWriter with my own 
>serializer(built with xercesc.2.5.0, STLport.4.6.2, libhoard.2.1.2d, 
>expat.1.95.7) compared with the same code built with xercesc.2.1.0, 
>STLport.4.5.3, libhoard.2.1.0 and expat.1.95.4. I hope I an drop my 
>serializer when DOMWriter gets more efficient.
>
>-Wenning Qiu
>
>-----Original Message-----
>From: Qiu, Wenning [mailto:Wenning_Qiu@csgsystems.com]
>Sent: Wednesday, April 28, 2004 5:11 PM
>To: xerces-c-dev@xml.apache.org
>Subject: RE: Performance degradation
>
>
>I appreciate the feedbacks on this subject, they have been very helpful.
>
>I quantified my application and I'd like to report the results that I have 
>obtained.
>
>I compared xercesc.2.1.0 with xercesc.2.5.0. built with STLport.4.5.3. 
>libhoard was taken out since it would not work with quantify. The 
>quantified program has only one thread, it parses XML messages or builds 
>DOM then serializes the DOM using xercesc::DOMWriter.
>
>I noticed only slight performance drop for parsing(~0.5%) and DOM 
>building(~3%). The bottleneck turns out to be the serlization part, where 
>the degradation in performance is around 60%.
>
>DOMWriterImpl::writeNode() used 75,800,391 cycles with Xerces.2.1.0. It 
>used 124,670,672 cycles with Xerces.2.5.0. It's 64.5% slower.
>
>The major contributors are:
>
>1) XMLFormatter::operator << (const unsigned short*const) used 39,802,051 
>cycles with Xerces.2.1.0. It used 41,861,068 cycles. It's 5.2% slower.
>
>2) XMLFormatter::operator << (const unsigned short) used 16,760,000 cycles 
>with Xerces.2.1.0. It used 18,032,000 cycles with Xerces.2.5.0. It's 7.6% 
>slower.
>
>3) Some new functions in Xerces.2.5.0 also contributed:
>    o xercesc_2_5::RefHashTableOf<unsigned short>::RefHashTableOf() 16.8% 
> of total cycles. Almost all the time is spent calling 
> MemoryManagerImpl::allocate() and Xmemory::operator new().
>    o xercesc_2_5::XMLFormatter::specialFormat() 9.33% of total cycles.
>    o xercesc_2_5::BaseRefVectorOf<xercesc_2.5::RefHashTableOf<unsigned 
> short> >::removeLastElement() 9.23% of total.
>    o xercesc_2_5::XMemory::operator new() 4.06% of total cycles.
>
>In addition to the degradation in processing time, xerces.2.5.0 seems not 
>to scale beyond 2 threads when DOM serialization is involved.
>
>
>-----Original Message-----
>From: Karande Samir [mailto:Samir.Karande@comverse.com]
>Sent: Wednesday, April 28, 2004 2:00 PM
>To: 'xerces-c-dev@xml.apache.org'
>Subject: RE: Performance degradation
>
>
>Hi Wenning,
>         I have seen similar performance degradation (though, not because of
>xerces upgrade) in past. It turned out that some of the components I was
>using were using their own malloc/new implementation and hoard could not
>override the malloc/new calls that were made through those components.
>Unfortunately, most of the memory management schemes in third party
>components are not optimized for SMP systems and you would start seeing lot
>of context switching between threads/processes due to idle wait in malloc.
>         If new memory allocation scheme in xerces is preventing hoard to
>take over memory management, its likely that you would see performance
>degradation. May be we want to supply libc's new/malloc to the xerces parser
>calls explicitly, if its possible.
>
>I hope this helps.
>
>-Samir
>
>-----Original Message-----
>From: Qiu, Wenning [mailto:Wenning_Qiu@csgsystems.com]
>Sent: Wednesday, April 28, 2004 2:47 PM
>To: xerces-c-dev@xml.apache.org
>Subject: RE: Performance degradation
>
>
>Yes I am using a Solaris box with 4 processors.
>
>-----Original Message-----
>From: Karande Samir [mailto:Samir.Karande@comverse.com]
>Sent: Wednesday, April 28, 2004 12:35 PM
>To: 'xerces-c-dev@xml.apache.org'
>Subject: RE: Performance degradation
>
>
>Hi Wenning,
>         Do you use multiprocessor (SMP) systems ? I am assuming SMP system
>because you are using hoard.
>-Samir
>
>-----Original Message-----
>From: Qiu, Wenning [mailto:Wenning_Qiu@csgsystems.com]
>Sent: Tuesday, April 27, 2004 5:16 PM
>To: xerces-c-dev@xml.apache.org
>Subject: RE: Performance degradation
>
>
>Hi Neil,
>
>You are right the "per-document memory heap" in Xerces's DOM implementation
>is still there. I missed that when I looked at the source code.
>
>My application accepts XML messages, parses them into DOM, converts DOM to
>messages in a proprietory format and send them to a server. When response
>messages come back from the server, my application parses the proprietory
>response message and builds DOM, it then serialize DOM into XML byte stream
>and send out.
>
>We use expat to do the parsing because for better performance. So the
>functionalities we actually use from Xerces is DOM building and
>serialization. The validation was not included.
>
>
>Thanks,
>Wenning Qiu
>
>-----Original Message-----
>From: Neil Graham [mailto:neilg@ca.ibm.com]
>Sent: Tuesday, April 27, 2004 2:56 PM
>To: xerces-c-dev@xml.apache.org
>Subject: RE: Performance degradation
>
>
>
>
>
>
>
>Hi Wenning,
>
>If by the "per-document memory heap" you're referring to the way Xerces's
>DOM implementation works, then nothing has changed since 2.2.  The same
>memory paradigm is used.
>
>The pluggable memory management certainly does introduce overhead:  every
>time the parser needs memory, it has to reach out to a virtual function
>instead of directly calling the system libraries.  Some work was done to
>mitigate the parser's habit of creating and destroying short-lived objects
>in the 2.4 time-frame, and this bought back a good portion of the
>performance that the pluggable memory scheme cost.
>
>To be more helpful, we'd have to understand the characteristics of your
>application.  I conjecture you're DOM-based; do you do any validation?  If
>so, then some of the grammar caching/persistence capabilities introduced
>since 2.3 might be helpful to you.
>
>Cheers,
>Neil
>Neil Graham
>XML Parser Development
>IBM Toronto Lab
>Phone:  905-413-3519, T/L 969-3519
>E-mail:  neilg@ca.ibm.com
>
>
>
>
>
>
>                       "Qiu, Wenning"
>
>                       <Wenning_Qiu@csgs        To:
><xe...@xml.apache.org>
>
>                       ystems.com>              cc:
>
>                                                Subject:  RE: Performance
>degradation
>                       04/27/2004 03:29
>
>                       PM
>
>                       Please respond to
>
>                       xerces-c-dev
>
>
>
>
>
>
>
>
>I've yet to Quantify my application. But as I took a brief look at the
>xercesc.2.3.0 source code, it seems that the per-document memory heap is
>gone with the introduction of pluggable memory manager. The default memory
>manager  just turns around and calls new() and delete(). This means higher
>overhead for handling large number of small objects. I suspect that the
>default memory manager causes the performance degradation. I have to wait
>for my application to be quantified to prove that.
>
>Is the per-document memory heap logic provided somewhere in the source
>distribution as a memory manager implemantation? It seems more reasonable
>to provide that as the default memory manager.
>
>       -----Original Message-----
>       From: Jesse Pelton [mailto:jsp@PKC.com]
>       Sent: Tuesday, April 27, 2004 9:24 AM
>       To: xerces-c-dev@xml.apache.org
>       Subject: RE: Performance degradation
>
>       Hmm. I wonder if the pluggable memory manager introduced in 2.3 is
>       responsible for the degradation. If I understand your benchmarks
>       correctly, changing from Xerces 2.2 or earlier to 2.3 or later
>       results in a 28% decrease in message throughput, from 50/sec to
>       36/sec. That's pretty serious.
>
>       Can you profile your application to see if there are any obvious
>       bottlenecks in Xerces or elsewhere? Knowing where the problem lies
>       would help you and/or the maintainers address it.
>
>        From: Qiu, Wenning [mailto:Wenning_Qiu@csgsystems.com]
>        Sent: Tuesday, April 27, 2004 9:59 AM
>        To: xerces-c-dev@xml.apache.org
>        Subject: Performance degradation
>
>
>
>
>              Hi, All
>
>
>              We have observed a performance degradation when upgrading some
>              third-party packages in our production systems.
>
>
>              We are currently using xercesc.2.1.0 with STLport.4.5.3,
>              libhoard.2.1.0 and expat.1.95.4. We are looking at upgrading
>              to xercesc.2.5.0, STLport.4.6.2, libhoard.2.1.2d and
>              expat.1.95.7.
>
>
>              Our current production code can process about 40 messages per
>              CPU-second in our test environment, while the new build with
>              all new 3-rd party packages can do only 36 per CPU-second.
>              However, when built with xercesc.2.1.0(or 2.2.0),
>              STLport.4.6.2, libhoard.2.1.2d and expat.1.95.7, it can handle
>              close to 50 mesages per CPU-second.
>
>
>              We have tested all xercesc releases since 2.1.0, it seems that
>              the performance drop started since 2.3.0 and remained till the
>              latest release.
>
>
>              Is there a way to turn off the unwanted features in the new
>              releases so that good performance is retained?
>
>
>              Does anybody have any idea when performance is to be addressed
>              in future releases?
>
>
>              For now it looks like we can move up to 2.2.0 at best since
>              the performance is of great importance for our system.
>
>
>              Thanks for any feedback.
>
>
>
>
>
>
>              Wenning Qiu
>              CSG Systems Inc.
>              Phone: (402)963-8364
>              Email: wenning_qiu@csgsystems.com
>
>
>
>
>
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
>For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
>For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
>
>______________________________________________________________________
>   This email message has been scanned by PineApp Mail-Secure and has been
>found clean.
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
>For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
>For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
>
>______________________________________________________________________
>   This email message has been scanned by PineApp Mail-Secure and has been
>found clean.
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
>For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
>For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
>For additional commands, e-mail: xerces-c-dev-help@xml.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org