You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-dev@axis.apache.org by "Rich Scheuerle (JIRA)" <ji...@apache.org> on 2007/09/18 22:21:46 UTC

[jira] Created: (AXIS2-3210) MessageContext Persistence Performance Improvement

MessageContext Persistence Performance Improvement
--------------------------------------------------

                 Key: AXIS2-3210
                 URL: https://issues.apache.org/jira/browse/AXIS2-3210
             Project: Axis 2.0 (Axis2)
          Issue Type: Improvement
          Components: kernel
            Reporter: Rich Scheuerle
            Assignee: Rich Scheuerle


MessageContext Persistence Performance Improvement
---------------------------------------------------------------------------------

Background: 
-----------------
When a MessageContext is persisted (for reliable messaging), the MessageContext object and associated
objects are written out to the ObjectOutput.  When a MessageContext is hydrated it is read from an 
InputObject.  The utility class, ObjectStateUtils, provides static utility functions to provide safety
mechanisms to write and read the data.

Problem:
--------------
The IBM performance team has profiled this code.  They found that the writing and reading of these objects is time 
consuming.  Some of the performance penalties are due to the use of static methods (thus hindering the ability to reuse
byte buffers).  Other penalties are due to the way that we determine if an object can be "safely written".
This JIRA issue addresses a number of these concerns.

Scope of Changes (Important):
-------------------------------------------
These changes only amend the existing writeExternal and readExternal support.  There is no impact on any code that 
does not use these methods.  No additional logic api's are added or changed. 


Specific Concerns and Solutions:
----------------------------------------------
  A) The original logic writes objects into a buffer.  If a serialization error occurs, the algorithm safely 
     accommodates the error.  The downside is that it is very expensive to write each object to a temporary buffer.

     Solution:
     A new marker interface, SafeSerializable, is introduced.  If an object (i.e. MessageContext) has this marker
     interface or is a lang wrapper object (i.e. String) then the object is written directly to the ObjectOutput.
     Eliminating the extra buffer write increases throughput.
     A similar change is made to the read algorithm.  The new algorithm detects whether the object was written directly
     or whether it was written as a byte buffer.  In the case where it is written directly, no extra buffering is needed
     when reading.

  B) If a buffer is needed to write or read an object, the ObjectStateUtils class creates a new buffer.  This 
     excessive allocation of buffers and subsequent garbage collection can hinder performance.

     Solution:
     The code is re-factored to use two new classes: SafeObjectOutputStream and SafeObjectInputStream.  These classes
     wrap the ObjectOutput and ObjectInput objects and provide similar logic as ObjectStateUtils.
     The key difference is that these are not static utility classes.  Therefore any buffers used during writing or reading can
     are reused for the life of the *Stream object.  In one series of tests, this reduced the number of buffers from 40 to 2 for 
     persisting a MessageContext.

  C) When an outbound MessageContext is persisted, its associated inbound MessageContext (if present) is also persisted.
     The problem is that the inbound MessageContext may have a large message.  Writing out this message can impact performance
     and in some cases causes logic errors.
  
     Solution:
     Any code that hydrates an outbound MessageContext should never need the message (soapenvelope) associated with the 
     inbound MessageContext.  The solution is to not persist the inbound message.

  D) In the current code, "marker" strings are persisted along with the data.  These marker strings may contain a lengthy 
     correlation id.   This extra information can impact performance and file size.

     Solution:
     I reduced the number of "marker" strings.  The remaining marker strings are changed to the "common name" of the object
     being persisted.  In most cases, the log correlation id is no longer present in the marker string.  In addition, I made
     changes to only create a log correlation id "on demand".  The log correlation code uses the (synchronized) UUIDGenerator.  
     Creating the log correlation id "on demand" limits unnecessary locking.

  E) Miscellaneous.  I spent time fine tuning the algorithmic logic in SafeObjectInputStream and SafeObjectOutputStream
     to eliminate extra buffers (i.e. ByteArrayOutputStream optimizations).  These are all localized changes.

Other Non-performance Related Changes
------------------------------------------------------------

  i) The externalize related code is refactored so that all lives in the new org.apache.axis2.context.externalize package.
 
  ii) The ObjectStateUtils class is retained for legacy reasons.  I didn't want to remove any api's.  The implementation 
      of ObjectStatUtils is changed to delegate to the new classes.

  iii) New tests are added.

  iv) I added classes DebugOutputObjectStream and DebugObjectInputStream.  
      These classes are installed when log.isDebugEnabled() is true.  
      The classes log all method calls to and from the underlying ObjectOutput and ObjectInput; thus they are helpful 
      in debugging errors.

  v) Andy Gatford has provided code that uses the context classloader when reading persisted data.

  vi) The high level logic used to write and read the objects is generally the same.  The implementation of the algorithms is changed/improved.
     In some cases, this required changes to the format of the persisted data.  An example is that each object is preceded by
     a boolean that indicates whether the object was written directly or written into a byte buffer.  I increased the revision id because
     I changed the format.



Kudos
---------
Much thanks to the following people who contributed to this work, helped with brainstorming, helped with testing or provided performance profiles:
Ann Robinson, Andy Gatford, Dan Zhong, Doug Larson, and Richard Slade.

Next Steps
---------------
I am attaching the patch to this JIRA.  I will be committing the patch in the next day or two.  Please let me know if you have any questions or concerns.

Thanks
Rich Scheuerle





-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: axis-dev-unsubscribe@ws.apache.org
For additional commands, e-mail: axis-dev-help@ws.apache.org


Re: [jira] Created: (AXIS2-3210) MessageContext Persistence Performance Improvement

Posted by Sanjiva Weerawarana <sa...@opensource.lk>.
This looks ok to me .. glad to see this improved!

Sanjiva.

Rich Scheuerle (JIRA) wrote:
> MessageContext Persistence Performance Improvement
> --------------------------------------------------
> 
>                  Key: AXIS2-3210
>                  URL: https://issues.apache.org/jira/browse/AXIS2-3210
>              Project: Axis 2.0 (Axis2)
>           Issue Type: Improvement
>           Components: kernel
>             Reporter: Rich Scheuerle
>             Assignee: Rich Scheuerle
> 
> 
> MessageContext Persistence Performance Improvement
> ---------------------------------------------------------------------------------
> 
> Background: 
> -----------------
> When a MessageContext is persisted (for reliable messaging), the MessageContext object and associated
> objects are written out to the ObjectOutput.  When a MessageContext is hydrated it is read from an 
> InputObject.  The utility class, ObjectStateUtils, provides static utility functions to provide safety
> mechanisms to write and read the data.
> 
> Problem:
> --------------
> The IBM performance team has profiled this code.  They found that the writing and reading of these objects is time 
> consuming.  Some of the performance penalties are due to the use of static methods (thus hindering the ability to reuse
> byte buffers).  Other penalties are due to the way that we determine if an object can be "safely written".
> This JIRA issue addresses a number of these concerns.
> 
> Scope of Changes (Important):
> -------------------------------------------
> These changes only amend the existing writeExternal and readExternal support.  There is no impact on any code that 
> does not use these methods.  No additional logic api's are added or changed. 
> 
> 
> Specific Concerns and Solutions:
> ----------------------------------------------
>   A) The original logic writes objects into a buffer.  If a serialization error occurs, the algorithm safely 
>      accommodates the error.  The downside is that it is very expensive to write each object to a temporary buffer.
> 
>      Solution:
>      A new marker interface, SafeSerializable, is introduced.  If an object (i.e. MessageContext) has this marker
>      interface or is a lang wrapper object (i.e. String) then the object is written directly to the ObjectOutput.
>      Eliminating the extra buffer write increases throughput.
>      A similar change is made to the read algorithm.  The new algorithm detects whether the object was written directly
>      or whether it was written as a byte buffer.  In the case where it is written directly, no extra buffering is needed
>      when reading.
> 
>   B) If a buffer is needed to write or read an object, the ObjectStateUtils class creates a new buffer.  This 
>      excessive allocation of buffers and subsequent garbage collection can hinder performance.
> 
>      Solution:
>      The code is re-factored to use two new classes: SafeObjectOutputStream and SafeObjectInputStream.  These classes
>      wrap the ObjectOutput and ObjectInput objects and provide similar logic as ObjectStateUtils.
>      The key difference is that these are not static utility classes.  Therefore any buffers used during writing or reading can
>      are reused for the life of the *Stream object.  In one series of tests, this reduced the number of buffers from 40 to 2 for 
>      persisting a MessageContext.
> 
>   C) When an outbound MessageContext is persisted, its associated inbound MessageContext (if present) is also persisted.
>      The problem is that the inbound MessageContext may have a large message.  Writing out this message can impact performance
>      and in some cases causes logic errors.
>   
>      Solution:
>      Any code that hydrates an outbound MessageContext should never need the message (soapenvelope) associated with the 
>      inbound MessageContext.  The solution is to not persist the inbound message.
> 
>   D) In the current code, "marker" strings are persisted along with the data.  These marker strings may contain a lengthy 
>      correlation id.   This extra information can impact performance and file size.
> 
>      Solution:
>      I reduced the number of "marker" strings.  The remaining marker strings are changed to the "common name" of the object
>      being persisted.  In most cases, the log correlation id is no longer present in the marker string.  In addition, I made
>      changes to only create a log correlation id "on demand".  The log correlation code uses the (synchronized) UUIDGenerator.  
>      Creating the log correlation id "on demand" limits unnecessary locking.
> 
>   E) Miscellaneous.  I spent time fine tuning the algorithmic logic in SafeObjectInputStream and SafeObjectOutputStream
>      to eliminate extra buffers (i.e. ByteArrayOutputStream optimizations).  These are all localized changes.
> 
> Other Non-performance Related Changes
> ------------------------------------------------------------
> 
>   i) The externalize related code is refactored so that all lives in the new org.apache.axis2.context.externalize package.
>  
>   ii) The ObjectStateUtils class is retained for legacy reasons.  I didn't want to remove any api's.  The implementation 
>       of ObjectStatUtils is changed to delegate to the new classes.
> 
>   iii) New tests are added.
> 
>   iv) I added classes DebugOutputObjectStream and DebugObjectInputStream.  
>       These classes are installed when log.isDebugEnabled() is true.  
>       The classes log all method calls to and from the underlying ObjectOutput and ObjectInput; thus they are helpful 
>       in debugging errors.
> 
>   v) Andy Gatford has provided code that uses the context classloader when reading persisted data.
> 
>   vi) The high level logic used to write and read the objects is generally the same.  The implementation of the algorithms is changed/improved.
>      In some cases, this required changes to the format of the persisted data.  An example is that each object is preceded by
>      a boolean that indicates whether the object was written directly or written into a byte buffer.  I increased the revision id because
>      I changed the format.
> 
> 
> 
> Kudos
> ---------
> Much thanks to the following people who contributed to this work, helped with brainstorming, helped with testing or provided performance profiles:
> Ann Robinson, Andy Gatford, Dan Zhong, Doug Larson, and Richard Slade.
> 
> Next Steps
> ---------------
> I am attaching the patch to this JIRA.  I will be committing the patch in the next day or two.  Please let me know if you have any questions or concerns.
> 
> Thanks
> Rich Scheuerle
> 
> 
> 
> 
> 

-- 
Sanjiva Weerawarana, Ph.D.
Founder & Director; Lanka Software Foundation; http://www.opensource.lk/
Founder, Chairman & CEO; WSO2, Inc.; http://www.wso2.com/
Member; Apache Software Foundation; http://www.apache.org/
Visiting Lecturer; University of Moratuwa; http://www.cse.mrt.ac.lk/

---------------------------------------------------------------------
To unsubscribe, e-mail: axis-dev-unsubscribe@ws.apache.org
For additional commands, e-mail: axis-dev-help@ws.apache.org


[jira] Updated: (AXIS2-3210) MessageContext Persistence Performance Improvement

Posted by "Rich Scheuerle (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AXIS2-3210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rich Scheuerle updated AXIS2-3210:
----------------------------------

    Attachment: patchJIRA.txt

Code Solution

ObjectInputStreamWithCL is contributed by Andrew Gatford (Sandesha).
The remaining changes are contributed by Rich Scheuerle

> MessageContext Persistence Performance Improvement
> --------------------------------------------------
>
>                 Key: AXIS2-3210
>                 URL: https://issues.apache.org/jira/browse/AXIS2-3210
>             Project: Axis 2.0 (Axis2)
>          Issue Type: Improvement
>          Components: kernel
>            Reporter: Rich Scheuerle
>            Assignee: Rich Scheuerle
>         Attachments: patchJIRA.txt
>
>
> MessageContext Persistence Performance Improvement
> ---------------------------------------------------------------------------------
> Background: 
> -----------------
> When a MessageContext is persisted (for reliable messaging), the MessageContext object and associated
> objects are written out to the ObjectOutput.  When a MessageContext is hydrated it is read from an 
> InputObject.  The utility class, ObjectStateUtils, provides static utility functions to provide safety
> mechanisms to write and read the data.
> Problem:
> --------------
> The IBM performance team has profiled this code.  They found that the writing and reading of these objects is time 
> consuming.  Some of the performance penalties are due to the use of static methods (thus hindering the ability to reuse
> byte buffers).  Other penalties are due to the way that we determine if an object can be "safely written".
> This JIRA issue addresses a number of these concerns.
> Scope of Changes (Important):
> -------------------------------------------
> These changes only amend the existing writeExternal and readExternal support.  There is no impact on any code that 
> does not use these methods.  No additional logic api's are added or changed. 
> Specific Concerns and Solutions:
> ----------------------------------------------
>   A) The original logic writes objects into a buffer.  If a serialization error occurs, the algorithm safely 
>      accommodates the error.  The downside is that it is very expensive to write each object to a temporary buffer.
>      Solution:
>      A new marker interface, SafeSerializable, is introduced.  If an object (i.e. MessageContext) has this marker
>      interface or is a lang wrapper object (i.e. String) then the object is written directly to the ObjectOutput.
>      Eliminating the extra buffer write increases throughput.
>      A similar change is made to the read algorithm.  The new algorithm detects whether the object was written directly
>      or whether it was written as a byte buffer.  In the case where it is written directly, no extra buffering is needed
>      when reading.
>   B) If a buffer is needed to write or read an object, the ObjectStateUtils class creates a new buffer.  This 
>      excessive allocation of buffers and subsequent garbage collection can hinder performance.
>      Solution:
>      The code is re-factored to use two new classes: SafeObjectOutputStream and SafeObjectInputStream.  These classes
>      wrap the ObjectOutput and ObjectInput objects and provide similar logic as ObjectStateUtils.
>      The key difference is that these are not static utility classes.  Therefore any buffers used during writing or reading can
>      are reused for the life of the *Stream object.  In one series of tests, this reduced the number of buffers from 40 to 2 for 
>      persisting a MessageContext.
>   C) When an outbound MessageContext is persisted, its associated inbound MessageContext (if present) is also persisted.
>      The problem is that the inbound MessageContext may have a large message.  Writing out this message can impact performance
>      and in some cases causes logic errors.
>   
>      Solution:
>      Any code that hydrates an outbound MessageContext should never need the message (soapenvelope) associated with the 
>      inbound MessageContext.  The solution is to not persist the inbound message.
>   D) In the current code, "marker" strings are persisted along with the data.  These marker strings may contain a lengthy 
>      correlation id.   This extra information can impact performance and file size.
>      Solution:
>      I reduced the number of "marker" strings.  The remaining marker strings are changed to the "common name" of the object
>      being persisted.  In most cases, the log correlation id is no longer present in the marker string.  In addition, I made
>      changes to only create a log correlation id "on demand".  The log correlation code uses the (synchronized) UUIDGenerator.  
>      Creating the log correlation id "on demand" limits unnecessary locking.
>   E) Miscellaneous.  I spent time fine tuning the algorithmic logic in SafeObjectInputStream and SafeObjectOutputStream
>      to eliminate extra buffers (i.e. ByteArrayOutputStream optimizations).  These are all localized changes.
> Other Non-performance Related Changes
> ------------------------------------------------------------
>   i) The externalize related code is refactored so that all lives in the new org.apache.axis2.context.externalize package.
>  
>   ii) The ObjectStateUtils class is retained for legacy reasons.  I didn't want to remove any api's.  The implementation 
>       of ObjectStatUtils is changed to delegate to the new classes.
>   iii) New tests are added.
>   iv) I added classes DebugOutputObjectStream and DebugObjectInputStream.  
>       These classes are installed when log.isDebugEnabled() is true.  
>       The classes log all method calls to and from the underlying ObjectOutput and ObjectInput; thus they are helpful 
>       in debugging errors.
>   v) Andy Gatford has provided code that uses the context classloader when reading persisted data.
>   vi) The high level logic used to write and read the objects is generally the same.  The implementation of the algorithms is changed/improved.
>      In some cases, this required changes to the format of the persisted data.  An example is that each object is preceded by
>      a boolean that indicates whether the object was written directly or written into a byte buffer.  I increased the revision id because
>      I changed the format.
> Kudos
> ---------
> Much thanks to the following people who contributed to this work, helped with brainstorming, helped with testing or provided performance profiles:
> Ann Robinson, Andy Gatford, Dan Zhong, Doug Larson, and Richard Slade.
> Next Steps
> ---------------
> I am attaching the patch to this JIRA.  I will be committing the patch in the next day or two.  Please let me know if you have any questions or concerns.
> Thanks
> Rich Scheuerle

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: axis-dev-unsubscribe@ws.apache.org
For additional commands, e-mail: axis-dev-help@ws.apache.org


[jira] Resolved: (AXIS2-3210) MessageContext Persistence Performance Improvement

Posted by "Rich Scheuerle (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AXIS2-3210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rich Scheuerle resolved AXIS2-3210.
-----------------------------------

    Resolution: Fixed

Revision = 577367

> MessageContext Persistence Performance Improvement
> --------------------------------------------------
>
>                 Key: AXIS2-3210
>                 URL: https://issues.apache.org/jira/browse/AXIS2-3210
>             Project: Axis 2.0 (Axis2)
>          Issue Type: Improvement
>          Components: kernel
>            Reporter: Rich Scheuerle
>            Assignee: Rich Scheuerle
>         Attachments: patchJIRA.txt
>
>
> MessageContext Persistence Performance Improvement
> ---------------------------------------------------------------------------------
> Background: 
> -----------------
> When a MessageContext is persisted (for reliable messaging), the MessageContext object and associated
> objects are written out to the ObjectOutput.  When a MessageContext is hydrated it is read from an 
> InputObject.  The utility class, ObjectStateUtils, provides static utility functions to provide safety
> mechanisms to write and read the data.
> Problem:
> --------------
> The IBM performance team has profiled this code.  They found that the writing and reading of these objects is time 
> consuming.  Some of the performance penalties are due to the use of static methods (thus hindering the ability to reuse
> byte buffers).  Other penalties are due to the way that we determine if an object can be "safely written".
> This JIRA issue addresses a number of these concerns.
> Scope of Changes (Important):
> -------------------------------------------
> These changes only amend the existing writeExternal and readExternal support.  There is no impact on any code that 
> does not use these methods.  No additional logic api's are added or changed. 
> Specific Concerns and Solutions:
> ----------------------------------------------
>   A) The original logic writes objects into a buffer.  If a serialization error occurs, the algorithm safely 
>      accommodates the error.  The downside is that it is very expensive to write each object to a temporary buffer.
>      Solution:
>      A new marker interface, SafeSerializable, is introduced.  If an object (i.e. MessageContext) has this marker
>      interface or is a lang wrapper object (i.e. String) then the object is written directly to the ObjectOutput.
>      Eliminating the extra buffer write increases throughput.
>      A similar change is made to the read algorithm.  The new algorithm detects whether the object was written directly
>      or whether it was written as a byte buffer.  In the case where it is written directly, no extra buffering is needed
>      when reading.
>   B) If a buffer is needed to write or read an object, the ObjectStateUtils class creates a new buffer.  This 
>      excessive allocation of buffers and subsequent garbage collection can hinder performance.
>      Solution:
>      The code is re-factored to use two new classes: SafeObjectOutputStream and SafeObjectInputStream.  These classes
>      wrap the ObjectOutput and ObjectInput objects and provide similar logic as ObjectStateUtils.
>      The key difference is that these are not static utility classes.  Therefore any buffers used during writing or reading can
>      are reused for the life of the *Stream object.  In one series of tests, this reduced the number of buffers from 40 to 2 for 
>      persisting a MessageContext.
>   C) When an outbound MessageContext is persisted, its associated inbound MessageContext (if present) is also persisted.
>      The problem is that the inbound MessageContext may have a large message.  Writing out this message can impact performance
>      and in some cases causes logic errors.
>   
>      Solution:
>      Any code that hydrates an outbound MessageContext should never need the message (soapenvelope) associated with the 
>      inbound MessageContext.  The solution is to not persist the inbound message.
>   D) In the current code, "marker" strings are persisted along with the data.  These marker strings may contain a lengthy 
>      correlation id.   This extra information can impact performance and file size.
>      Solution:
>      I reduced the number of "marker" strings.  The remaining marker strings are changed to the "common name" of the object
>      being persisted.  In most cases, the log correlation id is no longer present in the marker string.  In addition, I made
>      changes to only create a log correlation id "on demand".  The log correlation code uses the (synchronized) UUIDGenerator.  
>      Creating the log correlation id "on demand" limits unnecessary locking.
>   E) Miscellaneous.  I spent time fine tuning the algorithmic logic in SafeObjectInputStream and SafeObjectOutputStream
>      to eliminate extra buffers (i.e. ByteArrayOutputStream optimizations).  These are all localized changes.
> Other Non-performance Related Changes
> ------------------------------------------------------------
>   i) The externalize related code is refactored so that all lives in the new org.apache.axis2.context.externalize package.
>  
>   ii) The ObjectStateUtils class is retained for legacy reasons.  I didn't want to remove any api's.  The implementation 
>       of ObjectStatUtils is changed to delegate to the new classes.
>   iii) New tests are added.
>   iv) I added classes DebugOutputObjectStream and DebugObjectInputStream.  
>       These classes are installed when log.isDebugEnabled() is true.  
>       The classes log all method calls to and from the underlying ObjectOutput and ObjectInput; thus they are helpful 
>       in debugging errors.
>   v) Andy Gatford has provided code that uses the context classloader when reading persisted data.
>   vi) The high level logic used to write and read the objects is generally the same.  The implementation of the algorithms is changed/improved.
>      In some cases, this required changes to the format of the persisted data.  An example is that each object is preceded by
>      a boolean that indicates whether the object was written directly or written into a byte buffer.  I increased the revision id because
>      I changed the format.
> Kudos
> ---------
> Much thanks to the following people who contributed to this work, helped with brainstorming, helped with testing or provided performance profiles:
> Ann Robinson, Andy Gatford, Dan Zhong, Doug Larson, and Richard Slade.
> Next Steps
> ---------------
> I am attaching the patch to this JIRA.  I will be committing the patch in the next day or two.  Please let me know if you have any questions or concerns.
> Thanks
> Rich Scheuerle

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: axis-dev-unsubscribe@ws.apache.org
For additional commands, e-mail: axis-dev-help@ws.apache.org