You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-dev@axis.apache.org by "Rich Scheuerle (JIRA)" <ji...@apache.org> on 2007/09/19 19:08:14 UTC

[jira] Resolved: (AXIS2-3210) MessageContext Persistence Performance Improvement

     [ https://issues.apache.org/jira/browse/AXIS2-3210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rich Scheuerle resolved AXIS2-3210.
-----------------------------------

    Resolution: Fixed

Revision = 577367

> MessageContext Persistence Performance Improvement
> --------------------------------------------------
>
>                 Key: AXIS2-3210
>                 URL: https://issues.apache.org/jira/browse/AXIS2-3210
>             Project: Axis 2.0 (Axis2)
>          Issue Type: Improvement
>          Components: kernel
>            Reporter: Rich Scheuerle
>            Assignee: Rich Scheuerle
>         Attachments: patchJIRA.txt
>
>
> MessageContext Persistence Performance Improvement
> ---------------------------------------------------------------------------------
> Background: 
> -----------------
> When a MessageContext is persisted (for reliable messaging), the MessageContext object and associated
> objects are written out to the ObjectOutput.  When a MessageContext is hydrated it is read from an 
> InputObject.  The utility class, ObjectStateUtils, provides static utility functions to provide safety
> mechanisms to write and read the data.
> Problem:
> --------------
> The IBM performance team has profiled this code.  They found that the writing and reading of these objects is time 
> consuming.  Some of the performance penalties are due to the use of static methods (thus hindering the ability to reuse
> byte buffers).  Other penalties are due to the way that we determine if an object can be "safely written".
> This JIRA issue addresses a number of these concerns.
> Scope of Changes (Important):
> -------------------------------------------
> These changes only amend the existing writeExternal and readExternal support.  There is no impact on any code that 
> does not use these methods.  No additional logic api's are added or changed. 
> Specific Concerns and Solutions:
> ----------------------------------------------
>   A) The original logic writes objects into a buffer.  If a serialization error occurs, the algorithm safely 
>      accommodates the error.  The downside is that it is very expensive to write each object to a temporary buffer.
>      Solution:
>      A new marker interface, SafeSerializable, is introduced.  If an object (i.e. MessageContext) has this marker
>      interface or is a lang wrapper object (i.e. String) then the object is written directly to the ObjectOutput.
>      Eliminating the extra buffer write increases throughput.
>      A similar change is made to the read algorithm.  The new algorithm detects whether the object was written directly
>      or whether it was written as a byte buffer.  In the case where it is written directly, no extra buffering is needed
>      when reading.
>   B) If a buffer is needed to write or read an object, the ObjectStateUtils class creates a new buffer.  This 
>      excessive allocation of buffers and subsequent garbage collection can hinder performance.
>      Solution:
>      The code is re-factored to use two new classes: SafeObjectOutputStream and SafeObjectInputStream.  These classes
>      wrap the ObjectOutput and ObjectInput objects and provide similar logic as ObjectStateUtils.
>      The key difference is that these are not static utility classes.  Therefore any buffers used during writing or reading can
>      are reused for the life of the *Stream object.  In one series of tests, this reduced the number of buffers from 40 to 2 for 
>      persisting a MessageContext.
>   C) When an outbound MessageContext is persisted, its associated inbound MessageContext (if present) is also persisted.
>      The problem is that the inbound MessageContext may have a large message.  Writing out this message can impact performance
>      and in some cases causes logic errors.
>   
>      Solution:
>      Any code that hydrates an outbound MessageContext should never need the message (soapenvelope) associated with the 
>      inbound MessageContext.  The solution is to not persist the inbound message.
>   D) In the current code, "marker" strings are persisted along with the data.  These marker strings may contain a lengthy 
>      correlation id.   This extra information can impact performance and file size.
>      Solution:
>      I reduced the number of "marker" strings.  The remaining marker strings are changed to the "common name" of the object
>      being persisted.  In most cases, the log correlation id is no longer present in the marker string.  In addition, I made
>      changes to only create a log correlation id "on demand".  The log correlation code uses the (synchronized) UUIDGenerator.  
>      Creating the log correlation id "on demand" limits unnecessary locking.
>   E) Miscellaneous.  I spent time fine tuning the algorithmic logic in SafeObjectInputStream and SafeObjectOutputStream
>      to eliminate extra buffers (i.e. ByteArrayOutputStream optimizations).  These are all localized changes.
> Other Non-performance Related Changes
> ------------------------------------------------------------
>   i) The externalize related code is refactored so that all lives in the new org.apache.axis2.context.externalize package.
>  
>   ii) The ObjectStateUtils class is retained for legacy reasons.  I didn't want to remove any api's.  The implementation 
>       of ObjectStatUtils is changed to delegate to the new classes.
>   iii) New tests are added.
>   iv) I added classes DebugOutputObjectStream and DebugObjectInputStream.  
>       These classes are installed when log.isDebugEnabled() is true.  
>       The classes log all method calls to and from the underlying ObjectOutput and ObjectInput; thus they are helpful 
>       in debugging errors.
>   v) Andy Gatford has provided code that uses the context classloader when reading persisted data.
>   vi) The high level logic used to write and read the objects is generally the same.  The implementation of the algorithms is changed/improved.
>      In some cases, this required changes to the format of the persisted data.  An example is that each object is preceded by
>      a boolean that indicates whether the object was written directly or written into a byte buffer.  I increased the revision id because
>      I changed the format.
> Kudos
> ---------
> Much thanks to the following people who contributed to this work, helped with brainstorming, helped with testing or provided performance profiles:
> Ann Robinson, Andy Gatford, Dan Zhong, Doug Larson, and Richard Slade.
> Next Steps
> ---------------
> I am attaching the patch to this JIRA.  I will be committing the patch in the next day or two.  Please let me know if you have any questions or concerns.
> Thanks
> Rich Scheuerle

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: axis-dev-unsubscribe@ws.apache.org
For additional commands, e-mail: axis-dev-help@ws.apache.org