You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@cxf.apache.org by Mayank Mishra <ma...@pramati.com> on 2008/11/14 08:13:29 UTC

Mtom attachment Performance

Hi All,

On CXF 2.0.7, I was evaluating the performance of MTOM for sending 
attachments for both enabled and disabled scenarios. As promised by 
MTOM, I am able to see around 30% message-size optimization compared to 
Base64 encoded messages. But surprisingly, the time taken by the MTOM 
enabled scenarios are more than expected..

My simple test includeed, sending a string and a raw byte array in a 
request and receiving them back as a response.

    public void testMtom(
        @WebParam(mode = WebParam.Mode.INOUT, name = "name", 
targetNamespace = "http://cxf.apache.org/mime/types")
        javax.xml.ws.Holder<java.lang.String> name,
        @WebParam(mode = WebParam.Mode.INOUT, name = "attachinfo", 
targetNamespace = "http://cxf.apache.org/mime/types")
        javax.xml.ws.Holder<javax.activation.DataHandler> attachinfo
    );

I performed above test, for varying sizes of  byte array from 32KB to 12MB.

I checked where it is taking unreasonably more time during request and 
response cycle. I found following as my observations:

1. StaxUtils Change: XML Namespace aware and unaware factories are 
created as static members of StaxUtils. Moving them to an static method 
which initializes them as needed. This gives an improvement of around 50 
ms on both client and server StaxInterceptors performance, like,

public static XMLInputFactory getXMLInputFactory(boolean nsAware) {
       if (nsAware) {
           if (XML_NS_AWARE_INPUT_FACTORY == null) {
               XML_NS_AWARE_INPUT_FACTORY = XMLInputFactory.newInstance();
               
XML_NS_AWARE_INPUT_FACTORY.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, 
true);
           }
           return XML_NS_AWARE_INPUT_FACTORY;
       }else {
           if (XML_INPUT_FACTORY == null) {
               XML_INPUT_FACTORY = XMLInputFactory.newInstance();
               
XML_INPUT_FACTORY.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, false);
           }
           return XML_INPUT_FACTORY;
       }
}

2. MimeBodyPartInputStream Change: read(bytes[], int off, int len) 
method is implemented. Parent class InputStream serves the method 
invocation. This is reads a byte and processes them in the 
MimeBodyPartInputStream.
An alternative implementation having read(buf, off, len) in 
MimeBodyPartInputStream works 6 times faster than this one. To check the 
performance of this alternative MimeBodyPartInputStream implementation, 
I wrote a simple test program which just reads an inputstream and 
processes the same boundary check for both alternative and CXF 
MimeBodyPartInputStream implementation. For 12 MB data, this takes 
around 250 ms, whereas CXF original takes for around 1250 ms. If require 
I can give a patch of this alternate MimeBodyPartInputSteam file.

3. AttachmentInInterceptor Change: AttachmentDeserializer contains 
static java.util.regex.Patterns which required to be compiled for 
particular String expressions, which takes substantial time in 
milliseconds. AttachmentDeserializer instance is created in 
AttachmentInInterceptor during handleMessage() call. This can be moved 
to AttachmentInInterceptor constructor, and provided a 
setMessage(Message message) method in AttachmentDerserializer, we can 
set message during handleMessage() call.

4. AttachmentUtil Change: I am not sure about whether we require a 
Universally Unique ID as a Mime Content ID or not? It may the case that 
for adherence to the specification or Interoperability with other 
vendors we may require UUID. A Universally Unique ID is being created 
for each attachment for the message. If in case MIME Spec doesn't 
restrict to have an universally unique id, a sequential counter can be 
used to provide unique identifiers for different attachments in a 
message. This also saves a substaintial time.

MTOM enable scenario perform better using these changes.

With Regards,
Mayank

Re: Mtom attachment Performance

Posted by Benson Margulies <bi...@gmail.com>.
Please post these as a patch, preferably one that can be applied to
trunk. If you can't, one of us will do it the hard way.

On Fri, Nov 14, 2008 at 2:13 AM, Mayank Mishra <ma...@pramati.com> wrote:
> Hi All,
>
> On CXF 2.0.7, I was evaluating the performance of MTOM for sending
> attachments for both enabled and disabled scenarios. As promised by MTOM, I
> am able to see around 30% message-size optimization compared to Base64
> encoded messages. But surprisingly, the time taken by the MTOM enabled
> scenarios are more than expected..
>
> My simple test includeed, sending a string and a raw byte array in a request
> and receiving them back as a response.
>
>   public void testMtom(
>       @WebParam(mode = WebParam.Mode.INOUT, name = "name", targetNamespace =
> "http://cxf.apache.org/mime/types")
>       javax.xml.ws.Holder<java.lang.String> name,
>       @WebParam(mode = WebParam.Mode.INOUT, name = "attachinfo",
> targetNamespace = "http://cxf.apache.org/mime/types")
>       javax.xml.ws.Holder<javax.activation.DataHandler> attachinfo
>   );
>
> I performed above test, for varying sizes of  byte array from 32KB to 12MB.
>
> I checked where it is taking unreasonably more time during request and
> response cycle. I found following as my observations:
>
> 1. StaxUtils Change: XML Namespace aware and unaware factories are created
> as static members of StaxUtils. Moving them to an static method which
> initializes them as needed. This gives an improvement of around 50 ms on
> both client and server StaxInterceptors performance, like,
>
> public static XMLInputFactory getXMLInputFactory(boolean nsAware) {
>      if (nsAware) {
>          if (XML_NS_AWARE_INPUT_FACTORY == null) {
>              XML_NS_AWARE_INPUT_FACTORY = XMLInputFactory.newInstance();
>
>  XML_NS_AWARE_INPUT_FACTORY.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE,
> true);
>          }
>          return XML_NS_AWARE_INPUT_FACTORY;
>      }else {
>          if (XML_INPUT_FACTORY == null) {
>              XML_INPUT_FACTORY = XMLInputFactory.newInstance();
>
>  XML_INPUT_FACTORY.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, false);
>          }
>          return XML_INPUT_FACTORY;
>      }
> }
>
> 2. MimeBodyPartInputStream Change: read(bytes[], int off, int len) method is
> implemented. Parent class InputStream serves the method invocation. This is
> reads a byte and processes them in the MimeBodyPartInputStream.
> An alternative implementation having read(buf, off, len) in
> MimeBodyPartInputStream works 6 times faster than this one. To check the
> performance of this alternative MimeBodyPartInputStream implementation, I
> wrote a simple test program which just reads an inputstream and processes
> the same boundary check for both alternative and CXF MimeBodyPartInputStream
> implementation. For 12 MB data, this takes around 250 ms, whereas CXF
> original takes for around 1250 ms. If require I can give a patch of this
> alternate MimeBodyPartInputSteam file.
>
> 3. AttachmentInInterceptor Change: AttachmentDeserializer contains static
> java.util.regex.Patterns which required to be compiled for particular String
> expressions, which takes substantial time in milliseconds.
> AttachmentDeserializer instance is created in AttachmentInInterceptor during
> handleMessage() call. This can be moved to AttachmentInInterceptor
> constructor, and provided a setMessage(Message message) method in
> AttachmentDerserializer, we can set message during handleMessage() call.
>
> 4. AttachmentUtil Change: I am not sure about whether we require a
> Universally Unique ID as a Mime Content ID or not? It may the case that for
> adherence to the specification or Interoperability with other vendors we may
> require UUID. A Universally Unique ID is being created for each attachment
> for the message. If in case MIME Spec doesn't restrict to have an
> universally unique id, a sequential counter can be used to provide unique
> identifiers for different attachments in a message. This also saves a
> substaintial time.
>
> MTOM enable scenario perform better using these changes.
>
> With Regards,
> Mayank
>

Re: Mtom attachment Performance

Posted by Mayank Mishra <ma...@pramati.com>.
Daniel Kulp wrote:
> Very nice work.   I'll need to double check the attachment spec to see what is 
> required there.    I THINK we should be able to create a single UUID in the 
> message and append a counter on it for each part.   Not really sure though.
>
> Also, the first change may have a problem.   We tried that with 
> DocumentBuilderFactorys and TransformerFactories and ran into classloader 
> issues and things when CXF was embedded into other applications like 
> Geronimo, Camel, and ServiceMix.     If you look in XMLUtils, we ended up 
> created maps for ClassLoader -> factory.     It would probably be good to do 
> the same.
>
> That said, can you file a couple JIRA's with patches.   We cannot accept code 
> patches via email.  (need to have the "grant to apache" box checked in JIRA)
>
> Anyway, very nice job!   Thanks!
>
> Dan
>   

Thanks a lot Dan, Benson. I will file JIRAs and will submit patches to 
them very soon.

With Regards,
Mayank
>
>  
> On Friday 14 November 2008 2:13:29 am Mayank Mishra wrote:
>   
>> Hi All,
>>
>> On CXF 2.0.7, I was evaluating the performance of MTOM for sending
>> attachments for both enabled and disabled scenarios. As promised by
>> MTOM, I am able to see around 30% message-size optimization compared to
>> Base64 encoded messages. But surprisingly, the time taken by the MTOM
>> enabled scenarios are more than expected..
>>
>> My simple test includeed, sending a string and a raw byte array in a
>> request and receiving them back as a response.
>>
>>     public void testMtom(
>>         @WebParam(mode = WebParam.Mode.INOUT, name = "name",
>> targetNamespace = "http://cxf.apache.org/mime/types")
>>         javax.xml.ws.Holder<java.lang.String> name,
>>         @WebParam(mode = WebParam.Mode.INOUT, name = "attachinfo",
>> targetNamespace = "http://cxf.apache.org/mime/types")
>>         javax.xml.ws.Holder<javax.activation.DataHandler> attachinfo
>>     );
>>
>> I performed above test, for varying sizes of  byte array from 32KB to 12MB.
>>
>> I checked where it is taking unreasonably more time during request and
>> response cycle. I found following as my observations:
>>
>> 1. StaxUtils Change: XML Namespace aware and unaware factories are
>> created as static members of StaxUtils. Moving them to an static method
>> which initializes them as needed. This gives an improvement of around 50
>> ms on both client and server StaxInterceptors performance, like,
>>
>> public static XMLInputFactory getXMLInputFactory(boolean nsAware) {
>>        if (nsAware) {
>>            if (XML_NS_AWARE_INPUT_FACTORY == null) {
>>                XML_NS_AWARE_INPUT_FACTORY = XMLInputFactory.newInstance();
>>
>> XML_NS_AWARE_INPUT_FACTORY.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE,
>> true);
>>            }
>>            return XML_NS_AWARE_INPUT_FACTORY;
>>        }else {
>>            if (XML_INPUT_FACTORY == null) {
>>                XML_INPUT_FACTORY = XMLInputFactory.newInstance();
>>
>> XML_INPUT_FACTORY.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, false);
>>            }
>>            return XML_INPUT_FACTORY;
>>        }
>> }
>>
>> 2. MimeBodyPartInputStream Change: read(bytes[], int off, int len)
>> method is implemented. Parent class InputStream serves the method
>> invocation. This is reads a byte and processes them in the
>> MimeBodyPartInputStream.
>> An alternative implementation having read(buf, off, len) in
>> MimeBodyPartInputStream works 6 times faster than this one. To check the
>> performance of this alternative MimeBodyPartInputStream implementation,
>> I wrote a simple test program which just reads an inputstream and
>> processes the same boundary check for both alternative and CXF
>> MimeBodyPartInputStream implementation. For 12 MB data, this takes
>> around 250 ms, whereas CXF original takes for around 1250 ms. If require
>> I can give a patch of this alternate MimeBodyPartInputSteam file.
>>
>> 3. AttachmentInInterceptor Change: AttachmentDeserializer contains
>> static java.util.regex.Patterns which required to be compiled for
>> particular String expressions, which takes substantial time in
>> milliseconds. AttachmentDeserializer instance is created in
>> AttachmentInInterceptor during handleMessage() call. This can be moved
>> to AttachmentInInterceptor constructor, and provided a
>> setMessage(Message message) method in AttachmentDerserializer, we can
>> set message during handleMessage() call.
>>
>> 4. AttachmentUtil Change: I am not sure about whether we require a
>> Universally Unique ID as a Mime Content ID or not? It may the case that
>> for adherence to the specification or Interoperability with other
>> vendors we may require UUID. A Universally Unique ID is being created
>> for each attachment for the message. If in case MIME Spec doesn't
>> restrict to have an universally unique id, a sequential counter can be
>> used to provide unique identifiers for different attachments in a
>> message. This also saves a substaintial time.
>>
>> MTOM enable scenario perform better using these changes.
>>
>> With Regards,
>> Mayank
>>     
>
>
>
>   


Re: Mtom attachment Performance

Posted by Daniel Kulp <dk...@apache.org>.
Very nice work.   I'll need to double check the attachment spec to see what is 
required there.    I THINK we should be able to create a single UUID in the 
message and append a counter on it for each part.   Not really sure though.

Also, the first change may have a problem.   We tried that with 
DocumentBuilderFactorys and TransformerFactories and ran into classloader 
issues and things when CXF was embedded into other applications like 
Geronimo, Camel, and ServiceMix.     If you look in XMLUtils, we ended up 
created maps for ClassLoader -> factory.     It would probably be good to do 
the same.

That said, can you file a couple JIRA's with patches.   We cannot accept code 
patches via email.  (need to have the "grant to apache" box checked in JIRA)

Anyway, very nice job!   Thanks!

Dan


 
On Friday 14 November 2008 2:13:29 am Mayank Mishra wrote:
> Hi All,
>
> On CXF 2.0.7, I was evaluating the performance of MTOM for sending
> attachments for both enabled and disabled scenarios. As promised by
> MTOM, I am able to see around 30% message-size optimization compared to
> Base64 encoded messages. But surprisingly, the time taken by the MTOM
> enabled scenarios are more than expected..
>
> My simple test includeed, sending a string and a raw byte array in a
> request and receiving them back as a response.
>
>     public void testMtom(
>         @WebParam(mode = WebParam.Mode.INOUT, name = "name",
> targetNamespace = "http://cxf.apache.org/mime/types")
>         javax.xml.ws.Holder<java.lang.String> name,
>         @WebParam(mode = WebParam.Mode.INOUT, name = "attachinfo",
> targetNamespace = "http://cxf.apache.org/mime/types")
>         javax.xml.ws.Holder<javax.activation.DataHandler> attachinfo
>     );
>
> I performed above test, for varying sizes of  byte array from 32KB to 12MB.
>
> I checked where it is taking unreasonably more time during request and
> response cycle. I found following as my observations:
>
> 1. StaxUtils Change: XML Namespace aware and unaware factories are
> created as static members of StaxUtils. Moving them to an static method
> which initializes them as needed. This gives an improvement of around 50
> ms on both client and server StaxInterceptors performance, like,
>
> public static XMLInputFactory getXMLInputFactory(boolean nsAware) {
>        if (nsAware) {
>            if (XML_NS_AWARE_INPUT_FACTORY == null) {
>                XML_NS_AWARE_INPUT_FACTORY = XMLInputFactory.newInstance();
>
> XML_NS_AWARE_INPUT_FACTORY.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE,
> true);
>            }
>            return XML_NS_AWARE_INPUT_FACTORY;
>        }else {
>            if (XML_INPUT_FACTORY == null) {
>                XML_INPUT_FACTORY = XMLInputFactory.newInstance();
>
> XML_INPUT_FACTORY.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, false);
>            }
>            return XML_INPUT_FACTORY;
>        }
> }
>
> 2. MimeBodyPartInputStream Change: read(bytes[], int off, int len)
> method is implemented. Parent class InputStream serves the method
> invocation. This is reads a byte and processes them in the
> MimeBodyPartInputStream.
> An alternative implementation having read(buf, off, len) in
> MimeBodyPartInputStream works 6 times faster than this one. To check the
> performance of this alternative MimeBodyPartInputStream implementation,
> I wrote a simple test program which just reads an inputstream and
> processes the same boundary check for both alternative and CXF
> MimeBodyPartInputStream implementation. For 12 MB data, this takes
> around 250 ms, whereas CXF original takes for around 1250 ms. If require
> I can give a patch of this alternate MimeBodyPartInputSteam file.
>
> 3. AttachmentInInterceptor Change: AttachmentDeserializer contains
> static java.util.regex.Patterns which required to be compiled for
> particular String expressions, which takes substantial time in
> milliseconds. AttachmentDeserializer instance is created in
> AttachmentInInterceptor during handleMessage() call. This can be moved
> to AttachmentInInterceptor constructor, and provided a
> setMessage(Message message) method in AttachmentDerserializer, we can
> set message during handleMessage() call.
>
> 4. AttachmentUtil Change: I am not sure about whether we require a
> Universally Unique ID as a Mime Content ID or not? It may the case that
> for adherence to the specification or Interoperability with other
> vendors we may require UUID. A Universally Unique ID is being created
> for each attachment for the message. If in case MIME Spec doesn't
> restrict to have an universally unique id, a sequential counter can be
> used to provide unique identifiers for different attachments in a
> message. This also saves a substaintial time.
>
> MTOM enable scenario perform better using these changes.
>
> With Regards,
> Mayank



-- 
Daniel Kulp
dkulp@apache.org
http://dankulp.com/blog