You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@cxf.apache.org by Mayank Mishra <ma...@pramati.com> on 2008/11/14 08:13:29 UTC
Mtom attachment Performance
Hi All,
On CXF 2.0.7, I was evaluating the performance of MTOM for sending
attachments for both enabled and disabled scenarios. As promised by
MTOM, I am able to see around 30% message-size optimization compared to
Base64 encoded messages. But surprisingly, the time taken by the MTOM
enabled scenarios are more than expected..
My simple test includeed, sending a string and a raw byte array in a
request and receiving them back as a response.
public void testMtom(
@WebParam(mode = WebParam.Mode.INOUT, name = "name",
targetNamespace = "http://cxf.apache.org/mime/types")
javax.xml.ws.Holder<java.lang.String> name,
@WebParam(mode = WebParam.Mode.INOUT, name = "attachinfo",
targetNamespace = "http://cxf.apache.org/mime/types")
javax.xml.ws.Holder<javax.activation.DataHandler> attachinfo
);
I performed above test, for varying sizes of byte array from 32KB to 12MB.
I checked where it is taking unreasonably more time during request and
response cycle. I found following as my observations:
1. StaxUtils Change: XML Namespace aware and unaware factories are
created as static members of StaxUtils. Moving them to an static method
which initializes them as needed. This gives an improvement of around 50
ms on both client and server StaxInterceptors performance, like,
public static XMLInputFactory getXMLInputFactory(boolean nsAware) {
if (nsAware) {
if (XML_NS_AWARE_INPUT_FACTORY == null) {
XML_NS_AWARE_INPUT_FACTORY = XMLInputFactory.newInstance();
XML_NS_AWARE_INPUT_FACTORY.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE,
true);
}
return XML_NS_AWARE_INPUT_FACTORY;
}else {
if (XML_INPUT_FACTORY == null) {
XML_INPUT_FACTORY = XMLInputFactory.newInstance();
XML_INPUT_FACTORY.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, false);
}
return XML_INPUT_FACTORY;
}
}
2. MimeBodyPartInputStream Change: read(bytes[], int off, int len)
method is implemented. Parent class InputStream serves the method
invocation. This is reads a byte and processes them in the
MimeBodyPartInputStream.
An alternative implementation having read(buf, off, len) in
MimeBodyPartInputStream works 6 times faster than this one. To check the
performance of this alternative MimeBodyPartInputStream implementation,
I wrote a simple test program which just reads an inputstream and
processes the same boundary check for both alternative and CXF
MimeBodyPartInputStream implementation. For 12 MB data, this takes
around 250 ms, whereas CXF original takes for around 1250 ms. If require
I can give a patch of this alternate MimeBodyPartInputSteam file.
3. AttachmentInInterceptor Change: AttachmentDeserializer contains
static java.util.regex.Patterns which required to be compiled for
particular String expressions, which takes substantial time in
milliseconds. AttachmentDeserializer instance is created in
AttachmentInInterceptor during handleMessage() call. This can be moved
to AttachmentInInterceptor constructor, and provided a
setMessage(Message message) method in AttachmentDerserializer, we can
set message during handleMessage() call.
4. AttachmentUtil Change: I am not sure about whether we require a
Universally Unique ID as a Mime Content ID or not? It may the case that
for adherence to the specification or Interoperability with other
vendors we may require UUID. A Universally Unique ID is being created
for each attachment for the message. If in case MIME Spec doesn't
restrict to have an universally unique id, a sequential counter can be
used to provide unique identifiers for different attachments in a
message. This also saves a substaintial time.
MTOM enable scenario perform better using these changes.
With Regards,
Mayank
Re: Mtom attachment Performance
Posted by Benson Margulies <bi...@gmail.com>.
Please post these as a patch, preferably one that can be applied to
trunk. If you can't, one of us will do it the hard way.
On Fri, Nov 14, 2008 at 2:13 AM, Mayank Mishra <ma...@pramati.com> wrote:
> Hi All,
>
> On CXF 2.0.7, I was evaluating the performance of MTOM for sending
> attachments for both enabled and disabled scenarios. As promised by MTOM, I
> am able to see around 30% message-size optimization compared to Base64
> encoded messages. But surprisingly, the time taken by the MTOM enabled
> scenarios are more than expected..
>
> My simple test includeed, sending a string and a raw byte array in a request
> and receiving them back as a response.
>
> public void testMtom(
> @WebParam(mode = WebParam.Mode.INOUT, name = "name", targetNamespace =
> "http://cxf.apache.org/mime/types")
> javax.xml.ws.Holder<java.lang.String> name,
> @WebParam(mode = WebParam.Mode.INOUT, name = "attachinfo",
> targetNamespace = "http://cxf.apache.org/mime/types")
> javax.xml.ws.Holder<javax.activation.DataHandler> attachinfo
> );
>
> I performed above test, for varying sizes of byte array from 32KB to 12MB.
>
> I checked where it is taking unreasonably more time during request and
> response cycle. I found following as my observations:
>
> 1. StaxUtils Change: XML Namespace aware and unaware factories are created
> as static members of StaxUtils. Moving them to an static method which
> initializes them as needed. This gives an improvement of around 50 ms on
> both client and server StaxInterceptors performance, like,
>
> public static XMLInputFactory getXMLInputFactory(boolean nsAware) {
> if (nsAware) {
> if (XML_NS_AWARE_INPUT_FACTORY == null) {
> XML_NS_AWARE_INPUT_FACTORY = XMLInputFactory.newInstance();
>
> XML_NS_AWARE_INPUT_FACTORY.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE,
> true);
> }
> return XML_NS_AWARE_INPUT_FACTORY;
> }else {
> if (XML_INPUT_FACTORY == null) {
> XML_INPUT_FACTORY = XMLInputFactory.newInstance();
>
> XML_INPUT_FACTORY.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, false);
> }
> return XML_INPUT_FACTORY;
> }
> }
>
> 2. MimeBodyPartInputStream Change: read(bytes[], int off, int len) method is
> implemented. Parent class InputStream serves the method invocation. This is
> reads a byte and processes them in the MimeBodyPartInputStream.
> An alternative implementation having read(buf, off, len) in
> MimeBodyPartInputStream works 6 times faster than this one. To check the
> performance of this alternative MimeBodyPartInputStream implementation, I
> wrote a simple test program which just reads an inputstream and processes
> the same boundary check for both alternative and CXF MimeBodyPartInputStream
> implementation. For 12 MB data, this takes around 250 ms, whereas CXF
> original takes for around 1250 ms. If require I can give a patch of this
> alternate MimeBodyPartInputSteam file.
>
> 3. AttachmentInInterceptor Change: AttachmentDeserializer contains static
> java.util.regex.Patterns which required to be compiled for particular String
> expressions, which takes substantial time in milliseconds.
> AttachmentDeserializer instance is created in AttachmentInInterceptor during
> handleMessage() call. This can be moved to AttachmentInInterceptor
> constructor, and provided a setMessage(Message message) method in
> AttachmentDerserializer, we can set message during handleMessage() call.
>
> 4. AttachmentUtil Change: I am not sure about whether we require a
> Universally Unique ID as a Mime Content ID or not? It may the case that for
> adherence to the specification or Interoperability with other vendors we may
> require UUID. A Universally Unique ID is being created for each attachment
> for the message. If in case MIME Spec doesn't restrict to have an
> universally unique id, a sequential counter can be used to provide unique
> identifiers for different attachments in a message. This also saves a
> substaintial time.
>
> MTOM enable scenario perform better using these changes.
>
> With Regards,
> Mayank
>
Re: Mtom attachment Performance
Posted by Mayank Mishra <ma...@pramati.com>.
Daniel Kulp wrote:
> Very nice work. I'll need to double check the attachment spec to see what is
> required there. I THINK we should be able to create a single UUID in the
> message and append a counter on it for each part. Not really sure though.
>
> Also, the first change may have a problem. We tried that with
> DocumentBuilderFactorys and TransformerFactories and ran into classloader
> issues and things when CXF was embedded into other applications like
> Geronimo, Camel, and ServiceMix. If you look in XMLUtils, we ended up
> created maps for ClassLoader -> factory. It would probably be good to do
> the same.
>
> That said, can you file a couple JIRA's with patches. We cannot accept code
> patches via email. (need to have the "grant to apache" box checked in JIRA)
>
> Anyway, very nice job! Thanks!
>
> Dan
>
Thanks a lot Dan, Benson. I will file JIRAs and will submit patches to
them very soon.
With Regards,
Mayank
>
>
> On Friday 14 November 2008 2:13:29 am Mayank Mishra wrote:
>
>> Hi All,
>>
>> On CXF 2.0.7, I was evaluating the performance of MTOM for sending
>> attachments for both enabled and disabled scenarios. As promised by
>> MTOM, I am able to see around 30% message-size optimization compared to
>> Base64 encoded messages. But surprisingly, the time taken by the MTOM
>> enabled scenarios are more than expected..
>>
>> My simple test includeed, sending a string and a raw byte array in a
>> request and receiving them back as a response.
>>
>> public void testMtom(
>> @WebParam(mode = WebParam.Mode.INOUT, name = "name",
>> targetNamespace = "http://cxf.apache.org/mime/types")
>> javax.xml.ws.Holder<java.lang.String> name,
>> @WebParam(mode = WebParam.Mode.INOUT, name = "attachinfo",
>> targetNamespace = "http://cxf.apache.org/mime/types")
>> javax.xml.ws.Holder<javax.activation.DataHandler> attachinfo
>> );
>>
>> I performed above test, for varying sizes of byte array from 32KB to 12MB.
>>
>> I checked where it is taking unreasonably more time during request and
>> response cycle. I found following as my observations:
>>
>> 1. StaxUtils Change: XML Namespace aware and unaware factories are
>> created as static members of StaxUtils. Moving them to an static method
>> which initializes them as needed. This gives an improvement of around 50
>> ms on both client and server StaxInterceptors performance, like,
>>
>> public static XMLInputFactory getXMLInputFactory(boolean nsAware) {
>> if (nsAware) {
>> if (XML_NS_AWARE_INPUT_FACTORY == null) {
>> XML_NS_AWARE_INPUT_FACTORY = XMLInputFactory.newInstance();
>>
>> XML_NS_AWARE_INPUT_FACTORY.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE,
>> true);
>> }
>> return XML_NS_AWARE_INPUT_FACTORY;
>> }else {
>> if (XML_INPUT_FACTORY == null) {
>> XML_INPUT_FACTORY = XMLInputFactory.newInstance();
>>
>> XML_INPUT_FACTORY.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, false);
>> }
>> return XML_INPUT_FACTORY;
>> }
>> }
>>
>> 2. MimeBodyPartInputStream Change: read(bytes[], int off, int len)
>> method is implemented. Parent class InputStream serves the method
>> invocation. This is reads a byte and processes them in the
>> MimeBodyPartInputStream.
>> An alternative implementation having read(buf, off, len) in
>> MimeBodyPartInputStream works 6 times faster than this one. To check the
>> performance of this alternative MimeBodyPartInputStream implementation,
>> I wrote a simple test program which just reads an inputstream and
>> processes the same boundary check for both alternative and CXF
>> MimeBodyPartInputStream implementation. For 12 MB data, this takes
>> around 250 ms, whereas CXF original takes for around 1250 ms. If require
>> I can give a patch of this alternate MimeBodyPartInputSteam file.
>>
>> 3. AttachmentInInterceptor Change: AttachmentDeserializer contains
>> static java.util.regex.Patterns which required to be compiled for
>> particular String expressions, which takes substantial time in
>> milliseconds. AttachmentDeserializer instance is created in
>> AttachmentInInterceptor during handleMessage() call. This can be moved
>> to AttachmentInInterceptor constructor, and provided a
>> setMessage(Message message) method in AttachmentDerserializer, we can
>> set message during handleMessage() call.
>>
>> 4. AttachmentUtil Change: I am not sure about whether we require a
>> Universally Unique ID as a Mime Content ID or not? It may the case that
>> for adherence to the specification or Interoperability with other
>> vendors we may require UUID. A Universally Unique ID is being created
>> for each attachment for the message. If in case MIME Spec doesn't
>> restrict to have an universally unique id, a sequential counter can be
>> used to provide unique identifiers for different attachments in a
>> message. This also saves a substaintial time.
>>
>> MTOM enable scenario perform better using these changes.
>>
>> With Regards,
>> Mayank
>>
>
>
>
>
Re: Mtom attachment Performance
Posted by Daniel Kulp <dk...@apache.org>.
Very nice work. I'll need to double check the attachment spec to see what is
required there. I THINK we should be able to create a single UUID in the
message and append a counter on it for each part. Not really sure though.
Also, the first change may have a problem. We tried that with
DocumentBuilderFactorys and TransformerFactories and ran into classloader
issues and things when CXF was embedded into other applications like
Geronimo, Camel, and ServiceMix. If you look in XMLUtils, we ended up
created maps for ClassLoader -> factory. It would probably be good to do
the same.
That said, can you file a couple JIRA's with patches. We cannot accept code
patches via email. (need to have the "grant to apache" box checked in JIRA)
Anyway, very nice job! Thanks!
Dan
On Friday 14 November 2008 2:13:29 am Mayank Mishra wrote:
> Hi All,
>
> On CXF 2.0.7, I was evaluating the performance of MTOM for sending
> attachments for both enabled and disabled scenarios. As promised by
> MTOM, I am able to see around 30% message-size optimization compared to
> Base64 encoded messages. But surprisingly, the time taken by the MTOM
> enabled scenarios are more than expected..
>
> My simple test includeed, sending a string and a raw byte array in a
> request and receiving them back as a response.
>
> public void testMtom(
> @WebParam(mode = WebParam.Mode.INOUT, name = "name",
> targetNamespace = "http://cxf.apache.org/mime/types")
> javax.xml.ws.Holder<java.lang.String> name,
> @WebParam(mode = WebParam.Mode.INOUT, name = "attachinfo",
> targetNamespace = "http://cxf.apache.org/mime/types")
> javax.xml.ws.Holder<javax.activation.DataHandler> attachinfo
> );
>
> I performed above test, for varying sizes of byte array from 32KB to 12MB.
>
> I checked where it is taking unreasonably more time during request and
> response cycle. I found following as my observations:
>
> 1. StaxUtils Change: XML Namespace aware and unaware factories are
> created as static members of StaxUtils. Moving them to an static method
> which initializes them as needed. This gives an improvement of around 50
> ms on both client and server StaxInterceptors performance, like,
>
> public static XMLInputFactory getXMLInputFactory(boolean nsAware) {
> if (nsAware) {
> if (XML_NS_AWARE_INPUT_FACTORY == null) {
> XML_NS_AWARE_INPUT_FACTORY = XMLInputFactory.newInstance();
>
> XML_NS_AWARE_INPUT_FACTORY.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE,
> true);
> }
> return XML_NS_AWARE_INPUT_FACTORY;
> }else {
> if (XML_INPUT_FACTORY == null) {
> XML_INPUT_FACTORY = XMLInputFactory.newInstance();
>
> XML_INPUT_FACTORY.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, false);
> }
> return XML_INPUT_FACTORY;
> }
> }
>
> 2. MimeBodyPartInputStream Change: read(bytes[], int off, int len)
> method is implemented. Parent class InputStream serves the method
> invocation. This is reads a byte and processes them in the
> MimeBodyPartInputStream.
> An alternative implementation having read(buf, off, len) in
> MimeBodyPartInputStream works 6 times faster than this one. To check the
> performance of this alternative MimeBodyPartInputStream implementation,
> I wrote a simple test program which just reads an inputstream and
> processes the same boundary check for both alternative and CXF
> MimeBodyPartInputStream implementation. For 12 MB data, this takes
> around 250 ms, whereas CXF original takes for around 1250 ms. If require
> I can give a patch of this alternate MimeBodyPartInputSteam file.
>
> 3. AttachmentInInterceptor Change: AttachmentDeserializer contains
> static java.util.regex.Patterns which required to be compiled for
> particular String expressions, which takes substantial time in
> milliseconds. AttachmentDeserializer instance is created in
> AttachmentInInterceptor during handleMessage() call. This can be moved
> to AttachmentInInterceptor constructor, and provided a
> setMessage(Message message) method in AttachmentDerserializer, we can
> set message during handleMessage() call.
>
> 4. AttachmentUtil Change: I am not sure about whether we require a
> Universally Unique ID as a Mime Content ID or not? It may the case that
> for adherence to the specification or Interoperability with other
> vendors we may require UUID. A Universally Unique ID is being created
> for each attachment for the message. If in case MIME Spec doesn't
> restrict to have an universally unique id, a sequential counter can be
> used to provide unique identifiers for different attachments in a
> message. This also saves a substaintial time.
>
> MTOM enable scenario perform better using these changes.
>
> With Regards,
> Mayank
--
Daniel Kulp
dkulp@apache.org
http://dankulp.com/blog