You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-dev@axis.apache.org by Archit Shah <ar...@alum.mit.edu> on 2004/11/12 21:27:12 UTC
Axis memory usage and performance
Inspired by jira issue AXIS-1391, I have been investigating server side
performance and memory consumption by looking at the memory allocations
(using the JVM's -Xrunhprof:heap=sites) and using the SOAP benchmark
from Alek (http://www.extreme.indiana.edu/xgws/soap_bench/) and
SimpleAxisServer. The most illustrative test involved 10 calls to a
simple SOAP method:
public int receiveDoubles(double[] input) throws Exception {
return input.length;
}
with an input array of length 5000. Memory profiling seems to show that
the data is copied repeatedly during deserialization and that there is
a lot of per-array element work (about 40 objects allocated). Reducing
this work could help boost performance and reduce memory footprint.
Here is what I think is happening:
Data starts in char[] buffer.
1) sent to the SAX2EventRecorder where it is converted to a string
(line 91)
2) during replay it is converted back to char[] in order to use the
ContentHandler API
3) the replay client (SimpleDeserializer, I think) copies the data to
its internal CharArrayWriter
4) SimpleDeserializer.onEndElement converts its CharArrayWriter to a
String
That covers 4 copies of each piece of data that needs deserialization.
In addition, I see significant allocations from SOAPHander.onEndElement
(lines 114 and 118). SOAPHandler.onEndElement calls addTextNode, first
converting to String from a CharArrayWriter. addTextNode ends up
calling into crimson which converts the string back to a char[]. I
believe this happens before the replay. I am not sure why this is
necessary, as the data is stored in the event recorder.
In addition to the input data being copied, there are other sources of
per-array element allocations:
- SimpleDeserializer initialization (1 allocated per array element)
- NSStack (1 call to cloneFrame per array element)
- NodeImpl.initializeChildren (3 array list allocations per array
element)
- DeserializerImpl.addChildDeserializer (1 call per array element,
allocating 1 CallbackTarget and 1 vector)
Here is the the output from the allocation profiler showing the per
array element allocations (the full output is available on jira as an
attachment to issue 1391 (it will be once I can successfully upload,
something seems to be broken right now):
percent live alloc'ed stack class
rank self accum bytes objs bytes objs trace name
1 5.83% 5.83% 520104 5001 5201040 50010 12694
org.apache.axis.message.MessageElement
2 5.74% 11.57% 512016 1 7424400 25 13463 java.lang.Object
3 4.57% 16.14% 407328 4243 4798464 49984 13160
org.apache.axis.encoding.ser.SimpleDeserializer
4 3.81% 19.94% 339440 4243 3998720 49984 13165
java.util.HashMap$Entry
5 3.81% 23.75% 339440 4243 3998720 49984 13162 [C
6 3.81% 27.55% 339440 4243 3998720 49984 13167 [C
7 3.14% 30.69% 280056 5001 2800560 50010 12696 java.lang.Object
8 3.14% 33.83% 280000 5000 2800000 50000 12750 java.lang.Object
9 3.14% 36.97% 280000 5000 2800000 50000 12724 java.lang.Object
10 2.69% 39.66% 240000 5000 2400000 50000 12714
org.apache.axis.message.Text
11 2.66% 42.33% 237608 4243 2800000 50000 13066 java.lang.Object
12 1.90% 44.23% 169720 4243 1999360 49984 13164 java.util.HashMap
13 1.90% 46.13% 169720 4243 1999360 49984 13171
java.lang.FloatingDecimal
14 1.79% 47.92% 160000 5000 1600000 50000 12721
org.apache.crimson.tree.TextNode
15 1.79% 49.72% 160000 5000 1520000 50000 12722 [C
16 1.79% 51.51% 159920 4999 1519320 49991 12703 [C
17 1.52% 53.03% 135808 4246 1520384 50036 13075 [C
18 1.52% 54.56% 135776 4243 1519616 49984 13170 [C
19 1.52% 56.08% 135776 4243 1520000 50000 13077 [C
20 1.52% 57.60% 135776 4243 1599488 49984 13161
java.io.CharArrayWriter
21 1.52% 59.12% 135776 4243 1599488 49984 13166
java.io.CharArrayWriter
22 1.35% 60.47% 120024 5001 1200240 50010 12695
java.util.ArrayList
23 1.35% 61.81% 120000 5000 1200000 50000 12749
java.util.ArrayList
24 1.35% 63.16% 120000 5000 1200000 50000 12723
java.util.ArrayList
25 1.35% 64.50% 119976 4999 1199784 49991 12702 java.lang.String
26 1.14% 65.65% 101832 4243 1200000 50000 13065 java.util.Vector
27 1.14% 66.79% 101832 4243 1200000 50000 13076 java.lang.String
28 1.14% 67.93% 101832 4243 1200000 50000 13044 java.lang.Object
29 1.14% 69.07% 101832 4243 1200000 50000 13067
java.util.HashMap$Entry
30 1.14% 70.21% 101808 4242 1200000 50000 13020
org.apache.axis.encoding.TypeMappingImpl$Pair
31 1.14% 71.35% 101808 4242 1200000 50000 13021
org.apache.axis.encoding.TypeMappingImpl$Pair
32 0.99% 72.35% 88648 865 88768 867 2811 [C
33 0.90% 73.24% 80000 5000 799744 49984 13168 java.lang.Double
34 0.88% 74.13% 78768 478 78768 478 558 [B
35 0.76% 74.89% 67888 4243 799744 49984 13163 java.util.HashSet
36 0.76% 75.65% 67888 4243 800000 50000 13058
org.apache.axis.encoding.DeserializerTarget
37 0.76% 76.41% 67888 4243 800000 50000 13100 java.util.Vector$1
38 0.76% 77.17% 67888 4243 800000 50000 13068
org.apache.axis.encoding.CallbackTarget
39 0.76% 77.93% 67888 4243 799744 49984 13169 java.lang.Double
40 0.76% 78.69% 67888 4243 800000 50000 13058 java.lang.Integer
41 0.76% 79.45% 67888 4243 800000 50000 13081 java.lang.Object
42 0.76% 80.22% 67888 4243 800000 50000 13022
org.apache.axis.utils.cache.MethodCache$MethodKey
-- Archit
Re: Axis memory usage and performance
Posted by Davanum Srinivas <da...@gmail.com>.
Archit,
checked in your patch. Thanks for the same. If you see something else
fishy, give us a holler.
-- dims
On Tue, 16 Nov 2004 11:09:14 -0500, Archit Shah
<ar...@alum.mit.edu> wrote:
> I forgot to mention that the revised version gives the same performance
> boost as before, about 5-10% in the benchmark from issue 1391.
>
>
>
> -- Archit
>
> On Tuesday, Nov 16, 2004, at 10:50 US/Eastern, Davanum Srinivas wrote:
>
> > this patch looks good....How much perf boost are u getting with this?
> > (same as before 5-10%)
>
>
--
Davanum Srinivas - http://webservices.apache.org/~dims/
Re: Axis memory usage and performance
Posted by Archit Shah <ar...@alum.mit.edu>.
I forgot to mention that the revised version gives the same performance
boost as before, about 5-10% in the benchmark from issue 1391.
-- Archit
On Tuesday, Nov 16, 2004, at 10:50 US/Eastern, Davanum Srinivas wrote:
> this patch looks good....How much perf boost are u getting with this?
> (same as before 5-10%)
Re: Axis memory usage and performance
Posted by Davanum Srinivas <da...@gmail.com>.
Archit,
this patch looks good....How much perf boost are u getting with this?
(same as before 5-10%)?
thanks,
dims
On Tue, 16 Nov 2004 10:19:29 -0500, Archit Shah
<ar...@alum.mit.edu> wrote:
> Revised patch attached. It does not touch the Deserializer interface
> and does not make private the targets field of the public
> DeserializerImpl class.
>
> -- Archit
>
>
>
> On Tuesday, Nov 16, 2004, at 06:55 US/Eastern, Davanum Srinivas wrote:
>
> > problem is...it's too late to change method signatures...can you take
> > a stab at keeping the method signatures?
> >
> > On Tue, 16 Nov 2004 01:34:20 -0500, Archit Shah
> > <ar...@alum.mit.edu> wrote:
> >> I have attached a patch that changes NodeListImpl and DeserializerImpl
> >> to use ArrayList instead of Vector as Vector should only be used when
> >> synchronization is needed. The patch also changes the targets field of
> >> DeserializerImpl to be private instead of protected and removes the
> >> overrides of removeValueTargets in subclasses SimpleDeserializer and
> >> SimpleListDeserializer. In 2 places I have switched to using index
> >> based for loops instead of iteration. Altogether, this should provide
> >> a
> >> 5-10% improvement for the benchmark in issue 1391.
>
>
>
>
>
>
>
--
Davanum Srinivas - http://webservices.apache.org/~dims/
Re: Axis memory usage and performance
Posted by Archit Shah <ar...@alum.mit.edu>.
Revised patch attached. It does not touch the Deserializer interface
and does not make private the targets field of the public
DeserializerImpl class.
-- Archit
On Tuesday, Nov 16, 2004, at 06:55 US/Eastern, Davanum Srinivas wrote:
> problem is...it's too late to change method signatures...can you take
> a stab at keeping the method signatures?
>
> On Tue, 16 Nov 2004 01:34:20 -0500, Archit Shah
> <ar...@alum.mit.edu> wrote:
>> I have attached a patch that changes NodeListImpl and DeserializerImpl
>> to use ArrayList instead of Vector as Vector should only be used when
>> synchronization is needed. The patch also changes the targets field of
>> DeserializerImpl to be private instead of protected and removes the
>> overrides of removeValueTargets in subclasses SimpleDeserializer and
>> SimpleListDeserializer. In 2 places I have switched to using index
>> based for loops instead of iteration. Altogether, this should provide
>> a
>> 5-10% improvement for the benchmark in issue 1391.
Re: Axis memory usage and performance
Posted by Davanum Srinivas <da...@gmail.com>.
Archit,
problem is...it's too late to change method signatures...can you take
a stab at keeping the method signatures?
thanks,
dims
On Tue, 16 Nov 2004 01:34:20 -0500, Archit Shah
<ar...@alum.mit.edu> wrote:
> I have attached a patch that changes NodeListImpl and DeserializerImpl
> to use ArrayList instead of Vector as Vector should only be used when
> synchronization is needed. The patch also changes the targets field of
> DeserializerImpl to be private instead of protected and removes the
> overrides of removeValueTargets in subclasses SimpleDeserializer and
> SimpleListDeserializer. In 2 places I have switched to using index
> based for loops instead of iteration. Altogether, this should provide a
> 5-10% improvement for the benchmark in issue 1391.
>
> -- Archit
>
>
>
> On Friday, Nov 12, 2004, at 15:27 US/Eastern, Archit Shah wrote:
>
> > with an input array of length 5000. Memory profiling seems to show
> > that the data is copied repeatedly during deserialization and that
> > there is a lot of per-array element work (about 40 objects allocated).
> > Reducing this work could help boost performance and reduce memory
> > footprint. Here is what I think is happening:
>
>
>
>
>
>
>
--
Davanum Srinivas - http://webservices.apache.org/~dims/
Re: Axis memory usage and performance
Posted by Archit Shah <ar...@alum.mit.edu>.
I have attached a patch that changes NodeListImpl and DeserializerImpl
to use ArrayList instead of Vector as Vector should only be used when
synchronization is needed. The patch also changes the targets field of
DeserializerImpl to be private instead of protected and removes the
overrides of removeValueTargets in subclasses SimpleDeserializer and
SimpleListDeserializer. In 2 places I have switched to using index
based for loops instead of iteration. Altogether, this should provide a
5-10% improvement for the benchmark in issue 1391.
-- Archit
On Friday, Nov 12, 2004, at 15:27 US/Eastern, Archit Shah wrote:
> with an input array of length 5000. Memory profiling seems to show
> that the data is copied repeatedly during deserialization and that
> there is a lot of per-array element work (about 40 objects allocated).
> Reducing this work could help boost performance and reduce memory
> footprint. Here is what I think is happening:
Re: Axis memory usage and performance
Posted by Samuel Meder <me...@mcs.anl.gov>.
On Sun, 2004-11-14 at 17:37 -0500, Davanum Srinivas wrote:
> Archit,
>
> Sam/Jarek just checked in a bunch of changes...
These are all Jarek's, so no blaming me ;)
/Sam
> Can you please re-run
> your tests and send us the latest hprof output? I will check on JIRA
> outage.
>
> thanks,
> dims
>
>
> On Sun, 14 Nov 2004 12:26:17 -0500, Archit Shah
> <ar...@alum.mit.edu> wrote:
> > Not sure why I can't upload to jira. I get a 502 error. "The proxy
> > server received an invalid response from an upstream server. The proxy
> > server could not handle the request <em><a
> > href="/jira/secure/AttachFile.jspa">POST /jira/secure/
> > AttachFile.jspa</a></em>.<p> Reason: <strong>Error reading from remote
> > server</strong></p>"
> >
> > The file is attached here instead.
> >
> > -- Archit
> >
> > On Sunday, Nov 14, 2004, at 02:07 US/Eastern, Jarek Gawor wrote:
> >
> >
> >
> > >> -----Original Message-----
> > >> Here is the the output from the allocation profiler showing the per
> > >> array element allocations (the full output is available on jira as an
> > >> attachment to issue 1391 (it will be once I can successfully upload,
> > >> something seems to be broken right now):
> > >
> > > Where is that attachment? I can't find it with that bug.
> >
> >
> >
> >
> >
> >
> >
>
>
Re: Axis memory usage and performance
Posted by Archit Shah <ar...@alum.mit.edu>.
Jira is working fine now. File uploaded.
http://nagoya.apache.org/jira/secure/attachment/18141/rd2.hprof.txt.gz
-- Archit
On Sunday, Nov 14, 2004, at 17:37 US/Eastern, Davanum Srinivas wrote:
> Sam/Jarek just checked in a bunch of changes...Can you please re-run
> your tests and send us the latest hprof output? I will check on JIRA
> outage.
Re: Axis memory usage and performance
Posted by Davanum Srinivas <da...@gmail.com>.
Archit,
Sam/Jarek just checked in a bunch of changes...Can you please re-run
your tests and send us the latest hprof output? I will check on JIRA
outage.
thanks,
dims
On Sun, 14 Nov 2004 12:26:17 -0500, Archit Shah
<ar...@alum.mit.edu> wrote:
> Not sure why I can't upload to jira. I get a 502 error. "The proxy
> server received an invalid response from an upstream server. The proxy
> server could not handle the request <em><a
> href="/jira/secure/AttachFile.jspa">POST /jira/secure/
> AttachFile.jspa</a></em>.<p> Reason: <strong>Error reading from remote
> server</strong></p>"
>
> The file is attached here instead.
>
> -- Archit
>
> On Sunday, Nov 14, 2004, at 02:07 US/Eastern, Jarek Gawor wrote:
>
>
>
> >> -----Original Message-----
> >> Here is the the output from the allocation profiler showing the per
> >> array element allocations (the full output is available on jira as an
> >> attachment to issue 1391 (it will be once I can successfully upload,
> >> something seems to be broken right now):
> >
> > Where is that attachment? I can't find it with that bug.
>
>
>
>
>
>
>
--
Davanum Srinivas - http://webservices.apache.org/~dims/
Re: Axis memory usage and performance
Posted by Archit Shah <ar...@alum.mit.edu>.
Not sure why I can't upload to jira. I get a 502 error. "The proxy
server received an invalid response from an upstream server. The proxy
server could not handle the request <em><a
href="/jira/secure/AttachFile.jspa">POST /jira/secure/
AttachFile.jspa</a></em>.<p> Reason: <strong>Error reading from remote
server</strong></p>"
The file is attached here instead.
-- Archit
On Sunday, Nov 14, 2004, at 02:07 US/Eastern, Jarek Gawor wrote:
>> -----Original Message-----
>> Here is the the output from the allocation profiler showing the per
>> array element allocations (the full output is available on jira as an
>> attachment to issue 1391 (it will be once I can successfully upload,
>> something seems to be broken right now):
>
> Where is that attachment? I can't find it with that bug.
RE: Axis memory usage and performance
Posted by Jarek Gawor <ga...@mcs.anl.gov>.
> -----Original Message-----
> From: Archit Shah [mailto:archit.shah@alum.mit.edu]
> Sent: Friday, November 12, 2004 3:27 PM
> To: axis-dev@ws.apache.org
> Subject: Axis memory usage and performance
>
>
> Inspired by jira issue AXIS-1391, I have been investigating
> server side
> performance and memory consumption by looking at the memory
> allocations
> (using the JVM's -Xrunhprof:heap=sites) and using the SOAP benchmark
> from Alek (http://www.extreme.indiana.edu/xgws/soap_bench/) and
> SimpleAxisServer. The most illustrative test involved 10 calls to a
> simple SOAP method:
>
> public int receiveDoubles(double[] input) throws Exception {
> return input.length;
> }
>
> with an input array of length 5000. Memory profiling seems to
> show that
> the data is copied repeatedly during deserialization and that
> there is
> a lot of per-array element work (about 40 objects allocated).
> Reducing
> this work could help boost performance and reduce memory footprint.
> Here is what I think is happening:
>
> Data starts in char[] buffer.
> 1) sent to the SAX2EventRecorder where it is converted to a string
> (line 91)
> 2) during replay it is converted back to char[] in order to use the
> ContentHandler API
> 3) the replay client (SimpleDeserializer, I think) copies the data to
> its internal CharArrayWriter
> 4) SimpleDeserializer.onEndElement converts its CharArrayWriter to a
> String
I updated the SAX2EventRecorder not to convert to String and then back to
char[]. The input char[] still needs to be cloned, but that's one less clone
then before.
> That covers 4 copies of each piece of data that needs
> deserialization.
> In addition, I see significant allocations from
> SOAPHander.onEndElement
> (lines 114 and 118). SOAPHandler.onEndElement calls
> addTextNode, first
> converting to String from a CharArrayWriter. addTextNode ends up
> calling into crimson which converts the string back to a char[]. I
> believe this happens before the replay. I am not sure why this is
> necessary, as the data is stored in the event recorder.
I'm not sure why addTextNode is called in the first place either. I ran all
axis tests and tests in our software and saw no problems with that code
removed.
> In addition to the input data being copied, there are other
> sources of
> per-array element allocations:
>
> - SimpleDeserializer initialization (1 allocated per array element)
>
> - NSStack (1 call to cloneFrame per array element)
>
> - NodeImpl.initializeChildren (3 array list allocations per array
> element)
>
> - DeserializerImpl.addChildDeserializer (1 call per array element,
> allocating 1 CallbackTarget and 1 vector)
>
> Here is the the output from the allocation profiler showing the per
> array element allocations (the full output is available on jira as an
> attachment to issue 1391 (it will be once I can successfully upload,
> something seems to be broken right now):
Where is that attachment? I can't find it with that bug.
Jarek