You are viewing a plain text version of this content. The canonical link for it is here.
Posted to httpclient-users@hc.apache.org by "Brooks, Kenneth S" <ke...@chase.com> on 2010/07/18 08:04:03 UTC

httpclient performance

Oleg,

We're replacing a t3 solution with one that is similar to SpringRemoting.. Serializing POJOs over http.

I've been able to optimize the CPU utilization and memory usage very well.. We're running in a smaller memory footprint and less CPU than on t3.

The one thing I'm struggling with is the actual response times.
We're about 2-6x slower over http.

Keep in mind, the service itself can serve both t3 and http so its not an issue of the serverside.. it is responding consistently.

One thing that I'm focusing in on is what Jprofiler is telling me about httpclient.

Please see the attached html file.. it's a Call tree output showing the CPU time for the call tree.
It shows that 60% of my call time is buried in httpclient.execute.
I would expect that, but what I don't get is that when I drill down to doSendRequest and doSendResponse I find out that the majority of the time is eaten up by sendRequestHeader and receiveResponseHeader.. I would have thought that it would have been the retrieving the entity that would consume the majority of the time..

I understand that some of the overhead might be related to t3 being persistent and http not really (having to do the stale check, etc.. ).

But the difference is pretty significant... 2000 invocations on t3 take 5300ms, 2000 invocations on http takes 30000ms..
14 of those 30 seconds are spent inside of httpclient (and that's just CPU time)..

What in the world am I missing?
Is that just to be expected.. or am I doing something wrong?

What further information do you need from me?
-k




This transmission may contain information that is privileged,
confidential, legally privileged, and/or exempt from disclosure
under applicable law.  If you are not the intended recipient, you
are hereby notified that any disclosure, copying, distribution, or
use of the information contained herein (including any reliance
thereon) is STRICTLY PROHIBITED.  Although this transmission and
any attachments are believed to be free of any virus or other
defect that might affect any computer system into which it is
received and opened, it is the responsibility of the recipient to
ensure that it is virus free and no responsibility is accepted by
JPMorgan Chase & Co., its subsidiaries and affiliates, as
applicable, for any loss or damage arising in any way from its use.
 If you received this transmission in error, please immediately
contact the sender and destroy the material in its entirety,
whether in electronic or hard copy format. Thank you.

RE: httpclient performance

Posted by "Brooks, Kenneth S" <ke...@chase.com>.
> 
> There is no need to turn on full wire logging, which is very noisy and
> expensive. Only context logging should be enough.
> 
> http://hc.apache.org/httpcomponents-client-4.0.1/logging.html
> 
> ...
> 

[[-KenBrooks-]] 
Understood.. yeah I would turn it on with only context logging.. I was just explaining why I had it completely off before. :)
I'm building the 4.1 trunk (so that it includes the illegal state fix) and I'll then run it with context logging and hopefully shoot that across here later today.

> > Are there interceptors (like the cookie interceptors) that I can
> disable for authentication & state management?
> > That would at least be a good test that circumventing client and going
> to core might make a difference.
> >

[[-KenBrooks-]] 
Browsing around I see that in 4.1 (instead of 4.0.X) there are a few more interceptors.. like RequestAuthCache.. 


> Most of the overhead is added by DefaultRequestDirector which cannot be
> easily removed. One possibility would be to use a simplified custom
> implementation of RequestDirector to execute requests with minimal
> overhead, but this would be hardly any different than using HttpCore
> directly.
> 
> Here's a simple benchmark which can be used to compare performance of
> HttpCore and HttpClient
> 
> http://wiki.apache.org/HttpComponents/HttpClient3vsHttpClient4vsHttpCore
> 

[[-KenBrooks-]] 
I ran across this over the weekend.. Mostly was focused on httpclient3 vs. httpclient4 because at first I wasn't sure if the overhead was a result of us upgrading from 3.1 to 4.0.1. Now that you mention possibly using HttpCore it sheds new light on those metrics.. 

Thanks again for all the helpful suggestions.. 

-k

This transmission may contain information that is privileged,
confidential, legally privileged, and/or exempt from disclosure
under applicable law.  If you are not the intended recipient, you
are hereby notified that any disclosure, copying, distribution, or
use of the information contained herein (including any reliance
thereon) is STRICTLY PROHIBITED.  Although this transmission and
any attachments are believed to be free of any virus or other
defect that might affect any computer system into which it is
received and opened, it is the responsibility of the recipient to
ensure that it is virus free and no responsibility is accepted by
JPMorgan Chase & Co., its subsidiaries and affiliates, as
applicable, for any loss or damage arising in any way from its use.
 If you received this transmission in error, please immediately
contact the sender and destroy the material in its entirety,
whether in electronic or hard copy format. Thank you.

RE: httpclient performance

Posted by Oleg Kalnichevski <ol...@apache.org>.
On Mon, 2010-07-19 at 10:22 -0400, Brooks, Kenneth S wrote:
> 

...

> 
> > What in the world am I missing?
> > Is that just to be expected.. or am I doing something wrong?
> >  
> > What further information do you need from me?
> 
> A context log of the HTTP session would be most useful. 
> 
> [[-KenBrooks-]] Ok.. I'll turn the logging back on (I had it turned off because previously I was running httpclient trace and that in itself was obviously a huge performance hit..)  :) 
> 

There is no need to turn on full wire logging, which is very noisy and
expensive. Only context logging should be enough.

http://hc.apache.org/httpcomponents-client-4.0.1/logging.html

...

> 
> If the transport you are developing does not require authentication,
> HTTP state management and complex connection pooling, one possibility to
> speed things up by approximately 40-50% would be using HttpCore directly
> without all those additional services provided by HttpClient 
> 
> [[-KenBrooks-]] Very interesting thought.. No we don't need any authentication or http state management, cookies (which I disabled by removing the cookie interceptors)
> The only things that we really need that I can think of are the 
> * stale check 
> * idleconnectionmanagerthread (which is my class) but as long as the ConnectionManager it uses is in httpcore then we might be good.
> * socket timeout
> * connection timeout
> * tcp nodelay
> 
> I think a good many of those (at least the bottom 3) are all underlying protocol settings so are probably in core.. not sure about stale check and the connectionmanager.
> 
> Are there interceptors (like the cookie interceptors) that I can disable for authentication & state management?
> That would at least be a good test that circumventing client and going to core might make a difference.
> 

Most of the overhead is added by DefaultRequestDirector which cannot be
easily removed. One possibility would be to use a simplified custom
implementation of RequestDirector to execute requests with minimal
overhead, but this would be hardly any different than using HttpCore
directly.

Here's a simple benchmark which can be used to compare performance of
HttpCore and HttpClient

http://wiki.apache.org/HttpComponents/HttpClient3vsHttpClient4vsHttpCore

Hope this help

Oleg



---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org


RE: httpclient performance

Posted by "Brooks, Kenneth S" <ke...@chase.com>.

-----Original Message-----
From: Oleg Kalnichevski [mailto:olegk@apache.org] 
Sent: Sunday, July 18, 2010 3:54 PM
To: HttpClient User Discussion
Subject: Re: httpclient performance

On Sun, 2010-07-18 at 02:04 -0400, Brooks, Kenneth S wrote:
> Oleg,
>  
> We’re replacing a t3 solution with one that is similar to
> SpringRemoting.. Serializing POJOs over http.
>  
> I’ve been able to optimize the CPU utilization and memory usage very
> well.. We’re running in a smaller memory footprint and less CPU than
> on t3.
>  
> The one thing I’m struggling with is the actual response times.
> We’re about 2-6x slower over http.
>  
> Keep in mind, the service itself can serve both t3 and http so its not
> an issue of the serverside.. it is responding consistently.
>  
> One thing that I’m focusing in on is what Jprofiler is telling me
> about httpclient.
>  
> Please see the attached html file.. it’s a Call tree output showing
> the CPU time for the call tree.
> It shows that 60% of my call time is buried in httpclient.execute.
> I would expect that, but what I don’t get is that when I drill down to
> doSendRequest and doSendResponse I find out that the majority of the
> time is eaten up by sendRequestHeader and receiveResponseHeader.. I
> would have thought that it would have been the retrieving the entity
> that would consume the majority of the time..
>  

Right. But the response entity is not retrieved in the #execute method.
#execute reads the response head, constructs an HTTP entity bound to the
underlying HTTP connection and returns back without reading the entity.


> I understand that some of the overhead might be related to t3 being
> persistent and http not really (having to do the stale check, etc.. ).
>  
> But the difference is pretty significant… 2000 invocations on t3 take
> 5300ms, 2000 invocations on http takes 30000ms.. 
> 14 of those 30 seconds are spent inside of httpclient (and that’s just
> CPU time).. 
>  

Make sure you have the stale connection check disabled. It usually costs
50ms per request invocation. It is on per default.

[[-KenBrooks-]] I have it configured via an external property file that I can turn the stale check on and off.
It does makes a marginal difference but not significant.


> What in the world am I missing?
> Is that just to be expected.. or am I doing something wrong?
>  
> What further information do you need from me?

A context log of the HTTP session would be most useful. 

[[-KenBrooks-]] Ok.. I'll turn the logging back on (I had it turned off because previously I was running httpclient trace and that in itself was obviously a huge performance hit..)  :) 

Are you using multiple threads to execute requests concurrently or all
2000 requests are executed sequentially? 

[[-KenBrooks-]]  I am using the multithreading implementation, however in this scenario it is just a sequential invocation.
It is just simulating a single client machine making 2000 sequential calls.
Both t3 and http are using the same client test runner.. so it is the same loop invoking the code..
The only difference is when the actual call is made our client libraries decide to send it via t3 or http (depending upon the url prefix).


If the transport you are developing does not require authentication,
HTTP state management and complex connection pooling, one possibility to
speed things up by approximately 40-50% would be using HttpCore directly
without all those additional services provided by HttpClient 

[[-KenBrooks-]] Very interesting thought.. No we don't need any authentication or http state management, cookies (which I disabled by removing the cookie interceptors)
The only things that we really need that I can think of are the 
* stale check 
* idleconnectionmanagerthread (which is my class) but as long as the ConnectionManager it uses is in httpcore then we might be good.
* socket timeout
* connection timeout
* tcp nodelay

I think a good many of those (at least the bottom 3) are all underlying protocol settings so are probably in core.. not sure about stale check and the connectionmanager.

Are there interceptors (like the cookie interceptors) that I can disable for authentication & state management?
That would at least be a good test that circumventing client and going to core might make a difference.


Hope this helps

Oleg



---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org


This transmission may contain information that is privileged,
confidential, legally privileged, and/or exempt from disclosure
under applicable law.  If you are not the intended recipient, you
are hereby notified that any disclosure, copying, distribution, or
use of the information contained herein (including any reliance
thereon) is STRICTLY PROHIBITED.  Although this transmission and
any attachments are believed to be free of any virus or other
defect that might affect any computer system into which it is
received and opened, it is the responsibility of the recipient to
ensure that it is virus free and no responsibility is accepted by
JPMorgan Chase & Co., its subsidiaries and affiliates, as
applicable, for any loss or damage arising in any way from its use.
 If you received this transmission in error, please immediately
contact the sender and destroy the material in its entirety,
whether in electronic or hard copy format. Thank you.

Re: httpclient performance

Posted by Oleg Kalnichevski <ol...@apache.org>.
On Sun, 2010-07-18 at 02:04 -0400, Brooks, Kenneth S wrote:
> Oleg,
>  
> We’re replacing a t3 solution with one that is similar to
> SpringRemoting.. Serializing POJOs over http.
>  
> I’ve been able to optimize the CPU utilization and memory usage very
> well.. We’re running in a smaller memory footprint and less CPU than
> on t3.
>  
> The one thing I’m struggling with is the actual response times.
> We’re about 2-6x slower over http.
>  
> Keep in mind, the service itself can serve both t3 and http so its not
> an issue of the serverside.. it is responding consistently.
>  
> One thing that I’m focusing in on is what Jprofiler is telling me
> about httpclient.
>  
> Please see the attached html file.. it’s a Call tree output showing
> the CPU time for the call tree.
> It shows that 60% of my call time is buried in httpclient.execute.
> I would expect that, but what I don’t get is that when I drill down to
> doSendRequest and doSendResponse I find out that the majority of the
> time is eaten up by sendRequestHeader and receiveResponseHeader.. I
> would have thought that it would have been the retrieving the entity
> that would consume the majority of the time..
>  

Right. But the response entity is not retrieved in the #execute method.
#execute reads the response head, constructs an HTTP entity bound to the
underlying HTTP connection and returns back without reading the entity.


> I understand that some of the overhead might be related to t3 being
> persistent and http not really (having to do the stale check, etc.. ).
>  
> But the difference is pretty significant… 2000 invocations on t3 take
> 5300ms, 2000 invocations on http takes 30000ms.. 
> 14 of those 30 seconds are spent inside of httpclient (and that’s just
> CPU time).. 
>  

Make sure you have the stale connection check disabled. It usually costs
50ms per request invocation. It is on per default.


> What in the world am I missing?
> Is that just to be expected.. or am I doing something wrong?
>  
> What further information do you need from me?

A context log of the HTTP session would be most useful. 

Are you using multiple threads to execute requests concurrently or all
2000 requests are executed sequentially? 

If the transport you are developing does not require authentication,
HTTP state management and complex connection pooling, one possibility to
speed things up by approximately 40-50% would be using HttpCore directly
without all those additional services provided by HttpClient 

Hope this helps

Oleg



---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org