You are viewing a plain text version of this content. The canonical link for it is here.
Posted to httpclient-users@hc.apache.org by Sam Perman <sa...@permans.com> on 2012/07/03 15:55:34 UTC

performance issue with CachingHttpClient

Hello

We're using the CachingHttpClient and are seeing a spike in CPU usage when
it is enabled. We've profiled our application and see that most of the time
is being spent parsing dates. Specifically, it is trying to get the age of
a cache entry on a cache hit by parsing the "Date" header on the
HttpCacheEntry.  I had a couple questions:

1) Why can't this use the responseDate value that lives on HttpCacheEntry?
(This would avoid the overhead of parsing)
2) If it needs to parse, is it possible to remember the result on the
HttpCacheEntry so it doesn't need to be parsed every time?

We are using version 4.2

thanks for any advice
sam

ps - Here is the full backtrace we are seeing:

org.apache.http.impl.cookie.DateUtils.parseDate(String)

org.apache.http.impl.client.cache.CacheValidityPolicy.getDateValue(HttpCacheEntry)

org.apache.http.impl.client.cache.CacheValidityPolicy.getApparentAgeSecs(HttpCacheEntry)

org.apache.http.impl.client.cache.CacheValidityPolicy.getCorrectedReceivedAgeSecs(HttpCacheEntry)

org.apache.http.impl.client.cache.CacheValidityPolicy.getCorrectedInitialAgeSecs(HttpCacheEntry)

org.apache.http.impl.client.cache.CacheValidityPolicy.getCurrentAgeSecs(HttpCacheEntry,
Date)

There are a couple callers to "getCorrectedAgeSecs":

CacheValidityPolicy.isResponseFresh(HttpCacheEntry, Date)
  CachedResponseSuitabilityChecker.isFreshEnough(HttpCacheEntry,
HttpRequest, Date)
    CachedResponseSuitabilityChecker.canCachedResponseBeUsed(HttpHost,
HttpRequest, HttpCacheEntry, Date)
      CachingHttpClient.handleCacheHit(HttpHost, HttpRequest, HttpContext,
HttpCacheEntry)

CachedHttpResponseGenerator.generateResponse(HttpCacheEntry)
  CachingHttpClient.generateCachedResponse(HttpRequest, HttpContext,
HttpCacheEntry, Date)
    CachingHttpClient.handleCacheHit(HttpHost, HttpRequest, HttpContext,
HttpCacheEntry)


Looking at the code, it looks like this section from
CachingHttpClient.handleCacheHit will result in parsing the date twice
(apologies if I'm misreading this)

        if (suitabilityChecker.canCachedResponseBeUsed(target, request,
entry, now)) {
            return generateCachedResponse(request, context, entry, now);
        }

Both the call to "canCachedResponseBeUsed" and the call to
"generatedCachedResponse" will ultimately call "getCurrentAgeSecs" and
parse the Date header.

Re: performance issue with CachingHttpClient

Posted by Sam Perman <sa...@permans.com>.
>
> Sam
>
> Could you please raise a JIRA for this issue?
>
> Oleg
>

No problem: https://issues.apache.org/jira/browse/HTTPCLIENT-1213

thanks

Re: performance issue with CachingHttpClient

Posted by Oleg Kalnichevski <ol...@apache.org>.
On Tue, 2012-07-03 at 09:55 -0400, Sam Perman wrote:
> Hello
> 
> We're using the CachingHttpClient and are seeing a spike in CPU usage when
> it is enabled. We've profiled our application and see that most of the time
> is being spent parsing dates. Specifically, it is trying to get the age of
> a cache entry on a cache hit by parsing the "Date" header on the
> HttpCacheEntry.  I had a couple questions:
> 
> 1) Why can't this use the responseDate value that lives on HttpCacheEntry?
> (This would avoid the overhead of parsing)
> 2) If it needs to parse, is it possible to remember the result on the
> HttpCacheEntry so it doesn't need to be parsed every time?
> 
> We are using version 4.2
> 
> thanks for any advice
> sam
> 
> ps - Here is the full backtrace we are seeing:
> 
> org.apache.http.impl.cookie.DateUtils.parseDate(String)
> 
> org.apache.http.impl.client.cache.CacheValidityPolicy.getDateValue(HttpCacheEntry)
> 
> org.apache.http.impl.client.cache.CacheValidityPolicy.getApparentAgeSecs(HttpCacheEntry)
> 
> org.apache.http.impl.client.cache.CacheValidityPolicy.getCorrectedReceivedAgeSecs(HttpCacheEntry)
> 
> org.apache.http.impl.client.cache.CacheValidityPolicy.getCorrectedInitialAgeSecs(HttpCacheEntry)
> 
> org.apache.http.impl.client.cache.CacheValidityPolicy.getCurrentAgeSecs(HttpCacheEntry,
> Date)
> 
> There are a couple callers to "getCorrectedAgeSecs":
> 
> CacheValidityPolicy.isResponseFresh(HttpCacheEntry, Date)
>   CachedResponseSuitabilityChecker.isFreshEnough(HttpCacheEntry,
> HttpRequest, Date)
>     CachedResponseSuitabilityChecker.canCachedResponseBeUsed(HttpHost,
> HttpRequest, HttpCacheEntry, Date)
>       CachingHttpClient.handleCacheHit(HttpHost, HttpRequest, HttpContext,
> HttpCacheEntry)
> 
> CachedHttpResponseGenerator.generateResponse(HttpCacheEntry)
>   CachingHttpClient.generateCachedResponse(HttpRequest, HttpContext,
> HttpCacheEntry, Date)
>     CachingHttpClient.handleCacheHit(HttpHost, HttpRequest, HttpContext,
> HttpCacheEntry)
> 
> 
> Looking at the code, it looks like this section from
> CachingHttpClient.handleCacheHit will result in parsing the date twice
> (apologies if I'm misreading this)
> 
>         if (suitabilityChecker.canCachedResponseBeUsed(target, request,
> entry, now)) {
>             return generateCachedResponse(request, context, entry, now);
>         }
> 
> Both the call to "canCachedResponseBeUsed" and the call to
> "generatedCachedResponse" will ultimately call "getCurrentAgeSecs" and
> parse the Date header.

Sam

Could you please raise a JIRA for this issue?

Oleg


---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org


Re: performance issue with CachingHttpClient

Posted by Jon Moore <jo...@apache.org>.
Hi Sam,

The client can't use the responseDate on the HttpCacheEntry because it
isn't the same thing as the value of the Date header from the server
response. The responseDate is the (local) time at which the response was
received. If you look at the RFC2616 specification for how to calculate the
age of an entry, it treats this value distinctly from the Date header value
on the response.

That said, there is probably some benefit to not parsing this over and over
again. Please open an Improvement ticket on the HttpClient JIRA so we can
track this:
https://issues.apache.org/jira/browse/HTTPCLIENT

And, as always, patches welcome. :)

Jon

On Tue, Jul 3, 2012 at 9:55 AM, Sam Perman <sa...@permans.com> wrote:

> Hello
>
> We're using the CachingHttpClient and are seeing a spike in CPU usage when
> it is enabled. We've profiled our application and see that most of the time
> is being spent parsing dates. Specifically, it is trying to get the age of
> a cache entry on a cache hit by parsing the "Date" header on the
> HttpCacheEntry.  I had a couple questions:
>
> 1) Why can't this use the responseDate value that lives on HttpCacheEntry?
> (This would avoid the overhead of parsing)
> 2) If it needs to parse, is it possible to remember the result on the
> HttpCacheEntry so it doesn't need to be parsed every time?
>
> We are using version 4.2
>
> thanks for any advice
> sam
>
> ps - Here is the full backtrace we are seeing:
>
> org.apache.http.impl.cookie.DateUtils.parseDate(String)
>
>
> org.apache.http.impl.client.cache.CacheValidityPolicy.getDateValue(HttpCacheEntry)
>
>
> org.apache.http.impl.client.cache.CacheValidityPolicy.getApparentAgeSecs(HttpCacheEntry)
>
>
> org.apache.http.impl.client.cache.CacheValidityPolicy.getCorrectedReceivedAgeSecs(HttpCacheEntry)
>
>
> org.apache.http.impl.client.cache.CacheValidityPolicy.getCorrectedInitialAgeSecs(HttpCacheEntry)
>
>
> org.apache.http.impl.client.cache.CacheValidityPolicy.getCurrentAgeSecs(HttpCacheEntry,
> Date)
>
> There are a couple callers to "getCorrectedAgeSecs":
>
> CacheValidityPolicy.isResponseFresh(HttpCacheEntry, Date)
>   CachedResponseSuitabilityChecker.isFreshEnough(HttpCacheEntry,
> HttpRequest, Date)
>     CachedResponseSuitabilityChecker.canCachedResponseBeUsed(HttpHost,
> HttpRequest, HttpCacheEntry, Date)
>       CachingHttpClient.handleCacheHit(HttpHost, HttpRequest, HttpContext,
> HttpCacheEntry)
>
> CachedHttpResponseGenerator.generateResponse(HttpCacheEntry)
>   CachingHttpClient.generateCachedResponse(HttpRequest, HttpContext,
> HttpCacheEntry, Date)
>     CachingHttpClient.handleCacheHit(HttpHost, HttpRequest, HttpContext,
> HttpCacheEntry)
>
>
> Looking at the code, it looks like this section from
> CachingHttpClient.handleCacheHit will result in parsing the date twice
> (apologies if I'm misreading this)
>
>         if (suitabilityChecker.canCachedResponseBeUsed(target, request,
> entry, now)) {
>             return generateCachedResponse(request, context, entry, now);
>         }
>
> Both the call to "canCachedResponseBeUsed" and the call to
> "generatedCachedResponse" will ultimately call "getCurrentAgeSecs" and
> parse the Date header.
>