You are viewing a plain text version of this content. The canonical link for it is here.
Posted to httpclient-users@hc.apache.org by Michael Osipov <mi...@apache.org> on 2015/05/23 22:09:57 UTC

Cannot saturate LAN connection with HttpClient

Hi,

we are experiencing a (slight) performance problem with HttpClient 4.4.1 
while downloading big files from a remote server in the corporate intranet.

A simple test client:
HttpClientBuilder builder = HttpClientBuilder.create();
try (CloseableHttpClient client = builder.build()) {
   HttpGet get = new HttpGet("...");
   long start = System.nanoTime();
   HttpResponse response = client.execute(get);
   HttpEntity entity = response.getEntity();

   File file = File.createTempFile("prefix", null);
   OutputStream os = new FileOutputStream(file);
   entity.writeTo(os);
   long stop = System.nanoTime();
   long contentLength = file.length();

   long diff = stop - start;
   System.out.printf("Duration: %d ms%n", 
TimeUnit.NANOSECONDS.toMillis(diff));
   System.out.printf("Size: %d%n", contentLength);

   float speed = contentLength / (float) diff * (1_000_000_000 / 1_000_000);

   System.out.printf("Speed: %.2f MB/s%n", speed);
}

After at least 10 repetions I see that the 182 MB file is download 
within 24 000 ms with about 8 MB/s max. I cannot top that.

I have tried this over and over again with curl and see that curl is 
able to saturate the entire LAN connection (100 Mbit/s).

My tests are done on Windows 7 64 bit, JDK 7u67 32 bit.

Any idea what the bottleneck might me?

Downloading the same file locally with HttpClient gives me 180 MB/s, so 
this should be a problem.

Michael

---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org


Re: Cannot saturate LAN connection with HttpClient

Posted by Michael Osipov <19...@gmx.net>.
Am 2015-05-25 um 00:53 schrieb Oleg Kalnichevski:
> On May 25, 2015 12:23:39 AM GMT+02:00, Michael Osipov <mi...@apache.org> wrote:
>> Am 2015-05-24 um 14:25 schrieb Oleg Kalnichevski:
>>> [...]
>>>>> It all sounds very bizarre. I see no reason why HttpAsyncClient
>> without
>>>>> zero copy transfer should do any better than HttpClient in this
>>>>> scenario.
>>>>
>>>> So you are saying something is probably wrong with my client setup?
>>>>
>>>
>>> I think it is not unlikely.
>>
>> Guess what, I have changed the sample client to use HttpUrlConnection.
>> It instantly saturated the entire connection. Now I am completely
>> lost...
>>
>> I have tried 4.5 from branches, no avail. Then I have started to play
>> around with the builder options and disabled content compression.
>> Performance was off the charts. Full saturation: 11.4 MB/s.
>>
>> That made me wonder, I have redone the testing:
>>
>> 1. Curl from a server with gigabit connection topped 30 MB/s.
>> 2. Curl again but this time with Accept-Encoding. The speed drops to 7
>> MB/s. Server is sending chunks.
>>
>> Compared to HttpClient, HC does a slightly better job when it comes to
>> compressed files.
>>
>> Looking at the source code and Javadocs, I do not see any option to
>> disable this by RequestConfig or via HttpContext attribute for this
>> single request. RequestContentEncoding class does not use HttpContext
>> variable in #process().
>>
>> I am about to file an issue for that. WDYT?
>>
> Ah. Good catch. Makes sense. Please raise a Jira with a change request. I'll have to cancel RC1 vote, unfortunately, if we want this included in 4.5 release.

Just did: https://issues.apache.org/jira/browse/HTTPCLIENT-1651
I'd be great having this in 4.5.

Thanks again for your guidance.

Michael


---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org


Re: Cannot saturate LAN connection with HttpClient

Posted by Oleg Kalnichevski <ol...@apache.org>.
On May 25, 2015 12:23:39 AM GMT+02:00, Michael Osipov <mi...@apache.org> wrote:
>Am 2015-05-24 um 14:25 schrieb Oleg Kalnichevski:
>> [...]
>>>> It all sounds very bizarre. I see no reason why HttpAsyncClient
>without
>>>> zero copy transfer should do any better than HttpClient in this
>>>> scenario.
>>>
>>> So you are saying something is probably wrong with my client setup?
>>>
>>
>> I think it is not unlikely.
>
>Guess what, I have changed the sample client to use HttpUrlConnection. 
>It instantly saturated the entire connection. Now I am completely
>lost...
>
>I have tried 4.5 from branches, no avail. Then I have started to play 
>around with the builder options and disabled content compression. 
>Performance was off the charts. Full saturation: 11.4 MB/s.
>
>That made me wonder, I have redone the testing:
>
>1. Curl from a server with gigabit connection topped 30 MB/s.
>2. Curl again but this time with Accept-Encoding. The speed drops to 7 
>MB/s. Server is sending chunks.
>
>Compared to HttpClient, HC does a slightly better job when it comes to 
>compressed files.
>
>Looking at the source code and Javadocs, I do not see any option to 
>disable this by RequestConfig or via HttpContext attribute for this 
>single request. RequestContentEncoding class does not use HttpContext 
>variable in #process().
>
>I am about to file an issue for that. WDYT?
>
Ah. Good catch. Makes sense. Please raise a Jira with a change request. I'll have to cancel RC1 vote, unfortunately, if we want this included in 4.5 release.

Oleg

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org


Re: Cannot saturate LAN connection with HttpClient

Posted by Michael Osipov <mi...@apache.org>.
Am 2015-05-24 um 14:25 schrieb Oleg Kalnichevski:
> [...]
>>> It all sounds very bizarre. I see no reason why HttpAsyncClient without
>>> zero copy transfer should do any better than HttpClient in this
>>> scenario.
>>
>> So you are saying something is probably wrong with my client setup?
>>
>
> I think it is not unlikely.

Guess what, I have changed the sample client to use HttpUrlConnection. 
It instantly saturated the entire connection. Now I am completely lost...

I have tried 4.5 from branches, no avail. Then I have started to play 
around with the builder options and disabled content compression. 
Performance was off the charts. Full saturation: 11.4 MB/s.

That made me wonder, I have redone the testing:

1. Curl from a server with gigabit connection topped 30 MB/s.
2. Curl again but this time with Accept-Encoding. The speed drops to 7 
MB/s. Server is sending chunks.

Compared to HttpClient, HC does a slightly better job when it comes to 
compressed files.

Looking at the source code and Javadocs, I do not see any option to 
disable this by RequestConfig or via HttpContext attribute for this 
single request. RequestContentEncoding class does not use HttpContext 
variable in #process().

I am about to file an issue for that. WDYT?

Michael


---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org


Re: Cannot saturate LAN connection with HttpClient

Posted by Michael Osipov <mi...@apache.org>.
Am 2015-05-24 um 14:25 schrieb Oleg Kalnichevski:
> On Sun, 2015-05-24 at 13:02 +0200, Michael Osipov wrote:
>> Am 2015-05-24 um 12:17 schrieb Oleg Kalnichevski:
>>> On Sun, 2015-05-24 at 00:29 +0200, Michael Osipov wrote:
>>>> Am 2015-05-23 um 22:29 schrieb Oleg Kalnichevski:
>>>>> On Sat, 2015-05-23 at 22:09 +0200, Michael Osipov wrote:
>>>>>> Hi,
>>>>>>
>>>>>> we are experiencing a (slight) performance problem with HttpClient 4.4.1
>>>>>> while downloading big files from a remote server in the corporate intranet.
>>>>>>
>>>>>> A simple test client:
>>>>>> HttpClientBuilder builder = HttpClientBuilder.create();
>>>>>> try (CloseableHttpClient client = builder.build()) {
>>>>>>       HttpGet get = new HttpGet("...");
>>>>>>       long start = System.nanoTime();
>>>>>>       HttpResponse response = client.execute(get);
>>>>>>       HttpEntity entity = response.getEntity();
>>>>>>
>>>>>>       File file = File.createTempFile("prefix", null);
>>>>>>       OutputStream os = new FileOutputStream(file);
>>>>>>       entity.writeTo(os);
>>>>>>       long stop = System.nanoTime();
>>>>>>       long contentLength = file.length();
>>>>>>
>>>>>>       long diff = stop - start;
>>>>>>       System.out.printf("Duration: %d ms%n",
>>>>>> TimeUnit.NANOSECONDS.toMillis(diff));
>>>>>>       System.out.printf("Size: %d%n", contentLength);
>>>>>>
>>>>>>       float speed = contentLength / (float) diff * (1_000_000_000 / 1_000_000);
>>>>>>
>>>>>>       System.out.printf("Speed: %.2f MB/s%n", speed);
>>>>>> }
>>>>>>
>>>>>> After at least 10 repetions I see that the 182 MB file is download
>>>>>> within 24 000 ms with about 8 MB/s max. I cannot top that.
>>>>>>
>>>>>> I have tried this over and over again with curl and see that curl is
>>>>>> able to saturate the entire LAN connection (100 Mbit/s).
>>>>>>
>>>>>> My tests are done on Windows 7 64 bit, JDK 7u67 32 bit.
>>>>>>
>>>>>> Any idea what the bottleneck might me?
>>>>
>>>> Thanks for the quick response.
>>>>
>>>>> (1) Curl should be using zero copy file transfer which Java blocking i/o
>>>>> does not support. HttpAsyncClient on the other hand supports zero copy
>>>>> file transfer and generally tends to perform better when writing content
>>>>> out directly to the disk.
>>>>
>>>> I did try this [1] example and my heap exploaded. After increasing it to
>>>> -Xmx1024M, it did saturate the entire connection.
>>>>
>>>
>>> This sounds wrong. The example below does not use zero copy (with zero
>>> copy there should be no heap memory allocation at all).
>>>
>>> This example demonstrates how to use zero copy file transfer
>>>
>>> http://hc.apache.org/httpcomponents-asyncclient-4.1.x/httpasyncclient/examples/org/apache/http/examples/nio/client/ZeroCopyHttpExchange.java
>>
>> I have seen this example but there is no ZeroCopyGet. I haven't found
>> any example which explicitly says use zero-copy for GETs. The example
>> from [1] did work but with the explosion. What did I wrong here.
>>
>
> Zero copy can be employed only if a message encloses an entity in it.
> Therefore there is no such thing as ZeroCopyGet in HC. One can execute a
> normal GET request and use a ZeroCopyConsumer to stream content out
> directly to a file without any intermediate buffering in memory.

OK, that has confirmed my assumptions.

>>>>> (2) Use larger socket / intermediate buffers. Default buffer size used
>>>>> by Entity implementations is most likely suboptimal.
>>>>
>>>> That did not make any difference. I have changed:
>>>>
>>>> 1. Socket receive size
>>>> 2. Employed a buffered input stream
>>>> 3. Manually copied the stream to a file
>>>>
>>>> I have varied the buffer size from 2^14 to 2^20 bytes. No avail.
>>>> Regardless of this, your tip with zero copy helped me a lot.
>>>>
>>>> Unfortunately, this is just a little piece in a performance degregation
>>>> chain a colleague has figured out. HttpClient acts as an intermediate in
>>>> a webapp which receives a request via REST from a client, processes that
>>>> and opens up the stream to the huge files from a remote server. Without
>>>> caching the files to disk, I am passing the Entity#getContent stream
>>>> back to the client. The degreation is about 75 %.
>>>>
>>>> After rethinking your tips, I just checked the servers I am pulling off
>>>> data. One is slow the otherone is fast. Transfer speeds with piping the
>>>> streams from the fast server remains at 8 MB/s which is what I wanted
>>>> after I have identified an issue with my custom HttpResponseInputStream.
>>>>
>>>> I modified my code to use the async client and it seems to pipe with
>>>> maximum LAN speed though it looks weird with curl now. Curl blocks for
>>>> 15 seconds and within a second the entire stream is written down to disk.
>>>>
>>>
>>> It all sounds very bizarre. I see no reason why HttpAsyncClient without
>>> zero copy transfer should do any better than HttpClient in this
>>> scenario.
>>
>> So you are saying something is probably wrong with my client setup?
>>
>
> I think it is not unlikely.

Assuming I'd have the time to investigate that, I have currently no idea 
where to start looking for the mismatch.


---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org


Re: Cannot saturate LAN connection with HttpClient

Posted by Oleg Kalnichevski <ol...@apache.org>.
On Sun, 2015-05-24 at 13:02 +0200, Michael Osipov wrote:
> Am 2015-05-24 um 12:17 schrieb Oleg Kalnichevski:
> > On Sun, 2015-05-24 at 00:29 +0200, Michael Osipov wrote:
> >> Am 2015-05-23 um 22:29 schrieb Oleg Kalnichevski:
> >>> On Sat, 2015-05-23 at 22:09 +0200, Michael Osipov wrote:
> >>>> Hi,
> >>>>
> >>>> we are experiencing a (slight) performance problem with HttpClient 4.4.1
> >>>> while downloading big files from a remote server in the corporate intranet.
> >>>>
> >>>> A simple test client:
> >>>> HttpClientBuilder builder = HttpClientBuilder.create();
> >>>> try (CloseableHttpClient client = builder.build()) {
> >>>>      HttpGet get = new HttpGet("...");
> >>>>      long start = System.nanoTime();
> >>>>      HttpResponse response = client.execute(get);
> >>>>      HttpEntity entity = response.getEntity();
> >>>>
> >>>>      File file = File.createTempFile("prefix", null);
> >>>>      OutputStream os = new FileOutputStream(file);
> >>>>      entity.writeTo(os);
> >>>>      long stop = System.nanoTime();
> >>>>      long contentLength = file.length();
> >>>>
> >>>>      long diff = stop - start;
> >>>>      System.out.printf("Duration: %d ms%n",
> >>>> TimeUnit.NANOSECONDS.toMillis(diff));
> >>>>      System.out.printf("Size: %d%n", contentLength);
> >>>>
> >>>>      float speed = contentLength / (float) diff * (1_000_000_000 / 1_000_000);
> >>>>
> >>>>      System.out.printf("Speed: %.2f MB/s%n", speed);
> >>>> }
> >>>>
> >>>> After at least 10 repetions I see that the 182 MB file is download
> >>>> within 24 000 ms with about 8 MB/s max. I cannot top that.
> >>>>
> >>>> I have tried this over and over again with curl and see that curl is
> >>>> able to saturate the entire LAN connection (100 Mbit/s).
> >>>>
> >>>> My tests are done on Windows 7 64 bit, JDK 7u67 32 bit.
> >>>>
> >>>> Any idea what the bottleneck might me?
> >>
> >> Thanks for the quick response.
> >>
> >>> (1) Curl should be using zero copy file transfer which Java blocking i/o
> >>> does not support. HttpAsyncClient on the other hand supports zero copy
> >>> file transfer and generally tends to perform better when writing content
> >>> out directly to the disk.
> >>
> >> I did try this [1] example and my heap exploaded. After increasing it to
> >> -Xmx1024M, it did saturate the entire connection.
> >>
> >
> > This sounds wrong. The example below does not use zero copy (with zero
> > copy there should be no heap memory allocation at all).
> >
> > This example demonstrates how to use zero copy file transfer
> >
> > http://hc.apache.org/httpcomponents-asyncclient-4.1.x/httpasyncclient/examples/org/apache/http/examples/nio/client/ZeroCopyHttpExchange.java
> 
> I have seen this example but there is no ZeroCopyGet. I haven't found 
> any example which explicitly says use zero-copy for GETs. The example 
> from [1] did work but with the explosion. What did I wrong here.
> 

Zero copy can be employed only if a message encloses an entity in it.
Therefore there is no such thing as ZeroCopyGet in HC. One can execute a
normal GET request and use a ZeroCopyConsumer to stream content out
directly to a file without any intermediate buffering in memory.


> >>> (2) Use larger socket / intermediate buffers. Default buffer size used
> >>> by Entity implementations is most likely suboptimal.
> >>
> >> That did not make any difference. I have changed:
> >>
> >> 1. Socket receive size
> >> 2. Employed a buffered input stream
> >> 3. Manually copied the stream to a file
> >>
> >> I have varied the buffer size from 2^14 to 2^20 bytes. No avail.
> >> Regardless of this, your tip with zero copy helped me a lot.
> >>
> >> Unfortunately, this is just a little piece in a performance degregation
> >> chain a colleague has figured out. HttpClient acts as an intermediate in
> >> a webapp which receives a request via REST from a client, processes that
> >> and opens up the stream to the huge files from a remote server. Without
> >> caching the files to disk, I am passing the Entity#getContent stream
> >> back to the client. The degreation is about 75 %.
> >>
> >> After rethinking your tips, I just checked the servers I am pulling off
> >> data. One is slow the otherone is fast. Transfer speeds with piping the
> >> streams from the fast server remains at 8 MB/s which is what I wanted
> >> after I have identified an issue with my custom HttpResponseInputStream.
> >>
> >> I modified my code to use the async client and it seems to pipe with
> >> maximum LAN speed though it looks weird with curl now. Curl blocks for
> >> 15 seconds and within a second the entire stream is written down to disk.
> >>
> >
> > It all sounds very bizarre. I see no reason why HttpAsyncClient without
> > zero copy transfer should do any better than HttpClient in this
> > scenario.
> 
> So you are saying something is probably wrong with my client setup?
> 

I think it is not unlikely.

Oleg



---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org


Re: Cannot saturate LAN connection with HttpClient

Posted by Michael Osipov <mi...@apache.org>.
Am 2015-05-24 um 12:17 schrieb Oleg Kalnichevski:
> On Sun, 2015-05-24 at 00:29 +0200, Michael Osipov wrote:
>> Am 2015-05-23 um 22:29 schrieb Oleg Kalnichevski:
>>> On Sat, 2015-05-23 at 22:09 +0200, Michael Osipov wrote:
>>>> Hi,
>>>>
>>>> we are experiencing a (slight) performance problem with HttpClient 4.4.1
>>>> while downloading big files from a remote server in the corporate intranet.
>>>>
>>>> A simple test client:
>>>> HttpClientBuilder builder = HttpClientBuilder.create();
>>>> try (CloseableHttpClient client = builder.build()) {
>>>>      HttpGet get = new HttpGet("...");
>>>>      long start = System.nanoTime();
>>>>      HttpResponse response = client.execute(get);
>>>>      HttpEntity entity = response.getEntity();
>>>>
>>>>      File file = File.createTempFile("prefix", null);
>>>>      OutputStream os = new FileOutputStream(file);
>>>>      entity.writeTo(os);
>>>>      long stop = System.nanoTime();
>>>>      long contentLength = file.length();
>>>>
>>>>      long diff = stop - start;
>>>>      System.out.printf("Duration: %d ms%n",
>>>> TimeUnit.NANOSECONDS.toMillis(diff));
>>>>      System.out.printf("Size: %d%n", contentLength);
>>>>
>>>>      float speed = contentLength / (float) diff * (1_000_000_000 / 1_000_000);
>>>>
>>>>      System.out.printf("Speed: %.2f MB/s%n", speed);
>>>> }
>>>>
>>>> After at least 10 repetions I see that the 182 MB file is download
>>>> within 24 000 ms with about 8 MB/s max. I cannot top that.
>>>>
>>>> I have tried this over and over again with curl and see that curl is
>>>> able to saturate the entire LAN connection (100 Mbit/s).
>>>>
>>>> My tests are done on Windows 7 64 bit, JDK 7u67 32 bit.
>>>>
>>>> Any idea what the bottleneck might me?
>>
>> Thanks for the quick response.
>>
>>> (1) Curl should be using zero copy file transfer which Java blocking i/o
>>> does not support. HttpAsyncClient on the other hand supports zero copy
>>> file transfer and generally tends to perform better when writing content
>>> out directly to the disk.
>>
>> I did try this [1] example and my heap exploaded. After increasing it to
>> -Xmx1024M, it did saturate the entire connection.
>>
>
> This sounds wrong. The example below does not use zero copy (with zero
> copy there should be no heap memory allocation at all).
>
> This example demonstrates how to use zero copy file transfer
>
> http://hc.apache.org/httpcomponents-asyncclient-4.1.x/httpasyncclient/examples/org/apache/http/examples/nio/client/ZeroCopyHttpExchange.java

I have seen this example but there is no ZeroCopyGet. I haven't found 
any example which explicitly says use zero-copy for GETs. The example 
from [1] did work but with the explosion. What did I wrong here.

>>> (2) Use larger socket / intermediate buffers. Default buffer size used
>>> by Entity implementations is most likely suboptimal.
>>
>> That did not make any difference. I have changed:
>>
>> 1. Socket receive size
>> 2. Employed a buffered input stream
>> 3. Manually copied the stream to a file
>>
>> I have varied the buffer size from 2^14 to 2^20 bytes. No avail.
>> Regardless of this, your tip with zero copy helped me a lot.
>>
>> Unfortunately, this is just a little piece in a performance degregation
>> chain a colleague has figured out. HttpClient acts as an intermediate in
>> a webapp which receives a request via REST from a client, processes that
>> and opens up the stream to the huge files from a remote server. Without
>> caching the files to disk, I am passing the Entity#getContent stream
>> back to the client. The degreation is about 75 %.
>>
>> After rethinking your tips, I just checked the servers I am pulling off
>> data. One is slow the otherone is fast. Transfer speeds with piping the
>> streams from the fast server remains at 8 MB/s which is what I wanted
>> after I have identified an issue with my custom HttpResponseInputStream.
>>
>> I modified my code to use the async client and it seems to pipe with
>> maximum LAN speed though it looks weird with curl now. Curl blocks for
>> 15 seconds and within a second the entire stream is written down to disk.
>>
>
> It all sounds very bizarre. I see no reason why HttpAsyncClient without
> zero copy transfer should do any better than HttpClient in this
> scenario.

So you are saying something is probably wrong with my client setup?

> These are micro-benchmark workers that I use to compare relative
> performance of the clients
>
> http://svn.apache.org/repos/asf/httpcomponents/benchmark/httpclient/trunk/src/main/java/org/apache/http/client/benchmark/ApacheHttpClient.java
> http://svn.apache.org/repos/asf/httpcomponents/benchmark/httpclient/trunk/src/main/java/org/apache/http/client/benchmark/ApacheHttpAsyncClient.java
>
> Could you please run them against your test URI?

Yes, I will run them. I can only run the GET requests for now. In order 
to run them I have to modify the code and add custom HTTP headers which 
the ConfigParser does not provide at the moment.

I'll get back to you.

Michael

---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org


Re: Cannot saturate LAN connection with HttpClient

Posted by Michael Osipov <mi...@apache.org>.
Am 2015-05-24 um 14:30 schrieb Oleg Kalnichevski:
> On Sun, 2015-05-24 at 14:18 +0200, Michael Osipov wrote:
>> Am 2015-05-24 um 12:17 schrieb Oleg Kalnichevski:
>>> [...]
>>> These are micro-benchmark workers that I use to compare relative
>>> performance of the clients
>>>
>>> http://svn.apache.org/repos/asf/httpcomponents/benchmark/httpclient/trunk/src/main/java/org/apache/http/client/benchmark/ApacheHttpClient.java
>>> http://svn.apache.org/repos/asf/httpcomponents/benchmark/httpclient/trunk/src/main/java/org/apache/http/client/benchmark/ApacheHttpAsyncClient.java
>>>
>>> Could you please run them against your test URI?
>>
>> Tests are done, though they were confusing at the first sight. Warmup
>> and target URIs always point to localhost. I have swapped them for
>> config.getUri.
>>
>> =================================
>> HTTP agent: HttpCore NIO (ver: 4.4)
>> ---------------------------------
>> 10000 POST requests
>> ---------------------------------
>> Document URI:		...
>> Document Length:	0 bytes
>>
>> Concurrency level:	50
>> Time taken for tests:	81.198 seconds
>> Complete requests:	0
>> Failed requests:	10000
>
> Please note all requests failed for some reason
>
>> Content transferred:	0 bytes
>> Requests per second:	0.0 [#/sec] (mean)
>> ---------------------------------
>> =================================
>> HTTP agent: Apache HttpClient (ver: 4.4)
>> ---------------------------------
>> 10000 POST requests
>> ---------------------------------
>> Document URI:		...
>> Document Length:	1279 bytes
>>
>> Concurrency level:	50
>> Time taken for tests:	4.259 seconds
>> Complete requests:	0
>> Failed requests:	10000
>> Content transferred:	12790000 bytes
>> Requests per second:	0.0 [#/sec] (mean)
>> ---------------------------------
>> =================================
>> HTTP agent: Apache HttpAsyncClient (ver: 4.1)
>> ---------------------------------
>> 10000 POST requests
>> ---------------------------------
>> Document URI:		...
>> Document Length:	1279 bytes
>>
>> Concurrency level:	50
>> Time taken for tests:	3.667 seconds
>> Complete requests:	0
>> Failed requests:	10000
>> Content transferred:	12790000 bytes
>> Requests per second:	0.0 [#/sec] (mean)
>> ---------------------------------
>>
>> Please note that the remote server does not accept any post requests
>> without several other preparing requests and additionally require a
>> one-time ticket. Results may be biased.
>>
>
> This is what I would expect: with a concurrency level below, say, 2000
> blocking HC should be somewhat faster than async HC.
>
> The document length looks kind of small to me (1279 bytes). Were you not
> talking about downloading some huge file and not being able to saturate
> network bandwidth?

Your remarks are correct but reread what I have written above:

 > Please note that the remote server does not accept any post requests
 > without several other preparing requests and additionally require a
 > one-time ticket. Results may be biased.

To perform real-world test, I have to rewrite the BenchRunner to perform 
GET requests only with custom headers filled with one-time-tokens. POST 
is used to upload files to the server. This is an area I have not 
analyzed yet. Download is happens more frequently that upload.

This is something I cannot do before next week or June.

The issue is somehwere burried between BIO and NIO.

Thanks for the support!

Michael

---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org


Re: Cannot saturate LAN connection with HttpClient

Posted by Oleg Kalnichevski <ol...@apache.org>.
On Sun, 2015-05-24 at 14:18 +0200, Michael Osipov wrote:
> Am 2015-05-24 um 12:17 schrieb Oleg Kalnichevski:
> > [...]
> > These are micro-benchmark workers that I use to compare relative
> > performance of the clients
> >
> > http://svn.apache.org/repos/asf/httpcomponents/benchmark/httpclient/trunk/src/main/java/org/apache/http/client/benchmark/ApacheHttpClient.java
> > http://svn.apache.org/repos/asf/httpcomponents/benchmark/httpclient/trunk/src/main/java/org/apache/http/client/benchmark/ApacheHttpAsyncClient.java
> >
> > Could you please run them against your test URI?
> 
> Tests are done, though they were confusing at the first sight. Warmup 
> and target URIs always point to localhost. I have swapped them for 
> config.getUri.
> 
> =================================
> HTTP agent: HttpCore NIO (ver: 4.4)
> ---------------------------------
> 10000 POST requests
> ---------------------------------
> Document URI:		...
> Document Length:	0 bytes
> 
> Concurrency level:	50
> Time taken for tests:	81.198 seconds
> Complete requests:	0
> Failed requests:	10000

Please note all requests failed for some reason

> Content transferred:	0 bytes
> Requests per second:	0.0 [#/sec] (mean)
> ---------------------------------
> =================================
> HTTP agent: Apache HttpClient (ver: 4.4)
> ---------------------------------
> 10000 POST requests
> ---------------------------------
> Document URI:		...
> Document Length:	1279 bytes
> 
> Concurrency level:	50
> Time taken for tests:	4.259 seconds
> Complete requests:	0
> Failed requests:	10000
> Content transferred:	12790000 bytes
> Requests per second:	0.0 [#/sec] (mean)
> ---------------------------------
> =================================
> HTTP agent: Apache HttpAsyncClient (ver: 4.1)
> ---------------------------------
> 10000 POST requests
> ---------------------------------
> Document URI:		...
> Document Length:	1279 bytes
> 
> Concurrency level:	50
> Time taken for tests:	3.667 seconds
> Complete requests:	0
> Failed requests:	10000
> Content transferred:	12790000 bytes
> Requests per second:	0.0 [#/sec] (mean)
> ---------------------------------
> 
> Please note that the remote server does not accept any post requests 
> without several other preparing requests and additionally require a 
> one-time ticket. Results may be biased.
> 

This is what I would expect: with a concurrency level below, say, 2000
blocking HC should be somewhat faster than async HC.

The document length looks kind of small to me (1279 bytes). Were you not
talking about downloading some huge file and not being able to saturate
network bandwidth?

Oleg



---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org


Re: Cannot saturate LAN connection with HttpClient

Posted by Michael Osipov <mi...@apache.org>.
Am 2015-05-24 um 12:17 schrieb Oleg Kalnichevski:
> [...]
> These are micro-benchmark workers that I use to compare relative
> performance of the clients
>
> http://svn.apache.org/repos/asf/httpcomponents/benchmark/httpclient/trunk/src/main/java/org/apache/http/client/benchmark/ApacheHttpClient.java
> http://svn.apache.org/repos/asf/httpcomponents/benchmark/httpclient/trunk/src/main/java/org/apache/http/client/benchmark/ApacheHttpAsyncClient.java
>
> Could you please run them against your test URI?

Tests are done, though they were confusing at the first sight. Warmup 
and target URIs always point to localhost. I have swapped them for 
config.getUri.

=================================
HTTP agent: HttpCore NIO (ver: 4.4)
---------------------------------
10000 POST requests
---------------------------------
Document URI:		...
Document Length:	0 bytes

Concurrency level:	50
Time taken for tests:	81.198 seconds
Complete requests:	0
Failed requests:	10000
Content transferred:	0 bytes
Requests per second:	0.0 [#/sec] (mean)
---------------------------------
=================================
HTTP agent: Apache HttpClient (ver: 4.4)
---------------------------------
10000 POST requests
---------------------------------
Document URI:		...
Document Length:	1279 bytes

Concurrency level:	50
Time taken for tests:	4.259 seconds
Complete requests:	0
Failed requests:	10000
Content transferred:	12790000 bytes
Requests per second:	0.0 [#/sec] (mean)
---------------------------------
=================================
HTTP agent: Apache HttpAsyncClient (ver: 4.1)
---------------------------------
10000 POST requests
---------------------------------
Document URI:		...
Document Length:	1279 bytes

Concurrency level:	50
Time taken for tests:	3.667 seconds
Complete requests:	0
Failed requests:	10000
Content transferred:	12790000 bytes
Requests per second:	0.0 [#/sec] (mean)
---------------------------------

Please note that the remote server does not accept any post requests 
without several other preparing requests and additionally require a 
one-time ticket. Results may be biased.

Michael


---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org


Re: Cannot saturate LAN connection with HttpClient

Posted by Oleg Kalnichevski <ol...@apache.org>.
On Sun, 2015-05-24 at 00:29 +0200, Michael Osipov wrote:
> Am 2015-05-23 um 22:29 schrieb Oleg Kalnichevski:
> > On Sat, 2015-05-23 at 22:09 +0200, Michael Osipov wrote:
> >> Hi,
> >>
> >> we are experiencing a (slight) performance problem with HttpClient 4.4.1
> >> while downloading big files from a remote server in the corporate intranet.
> >>
> >> A simple test client:
> >> HttpClientBuilder builder = HttpClientBuilder.create();
> >> try (CloseableHttpClient client = builder.build()) {
> >>     HttpGet get = new HttpGet("...");
> >>     long start = System.nanoTime();
> >>     HttpResponse response = client.execute(get);
> >>     HttpEntity entity = response.getEntity();
> >>
> >>     File file = File.createTempFile("prefix", null);
> >>     OutputStream os = new FileOutputStream(file);
> >>     entity.writeTo(os);
> >>     long stop = System.nanoTime();
> >>     long contentLength = file.length();
> >>
> >>     long diff = stop - start;
> >>     System.out.printf("Duration: %d ms%n",
> >> TimeUnit.NANOSECONDS.toMillis(diff));
> >>     System.out.printf("Size: %d%n", contentLength);
> >>
> >>     float speed = contentLength / (float) diff * (1_000_000_000 / 1_000_000);
> >>
> >>     System.out.printf("Speed: %.2f MB/s%n", speed);
> >> }
> >>
> >> After at least 10 repetions I see that the 182 MB file is download
> >> within 24 000 ms with about 8 MB/s max. I cannot top that.
> >>
> >> I have tried this over and over again with curl and see that curl is
> >> able to saturate the entire LAN connection (100 Mbit/s).
> >>
> >> My tests are done on Windows 7 64 bit, JDK 7u67 32 bit.
> >>
> >> Any idea what the bottleneck might me?
> 
> Thanks for the quick response.
> 
> > (1) Curl should be using zero copy file transfer which Java blocking i/o
> > does not support. HttpAsyncClient on the other hand supports zero copy
> > file transfer and generally tends to perform better when writing content
> > out directly to the disk.
> 
> I did try this [1] example and my heap exploaded. After increasing it to 
> -Xmx1024M, it did saturate the entire connection.
> 

This sounds wrong. The example below does not use zero copy (with zero
copy there should be no heap memory allocation at all). 

This example demonstrates how to use zero copy file transfer

http://hc.apache.org/httpcomponents-asyncclient-4.1.x/httpasyncclient/examples/org/apache/http/examples/nio/client/ZeroCopyHttpExchange.java

> > (2) Use larger socket / intermediate buffers. Default buffer size used
> > by Entity implementations is most likely suboptimal.
> 
> That did not make any difference. I have changed:
> 
> 1. Socket receive size
> 2. Employed a buffered input stream
> 3. Manually copied the stream to a file
> 
> I have varied the buffer size from 2^14 to 2^20 bytes. No avail.
> Regardless of this, your tip with zero copy helped me a lot.
> 
> Unfortunately, this is just a little piece in a performance degregation 
> chain a colleague has figured out. HttpClient acts as an intermediate in 
> a webapp which receives a request via REST from a client, processes that 
> and opens up the stream to the huge files from a remote server. Without 
> caching the files to disk, I am passing the Entity#getContent stream 
> back to the client. The degreation is about 75 %.
> 
> After rethinking your tips, I just checked the servers I am pulling off 
> data. One is slow the otherone is fast. Transfer speeds with piping the 
> streams from the fast server remains at 8 MB/s which is what I wanted 
> after I have identified an issue with my custom HttpResponseInputStream.
> 
> I modified my code to use the async client and it seems to pipe with 
> maximum LAN speed though it looks weird with curl now. Curl blocks for 
> 15 seconds and within a second the entire stream is written down to disk.
> 

It all sounds very bizarre. I see no reason why HttpAsyncClient without
zero copy transfer should do any better than HttpClient in this
scenario.

These are micro-benchmark workers that I use to compare relative
performance of the clients
 
http://svn.apache.org/repos/asf/httpcomponents/benchmark/httpclient/trunk/src/main/java/org/apache/http/client/benchmark/ApacheHttpClient.java
http://svn.apache.org/repos/asf/httpcomponents/benchmark/httpclient/trunk/src/main/java/org/apache/http/client/benchmark/ApacheHttpAsyncClient.java   

Could you please run them against your test URI?

Oleg 


---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org


Re: Cannot saturate LAN connection with HttpClient

Posted by Michael Osipov <mi...@apache.org>.
Am 2015-05-23 um 22:29 schrieb Oleg Kalnichevski:
> On Sat, 2015-05-23 at 22:09 +0200, Michael Osipov wrote:
>> Hi,
>>
>> we are experiencing a (slight) performance problem with HttpClient 4.4.1
>> while downloading big files from a remote server in the corporate intranet.
>>
>> A simple test client:
>> HttpClientBuilder builder = HttpClientBuilder.create();
>> try (CloseableHttpClient client = builder.build()) {
>>     HttpGet get = new HttpGet("...");
>>     long start = System.nanoTime();
>>     HttpResponse response = client.execute(get);
>>     HttpEntity entity = response.getEntity();
>>
>>     File file = File.createTempFile("prefix", null);
>>     OutputStream os = new FileOutputStream(file);
>>     entity.writeTo(os);
>>     long stop = System.nanoTime();
>>     long contentLength = file.length();
>>
>>     long diff = stop - start;
>>     System.out.printf("Duration: %d ms%n",
>> TimeUnit.NANOSECONDS.toMillis(diff));
>>     System.out.printf("Size: %d%n", contentLength);
>>
>>     float speed = contentLength / (float) diff * (1_000_000_000 / 1_000_000);
>>
>>     System.out.printf("Speed: %.2f MB/s%n", speed);
>> }
>>
>> After at least 10 repetions I see that the 182 MB file is download
>> within 24 000 ms with about 8 MB/s max. I cannot top that.
>>
>> I have tried this over and over again with curl and see that curl is
>> able to saturate the entire LAN connection (100 Mbit/s).
>>
>> My tests are done on Windows 7 64 bit, JDK 7u67 32 bit.
>>
>> Any idea what the bottleneck might me?

Thanks for the quick response.

> (1) Curl should be using zero copy file transfer which Java blocking i/o
> does not support. HttpAsyncClient on the other hand supports zero copy
> file transfer and generally tends to perform better when writing content
> out directly to the disk.

I did try this [1] example and my heap exploaded. After increasing it to 
-Xmx1024M, it did saturate the entire connection.

> (2) Use larger socket / intermediate buffers. Default buffer size used
> by Entity implementations is most likely suboptimal.

That did not make any difference. I have changed:

1. Socket receive size
2. Employed a buffered input stream
3. Manually copied the stream to a file

I have varied the buffer size from 2^14 to 2^20 bytes. No avail.
Regardless of this, your tip with zero copy helped me a lot.

Unfortunately, this is just a little piece in a performance degregation 
chain a colleague has figured out. HttpClient acts as an intermediate in 
a webapp which receives a request via REST from a client, processes that 
and opens up the stream to the huge files from a remote server. Without 
caching the files to disk, I am passing the Entity#getContent stream 
back to the client. The degreation is about 75 %.

After rethinking your tips, I just checked the servers I am pulling off 
data. One is slow the otherone is fast. Transfer speeds with piping the 
streams from the fast server remains at 8 MB/s which is what I wanted 
after I have identified an issue with my custom HttpResponseInputStream.

I modified my code to use the async client and it seems to pipe with 
maximum LAN speed though it looks weird with curl now. Curl blocks for 
15 seconds and within a second the entire stream is written down to disk.

But anyway, you helped me a lot to sort out the issue. I will think 
about the async stuff. I am not very experienced with that, the 
handling/API looks different and it has to fit into the model mentioned 
above.

[1] https://hc.apache.org/httpcomponents-asyncclient-4.1.x/quickstart.html

Michael

---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org


Re: Cannot saturate LAN connection with HttpClient

Posted by Oleg Kalnichevski <ol...@apache.org>.
On Sat, 2015-05-23 at 22:09 +0200, Michael Osipov wrote:
> Hi,
> 
> we are experiencing a (slight) performance problem with HttpClient 4.4.1 
> while downloading big files from a remote server in the corporate intranet.
> 
> A simple test client:
> HttpClientBuilder builder = HttpClientBuilder.create();
> try (CloseableHttpClient client = builder.build()) {
>    HttpGet get = new HttpGet("...");
>    long start = System.nanoTime();
>    HttpResponse response = client.execute(get);
>    HttpEntity entity = response.getEntity();
> 
>    File file = File.createTempFile("prefix", null);
>    OutputStream os = new FileOutputStream(file);
>    entity.writeTo(os);
>    long stop = System.nanoTime();
>    long contentLength = file.length();
> 
>    long diff = stop - start;
>    System.out.printf("Duration: %d ms%n", 
> TimeUnit.NANOSECONDS.toMillis(diff));
>    System.out.printf("Size: %d%n", contentLength);
> 
>    float speed = contentLength / (float) diff * (1_000_000_000 / 1_000_000);
> 
>    System.out.printf("Speed: %.2f MB/s%n", speed);
> }
> 
> After at least 10 repetions I see that the 182 MB file is download 
> within 24 000 ms with about 8 MB/s max. I cannot top that.
> 
> I have tried this over and over again with curl and see that curl is 
> able to saturate the entire LAN connection (100 Mbit/s).
> 
> My tests are done on Windows 7 64 bit, JDK 7u67 32 bit.
> 
> Any idea what the bottleneck might me?
> 

(1) Curl should be using zero copy file transfer which Java blocking i/o
does not support. HttpAsyncClient on the other hand supports zero copy
file transfer and generally tends to perform better when writing content
out directly to the disk.

(2) Use larger socket / intermediate buffers. Default buffer size used
by Entity implementations is most likely suboptimal.

Oleg



---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org