You are viewing a plain text version of this content. The canonical link for it is here.
Posted to httpclient-users@hc.apache.org by Michael Osipov <mi...@apache.org> on 2015/05/23 22:09:57 UTC
Cannot saturate LAN connection with HttpClient
Hi,
we are experiencing a (slight) performance problem with HttpClient 4.4.1
while downloading big files from a remote server in the corporate intranet.
A simple test client:
HttpClientBuilder builder = HttpClientBuilder.create();
try (CloseableHttpClient client = builder.build()) {
HttpGet get = new HttpGet("...");
long start = System.nanoTime();
HttpResponse response = client.execute(get);
HttpEntity entity = response.getEntity();
File file = File.createTempFile("prefix", null);
OutputStream os = new FileOutputStream(file);
entity.writeTo(os);
long stop = System.nanoTime();
long contentLength = file.length();
long diff = stop - start;
System.out.printf("Duration: %d ms%n",
TimeUnit.NANOSECONDS.toMillis(diff));
System.out.printf("Size: %d%n", contentLength);
float speed = contentLength / (float) diff * (1_000_000_000 / 1_000_000);
System.out.printf("Speed: %.2f MB/s%n", speed);
}
After at least 10 repetions I see that the 182 MB file is download
within 24 000 ms with about 8 MB/s max. I cannot top that.
I have tried this over and over again with curl and see that curl is
able to saturate the entire LAN connection (100 Mbit/s).
My tests are done on Windows 7 64 bit, JDK 7u67 32 bit.
Any idea what the bottleneck might me?
Downloading the same file locally with HttpClient gives me 180 MB/s, so
this should be a problem.
Michael
---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org
Re: Cannot saturate LAN connection with HttpClient
Posted by Michael Osipov <19...@gmx.net>.
Am 2015-05-25 um 00:53 schrieb Oleg Kalnichevski:
> On May 25, 2015 12:23:39 AM GMT+02:00, Michael Osipov <mi...@apache.org> wrote:
>> Am 2015-05-24 um 14:25 schrieb Oleg Kalnichevski:
>>> [...]
>>>>> It all sounds very bizarre. I see no reason why HttpAsyncClient
>> without
>>>>> zero copy transfer should do any better than HttpClient in this
>>>>> scenario.
>>>>
>>>> So you are saying something is probably wrong with my client setup?
>>>>
>>>
>>> I think it is not unlikely.
>>
>> Guess what, I have changed the sample client to use HttpUrlConnection.
>> It instantly saturated the entire connection. Now I am completely
>> lost...
>>
>> I have tried 4.5 from branches, no avail. Then I have started to play
>> around with the builder options and disabled content compression.
>> Performance was off the charts. Full saturation: 11.4 MB/s.
>>
>> That made me wonder, I have redone the testing:
>>
>> 1. Curl from a server with gigabit connection topped 30 MB/s.
>> 2. Curl again but this time with Accept-Encoding. The speed drops to 7
>> MB/s. Server is sending chunks.
>>
>> Compared to HttpClient, HC does a slightly better job when it comes to
>> compressed files.
>>
>> Looking at the source code and Javadocs, I do not see any option to
>> disable this by RequestConfig or via HttpContext attribute for this
>> single request. RequestContentEncoding class does not use HttpContext
>> variable in #process().
>>
>> I am about to file an issue for that. WDYT?
>>
> Ah. Good catch. Makes sense. Please raise a Jira with a change request. I'll have to cancel RC1 vote, unfortunately, if we want this included in 4.5 release.
Just did: https://issues.apache.org/jira/browse/HTTPCLIENT-1651
I'd be great having this in 4.5.
Thanks again for your guidance.
Michael
---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org
Re: Cannot saturate LAN connection with HttpClient
Posted by Oleg Kalnichevski <ol...@apache.org>.
On May 25, 2015 12:23:39 AM GMT+02:00, Michael Osipov <mi...@apache.org> wrote:
>Am 2015-05-24 um 14:25 schrieb Oleg Kalnichevski:
>> [...]
>>>> It all sounds very bizarre. I see no reason why HttpAsyncClient
>without
>>>> zero copy transfer should do any better than HttpClient in this
>>>> scenario.
>>>
>>> So you are saying something is probably wrong with my client setup?
>>>
>>
>> I think it is not unlikely.
>
>Guess what, I have changed the sample client to use HttpUrlConnection.
>It instantly saturated the entire connection. Now I am completely
>lost...
>
>I have tried 4.5 from branches, no avail. Then I have started to play
>around with the builder options and disabled content compression.
>Performance was off the charts. Full saturation: 11.4 MB/s.
>
>That made me wonder, I have redone the testing:
>
>1. Curl from a server with gigabit connection topped 30 MB/s.
>2. Curl again but this time with Accept-Encoding. The speed drops to 7
>MB/s. Server is sending chunks.
>
>Compared to HttpClient, HC does a slightly better job when it comes to
>compressed files.
>
>Looking at the source code and Javadocs, I do not see any option to
>disable this by RequestConfig or via HttpContext attribute for this
>single request. RequestContentEncoding class does not use HttpContext
>variable in #process().
>
>I am about to file an issue for that. WDYT?
>
Ah. Good catch. Makes sense. Please raise a Jira with a change request. I'll have to cancel RC1 vote, unfortunately, if we want this included in 4.5 release.
Oleg
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org
Re: Cannot saturate LAN connection with HttpClient
Posted by Michael Osipov <mi...@apache.org>.
Am 2015-05-24 um 14:25 schrieb Oleg Kalnichevski:
> [...]
>>> It all sounds very bizarre. I see no reason why HttpAsyncClient without
>>> zero copy transfer should do any better than HttpClient in this
>>> scenario.
>>
>> So you are saying something is probably wrong with my client setup?
>>
>
> I think it is not unlikely.
Guess what, I have changed the sample client to use HttpUrlConnection.
It instantly saturated the entire connection. Now I am completely lost...
I have tried 4.5 from branches, no avail. Then I have started to play
around with the builder options and disabled content compression.
Performance was off the charts. Full saturation: 11.4 MB/s.
That made me wonder, I have redone the testing:
1. Curl from a server with gigabit connection topped 30 MB/s.
2. Curl again but this time with Accept-Encoding. The speed drops to 7
MB/s. Server is sending chunks.
Compared to HttpClient, HC does a slightly better job when it comes to
compressed files.
Looking at the source code and Javadocs, I do not see any option to
disable this by RequestConfig or via HttpContext attribute for this
single request. RequestContentEncoding class does not use HttpContext
variable in #process().
I am about to file an issue for that. WDYT?
Michael
---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org
Re: Cannot saturate LAN connection with HttpClient
Posted by Michael Osipov <mi...@apache.org>.
Am 2015-05-24 um 14:25 schrieb Oleg Kalnichevski:
> On Sun, 2015-05-24 at 13:02 +0200, Michael Osipov wrote:
>> Am 2015-05-24 um 12:17 schrieb Oleg Kalnichevski:
>>> On Sun, 2015-05-24 at 00:29 +0200, Michael Osipov wrote:
>>>> Am 2015-05-23 um 22:29 schrieb Oleg Kalnichevski:
>>>>> On Sat, 2015-05-23 at 22:09 +0200, Michael Osipov wrote:
>>>>>> Hi,
>>>>>>
>>>>>> we are experiencing a (slight) performance problem with HttpClient 4.4.1
>>>>>> while downloading big files from a remote server in the corporate intranet.
>>>>>>
>>>>>> A simple test client:
>>>>>> HttpClientBuilder builder = HttpClientBuilder.create();
>>>>>> try (CloseableHttpClient client = builder.build()) {
>>>>>> HttpGet get = new HttpGet("...");
>>>>>> long start = System.nanoTime();
>>>>>> HttpResponse response = client.execute(get);
>>>>>> HttpEntity entity = response.getEntity();
>>>>>>
>>>>>> File file = File.createTempFile("prefix", null);
>>>>>> OutputStream os = new FileOutputStream(file);
>>>>>> entity.writeTo(os);
>>>>>> long stop = System.nanoTime();
>>>>>> long contentLength = file.length();
>>>>>>
>>>>>> long diff = stop - start;
>>>>>> System.out.printf("Duration: %d ms%n",
>>>>>> TimeUnit.NANOSECONDS.toMillis(diff));
>>>>>> System.out.printf("Size: %d%n", contentLength);
>>>>>>
>>>>>> float speed = contentLength / (float) diff * (1_000_000_000 / 1_000_000);
>>>>>>
>>>>>> System.out.printf("Speed: %.2f MB/s%n", speed);
>>>>>> }
>>>>>>
>>>>>> After at least 10 repetions I see that the 182 MB file is download
>>>>>> within 24 000 ms with about 8 MB/s max. I cannot top that.
>>>>>>
>>>>>> I have tried this over and over again with curl and see that curl is
>>>>>> able to saturate the entire LAN connection (100 Mbit/s).
>>>>>>
>>>>>> My tests are done on Windows 7 64 bit, JDK 7u67 32 bit.
>>>>>>
>>>>>> Any idea what the bottleneck might me?
>>>>
>>>> Thanks for the quick response.
>>>>
>>>>> (1) Curl should be using zero copy file transfer which Java blocking i/o
>>>>> does not support. HttpAsyncClient on the other hand supports zero copy
>>>>> file transfer and generally tends to perform better when writing content
>>>>> out directly to the disk.
>>>>
>>>> I did try this [1] example and my heap exploaded. After increasing it to
>>>> -Xmx1024M, it did saturate the entire connection.
>>>>
>>>
>>> This sounds wrong. The example below does not use zero copy (with zero
>>> copy there should be no heap memory allocation at all).
>>>
>>> This example demonstrates how to use zero copy file transfer
>>>
>>> http://hc.apache.org/httpcomponents-asyncclient-4.1.x/httpasyncclient/examples/org/apache/http/examples/nio/client/ZeroCopyHttpExchange.java
>>
>> I have seen this example but there is no ZeroCopyGet. I haven't found
>> any example which explicitly says use zero-copy for GETs. The example
>> from [1] did work but with the explosion. What did I wrong here.
>>
>
> Zero copy can be employed only if a message encloses an entity in it.
> Therefore there is no such thing as ZeroCopyGet in HC. One can execute a
> normal GET request and use a ZeroCopyConsumer to stream content out
> directly to a file without any intermediate buffering in memory.
OK, that has confirmed my assumptions.
>>>>> (2) Use larger socket / intermediate buffers. Default buffer size used
>>>>> by Entity implementations is most likely suboptimal.
>>>>
>>>> That did not make any difference. I have changed:
>>>>
>>>> 1. Socket receive size
>>>> 2. Employed a buffered input stream
>>>> 3. Manually copied the stream to a file
>>>>
>>>> I have varied the buffer size from 2^14 to 2^20 bytes. No avail.
>>>> Regardless of this, your tip with zero copy helped me a lot.
>>>>
>>>> Unfortunately, this is just a little piece in a performance degregation
>>>> chain a colleague has figured out. HttpClient acts as an intermediate in
>>>> a webapp which receives a request via REST from a client, processes that
>>>> and opens up the stream to the huge files from a remote server. Without
>>>> caching the files to disk, I am passing the Entity#getContent stream
>>>> back to the client. The degreation is about 75 %.
>>>>
>>>> After rethinking your tips, I just checked the servers I am pulling off
>>>> data. One is slow the otherone is fast. Transfer speeds with piping the
>>>> streams from the fast server remains at 8 MB/s which is what I wanted
>>>> after I have identified an issue with my custom HttpResponseInputStream.
>>>>
>>>> I modified my code to use the async client and it seems to pipe with
>>>> maximum LAN speed though it looks weird with curl now. Curl blocks for
>>>> 15 seconds and within a second the entire stream is written down to disk.
>>>>
>>>
>>> It all sounds very bizarre. I see no reason why HttpAsyncClient without
>>> zero copy transfer should do any better than HttpClient in this
>>> scenario.
>>
>> So you are saying something is probably wrong with my client setup?
>>
>
> I think it is not unlikely.
Assuming I'd have the time to investigate that, I have currently no idea
where to start looking for the mismatch.
---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org
Re: Cannot saturate LAN connection with HttpClient
Posted by Oleg Kalnichevski <ol...@apache.org>.
On Sun, 2015-05-24 at 13:02 +0200, Michael Osipov wrote:
> Am 2015-05-24 um 12:17 schrieb Oleg Kalnichevski:
> > On Sun, 2015-05-24 at 00:29 +0200, Michael Osipov wrote:
> >> Am 2015-05-23 um 22:29 schrieb Oleg Kalnichevski:
> >>> On Sat, 2015-05-23 at 22:09 +0200, Michael Osipov wrote:
> >>>> Hi,
> >>>>
> >>>> we are experiencing a (slight) performance problem with HttpClient 4.4.1
> >>>> while downloading big files from a remote server in the corporate intranet.
> >>>>
> >>>> A simple test client:
> >>>> HttpClientBuilder builder = HttpClientBuilder.create();
> >>>> try (CloseableHttpClient client = builder.build()) {
> >>>> HttpGet get = new HttpGet("...");
> >>>> long start = System.nanoTime();
> >>>> HttpResponse response = client.execute(get);
> >>>> HttpEntity entity = response.getEntity();
> >>>>
> >>>> File file = File.createTempFile("prefix", null);
> >>>> OutputStream os = new FileOutputStream(file);
> >>>> entity.writeTo(os);
> >>>> long stop = System.nanoTime();
> >>>> long contentLength = file.length();
> >>>>
> >>>> long diff = stop - start;
> >>>> System.out.printf("Duration: %d ms%n",
> >>>> TimeUnit.NANOSECONDS.toMillis(diff));
> >>>> System.out.printf("Size: %d%n", contentLength);
> >>>>
> >>>> float speed = contentLength / (float) diff * (1_000_000_000 / 1_000_000);
> >>>>
> >>>> System.out.printf("Speed: %.2f MB/s%n", speed);
> >>>> }
> >>>>
> >>>> After at least 10 repetions I see that the 182 MB file is download
> >>>> within 24 000 ms with about 8 MB/s max. I cannot top that.
> >>>>
> >>>> I have tried this over and over again with curl and see that curl is
> >>>> able to saturate the entire LAN connection (100 Mbit/s).
> >>>>
> >>>> My tests are done on Windows 7 64 bit, JDK 7u67 32 bit.
> >>>>
> >>>> Any idea what the bottleneck might me?
> >>
> >> Thanks for the quick response.
> >>
> >>> (1) Curl should be using zero copy file transfer which Java blocking i/o
> >>> does not support. HttpAsyncClient on the other hand supports zero copy
> >>> file transfer and generally tends to perform better when writing content
> >>> out directly to the disk.
> >>
> >> I did try this [1] example and my heap exploaded. After increasing it to
> >> -Xmx1024M, it did saturate the entire connection.
> >>
> >
> > This sounds wrong. The example below does not use zero copy (with zero
> > copy there should be no heap memory allocation at all).
> >
> > This example demonstrates how to use zero copy file transfer
> >
> > http://hc.apache.org/httpcomponents-asyncclient-4.1.x/httpasyncclient/examples/org/apache/http/examples/nio/client/ZeroCopyHttpExchange.java
>
> I have seen this example but there is no ZeroCopyGet. I haven't found
> any example which explicitly says use zero-copy for GETs. The example
> from [1] did work but with the explosion. What did I wrong here.
>
Zero copy can be employed only if a message encloses an entity in it.
Therefore there is no such thing as ZeroCopyGet in HC. One can execute a
normal GET request and use a ZeroCopyConsumer to stream content out
directly to a file without any intermediate buffering in memory.
> >>> (2) Use larger socket / intermediate buffers. Default buffer size used
> >>> by Entity implementations is most likely suboptimal.
> >>
> >> That did not make any difference. I have changed:
> >>
> >> 1. Socket receive size
> >> 2. Employed a buffered input stream
> >> 3. Manually copied the stream to a file
> >>
> >> I have varied the buffer size from 2^14 to 2^20 bytes. No avail.
> >> Regardless of this, your tip with zero copy helped me a lot.
> >>
> >> Unfortunately, this is just a little piece in a performance degregation
> >> chain a colleague has figured out. HttpClient acts as an intermediate in
> >> a webapp which receives a request via REST from a client, processes that
> >> and opens up the stream to the huge files from a remote server. Without
> >> caching the files to disk, I am passing the Entity#getContent stream
> >> back to the client. The degreation is about 75 %.
> >>
> >> After rethinking your tips, I just checked the servers I am pulling off
> >> data. One is slow the otherone is fast. Transfer speeds with piping the
> >> streams from the fast server remains at 8 MB/s which is what I wanted
> >> after I have identified an issue with my custom HttpResponseInputStream.
> >>
> >> I modified my code to use the async client and it seems to pipe with
> >> maximum LAN speed though it looks weird with curl now. Curl blocks for
> >> 15 seconds and within a second the entire stream is written down to disk.
> >>
> >
> > It all sounds very bizarre. I see no reason why HttpAsyncClient without
> > zero copy transfer should do any better than HttpClient in this
> > scenario.
>
> So you are saying something is probably wrong with my client setup?
>
I think it is not unlikely.
Oleg
---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org
Re: Cannot saturate LAN connection with HttpClient
Posted by Michael Osipov <mi...@apache.org>.
Am 2015-05-24 um 12:17 schrieb Oleg Kalnichevski:
> On Sun, 2015-05-24 at 00:29 +0200, Michael Osipov wrote:
>> Am 2015-05-23 um 22:29 schrieb Oleg Kalnichevski:
>>> On Sat, 2015-05-23 at 22:09 +0200, Michael Osipov wrote:
>>>> Hi,
>>>>
>>>> we are experiencing a (slight) performance problem with HttpClient 4.4.1
>>>> while downloading big files from a remote server in the corporate intranet.
>>>>
>>>> A simple test client:
>>>> HttpClientBuilder builder = HttpClientBuilder.create();
>>>> try (CloseableHttpClient client = builder.build()) {
>>>> HttpGet get = new HttpGet("...");
>>>> long start = System.nanoTime();
>>>> HttpResponse response = client.execute(get);
>>>> HttpEntity entity = response.getEntity();
>>>>
>>>> File file = File.createTempFile("prefix", null);
>>>> OutputStream os = new FileOutputStream(file);
>>>> entity.writeTo(os);
>>>> long stop = System.nanoTime();
>>>> long contentLength = file.length();
>>>>
>>>> long diff = stop - start;
>>>> System.out.printf("Duration: %d ms%n",
>>>> TimeUnit.NANOSECONDS.toMillis(diff));
>>>> System.out.printf("Size: %d%n", contentLength);
>>>>
>>>> float speed = contentLength / (float) diff * (1_000_000_000 / 1_000_000);
>>>>
>>>> System.out.printf("Speed: %.2f MB/s%n", speed);
>>>> }
>>>>
>>>> After at least 10 repetions I see that the 182 MB file is download
>>>> within 24 000 ms with about 8 MB/s max. I cannot top that.
>>>>
>>>> I have tried this over and over again with curl and see that curl is
>>>> able to saturate the entire LAN connection (100 Mbit/s).
>>>>
>>>> My tests are done on Windows 7 64 bit, JDK 7u67 32 bit.
>>>>
>>>> Any idea what the bottleneck might me?
>>
>> Thanks for the quick response.
>>
>>> (1) Curl should be using zero copy file transfer which Java blocking i/o
>>> does not support. HttpAsyncClient on the other hand supports zero copy
>>> file transfer and generally tends to perform better when writing content
>>> out directly to the disk.
>>
>> I did try this [1] example and my heap exploaded. After increasing it to
>> -Xmx1024M, it did saturate the entire connection.
>>
>
> This sounds wrong. The example below does not use zero copy (with zero
> copy there should be no heap memory allocation at all).
>
> This example demonstrates how to use zero copy file transfer
>
> http://hc.apache.org/httpcomponents-asyncclient-4.1.x/httpasyncclient/examples/org/apache/http/examples/nio/client/ZeroCopyHttpExchange.java
I have seen this example but there is no ZeroCopyGet. I haven't found
any example which explicitly says use zero-copy for GETs. The example
from [1] did work but with the explosion. What did I wrong here.
>>> (2) Use larger socket / intermediate buffers. Default buffer size used
>>> by Entity implementations is most likely suboptimal.
>>
>> That did not make any difference. I have changed:
>>
>> 1. Socket receive size
>> 2. Employed a buffered input stream
>> 3. Manually copied the stream to a file
>>
>> I have varied the buffer size from 2^14 to 2^20 bytes. No avail.
>> Regardless of this, your tip with zero copy helped me a lot.
>>
>> Unfortunately, this is just a little piece in a performance degregation
>> chain a colleague has figured out. HttpClient acts as an intermediate in
>> a webapp which receives a request via REST from a client, processes that
>> and opens up the stream to the huge files from a remote server. Without
>> caching the files to disk, I am passing the Entity#getContent stream
>> back to the client. The degreation is about 75 %.
>>
>> After rethinking your tips, I just checked the servers I am pulling off
>> data. One is slow the otherone is fast. Transfer speeds with piping the
>> streams from the fast server remains at 8 MB/s which is what I wanted
>> after I have identified an issue with my custom HttpResponseInputStream.
>>
>> I modified my code to use the async client and it seems to pipe with
>> maximum LAN speed though it looks weird with curl now. Curl blocks for
>> 15 seconds and within a second the entire stream is written down to disk.
>>
>
> It all sounds very bizarre. I see no reason why HttpAsyncClient without
> zero copy transfer should do any better than HttpClient in this
> scenario.
So you are saying something is probably wrong with my client setup?
> These are micro-benchmark workers that I use to compare relative
> performance of the clients
>
> http://svn.apache.org/repos/asf/httpcomponents/benchmark/httpclient/trunk/src/main/java/org/apache/http/client/benchmark/ApacheHttpClient.java
> http://svn.apache.org/repos/asf/httpcomponents/benchmark/httpclient/trunk/src/main/java/org/apache/http/client/benchmark/ApacheHttpAsyncClient.java
>
> Could you please run them against your test URI?
Yes, I will run them. I can only run the GET requests for now. In order
to run them I have to modify the code and add custom HTTP headers which
the ConfigParser does not provide at the moment.
I'll get back to you.
Michael
---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org
Re: Cannot saturate LAN connection with HttpClient
Posted by Michael Osipov <mi...@apache.org>.
Am 2015-05-24 um 14:30 schrieb Oleg Kalnichevski:
> On Sun, 2015-05-24 at 14:18 +0200, Michael Osipov wrote:
>> Am 2015-05-24 um 12:17 schrieb Oleg Kalnichevski:
>>> [...]
>>> These are micro-benchmark workers that I use to compare relative
>>> performance of the clients
>>>
>>> http://svn.apache.org/repos/asf/httpcomponents/benchmark/httpclient/trunk/src/main/java/org/apache/http/client/benchmark/ApacheHttpClient.java
>>> http://svn.apache.org/repos/asf/httpcomponents/benchmark/httpclient/trunk/src/main/java/org/apache/http/client/benchmark/ApacheHttpAsyncClient.java
>>>
>>> Could you please run them against your test URI?
>>
>> Tests are done, though they were confusing at the first sight. Warmup
>> and target URIs always point to localhost. I have swapped them for
>> config.getUri.
>>
>> =================================
>> HTTP agent: HttpCore NIO (ver: 4.4)
>> ---------------------------------
>> 10000 POST requests
>> ---------------------------------
>> Document URI: ...
>> Document Length: 0 bytes
>>
>> Concurrency level: 50
>> Time taken for tests: 81.198 seconds
>> Complete requests: 0
>> Failed requests: 10000
>
> Please note all requests failed for some reason
>
>> Content transferred: 0 bytes
>> Requests per second: 0.0 [#/sec] (mean)
>> ---------------------------------
>> =================================
>> HTTP agent: Apache HttpClient (ver: 4.4)
>> ---------------------------------
>> 10000 POST requests
>> ---------------------------------
>> Document URI: ...
>> Document Length: 1279 bytes
>>
>> Concurrency level: 50
>> Time taken for tests: 4.259 seconds
>> Complete requests: 0
>> Failed requests: 10000
>> Content transferred: 12790000 bytes
>> Requests per second: 0.0 [#/sec] (mean)
>> ---------------------------------
>> =================================
>> HTTP agent: Apache HttpAsyncClient (ver: 4.1)
>> ---------------------------------
>> 10000 POST requests
>> ---------------------------------
>> Document URI: ...
>> Document Length: 1279 bytes
>>
>> Concurrency level: 50
>> Time taken for tests: 3.667 seconds
>> Complete requests: 0
>> Failed requests: 10000
>> Content transferred: 12790000 bytes
>> Requests per second: 0.0 [#/sec] (mean)
>> ---------------------------------
>>
>> Please note that the remote server does not accept any post requests
>> without several other preparing requests and additionally require a
>> one-time ticket. Results may be biased.
>>
>
> This is what I would expect: with a concurrency level below, say, 2000
> blocking HC should be somewhat faster than async HC.
>
> The document length looks kind of small to me (1279 bytes). Were you not
> talking about downloading some huge file and not being able to saturate
> network bandwidth?
Your remarks are correct but reread what I have written above:
> Please note that the remote server does not accept any post requests
> without several other preparing requests and additionally require a
> one-time ticket. Results may be biased.
To perform real-world test, I have to rewrite the BenchRunner to perform
GET requests only with custom headers filled with one-time-tokens. POST
is used to upload files to the server. This is an area I have not
analyzed yet. Download is happens more frequently that upload.
This is something I cannot do before next week or June.
The issue is somehwere burried between BIO and NIO.
Thanks for the support!
Michael
---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org
Re: Cannot saturate LAN connection with HttpClient
Posted by Oleg Kalnichevski <ol...@apache.org>.
On Sun, 2015-05-24 at 14:18 +0200, Michael Osipov wrote:
> Am 2015-05-24 um 12:17 schrieb Oleg Kalnichevski:
> > [...]
> > These are micro-benchmark workers that I use to compare relative
> > performance of the clients
> >
> > http://svn.apache.org/repos/asf/httpcomponents/benchmark/httpclient/trunk/src/main/java/org/apache/http/client/benchmark/ApacheHttpClient.java
> > http://svn.apache.org/repos/asf/httpcomponents/benchmark/httpclient/trunk/src/main/java/org/apache/http/client/benchmark/ApacheHttpAsyncClient.java
> >
> > Could you please run them against your test URI?
>
> Tests are done, though they were confusing at the first sight. Warmup
> and target URIs always point to localhost. I have swapped them for
> config.getUri.
>
> =================================
> HTTP agent: HttpCore NIO (ver: 4.4)
> ---------------------------------
> 10000 POST requests
> ---------------------------------
> Document URI: ...
> Document Length: 0 bytes
>
> Concurrency level: 50
> Time taken for tests: 81.198 seconds
> Complete requests: 0
> Failed requests: 10000
Please note all requests failed for some reason
> Content transferred: 0 bytes
> Requests per second: 0.0 [#/sec] (mean)
> ---------------------------------
> =================================
> HTTP agent: Apache HttpClient (ver: 4.4)
> ---------------------------------
> 10000 POST requests
> ---------------------------------
> Document URI: ...
> Document Length: 1279 bytes
>
> Concurrency level: 50
> Time taken for tests: 4.259 seconds
> Complete requests: 0
> Failed requests: 10000
> Content transferred: 12790000 bytes
> Requests per second: 0.0 [#/sec] (mean)
> ---------------------------------
> =================================
> HTTP agent: Apache HttpAsyncClient (ver: 4.1)
> ---------------------------------
> 10000 POST requests
> ---------------------------------
> Document URI: ...
> Document Length: 1279 bytes
>
> Concurrency level: 50
> Time taken for tests: 3.667 seconds
> Complete requests: 0
> Failed requests: 10000
> Content transferred: 12790000 bytes
> Requests per second: 0.0 [#/sec] (mean)
> ---------------------------------
>
> Please note that the remote server does not accept any post requests
> without several other preparing requests and additionally require a
> one-time ticket. Results may be biased.
>
This is what I would expect: with a concurrency level below, say, 2000
blocking HC should be somewhat faster than async HC.
The document length looks kind of small to me (1279 bytes). Were you not
talking about downloading some huge file and not being able to saturate
network bandwidth?
Oleg
---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org
Re: Cannot saturate LAN connection with HttpClient
Posted by Michael Osipov <mi...@apache.org>.
Am 2015-05-24 um 12:17 schrieb Oleg Kalnichevski:
> [...]
> These are micro-benchmark workers that I use to compare relative
> performance of the clients
>
> http://svn.apache.org/repos/asf/httpcomponents/benchmark/httpclient/trunk/src/main/java/org/apache/http/client/benchmark/ApacheHttpClient.java
> http://svn.apache.org/repos/asf/httpcomponents/benchmark/httpclient/trunk/src/main/java/org/apache/http/client/benchmark/ApacheHttpAsyncClient.java
>
> Could you please run them against your test URI?
Tests are done, though they were confusing at the first sight. Warmup
and target URIs always point to localhost. I have swapped them for
config.getUri.
=================================
HTTP agent: HttpCore NIO (ver: 4.4)
---------------------------------
10000 POST requests
---------------------------------
Document URI: ...
Document Length: 0 bytes
Concurrency level: 50
Time taken for tests: 81.198 seconds
Complete requests: 0
Failed requests: 10000
Content transferred: 0 bytes
Requests per second: 0.0 [#/sec] (mean)
---------------------------------
=================================
HTTP agent: Apache HttpClient (ver: 4.4)
---------------------------------
10000 POST requests
---------------------------------
Document URI: ...
Document Length: 1279 bytes
Concurrency level: 50
Time taken for tests: 4.259 seconds
Complete requests: 0
Failed requests: 10000
Content transferred: 12790000 bytes
Requests per second: 0.0 [#/sec] (mean)
---------------------------------
=================================
HTTP agent: Apache HttpAsyncClient (ver: 4.1)
---------------------------------
10000 POST requests
---------------------------------
Document URI: ...
Document Length: 1279 bytes
Concurrency level: 50
Time taken for tests: 3.667 seconds
Complete requests: 0
Failed requests: 10000
Content transferred: 12790000 bytes
Requests per second: 0.0 [#/sec] (mean)
---------------------------------
Please note that the remote server does not accept any post requests
without several other preparing requests and additionally require a
one-time ticket. Results may be biased.
Michael
---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org
Re: Cannot saturate LAN connection with HttpClient
Posted by Oleg Kalnichevski <ol...@apache.org>.
On Sun, 2015-05-24 at 00:29 +0200, Michael Osipov wrote:
> Am 2015-05-23 um 22:29 schrieb Oleg Kalnichevski:
> > On Sat, 2015-05-23 at 22:09 +0200, Michael Osipov wrote:
> >> Hi,
> >>
> >> we are experiencing a (slight) performance problem with HttpClient 4.4.1
> >> while downloading big files from a remote server in the corporate intranet.
> >>
> >> A simple test client:
> >> HttpClientBuilder builder = HttpClientBuilder.create();
> >> try (CloseableHttpClient client = builder.build()) {
> >> HttpGet get = new HttpGet("...");
> >> long start = System.nanoTime();
> >> HttpResponse response = client.execute(get);
> >> HttpEntity entity = response.getEntity();
> >>
> >> File file = File.createTempFile("prefix", null);
> >> OutputStream os = new FileOutputStream(file);
> >> entity.writeTo(os);
> >> long stop = System.nanoTime();
> >> long contentLength = file.length();
> >>
> >> long diff = stop - start;
> >> System.out.printf("Duration: %d ms%n",
> >> TimeUnit.NANOSECONDS.toMillis(diff));
> >> System.out.printf("Size: %d%n", contentLength);
> >>
> >> float speed = contentLength / (float) diff * (1_000_000_000 / 1_000_000);
> >>
> >> System.out.printf("Speed: %.2f MB/s%n", speed);
> >> }
> >>
> >> After at least 10 repetions I see that the 182 MB file is download
> >> within 24 000 ms with about 8 MB/s max. I cannot top that.
> >>
> >> I have tried this over and over again with curl and see that curl is
> >> able to saturate the entire LAN connection (100 Mbit/s).
> >>
> >> My tests are done on Windows 7 64 bit, JDK 7u67 32 bit.
> >>
> >> Any idea what the bottleneck might me?
>
> Thanks for the quick response.
>
> > (1) Curl should be using zero copy file transfer which Java blocking i/o
> > does not support. HttpAsyncClient on the other hand supports zero copy
> > file transfer and generally tends to perform better when writing content
> > out directly to the disk.
>
> I did try this [1] example and my heap exploaded. After increasing it to
> -Xmx1024M, it did saturate the entire connection.
>
This sounds wrong. The example below does not use zero copy (with zero
copy there should be no heap memory allocation at all).
This example demonstrates how to use zero copy file transfer
http://hc.apache.org/httpcomponents-asyncclient-4.1.x/httpasyncclient/examples/org/apache/http/examples/nio/client/ZeroCopyHttpExchange.java
> > (2) Use larger socket / intermediate buffers. Default buffer size used
> > by Entity implementations is most likely suboptimal.
>
> That did not make any difference. I have changed:
>
> 1. Socket receive size
> 2. Employed a buffered input stream
> 3. Manually copied the stream to a file
>
> I have varied the buffer size from 2^14 to 2^20 bytes. No avail.
> Regardless of this, your tip with zero copy helped me a lot.
>
> Unfortunately, this is just a little piece in a performance degregation
> chain a colleague has figured out. HttpClient acts as an intermediate in
> a webapp which receives a request via REST from a client, processes that
> and opens up the stream to the huge files from a remote server. Without
> caching the files to disk, I am passing the Entity#getContent stream
> back to the client. The degreation is about 75 %.
>
> After rethinking your tips, I just checked the servers I am pulling off
> data. One is slow the otherone is fast. Transfer speeds with piping the
> streams from the fast server remains at 8 MB/s which is what I wanted
> after I have identified an issue with my custom HttpResponseInputStream.
>
> I modified my code to use the async client and it seems to pipe with
> maximum LAN speed though it looks weird with curl now. Curl blocks for
> 15 seconds and within a second the entire stream is written down to disk.
>
It all sounds very bizarre. I see no reason why HttpAsyncClient without
zero copy transfer should do any better than HttpClient in this
scenario.
These are micro-benchmark workers that I use to compare relative
performance of the clients
http://svn.apache.org/repos/asf/httpcomponents/benchmark/httpclient/trunk/src/main/java/org/apache/http/client/benchmark/ApacheHttpClient.java
http://svn.apache.org/repos/asf/httpcomponents/benchmark/httpclient/trunk/src/main/java/org/apache/http/client/benchmark/ApacheHttpAsyncClient.java
Could you please run them against your test URI?
Oleg
---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org
Re: Cannot saturate LAN connection with HttpClient
Posted by Michael Osipov <mi...@apache.org>.
Am 2015-05-23 um 22:29 schrieb Oleg Kalnichevski:
> On Sat, 2015-05-23 at 22:09 +0200, Michael Osipov wrote:
>> Hi,
>>
>> we are experiencing a (slight) performance problem with HttpClient 4.4.1
>> while downloading big files from a remote server in the corporate intranet.
>>
>> A simple test client:
>> HttpClientBuilder builder = HttpClientBuilder.create();
>> try (CloseableHttpClient client = builder.build()) {
>> HttpGet get = new HttpGet("...");
>> long start = System.nanoTime();
>> HttpResponse response = client.execute(get);
>> HttpEntity entity = response.getEntity();
>>
>> File file = File.createTempFile("prefix", null);
>> OutputStream os = new FileOutputStream(file);
>> entity.writeTo(os);
>> long stop = System.nanoTime();
>> long contentLength = file.length();
>>
>> long diff = stop - start;
>> System.out.printf("Duration: %d ms%n",
>> TimeUnit.NANOSECONDS.toMillis(diff));
>> System.out.printf("Size: %d%n", contentLength);
>>
>> float speed = contentLength / (float) diff * (1_000_000_000 / 1_000_000);
>>
>> System.out.printf("Speed: %.2f MB/s%n", speed);
>> }
>>
>> After at least 10 repetions I see that the 182 MB file is download
>> within 24 000 ms with about 8 MB/s max. I cannot top that.
>>
>> I have tried this over and over again with curl and see that curl is
>> able to saturate the entire LAN connection (100 Mbit/s).
>>
>> My tests are done on Windows 7 64 bit, JDK 7u67 32 bit.
>>
>> Any idea what the bottleneck might me?
Thanks for the quick response.
> (1) Curl should be using zero copy file transfer which Java blocking i/o
> does not support. HttpAsyncClient on the other hand supports zero copy
> file transfer and generally tends to perform better when writing content
> out directly to the disk.
I did try this [1] example and my heap exploaded. After increasing it to
-Xmx1024M, it did saturate the entire connection.
> (2) Use larger socket / intermediate buffers. Default buffer size used
> by Entity implementations is most likely suboptimal.
That did not make any difference. I have changed:
1. Socket receive size
2. Employed a buffered input stream
3. Manually copied the stream to a file
I have varied the buffer size from 2^14 to 2^20 bytes. No avail.
Regardless of this, your tip with zero copy helped me a lot.
Unfortunately, this is just a little piece in a performance degregation
chain a colleague has figured out. HttpClient acts as an intermediate in
a webapp which receives a request via REST from a client, processes that
and opens up the stream to the huge files from a remote server. Without
caching the files to disk, I am passing the Entity#getContent stream
back to the client. The degreation is about 75 %.
After rethinking your tips, I just checked the servers I am pulling off
data. One is slow the otherone is fast. Transfer speeds with piping the
streams from the fast server remains at 8 MB/s which is what I wanted
after I have identified an issue with my custom HttpResponseInputStream.
I modified my code to use the async client and it seems to pipe with
maximum LAN speed though it looks weird with curl now. Curl blocks for
15 seconds and within a second the entire stream is written down to disk.
But anyway, you helped me a lot to sort out the issue. I will think
about the async stuff. I am not very experienced with that, the
handling/API looks different and it has to fit into the model mentioned
above.
[1] https://hc.apache.org/httpcomponents-asyncclient-4.1.x/quickstart.html
Michael
---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org
Re: Cannot saturate LAN connection with HttpClient
Posted by Oleg Kalnichevski <ol...@apache.org>.
On Sat, 2015-05-23 at 22:09 +0200, Michael Osipov wrote:
> Hi,
>
> we are experiencing a (slight) performance problem with HttpClient 4.4.1
> while downloading big files from a remote server in the corporate intranet.
>
> A simple test client:
> HttpClientBuilder builder = HttpClientBuilder.create();
> try (CloseableHttpClient client = builder.build()) {
> HttpGet get = new HttpGet("...");
> long start = System.nanoTime();
> HttpResponse response = client.execute(get);
> HttpEntity entity = response.getEntity();
>
> File file = File.createTempFile("prefix", null);
> OutputStream os = new FileOutputStream(file);
> entity.writeTo(os);
> long stop = System.nanoTime();
> long contentLength = file.length();
>
> long diff = stop - start;
> System.out.printf("Duration: %d ms%n",
> TimeUnit.NANOSECONDS.toMillis(diff));
> System.out.printf("Size: %d%n", contentLength);
>
> float speed = contentLength / (float) diff * (1_000_000_000 / 1_000_000);
>
> System.out.printf("Speed: %.2f MB/s%n", speed);
> }
>
> After at least 10 repetions I see that the 182 MB file is download
> within 24 000 ms with about 8 MB/s max. I cannot top that.
>
> I have tried this over and over again with curl and see that curl is
> able to saturate the entire LAN connection (100 Mbit/s).
>
> My tests are done on Windows 7 64 bit, JDK 7u67 32 bit.
>
> Any idea what the bottleneck might me?
>
(1) Curl should be using zero copy file transfer which Java blocking i/o
does not support. HttpAsyncClient on the other hand supports zero copy
file transfer and generally tends to perform better when writing content
out directly to the disk.
(2) Use larger socket / intermediate buffers. Default buffer size used
by Entity implementations is most likely suboptimal.
Oleg
---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org