You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hc.apache.org by Oleg Kalnichevski <ol...@apache.org> on 2013/02/20 22:47:02 UTC

HttpCore NIO performance improvements

Folks

I made really major changes to HttpCore NIO in order to reduce packet
fragmentation on the TCP level when transmitting relatively short (less
than 1 TCP frame) entity enclosing messages. In my tests I am seeing 25
to 30% performance improvements for short PUT and POST requests on the
client side and for short responses the server side as a result of
reduced TCP packet fragmentation.

The downside is that all this comes at the price of destabilizing areas
of code that have been well tested and stable for years. I made sure
that those areas have close to 100% unit test coverage, but still there
can be regressions.

I would be enormously thankful if you could take the latest SVN trunk
for a spin and test it with your applications. I would be interesting to
know if the latest changes actually translate into any tangible
performance gains at the application level.

Another interesting thing. For the first time I have seen NIO transports
outperform blocking ones with as few as 100 concurrent connections when
using the latest Java 6 or Java 7 Oracle JREs. It used to take 1000 or
more concurrent connections a few years back. I am now very curious to
see if I can make HttpAsyncClient outperform HttpClient with, say, 250
concurrent connections.

Cheers

Oleg


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


Re: HttpCore NIO performance improvements

Posted by Oleg Kalnichevski <ol...@apache.org>.
On Mon, 2013-02-25 at 08:23 +0300, Dmitry Potapov wrote:
> Ok, let me explain. When TCP_NODELAY is set to false, the Nagle's
> algorithm (rfc896) is enabled for socket. What this algorithm does?
> For each portion of data written to socket no data will be transmitted
> over TCP until at least one of two conditions is met:
> 1. ACK for the previous portion of data is received from the remote
> host (i.e. no data is sent until remote host confirm that previous
> portion of data is received)
> 2. Socket buffer is full (i.e. the overhead of TCP headers is minimal,
> no more data can be added to the current buffer without creating new
> packet with its own TCP headers)
> 
> Second statement is what your modification does.
> 
> When TCP_NODELAY is set to true, the Nagle's algorithm is disabled and
> user expects that all data written to socket is sent immediately. That
> is: Nagle's algorithm is about efficiency, TCP_NODELAY == true is
> about latency.
> It is no wonder that you gained speed boost with TCP_NODELAY set to
> true, because in fact you partially repeated TCP_NODELAY == false
> behaviour, but users who intentionally set TCP_NODELAY to true will
> get increased latency and it will be hard to find the root cause
> without reading this thread or the source code.
> 
> My opinion is that such kind of data buffering must be enabled only
> whet TCP_DELAY is set to false, as it will reduce the number of
> syscalls. For TCP_DELAY set to true, no buffering is acceptable.
> 

Dmitry

The primary purpose of the Nagle's algorithm is _congestion_ control and
flooding prevention. This is in part achieved by avoiding excessive
packet fragmentation mainly when transmitting terminal interactions
(that often consist of a single keystroke). But efficiency is not the
main point of RFC 896. More importantly, though, RFC imposes _no_
restriction of any nature on data buffering in protocol layers on top of
TCP.

Oleg


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


Re: HttpCore NIO performance improvements

Posted by Dmitry Potapov <po...@gmail.com>.
Ok, let me explain. When TCP_NODELAY is set to false, the Nagle's
algorithm (rfc896) is enabled for socket. What this algorithm does?
For each portion of data written to socket no data will be transmitted
over TCP until at least one of two conditions is met:
1. ACK for the previous portion of data is received from the remote
host (i.e. no data is sent until remote host confirm that previous
portion of data is received)
2. Socket buffer is full (i.e. the overhead of TCP headers is minimal,
no more data can be added to the current buffer without creating new
packet with its own TCP headers)

Second statement is what your modification does.

When TCP_NODELAY is set to true, the Nagle's algorithm is disabled and
user expects that all data written to socket is sent immediately. That
is: Nagle's algorithm is about efficiency, TCP_NODELAY == true is
about latency.
It is no wonder that you gained speed boost with TCP_NODELAY set to
true, because in fact you partially repeated TCP_NODELAY == false
behaviour, but users who intentionally set TCP_NODELAY to true will
get increased latency and it will be hard to find the root cause
without reading this thread or the source code.

My opinion is that such kind of data buffering must be enabled only
whet TCP_DELAY is set to false, as it will reduce the number of
syscalls. For TCP_DELAY set to true, no buffering is acceptable.

-- 
Best regards,
Dmitry

On Fri, Feb 22, 2013 at 12:31 AM, Oleg Kalnichevski <ol...@apache.org> wrote:
> On Thu, 2013-02-21 at 22:21 +0300, Dmitry Potapov wrote:
>> I just want to say that when I'm setting TCP_NODELAY I expect that every
>> portion of data is being transmitted right after I called
>> ContentEncoder.write() (even without waiting for ACK for previous TCP
>> packet)
>> I've looked through the code in trunk, and looks like this expectation
>> will fail after this change.
>>
>
> Dmitry
>
> I am not entirely sure I understand what your expectation is based upon
> but I am not a TCP/IP specialist. At any rate you can force HttpCore to
> always write things out directly to the underlying channel by setting
> the fragment hint parameter to zero.
>
> Hope this helps
>
> Oleg
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
> For additional commands, e-mail: dev-help@hc.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


Re: HttpCore NIO performance improvements

Posted by Oleg Kalnichevski <ol...@apache.org>.
On Thu, 2013-02-21 at 22:21 +0300, Dmitry Potapov wrote:
> I just want to say that when I'm setting TCP_NODELAY I expect that every
> portion of data is being transmitted right after I called
> ContentEncoder.write() (even without waiting for ACK for previous TCP
> packet)
> I've looked through the code in trunk, and looks like this expectation
> will fail after this change.
> 

Dmitry

I am not entirely sure I understand what your expectation is based upon
but I am not a TCP/IP specialist. At any rate you can force HttpCore to
always write things out directly to the underlying channel by setting
the fragment hint parameter to zero.

Hope this helps

Oleg


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


Re: HttpCore NIO performance improvements

Posted by Dmitry Potapov <po...@gmail.com>.
I just want to say that when I'm setting TCP_NODELAY I expect that every
portion of data is being transmitted right after I called
ContentEncoder.write() (even without waiting for ACK for previous TCP
packet)
I've looked through the code in trunk, and looks like this expectation
will fail after this change.

-- 
Best regards,
Dmitry

On Thursday, February 21, 2013, Oleg Kalnichevski wrote:

> On Thu, Feb 21, 2013 at 11:39:49AM +0300, Dmitry Potapov wrote:
> > On Thu, Feb 21, 2013 at 12:05 PM, Oleg Kalnichevski <olegk@apache.org<javascript:;>>
> wrote:
> > > On Thu, 2013-02-21 at 12:26 +0530, Asankha C. Perera wrote:
> > >> Hi Oleg
> > >> > I made really major changes to HttpCore NIO in order to reduce
> packet
> > >> > fragmentation on the TCP level when transmitting relatively short
> (less
> > >> > than 1 TCP frame) entity enclosing messages. In my tests I am
> seeing 25
> > >> > to 30% performance improvements for short PUT and POST requests on
> the
> > >> > client side and for short responses the server side as a result of
> > >> > reduced TCP packet fragmentation.
> > >> This is interesting.. What was the size of the messages that yielded
> the
> > >> improvement? Does this also relate to the use of tcpnodelay
> > >
> > > 2048 bytes. I was using this micro-benchmark to compare performance of
> > > different versions. I had tcpnodelay set to true for all test
> scenarios.
> > So, in fact HttpCore NIO doesn't respect behaviour of TCP_NODELAY, if
> > ConnectionConfig.Builder.setFragmentSizeHint(0) wasn't called?
> > I think this should be implicitly stated in javadocs, because this is
> > not clear without reading the code.
> >
>
> Dmitry
>
> TCP_NODELAY and fragmentation parameters are completely unrelated. What I
> was trying to say was that I had run all my tests with TCP_NODELAY set to
> true.
>
> Oleg
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org <javascript:;>
> For additional commands, e-mail: dev-help@hc.apache.org <javascript:;>
>
>

Re: HttpCore NIO performance improvements

Posted by Oleg Kalnichevski <ol...@apache.org>.
On Thu, Feb 21, 2013 at 11:39:49AM +0300, Dmitry Potapov wrote:
> On Thu, Feb 21, 2013 at 12:05 PM, Oleg Kalnichevski <ol...@apache.org> wrote:
> > On Thu, 2013-02-21 at 12:26 +0530, Asankha C. Perera wrote:
> >> Hi Oleg
> >> > I made really major changes to HttpCore NIO in order to reduce packet
> >> > fragmentation on the TCP level when transmitting relatively short (less
> >> > than 1 TCP frame) entity enclosing messages. In my tests I am seeing 25
> >> > to 30% performance improvements for short PUT and POST requests on the
> >> > client side and for short responses the server side as a result of
> >> > reduced TCP packet fragmentation.
> >> This is interesting.. What was the size of the messages that yielded the
> >> improvement? Does this also relate to the use of tcpnodelay
> >
> > 2048 bytes. I was using this micro-benchmark to compare performance of
> > different versions. I had tcpnodelay set to true for all test scenarios.
> So, in fact HttpCore NIO doesn't respect behaviour of TCP_NODELAY, if
> ConnectionConfig.Builder.setFragmentSizeHint(0) wasn't called?
> I think this should be implicitly stated in javadocs, because this is
> not clear without reading the code.
> 

Dmitry

TCP_NODELAY and fragmentation parameters are completely unrelated. What I was trying to say was that I had run all my tests with TCP_NODELAY set to true.

Oleg


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


Re: HttpCore NIO performance improvements

Posted by Dmitry Potapov <po...@gmail.com>.
On Thu, Feb 21, 2013 at 12:05 PM, Oleg Kalnichevski <ol...@apache.org> wrote:
> On Thu, 2013-02-21 at 12:26 +0530, Asankha C. Perera wrote:
>> Hi Oleg
>> > I made really major changes to HttpCore NIO in order to reduce packet
>> > fragmentation on the TCP level when transmitting relatively short (less
>> > than 1 TCP frame) entity enclosing messages. In my tests I am seeing 25
>> > to 30% performance improvements for short PUT and POST requests on the
>> > client side and for short responses the server side as a result of
>> > reduced TCP packet fragmentation.
>> This is interesting.. What was the size of the messages that yielded the
>> improvement? Does this also relate to the use of tcpnodelay
>
> 2048 bytes. I was using this micro-benchmark to compare performance of
> different versions. I had tcpnodelay set to true for all test scenarios.
So, in fact HttpCore NIO doesn't respect behaviour of TCP_NODELAY, if
ConnectionConfig.Builder.setFragmentSizeHint(0) wasn't called?
I think this should be implicitly stated in javadocs, because this is
not clear without reading the code.

>
> https://svn.apache.org/repos/asf/httpcomponents/benchmark/httpcore/trunk/
>
>> > The downside is that all this comes at the price of destabilizing areas
>> > of code that have been well tested and stable for years. I made sure
>> > that those areas have close to 100% unit test coverage, but still there
>> > can be regressions.
>> >
>> > I would be enormously thankful if you could take the latest SVN trunk
>> > for a spin and test it with your applications. I would be interesting to
>> > know if the latest changes actually translate into any tangible
>> > performance gains at the application level.
>> During our migration, at one point we also did migrate to 4.3-alpha1 -
>> so it would be something we would love to do, but it will need some more
>> time, as we are yet to finalize the fallback to 4.2.x..
>
> HttpCore 4.3 should be backward compatible with 4.1 and 4.2. You should
> be able to drop it in place of an older version and see improved
> performance.
>
> Oleg
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
> For additional commands, e-mail: dev-help@hc.apache.org
>

-- 
Dmitry

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


Re: HttpCore NIO performance improvements

Posted by Oleg Kalnichevski <ol...@apache.org>.
On Thu, 2013-02-21 at 12:26 +0530, Asankha C. Perera wrote:
> Hi Oleg
> > I made really major changes to HttpCore NIO in order to reduce packet
> > fragmentation on the TCP level when transmitting relatively short (less
> > than 1 TCP frame) entity enclosing messages. In my tests I am seeing 25
> > to 30% performance improvements for short PUT and POST requests on the
> > client side and for short responses the server side as a result of
> > reduced TCP packet fragmentation.
> This is interesting.. What was the size of the messages that yielded the 
> improvement? Does this also relate to the use of tcpnodelay

2048 bytes. I was using this micro-benchmark to compare performance of
different versions. I had tcpnodelay set to true for all test scenarios.

https://svn.apache.org/repos/asf/httpcomponents/benchmark/httpcore/trunk/

> > The downside is that all this comes at the price of destabilizing areas
> > of code that have been well tested and stable for years. I made sure
> > that those areas have close to 100% unit test coverage, but still there
> > can be regressions.
> >
> > I would be enormously thankful if you could take the latest SVN trunk
> > for a spin and test it with your applications. I would be interesting to
> > know if the latest changes actually translate into any tangible
> > performance gains at the application level.
> During our migration, at one point we also did migrate to 4.3-alpha1 - 
> so it would be something we would love to do, but it will need some more 
> time, as we are yet to finalize the fallback to 4.2.x..

HttpCore 4.3 should be backward compatible with 4.1 and 4.2. You should
be able to drop it in place of an older version and see improved
performance.

Oleg



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


Re: HttpCore NIO performance improvements

Posted by "Asankha C. Perera" <as...@apache.org>.
Hi Oleg
> I made really major changes to HttpCore NIO in order to reduce packet
> fragmentation on the TCP level when transmitting relatively short (less
> than 1 TCP frame) entity enclosing messages. In my tests I am seeing 25
> to 30% performance improvements for short PUT and POST requests on the
> client side and for short responses the server side as a result of
> reduced TCP packet fragmentation.
This is interesting.. What was the size of the messages that yielded the 
improvement? Does this also relate to the use of tcpnodelay
> The downside is that all this comes at the price of destabilizing areas
> of code that have been well tested and stable for years. I made sure
> that those areas have close to 100% unit test coverage, but still there
> can be regressions.
>
> I would be enormously thankful if you could take the latest SVN trunk
> for a spin and test it with your applications. I would be interesting to
> know if the latest changes actually translate into any tangible
> performance gains at the application level.
During our migration, at one point we also did migrate to 4.3-alpha1 - 
so it would be something we would love to do, but it will need some more 
time, as we are yet to finalize the fallback to 4.2.x..
> Another interesting thing. For the first time I have seen NIO transports
> outperform blocking ones with as few as 100 concurrent connections when
> using the latest Java 6 or Java 7 Oracle JREs. It used to take 1000 or
> more concurrent connections a few years back. I am now very curious to
> see if I can make HttpAsyncClient outperform HttpClient with, say, 250
> concurrent connections.
Thats great to hear! and I look forward to trying out the AsyncClient 
over raw HC/NIO

Thanks to all of your efforts

regards
asankha

-- 
Asankha C. Perera
AdroitLogic, http://adroitlogic.org

http://esbmagic.blogspot.com




---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org