You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@trafficserver.apache.org by Leif Hedstrom <zw...@apache.org> on 2012/01/23 03:13:02 UTC

Re: Fast Accept

Still watching the game at In The Zone, but, with your patch, does it still support both accept threads (1 or more) and non-blocking accept on each net-thread ?

On Jan 22, 2012, at 5:42 PM, "Alan M. Carroll" <am...@network-geographics.com> wrote:

> I could be wrong. It happened once before, so why not twice?
> 
> Sunday, January 22, 2012, 6:10:04 PM, you wrote:
> 
>> Hmmm, I'll have to go read the code again, after the 49ers game. That is not how I remember it at all.
>

Re: Fast Accept

Posted by "Alan M. Carroll" <am...@network-geographics.com>.

Monday, January 23, 2012, 12:19:00 PM, you wrote:

> The key question is "why is the test failing", and AFAICT it looks like NetAccept::init_accept() isn't doing:

>     PollDescriptor *pd = get_PollDescriptor(t);
>     if (a->ep.start(pd, a, EVENTIO_READ) < 0)

> link in init_accept_per_thread().  So it isn't clear to me why it would get called ever.

I'm not sure what you mean by "it" in "why it would get called ever". Could you clarify?

I checked on this after reading your email and that logic isn't done normally for testing or runtime, only for the case of explicitly passing in accept_thread == 0 & frequent_accept. The default configuration has accept_threads == 1 and testing has frequent_accept == false. My attempt to clean up the NetProcessor interface by eliminating the obsolete massive argument accept() method changed the test logic to exercise this case. So it's possible that there is a general problem which has been hidden.

For now, I have reverted frequent_accept == true as the default and changed TSNetAccept to set it explicitly to false to preserve previous behavior. That fixes my current problem.

While I'm still not clear on what the root cause is, I did notice that only in the failing path is the VC put on the read_ready_list and marked triggered (contrast UnixNetVConnection::acceptEvent and NetAccept::acceptFastEvent).

Re: Fast Accept

Posted by Leif Hedstrom <zw...@apache.org>.

On 1/23/12 11:21 AM, Theo Schlossnagle wrote:
> On Linux, FreeBSD and Solaris (at least), even with the most recent nonblocking kernel facilities, it is still faster to accept new TCP connections in a blocking fashion via thread pool. FWIW.

That definitely was the case when we worked on Accept threads a while ago (a 
few years ago), I clearly remember it making a noticeable difference with 
the old code base. However, today, it seems to introduce more latency, 
compared to doing non-blocking accept() on all net-threads. I haven't spent 
a whole lot of time investigating why, but it certainly seems that now the 
dedicated (blocking) accept threads is "worse". It can do more accept's / 
second, but at a higher latency.

This could quite possibly be a problem in our code, since with accept 
threads enabled, the session has to be migrated between threads, so perhaps 
we're doing something inefficient there (or something that requires a 
scheduling retry etc. due to lock contention).

-- leif

Re: Fast Accept

Posted by Igor Galić <i....@brainsware.org>.


----- Original Message -----
> I think if we switch to fine-grained locking and away from
> large-grained
> try-locks we could deal with the latency issue and get the most out
> of the
> accept thread since we could synchronise between the accept and net
> threads
> faster. This is likely to be easier with a simplified codebase as
> well.

So the solution to this can'o'worms is the same as to the
last one Alan opened?

i

-- 
Igor Galić

Tel: +43 (0) 664 886 22 883
Mail: i.galic@brainsware.org
URL: http://brainsware.org/
GPG: 6880 4155 74BD FD7C B515  2EA5 4B1D 9E08 A097 C9AE

Re: Fast Accept

Posted by John Plevyak <jp...@acm.org>.

I think if we switch to fine-grained locking and away from large-grained
try-locks we could deal with the latency issue and get the most out of the
accept thread since we could synchronise between the accept and net threads
faster. This is likely to be easier with a simplified codebase as well.

On Mon, Jan 23, 2012 at 10:35 AM, Leif Hedstrom <le...@godaddy.com> wrote:

> On 1/23/12 11:21 AM, Theo Schlossnagle wrote:
>
>> On Linux, FreeBSD and Solaris (at least), even with the most recent
>> nonblocking kernel facilities, it is still faster to accept new TCP
>> connections in a blocking fashion via thread pool. FWIW.
>>
>
> That definitely was the case when we worked on Accept threads a while ago
> (a few years ago), I clearly remember it making a noticeable difference
> with the old code base. However, today, it seems to introduce more latency,
> compared to doing non-blocking accept() on all net-threads. I haven't spent
> a whole lot of time investigating why, but it certainly seems that now the
> dedicated (blocking) accept threads is "worse". It can do more accept's /
> second, but at a higher latency.
>
> This could quite possibly be a problem in our code, since with accept
> threads enabled, the session has to be migrated between threads, so perhaps
> we're doing something inefficient there (or something that requires a
> scheduling retry etc. due to lock contention).
>
> -- leif
>
>

Re: Fast Accept

Posted by Leif Hedstrom <le...@godaddy.com>.

On 1/23/12 11:21 AM, Theo Schlossnagle wrote:
> On Linux, FreeBSD and Solaris (at least), even with the most recent nonblocking kernel facilities, it is still faster to accept new TCP connections in a blocking fashion via thread pool. FWIW.

That definitely was the case when we worked on Accept threads a while ago (a 
few years ago), I clearly remember it making a noticeable difference with 
the old code base. However, today, it seems to introduce more latency, 
compared to doing non-blocking accept() on all net-threads. I haven't spent 
a whole lot of time investigating why, but it certainly seems that now the 
dedicated (blocking) accept threads is "worse". It can do more accept's / 
second, but at a higher latency.

This could quite possibly be a problem in our code, since with accept 
threads enabled, the session has to be migrated between threads, so perhaps 
we're doing something inefficient there (or something that requires a 
scheduling retry etc. due to lock contention).

-- leif

Re: Fast Accept

Posted by Theo Schlossnagle <je...@omniti.com>.

On Linux, FreeBSD and Solaris (at least), even with the most recent nonblocking kernel facilities, it is still faster to accept new TCP connections in a blocking fashion via thread pool. FWIW.

--
Theo Schlossnagle (mobile)
http://omniti.com/is/theo-schlossnagle

On Jan 23, 2012, at 1:19 PM, John Plevyak <jp...@acm.org> wrote:

> I think Leif has it right about the use cases and meaning of
> frequent_accept.   Given how much has changed kernel-wise since that code
> was originally written, it isn't clear how much we save by limiting some
> accepts to a single thread non-blocking (i.e. epoll() makes the marginal
> cost of another fd small, and modern OSes handle many more threads
> efficiently than they used to).
> 
> The key question is "why is the test failing", and AFAICT it looks like
> NetAccept::init_accept() isn't doing:
> 
>    PollDescriptor *pd = get_PollDescriptor(t);
>    if (a->ep.start(pd, a, EVENTIO_READ) < 0)
> 
> link in init_accept_per_thread().  So it isn't clear to me why it would get
> called ever.
> 
> It just looks like this code needs to be simplified and cleaned up a bit.
> There is a lot of commented out code and convoluted paths and the code is
> split between NetAccept and UnixNetProcessor which further complicates
> things.  There are too many different paths which do largely the same thing
> under different circumstances, and yes, we might loose a few cycles by
> factoring the code better, but it would be much more maintainable, stable
> and (hopefully) bug free.
> 
> The way the code is now, we tend to have the main entry points for a
> processor in the XXXXProcessor.cc file and then some code for each point in
> another XXXFoo file.  This tends to split up the code which is confusing.
> Perhaps for main entry point Foo we should just have all the code in
> XXXFoo, in this case UnixAccept.cc for example.
> 
> I'll see if I can break away some time.
> 
> john
> 
> On Mon, Jan 23, 2012 at 7:50 AM, Leif Hedstrom <zw...@apache.org> wrote:
> 
>> On 1/23/12 7:57 AM, Alan M. Carroll wrote:
>> 
>>> Sunday, January 22, 2012, 10:20:58 PM, you wrote:
>>> 
>>>      if (frequent_accept) { // true
>>>> 
>>> I read that as "checking to see if frequent_accept is true".
>>> 
>> 
>> Right, and it used to be always true. And you are suggesting to change it
>> to "false", no ?
>> 
>> 
>> 
>>> What's semi confusing is that the declaration of accept_internal() has
>>>> the
>>>> frequent_accept parameter defaulted to "false", yet, as far as I can
>>>> tell,
>>>> in this older version of the code, we always called it with "true".
>>>> Fwiw, I also checked v2.0.1, and it also had frequent_accept (as passed
>>>> around) default to true in every case.
>>>> 
>>> 
>>> MuxVC.cc, InkAPI.cc, and LogCollationAccept.cc call
>>> NetProcessor::accept() with frequent_accept false or defaulted. So it was
>>> not true everywhere.
>>> 
>> 
>> I see, yeah, that makes sense, I guess those auxiliary 'server' features
>> did not want to create multiple accept threads. I meant "always true" as in
>> for the main traffic_server process / HTTP server (i.e. there was no way to
>> make it false before).
>> 
>> 
>>> Perhaps we should leave the default true, but change the calls to
>>> NetProcessor::accept to set it false?
>>> 
>>> So, unless I'm totally missing something, just changing the default back
>>>> to
>>>> 'false' again will break accept_internal() as it was intended to work,
>>>> no ?
>>>> 
>>> accept_internal should be unaffected by the default. I think it was
>>> intended to check the value and do something based on it. Otherwise there
>>> should not even be an option to check.
>>> 
>>> The question is what the callers should do. I can try tweaking
>>> TSNetAccept to turn off frequent_accept, which is original (2.1.6) behavior
>>> (although that seems to make mock of the accept_threads parameter). We're
>>> still left with the question of why my code fails regression with fast
>>> accept.
>>> 
>> Alright, I think we are on the same page now. For some reason, I thought
>> you meant to turn off (set to false) the frequent_accept for the main
>> server. And like I said, I know we had that bug once before (before v2.1.7,
>> when you initially made the changes to pass the "opt" parameter, with it
>> defaulted to false). My concern is/was that we'd make the same mistake
>> again.
>> 
>> Seeing your explanation, the frequent_accept parameter does make some
>> sense, since there are (apparent) use cases where you don't want multiple
>> accept threads, or accepts per net-thread (for e.g. the log collation
>> server). So I don't think we should remove it (i.e. making it always on
>> seems like a bad idea).
>> 
>> It might be that we should clean this up though, a more obvious solution
>> would be to only have two options:
>> 
>>   1) One, or several, dedicated accept threads (e.g. num_accept_threads).
>> 
>>   2) Accept on one, or all, of the net-threads
>> 
>> 
>> This is basically the same as it works today, but less convoluted. If I
>> understand the code correct, if frequent_accept is false, then you get
>> assigned a "random" ET_NET thread to do the non-blocking accept() on. If it
>> is true, you either create one or several accept threads to do blocking
>> accept() on, or you do non-blocking accept() on all the net-threads.
>> 
>> Then all uses of this would allow for this, and specially, the InkAPI
>> accept APIs would allow for this control as well (unless they already do, I
>> didn't check).
>> 
>> Ciao,
>> 
>> -- Leif
>> 
>>

Re: Fast Accept

Posted by John Plevyak <jp...@acm.org>.

I think Leif has it right about the use cases and meaning of
frequent_accept.   Given how much has changed kernel-wise since that code
was originally written, it isn't clear how much we save by limiting some
accepts to a single thread non-blocking (i.e. epoll() makes the marginal
cost of another fd small, and modern OSes handle many more threads
efficiently than they used to).

The key question is "why is the test failing", and AFAICT it looks like
NetAccept::init_accept() isn't doing:

    PollDescriptor *pd = get_PollDescriptor(t);
    if (a->ep.start(pd, a, EVENTIO_READ) < 0)

link in init_accept_per_thread().  So it isn't clear to me why it would get
called ever.

It just looks like this code needs to be simplified and cleaned up a bit.
There is a lot of commented out code and convoluted paths and the code is
split between NetAccept and UnixNetProcessor which further complicates
things.  There are too many different paths which do largely the same thing
under different circumstances, and yes, we might loose a few cycles by
factoring the code better, but it would be much more maintainable, stable
and (hopefully) bug free.

The way the code is now, we tend to have the main entry points for a
processor in the XXXXProcessor.cc file and then some code for each point in
another XXXFoo file.  This tends to split up the code which is confusing.
Perhaps for main entry point Foo we should just have all the code in
XXXFoo, in this case UnixAccept.cc for example.

I'll see if I can break away some time.

john

On Mon, Jan 23, 2012 at 7:50 AM, Leif Hedstrom <zw...@apache.org> wrote:

> On 1/23/12 7:57 AM, Alan M. Carroll wrote:
>
>> Sunday, January 22, 2012, 10:20:58 PM, you wrote:
>>
>>       if (frequent_accept) { // true
>>>
>> I read that as "checking to see if frequent_accept is true".
>>
>
> Right, and it used to be always true. And you are suggesting to change it
> to "false", no ?
>
>
>
>>  What's semi confusing is that the declaration of accept_internal() has
>>> the
>>> frequent_accept parameter defaulted to "false", yet, as far as I can
>>> tell,
>>> in this older version of the code, we always called it with "true".
>>> Fwiw, I also checked v2.0.1, and it also had frequent_accept (as passed
>>> around) default to true in every case.
>>>
>>
>> MuxVC.cc, InkAPI.cc, and LogCollationAccept.cc call
>> NetProcessor::accept() with frequent_accept false or defaulted. So it was
>> not true everywhere.
>>
>
> I see, yeah, that makes sense, I guess those auxiliary 'server' features
> did not want to create multiple accept threads. I meant "always true" as in
> for the main traffic_server process / HTTP server (i.e. there was no way to
> make it false before).
>
>
>> Perhaps we should leave the default true, but change the calls to
>> NetProcessor::accept to set it false?
>>
>>  So, unless I'm totally missing something, just changing the default back
>>> to
>>> 'false' again will break accept_internal() as it was intended to work,
>>> no ?
>>>
>> accept_internal should be unaffected by the default. I think it was
>> intended to check the value and do something based on it. Otherwise there
>> should not even be an option to check.
>>
>> The question is what the callers should do. I can try tweaking
>> TSNetAccept to turn off frequent_accept, which is original (2.1.6) behavior
>> (although that seems to make mock of the accept_threads parameter). We're
>> still left with the question of why my code fails regression with fast
>> accept.
>>
> Alright, I think we are on the same page now. For some reason, I thought
> you meant to turn off (set to false) the frequent_accept for the main
> server. And like I said, I know we had that bug once before (before v2.1.7,
> when you initially made the changes to pass the "opt" parameter, with it
> defaulted to false). My concern is/was that we'd make the same mistake
> again.
>
> Seeing your explanation, the frequent_accept parameter does make some
> sense, since there are (apparent) use cases where you don't want multiple
> accept threads, or accepts per net-thread (for e.g. the log collation
> server). So I don't think we should remove it (i.e. making it always on
> seems like a bad idea).
>
> It might be that we should clean this up though, a more obvious solution
> would be to only have two options:
>
>    1) One, or several, dedicated accept threads (e.g. num_accept_threads).
>
>    2) Accept on one, or all, of the net-threads
>
>
> This is basically the same as it works today, but less convoluted. If I
> understand the code correct, if frequent_accept is false, then you get
> assigned a "random" ET_NET thread to do the non-blocking accept() on. If it
> is true, you either create one or several accept threads to do blocking
> accept() on, or you do non-blocking accept() on all the net-threads.
>
> Then all uses of this would allow for this, and specially, the InkAPI
> accept APIs would allow for this control as well (unless they already do, I
> didn't check).
>
> Ciao,
>
> -- Leif
>
>

Re: Fast Accept

Posted by Leif Hedstrom <zw...@apache.org>.

On 1/23/12 7:57 AM, Alan M. Carroll wrote:
> Sunday, January 22, 2012, 10:20:58 PM, you wrote:
>
>>       if (frequent_accept) { // true
> I read that as "checking to see if frequent_accept is true".

Right, and it used to be always true. And you are suggesting to change it to 
"false", no ?

>
>> What's semi confusing is that the declaration of accept_internal() has the
>> frequent_accept parameter defaulted to "false", yet, as far as I can tell,
>> in this older version of the code, we always called it with "true".
>> Fwiw, I also checked v2.0.1, and it also had frequent_accept (as passed
>> around) default to true in every case.
>
> MuxVC.cc, InkAPI.cc, and LogCollationAccept.cc call NetProcessor::accept() with frequent_accept false or defaulted. So it was not true everywhere.

I see, yeah, that makes sense, I guess those auxiliary 'server' features did 
not want to create multiple accept threads. I meant "always true" as in for 
the main traffic_server process / HTTP server (i.e. there was no way to make 
it false before).
>
> Perhaps we should leave the default true, but change the calls to NetProcessor::accept to set it false?
>
>> So, unless I'm totally missing something, just changing the default back to
>> 'false' again will break accept_internal() as it was intended to work, no ?
> accept_internal should be unaffected by the default. I think it was intended to check the value and do something based on it. Otherwise there should not even be an option to check.
>
> The question is what the callers should do. I can try tweaking TSNetAccept to turn off frequent_accept, which is original (2.1.6) behavior (although that seems to make mock of the accept_threads parameter). We're still left with the question of why my code fails regression with fast accept.
Alright, I think we are on the same page now. For some reason, I thought you 
meant to turn off (set to false) the frequent_accept for the main server. 
And like I said, I know we had that bug once before (before v2.1.7, when you 
initially made the changes to pass the "opt" parameter, with it defaulted to 
false). My concern is/was that we'd make the same mistake again.

Seeing your explanation, the frequent_accept parameter does make some sense, 
since there are (apparent) use cases where you don't want multiple accept 
threads, or accepts per net-thread (for e.g. the log collation server). So I 
don't think we should remove it (i.e. making it always on seems like a bad 
idea).

It might be that we should clean this up though, a more obvious solution 
would be to only have two options:

     1) One, or several, dedicated accept threads (e.g. num_accept_threads).

     2) Accept on one, or all, of the net-threads

This is basically the same as it works today, but less convoluted. If I 
understand the code correct, if frequent_accept is false, then you get 
assigned a "random" ET_NET thread to do the non-blocking accept() on. If it 
is true, you either create one or several accept threads to do blocking 
accept() on, or you do non-blocking accept() on all the net-threads.

Then all uses of this would allow for this, and specially, the InkAPI accept 
APIs would allow for this control as well (unless they already do, I didn't 
check).

Ciao,

-- Leif

Re: Fast Accept

Posted by "Alan M. Carroll" <am...@network-geographics.com>.

Sunday, January 22, 2012, 10:20:58 PM, you wrote:

>      if (frequent_accept) { // true

I read that as "checking to see if frequent_accept is true". 

> What's semi confusing is that the declaration of accept_internal() has the 
> frequent_accept parameter defaulted to "false", yet, as far as I can tell, 
> in this older version of the code, we always called it with "true".

> Fwiw, I also checked v2.0.1, and it also had frequent_accept (as passed 
> around) default to true in every case.

MuxVC.cc, InkAPI.cc, and LogCollationAccept.cc call NetProcessor::accept() with frequent_accept false or defaulted. So it was not true everywhere.

Perhaps we should leave the default true, but change the calls to NetProcessor::accept to set it false?

> So, unless I'm totally missing something, just changing the default back to 
> 'false' again will break accept_internal() as it was intended to work, no ?

accept_internal should be unaffected by the default. I think it was intended to check the value and do something based on it. Otherwise there should not even be an option to check.

The question is what the callers should do. I can try tweaking TSNetAccept to turn off frequent_accept, which is original (2.1.6) behavior (although that seems to make mock of the accept_threads parameter). We're still left with the question of why my code fails regression with fast accept.

Re: Fast Accept

Posted by Leif Hedstrom <zw...@apache.org>.

On 1/22/12 7:13 PM, Leif Hedstrom wrote:
> Still watching the game at In The Zone, but, with your patch, does it still support both accept threads (1 or more) and non-blocking accept on each net-thread ?
>

So, looking at some older code, v2.1.6 for example, where we passed around 
frequent_accept as a parameter, it was always true. And the code still said

     if (frequent_accept) { // true

What's semi confusing is that the declaration of accept_internal() has the 
frequent_accept parameter defaulted to "false", yet, as far as I can tell, 
in this older version of the code, we always called it with "true". Hence, I 
think the initial commit made the "opt" structure default the value to 
false, and when we noticed it was breaking shit, it was changed to 'true'.

Fwiw, I also checked v2.0.1, and it also had frequent_accept (as passed 
around) default to true in every case. See code below, for how it was 
invoked back then (notice how the frequent_accept parameter is always 'true').

So, unless I'm totally missing something, just changing the default back to 
'false' again will break accept_internal() as it was intended to work, no ?

-- leif

Action *
NetProcessor::main_accept(Continuation * cont, SOCKET fd, int port,
                           sockaddr * bound_sockaddr, int *bound_sockaddr_size,
                           bool accept_only,
                           int recv_bufsize, int send_bufsize, unsigned long sockopt_flags,
                           EventType etype, bool callback_on_open)
{
   (void) accept_only;           // NT only
   Debug("net_processor", "NetProcessor::main_accept - port %d,recv_bufsize %d, send_bufsize %d, sockopt 0x%0lX",
         port, recv_bufsize, send_bufsize, sockopt_flags);
   return ((UnixNetProcessor *) this)->accept_internal(cont, fd, port,
                                                       bound_sockaddr,
                                                       bound_sockaddr_size,
                                                       true,
                                                       net_accept,
                                                       recv_bufsize,
                                                       send_bufsize,
                                                       sockopt_flags,
                                                       ((UnixNetProcessor *) this)->incoming_ip_to_bind_saddr,
                                                       callback_on_open, etype);