You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Brian Pane <br...@apache.org> on 2005/08/26 09:42:19 UTC
PATCH: lazy initialization of TCP_NODELAY (workaround for 2.6 TCP_CORK problem)
The attached patch delays the setting of TCP_NODELAY on client
connections
until the first time core_output_filter has to do a writev_it_all()
or emulate_sendfile().
My motivation for this is to work around the TCP_NODELAY/TCP_CORK
problem
in Linux 2.6. However, it also will save a couple of syscalls for
any request that
is handled with sendfile(2).
Note that there was an APR bug that caused TCP_NODELAY to be set on the
listener socket at startup as a side-effect of TCP_DEFER_ACCEPT being
set.
I just committed a fix for this. Without that fix, Linux 2.6 systems
using this httpd
patch will still exhibit the corking problem.
Can a couple of volunteers please test and/or review this patch? I'd
appreciate
a second opinion before I commit, given the subtlety of the NODELAY
and CORK
semantics on various platforms.
Thanks,
Brian
Re: PATCH: lazy initialization of TCP_NODELAY (workaround for 2.6 TCP_CORK problem)
Posted by Brian Pane <br...@apache.org>.
On Aug 26, 2005, at 1:59 AM, Joe Orton wrote:
> On Fri, Aug 26, 2005 at 01:23:15AM -0700, Brian Pane wrote:
>
>>
>> I didn't think it was actually possible for APR to allow TCP_CORK and
>> TCP_NODELAY
>> at the same time. From the tcp(7) manual page on my FC4
>> installation,
>>
>>
>
> That's out of date yes, see recent thread ;)
>
> All 2.6.x kernels do allow setting both TCP_CORK and TCP_NODELAY at
> the
> same time.
>
> All current 2.6.x kernels have the bug in TCP_CORK handling which
> means
> that if TCP_NODELAY was ever enabled on the socket, TCP_CORK will not
> take effect.
>
> The fix for that was in the 2.6.13-rc7 release, for the curious:
> http://www.kernel.org/git/?p=linux/kernel/git/torvalds/
> linux-2.6.git;a=commit;h=89ebd197eb2cd31d6187db344d5117064e19fdde
>
Ah, thanks. With this kernel fix forthcoming, the current httpd
implementation
(sans my patch) will do the right thing for both sendfile and non-
sendfile responses.
Brian
Re: PATCH: lazy initialization of TCP_NODELAY (workaround for 2.6 TCP_CORK problem)
Posted by Joe Orton <jo...@redhat.com>.
On Fri, Aug 26, 2005 at 01:23:15AM -0700, Brian Pane wrote:
>
> On Aug 26, 2005, at 12:55 AM, Joe Orton wrote:
>
> >On Fri, Aug 26, 2005 at 12:42:19AM -0700, Brian Pane wrote:
> >
> >>The attached patch delays the setting of TCP_NODELAY on client
> >>connections until the first time core_output_filter has to do a
> >>writev_it_all() or emulate_sendfile(). My motivation for this is to
> >>work around the TCP_NODELAY/TCP_CORK problem in Linux 2.6. However,
> >>it also will save a couple of syscalls for any request that is
> >>handled
> >>with sendfile(2).
> >>
> >
> >This will actually end up being *more* expensive on 2.6 systems in the
> >long run, though, right? I'm not convinced this is a good idea. With
> >APR changed to allow setting TCP_CORK and TCP_NODELAY at the same time
> >with 2.6, it is cheaper to just set TCP_NODELAY once on the listening
> >socket and never have to touch it again.
>
> I didn't think it was actually possible for APR to allow TCP_CORK and
> TCP_NODELAY
> at the same time. From the tcp(7) manual page on my FC4 installation,
That's out of date yes, see recent thread ;)
All 2.6.x kernels do allow setting both TCP_CORK and TCP_NODELAY at the
same time.
All current 2.6.x kernels have the bug in TCP_CORK handling which means
that if TCP_NODELAY was ever enabled on the socket, TCP_CORK will not
take effect.
The fix for that was in the 2.6.13-rc7 release, for the curious:
http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=89ebd197eb2cd31d6187db344d5117064e19fdde
...
> If it's possible to use both TCP_CORK and TCP_NODELAY on the same
> socket in 2.6.13 or later (assuming that's when the fix for the current
> NODELAY toggling problem becomes available), then yes, my lazy
> evaluation approach will be less efficient than just setting TCP_NODELAY
> on the listener socket--for requests that don't use sendfile.
> For requests that do use sendfile, I think the logic implemented by my
> patch will be optimal for both 2.6.1-2.6.12 and 2.6.13+
To be clear this is only a partial workaround for the kernel bug: if at
any time something is sent on the connection which requires enabling
TCP_NODELAY, any subsequent TCP_CORKs will have no effect.
Given that fact I'm not convinced it's worth changing httpd: this is (1)
a kernel bug and (2) an APR lack-of-feature; with both of those things
fixed the current httpd code is perfectly correct.
joe
Re: PATCH: lazy initialization of TCP_NODELAY (workaround for 2.6 TCP_CORK problem)
Posted by Brian Pane <br...@apache.org>.
On Aug 26, 2005, at 12:55 AM, Joe Orton wrote:
> On Fri, Aug 26, 2005 at 12:42:19AM -0700, Brian Pane wrote:
>
>> The attached patch delays the setting of TCP_NODELAY on client
>> connections until the first time core_output_filter has to do a
>> writev_it_all() or emulate_sendfile(). My motivation for this is to
>> work around the TCP_NODELAY/TCP_CORK problem in Linux 2.6. However,
>> it also will save a couple of syscalls for any request that is
>> handled
>> with sendfile(2).
>>
>
> This will actually end up being *more* expensive on 2.6 systems in the
> long run, though, right? I'm not convinced this is a good idea. With
> APR changed to allow setting TCP_CORK and TCP_NODELAY at the same time
> with 2.6, it is cheaper to just set TCP_NODELAY once on the listening
> socket and never have to touch it again.
I didn't think it was actually possible for APR to allow TCP_CORK and
TCP_NODELAY
at the same time. From the tcp(7) manual page on my FC4 installation,
TCP_CORK
If set, don’t send out partial frames. All
queued partial
frames are sent when the option is cleared again.
This is use-
ful for prepending headers before calling sendfile
(2), or for
throughput optimization. This option cannot be
combined with
TCP_NODELAY. This option should not be used in code
intended to
be portable.
and
TCP_NODELAY
If set, disable the Nagle algorithm. This means
that segments
are always sent as soon as possible, even if there
is only a
small amount of data. When not set, data is
buffered until
there is a sufficient amount to send out, thereby
avoiding the
frequent sending of small packets, which results in
poor uti-
lization of the network. This option cannot be used
at the same
time as the option TCP_CORK.
Is the manpage out of date?
If it's possible to use both TCP_CORK and TCP_NODELAY on the same
socket in 2.6.13 or later (assuming that's when the fix for the current
NODELAY toggling problem becomes available), then yes, my lazy
evaluation approach will be less efficient than just setting TCP_NODELAY
on the listener socket--for requests that don't use sendfile. For
requests
that do use sendfile, I think the logic implemented by my patch will be
optimal for both 2.6.1-2.6.12 and 2.6.13+
Brian
Re: PATCH: lazy initialization of TCP_NODELAY (workaround for 2.6 TCP_CORK problem)
Posted by Joe Orton <jo...@redhat.com>.
On Fri, Aug 26, 2005 at 12:42:19AM -0700, Brian Pane wrote:
> The attached patch delays the setting of TCP_NODELAY on client
> connections until the first time core_output_filter has to do a
> writev_it_all() or emulate_sendfile(). My motivation for this is to
> work around the TCP_NODELAY/TCP_CORK problem in Linux 2.6. However,
> it also will save a couple of syscalls for any request that is handled
> with sendfile(2).
This will actually end up being *more* expensive on 2.6 systems in the
long run, though, right? I'm not convinced this is a good idea. With
APR changed to allow setting TCP_CORK and TCP_NODELAY at the same time
with 2.6, it is cheaper to just set TCP_NODELAY once on the listening
socket and never have to touch it again.
> Note that there was an APR bug that caused TCP_NODELAY to be set on
> the listener socket at startup as a side-effect of TCP_DEFER_ACCEPT
> being set.
good catch!
joe