You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Jeff Trawick <tr...@gmail.com> on 2005/04/07 14:57:51 UTC

Re: svn commit: r160348 - httpd/httpd/trunk/server/core_filters.c

On Apr 6, 2005 8:22 PM, wsanchez@apache.org <ws...@apache.org> wrote:
> Author: wsanchez
> Date: Wed Apr  6 17:22:29 2005
> New Revision: 160348
> 
> URL: http://svn.apache.org/viewcvs?view=rev&rev=160348
> Log:
> In emulate_sendfile(), handle APR_EAGAIN from apr_socket_send().

> -            rv = apr_socket_send(c->client_socket, &buffer[o], &bytes_sent);
> -            *nbytes += bytes_sent;
> -            if (rv == APR_SUCCESS) {
> -                sendlen -= bytes_sent; /* sendlen != bytes_sent ==> partial write */
> -                o += bytes_sent;       /* o is where we are in the buffer */
> -                togo -= bytes_sent;    /* track how much of the file we've sent */
> +        if (rv == APR_SUCCESS && sendlen) {
> +            while ((rv == APR_SUCCESS || rv == APR_EAGAIN) && sendlen) {

Why would EAGAIN be returned?  There should be a timeout set on the
APR socket.  Either the send works within the timeout period or we get
timeout error or we get some lower-level socket error.

If EAGAIN is really returned, I suspect there is something else to investigate.

EAGAIN on write after poll() on Darwin (re: commit r160348)

Posted by Wilfredo Sánchez Vega <ws...@wsanchez.net>.
   Mike Smith tells me that it's possible for poll() to tell you that  
a socket is writable and for the socket to change to an unwritable  
state before you get around to writing to it for a number of  
reasons.  So it's reasonable (but presumably uncommon) to do a poll()  
then get back EAGAIN when you subsequently try to write().  Which  
implies that it may not be a bug in the OS after all.

   If so, then the question is whether the correct fix would be in  
APR instead of in HTTPd, since HTTPd isn't really aware here that  
it's dealing with a non-blocking socket.

   Does this make sense?

     -wsv


On Apr 14, 2005, at 12:02 PM, Wilfredo Sánchez Vega wrote:

>   We're investigating possible issues in the system.  One comment  
> from a kernel developer:
>
>     We are returning EWOULDBLOCK because the socket is in non- 
> blocking.
>     Inspecting the socket, so_state is 0x182 (0x100 is SS_NBIO).  
> Setting
>     a breakpoint on soioctl for SS_NBIO I can clearly see that httpd
>     is setting the socket as non-blocking. httpd is using fcntl which
>     translate the non-blocking change to an soioctl.
>
>   Does it make sense that the socket is non-blocking?
>
>     -wsv
>
>
>
> On Apr 7, 2005, at 7:00 AM, Paul Querna wrote:
>
>
>> Jeff Trawick wrote:
>>
>>
>>> On Apr 6, 2005 8:22 PM, wsanchez@apache.org <ws...@apache.org>  
>>> wrote:
>>>
>>>
>>>> Author: wsanchez
>>>> Date: Wed Apr  6 17:22:29 2005
>>>> New Revision: 160348
>>>>
>>>> URL: http://svn.apache.org/viewcvs?view=rev&rev=160348
>>>> Log:
>>>> In emulate_sendfile(), handle APR_EAGAIN from apr_socket_send().
>>>> -            rv = apr_socket_send(c->client_socket, &buffer[o],  
>>>> &bytes_sent);
>>>> -            *nbytes += bytes_sent;
>>>> -            if (rv == APR_SUCCESS) {
>>>> -                sendlen -= bytes_sent; /* sendlen != bytes_sent  
>>>> ==> partial write */
>>>> -                o += bytes_sent;       /* o is where we are in  
>>>> the buffer */
>>>> -                togo -= bytes_sent;    /* track how much of the  
>>>> file we've sent */
>>>> +        if (rv == APR_SUCCESS && sendlen) {
>>>> +            while ((rv == APR_SUCCESS || rv == APR_EAGAIN) &&  
>>>> sendlen) {
>>>>
>>>>
>>> Why would EAGAIN be returned?  There should be a timeout set on the
>>> APR socket.  Either the send works within the timeout period or  
>>> we get
>>> timeout error or we get some lower-level socket error.
>>> If EAGAIN is really returned, I suspect there is something else  
>>> to investigate.
>>>
>>>
>>
>> Yes, I was talking to wsanchez on IRC, and I suspect there might  
>> be a bug in the OS-X kernel, causing a blocking socket to return  
>> EAGAIN on write().
>>
>>
>
>


Re: svn commit: r160348 - httpd/httpd/trunk/server/core_filters.c

Posted by Jeff Trawick <tr...@gmail.com>.
On 4/15/05, Joe Orton <jo...@redhat.com> wrote:
> On Thu, Apr 14, 2005 at 12:02:35PM -0700, Wilfredo Sánchez Vega wrote:
> >   We're investigating possible issues in the system.  One comment
> > from a kernel developer:
> >
> >     We are returning EWOULDBLOCK because the socket is in non-blocking.
> >     Inspecting the socket, so_state is 0x182 (0x100 is SS_NBIO).
> > Setting
> >     a breakpoint on soioctl for SS_NBIO I can clearly see that httpd
> >     is setting the socket as non-blocking. httpd is using fcntl which
> >     translate the non-blocking change to an soioctl.
> >
> >   Does it make sense that the socket is non-blocking?
> 
> Recent httpd releases (2.0.49 onwards) will set listening sockets as
> non-blocking if more than one listener is configured, to fix
> CAN-2004-0174.

At the point of doing actual I/O, these sockets were non-blocking
anyway as far as the kernel is concerned, because that is how APR
implements I/O with timeout (tell kernel that socket is non-blocking,
and handle EAGAIN/EWOULDBLOCK internally with poll()/select()).

> This caused a number of obscure regressions on BSD platforms where APR
> failed to detect (or, in some cases, the OS failed to correctly report)
> whether or not O_NONBLOCK is inherited across to the socket returned by
> accept().  That would be the first thing to check if you're seeing
> issues like this - the test program APR uses is here:

Another thing to check is if apr_poll() is telling the I/O routine
that data is ready when in fact it is not.  I recall some recent
complaints about APR using poll() on OS X 10.3, where poll() has some
negative attributes (I don't recall details).

Re: svn commit: r160348 - httpd/httpd/trunk/server/core_filters.c

Posted by Joe Orton <jo...@redhat.com>.
On Thu, Apr 14, 2005 at 12:02:35PM -0700, Wilfredo Sánchez Vega wrote:
>   We're investigating possible issues in the system.  One comment  
> from a kernel developer:
> 
>     We are returning EWOULDBLOCK because the socket is in non-blocking.
>     Inspecting the socket, so_state is 0x182 (0x100 is SS_NBIO).  
> Setting
>     a breakpoint on soioctl for SS_NBIO I can clearly see that httpd
>     is setting the socket as non-blocking. httpd is using fcntl which
>     translate the non-blocking change to an soioctl.
> 
>   Does it make sense that the socket is non-blocking?

Recent httpd releases (2.0.49 onwards) will set listening sockets as
non-blocking if more than one listener is configured, to fix
CAN-2004-0174.

This caused a number of obscure regressions on BSD platforms where APR
failed to detect (or, in some cases, the OS failed to correctly report)
whether or not O_NONBLOCK is inherited across to the socket returned by
accept().  That would be the first thing to check if you're seeing
issues like this - the test program APR uses is here:

http://people.apache.org/~jorton/nonblock.c

joe

Re: svn commit: r160348 - httpd/httpd/trunk/server/core_filters.c

Posted by Wilfredo Sánchez Vega <ws...@wsanchez.net>.
   We're investigating possible issues in the system.  One comment  
from a kernel developer:

     We are returning EWOULDBLOCK because the socket is in non-blocking.
     Inspecting the socket, so_state is 0x182 (0x100 is SS_NBIO).  
Setting
     a breakpoint on soioctl for SS_NBIO I can clearly see that httpd
     is setting the socket as non-blocking. httpd is using fcntl which
     translate the non-blocking change to an soioctl.

   Does it make sense that the socket is non-blocking?

     -wsv



On Apr 7, 2005, at 7:00 AM, Paul Querna wrote:

> Jeff Trawick wrote:
>
>> On Apr 6, 2005 8:22 PM, wsanchez@apache.org <ws...@apache.org>  
>> wrote:
>>
>>> Author: wsanchez
>>> Date: Wed Apr  6 17:22:29 2005
>>> New Revision: 160348
>>>
>>> URL: http://svn.apache.org/viewcvs?view=rev&rev=160348
>>> Log:
>>> In emulate_sendfile(), handle APR_EAGAIN from apr_socket_send().
>>> -            rv = apr_socket_send(c->client_socket, &buffer[o],  
>>> &bytes_sent);
>>> -            *nbytes += bytes_sent;
>>> -            if (rv == APR_SUCCESS) {
>>> -                sendlen -= bytes_sent; /* sendlen != bytes_sent  
>>> ==> partial write */
>>> -                o += bytes_sent;       /* o is where we are in  
>>> the buffer */
>>> -                togo -= bytes_sent;    /* track how much of the  
>>> file we've sent */
>>> +        if (rv == APR_SUCCESS && sendlen) {
>>> +            while ((rv == APR_SUCCESS || rv == APR_EAGAIN) &&  
>>> sendlen) {
>>>
>> Why would EAGAIN be returned?  There should be a timeout set on the
>> APR socket.  Either the send works within the timeout period or we  
>> get
>> timeout error or we get some lower-level socket error.
>> If EAGAIN is really returned, I suspect there is something else to  
>> investigate.
>>
>
> Yes, I was talking to wsanchez on IRC, and I suspect there might be  
> a bug in the OS-X kernel, causing a blocking socket to return  
> EAGAIN on write().
>


Re: svn commit: r160348 - httpd/httpd/trunk/server/core_filters.c

Posted by Paul Querna <ch...@force-elite.com>.
Jeff Trawick wrote:
> On Apr 6, 2005 8:22 PM, wsanchez@apache.org <ws...@apache.org> wrote:
> 
>>Author: wsanchez
>>Date: Wed Apr  6 17:22:29 2005
>>New Revision: 160348
>>
>>URL: http://svn.apache.org/viewcvs?view=rev&rev=160348
>>Log:
>>In emulate_sendfile(), handle APR_EAGAIN from apr_socket_send().
> 
> 
>>-            rv = apr_socket_send(c->client_socket, &buffer[o], &bytes_sent);
>>-            *nbytes += bytes_sent;
>>-            if (rv == APR_SUCCESS) {
>>-                sendlen -= bytes_sent; /* sendlen != bytes_sent ==> partial write */
>>-                o += bytes_sent;       /* o is where we are in the buffer */
>>-                togo -= bytes_sent;    /* track how much of the file we've sent */
>>+        if (rv == APR_SUCCESS && sendlen) {
>>+            while ((rv == APR_SUCCESS || rv == APR_EAGAIN) && sendlen) {
> 
> 
> Why would EAGAIN be returned?  There should be a timeout set on the
> APR socket.  Either the send works within the timeout period or we get
> timeout error or we get some lower-level socket error.
> 
> If EAGAIN is really returned, I suspect there is something else to investigate.
> 

Yes, I was talking to wsanchez on IRC, and I suspect there might be a 
bug in the OS-X kernel, causing a blocking socket to return EAGAIN on 
write().