You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@apr.apache.org by tr...@apache.org on 2001/05/04 00:38:02 UTC

cvs commit: apr/network_io/unix sendrecv.c

trawick     01/05/03 15:38:02

  Modified:    .        CHANGES
               test     sendfile.c
               network_io/unix sendrecv.c
  Log:
  fix a problem with the FreeBSD flavor of apr_sendfile(); we could
  return APR_EAGAIN+bytes sent for a non-blocking socket
  
  Revision  Changes    Path
  1.101     +3 -0      apr/CHANGES
  
  Index: CHANGES
  ===================================================================
  RCS file: /home/cvs/apr/CHANGES,v
  retrieving revision 1.100
  retrieving revision 1.101
  diff -u -r1.100 -r1.101
  --- CHANGES	2001/05/02 02:54:11	1.100
  +++ CHANGES	2001/05/03 22:37:54	1.101
  @@ -1,5 +1,8 @@
   Changes with APR b1  
   
  +  *) Fix a problem with the FreeBSD flavor of apr_sendfile() where we 
  +     could return APR_EAGAIN+bytes_sent.  [Jeff Trawick]
  +
     *) Fix a problem on unixware where clearing h_errno wouldn't work.
        Use set_h_errno() instead.  PR #7651  [Jeff Trawick]
   
  
  
  
  1.13      +1 -0      apr/test/sendfile.c
  
  Index: sendfile.c
  ===================================================================
  RCS file: /home/cvs/apr/test/sendfile.c,v
  retrieving revision 1.12
  retrieving revision 1.13
  diff -u -r1.12 -r1.13
  --- sendfile.c	2001/03/31 22:37:13	1.12
  +++ sendfile.c	2001/05/03 22:37:58	1.13
  @@ -373,6 +373,7 @@
               printf("apr_sendfile()->%d, sent %ld bytes\n", rv, (long)tmplen);
               if (rv) {
                   if (APR_STATUS_IS_EAGAIN(rv)) {
  +                    assert(tmplen == 0);
                       nsocks = 1;
                       tmprv = apr_poll(pfd, &nsocks, -1);
                       assert(!tmprv);
  
  
  
  1.65      +12 -6     apr/network_io/unix/sendrecv.c
  
  Index: sendrecv.c
  ===================================================================
  RCS file: /home/cvs/apr/network_io/unix/sendrecv.c,v
  retrieving revision 1.64
  retrieving revision 1.65
  diff -u -r1.64 -r1.65
  --- sendrecv.c	2001/03/31 13:25:45	1.64
  +++ sendrecv.c	2001/05/03 22:38:00	1.65
  @@ -376,11 +376,6 @@
       return rv < 0 ? errno : APR_SUCCESS;
   }
   
  -/* These are just demos of how the code for the other OSes.
  - * I haven't tested these, but they're right in terms of interface.
  - * I just wanted to see what types of vars would be required from other OSes. 
  - */
  -
   #elif defined(__FreeBSD__)
   
   /* Release 3.1 or greater */
  @@ -423,6 +418,10 @@
            * first chunk of data twice, once in the first call and once in the
            * second.  If we are using a timed write, then we check to make sure
            * we can send data before trying to send it.
  +         *
  +         * JLT: doing this first doesn't eliminate the possibility that
  +         *      we get -1/EAGAIN/nbytes>0; AFAICT it just means extra syscalls
  +         *      from time to time
            */
           apr_status_t arv = wait_for_io_or_timeout(sock, 0);
           if (arv != APR_SUCCESS) {
  @@ -444,6 +443,13 @@
                             &headerstruct, /* Headers/footers */
                             &nbytes,       /* number of bytes written */
                             flags);        /* undefined, set to 0 */
  +            /* FreeBSD's sendfile can return -1/EAGAIN even if it
  +             * sent bytes.  Sanitize the result so we get normal EAGAIN
  +             * semantics w.r.t. bytes sent.
  +             */
  +            if (rv == -1 && errno == EAGAIN && nbytes) {
  +                rv = 0;
  +            }
           }
           else {
               /* just trailer bytes... use writev()
  @@ -462,7 +468,7 @@
       } while (rv == -1 && errno == EINTR);
   
       (*len) = nbytes;
  -    if (rv == -1 && (errno != EAGAIN || (errno == EAGAIN && sock->timeout < 0))) {
  +    if (rv == -1) {
           return errno;
       }
       return APR_SUCCESS;

Re: cvs commit: apr/network_io/unix sendrecv.c

Posted by Garrett Rooney <ro...@electricjellyfish.net>.

On Sat, May 05, 2001 at 12:11:26AM -0400, Jeff Trawick wrote:
 
> If it were non-blocking, it would busy-loop while waiting for acks
> from the client since there is no call to select or poll to wait for
> writability.  Extremely unlikely that such looping would go unnoticed :)

well, like i said, it was just off the top of my head ;-)

in that case, i have absolutely no idea why it would be in a loop...
paranoia?

-- 
garrett rooney                     Unix was not designed to stop you from 
rooneg@electricjellyfish.net       doing stupid things, because that would  
http://electricjellyfish.net/      stop you from doing clever things.

Re: cvs commit: apr/network_io/unix sendrecv.c

Posted by Jeff Trawick <tr...@bellsouth.net>.

Garrett Rooney <ro...@electricjellyfish.net> writes:

> On Fri, May 04, 2001 at 01:15:17PM -0400, Jeff Trawick wrote:
> > I noticed this weird use of sendfile() in FreeBSD's ftpd.c:
> > 
> >                         while (err != -1 && cnt < filesize) {
> >                                 err = sendfile(filefd, netfd, offset, len,
> >                                         (struct sf_hdtr *) NULL, &cnt, 0);
> >                                 byte_count += cnt;
> >                                 offset += cnt;
> >                                 len -= cnt;
> > 
> >                                 if (err == -1) {
> >                                         if (!cnt)
> >                                                 goto oldway;
> > 
> >                                         goto data_err;
> >                                 }
> >                         }
> > 
> > I don't understand why there is a loop...  (I have no idea if there is
> > a connection between the unclear need for this loop and BSD's
> > sendfile() returning 0/0...)
> 
> this is just off the top of my head, but could that socket (in the ftpd code) 
> be set for non-blocking io?  the man page for sendfile() on freebsd indicates
> that in such a case, less than the total number of bytes could be sent, and
> EAGAIN would be returned.  thus the loop.

If it were non-blocking, it would busy-loop while waiting for acks
from the client since there is no call to select or poll to wait for
writability.  Extremely unlikely that such looping would go unnoticed :)

-- 
Jeff Trawick | trawickj@bellsouth.net | PGP public key at web site:
       http://www.geocities.com/SiliconValley/Park/9289/
             Born in Roswell... married an alien...

Re: cvs commit: apr/network_io/unix sendrecv.c

Posted by Garrett Rooney <ro...@electricjellyfish.net>.

On Fri, May 04, 2001 at 01:15:17PM -0400, Jeff Trawick wrote:
> We have one coredump for an assertion (ohmygod) failure on
> apache.org where apr_sendfile() returned APR_SUCCESS/zero-bytes-read
> for a socket with a timeout.  After reading the code many times I just
> don't see how, unless the kernel had a slip-up.
> 
> I noticed this weird use of sendfile() in FreeBSD's ftpd.c:
> 
>                         while (err != -1 && cnt < filesize) {
>                                 err = sendfile(filefd, netfd, offset, len,
>                                         (struct sf_hdtr *) NULL, &cnt, 0);
>                                 byte_count += cnt;
>                                 offset += cnt;
>                                 len -= cnt;
> 
>                                 if (err == -1) {
>                                         if (!cnt)
>                                                 goto oldway;
> 
>                                         goto data_err;
>                                 }
>                         }
> 
> I don't understand why there is a loop...  (I have no idea if there is
> a connection between the unclear need for this loop and BSD's
> sendfile() returning 0/0...)

this is just off the top of my head, but could that socket (in the ftpd code) 
be set for non-blocking io?  the man page for sendfile() on freebsd indicates
that in such a case, less than the total number of bytes could be sent, and
EAGAIN would be returned.  thus the loop.

-- 
garrett rooney                     Unix was not designed to stop you from 
rooneg@electricjellyfish.net       doing stupid things, because that would  
http://electricjellyfish.net/      stop you from doing clever things.

Re: cvs commit: apr/network_io/unix sendrecv.c

Posted by Jeff Trawick <tr...@bellsouth.net>.

> trawick     01/05/03 15:38:02
> 
>   Modified:    .        CHANGES
>                test     sendfile.c
>                network_io/unix sendrecv.c

>   Index: sendrecv.c
>   ===================================================================
>   RCS file: /home/cvs/apr/network_io/unix/sendrecv.c,v
>   retrieving revision 1.64
>   retrieving revision 1.65
>   diff -u -r1.64 -r1.65
>   --- sendrecv.c	2001/03/31 13:25:45	1.64
>   +++ sendrecv.c	2001/05/03 22:38:00	1.65
...
>   @@ -444,6 +443,13 @@
>                              &headerstruct, /* Headers/footers */
>                              &nbytes,       /* number of bytes written */
>                              flags);        /* undefined, set to 0 */
>   +            /* FreeBSD's sendfile can return -1/EAGAIN even if it
>   +             * sent bytes.  Sanitize the result so we get normal EAGAIN
>   +             * semantics w.r.t. bytes sent.
>   +             */
>   +            if (rv == -1 && errno == EAGAIN && nbytes) {
>   +                rv = 0;
>   +            }
>            }
>            else {
>                /* just trailer bytes... use writev()
>   @@ -462,7 +468,7 @@
>        } while (rv == -1 && errno == EINTR);
>    
>        (*len) = nbytes;
>   -    if (rv == -1 && (errno != EAGAIN || (errno == EAGAIN && sock->timeout < 0))) {
>   +    if (rv == -1) {
>            return errno;
>        }
>        return APR_SUCCESS;

We have one coredump for an assertion (ohmygod) failure on
apache.org where apr_sendfile() returned APR_SUCCESS/zero-bytes-read
for a socket with a timeout.  After reading the code many times I just
don't see how, unless the kernel had a slip-up.

I noticed this weird use of sendfile() in FreeBSD's ftpd.c:

                        while (err != -1 && cnt < filesize) {
                                err = sendfile(filefd, netfd, offset, len,
                                        (struct sf_hdtr *) NULL, &cnt, 0);
                                byte_count += cnt;
                                offset += cnt;
                                len -= cnt;

                                if (err == -1) {
                                        if (!cnt)
                                                goto oldway;

                                        goto data_err;
                                }
                        }

I don't understand why there is a loop...  (I have no idea if there is
a connection between the unclear need for this loop and BSD's
sendfile() returning 0/0...)

-- 
Jeff Trawick | trawickj@bellsouth.net | PGP public key at web site:
       http://www.geocities.com/SiliconValley/Park/9289/
             Born in Roswell... married an alien...

Re: cvs commit: apr/network_io/unix sendrecv.c

Posted by Jeff Trawick <tr...@bellsouth.net>.

"Roy T. Fielding" <fi...@ebuilt.com> writes:

> >   Index: sendfile.c
> >   ===================================================================
> >   RCS file: /home/cvs/apr/test/sendfile.c,v
> >   retrieving revision 1.12
> >   retrieving revision 1.13
> >   diff -u -r1.12 -r1.13
> >   --- sendfile.c	2001/03/31 22:37:13	1.12
> >   +++ sendfile.c	2001/05/03 22:37:58	1.13
> >   @@ -373,6 +373,7 @@
> >                printf("apr_sendfile()->%d, sent %ld bytes\n", rv, (long)tmplen);
> >                if (rv) {
> >                    if (APR_STATUS_IS_EAGAIN(rv)) {
> >   +                    assert(tmplen == 0);
> >                        nsocks = 1;
> >                        tmprv = apr_poll(pfd, &nsocks, -1);
> >                        assert(!tmprv);
> 
> No assert should ever be present in server code, period.  I don't care if
> the only way it could be triggered is by a cosmic ray hittng a memory cell
> at just the wrong moment in time, it doesn't belong in the server
> code.

My first take was to suggest that you stop being excited about an
assert() in a test program.  But then I disagree 100% with your
comments on assertions :)  I practically always prefer that software
go out with a bang rather than fail silently, even when I don't have
time or don't want to complicate the logic by handling unanticipated
conditions more carefully than with an assertion.

When I used to work on mission critical code we had a fair number of
asserts, usually added when they could be done so with little extra
pathlength... They are infinitely more reliable than any comment and
they also acknowledge the fact that shit happens and you need to
figure out where things went south.

I like that Apache has asserts that only work with AP_DEBUG builds as
well as asserts that are always active.  Some AP_DEBUG asserts added
by Greg Ames and others have helped us get some nasty bugs out.

-- 
Jeff Trawick | trawickj@bellsouth.net | PGP public key at web site:
       http://www.geocities.com/SiliconValley/Park/9289/
             Born in Roswell... married an alien...

Re: cvs commit: apr/network_io/unix sendrecv.c

Posted by rb...@covalent.net.

On Thu, 3 May 2001, Roy T. Fielding wrote:

> > We have asserts throughout the server these days.  Do we want to remove
> > them all?
>
> I want to remove them all, but won't do so until I've done all of the
> other things I want to do.  The problem with asserts is that causing the
> server process to core dump is never a good idea, even when a person is
> in debug mode, unless the condition being checked would result in a
> fatal error in any case (like a later pointer de-ref or divide-by-zero).
> Expected run-time errors should just be logged and handled.
> In all other cases it is better to bullet-proof the code or simply ignore
> that condition as a possibility.  Cosmic rays and memory errors do occur,
> so an assert should not be used unless it is a truly fatal condition.
>
> The problem with conditionally-compiled debug switches is they change
> the code such that the stuff we test isn't the same as gets deployed,
> which can mask race conditions.
>
> However, the main reason that I don't like asserts is that they lead to
> programmer mistakes -- people assuming that a condition will never occur
> in all environments simply because none of their own debug tests in their
> own environment triggered the assert.  I think the code should either be
> logically complete (handle all cases) or fail-soft.

Well, you have just said what I believe in a much more concise way then I
ever could.  :-)  I seriously asked the question because I was hoping for
an answer like this, but couldn't voice it.

Ryan

_______________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
406 29th St.
San Francisco, CA 94131
-------------------------------------------------------------------------------

Re: cvs commit: apr/network_io/unix sendrecv.c

Posted by "Roy T. Fielding" <fi...@ebuilt.com>.

> We have asserts throughout the server these days.  Do we want to remove
> them all?

I want to remove them all, but won't do so until I've done all of the
other things I want to do.  The problem with asserts is that causing the
server process to core dump is never a good idea, even when a person is
in debug mode, unless the condition being checked would result in a
fatal error in any case (like a later pointer de-ref or divide-by-zero).
Expected run-time errors should just be logged and handled.
In all other cases it is better to bullet-proof the code or simply ignore
that condition as a possibility.  Cosmic rays and memory errors do occur,
so an assert should not be used unless it is a truly fatal condition.

The problem with conditionally-compiled debug switches is they change
the code such that the stuff we test isn't the same as gets deployed,
which can mask race conditions.

However, the main reason that I don't like asserts is that they lead to
programmer mistakes -- people assuming that a condition will never occur
in all environments simply because none of their own debug tests in their
own environment triggered the assert.  I think the code should either be
logically complete (handle all cases) or fail-soft.

....Roy

Re: cvs commit: apr/network_io/unix sendrecv.c

Posted by Dirk-Willem van Gulik <di...@covalent.net>.

You two guys might violently agree :-)

Asserts - when used properly - are very usefull tools to check for
conceptional bugs, But you should always assume that you compile for
production with the -NA, -DNDEBUG, -NOASSERT or your local relevant flags.

Production code which relies on assert()s are capital bad.

As long as that is our premisse :-)

Dw

On Thu, 3 May 2001 rbb@covalent.net wrote:

>
> > >   Index: sendfile.c
> > >   ===================================================================
> > >   RCS file: /home/cvs/apr/test/sendfile.c,v
> > >   retrieving revision 1.12
> > >   retrieving revision 1.13
> > >   diff -u -r1.12 -r1.13
> > >   --- sendfile.c	2001/03/31 22:37:13	1.12
> > >   +++ sendfile.c	2001/05/03 22:37:58	1.13
> > >   @@ -373,6 +373,7 @@
> > >                printf("apr_sendfile()->%d, sent %ld bytes\n", rv, (long)tmplen);
> > >                if (rv) {
> > >                    if (APR_STATUS_IS_EAGAIN(rv)) {
> > >   +                    assert(tmplen == 0);
> > >                        nsocks = 1;
> > >                        tmprv = apr_poll(pfd, &nsocks, -1);
> > >                        assert(!tmprv);
> >
> > No assert should ever be present in server code, period.  I don't care if
> > the only way it could be triggered is by a cosmic ray hittng a memory cell
> > at just the wrong moment in time, it doesn't belong in the server code.
> >
> > Either check the error condition or be prepared to ignore it.
>
> We have asserts throughout the server these days.  Do we want to remove
> them all?
>
> Ryan
>
> _______________________________________________________________________________
> Ryan Bloom                        	rbb@apache.org
> 406 29th St.
> San Francisco, CA 94131
> -------------------------------------------------------------------------------
>
>

Re: cvs commit: apr/network_io/unix sendrecv.c

Posted by rb...@covalent.net.

> >   Index: sendfile.c
> >   ===================================================================
> >   RCS file: /home/cvs/apr/test/sendfile.c,v
> >   retrieving revision 1.12
> >   retrieving revision 1.13
> >   diff -u -r1.12 -r1.13
> >   --- sendfile.c	2001/03/31 22:37:13	1.12
> >   +++ sendfile.c	2001/05/03 22:37:58	1.13
> >   @@ -373,6 +373,7 @@
> >                printf("apr_sendfile()->%d, sent %ld bytes\n", rv, (long)tmplen);
> >                if (rv) {
> >                    if (APR_STATUS_IS_EAGAIN(rv)) {
> >   +                    assert(tmplen == 0);
> >                        nsocks = 1;
> >                        tmprv = apr_poll(pfd, &nsocks, -1);
> >                        assert(!tmprv);
>
> No assert should ever be present in server code, period.  I don't care if
> the only way it could be triggered is by a cosmic ray hittng a memory cell
> at just the wrong moment in time, it doesn't belong in the server code.
>
> Either check the error condition or be prepared to ignore it.

We have asserts throughout the server these days.  Do we want to remove
them all?

Ryan

_______________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
406 29th St.
San Francisco, CA 94131
-------------------------------------------------------------------------------

Re: cvs commit: apr/network_io/unix sendrecv.c

Posted by "Roy T. Fielding" <fi...@ebuilt.com>.

>   Index: sendfile.c
>   ===================================================================
>   RCS file: /home/cvs/apr/test/sendfile.c,v
>   retrieving revision 1.12
>   retrieving revision 1.13
>   diff -u -r1.12 -r1.13
>   --- sendfile.c	2001/03/31 22:37:13	1.12
>   +++ sendfile.c	2001/05/03 22:37:58	1.13
>   @@ -373,6 +373,7 @@
>                printf("apr_sendfile()->%d, sent %ld bytes\n", rv, (long)tmplen);
>                if (rv) {
>                    if (APR_STATUS_IS_EAGAIN(rv)) {
>   +                    assert(tmplen == 0);
>                        nsocks = 1;
>                        tmprv = apr_poll(pfd, &nsocks, -1);
>                        assert(!tmprv);

No assert should ever be present in server code, period.  I don't care if
the only way it could be triggered is by a cosmic ray hittng a memory cell
at just the wrong moment in time, it doesn't belong in the server code.

Either check the error condition or be prepared to ignore it.

....Roy