You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by "Roy T. Fielding" <fi...@kiwi.ics.uci.edu> on 1997/07/14 16:25:31 UTC

[PATCH] Unbuffering slow CGI output

As far as I can tell, this is the right way to handle slow CGI scripts
that need stuff sent before a full buffer is available.  It has no
ill effects on fast-running scripts or files.  In fact, it probably
speeds them up as well since we avoid the overhead of stdio.  On OSes
that do not support POSIX or BSD non-blocking, it will still work as today.
The only problem might be weird OSes that lie about non-blocking support,
which I seem to remember being the case for MPE.

Could someone test this on a real site?  (and MPE)

....Roy


Index: http_protocol.c
===================================================================
RCS file: /export/home/cvs/apache/src/http_protocol.c,v
retrieving revision 1.136
diff -c -r1.136 http_protocol.c
*** http_protocol.c	1997/07/14 11:28:55	1.136
--- http_protocol.c	1997/07/14 14:22:34
***************
*** 1532,1540 ****
--- 1532,1556 ----
      char buf[IOBUFSIZE];
      long total_bytes_sent = 0;
      register int n, w, o, len;
+     int fudstat;
+     int fud = fileno(f);
      
      if (length == 0) return 0;
  
+ #if defined(O_NONBLOCK) || defined(F_NDELAY)
+     /* Use non-blocking reads so that we can flush the output buffer
+      * instead of blocking on a slow CGI script (or module).
+      */
+     if ((fudstat = fcntl(fud, F_GETFL, 0)) != -1) {
+ #if defined(O_NONBLOCK)
+         fudstat |= O_NONBLOCK;
+ #elif defined(F_NDELAY)
+         fudstat |= F_NDELAY;
+ #endif
+         fudstat = fcntl(fud, F_SETFL, fudstat);
+     }
+ #endif
+ 
      soft_timeout("send body", r);
  
      while (!r->connection->aborted) {
***************
*** 1542,1554 ****
  	    len = length - total_bytes_sent;
  	else len = IOBUFSIZE;
  
!         while ((n= fread(buf, sizeof(char), len, f)) < 1
! 	       && ferror(f) && errno == EINTR && !r->connection->aborted)
! 	    continue;
! 	
! 	if (n < 1) {
              break;
          }
          o=0;
  	total_bytes_sent += n;
  
--- 1558,1581 ----
  	    len = length - total_bytes_sent;
  	else len = IOBUFSIZE;
  
!         while (((n = read(fud, buf, len)) < 0)
!                && errno == EINTR && !r->connection->aborted)
!             continue;
! 
!         if (n == 0 || r->connection->aborted)  /* EOF */
              break;
+         else if (n < 0) {
+             if (errno == EAGAIN) {             /* read would have blocked */
+                 errno = 0;
+                 bflush(r->connection->client);
+                 continue;
+             }
+             else {
+                 log_unixerr("read failed in send body", NULL, NULL, r->server);
+                 break;
+             }
          }
+                 
          o=0;
  	total_bytes_sent += n;
  

Re: [PATCH] Unbuffering slow CGI output

Posted by Brian Behlendorf <br...@organic.com>.
Hmm, I tried this patch, and got some weird problems.  You can see them at
http://hyperreal.org:8001/ if you're interested.  I'll dive in some more to
see what's wrong.  It appears to be only a problem with CGI scripts...

	Brian

At 07:25 AM 7/14/97 -0700, you wrote:
>As far as I can tell, this is the right way to handle slow CGI scripts
>that need stuff sent before a full buffer is available.  It has no
>ill effects on fast-running scripts or files.  In fact, it probably
>speeds them up as well since we avoid the overhead of stdio.  On OSes
>that do not support POSIX or BSD non-blocking, it will still work as today.
>The only problem might be weird OSes that lie about non-blocking support,
>which I seem to remember being the case for MPE.
>
>Could someone test this on a real site?  (and MPE)
>
>....Roy
>
>
>Index: http_protocol.c
>===================================================================
>RCS file: /export/home/cvs/apache/src/http_protocol.c,v
>retrieving revision 1.136
>diff -c -r1.136 http_protocol.c
>*** http_protocol.c	1997/07/14 11:28:55	1.136
>--- http_protocol.c	1997/07/14 14:22:34
>***************
>*** 1532,1540 ****
>--- 1532,1556 ----
>      char buf[IOBUFSIZE];
>      long total_bytes_sent = 0;
>      register int n, w, o, len;
>+     int fudstat;
>+     int fud = fileno(f);
>      
>      if (length == 0) return 0;
>  
>+ #if defined(O_NONBLOCK) || defined(F_NDELAY)
>+     /* Use non-blocking reads so that we can flush the output buffer
>+      * instead of blocking on a slow CGI script (or module).
>+      */
>+     if ((fudstat = fcntl(fud, F_GETFL, 0)) != -1) {
>+ #if defined(O_NONBLOCK)
>+         fudstat |= O_NONBLOCK;
>+ #elif defined(F_NDELAY)
>+         fudstat |= F_NDELAY;
>+ #endif
>+         fudstat = fcntl(fud, F_SETFL, fudstat);
>+     }
>+ #endif
>+ 
>      soft_timeout("send body", r);
>  
>      while (!r->connection->aborted) {
>***************
>*** 1542,1554 ****
>  	    len = length - total_bytes_sent;
>  	else len = IOBUFSIZE;
>  
>!         while ((n= fread(buf, sizeof(char), len, f)) < 1
>! 	       && ferror(f) && errno == EINTR && !r->connection->aborted)
>! 	    continue;
>! 	
>! 	if (n < 1) {
>              break;
>          }
>          o=0;
>  	total_bytes_sent += n;
>  
>--- 1558,1581 ----
>  	    len = length - total_bytes_sent;
>  	else len = IOBUFSIZE;
>  
>!         while (((n = read(fud, buf, len)) < 0)
>!                && errno == EINTR && !r->connection->aborted)
>!             continue;
>! 
>!         if (n == 0 || r->connection->aborted)  /* EOF */
>              break;
>+         else if (n < 0) {
>+             if (errno == EAGAIN) {             /* read would have
blocked */
>+                 errno = 0;
>+                 bflush(r->connection->client);
>+                 continue;
>+             }
>+             else {
>+                 log_unixerr("read failed in send body", NULL, NULL,
r->server);
>+                 break;
>+             }
>          }
>+                 
>          o=0;
>  	total_bytes_sent += n;
>  
>
--=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=--
"Why not?" - TL           brian@organic.com - hyperreal.org - apache.org

Re: [PATCH] Unbuffering slow CGI output

Posted by Dean Gaudet <dg...@arctic.org>.
That's because Roy's patch uses fileno(f) after various other actions have
been performed on the FILE *.  This is a no no... the stdio stuff has
stuff buffered at this point.  There is no quick fix, this whole cgi
buffering thing requires some pretty extensive work imho. 

I want to turn the whole cgi thing into a select event loop with
non-blocking descriptors.  It'll require help from buff.c to deal with
problems like the above... and I'll probably ditch the use of FILE *
completely.  That can hamper backwards compatibility.

As a side benefit this could actually process the stderr stream from the
cgi and tack a timestamp or other data to the beginnings of lines.

But it's not a trivial task. 

Dean

On Mon, 14 Jul 1997, Brian Behlendorf wrote:

> 
> Okay, it looks like this patch is losing whatever body part is in the first
> flush from the CGI script to the server, though whatever headers are a part
> of that flush are captured correctly.  I.e. if the CGI output looks like this:
> 
> flush1:    Content-type: text/html\r
>            \r
> flush2:    blah blah
> 
> The server will do the right thing, but if it sees
> 
> flush1:    Content-type: text/html\r
>            \r
>            blah blah
> flush2:    bletch bletch
> 
> The client will only see "bletch bletch".  
> 
> So, for scripts which don't flush at the end of the headers (which I would
> presume to be any perl script which doesn't set "$| = 1" and have the
> headers in a separate print() statement, for example) a large chunk of
> their CGI script output will get eaten.
> 
> This should be enough to help track down the problem...
> 
> 	Brian
> 
> 
> --=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=--
> "Why not?" - TL           brian@organic.com - hyperreal.org - apache.org
> 


Re: [PATCH] Unbuffering slow CGI output

Posted by Brian Behlendorf <br...@organic.com>.
Okay, it looks like this patch is losing whatever body part is in the first
flush from the CGI script to the server, though whatever headers are a part
of that flush are captured correctly.  I.e. if the CGI output looks like this:

flush1:    Content-type: text/html\r
           \r
flush2:    blah blah

The server will do the right thing, but if it sees

flush1:    Content-type: text/html\r
           \r
           blah blah
flush2:    bletch bletch

The client will only see "bletch bletch".  

So, for scripts which don't flush at the end of the headers (which I would
presume to be any perl script which doesn't set "$| = 1" and have the
headers in a separate print() statement, for example) a large chunk of
their CGI script output will get eaten.

This should be enough to help track down the problem...

	Brian


--=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=--
"Why not?" - TL           brian@organic.com - hyperreal.org - apache.org