You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Marc Slemko <ma...@znep.com> on 1998/03/06 03:33:23 UTC

non-buffered CGIs suck

...from a network perspective.

Try running this:

#include <sys/types.h>
#include <sys/uio.h>
#include <unistd.h>

#define HEADERS "Content-type: text/plain\n\n"
int main () {
        char *s = "this is a line that is being sent\n ";
        int i;
        write(STDOUT_FILENO, HEADERS, strlen(HEADERS));
        for (i = 0; i < 200; i++) {
                write(STDOUT_FILENO, s, strlen(s));
                usleep(1);

        }
}

And you will see many small packets, it will take twice as long to
transfer as buffered CGI did, etc.  It is not very nice to the network.
While many CGIs will have their own buffering (eg. stdio), I'm still not
comfortable.

How about something like the below?  Note that this isn't complete; there
really should be a limit on the total length of time we will do this for
before we flush.  eg. a client writing a byte every 50 ms won't get
flushed in this case.  This is one of those (few) times I wish the world
were a Linux, since it would be easy then. 

For everything else, I really think that loosing this bit of "realtime" is
worthwhile and has minimal impact.  If we didn't disable Nagle, we
wouldn't have to worry about it, however currently we do disable Nagle so
we have to fake our own without being able to do it right.

Index: http_protocol.c
===================================================================
RCS file: /export/home/cvs//apache-1.3/src/main/http_protocol.c,v
retrieving revision 1.194
diff -u -r1.194 http_protocol.c
--- http_protocol.c	1998/03/04 02:28:16	1.194
+++ http_protocol.c	1998/03/06 02:28:47
@@ -1658,11 +1658,27 @@
             len = IOBUFSIZE;
 
         do {
+	    struct timeval tv;
+
             n = bread(fb, buf, len);
             if (n >= 0 || r->connection->aborted)
                 break;
             if (n < 0 && errno != EAGAIN)
                 break;
+
+	    /*
+	     * we really don't want to be shoving lots of small data out
+	     * to the network, so hang around for 100ms to see if we can
+	     * grab anything else.
+	     */
+	    tv.tv_sec = 0;
+	    tv.tv_usec = 100000;
+	    FD_SET(fd, &fds);
+	    if (ap_select(fd + 1, &fds, NULL, &fds, &tv) > 0) {
+		/* something more to read, lets give it a shot */
+		continue;
+	    }
+
             /* we need to block, so flush the output first */
             bflush(r->connection->client);
             if (r->connection->aborted)


Re: non-buffered CGIs suck

Posted by Rodent of Unusual Size <Ke...@Golux.Com>.
Dean Gaudet wrote:
> 
> And I still disagree with every single CGI FAQ that says "set $| =1; in
> your perl scripts".  I've never understood why that is there.  I never
> seem to require it.  At least our FAQ explains that you should turn
> buffering back on.

That's to ensure that the header gets sent to the server before it
times out, so it won't assume the script is completely dysfunctional.
If your script didn't do this, emitted a small header, and then went
into mondo-time-computation mode, the server wouldn't get the header
because Perl's stdio hadn't flushed.  This way at least the server
gets a "no thirty" message rather than an aching silence.

True, scripts really *ought* to provide *some* content so the
end user won't bang on the stop button (generating another of
those "client went blooey" messages each time) - but there's
no requirement that scripts not be lame.  Remember, we believe
in giving people rope - and this way we can short-circuit
claims that it's the server's fault when they get hanged.

#ken	P-)}

Ken Coar                    <http://Web.Golux.Com/coar/>
Apache Group member         <http://www.apache.org/>
"Apache Server for Dummies" <http://WWW.Dummies.Com/

Re: non-buffered CGIs suck

Posted by Marc Slemko <ma...@worldgate.com>.
Oh yea, and to my _great_ surprise in an informal survey of clients
connection to a typical Internet server (ie. porn server) I found that a
tremendous (90%+) number of clients are advertising a MSS of 1460 or
close, and a significant (50%+) are doing PMTU-D.  This means that taking
care to keep our segment sizes up is good.

On Thu, 5 Mar 1998, Marc Slemko wrote:

> On Thu, 5 Mar 1998, Dean Gaudet wrote:
> 
> > 
> > 
> > On Thu, 5 Mar 1998, Marc Slemko wrote:
> > 
> > > Why should it have any significant impact at all on them?  Heck, you have
> > > less overhead when there is a delay of less than the select timeout
> > > because you avoid pointless flushes.  When it does timeout and go to
> > > block, you have one extra syscall overhead.
> > > 
> > > What other overhead is there?
> > 
> > 4k chunks never get buffered.  So waiting 100ms for each of them hurts
> > overall throughput. 
> 
> I'm not sure I follow.  If they don't get buffered, where is the problem?
> You do a 4k write.  It doesn't get buffered, so it goes out without a
> flush.  You then wait for either 100ms or the next write, whichever comes
> first.  If the next write comes right away, there is no difference.  This
> code only comes into play if we need to block for the next read.  If you
> do 2k writes, for example, then that 2k could end up being delayed an
> extra 100 ms.
> 
> If you did a 4k write and it didn't get sent until the flush or more data
> was written, it could add delay.  Not necessarily that much though, since
> you have to remember you still have the send buffer size in the TCP stack
> so in bulk data flow I can see no delays since the CGI should be able to
> write at speeds >> than the network can send.
> 
> What really should be done here is to prevent sending things if there
> isn't a full segment, but we have no way to do that.
> 
> > 
> > > Remember prior to 1.1?  We had Nagle enabled.
> > 
> > Doesn't help in all cases though.  But point taken.  How do things look if
> > you re-enable Nagle?
> 
> Things are fine from the packet size perspective if you reenable Nagle,
> but it may cause performance problems in some cases.  The original reason
> for disabling it was due to the fact that we sent the headers in a
> separate segment.  
> 
> We should only run into trouble with Nagle if we have two short segments
> in a row.  Before, that could be the end of one response body and the
> headers of the next response.  Now we don't flush after the headers are
> sent, so that (common) case doesn't happen.  It could happen with just the
> right sequence of cache validation stuff; not when we have a whole bunch
> of requests pipelined at once, but when a new one comes in after we sent
> the last segment of the previous response but before we have the ACK back
> for it.  I am planning on looking to see if it is possible to enable Nagle
> without causing problems.  Nagle is a lot smarter than Apache can be about
> this because of the layer it is at.  I am also looking to see how many
> systems have the sucky segment size problem; I am told that most don't,
> and I don't even see it with all FreeBSD systems.  Not sure why yet. 
> 
> > 
> > And maybe I should check your script on Linux to see if it's another
> > freebsd feature ;)  (couldn't resist ;) 
> > 
> > > > And I still disagree with every single CGI FAQ that says "set $| =1; in
> > > > your perl scripts".  I've never understood why that is there.  I never
> > > > seem to require it.  At least our FAQ explains that you should turn
> > > > buffering back on. 
> > > 
> > > If you do anything that mixes non-buffered and buffered IO you need it or
> > > something similar.  If you do:
> > > 
> > > print "Content-type: text/plain\n\n";
> > > system("/bin/hostname");
> > > 
> > > you need it.
> > 
> > Yeah you're right, I guess I don't write these sort of lame CGIs so I
> > never run into it. 
> > 
> > Dean
> > 
> > 
> 
> 


Re: non-buffered CGIs suck

Posted by Rodent of Unusual Size <Ke...@Golux.Com>.
Marc Slemko wrote:
> 
> I do really like the concept of a CGI header the script can output to tell
> the server not to buffer, but it is of limited use in this case.

Well, with CGI/1.2 in definition, please go right ahead and propose it..
Seriously.

#ken	P-)}

Ken Coar                    <http://Web.Golux.Com/coar/>
Apache Group member         <http://www.apache.org/>
"Apache Server for Dummies" <http://WWW.Dummies.Com/

Re: non-buffered CGIs suck

Posted by Dean Gaudet <dg...@arctic.org>.

On Thu, 5 Mar 1998, Marc Slemko wrote:

> But it is small because you are adding extra overhead for no reason.  Any
> decent filesystem is going to be reading more than 4k anyway, and bumping
> it up to 16k or 32k or more would remove a certain amount of overhead
> while adding other overhead.  It isn't that easy though, since there are
> oodles of interactions that go on between things that aren't obvious.

The larger you make these buffers, the more you exacerbate the pipelining
problem that I just mentioned -- a bunch of small responses get stuck in
the buffer waiting for one long response to fill it up. 

> Look at it this way: is it worth it to use 1/8th of the system calls
> (non-mmap, of course) for reading and sending the body data in exchange
> for giving up 28k of memory?  

I'm not interested in non-mmap and performance really... the systems we
don't do mmap on aren't typically going to be used in performance critical
situations. 

PIPE_BUF is typically 4k, and I bet if you look around you'll find that
you only ever get 4k reads from pipes on many systems -- I know linux is
this way... but I haven't tested others.  It may be 8k on alphas and a few
others. 

So that essentially leaves mod_include... (and mod_proxy, but mod_proxy's
cache should be changed to use mmap as well). 

Although I'm finding right now what looks like a linux performance bug if
you've got a system that's just a bit over its RAM -- with mmap() I'm
seeing more swap activity than I do with read().  I think it's a known
problem... linux does great until it starts swapping...

> (unrelated) What does Linux default to for a send buffer size?

65535 (tested on 2.1.86 and 2.0.33) 

Dean


Re: non-buffered CGIs suck

Posted by Marc Slemko <ma...@worldgate.com>.
On Thu, 5 Mar 1998, Dean Gaudet wrote:

> On Thu, 5 Mar 1998, Marc Slemko wrote:
> 
> > The problem is that Apache is making this possible by disabling Nagle, so
> > we should deal with all the consequences of disabling Nagle or not do it.
> 
> i.e. we tell CGI authors "it's your responsibility".
> 
> How about someone go and get some random CGIs from the public repositories
> of them and see if this is even an issue.  I don't think it's an issue.

Mmm.

I'm thinking that some version of my patch to make CGIs nearly unbuffered
(instead of almost fully), plus disabling it for nph- scripts may work.
Yes, it adds back the concept of nph- scripts being unbuffered which isn't
how they are defined, but is common practice.

I do really like the concept of a CGI header the script can output to tell
the server not to buffer, but it is of limited use in this case.

> 
> > Naw, I just wait for you to abstract timeouts then use that.  <g>
> 
> Not unless there's a bug to be fixed.
> 
> > On a related note, I want to look into how the various buffer sizes
> > interact with each other and if there is any reason at all why it makes
> > sense to use such small buffers for reading and writing.
> 
> 4k isn't small.  Remember an ethernet segment is much smaller than that. 

But it is small because you are adding extra overhead for no reason.  Any
decent filesystem is going to be reading more than 4k anyway, and bumping
it up to 16k or 32k or more would remove a certain amount of overhead
while adding other overhead.  It isn't that easy though, since there are
oodles of interactions that go on between things that aren't obvious.

Look at it this way: is it worth it to use 1/8th of the system calls
(non-mmap, of course) for reading and sending the body data in exchange
for giving up 28k of memory?  

(unrelated) What does Linux default to for a send buffer size?

> 
> > > You notate only when you put data in an empty buffer, and you remove the
> > > notation when you flush the buffer.  The accuracy is +0 to +1s from when
> > > you want it, and you never make a syscall to do it. 
> > 
> > I am not convinced that sort of accuracy is really enough for this.  I
> > would almost always rather put an extra segment on the network then wait a
> > second.
> 
> Good luck.  Not without something more expensive than what we're doing
> now... well you could use -DSCOREBOARD_MAINTENANCE_INTERVAL=100000 and you
> could get a 200ms delivery.  But you'd be scanning the scoreboard 10 times
> per second. 

Exactly my problem.  Sigh.  Making Apache as a kernel module sounds better
and better.  <g>


> 
> If we had cheap userland semaphores we could do more... ;)  we won't have
> that until someone does a pthreads port though.
> 
> Dean
> 


Re: non-buffered CGIs suck

Posted by Marc Slemko <ma...@worldgate.com>.
On Fri, 6 Mar 1998, Dean Gaudet wrote:

> 
> 
> On Fri, 6 Mar 1998, Marc Slemko wrote:
> 
> > I don't care about the performace of their system.  For all I care they
> > could go shove it off a cliff.  I do care about my network falling over
> > from that sort of crap.  This does not just impact the server and the
> > client accessing it, and that is the whole point.  The code, as written,
> > can perform fine for many people and can do everything they need, so why
> > change it?  
> 
> I'll repeat an earlier question... show me that your CGI examples aren't
> degenerate.

Well gee, that's easy: all CGIs are degenerate by definition.  <g>

I see nothing "wrong" with my sample CGI.  That isn't the way I would
write it, but expecting users to... doesn't make sense to me.  

> 
> How about one of the Stronghold folks tell us if they've experienced any
> troubles with this CGI buffering change... which has been in stronghold
> for over a year now I believe.

I do know of one person who has had routers fall over due to load from a
high-traffic CGI on their site doing one-byte writes which ended up in a
separate packet each.  Is the CGI broken?  Yes.  Does that matter?  No,
Apache is still broken.  The only reason that this is Apache's fault is
because it explicitly goes and disables Nagle; if it does that, we need to
deal with possible consequences.


Re: non-buffered CGIs suck

Posted by Dean Gaudet <dg...@arctic.org>.

On Fri, 6 Mar 1998, Marc Slemko wrote:

> I don't care about the performace of their system.  For all I care they
> could go shove it off a cliff.  I do care about my network falling over
> from that sort of crap.  This does not just impact the server and the
> client accessing it, and that is the whole point.  The code, as written,
> can perform fine for many people and can do everything they need, so why
> change it?  

I'll repeat an earlier question... show me that your CGI examples aren't
degenerate.

How about one of the Stronghold folks tell us if they've experienced any
troubles with this CGI buffering change... which has been in stronghold
for over a year now I believe.

Dean



Re: non-buffered CGIs suck

Posted by Marc Slemko <ma...@worldgate.com>.
On Fri, 6 Mar 1998, Dean Gaudet wrote:

> > Consider how many people always set "$| = 1" in their perl scripts.  While
> > that alone won't cause problems... if they are doing anything that blocks
> > it will.
> > 
> > For example:
> > 
> > #!/usr/local/bin/perl
> > $| = 1;
> > print "Content-type: text/plain\n\n";
> > while (</etc/*>) {
> >         printf ("haha: ");
> >         system ("ls -ld $_");
> > }
> > 
> > normally results in one packet for each haha, and one for each ls output,
> > at least on the systems I tried it.  
> 
> If someone writes that perl code then they're not going to have a high
> performance system.  So... why should we worry about it?

I don't care about the performace of their system.  For all I care they
could go shove it off a cliff.  I do care about my network falling over
from that sort of crap.  This does not just impact the server and the
client accessing it, and that is the whole point.  The code, as written,
can perform fine for many people and can do everything they need, so why
change it?  


Re: non-buffered CGIs suck

Posted by Dean Gaudet <dg...@arctic.org>.

On Fri, 6 Mar 1998, Marc Slemko wrote:

> Except that the world isn't static content.

I know that.  But you cut out the part of my message where I examined all
the other dynamic content sources that we ship with.  If you want to
speculate on mod_perl and mod_php then go for it.  I still don't see a
problem with a 4k buffer.  Dynamic content by its very nature is going to
cost more CPU to produce and so its top end is much lower than static
content... and I doubt using a 4k buffer is going to make it much worse. 
At the moment I'm completely willing to trade off a bit of speed on the
dynamic content issue in order to make sure pipelined requests work
well... and that's where you'll be trading off.

Without threads and millisecond resolution timers I don't think we can
change this stuff that much. 

> On Thu, 5 Mar 1998, Dean Gaudet wrote:
> 
> > 
> > 
> > On Thu, 5 Mar 1998, Marc Slemko wrote:
> > 
> > > The problem is that Apache is making this possible by disabling Nagle, so
> > > we should deal with all the consequences of disabling Nagle or not do it.
> > 
> > i.e. we tell CGI authors "it's your responsibility".
> > 
> > How about someone go and get some random CGIs from the public repositories
> > of them and see if this is even an issue.  I don't think it's an issue.
> 
> Consider how many people always set "$| = 1" in their perl scripts.  While
> that alone won't cause problems... if they are doing anything that blocks
> it will.
> 
> For example:
> 
> #!/usr/local/bin/perl
> $| = 1;
> print "Content-type: text/plain\n\n";
> while (</etc/*>) {
>         printf ("haha: ");
>         system ("ls -ld $_");
> }
> 
> normally results in one packet for each haha, and one for each ls output,
> at least on the systems I tried it.  

If someone writes that perl code then they're not going to have a high
performance system.  So... why should we worry about it?

Dean



Re: non-buffered CGIs suck

Posted by Marc Slemko <ma...@worldgate.com>.
On Thu, 5 Mar 1998, Dean Gaudet wrote:

> 
> 
> On Thu, 5 Mar 1998, Marc Slemko wrote:
> > Look at it this way: is it worth it to use 1/8th of the system calls
> > (non-mmap, of course) for reading and sending the body data in exchange
> > for giving up 28k of memory?  
> 
> I'm not interested in non-mmap and performance really... the systems we
> don't do mmap on aren't typically going to be used in performance critical
> situations. 
> 

Except that the world isn't static content.

On Thu, 5 Mar 1998, Dean Gaudet wrote:

> 
> 
> On Thu, 5 Mar 1998, Marc Slemko wrote:
> 
> > The problem is that Apache is making this possible by disabling Nagle, so
> > we should deal with all the consequences of disabling Nagle or not do it.
> 
> i.e. we tell CGI authors "it's your responsibility".
> 
> How about someone go and get some random CGIs from the public repositories
> of them and see if this is even an issue.  I don't think it's an issue.

Consider how many people always set "$| = 1" in their perl scripts.  While
that alone won't cause problems... if they are doing anything that blocks
it will.

For example:

#!/usr/local/bin/perl
$| = 1;
print "Content-type: text/plain\n\n";
while (</etc/*>) {
        printf ("haha: ");
        system ("ls -ld $_");
}

normally results in one packet for each haha, and one for each ls output,
at least on the systems I tried it.  

While there are better ways to implement this in perl with a lot less
overhead all around, the script isn't doing anything that is so wrong.  It
has to enable autoflushing in perl because of the system call.  It can't
gather writes nicely because the whole idea is that each file has to be
passed to some external program. 

Or is there less overhead in all this from just enabling Nagle while
sending the CGI then disabling it?


Re: non-buffered CGIs suck

Posted by Dean Gaudet <dg...@arctic.org>.

On Thu, 5 Mar 1998, Marc Slemko wrote:

> The problem is that Apache is making this possible by disabling Nagle, so
> we should deal with all the consequences of disabling Nagle or not do it.

i.e. we tell CGI authors "it's your responsibility".

How about someone go and get some random CGIs from the public repositories
of them and see if this is even an issue.  I don't think it's an issue.

> Naw, I just wait for you to abstract timeouts then use that.  <g>

Not unless there's a bug to be fixed.

> On a related note, I want to look into how the various buffer sizes
> interact with each other and if there is any reason at all why it makes
> sense to use such small buffers for reading and writing.

4k isn't small.  Remember an ethernet segment is much smaller than that. 

> > You notate only when you put data in an empty buffer, and you remove the
> > notation when you flush the buffer.  The accuracy is +0 to +1s from when
> > you want it, and you never make a syscall to do it. 
> 
> I am not convinced that sort of accuracy is really enough for this.  I
> would almost always rather put an extra segment on the network then wait a
> second.

Good luck.  Not without something more expensive than what we're doing
now... well you could use -DSCOREBOARD_MAINTENANCE_INTERVAL=100000 and you
could get a 200ms delivery.  But you'd be scanning the scoreboard 10 times
per second. 

If we had cheap userland semaphores we could do more... ;)  we won't have
that until someone does a pthreads port though.

Dean


Re: non-buffered CGIs suck

Posted by Marc Slemko <ma...@worldgate.com>.
On Thu, 5 Mar 1998, Dean Gaudet wrote:

> 
> 
> On Thu, 5 Mar 1998, Marc Slemko wrote:
> 
> > (actually, it could be).  If the OS modified tv to indicate time left it
> > is easy, but otherwise there is no nice way to do that. 
> 
> i.e. linux.  The timevalue is modified to indicate the remaining time. 

Exactly.

> Linus tried to revert it during 2.1.x because Linux is the only unix that
> supports this and so nobody could use it.  But I showed that the C library
> depended on this functionality and he left it in. 
> 
> > Yes.  It was just there to force a context switch.
> > 
> > It is an inaccurate representation of unbuffered CGIs sending static
> > content, but I would suggest it may be very accurate for a CGI sending
> > short bits of information that each require a disk read, etc.  A well
> > designed app won't do that because of buffering on reading that input
> > data. I'm not worried about well designed apps though, since they will
> > watch their output too.
> 
> If it's not a well designed app it can do far worse than spit small
> packets on the net.  But if you feel this is a fun challenge to solve go

The problem is that Apache is making this possible by disabling Nagle, so
we should deal with all the consequences of disabling Nagle or not do it.

> for it :)
> 
> Maybe you just want to solve the "I don't want a buffer to age more than N
> seconds" problem in general.  It affects more than just mod_cgi you
> know... for example if you're in a pipelined connection a bunch of small
> short responses can be in the buffer, unsent, waiting for a long running
> request to generate enough output to flush the buffer. 
> 
> It's probably as easy as making a second timeout notation in the
> scoreboard and sending a different signal when that timeout expires.  This
> works for all OPTIMIZE_TIMEOUTS configurations... which uh... are all I
> care about -- i.e. it covers probably 95% of our installations.  (And
> probably covers more except we don't have detailed info on the systems so
> we don't use shmget or mmap... see autoconf.)

Naw, I just wait for you to abstract timeouts then use that.  <g>

On a related note, I want to look into how the various buffer sizes
interact with each other and if there is any reason at all why it makes
sense to use such small buffers for reading and writing.

> 
> You notate only when you put data in an empty buffer, and you remove the
> notation when you flush the buffer.  The accuracy is +0 to +1s from when
> you want it, and you never make a syscall to do it. 

I am not convinced that sort of accuracy is really enough for this.  I
would almost always rather put an extra segment on the network then wait a
second.

> 
> Critical section?  Easy.  It's just like SIGALRM handling.  You need have
> a nesting counter, and sometimes you have to defer the flush until the
> nesting goes to 0. 
> 
> Dean
> 
> 


Re: non-buffered CGIs suck

Posted by Dean Gaudet <dg...@arctic.org>.

On Thu, 5 Mar 1998, Marc Slemko wrote:

> (actually, it could be).  If the OS modified tv to indicate time left it
> is easy, but otherwise there is no nice way to do that. 

i.e. linux.  The timevalue is modified to indicate the remaining time. 
Linus tried to revert it during 2.1.x because Linux is the only unix that
supports this and so nobody could use it.  But I showed that the C library
depended on this functionality and he left it in. 

> Yes.  It was just there to force a context switch.
> 
> It is an inaccurate representation of unbuffered CGIs sending static
> content, but I would suggest it may be very accurate for a CGI sending
> short bits of information that each require a disk read, etc.  A well
> designed app won't do that because of buffering on reading that input
> data. I'm not worried about well designed apps though, since they will
> watch their output too.

If it's not a well designed app it can do far worse than spit small
packets on the net.  But if you feel this is a fun challenge to solve go
for it :)

Maybe you just want to solve the "I don't want a buffer to age more than N
seconds" problem in general.  It affects more than just mod_cgi you
know... for example if you're in a pipelined connection a bunch of small
short responses can be in the buffer, unsent, waiting for a long running
request to generate enough output to flush the buffer. 

It's probably as easy as making a second timeout notation in the
scoreboard and sending a different signal when that timeout expires.  This
works for all OPTIMIZE_TIMEOUTS configurations... which uh... are all I
care about -- i.e. it covers probably 95% of our installations.  (And
probably covers more except we don't have detailed info on the systems so
we don't use shmget or mmap... see autoconf.)

You notate only when you put data in an empty buffer, and you remove the
notation when you flush the buffer.  The accuracy is +0 to +1s from when
you want it, and you never make a syscall to do it. 

Critical section?  Easy.  It's just like SIGALRM handling.  You need have
a nesting counter, and sometimes you have to defer the flush until the
nesting goes to 0. 

Dean



Re: non-buffered CGIs suck

Posted by Marc Slemko <ma...@worldgate.com>.
On Thu, 5 Mar 1998, Dean Gaudet wrote:

> 
> 
> On Thu, 5 Mar 1998, Marc Slemko wrote:
> 
> > I'm not sure I follow.  If they don't get buffered, where is the problem?
> 
> Ah ok, I've patched in your patch and it makes more sense in the full
> context.  I thought the write was hiding somewhere behind your new select
> and so all writes would be 100ms delayed.  Sorry.  +1 on the patch. 

Note that the patch is not acceptable as is, because it doesn't allow for
unbuffered CGIs anymore.  I am not concerned about a 100ms delay, although
actually perhaps I should be.  Darn.  Yes, that sucks.  

eg.

#include <unistd.h>
#include <stdio.h>

int main () {
        int i = 0;
        printf("Content-type: text/plain\n\n");
        fflush(stdout);
        while (i++ < 100000) {
                char *s;
                sprintf(s, "%d\n", time(NULL));
                write(STDOUT_FILENO, s, strlen(s));
                usleep(50000);
        }
}

Will not output until it fills a buffer.

In any case, as I was saying what I am more concerned about is that there
really should be some total time limit on how long a script can go before
it gets flushed; that doesn't fix this case unless that limit is quite low
(actually, it could be).  If the OS modified tv to indicate time left it
is easy, but otherwise there is no nice way to do that. 

(yea, it doesn't make sense to print a clock without millisecond
resolution more than once a second, but...)

I'm not sure what to do about this.  

Given:

1. Gathering small writes is good.  We either leave Nagle on, or do it
ourself.

2. Pseudo-unbuffered CGIs are good.

I'm not sure of the solution.  Perhaps some way for a CGI script to
disable this sort of delay?  eg. send a particular header?  The vast
majority of CGIs will be fine with my suggested change.  It is reasonable
to ask those that aren't to do something about it to avoid it.  It could
be disabled for nph- scripts (lame hack) and with defining a header to be
sent by the CGI to indicate it wants all buffering removed as far as
possible (actually a pretty good way; would be nice if it were standard). 

> 
> Note, I'm not sure about freebsd, but I do know that on Linux, a context
> switch doesn't occur on each write to a pipe.  The pipe is allowed to fill
> up.  So there's buffering in the kernel too.  I'm assuming this is why you
> stuck in the usleep(1).  I think that's actually an inaccurate portrayal
> of even unbuffered CGIs. 

Yes.  It was just there to force a context switch.

It is an inaccurate representation of unbuffered CGIs sending static
content, but I would suggest it may be very accurate for a CGI sending
short bits of information that each require a disk read, etc.  A well
designed app won't do that because of buffering on reading that input
data. I'm not worried about well designed apps though, since they will
watch their output too.

> 
> > Things are fine from the packet size perspective if you reenable Nagle,
> > but it may cause performance problems in some cases.  The original reason
> > for disabling it was due to the fact that we sent the headers in a
> > separate segment.  
> > 
> > We should only run into trouble with Nagle if we have two short segments
> > in a row.  Before, that could be the end of one response body and the
> > headers of the next response.  Now we don't flush after the headers are
> > sent, so that (common) case doesn't happen.  It could happen with just the
> > right sequence of cache validation stuff; not when we have a whole bunch
> > of requests pipelined at once, but when a new one comes in after we sent
> > the last segment of the previous response but before we have the ACK back
> > for it.  I am planning on looking to see if it is possible to enable Nagle
> > without causing problems.  Nagle is a lot smarter than Apache can be about
> > this because of the layer it is at.  I am also looking to see how many
> > systems have the sucky segment size problem; I am told that most don't,
> > and I don't even see it with all FreeBSD systems.  Not sure why yet. 
> 
> Leaving nagle on would mean 1 less syscall per connection.  You can bet
> I'd support that ;)

Hehe.  


Re: non-buffered CGIs suck

Posted by Dean Gaudet <dg...@arctic.org>.

On Thu, 5 Mar 1998, Marc Slemko wrote:

> I'm not sure I follow.  If they don't get buffered, where is the problem?

Ah ok, I've patched in your patch and it makes more sense in the full
context.  I thought the write was hiding somewhere behind your new select
and so all writes would be 100ms delayed.  Sorry.  +1 on the patch. 

Note, I'm not sure about freebsd, but I do know that on Linux, a context
switch doesn't occur on each write to a pipe.  The pipe is allowed to fill
up.  So there's buffering in the kernel too.  I'm assuming this is why you
stuck in the usleep(1).  I think that's actually an inaccurate portrayal
of even unbuffered CGIs. 

> Things are fine from the packet size perspective if you reenable Nagle,
> but it may cause performance problems in some cases.  The original reason
> for disabling it was due to the fact that we sent the headers in a
> separate segment.  
> 
> We should only run into trouble with Nagle if we have two short segments
> in a row.  Before, that could be the end of one response body and the
> headers of the next response.  Now we don't flush after the headers are
> sent, so that (common) case doesn't happen.  It could happen with just the
> right sequence of cache validation stuff; not when we have a whole bunch
> of requests pipelined at once, but when a new one comes in after we sent
> the last segment of the previous response but before we have the ACK back
> for it.  I am planning on looking to see if it is possible to enable Nagle
> without causing problems.  Nagle is a lot smarter than Apache can be about
> this because of the layer it is at.  I am also looking to see how many
> systems have the sucky segment size problem; I am told that most don't,
> and I don't even see it with all FreeBSD systems.  Not sure why yet. 

Leaving nagle on would mean 1 less syscall per connection.  You can bet
I'd support that ;)

Dean


Re: non-buffered CGIs suck

Posted by Marc Slemko <ma...@worldgate.com>.
On Thu, 5 Mar 1998, Dean Gaudet wrote:

> 
> 
> On Thu, 5 Mar 1998, Marc Slemko wrote:
> 
> > Why should it have any significant impact at all on them?  Heck, you have
> > less overhead when there is a delay of less than the select timeout
> > because you avoid pointless flushes.  When it does timeout and go to
> > block, you have one extra syscall overhead.
> > 
> > What other overhead is there?
> 
> 4k chunks never get buffered.  So waiting 100ms for each of them hurts
> overall throughput. 

I'm not sure I follow.  If they don't get buffered, where is the problem?
You do a 4k write.  It doesn't get buffered, so it goes out without a
flush.  You then wait for either 100ms or the next write, whichever comes
first.  If the next write comes right away, there is no difference.  This
code only comes into play if we need to block for the next read.  If you
do 2k writes, for example, then that 2k could end up being delayed an
extra 100 ms.

If you did a 4k write and it didn't get sent until the flush or more data
was written, it could add delay.  Not necessarily that much though, since
you have to remember you still have the send buffer size in the TCP stack
so in bulk data flow I can see no delays since the CGI should be able to
write at speeds >> than the network can send.

What really should be done here is to prevent sending things if there
isn't a full segment, but we have no way to do that.

> 
> > Remember prior to 1.1?  We had Nagle enabled.
> 
> Doesn't help in all cases though.  But point taken.  How do things look if
> you re-enable Nagle?

Things are fine from the packet size perspective if you reenable Nagle,
but it may cause performance problems in some cases.  The original reason
for disabling it was due to the fact that we sent the headers in a
separate segment.  

We should only run into trouble with Nagle if we have two short segments
in a row.  Before, that could be the end of one response body and the
headers of the next response.  Now we don't flush after the headers are
sent, so that (common) case doesn't happen.  It could happen with just the
right sequence of cache validation stuff; not when we have a whole bunch
of requests pipelined at once, but when a new one comes in after we sent
the last segment of the previous response but before we have the ACK back
for it.  I am planning on looking to see if it is possible to enable Nagle
without causing problems.  Nagle is a lot smarter than Apache can be about
this because of the layer it is at.  I am also looking to see how many
systems have the sucky segment size problem; I am told that most don't,
and I don't even see it with all FreeBSD systems.  Not sure why yet. 

> 
> And maybe I should check your script on Linux to see if it's another
> freebsd feature ;)  (couldn't resist ;) 
> 
> > > And I still disagree with every single CGI FAQ that says "set $| =1; in
> > > your perl scripts".  I've never understood why that is there.  I never
> > > seem to require it.  At least our FAQ explains that you should turn
> > > buffering back on. 
> > 
> > If you do anything that mixes non-buffered and buffered IO you need it or
> > something similar.  If you do:
> > 
> > print "Content-type: text/plain\n\n";
> > system("/bin/hostname");
> > 
> > you need it.
> 
> Yeah you're right, I guess I don't write these sort of lame CGIs so I
> never run into it. 
> 
> Dean
> 
> 



Re: non-buffered CGIs suck

Posted by Dean Gaudet <dg...@arctic.org>.

On Thu, 5 Mar 1998, Marc Slemko wrote:

> Why should it have any significant impact at all on them?  Heck, you have
> less overhead when there is a delay of less than the select timeout
> because you avoid pointless flushes.  When it does timeout and go to
> block, you have one extra syscall overhead.
> 
> What other overhead is there?

4k chunks never get buffered.  So waiting 100ms for each of them hurts
overall throughput. 

> Remember prior to 1.1?  We had Nagle enabled.

Doesn't help in all cases though.  But point taken.  How do things look if
you re-enable Nagle?

And maybe I should check your script on Linux to see if it's another
freebsd feature ;)  (couldn't resist ;) 

> > And I still disagree with every single CGI FAQ that says "set $| =1; in
> > your perl scripts".  I've never understood why that is there.  I never
> > seem to require it.  At least our FAQ explains that you should turn
> > buffering back on. 
> 
> If you do anything that mixes non-buffered and buffered IO you need it or
> something similar.  If you do:
> 
> print "Content-type: text/plain\n\n";
> system("/bin/hostname");
> 
> you need it.

Yeah you're right, I guess I don't write these sort of lame CGIs so I
never run into it. 

Dean



Re: non-buffered CGIs suck

Posted by Marc Slemko <ma...@worldgate.com>.
On Thu, 5 Mar 1998, Dean Gaudet wrote:

> Huh?  This would absolutely kill CGI performance for scripts that shovel
> back loads of data in nice 4k chunks.  No way. 

Why should it have any significant impact at all on them?  Heck, you have
less overhead when there is a delay of less than the select timeout
because you avoid pointless flushes.  When it does timeout and go to
block, you have one extra syscall overhead.

What other overhead is there?

> 
> I suggest that it's your CGI that's broken... one of the comments when we
> did this change was that most folks use a language like perl, or use
> stdio, which do buffering already.

I have trouble with that assumption and am not convinced it is wise to
make.  No need to trust CGIs unless you have to.

> 
> Remember, prior to apache 1.1?, whenever we got HTTP/1.1 support, the CGI
> was connected to the client directly and so it could spurt the smallest
> packets it wanted to.  It was the abstraction that we had to put in to
> support chunked encoding that broke that.

Remember prior to 1.1?  We had Nagle enabled.

> 
> And I still disagree with every single CGI FAQ that says "set $| =1; in
> your perl scripts".  I've never understood why that is there.  I never
> seem to require it.  At least our FAQ explains that you should turn
> buffering back on. 

If you do anything that mixes non-buffered and buffered IO you need it or
something similar.  If you do:

print "Content-type: text/plain\n\n";
system("/bin/hostname");

you need it.

> 
> Dean
> 
> On Thu, 5 Mar 1998, Marc Slemko wrote:
> 
> > ...from a network perspective.
> > 
> > Try running this:
> > 
> > #include <sys/types.h>
> > #include <sys/uio.h>
> > #include <unistd.h>
> > 
> > #define HEADERS "Content-type: text/plain\n\n"
> > int main () {
> >         char *s = "this is a line that is being sent\n ";
> >         int i;
> >         write(STDOUT_FILENO, HEADERS, strlen(HEADERS));
> >         for (i = 0; i < 200; i++) {
> >                 write(STDOUT_FILENO, s, strlen(s));
> >                 usleep(1);
> > 
> >         }
> > }
> > 
> > And you will see many small packets, it will take twice as long to
> > transfer as buffered CGI did, etc.  It is not very nice to the network.
> > While many CGIs will have their own buffering (eg. stdio), I'm still not
> > comfortable.
> > 
> > How about something like the below?  Note that this isn't complete; there
> > really should be a limit on the total length of time we will do this for
> > before we flush.  eg. a client writing a byte every 50 ms won't get
> > flushed in this case.  This is one of those (few) times I wish the world
> > were a Linux, since it would be easy then. 
> > 
> > For everything else, I really think that loosing this bit of "realtime" is
> > worthwhile and has minimal impact.  If we didn't disable Nagle, we
> > wouldn't have to worry about it, however currently we do disable Nagle so
> > we have to fake our own without being able to do it right.
> > 
> > Index: http_protocol.c
> > ===================================================================
> > RCS file: /export/home/cvs//apache-1.3/src/main/http_protocol.c,v
> > retrieving revision 1.194
> > diff -u -r1.194 http_protocol.c
> > --- http_protocol.c	1998/03/04 02:28:16	1.194
> > +++ http_protocol.c	1998/03/06 02:28:47
> > @@ -1658,11 +1658,27 @@
> >              len = IOBUFSIZE;
> >  
> >          do {
> > +	    struct timeval tv;
> > +
> >              n = bread(fb, buf, len);
> >              if (n >= 0 || r->connection->aborted)
> >                  break;
> >              if (n < 0 && errno != EAGAIN)
> >                  break;
> > +
> > +	    /*
> > +	     * we really don't want to be shoving lots of small data out
> > +	     * to the network, so hang around for 100ms to see if we can
> > +	     * grab anything else.
> > +	     */
> > +	    tv.tv_sec = 0;
> > +	    tv.tv_usec = 100000;
> > +	    FD_SET(fd, &fds);
> > +	    if (ap_select(fd + 1, &fds, NULL, &fds, &tv) > 0) {
> > +		/* something more to read, lets give it a shot */
> > +		continue;
> > +	    }
> > +
> >              /* we need to block, so flush the output first */
> >              bflush(r->connection->client);
> >              if (r->connection->aborted)
> > 
> > 
> 


Re: non-buffered CGIs suck

Posted by Dean Gaudet <dg...@arctic.org>.
Huh?  This would absolutely kill CGI performance for scripts that shovel
back loads of data in nice 4k chunks.  No way. 

I suggest that it's your CGI that's broken... one of the comments when we
did this change was that most folks use a language like perl, or use
stdio, which do buffering already.

Remember, prior to apache 1.1?, whenever we got HTTP/1.1 support, the CGI
was connected to the client directly and so it could spurt the smallest
packets it wanted to.  It was the abstraction that we had to put in to
support chunked encoding that broke that.

And I still disagree with every single CGI FAQ that says "set $| =1; in
your perl scripts".  I've never understood why that is there.  I never
seem to require it.  At least our FAQ explains that you should turn
buffering back on. 

Dean

On Thu, 5 Mar 1998, Marc Slemko wrote:

> ...from a network perspective.
> 
> Try running this:
> 
> #include <sys/types.h>
> #include <sys/uio.h>
> #include <unistd.h>
> 
> #define HEADERS "Content-type: text/plain\n\n"
> int main () {
>         char *s = "this is a line that is being sent\n ";
>         int i;
>         write(STDOUT_FILENO, HEADERS, strlen(HEADERS));
>         for (i = 0; i < 200; i++) {
>                 write(STDOUT_FILENO, s, strlen(s));
>                 usleep(1);
> 
>         }
> }
> 
> And you will see many small packets, it will take twice as long to
> transfer as buffered CGI did, etc.  It is not very nice to the network.
> While many CGIs will have their own buffering (eg. stdio), I'm still not
> comfortable.
> 
> How about something like the below?  Note that this isn't complete; there
> really should be a limit on the total length of time we will do this for
> before we flush.  eg. a client writing a byte every 50 ms won't get
> flushed in this case.  This is one of those (few) times I wish the world
> were a Linux, since it would be easy then. 
> 
> For everything else, I really think that loosing this bit of "realtime" is
> worthwhile and has minimal impact.  If we didn't disable Nagle, we
> wouldn't have to worry about it, however currently we do disable Nagle so
> we have to fake our own without being able to do it right.
> 
> Index: http_protocol.c
> ===================================================================
> RCS file: /export/home/cvs//apache-1.3/src/main/http_protocol.c,v
> retrieving revision 1.194
> diff -u -r1.194 http_protocol.c
> --- http_protocol.c	1998/03/04 02:28:16	1.194
> +++ http_protocol.c	1998/03/06 02:28:47
> @@ -1658,11 +1658,27 @@
>              len = IOBUFSIZE;
>  
>          do {
> +	    struct timeval tv;
> +
>              n = bread(fb, buf, len);
>              if (n >= 0 || r->connection->aborted)
>                  break;
>              if (n < 0 && errno != EAGAIN)
>                  break;
> +
> +	    /*
> +	     * we really don't want to be shoving lots of small data out
> +	     * to the network, so hang around for 100ms to see if we can
> +	     * grab anything else.
> +	     */
> +	    tv.tv_sec = 0;
> +	    tv.tv_usec = 100000;
> +	    FD_SET(fd, &fds);
> +	    if (ap_select(fd + 1, &fds, NULL, &fds, &tv) > 0) {
> +		/* something more to read, lets give it a shot */
> +		continue;
> +	    }
> +
>              /* we need to block, so flush the output first */
>              bflush(r->connection->client);
>              if (r->connection->aborted)
> 
>