You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Dean Gaudet <dg...@arctic.org> on 1999/05/17 18:30:37 UTC

[PATCH] select-thread-hybrid-01.patch

http://www.arctic.org/~dgaudet/apache/2.0/select-thread-hybrid-01.patch

OK I've managed to serve a few requests with this, so it's time to post ;) 

Building: 

Get and build bind-8.x (I'm using bind-8.2).  You want the wonderful
eventlib which the ISC folks have written for us, and released under a
BSD-style license.  This is a convenient select/poll wrapper which
provides timer and fd events.  Paul's comments say he had inputs from lots
of folks -- including the X folks... so I'm guessing Jim Gettys' comments
were taken into account... and so I'm just trusting the code rather than
looking at it.

(Redhat users:  I tried using the installed bind-devel kit, but it doesn't
work, something is bogus about their library.)

Get apache-apr... apply the patch. 

Use some variant of the config.status below to configure the server. 

Notes: 

- I tore up http_main.c.  Just completely gutted it.  I got tired of
stubbing out things.  We can add things back incrementally. 

- no signals, no restart, no multiple processes... 

- no scoreboard -- major rework needed here.  We need to divorce ourselves
from the concept of a one-to-one mapping between threads/processes and
requests.  We'll have connections which are being handled by the event
loop.  I think the more appropriate thing is to split the scoreboard into
a "worker" section, and a "connection" section.  But even still -- we can
potentially have thousands of connections in progress... that's a lot of
shared memory to chew up (shared mem is not pageable on many systems).  A
better solution is required -- such as building the scoreboard only when
serving /server-status. 

- It's similar to one of manoj/ryan's older servers with the fdqueue at
the moment, but using eventlib instead... and it should show how I think
we should pass events back and forth between workers and event thread.

Dean

#!/bin/sh
##
##  config.status -- APACI auto-generated configuration restore script
##
##  Use this shell script to re-run the APACI configure script for
##  restoring your configuration. Additional parameters can be supplied.
##

CFLAGS="-g -O2 -Wall" \
LIBS="/home/dgaudet/ap/bind/src/lib/libbind_r.a" \
INCLUDES="-I/home/dgaudet/ap/bind/src/include" \
./configure \
"--with-layout=Apache" \
"--prefix=/home/dgaudet/ev" \
"--disable-module=status" \
"$@"



Re: [PATCH] select-thread-hybrid-01.patch

Posted by Dean Gaudet <dg...@arctic.org>.

On Thu, 20 May 1999, Tony Finch wrote:

> Dean Gaudet <dg...@arctic.org> wrote:
> >
> >FWIW, here's the list of responseq messages I expect:
> >
> >RESPONSEQ_SEND_MMAP,		/* copy from memory to BUFF */
> >RESPONSEQ_SEND_FILE,		/* copy from file to BUFF */
> >RESPONSEQ_SEND_PIPE,		/* copy from pipe to BUFF */
> >RESPONSEQ_LINGERING_CLOSE,	/* handle lingering close */
> >RESPONSEQ_WAIT_FOR_READ,	/* handle persistent connections */
> 
> If you have an async request handling thread would it help to use it
> for DNS lookups too? or does that clash too much with the existing
> programming model?

I hadn't really thought about DNS lookups... 

> Hmm, how do you handle logging when you hand off the response handling
> to a separate thread? do you log right after the hand-off?

The plan is to hand it back to a worker thread for logging -- essentially
the primitives which the event thread supports are almost non-protocol
specific.  Anything to do with the protocol decision happens in the worker
threads. 

Dean



Re: [PATCH] select-thread-hybrid-01.patch

Posted by Tony Finch <do...@dotat.at>.
Dean Gaudet <dg...@arctic.org> wrote:
>
>FWIW, here's the list of responseq messages I expect:
>
>RESPONSEQ_SEND_MMAP,		/* copy from memory to BUFF */
>RESPONSEQ_SEND_FILE,		/* copy from file to BUFF */
>RESPONSEQ_SEND_PIPE,		/* copy from pipe to BUFF */
>RESPONSEQ_LINGERING_CLOSE,	/* handle lingering close */
>RESPONSEQ_WAIT_FOR_READ,	/* handle persistent connections */

If you have an async request handling thread would it help to use it
for DNS lookups too? or does that clash too much with the existing
programming model?

Hmm, how do you handle logging when you hand off the response handling
to a separate thread? do you log right after the hand-off?

Tony.
-- 
f.a.n.finch   dot@dotat.at   fanf@demon.net   black dog

Re: [PATCH] select-thread-hybrid-01.patch

Posted by Manoj Kasichainula <ma...@io.com>.
On Tue, May 18, 1999 at 09:52:38AM -0700, Dean Gaudet wrote:
> I'm attempting to make use of the response queue right now.  Except I'm
> running into the total mess that buff.c is

Random mumblings...

There are a lot of changes that would be good to make to buff.c, like
sendfile/TransmitFile API support, which is a basic structural change.
It's probably easier to add this support at the same time as any other
rewrite of buff.

As a first cut, the event thread could just use the standard I/O calls
instead of going though buff. This would make the event thread usable
only for static content, though, but we'd be able to delay biting off
buf rewrite until later. Is there any reason we need to use buff for
static content?

> I'll probably end up ripping out the non-unix stuff from buff.c -- yet
> another thing which we can fix later (given my current main only supports
> unix this isn't a problem). 

Everything needed by buff.c should be abstracted out by the PR, so
this is a good thing.

> FWIW, here's the list of responseq messages I expect:
> 
> RESPONSEQ_SEND_MMAP,		/* copy from memory to BUFF */
> RESPONSEQ_SEND_FILE,		/* copy from file to BUFF */
> RESPONSEQ_SEND_PIPE,		/* copy from pipe to BUFF */
> RESPONSEQ_LINGERING_CLOSE,	/* handle lingering close */
> RESPONSEQ_WAIT_FOR_READ,	/* handle persistent connections */
> 
> There will be corresponding WORKQ messages to indicate those are complete.
> And then we'll probably realise that we should have only one message
> type and combine the two.

To solve the blocking disk problem, we could add a WORKQ message to
read/mmap from a file not cached and pass back the results, making the
server look more like Flash. Of course, if we did this, there's no
reason that the worker thread couldn't also write the block out if the
block size < SO_SNDBUF.

> The difficulty that I'm running into is that I need to make all fds
> non-blocking, and the BUFF using the fd needs to know if the caller
> expects blocking or non-blocking behaviour... and that totally interferes
> with the {send,recv}withtimeout stuff... which is what lead me into the
> mess of buff.c :)

I need to dip into buff.c more than I have, but a little change to
{send,recv}withtimeout should give it both blocking and non-blocking
behavior, right? Just let sec == -1 mean no timeout instead of sec ==
0.  Then, {send,recv}withtimeout can be made non-blocking on a whim.

In fact, if buff can assume that the socket descriptor is always
nonblocking, this should actually make the code simpler.

-- 
Manoj Kasichainula - manojk at io dot com - http://www.io.com/~manojk/
"How can one live in this age and not be curious?" -- Charles Krauthammer

Re: [PATCH] select-thread-hybrid-01.patch

Posted by Dean Gaudet <dg...@arctic.org>.
On Tue, 18 May 1999, Ryan Bloom wrote:

> 
> > I don't think the abstraction is worth keeping -- you had only one
> > "message type" essentialy.
> 
> The abstraction is necessary.  There are some operating systems that don't
> want to use a complex messaging system like what you are implementing.  I
> am thinking of one platform already that has already expressed an interest
> in using another method for accepting connections.  Plus, we most
> definately don't want the server on a non-threaded system to use the same
> accept model as a threaded platform.  Everything I have heard from the
> group is that the hybrid server must be able to run in both thread-only
> and process-only mode.  The accept model abstraction makes this possible.

The abstraction is still at the wrong level... it certainly shouldn't
be in http_main.c.  Yes I know about those other systems.  It should be
easy to handle them by not sending the messages.  I was hoping to have
an example for you by now.

Just think of those other systems as systems where SO_SNDBUF is
infinite... remember I said that I wasn't going to punt back to the event
loop when the response fits in the kernel buffers?  That means that the
response handler will run in a "worker thread".  The code should be pretty
similar.

For example, sending a memory mapped region (mmap or otherwise), here's
the interface I'm currently playing with:

    typedef struct {
	conn_rec *conn;
	void *mm;
	size_t length;
	void *uap;
	void (*complete)(void *uap, int bytes_sent);
    } ap_async_send_mmap_data;

    API_EXPORT(void) ap_async_send_mmap(ap_async_send_mmap_data *);

The handler allocates the structure, sets up the parameters, and
calls ap_async_send_mmap.  Then it returns OK.

The complete function is called in the future when the operation has
completed.  At this point, another ap_async_send_mmap() could be performed
(for example, handling range requests).

Folks familiar with async programming should recognize this... it's
a pretty standard method.  It doesn't require anything special other
than a few conventions.

> I really hope that when you say fds, you mean sds.  If you require all fds
> to be non-blocking, including those actually referring to files on disk,
> we have a problem.  There are too many platforms that do not support
> non-blocking fds for files on disk.

No I don't require all fds to be non-blocking.  Not even unix supports
non-blocking disk file fds.  That's why I gave separate cases for sending
a file versus sending a pipe (i.e. handling a cgi or other external
process through a pipe/socket). 

Dean


Re: [PATCH] select-thread-hybrid-01.patch

Posted by Ryan Bloom <rb...@raleigh.ibm.com>.
> I don't think the abstraction is worth keeping -- you had only one
> "message type" essentialy.

The abstraction is necessary.  There are some operating systems that don't
want to use a complex messaging system like what you are implementing.  I
am thinking of one platform already that has already expressed an interest
in using another method for accepting connections.  Plus, we most
definately don't want the server on a non-threaded system to use the same
accept model as a threaded platform.  Everything I have heard from the
group is that the hybrid server must be able to run in both thread-only
and process-only mode.  The accept model abstraction makes this possible.

> 
> I'm attempting to make use of the response queue right now.  Except I'm
> running into the total mess that buff.c is ... damn this was a LOT more
> clean in apache-nspr.  You guys just wedged in timeouts in a haphazard,
> incomplete way.  Now I'm debating on looking at APR or just hacking
> things up more...  my aesthetic sense has me puking at the moment though,
> so I'm derailed.

Those timeouts were put in because we needed them to be able to test.
When APR is included in the server, the nspr code can be moved over.  The
ap_(read|write) code already has timeouts, so the buff functions can be
cleaned.  I would suggest just hacking away until we get apr implemented
in the server.  :) 

> 
> I'll probably end up ripping out the non-unix stuff from buff.c -- yet
> another thing which we can fix later (given my current main only supports
> unix this isn't a problem). 

That is the plan.

> 
> FWIW, here's the list of responseq messages I expect:
> 
> RESPONSEQ_SEND_MMAP,		/* copy from memory to BUFF */
> RESPONSEQ_SEND_FILE,		/* copy from file to BUFF */
> RESPONSEQ_SEND_PIPE,		/* copy from pipe to BUFF */
> RESPONSEQ_LINGERING_CLOSE,	/* handle lingering close */
> RESPONSEQ_WAIT_FOR_READ,	/* handle persistent connections */
> 
> There will be corresponding WORKQ messages to indicate those are complete.
> And then we'll probably realise that we should have only one message
> type and combine the two.
> 
> The difficulty that I'm running into is that I need to make all fds
> non-blocking, and the BUFF using the fd needs to know if the caller
> expects blocking or non-blocking behaviour... and that totally interferes
> with the {send,recv}withtimeout stuff... which is what lead me into the
> mess of buff.c :)

I really hope that when you say fds, you mean sds.  If you require all fds
to be non-blocking, including those actually referring to files on disk,
we have a problem.  There are too many platforms that do not support
non-blocking fds for files on disk.

Ryan

_______________________________________________________________________
Ryan Bloom		rbb@raleigh.ibm.com
4205 S Miami Blvd	
RTP, NC 27709		It's a beautiful sight to see good dancers 
			doing simple steps.  It's a painful sight to
			see beginners doing complicated patterns.	


Re: [PATCH] select-thread-hybrid-01.patch

Posted by Dean Gaudet <dg...@arctic.org>.

On Mon, 17 May 1999, Manoj Kasichainula wrote:

> With the changes so far, I think this code can fit back into the
> accept abstraction stuff that's in the apache-apr repository right
> now.  Making use of the response queue will require changing the
> abstraction, though.

I don't think the abstraction is worth keeping -- you had only one
"message type" essentialy.

I'm attempting to make use of the response queue right now.  Except I'm
running into the total mess that buff.c is ... damn this was a LOT more
clean in apache-nspr.  You guys just wedged in timeouts in a haphazard,
incomplete way.  Now I'm debating on looking at APR or just hacking
things up more...  my aesthetic sense has me puking at the moment though,
so I'm derailed.

I'll probably end up ripping out the non-unix stuff from buff.c -- yet
another thing which we can fix later (given my current main only supports
unix this isn't a problem). 

FWIW, here's the list of responseq messages I expect:

RESPONSEQ_SEND_MMAP,		/* copy from memory to BUFF */
RESPONSEQ_SEND_FILE,		/* copy from file to BUFF */
RESPONSEQ_SEND_PIPE,		/* copy from pipe to BUFF */
RESPONSEQ_LINGERING_CLOSE,	/* handle lingering close */
RESPONSEQ_WAIT_FOR_READ,	/* handle persistent connections */

There will be corresponding WORKQ messages to indicate those are complete.
And then we'll probably realise that we should have only one message
type and combine the two.

The difficulty that I'm running into is that I need to make all fds
non-blocking, and the BUFF using the fd needs to know if the caller
expects blocking or non-blocking behaviour... and that totally interferes
with the {send,recv}withtimeout stuff... which is what lead me into the
mess of buff.c :)

Dean


Re: [PATCH] select-thread-hybrid-01.patch

Posted by Manoj Kasichainula <ma...@io.com>.
On Mon, May 17, 1999 at 09:30:37AM -0700, Dean Gaudet wrote:
> (Redhat users:  I tried using the installed bind-devel kit, but it doesn't
> work, something is bogus about their library.)

They stopped stripping libbind.a in Red Hat 6.0, so it has a working
bind-devel. The code is standing up well to ab -c 100 -n 10000 run a
few times.

> - I tore up http_main.c.  Just completely gutted it.  I got tired of
> stubbing out things.  We can add things back incrementally. 

With the changes so far, I think this code can fit back into the
accept abstraction stuff that's in the apache-apr repository right
now.  Making use of the response queue will require changing the
abstraction, though.

On Mon, May 17, 1999 at 10:57:52AM -0700, Dean Gaudet wrote:
> Damnit I hate GNU patch versions after 2.1.  I can't even apply the patch
> I made.  Stupid stupid stupid.  I'm not even going to attempt to fix it. 

How about diff -r'ing the pristine repository with the changed one?
I've had much better luck with patches created only by diff. Also, cvs
1.10.6 is out, which may fix some of the malformed patch errors from
the CVS extract.

-- 
Manoj Kasichainula - manojk at io dot com - http://www.io.com/~manojk/
"Firewalls don't know the difference between a virus, a Trojan Horse, or
Windows NT." -- An advertising brochure from DataLynx, Inc. ("SECURITY THROUGH
STRENGTH")

Re: [PATCH] select-thread-hybrid-01.patch

Posted by Jim Gettys <jg...@pa.dec.com>.
> Sender: new-httpd-owner@apache.org
> From: Dean Gaudet <dg...@arctic.org>
> Date: Mon, 17 May 1999 09:30:37 -0700 (PDT)
> To: new-httpd@apache.org
> Subject: [PATCH] select-thread-hybrid-01.patch
> -----
> http://www.arctic.org/~dgaudet/apache/2.0/select-thread-hybrid-01.patch
> 
> OK I've managed to serve a few requests with this, so it's time to post ;)
> 
> Building:
> 
> Get and build bind-8.x (I'm using bind-8.2).  You want the wonderful
> eventlib which the ISC folks have written for us, and released under a
> BSD-style license.  This is a convenient select/poll wrapper which
> provides timer and fd events.  Paul's comments say he had inputs from lots
> of folks -- including the X folks... so I'm guessing Jim Gettys' comments
> were taken into account... and so I'm just trusting the code rather than
> looking at it.
> 

Dunno the geneology of this code (and haven't looked at it).  The X server 
internally has always been select driven, and has also taken advantage 
of select timeouts (for example, this is how it knows to do screensaver 
operations).  AF, an audio server I built with some other folks added 
a simple tasking package (each task, or thread, promised not to block
or run too long).

I know that John Ousterhaut built such a thing for TK as well, reputedly
quite clean (John writes good code!), and not X specific.

The basic loop of an X server is roughly as follows:

loop:
   call select (fd of clients + fd for accept, mask showing work to do, timout)
	If (new connection) accept it
	if (old connection went away) close it
	if (timeout) do what is pending (screensaver, or other things like
		synchronization).
	foreach (fd of open connections with data available)
		read as much data as is available non-blocking
		foreach (request in data read) {
			If (full request present)
				call request routine
			}
		flush output buffer of FD
	}
go to loop:
		
		
This is not completely fair for a scheduler: in particular, a single
client can execute quite a few requests before something else gets run;
but it has been fair enough for X's use (you don't want to keep changing
graphics contexts between clients; this can be expensive on alot of hardware).

On mean, X executes of order 100 instructions/request overhead total, 
including TCP overhead of moving the data.  The bottom line is that X 
is most efficient when most loaded: as one client gets the server busy, 
other clients will have more chance to send more requests, and so the 
system call/request keeps dropping as the server is loaded, to WAY under 
1 system call/request on average (since the frame buffer is generally 
mapped to allow futzing with the device registers directly).  

My pentium 400 at home, for example, gets around 2,000,000 no-ops/second; 
I get something like 3,300,000 single pixel dots/second (poly request).  
The protocol, however, is very much more compact and simpler than HTTP...
I don't mean to make a direct comparison, but shall we just say that
HTTP looks nothing like what I believe a simple, clean wire protocol should
look like.  In X's case, however, we were lucky enough to stamp out the
first generation of the protocol, a luxury that will be difficult with
the Web.
					- Jim


--
Jim Gettys
Industry Standards and Consortia
Compaq Computer Corporation
Visting Scientist, World Wide Web Consortium, M.I.T.
http://www.w3.org/People/Gettys/
jg@w3.org, jg@pa.dec.com


Re: [PATCH] select-thread-hybrid-01.patch

Posted by Dean Gaudet <dg...@arctic.org>.
Yeah, I tried that too -- it didn't help... it screwed up on all the new
files.

Dean

On Tue, 18 May 1999, Rodent of Unusual Size wrote:

> Dean Gaudet wrote:
> > 
> > Damnit I hate GNU patch versions after 2.1.  I can't even apply the
> > patch I made.
> 
> I've had problems with it too, and recently discovered that most
> of them went away with a combination of POSIXLY_CORRECT and "-p 0"
> on the patch command line.  Go figure.
> -- 
> #ken    P-)}
> 
> Ken Coar                    <http://Web.Golux.Com/coar/>
> Apache Software Foundation  <http://www.apache.org/>
> "Apache Server for Dummies" <http://Web.Golux.Com/coar/ASFD/>
> 


Re: [PATCH] select-thread-hybrid-01.patch

Posted by Rodent of Unusual Size <Ke...@Golux.Com>.
Dean Gaudet wrote:
> 
> Damnit I hate GNU patch versions after 2.1.  I can't even apply the
> patch I made.

I've had problems with it too, and recently discovered that most
of them went away with a combination of POSIXLY_CORRECT and "-p 0"
on the patch command line.  Go figure.
-- 
#ken    P-)}

Ken Coar                    <http://Web.Golux.Com/coar/>
Apache Software Foundation  <http://www.apache.org/>
"Apache Server for Dummies" <http://Web.Golux.Com/coar/ASFD/>

Re: [PATCH] select-thread-hybrid-01.patch

Posted by Dean Gaudet <dg...@arctic.org>.
Damnit I hate GNU patch versions after 2.1.  I can't even apply the patch
I made.  Stupid stupid stupid.  I'm not even going to attempt to fix it. 

Instead there's a
http://www.arctic.org/~dgaudet/apache/2.0/select-thread-hybrid-02.tar.gz
which includes the entire source tree. 

And it also spawns worker threads dynamically... which was pretty easy to
add with an eventlib timer event. 

Dean