You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Johannes Erdfelt <jo...@erdfelt.com> on 2002/10/18 03:53:33 UTC

[patch] perchild MPM bug fixes (+ open problem)

Ryan, I've CC'd you on this just to let you see the patch. If you don't
want me to involve you in this, please accept my apologies and let me
know and I won't CC you in any further patches.

I spent some time debugging the perchild MPM since it wasn't working for
me, nor anyone else it seems. I've found a few problems:

Problem 1:
In worker_thread, there is a variable called csd that is used to get
the new socket from lr->accept_func(). If that variable is NULL, then
the memory for the new socket is allocated in the per-transaction pool.
Unfortunately, the code neglected to reset the variable to NULL after
servicing a request. The result is that the first request for each
thread worked fine, but subsequent request may have the memory
overwritten and resulting in an invalid FD.

Solution:
Set it to NULL before calling lr->accept_func(). Implemented in the patch
below.

Problem 2:
pass_request fills out an iovec with the headers and the body of the
request it wants to pass to another process. It unfortunately uses the
wrong variable for the length causing it to always be 0.

Solution:
Use the correct variable name "l". len is never set to anything so I removed
that and used 0 in the one reference to it. Implemented in the patch below.

Problem 3:
receive_from_other_child assumes that the iovec is the same on read as
it is on write. This isn't true and readmsg() follows readv() semantics.
iovec is a scatter/gather list and as a result, the 2 send buffers are
merged into one received buffer with the second always being untouched.
It also trusted the lengths in iov.iov_len which will be the size of the
original buffer, not the size of the data actually received.

Solution:
Merge the 2 buffer's into 1 and find the null terminators for the 2 strings.
Implemented in the patch below.

Problem 4:
Occasionally, when a request needs to be passed to another process, it will
hang until another connection is accepted. perchild uses the same process
accept mutex that the other MPM's seem to need. This results in
serialization of all accept() calls as is the intent of the mutex. The
problem is that another process might have acquired the mutex from the
process that actually needs the mutex to hit accept() and see the passed
request. This isn't normally a problem because all processes accept
connections on all listening ports. The problem in this case is the fact
not all processes accept connections on the per process unix domain socket
to pass connections around, for obvious reasons.

Solution:
I don't have one :) I'm not sure what the best way of solving this is. I
don't fully understand the semantics of the process accept mutex yet. Any
suggestions?

Anyway, this patch has been tested lightly and makes things work MUCH better
for the perchild MPM now, atleast for me :)

If this patch is acceptable by the Apache development team, can it be
applied?

JE

Index: perchild.c
===================================================================
RCS file: /home/cvspublic/httpd-2.0/server/mpm/experimental/perchild/perchild.c,v
retrieving revision 1.135
diff -u -r1.135 perchild.c
--- perchild.c	11 Oct 2002 15:41:52 -0000	1.135
+++ perchild.c	18 Oct 2002 01:31:22 -0000
@@ -679,9 +679,9 @@
 {
     struct msghdr msg;
     struct cmsghdr *cmsg;
-    char headers[HUGE_STRING_LEN];
-    char request_body[HUGE_STRING_LEN];
-    struct iovec iov[2];
+    char buffer[HUGE_STRING_LEN * 2], *headers, *body;
+    int headerslen, bodylen;
+    struct iovec iov;
     int ret, dp;
     apr_os_sock_t sd;
     apr_bucket_alloc_t *alloc = apr_bucket_alloc_create(ptrans);
@@ -690,15 +690,13 @@
 
     apr_os_sock_get(&sd, lr->sd);
 
-    iov[0].iov_base = headers;
-    iov[0].iov_len = HUGE_STRING_LEN;
-    iov[1].iov_base = request_body;
-    iov[1].iov_len = HUGE_STRING_LEN;
+    iov.iov_base = buffer;
+    iov.iov_len = sizeof(buffer);
 
     msg.msg_name = NULL;
     msg.msg_namelen = 0;
-    msg.msg_iov = iov;
-    msg.msg_iovlen = 2;
+    msg.msg_iov = &iov;
+    msg.msg_iovlen = 1;
 
     cmsg = apr_palloc(ptrans, sizeof(*cmsg) + sizeof(sd));
     cmsg->cmsg_len = sizeof(*cmsg) + sizeof(sd);
@@ -715,14 +713,25 @@
     APR_BRIGADE_INSERT_HEAD(bb, bucket);
     bucket = apr_bucket_socket_create(*csd, alloc);
     APR_BRIGADE_INSERT_HEAD(bb, bucket);
-    bucket = apr_bucket_heap_create(iov[1].iov_base, 
-                                    iov[1].iov_len, NULL, alloc);
+
+    body = strchr(iov.iov_base, 0);
+    if (!body) {
+        return 1;
+    }
+
+    body++;
+    bodylen = strlen(body);
+
+    headers = iov.iov_base;
+    headerslen = body - headers;
+
+    bucket = apr_bucket_heap_create(body, bodylen, NULL, alloc);
     APR_BRIGADE_INSERT_HEAD(bb, bucket);
-    bucket = apr_bucket_heap_create(iov[0].iov_base, 
-                                    iov[0].iov_len, NULL, alloc);
+    bucket = apr_bucket_heap_create(headers, headerslen, NULL, alloc);
     APR_BRIGADE_INSERT_HEAD(bb, bucket);
 
     apr_pool_userdata_set(bb, "PERCHILD_SOCKETS", NULL, ptrans);
+
     return 0;
 }
 
@@ -730,7 +739,7 @@
 
 static void *worker_thread(apr_thread_t *thd, void *arg)
 {
-    void *csd = NULL;
+    void *csd;
     apr_pool_t *tpool;      /* Pool for this thread           */
     apr_pool_t *ptrans;     /* Pool for per-transaction stuff */
     volatile int thread_just_started = 1;
@@ -832,6 +841,7 @@
         }
     got_fd:
         if (!workers_may_exit) {
+            csd = NULL;
             rv = lr->accept_func(&csd, lr, ptrans);
             if (rv == APR_EGENERAL) {
                 /* E[NM]FILE, ENOMEM, etc */
@@ -1635,7 +1645,6 @@
     apr_bucket_brigade *bb = apr_brigade_create(r->pool, c->bucket_alloc);
     apr_bucket_brigade *sockbb;
     char request_body[HUGE_STRING_LEN] = "\0";
-    apr_off_t len = 0;
     apr_size_t l = 0;
     perchild_header h;
     apr_bucket *sockbuck;
@@ -1647,7 +1656,7 @@
                  "passing request to another child.  Vhost: %s, child %d %d",
                  apr_table_get(r->headers_in, "Host"), child_num, sconf->output);
     ap_get_brigade(r->connection->input_filters, bb, AP_MODE_EXHAUSTIVE, APR_NONBLOCK_READ,
-                   len);
+                   0);
 
     for (sockbuck = APR_BRIGADE_FIRST(bb); sockbuck != APR_BRIGADE_SENTINEL(bb);
          sockbuck = APR_BUCKET_NEXT(sockbuck)) {
@@ -1678,7 +1687,7 @@
     iov[0].iov_base = h.headers;
     iov[0].iov_len = strlen(h.headers) + 1;
     iov[1].iov_base = request_body;
-    iov[1].iov_len = len + 1;
+    iov[1].iov_len = l + 1;
 
     msg.msg_name = NULL;
     msg.msg_namelen = 0;

Re: [patch] perchild MPM bug fixes (+ open problem)

Posted by Jeff Trawick <tr...@attglobal.net>.
Johannes Erdfelt <jo...@erdfelt.com> writes:

> I spent some time debugging the perchild MPM since it wasn't working for
> me, nor anyone else it seems. I've found a few problems:

just FYI so you don't feel ignored...

if nobody beats me to it, I plan to take a more detailed look at this
with the intent to commit...  nag me directly in a few days if I
haven't found the time to play with it now...

no need to rework this now, but patches that are smaller and solve
exactly one problem at a time get committed sooner...

-- 
Jeff Trawick | trawick@attglobal.net
Born in Roswell... married an alien...

Re: [patch] perchild MPM bug fixes (+ open problem)

Posted by Johannes Erdfelt <jo...@erdfelt.com>.
On Thu, Oct 24, 2002, Jeff Trawick <tr...@attglobal.net> wrote:
> Johannes Erdfelt <jo...@erdfelt.com> writes:
> > Problem 2:
> > pass_request fills out an iovec with the headers and the body of the
> > request it wants to pass to another process. It unfortunately uses the
> > wrong variable for the length causing it to always be 0.
> > 
> > Solution:
> > Use the correct variable name "l". len is never set to anything so I removed
> > that and used 0 in the one reference to it. Implemented in the patch below.
> 
> applied, but I made another fix (?) too... see my note below, and let
> me know if it is bad :)
> 
> > Index: perchild.c
> > @@ -1635,7 +1645,6 @@
> >      apr_bucket_brigade *bb = apr_brigade_create(r->pool, c->bucket_alloc);
> >      apr_bucket_brigade *sockbb;
> >      char request_body[HUGE_STRING_LEN] = "\0";
> > -    apr_off_t len = 0;
> 
> looks to me like len should be initialized to sizeof(request_body)
> since on input to apr_brigade_flatten it should have the maximum
> length of the char array

I'll have to defer to you here. I'm not that familiar with the brigade
implementation to know what the appropriate fix for this is.

However, it most certainly is better than the current behaviour so I'm
not against it. If it's wrong, I'll find out pretty soon :)

Thanks for applying these patches!

JE


Re: [patch] perchild MPM bug fixes (+ open problem)

Posted by Jeff Trawick <tr...@attglobal.net>.
Johannes Erdfelt <jo...@erdfelt.com> writes:

> Problem 1:
> In worker_thread, there is a variable called csd that is used to get
> the new socket from lr->accept_func(). If that variable is NULL, then
> the memory for the new socket is allocated in the per-transaction pool.
> Unfortunately, the code neglected to reset the variable to NULL after
> servicing a request. The result is that the first request for each
> thread worked fine, but subsequent request may have the memory
> overwritten and resulting in an invalid FD.
> 
> Solution:
> Set it to NULL before calling lr->accept_func(). Implemented in the patch
> below.

I cleared that storage just before the call to apr_os_sock_put() so
that the reason is more obvious.

> Problem 2:
> pass_request fills out an iovec with the headers and the body of the
> request it wants to pass to another process. It unfortunately uses the
> wrong variable for the length causing it to always be 0.
> 
> Solution:
> Use the correct variable name "l". len is never set to anything so I removed
> that and used 0 in the one reference to it. Implemented in the patch below.

applied, but I made another fix (?) too... see my note below, and let
me know if it is bad :)

> Problem 3:
> receive_from_other_child assumes that the iovec is the same on read as
> it is on write. This isn't true and readmsg() follows readv() semantics.
> iovec is a scatter/gather list and as a result, the 2 send buffers are
> merged into one received buffer with the second always being untouched.
> It also trusted the lengths in iov.iov_len which will be the size of the
> original buffer, not the size of the data actually received.
> 
> Solution:
> Merge the 2 buffer's into 1 and find the null terminators for the 2 strings.
> Implemented in the patch below.

fix applied as-is

> Index: perchild.c
> @@ -1635,7 +1645,6 @@
>      apr_bucket_brigade *bb = apr_brigade_create(r->pool, c->bucket_alloc);
>      apr_bucket_brigade *sockbb;
>      char request_body[HUGE_STRING_LEN] = "\0";
> -    apr_off_t len = 0;

looks to me like len should be initialized to sizeof(request_body)
since on input to apr_brigade_flatten it should have the maximum
length of the char array

Thanks!

-- 
Jeff Trawick | trawick@attglobal.net
Born in Roswell... married an alien...

Re: [patch] perchild MPM bug fixes (+ open problem)

Posted by rb...@apache.org.
On Mon, 21 Oct 2002, Johannes Erdfelt wrote:

> On Mon, Oct 21, 2002, rbb@apache.org <rb...@apache.org> wrote:
> > > > > What I was thinking was to add an artificial limitation that you can't
> > > > > share an IP:port pair across two different uid/gid's since that's the
> > > > > only case you want to pass a connection.
> > > > 
> > > > That limitation should already exist, although it may not.
> > > 
> > > That limitation doesn't exist either in the documentation or in the
> > > implementation.
> > > 
> > > I figured that was part of the point of connection passing. I know it's
> > > also useful for passing between the siblings for a particular uid/gid
> > > also, but why not just require 1 child per uid/gid and then not worry
> > > about connection passing at all?
> > 
> > A couple of reasons.  Remember first of all that the more threads per
> > process, the less stable your server.  This is because if a module
> > seg-faults, the whole process dies.  For that reason, many admins will
> > want to have multiple child processes with the same uid/gid to create some
> > failover.
> > 
> > The second reason, however, is that the file descriptor passing is meant
> > to solve the case of multiple Name-based Vhosts on the same ip:port all
> > with different uid/gid requirements.  In that case, you will have multiple
> > child processes all listening to the same ip:port, but you will need to do
> > the file descriptor passing.
> 
> Well, that's what I said before, but you implied that the limitations
> should prevent this. :)

yeah, you're right, I wasn't thinking when I wrote that.  Sorry.

> > Closing the file descriptors of ip:port combinations that can't be served
> > is a great optimization, but that is all it is, an optimization for the
> > non-Name-based vhost case.
> 
> Yes.
> 
> This brings it back to what I was saying before, you can't pass SSL
> connections, so as a result, you can't share an SSL port across two
> uid/gid's. So this should be documented somewhere, or enforced in the
> server (which may be difficult because of the layering), so people don't
> do that :)

You don't need that doc.  In order to share an ip:port combination across
child processes, you need to be using name-based vhosts.  SSL doesn't
support name-based vhosts today, so there is no new docs required.  Of
course, if my patch to provide Upgrade support goes in, then you will be
able to share ip:port combinations that support SSL across child
processes.  So, the docs really aren't necessary.

> Anyway, I think I understand the path to follow now. I'm going to
> implement the necessary fixes to make perchild finally usable.

Great!

Ryan

_______________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
550 Jean St
Oakland CA 94610
-------------------------------------------------------------------------------


Re: [patch] perchild MPM bug fixes (+ open problem)

Posted by Johannes Erdfelt <jo...@erdfelt.com>.
On Mon, Oct 21, 2002, rbb@apache.org <rb...@apache.org> wrote:
> > > > What I was thinking was to add an artificial limitation that you can't
> > > > share an IP:port pair across two different uid/gid's since that's the
> > > > only case you want to pass a connection.
> > > 
> > > That limitation should already exist, although it may not.
> > 
> > That limitation doesn't exist either in the documentation or in the
> > implementation.
> > 
> > I figured that was part of the point of connection passing. I know it's
> > also useful for passing between the siblings for a particular uid/gid
> > also, but why not just require 1 child per uid/gid and then not worry
> > about connection passing at all?
> 
> A couple of reasons.  Remember first of all that the more threads per
> process, the less stable your server.  This is because if a module
> seg-faults, the whole process dies.  For that reason, many admins will
> want to have multiple child processes with the same uid/gid to create some
> failover.
> 
> The second reason, however, is that the file descriptor passing is meant
> to solve the case of multiple Name-based Vhosts on the same ip:port all
> with different uid/gid requirements.  In that case, you will have multiple
> child processes all listening to the same ip:port, but you will need to do
> the file descriptor passing.

Well, that's what I said before, but you implied that the limitations
should prevent this. :)

> Closing the file descriptors of ip:port combinations that can't be served
> is a great optimization, but that is all it is, an optimization for the
> non-Name-based vhost case.

Yes.

This brings it back to what I was saying before, you can't pass SSL
connections, so as a result, you can't share an SSL port across two
uid/gid's. So this should be documented somewhere, or enforced in the
server (which may be difficult because of the layering), so people don't
do that :)

Anyway, I think I understand the path to follow now. I'm going to
implement the necessary fixes to make perchild finally usable.

JE


Re: [patch] perchild MPM bug fixes (+ open problem)

Posted by rb...@apache.org.
> > Consider the case where an admin configures the server to listen on
> > www.foo.com:8080, but he never assigns a child process to listen to that
> > port.  If you just don't accept the connections, the user will hang
> > forever.  If every child process, however, actively closes the sockets
> > that it can't serve, then the client won't even be able to connect.
> 
> That's a good point too.
> 
> > > What I was thinking was to add an artificial limitation that you can't
> > > share an IP:port pair across two different uid/gid's since that's the
> > > only case you want to pass a connection.
> > 
> > That limitation should already exist, although it may not.
> 
> That limitation doesn't exist either in the documentation or in the
> implementation.
> 
> I figured that was part of the point of connection passing. I know it's
> also useful for passing between the siblings for a particular uid/gid
> also, but why not just require 1 child per uid/gid and then not worry
> about connection passing at all?

A couple of reasons.  Remember first of all that the more threads per
process, the less stable your server.  This is because if a module
seg-faults, the whole process dies.  For that reason, many admins will
want to have multiple child processes with the same uid/gid to create some
failover.

The second reason, however, is that the file descriptor passing is meant
to solve the case of multiple Name-based Vhosts on the same ip:port all
with different uid/gid requirements.  In that case, you will have multiple
child processes all listening to the same ip:port, but you will need to do
the file descriptor passing.

Closing the file descriptors of ip:port combinations that can't be served
is a great optimization, but that is all it is, an optimization for the
non-Name-based vhost case.

Ryan

_______________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
550 Jean St
Oakland CA 94610
-------------------------------------------------------------------------------


Re: [patch] perchild MPM bug fixes (+ open problem)

Posted by Johannes Erdfelt <jo...@erdfelt.com>.
On Mon, Oct 21, 2002, rbb@apache.org <rb...@apache.org> wrote:
> On Mon, 21 Oct 2002, Johannes Erdfelt wrote:
> > Perhaps I misunderstood. The patch I had developed (which is broken
> > because of the problems with the accept lock) just didn't listen on the
> > socket if it has no chance of answering the query itself.
> > 
> > That's what I thought you meant, but reading what you said again I may
> > have misunderstood.
> 
> Nope, that's exactly what I meant.  (see below)
> 
> > Did you mean closing the client socket or the listening socket? Wouldn't
> > closing the client socket just cause the client to treat it as an error?
> 
> Just close the listening socket in the child_init phase of the
> module.  You want to do this before you start accepting requests.  You
> could also do this by just removing the socket from the accept array.  The
> reason I like having the child close the socket, is that it eliminates
> a potential bad config.
> 
> Consider the case where an admin configures the server to listen on
> www.foo.com:8080, but he never assigns a child process to listen to that
> port.  If you just don't accept the connections, the user will hang
> forever.  If every child process, however, actively closes the sockets
> that it can't serve, then the client won't even be able to connect.

That's a good point too.

> > What I was thinking was to add an artificial limitation that you can't
> > share an IP:port pair across two different uid/gid's since that's the
> > only case you want to pass a connection.
> 
> That limitation should already exist, although it may not.

That limitation doesn't exist either in the documentation or in the
implementation.

I figured that was part of the point of connection passing. I know it's
also useful for passing between the siblings for a particular uid/gid
also, but why not just require 1 child per uid/gid and then not worry
about connection passing at all?

JE


Re: [patch] perchild MPM bug fixes (+ open problem)

Posted by rb...@apache.org.
On Mon, 21 Oct 2002, Johannes Erdfelt wrote:

> On Mon, Oct 21, 2002, rbb@apache.org <rb...@apache.org> wrote:
> > 
> > > > As long as you are doing all this work, there is one more thought that I
> > > > have been meaning to implement, but that I never got around to.  Currently
> > > > perchild doesn't work with SSL, because of when the request is passed off,
> > > > and how SSL works.  The easy solution to this, is to have the child
> > > > processes close the sockets for any requests that they cannot
> > > > handle.  This will also improve the chance that a request won't be passed
> > > > if you have vhosts with different ports.  Consider the following:
> > > > 
> > > > <VHost www.foo.com:80>
> > > >     AssignChildPerUidGid  rbb rbb
> > > > </Vhost>
> > > > 
> > > > <Vhost www.foo.com:81>
> > > >     AssignChildPerUidGid  foo foo
> > > > </VHost>
> > > > 
> > > > There is no reason for the foo/foo child process to be listening on port
> > > > 80.
> > > > 
> > > > Just a thought for how to get SSL to work.
> > > 
> > > I actually have a patch for this already :) Although I implemented it
> > > only as an optimization and not because of the issue with SSL. I hadn't
> > > tried to do SSL yet.
> > > 
> > > I'd imagine this SSL limitation will have to be clearly documented since
> > > it may not be obvious to everyone.
> > 
> > Actually, as long as you have a patch to do this, then SSL should just
> > work, and no docs should be necessary.   :-)
> 
> Perhaps I misunderstood. The patch I had developed (which is broken
> because of the problems with the accept lock) just didn't listen on the
> socket if it has no chance of answering the query itself.
> 
> That's what I thought you meant, but reading what you said again I may
> have misunderstood.

Nope, that's exactly what I meant.  (see below)

> Did you mean closing the client socket or the listening socket? Wouldn't
> closing the client socket just cause the client to treat it as an error?

Just close the listening socket in the child_init phase of the
module.  You want to do this before you start accepting requests.  You
could also do this by just removing the socket from the accept array.  The
reason I like having the child close the socket, is that it eliminates
a potential bad config.

Consider the case where an admin configures the server to listen on
www.foo.com:8080, but he never assigns a child process to listen to that
port.  If you just don't accept the connections, the user will hang
forever.  If every child process, however, actively closes the sockets
that it can't serve, then the client won't even be able to connect.

> What I was thinking was to add an artificial limitation that you can't
> share an IP:port pair across two different uid/gid's since that's the
> only case you want to pass a connection.

That limitation should already exist, although it may not.

_______________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
550 Jean St
Oakland CA 94610
-------------------------------------------------------------------------------


Re: [patch] perchild MPM bug fixes (+ open problem)

Posted by Johannes Erdfelt <jo...@erdfelt.com>.
On Mon, Oct 21, 2002, rbb@apache.org <rb...@apache.org> wrote:
> 
> > > As long as you are doing all this work, there is one more thought that I
> > > have been meaning to implement, but that I never got around to.  Currently
> > > perchild doesn't work with SSL, because of when the request is passed off,
> > > and how SSL works.  The easy solution to this, is to have the child
> > > processes close the sockets for any requests that they cannot
> > > handle.  This will also improve the chance that a request won't be passed
> > > if you have vhosts with different ports.  Consider the following:
> > > 
> > > <VHost www.foo.com:80>
> > >     AssignChildPerUidGid  rbb rbb
> > > </Vhost>
> > > 
> > > <Vhost www.foo.com:81>
> > >     AssignChildPerUidGid  foo foo
> > > </VHost>
> > > 
> > > There is no reason for the foo/foo child process to be listening on port
> > > 80.
> > > 
> > > Just a thought for how to get SSL to work.
> > 
> > I actually have a patch for this already :) Although I implemented it
> > only as an optimization and not because of the issue with SSL. I hadn't
> > tried to do SSL yet.
> > 
> > I'd imagine this SSL limitation will have to be clearly documented since
> > it may not be obvious to everyone.
> 
> Actually, as long as you have a patch to do this, then SSL should just
> work, and no docs should be necessary.   :-)

Perhaps I misunderstood. The patch I had developed (which is broken
because of the problems with the accept lock) just didn't listen on the
socket if it has no chance of answering the query itself.

That's what I thought you meant, but reading what you said again I may
have misunderstood.

Did you mean closing the client socket or the listening socket? Wouldn't
closing the client socket just cause the client to treat it as an error?

What I was thinking was to add an artificial limitation that you can't
share an IP:port pair across two different uid/gid's since that's the
only case you want to pass a connection.

JE


Re: [patch] perchild MPM bug fixes (+ open problem)

Posted by rb...@apache.org.
> > As long as you are doing all this work, there is one more thought that I
> > have been meaning to implement, but that I never got around to.  Currently
> > perchild doesn't work with SSL, because of when the request is passed off,
> > and how SSL works.  The easy solution to this, is to have the child
> > processes close the sockets for any requests that they cannot
> > handle.  This will also improve the chance that a request won't be passed
> > if you have vhosts with different ports.  Consider the following:
> > 
> > <VHost www.foo.com:80>
> >     AssignChildPerUidGid  rbb rbb
> > </Vhost>
> > 
> > <Vhost www.foo.com:81>
> >     AssignChildPerUidGid  foo foo
> > </VHost>
> > 
> > There is no reason for the foo/foo child process to be listening on port
> > 80.
> > 
> > Just a thought for how to get SSL to work.
> 
> I actually have a patch for this already :) Although I implemented it
> only as an optimization and not because of the issue with SSL. I hadn't
> tried to do SSL yet.
> 
> I'd imagine this SSL limitation will have to be clearly documented since
> it may not be obvious to everyone.

Actually, as long as you have a patch to do this, then SSL should just
work, and no docs should be necessary.   :-)

Ryan

_______________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
550 Jean St
Oakland CA 94610
-------------------------------------------------------------------------------


Re: [patch] perchild MPM bug fixes (+ open problem)

Posted by Johannes Erdfelt <jo...@erdfelt.com>.
On Mon, Oct 21, 2002, rbb@apache.org <rb...@apache.org> wrote:
> On Mon, 21 Oct 2002, Johannes Erdfelt wrote:
> > If there is only one thread listening on the unix domain socket (because
> > each of those is specific to a child) is a lock even needed?
> > 
> > The current lock around the accept() logic is mostly to prevent the
> > thundering hurd syndrome, right?
> 
> Yes, it is needed.  The thing is, that you can have multiple child
> processes running as a single user/group combination.  All of those child
> processes will use the same unix domain socket for their
> communication.  So, the thundering herd problem still exists.  If you
> really wanted to optimize this, you could do only lock the file if it was
> configured for multiple child processes, but that is not going to be
> trivial, because that information doesn't really exist at the point that
> you are doing the lock.

Ahh, good point, I forgot about that.

> As long as you are doing all this work, there is one more thought that I
> have been meaning to implement, but that I never got around to.  Currently
> perchild doesn't work with SSL, because of when the request is passed off,
> and how SSL works.  The easy solution to this, is to have the child
> processes close the sockets for any requests that they cannot
> handle.  This will also improve the chance that a request won't be passed
> if you have vhosts with different ports.  Consider the following:
> 
> <VHost www.foo.com:80>
>     AssignChildPerUidGid  rbb rbb
> </Vhost>
> 
> <Vhost www.foo.com:81>
>     AssignChildPerUidGid  foo foo
> </VHost>
> 
> There is no reason for the foo/foo child process to be listening on port
> 80.
> 
> Just a thought for how to get SSL to work.

I actually have a patch for this already :) Although I implemented it
only as an optimization and not because of the issue with SSL. I hadn't
tried to do SSL yet.

I'd imagine this SSL limitation will have to be clearly documented since
it may not be obvious to everyone.

JE


Re: [patch] perchild MPM bug fixes (+ open problem)

Posted by rb...@apache.org.
On Mon, 21 Oct 2002, Johannes Erdfelt wrote:

> > > Problem 4:
> > > Occasionally, when a request needs to be passed to another process, it will
> > > hang until another connection is accepted. perchild uses the same process
> > > accept mutex that the other MPM's seem to need. This results in
> > > serialization of all accept() calls as is the intent of the mutex. The
> > > problem is that another process might have acquired the mutex from the
> > > process that actually needs the mutex to hit accept() and see the passed
> > > request. This isn't normally a problem because all processes accept
> > > connections on all listening ports. The problem in this case is the fact
> > > not all processes accept connections on the per process unix domain socket
> > > to pass connections around, for obvious reasons.
> > > 
> > > Solution:
> > > I don't have one :) I'm not sure what the best way of solving this is. I
> > > don't fully understand the semantics of the process accept mutex yet. Any
> > > suggestions?
> > 
> > There is only one solution that I can see for this, unfortunately, it is a
> > rather big change.  :-(  The first step, is to migrate the perchild MPM to
> > the same model as the worker MPM, where one thread accepts requests and
> > puts them in a queue, and the rest of the threads remove the requests from
> > the queue, and actually serve them.  I've been meaning to do this for a
> > long time, but I never got around to it.
> 
> Ahh, that's a good idea. I can implement this as well.
> 
> > Once that is done, the solution to this problem is easy.  Every child
> > process needs to have two acceptor threads.  The first accepts on the TCP
> > socket, and the second accepts on the Unix Domain Socket.  For the TCP
> > socket, the thread should just use the same accept_mutex that it is using
> > now.  However for the Unix Domain Socket, each thread that is accepting on
> > the socket should just use apr_file_lock to lock the socket before
> > accepting.  This should remove the starvation that you are seeing, because
> > each socket will have it's own lock.
> 
> If there is only one thread listening on the unix domain socket (because
> each of those is specific to a child) is a lock even needed?
> 
> The current lock around the accept() logic is mostly to prevent the
> thundering hurd syndrome, right?

Yes, it is needed.  The thing is, that you can have multiple child
processes running as a single user/group combination.  All of those child
processes will use the same unix domain socket for their
communication.  So, the thundering herd problem still exists.  If you
really wanted to optimize this, you could do only lock the file if it was
configured for multiple child processes, but that is not going to be
trivial, because that information doesn't really exist at the point that
you are doing the lock.

As long as you are doing all this work, there is one more thought that I
have been meaning to implement, but that I never got around to.  Currently
perchild doesn't work with SSL, because of when the request is passed off,
and how SSL works.  The easy solution to this, is to have the child
processes close the sockets for any requests that they cannot
handle.  This will also improve the chance that a request won't be passed
if you have vhosts with different ports.  Consider the following:

<VHost www.foo.com:80>
    AssignChildPerUidGid  rbb rbb
</Vhost>

<Vhost www.foo.com:81>
    AssignChildPerUidGid  foo foo
</VHost>

There is no reason for the foo/foo child process to be listening on port
80.

Just a thought for how to get SSL to work.

Ryan
_______________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
550 Jean St
Oakland CA 94610
-------------------------------------------------------------------------------


Re: [patch] perchild MPM bug fixes (+ open problem)

Posted by Johannes Erdfelt <jo...@erdfelt.com>.
On Mon, Oct 21, 2002, rbb@apache.org <rb...@apache.org> wrote:
> On Thu, 17 Oct 2002, Johannes Erdfelt wrote:
> 
> > Ryan, I've CC'd you on this just to let you see the patch. If you don't
> > want me to involve you in this, please accept my apologies and let me
> > know and I won't CC you in any further patches.
> 
> I have no problem being CC'ed on patches, although for the most part, I am
> unlikely to respond.  In this case however, I will.  I want to say that
> this is great.  I have reviewed the patch and the logic, and it all looks
> fantastic.  I'm really glad that the MPM is making progress.  I will also
> provide a possible solution to the one problem that you were unable to
> solve.  :-)

Thanks for taking the time to look at the patch. I'll make sure to CC
you on any further patches but focus on the rest of the development team
to merge it :)

> > Problem 4:
> > Occasionally, when a request needs to be passed to another process, it will
> > hang until another connection is accepted. perchild uses the same process
> > accept mutex that the other MPM's seem to need. This results in
> > serialization of all accept() calls as is the intent of the mutex. The
> > problem is that another process might have acquired the mutex from the
> > process that actually needs the mutex to hit accept() and see the passed
> > request. This isn't normally a problem because all processes accept
> > connections on all listening ports. The problem in this case is the fact
> > not all processes accept connections on the per process unix domain socket
> > to pass connections around, for obvious reasons.
> > 
> > Solution:
> > I don't have one :) I'm not sure what the best way of solving this is. I
> > don't fully understand the semantics of the process accept mutex yet. Any
> > suggestions?
> 
> There is only one solution that I can see for this, unfortunately, it is a
> rather big change.  :-(  The first step, is to migrate the perchild MPM to
> the same model as the worker MPM, where one thread accepts requests and
> puts them in a queue, and the rest of the threads remove the requests from
> the queue, and actually serve them.  I've been meaning to do this for a
> long time, but I never got around to it.

Ahh, that's a good idea. I can implement this as well.

> Once that is done, the solution to this problem is easy.  Every child
> process needs to have two acceptor threads.  The first accepts on the TCP
> socket, and the second accepts on the Unix Domain Socket.  For the TCP
> socket, the thread should just use the same accept_mutex that it is using
> now.  However for the Unix Domain Socket, each thread that is accepting on
> the socket should just use apr_file_lock to lock the socket before
> accepting.  This should remove the starvation that you are seeing, because
> each socket will have it's own lock.

If there is only one thread listening on the unix domain socket (because
each of those is specific to a child) is a lock even needed?

The current lock around the accept() logic is mostly to prevent the
thundering hurd syndrome, right?

Other than that, this makes perfect sense too, I can implement this as
well.

Thanks for the reply!

JE


Re: [patch] perchild MPM bug fixes (+ open problem)

Posted by rb...@apache.org.
On Thu, 17 Oct 2002, Johannes Erdfelt wrote:

> Ryan, I've CC'd you on this just to let you see the patch. If you don't
> want me to involve you in this, please accept my apologies and let me
> know and I won't CC you in any further patches.

I have no problem being CC'ed on patches, although for the most part, I am
unlikely to respond.  In this case however, I will.  I want to say that
this is great.  I have reviewed the patch and the logic, and it all looks
fantastic.  I'm really glad that the MPM is making progress.  I will also
provide a possible solution to the one problem that you were unable to
solve.  :-)

> Problem 4:
> Occasionally, when a request needs to be passed to another process, it will
> hang until another connection is accepted. perchild uses the same process
> accept mutex that the other MPM's seem to need. This results in
> serialization of all accept() calls as is the intent of the mutex. The
> problem is that another process might have acquired the mutex from the
> process that actually needs the mutex to hit accept() and see the passed
> request. This isn't normally a problem because all processes accept
> connections on all listening ports. The problem in this case is the fact
> not all processes accept connections on the per process unix domain socket
> to pass connections around, for obvious reasons.
> 
> Solution:
> I don't have one :) I'm not sure what the best way of solving this is. I
> don't fully understand the semantics of the process accept mutex yet. Any
> suggestions?

There is only one solution that I can see for this, unfortunately, it is a
rather big change.  :-(  The first step, is to migrate the perchild MPM to
the same model as the worker MPM, where one thread accepts requests and
puts them in a queue, and the rest of the threads remove the requests from
the queue, and actually serve them.  I've been meaning to do this for a
long time, but I never got around to it.

Once that is done, the solution to this problem is easy.  Every child
process needs to have two acceptor threads.  The first accepts on the TCP
socket, and the second accepts on the Unix Domain Socket.  For the TCP
socket, the thread should just use the same accept_mutex that it is using
now.  However for the Unix Domain Socket, each thread that is accepting on
the socket should just use apr_file_lock to lock the socket before
accepting.  This should remove the starvation that you are seeing, because
each socket will have it's own lock.

This is really great work, I hope that it gets applied ASAP.

Ryan