You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by "Marc G. Fournier" <sc...@hub.org> on 2002/05/21 15:09:41 UTC

Apache 2.0.36 w/ FreeBSD ... 'hangs' ...

Can't describe it much better then that ... I've got it compiled
WITH_THREADS and MPM=worker ... starts fine, get a line in
/var/log/httpd-error.log:

[Tue May 21 10:01:47 2002] [notice] Digest: generating secret for digest authentication ...
[Tue May 21 10:01:52 2002] [notice] Digest: done
[Tue May 21 10:01:53 2002] [notice] Apache/2.0.36 (Unix) DAV/2 PHP/4.2.1 configured -- resuming normal operations

But when I try to connect to the site in question, the connection just
hangs there ... as soon as I do a 'stop' of the server, though, the page
is sent and then httpd terminates ...

I'm running with settings of:

<IfModule worker.c>
StartServers         1
MaxClients         128
MinSpareThreads      5
MaxSpareThreads     15
ThreadsPerChild     64
MaxRequestsPerChild  0
</IfModule>

And am getting no errors that I can find ...

ps shows:

earth# ps aux | grep http
root    78796  0.0  0.1  9764 6012  ??  SsJ  10:06AM   0:00.11 /usr/local/sbin/httpd
nobody  78799  0.0  0.1  9268 4944  ??  SJ   10:06AM   0:00.01 /usr/local/sbin/httpd
nobody  78825  0.0  0.2 15044 6640  ??  IJ   10:06AM   0:00.03 /usr/local/sbin/httpd

Which I find kinda odd since I'm not getting that many hits, and told it
to only start with one :(

Thoughts as to what I should look at here?  I've even enabled the
'apache_runtime_status' file, and that appears to make no difference ...




RE: Apache 2.0.36 w/ FreeBSD ... 'hangs' ...

Posted by Ryan Bloom <rb...@covalent.net>.
> From: Marc G. Fournier [mailto:scrappy@hub.org]
> 
> 
> Okay, but this is the scenario that I do want (one worker, many
threads)
> ... so I setup my httpd.conf as:
> 
> <IfModule worker.c>
> StartServers         1
> MaxClients         150
> MinSpareThreads     25
> MaxSpareThreads     75
> ThreadsPerChild     25
> MaxRequestsPerChild  0
> </IfModule>
> 
> But as soon as I start up, it starts 3 servers (I would expect 2 ...
one
> root, one nobody):
> 
> atelier# ps aux | grep http
> root   59418  0.0  0.1  1416 1052  ph  RV+   1:41PM   0:00.00 grep
http
> (csh)
> root   59405  0.0  0.4  5316 3708  ??  Ss    1:39PM   0:00.03
> /usr/local/sbin/httpd
> www    59406  0.0  0.3  5096 3480  ??  I     1:39PM   0:00.00
> /usr/local/sbin/httpd
> www    59409  0.0  0.4  7420 4000  ??  I     1:39PM   0:00.00
> /usr/local/sbin/httpd
> 
> With the second one be a good 50% larger then the other two ...
> 
> So, am I mis-understanding below, *or*, mis-understanding the conf
file
> ...

Did you configure cgi?  If so, one of those two processes owned by "www"
is the CGId process.

Ryan



Re: Apache 2.0.36 w/ FreeBSD ... 'hangs' ...

Posted by "Marc G. Fournier" <sc...@hub.org>.

Okay, but this is the scenario that I do want (one worker, many threads)
... so I setup my httpd.conf as:

<IfModule worker.c>
StartServers         1
MaxClients         150
MinSpareThreads     25
MaxSpareThreads     75
ThreadsPerChild     25
MaxRequestsPerChild  0
</IfModule>

But as soon as I start up, it starts 3 servers (I would expect 2 ... one
root, one nobody):

atelier# ps aux | grep http
root   59418  0.0  0.1  1416 1052  ph  RV+   1:41PM   0:00.00 grep http (csh)
root   59405  0.0  0.4  5316 3708  ??  Ss    1:39PM   0:00.03 /usr/local/sbin/httpd
www    59406  0.0  0.3  5096 3480  ??  I     1:39PM   0:00.00 /usr/local/sbin/httpd
www    59409  0.0  0.4  7420 4000  ??  I     1:39PM   0:00.00 /usr/local/sbin/httpd

With the second one be a good 50% larger then the other two ...

So, am I mis-understanding below, *or*, mis-understanding the conf file
...


On Tue, 21 May 2002, Aaron Bannert wrote:

> On Tue, May 21, 2002 at 01:14:00PM -0300, Marc G. Fournier wrote:
> > Just to confirm, would it be the following that I'm looking at:
> ...
>
> Nope, further down:
>
>     * FreeBSD, threads, and worker MPM.  All seems to work fine
>       if you only have one worker process with many threads.  Add
>       a second worker process and the accept lock seems to be
>       lost.  This might be an APR issue with how it deals with
>       the child_init hook (i.e. the fcntl lock needs to be resynced).
>       More examination and analysis is required.
>         Status: This has also been reported on Cygwin.
>         Message-ID: <3C...@wapme-systems.de> (cygnus)
>
>       Justin says: So, FreeBSD-CURRENT and Cywin have the same
>                    problem.  Yum.  If another platform has this
>                    with worker, this becomes a showstopper.
>       Aaron says: I spent some time disecting this and have come to
>               the conclusion that it is not a problem in the worker MPM
>               (or at least, it is not isolated to a problem in worker).
>               I'll list some of the problems I'm seeing in case someone
>               else wants to pick up where I've left off:
>                - Delivery of just about any signal to one of the child
>                  processes will send it into an infinite loop as well.
>                - Even though the parent is spinning out of control,
>                  at first the child or children will appear to work
>                  properly. At times it is possible to get it into a state,
>                  however, where a request will hang until another concurrent
>                  request "kicks" the first, at which point the second will
>                  hang. My theory is that this has to do with the
>                  pthread_cond_*() implementation in FreeBSD, but it's still
>                  possible that it is in APR.
>
>       Justin adds: Oh, FreeBSD threads are implemented entirely with
>                    select()/poll()/longjmp().  Welcome to the nightmare.
>                    So, that means a ktrace output also has the thread
>                    scheduling internals in it (since it is all the same to
>                    the kernel).  Which makes it hard to distinguish between
>                    our select() calls and their select() calls.
>                    *bangs head on wall repeatedly*  But, some of the libc_r
>                    files have a DBG_MSG #define.  This is moderately helpful
>                    when used with -DNO_DETACH.  The kernel scheduler isn't
>                    waking up the threads on a select().  Yum.  And, I bet
>                    those decrementing select calls have to do with the
>                    scheduler.  Time to brush up on our OS fundamentals.
>
> -aaron
>


Re: Apache 2.0.36 w/ FreeBSD ... 'hangs' ...

Posted by Aaron Bannert <aa...@clove.org>.
On Tue, May 21, 2002 at 01:14:00PM -0300, Marc G. Fournier wrote:
> Just to confirm, would it be the following that I'm looking at:
...

Nope, further down:

    * FreeBSD, threads, and worker MPM.  All seems to work fine 
      if you only have one worker process with many threads.  Add
      a second worker process and the accept lock seems to be
      lost.  This might be an APR issue with how it deals with
      the child_init hook (i.e. the fcntl lock needs to be resynced).
      More examination and analysis is required.
        Status: This has also been reported on Cygwin.  
        Message-ID: <3C...@wapme-systems.de> (cygnus)

      Justin says: So, FreeBSD-CURRENT and Cywin have the same 
                   problem.  Yum.  If another platform has this
                   with worker, this becomes a showstopper.
      Aaron says: I spent some time disecting this and have come to
              the conclusion that it is not a problem in the worker MPM
              (or at least, it is not isolated to a problem in worker).
              I'll list some of the problems I'm seeing in case someone
              else wants to pick up where I've left off:
               - Delivery of just about any signal to one of the child
                 processes will send it into an infinite loop as well.
               - Even though the parent is spinning out of control,
                 at first the child or children will appear to work
                 properly. At times it is possible to get it into a state,
                 however, where a request will hang until another concurrent
                 request "kicks" the first, at which point the second will
                 hang. My theory is that this has to do with the
                 pthread_cond_*() implementation in FreeBSD, but it's still
                 possible that it is in APR.

      Justin adds: Oh, FreeBSD threads are implemented entirely with
                   select()/poll()/longjmp().  Welcome to the nightmare.
                   So, that means a ktrace output also has the thread
                   scheduling internals in it (since it is all the same to
                   the kernel).  Which makes it hard to distinguish between
                   our select() calls and their select() calls.
                   *bangs head on wall repeatedly*  But, some of the libc_r
                   files have a DBG_MSG #define.  This is moderately helpful
                   when used with -DNO_DETACH.  The kernel scheduler isn't
                   waking up the threads on a select().  Yum.  And, I bet
                   those decrementing select calls have to do with the
                   scheduler.  Time to brush up on our OS fundamentals.

-aaron

Re: Apache 2.0.36 w/ FreeBSD ... 'hangs' ...

Posted by "Marc G. Fournier" <sc...@hub.org>.

Just to confirm, would it be the following that I'm looking at:

================================================
    * Generate a good bug report to send to the FreeBSD hackers that details
      the problems we have seen with threads and system calls (specifically
      sendfile data is corrupted).  From our analysis so far, we don't think
      that this is an APR issue, but rather a FreeBSD kernel issue.  Our
      current solution is to just disable threads across the board on
      FreeBSD.

      MsgID: <20...@ebuilt.com>
        Status: Fixed in -CURRENT.  MFC in about a week.  Continuing
                testing with threads on FreeBSD.

                FreeBSD PR kern/32684:
                  http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/32684
================================================


On Tue, 21 May 2002, Aaron Bannert wrote:

> > > Just tried the same config on a seperate FreeBSD machine ... both are
> > > running 4.6-PRERELEASE right now, and the problem(s) are the same ...
> > > connect, hang, kill server and page gets sent across then server goes down
> ...
> > AFAIK it is still unclear whether APR is the reason, or rather the
> > somewhat broken pthreads on FreeBSD....  Most developers think it's the
> > latter.
>
> A few of us spent some time diagnosing this awhile back, and you
> may find our notes in the STATUS file to be of some use.
>
> -aaron
>


Re: Apache 2.0.36 w/ FreeBSD ... 'hangs' ...

Posted by Aaron Bannert <aa...@clove.org>.
> > Just tried the same config on a seperate FreeBSD machine ... both are
> > running 4.6-PRERELEASE right now, and the problem(s) are the same ...
> > connect, hang, kill server and page gets sent across then server goes down
...
> AFAIK it is still unclear whether APR is the reason, or rather the
> somewhat broken pthreads on FreeBSD....  Most developers think it's the
> latter.

A few of us spent some time diagnosing this awhile back, and you
may find our notes in the STATUS file to be of some use.

-aaron

Re: Apache 2.0.36 w/ FreeBSD ... 'hangs' ...

Posted by Justin Erenkrantz <je...@apache.org>.
On Tue, May 21, 2002 at 01:45:25PM -0400, Cliff Woolley wrote:
> It's unclear from
> http://www.freebsd.org/cgi/cvsweb.cgi/src/lib/libc_r/uthread/uthread_sendfile.c
> whether or not the problem is fixed in 4.6PRERELEASE, unless that is
> strictly based off of HEAD.  There's no tag in that file for it.  Changes
> definitely went in after 4.5 that would affect this problem.

The sendfile libc_r problem was fixed before 4.5 was released and
made it into 4.5.  That allows files to get served without sending
corrupt data.  However, threads are only partially supported with
-CURRENT (5.0) and are not supported with 4.x at all.

BUT, I've heard some success with people using linuxthreads on
FreeBSD.  I certainly don't have the time (or the platforms) to
research this more.  -- justin

Re: Apache 2.0.36 w/ FreeBSD ... 'hangs' ...

Posted by Cliff Woolley <jw...@virginia.edu>.
On Tue, 21 May 2002, Marc G. Fournier wrote:

> Okay, I had heard that the 'broken pthreads' was something that was fixed
> since 4.5 ... are you saying that this isn't the case? :(  Going to
> threads was one of my key reasons for moving to Apache2 :(

It's unclear from
http://www.freebsd.org/cgi/cvsweb.cgi/src/lib/libc_r/uthread/uthread_sendfile.c
whether or not the problem is fixed in 4.6PRERELEASE, unless that is
strictly based off of HEAD.  There's no tag in that file for it.  Changes
definitely went in after 4.5 that would affect this problem.

--Cliff


--------------------------------------------------------------
   Cliff Woolley
   cliffwoolley@yahoo.com
   Charlottesville, VA



Re: Apache 2.0.36 w/ FreeBSD ... 'hangs' ...

Posted by "Marc G. Fournier" <sc...@hub.org>.
On Tue, 21 May 2002, Martin Kraemer wrote:

> On Tue, May 21, 2002 at 10:12:17AM -0300, Marc G. Fournier wrote:
> >
> > Just tried the same config on a seperate FreeBSD machine ... both are
> > running 4.6-PRERELEASE right now, and the problem(s) are the same ...
> > connect, hang, kill server and page gets sent across then server goes down
> > ...
>
> Same here. And the same probably on every FreeBSD.
> Try --with-mpm=prefork which works better.
>
> AFAIK it is still unclear whether APR is the reason, or rather the
> somewhat broken pthreads on FreeBSD....  Most developers think it's the
> latter.

Okay, I had heard that the 'broken pthreads' was something that was fixed
since 4.5 ... are you saying that this isn't the case? :(  Going to
threads was one of my key reasons for moving to Apache2 :(



Re: Apache 2.0.36 w/ FreeBSD ... 'hangs' ...

Posted by Martin Kraemer <Ma...@Fujitsu-Siemens.com>.
On Tue, May 21, 2002 at 10:12:17AM -0300, Marc G. Fournier wrote:
> 
> Just tried the same config on a seperate FreeBSD machine ... both are
> running 4.6-PRERELEASE right now, and the problem(s) are the same ...
> connect, hang, kill server and page gets sent across then server goes down
> ...

Same here. And the same probably on every FreeBSD.
Try --with-mpm=prefork which works better.

AFAIK it is still unclear whether APR is the reason, or rather the
somewhat broken pthreads on FreeBSD....  Most developers think it's the
latter.

   Martin
-- 
<Ma...@Fujitsu-Siemens.com>         |     Fujitsu Siemens
Fon: +49-89-636-46021, FAX: +49-89-636-47655 | 81730  Munich,  Germany

Re: Apache 2.0.36 w/ FreeBSD ... 'hangs' ...

Posted by "Marc G. Fournier" <sc...@hub.org>.
Just tried the same config on a seperate FreeBSD machine ... both are
running 4.6-PRERELEASE right now, and the problem(s) are the same ...
connect, hang, kill server and page gets sent across then server goes down
...



On Tue, 21 May 2002, Marc G. Fournier wrote:

>
> Can't describe it much better then that ... I've got it compiled
> WITH_THREADS and MPM=worker ... starts fine, get a line in
> /var/log/httpd-error.log:
>
> [Tue May 21 10:01:47 2002] [notice] Digest: generating secret for digest authentication ...
> [Tue May 21 10:01:52 2002] [notice] Digest: done
> [Tue May 21 10:01:53 2002] [notice] Apache/2.0.36 (Unix) DAV/2 PHP/4.2.1 configured -- resuming normal operations
>
> But when I try to connect to the site in question, the connection just
> hangs there ... as soon as I do a 'stop' of the server, though, the page
> is sent and then httpd terminates ...
>
> I'm running with settings of:
>
> <IfModule worker.c>
> StartServers         1
> MaxClients         128
> MinSpareThreads      5
> MaxSpareThreads     15
> ThreadsPerChild     64
> MaxRequestsPerChild  0
> </IfModule>
>
> And am getting no errors that I can find ...
>
> ps shows:
>
> earth# ps aux | grep http
> root    78796  0.0  0.1  9764 6012  ??  SsJ  10:06AM   0:00.11 /usr/local/sbin/httpd
> nobody  78799  0.0  0.1  9268 4944  ??  SJ   10:06AM   0:00.01 /usr/local/sbin/httpd
> nobody  78825  0.0  0.2 15044 6640  ??  IJ   10:06AM   0:00.03 /usr/local/sbin/httpd
>
> Which I find kinda odd since I'm not getting that many hits, and told it
> to only start with one :(
>
> Thoughts as to what I should look at here?  I've even enabled the
> 'apache_runtime_status' file, and that appears to make no difference ...
>
>
>
>