You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Ed Korthof <ed...@organic.com> on 1997/01/20 04:06:44 UTC

[BUG]s : ErrorDocument and kill -HUP behavior

Both of these bugs occur on Sparcs running Solaris 2.5.  They are both at
least moderately serious; the second is probably more serious, though the
first does cause a strange and unexplaied server death (during startup, so
it doesn't run at all under the wrong conditions).

The first bug is not easily reproduced.  It appears to be a memory
corruption bug, and it has only appeared on a server with many virtual
hosts (done through both multiple IPs and multiple ports).   With the
proper conditions, it is perfectly repeatable.  I can easily provide core
files if anyone wants one.

The bug occurs during startup, and kills the server.  The error message
occurs after some of the virtual hosts have been configured (because not
all of the named document roots exist, there are generally about 8
messages about non-extant DocumentRoots; four go by when this occurs). 

The error message is 'http_main.c:1874: failed assertion !nr->used' (the
line number is probably not standard, since I'm running a patched 1.2b2,
but it's in the function copy_listeners).  From within gdb, the value of
listeners->used (same as nr->used) is 613152; I believe it is consistently
that.  listeners->next is not readable, suggesting some sort of memory
corruption. 

Finally, the really weird part.  If I fully specify the AccessConfig and
ResourceConfig files (rather than using relative names), the problem
appears to go away.  If I move all the files (from /local/etc/httpd) the
problem also goes away, even without fully specifying the above two files.
And finally, if one of them is fully specified (whichever it is), that
will solve the problem.

The problem first occurred when I added an ErrorDocument (503, w/ quoted
text) to httpd.conf.

Does anyone have any idea what might be causing this?

-----

The second bug involves signal handling for SIGHUP on heavily loaded
servers.  It's always been the case that once in a while, one of the
children doesn't die when the parent sends a SIGUP to it's process group;
the parent waits to reap the child, but that takes a while (the child
won't die until it's served up it's remaining  MaxRequestPerChild -- we
generally have to restart before this occurs).

However, on a heavily loaded server it appears that this problem is easily
reproducible -- as many as 19 children will be left alive after a SIGHUP;
and a SIGHUP will almost never restart the server cleanly.  This is on a
server working more or less at capacity... I can proidve core files from
this as well.

I don't know the signal handling code all that well, or I'd try to deal
with this; anyone else have a good idea what might be the problem?  If
not, I'll work on this, but it'll probably be a few weeks (at the
earliest) before I'm likely to have a solution.

     -- Ed Korthof        |  Web Server Engineer --
     -- ed@organic.com    |  Organic Online, Inc --
     -- (415) 278-5676    |  Fax: (415) 284-6891 --




Re: [BUG]s : ErrorDocument and kill -HUP behavior

Posted by Ed Korthof <ed...@organic.com>.
<snicker self>

The behavior I described earlier, with ErrorDocument and Apache dying
during startup was due to a bug which was fixed by 1.2b4 (I was working
from the 1.2b2 base, which I'm going to stop doing) -- an initialized
variable.  <rueful> Sorry for the spam...

     -- Ed Korthof        |  Web Server Engineer --
     -- ed@organic.com    |  Organic Online, Inc --
     -- (415) 278-5676    |  Fax: (415) 284-6891 --


Re: [BUG]s : ErrorDocument and kill -HUP behavior

Posted by Ed Korthof <ed...@organic.com>.
Umm -- the problem with ErrorDocument also seems dependant on some stuff
I'm working on, which is not part of the code base.  I'm not sure how the
stuff I wrote would trigger that problem (esp. only under such specialized
conditions -- I have two additional directives, and if either one is not 
in the .conf file, things seem to work fine.... whatever....), but clearly
it's doing something.

Anyway, if I do find something wrong in the code base, I'll mention it;
and the bug with SIGHUPs being ignored is definitely present w/o the stuff
I'm working on (we see it on our live servers)...

     -- Ed Korthof        |  Web Server Engineer --
     -- ed@organic.com    |  Organic Online, Inc --
     -- (415) 278-5676    |  Fax: (415) 284-6891 --