You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Brian Behlendorf <br...@hyperreal.org> on 1998/05/07 03:15:04 UTC

Re: os-solaris/2185: 'apachectl restart' or 'apachectl graceful' causes httpd to die.

Hmm, possible bug in 'rotatelogs' - is the fact that it doesn't implement a
handler for SIGTERM a problem on this platform, maybe?

>Delivered-To: brian@hyperreal.org
>Date: Wed, 06 May 1998 08:33:26 -0700
>From: "C. R. Oldham" <cr...@nca.asu.edu>
>Organization: NCA Commission on Schools
>X-Mailer: Mozilla 4.05 [en] (WinNT; U)
>To: brian@hyperreal.org
>CC: apache-bugdb@apache.org, brian@apache.org, apbugs@Apache.Org
>Subject: Re: os-solaris/2185: 'apachectl restart' or 'apachectl graceful'
causes httpd to die.
>
>
>
>brian@hyperreal.org wrote:
>
>> [In order for any reply to be added to the PR database, ]
>> [you need to include <ap...@Apache.Org> in the Cc line ]
>> [and leave the subject line UNCHANGED.  This is not done]
>> [automatically because of the potential for mail loops. ]
>>
>> Synopsis: 'apachectl restart' or 'apachectl graceful' causes httpd to die.
>>
>> State-Changed-From-To: open-analyzed
>> State-Changed-By: brian
>> State-Changed-When: Tue May  5 18:10:45 PDT 1998
>> State-Changed-Why:
>> Could you run "truss", "strace", or some other type of system
>> call tracking program on it, so we could see where it dies or
>> becomes unresponsive?  You could also use "gcore" to get a core
>> file and see where it might be hung.
>
>I did a 'gcore' on the httpd that remained after 'apachectl restart'.  Then I
>ran gdb on the core file and obtained this backtrace.(gdb) backtrace#0
>0x8018255b in ?? () from /usr/lib/libc.so.1
>#1  0x8019904f in ?? () from /usr/lib/libc.so.1
>#2  0x80a6dca in reclaim_child_processes ()
>#3  0x80a8c10 in standalone_main ()
>#4  0x80a8f44 in main ()
>#5  0x805baa7 in _start ()
>
>Here is 'truss1.out' from 'truss -o /tmp/truss1.out -p 3110'
>(3110 was the pid of the master httpd).
>
>[...]
>waitid(P_ALL, 0, 0x08047B50, WEXITED|WTRAPPED|WNOHANG) = 0
>poll(0x08045B98, 0, 1000)   = 0
>time()      = 894467787
>waitid(P_ALL, 0, 0x08047B50, WEXITED|WTRAPPED|WNOHANG) = 0
>poll(0x08045B98, 0, 1000)   = 0
>time()      = 894467788
>waitid(P_ALL, 0, 0x08047B50, WEXITED|WTRAPPED|WNOHANG) = 0
>poll(0x08045B98, 0, 1000)   = 0
>time()      = 894467789
>waitid(P_ALL, 0, 0x08047B50, WEXITED|WTRAPPED|WNOHANG) = 0
>poll(0x08045B98, 0, 1000)   = 0
>time()      = 894467790
>waitid(P_ALL, 0, 0x08047B50, WEXITED|WTRAPPED|WNOHANG) = 0
>poll(0x08045B98, 0, 1000)   = 0
>time()      = 894467791
>waitid(P_ALL, 0, 0x08047B50, WEXITED|WTRAPPED|WNOHANG) = 0
>poll(0x08045B98, 0, 1000)   = 0
>time()      = 894467792
>waitid(P_ALL, 0, 0x08047B50, WEXITED|WTRAPPED|WNOHANG) = 0
>poll(0x08045B98, 0, 1000)   = 0
>time()      = 894467793
>waitid(P_ALL, 0, 0x08047B50, WEXITED|WTRAPPED|WNOHANG) = 0
>poll(0x08045B98, 0, 1000)   = 0
>time()      = 894467794
>poll(0x08045B00, 1, 0)    = 1
>waitid(P_ALL, 0, 0x08047B50, WEXITED|WTRAPPED|WNOHANG) = 0
>poll(0x08045B98, 0, 1000)   = 0
>time()      = 894467795
>waitid(P_ALL, 0, 0x08047B50, WEXITED|WTRAPPED|WNOHANG) = 0
>    Received signal #1, SIGHUP, in poll() [caught]
>      siginfo: SIGHUP pid=3175 uid=0
>poll(0x08045B98, 0, 1000)   Err#4 EINTR
>setcontext(0x0804597C)
>time()      = 894467795
>sigaction(SIGHUP, 0x08047B68, 0x08047BC4) = 0
>sigaction(SIGUSR1, 0x08047B60, 0x08047BBC) = 0
>kill(-3110, SIGHUP)    = 0
>    Received signal #1, SIGHUP [ignored]
>      siginfo: SIGHUP pid=3110 uid=0
>    Received signal #18, SIGCLD [default]
>      siginfo: SIGCLD CLD_EXITED pid=3128 status=0x0000
>poll(0x08045B80, 0, 17)    = 0
>waitid(P_PID, 3127, 0x08047B38, WEXITED|WTRAPPED|WNOHANG) = 0
>waitid(P_PID, 3128, 0x08047B38, WEXITED|WTRAPPED|WNOHANG) = 0
>waitid(P_PID, 3129, 0x08047B38, WEXITED|WTRAPPED|WNOHANG) = 0
>waitid(P_PID, 3131, 0x08047B38, WEXITED|WTRAPPED|WNOHANG) = 0
>waitid(P_PID, 3133, 0x08047B38, WEXITED|WTRAPPED|WNOHANG) = 0
>waitid(P_PID, 3157, 0x08047B38, WEXITED|WTRAPPED|WNOHANG) = 0
>waitid(P_PID, 3123, 0x08047B38, WEXITED|WTRAPPED|WNOHANG) = 0
>kill(3123, SIGTERM)    = 0
>poll(0x08045B80, 0, 66)    = 0
>waitid(P_PID, 3123, 0x08047B38, WEXITED|WTRAPPED|WNOHANG) = 0
>kill(3123, SIGTERM)    = 0
>poll(0x08045B80, 0, 263)   = 0
>waitid(P_PID, 3123, 0x08047B38, WEXITED|WTRAPPED|WNOHANG) = 0
>kill(3123, SIGTERM)    = 0
>poll(0x08045B80, 0, 1049)   = 0
>waitid(P_PID, 3123, 0x08047B38, WEXITED|WTRAPPED|WNOHANG) = 0
>kill(3123, SIGTERM)    = 0
>poll(0x08045B80, 0, 4195) (sleeping...)
>poll(0x08045B80, 0, 4195)   = 0
>waitid(P_PID, 3123, 0x08047B38, WEXITED|WTRAPPED|WNOHANG) = 0
>kill(3123, SIGTERM)    = 0
>poll(0x08045B80, 0, 16778) (sleeping...)
>poll(0x08045B80, 0, 16778)   = 0
>waitid(P_PID, 3123, 0x08047B38, WEXITED|WTRAPPED|WNOHANG) = 0
>kill(3123, SIGTERM)    = 0
>poll(0x08045B80, 0, 67109) (sleeping...)
> *** process killed ***
>
>And below is the list of running httpds at the time of restart.  Note that
>process 3123 is not in the list.
>
>pts/2 socrates[16]# ps -ef | grep http
>~/src/apache_1.3b6/src/main
>  nobody  3128  3110  3 08:14:25 ?        0:01 /usr/local/apache/sbin/httpd
>  nobody  3127  3110  2 08:14:25 ?        0:01 /usr/local/apache/sbin/httpd
>  nobody  3129  3110  0 08:14:25 ?        0:01 /usr/local/apache/sbin/httpd
>    root  3110     1  0 08:14:21 ?        0:00 /usr/local/apache/sbin/httpd
>  nobody  3131  3110  1 08:14:25 ?        0:01 /usr/local/apache/sbin/httpd
>  nobody  3133  3110  0 08:14:25 ?        0:00 /usr/local/apache/sbin/httpd
>  nobody  3157  3110  0 08:15:02 ?        0:00 /usr/local/apache/sbin/httpd
>pts/2 socrates[16]# ps -ef | grep http
>~/src/apache_1.3b6/src/main
>  nobody  3128  3110  3 08:14:25 ?        0:01 /usr/local/apache/sbin/httpd
>  nobody  3127  3110  2 08:14:25 ?        0:01 /usr/local/apache/sbin/httpd
>  nobody  3129  3110  0 08:14:25 ?        0:01 /usr/local/apache/sbin/httpd
>    root  3110     1  0 08:14:21 ?        0:00 /usr/local/apache/sbin/httpd
>  nobody  3131  3110  1 08:14:25 ?        0:01 /usr/local/apache/sbin/httpd
>  nobody  3133  3110  0 08:14:25 ?        0:00 /usr/local/apache/sbin/httpd
>  nobody  3157  3110  0 08:15:02 ?        0:00 /usr/local/apache/sbin/httpd
>
>Further investigation revealed that it belongs to 'rotatelogs', which I use
>for all my logging.  Rotatelogs does not install a signal handler for
>SIGTERM--is this the problem?
>
>
>--
>| Charles R. (C. R.) Oldham     | NCA Commission on Schools        |
>| cro@nca.asu.edu               | Arizona St. Univ., PO Box 873011,|
>| V:602/965-8700 F:602/965-9423 | Tempe, AZ 85287-3011           _ |
>| "I like it!"--Citizen G'Kar   | #include <disclaimer.h>       X_>|
>
>
>
--=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=--
pure chewing satisfaction                                  brian@apache.org
                                                        brian@hyperreal.org