You are viewing a plain text version of this content. The canonical link for it is here.
Posted to apache-bugdb@apache.org by dg...@apache.org on 1999/05/01 19:39:02 UTC
Re: os-linux/3312: Children die. Parent stops serving requests
[In order for any reply to be added to the PR database, ]
[you need to include <ap...@Apache.Org> in the Cc line ]
[and leave the subject line UNCHANGED. This is not done]
[automatically because of the potential for mail loops. ]
[If you do not include this Cc, your reply may be ig- ]
[nored unless you are responding to an explicit request ]
[from a developer. ]
[Reply only with text; DO NOT SEND ATTACHMENTS! ]
Synopsis: Children die. Parent stops serving requests
State-Changed-From-To: feedback-analyzed
State-Changed-By: dgaudet
State-Changed-When: Sat May 1 10:39:02 PDT 1999
State-Changed-Why:
I examined the straces a while ago, but forgot to comment.
Here's a portion of the parent's trace:
time(NULL) = 909702870
wait4(-1, 0xbffffe64, WNOHANG, NULL) = 0
select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout)
time(NULL) = 909702871
fork() = 26032
wait4(-1, [WIFEXITED(s) && WEXITSTATUS(s) == 0], WNOHANG, NULL) = 26032
--- SIGCHLD (Child exited) ---
wait4(-1, 0xbffffe64, WNOHANG, NULL) = -1 ECHILD (No child processes)
select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout)
time(NULL) = 909703113
Somehow 242 seconds passed between the two time() calls... the parent does
nothing cpu intensive, so I doubt it's that. It's possible the guy's box
is swapping to hell... but we've got about a dozen similar reports. The
reports are against 2.0.30, 2.0.32, and 2.0.33.
Oh then there's the odd SIGCHLD followed by ECHILD... there's a few other
instances of that -- SIGCHLDs happenning and wait4() not reporting
anything.
The short answer: kernel problem. Alan Cox hasn't heard of
this problem before, so it's probably an unknown problem.
Dean
Re: os-linux/3312: Children die. Parent stops serving requests
Posted by Ole Tange <ta...@tange.dk>.
On 1 May 1999 dgaudet@apache.org wrote:
> Somehow 242 seconds passed between the two time() calls... the parent does
> nothing cpu intensive, so I doubt it's that. It's possible the guy's box
> is swapping to hell... but we've got about a dozen similar reports.
Nope. In that case the load ought to rise, which it didnot. The problem
was worked around by disabling keep-alives.
> The reports are against 2.0.30, 2.0.32, and 2.0.33.
After upgrading to kernel 2.0.36 and apache 1.3.4 I have been able to
re-enable keepalives with no problems so far.
> The short answer: kernel problem. Alan Cox hasn't heard of
> this problem before, so it's probably an unknown problem.
The short comment: Case appears solved by upgrading.
/Ole