You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Jeff Trawick <tr...@gmail.com> on 2005/05/15 15:22:57 UTC

[PATCH] PR 26467 child process can hang when piped logger goes away

Apache keeps the read handle to the pipe open in the child process. 
Thus, if the piped logger goes away and a new piped logger doesn't
inherit the same handles, a child process can hang writing to the
logger since the child process itself is keeping the read side of the
pipe open.

The scenario where this leads to a hang is during graceful restart. 
The child process is busy writing to the logging pipe (enough to cause
them to block due to kernel pipe flow control).  Then the parent
process whacks the piped logger and closes its handles to that pipe as
part of restart processing.  The child process is stuck forever trying
to write to the pipe.

Comments/concerns?  Is there a cleaner way to accomplish this?

In the discussion of PR 26467 there is a reference to an older dev@
thread, including this text:

>Joe Orton says, "To fix this properly, I suppose piped loggers should not get
>SIGTERMedduring a graceful restart, they should read till EOF then
exit(0): then
>when the last child attached to the piped logger for a particular generation
>quits, the pipe is closed and the piped logger terminates gracefully too,
>without losing log messages."

In that scenario, the patch I've attached here would not normally be
needed, but would be a backup if that doesn't work.  It is not clear
to me that we can change that aspect of piped loggers anyway, since it
can break third-party code.

A couple of tangents:

a) Wouldn't it be nice if there was a way to indicate (when
registering an other-child) that the other-child process has to be
maintained until all of the current generation of children goes
away...

Thus, mod_cgid requests wouldn't fail during graceful restart, and
this piped logging hang would potentially be avoided as well.  We'd
still hit the logger with SIGTERM, but only after  the logger is no
longer needed.

i. I still like changing the server to avoid the piped logging hang
regardless of what happens elsewhere.
ii. mod_cgid would need some tweaks to use a new Unix socket for new
generations, so that any last requests issued by the old generation
would be serviced by the old generation cgid daemon but requests
issued by the new generation would be serviced by the new generation
cgid daemon

b) wouldn't it be nice to have a hook similar to child-init which
would be run in non-MPM child processes (e.g., mod_cgid's child or
mod_fastcgi's child or children created in various other
non-Apache-distributed modules)

The module would need to add the ap_run_non-mpm-child-init() call, but
then core would be able to close the listeners in the child and close
the piped logger read handles in the child, and potentially other
unsanitary conditions could be fixed.  For now we have to assume that
such non-mpm children really do exit at the right time.

It is disturbing to see child processes with file descriptors they
don't need, whether or not I have a use-case for when it breaks
something.

Re: [PATCH] PR 26467 child process can hang when piped logger goes away

Posted by Tom Hoefakker <tw...@iw.net>.
good point. I suspected there was a reason for the difference in 
behavior between this and mod_log_config.



Jeff Trawick wrote:
> On 5/15/05, Tom Hoefakker <tw...@iw.net> wrote:
> 
>>Perhaps while we're dealing with piped log process/pipe management
>>perhaps we can take care of this one at the same time:
>>
>>http://issues.apache.org/bugzilla/show_bug.cgi?id=34537
> 
> 
> My own rationalization for current behavior (admittedly weak):
> If the access log piped logger keeps crashing, at least there is an
> error log for reporting the problem, so it isn't the worst thing in
> the world to keep trying to start the darn thing over and over and
> over (fork-a-matic) since we report what we're trying to do.  Is it
> okay to try over and over to restart an error log piped logger if
> there is nowhere to report the issue?
> 

Re: [PATCH] PR 26467 child process can hang when piped logger goes away

Posted by Jeff Trawick <tr...@gmail.com>.
On 5/15/05, Tom Hoefakker <tw...@iw.net> wrote:
> Perhaps while we're dealing with piped log process/pipe management
> perhaps we can take care of this one at the same time:
> 
> http://issues.apache.org/bugzilla/show_bug.cgi?id=34537

My own rationalization for current behavior (admittedly weak):
If the access log piped logger keeps crashing, at least there is an
error log for reporting the problem, so it isn't the worst thing in
the world to keep trying to start the darn thing over and over and
over (fork-a-matic) since we report what we're trying to do.  Is it
okay to try over and over to restart an error log piped logger if
there is nowhere to report the issue?

Re: [PATCH] PR 26467 child process can hang when piped logger goes away

Posted by Tom Hoefakker <tw...@iw.net>.
Perhaps while we're dealing with piped log process/pipe management 
perhaps we can take care of this one at the same time:

http://issues.apache.org/bugzilla/show_bug.cgi?id=34537

Thanks!
Tom

Jeff Trawick wrote:
> Apache keeps the read handle to the pipe open in the child process. 
> Thus, if the piped logger goes away and a new piped logger doesn't
> inherit the same handles, a child process can hang writing to the
> logger since the child process itself is keeping the read side of the
> pipe open.
> 
> The scenario where this leads to a hang is during graceful restart. 
> The child process is busy writing to the logging pipe (enough to cause
> them to block due to kernel pipe flow control).  Then the parent
> process whacks the piped logger and closes its handles to that pipe as
> part of restart processing.  The child process is stuck forever trying
> to write to the pipe.
> 
> Comments/concerns?  Is there a cleaner way to accomplish this?
> 
> In the discussion of PR 26467 there is a reference to an older dev@
> thread, including this text:
> 
> 
>>Joe Orton says, "To fix this properly, I suppose piped loggers should not get
>>SIGTERMedduring a graceful restart, they should read till EOF then
> 
> exit(0): then
> 
>>when the last child attached to the piped logger for a particular generation
>>quits, the pipe is closed and the piped logger terminates gracefully too,
>>without losing log messages."
> 
> 
> In that scenario, the patch I've attached here would not normally be
> needed, but would be a backup if that doesn't work.  It is not clear
> to me that we can change that aspect of piped loggers anyway, since it
> can break third-party code.
> 
> A couple of tangents:
> 
> a) Wouldn't it be nice if there was a way to indicate (when
> registering an other-child) that the other-child process has to be
> maintained until all of the current generation of children goes
> away...
> 
> Thus, mod_cgid requests wouldn't fail during graceful restart, and
> this piped logging hang would potentially be avoided as well.  We'd
> still hit the logger with SIGTERM, but only after  the logger is no
> longer needed.
> 
> i. I still like changing the server to avoid the piped logging hang
> regardless of what happens elsewhere.
> ii. mod_cgid would need some tweaks to use a new Unix socket for new
> generations, so that any last requests issued by the old generation
> would be serviced by the old generation cgid daemon but requests
> issued by the new generation would be serviced by the new generation
> cgid daemon
> 
> b) wouldn't it be nice to have a hook similar to child-init which
> would be run in non-MPM child processes (e.g., mod_cgid's child or
> mod_fastcgi's child or children created in various other
> non-Apache-distributed modules)
> 
> The module would need to add the ap_run_non-mpm-child-init() call, but
> then core would be able to close the listeners in the child and close
> the piped logger read handles in the child, and potentially other
> unsanitary conditions could be fixed.  For now we have to assume that
> such non-mpm children really do exit at the right time.
> 
> It is disturbing to see child processes with file descriptors they
> don't need, whether or not I have a use-case for when it breaks
> something.

Re: [PATCH] PR 26467 child process can hang when piped logger goes away

Posted by Jeff Trawick <tr...@gmail.com>.
On 5/15/05, Jeff Trawick <tr...@gmail.com> wrote:
> Apache keeps the read handle to the pipe open in the child process.
> Thus, if the piped logger goes away and a new piped logger doesn't
> inherit the same handles, a child process can hang writing to the
> logger since the child process itself is keeping the read side of the
> pipe open.
> 
> The scenario where this leads to a hang is during graceful restart.
> The child process is busy writing to the logging pipe (enough to cause
> them to block due to kernel pipe flow control).  Then the parent
> process whacks the piped logger and closes its handles to that pipe as
> part of restart processing.  The child process is stuck forever trying
> to write to the pipe.

I stuck a patch for 1.3 at http://people.apache.org/~trawick/pr26467_13.txt

I need to test a little more and convince myself that it is okay to
represent pipe handles with int (e.g., because other platforms don't
use that code??).  I guess the new code needs to be conditional based
on platform.  Anybody want to play with it?

Re: [PATCH] PR 26467 child process can hang when piped logger goes away

Posted by Joe Orton <jo...@redhat.com>.
On Sun, May 15, 2005 at 08:22:32PM -0400, Jeff Trawick wrote:
> On 5/15/05, Joe Orton <jo...@redhat.com> wrote:
> > I think it would be OK to make the behaviour change for 2.2 at least,
> > since the current behaviour of losing log entries during graceful
> > restart if using piped loggers is definitely Surprising Behaviour (tm).
> 
> That's probably MUCH more practical than waiting until the current
> generation is finished to send SIGTERM, but the latter could be put to
> use in other circumstances.  (But no round tuits in sight for that.)

Actually the hard part of changing to "piped loggers are allowed to
exist until the pipe closes" is just doing the pipe-read-side-closure
cleanup which you've just committed.  (and works great, btw, just did
some testing)

So I think that now we could just remove the apr_proc_kill() call from
piped_log_maintenance() and it would all Just Work, right?  I'll try
this out later...

joe

Re: [PATCH] PR 26467 child process can hang when piped logger goes away

Posted by Jeff Trawick <tr...@gmail.com>.
On 5/15/05, Joe Orton <jo...@redhat.com> wrote:
> On Sun, May 15, 2005 at 09:22:57AM -0400, Jeff Trawick wrote:
> ...
> > >Joe Orton says, "To fix this properly, I suppose piped loggers should not get
> > >SIGTERMedduring a graceful restart, they should read till EOF then
> > exit(0): then
> > >when the last child attached to the piped logger for a particular generation
> > >quits, the pipe is closed and the piped logger terminates gracefully too,
> > >without losing log messages."
> >
> > In that scenario, the patch I've attached here would not normally be
> > needed, but would be a backup if that doesn't work.  It is not clear
> > to me that we can change that aspect of piped loggers anyway, since it
> > can break third-party code.
> 
> I have a patch in-progress somewhere which does attempt the alternative
> approach, but I stalled on this exactly because of the concern for
> breaking third-party loggers.  (I think I checked cronolog and found it
> would just spin forever calling read() if stdin was closed)
> 
> I think it would be OK to make the behaviour change for 2.2 at least,
> since the current behaviour of losing log entries during graceful
> restart if using piped loggers is definitely Surprising Behaviour (tm).

That's probably MUCH more practical than waiting until the current
generation is finished to send SIGTERM, but the latter could be put to
use in other circumstances.  (But no round tuits in sight for that.)

> > b) wouldn't it be nice to have a hook similar to child-init which
> > would be run in non-MPM child processes (e.g., mod_cgid's child or
> > mod_fastcgi's child or children created in various other
> > non-Apache-distributed modules)
> 
> I suppose there is a better argument to start setting FD_CLOEXEC
> everywhere...

no exec() in sight though (or registering cleanup-on-exec would have worked)

Re: [PATCH] PR 26467 child process can hang when piped logger goes away

Posted by Joe Orton <jo...@redhat.com>.
On Sun, May 15, 2005 at 09:22:57AM -0400, Jeff Trawick wrote:
...
> >Joe Orton says, "To fix this properly, I suppose piped loggers should not get
> >SIGTERMedduring a graceful restart, they should read till EOF then
> exit(0): then
> >when the last child attached to the piped logger for a particular generation
> >quits, the pipe is closed and the piped logger terminates gracefully too,
> >without losing log messages."
> 
> In that scenario, the patch I've attached here would not normally be
> needed, but would be a backup if that doesn't work.  It is not clear
> to me that we can change that aspect of piped loggers anyway, since it
> can break third-party code.

I have a patch in-progress somewhere which does attempt the alternative
approach, but I stalled on this exactly because of the concern for
breaking third-party loggers.  (I think I checked cronolog and found it
would just spin forever calling read() if stdin was closed)

I think it would be OK to make the behaviour change for 2.2 at least,
since the current behaviour of losing log entries during graceful
restart if using piped loggers is definitely Surprising Behaviour (tm).

But +1 on the alternative appropach in lieu of any round tuits to get
that done.  Definitely +1 on this approach for 2.0.x too.  Thanks Jeff!

> b) wouldn't it be nice to have a hook similar to child-init which
> would be run in non-MPM child processes (e.g., mod_cgid's child or
> mod_fastcgi's child or children created in various other
> non-Apache-distributed modules)

I suppose there is a better argument to start setting FD_CLOEXEC 
everywhere...

joe