You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Yann Ylavic <yl...@gmail.com> on 2021/08/25 11:23:24 UTC

Re: Late(r) stop of children processes on restart

On Tue, Jun 29, 2021 at 3:00 PM Rainer Jung <ra...@kippdata.de> wrote:
>
> Am 29.06.2021 um 14:31 schrieb Stefan Eissing:
> > Can comment really on the diff, but totally agree on the goal to minimize the unresponsive time and make graceful less disruptive.
> >
> > So +1 for that.
>
> +1 on the intention as well.

Checked in trunk (r1892587 + r1892595).

>
> Not sure, whether that means people would need more headroom in the
> scoreboard (which would probably warrant a sentence in CHANGES or docs
> about that) or whether it just means the duration during which that
> headroom is used changes (which I wouldn't care about).

The restart delay between stop and start is now minimal (no reload in
between), but the headroom needed does not change AIUI.
We still have the situation where connections (worker threads) are
active for both the new and old generations of children processes, and
its duration depends mainly on the actual lifetime of the connections.
So the current tunings still hold I think.

What changes now is that for both graceful and ungraceful restarts the
main process fully consumes one CPU (to reload) while children are
actively running (the old generation keeps accepting/processing
connections during reload), whereas before the children were tearing
down thus easing the CPUs (but filling the sockets backlogs,
potentially until exhaustion..).
So there might be a greater load spike (overall) than before on reload.

A note on the headroom while at it:
mpm_event is possibly less consumer of children (hence scoreboard
slots) on restart, because when a child is dying it stops (and thus
doesn't account for) the worker threads above the remaining number of
connections, which will accurately create children of the new
generation to scale. mpm_worker never stops threads (this improvement
never made it there AFAICT), thus by accounting for inactive threads
as active it will finally create more children of the new generation
as connections arrive (eventually reaching the limits earlier, or
blocking/waiting for worker threads in the new generation of children
overflowed by incoming connections which the main process thinks are
evenly distributed across all the children, including old
generation's).
I don't know how hard/worthy it is to align mpm_worker with mpm_event
on this, just a note..


Cheers;
Yann.

Re: Late(r) stop of children processes on restart

Posted by Rainer Jung <ra...@kippdata.de>.
Thanks for the headroom explanation Yann, good reading!

Rainer

Am 25.08.2021 um 13:23 schrieb Yann Ylavic:
> On Tue, Jun 29, 2021 at 3:00 PM Rainer Jung <ra...@kippdata.de> wrote:
>>
>> Am 29.06.2021 um 14:31 schrieb Stefan Eissing:
>>> Can comment really on the diff, but totally agree on the goal to minimize the unresponsive time and make graceful less disruptive.
>>>
>>> So +1 for that.
>>
>> +1 on the intention as well.
> 
> Checked in trunk (r1892587 + r1892595).
> 
>>
>> Not sure, whether that means people would need more headroom in the
>> scoreboard (which would probably warrant a sentence in CHANGES or docs
>> about that) or whether it just means the duration during which that
>> headroom is used changes (which I wouldn't care about).
> 
> The restart delay between stop and start is now minimal (no reload in
> between), but the headroom needed does not change AIUI.
> We still have the situation where connections (worker threads) are
> active for both the new and old generations of children processes, and
> its duration depends mainly on the actual lifetime of the connections.
> So the current tunings still hold I think.
> 
> What changes now is that for both graceful and ungraceful restarts the
> main process fully consumes one CPU (to reload) while children are
> actively running (the old generation keeps accepting/processing
> connections during reload), whereas before the children were tearing
> down thus easing the CPUs (but filling the sockets backlogs,
> potentially until exhaustion..).
> So there might be a greater load spike (overall) than before on reload.
> 
> A note on the headroom while at it:
> mpm_event is possibly less consumer of children (hence scoreboard
> slots) on restart, because when a child is dying it stops (and thus
> doesn't account for) the worker threads above the remaining number of
> connections, which will accurately create children of the new
> generation to scale. mpm_worker never stops threads (this improvement
> never made it there AFAICT), thus by accounting for inactive threads
> as active it will finally create more children of the new generation
> as connections arrive (eventually reaching the limits earlier, or
> blocking/waiting for worker threads in the new generation of children
> overflowed by incoming connections which the main process thinks are
> evenly distributed across all the children, including old
> generation's).
> I don't know how hard/worthy it is to align mpm_worker with mpm_event
> on this, just a note..
> 
> 
> Cheers;
> Yann.