You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Stefan Eissing <st...@eissing.org> on 2021/12/02 09:40:48 UTC

2.4.51 scoreboard

Friends of the scoreboard,

trying to improve the mod_http2 information available on our scoreboard, I am not sure I can interpret correctly what I see. Maybe someone can help me?

I run a 2.4.51 and see what worker are busy with. That#s fine. But I also see slots where the connection has been closed a while ago and still this is listed.

The overall stats are

Slot	PID	Stopping	Connections	Threads	Async connections
total	accepting	busy	idle	writing	keep-alive	closing
0	69336	no	1	yes	1	24	0	0	0
1	69338	no	0	yes	0	25	0	0	0
Sum	2	0	1	 	1	49	0	0	0

which means only the connection who reads server-status is active. But I also see listings as in:

Srv  PID   Acc    M CPU  SS  Req Dur Conn Child Slot Client        Protocol VHost         Request
...
1-1  69338 0/0/10 _ 0.00 948 0   5   0.0  0.00  0.04 195.133.18.60 http/1.1 domain.com:80 GET /up.php HTTP/1.1

The "SS" of 948 keeps on climbing, but the connection is long gone. How is one supposed to read that? Is that a side effect of an MPM event change that switches slots on re-activation?

Any help appreciated.

Cheers,
Stefan


Re: 2.4.51 scoreboard

Posted by Stefan Eissing <st...@eissing.org>.

> Am 03.12.2021 um 13:18 schrieb Niklas Edmundsson <ni...@acc.umu.se>:
> 
> On Thu, 2 Dec 2021, Stefan Eissing wrote:
> 
>> Thinking out loud here. Reading the source of mod_status and event some more:
>> 
>> - mpm_event: process_socket gets an available worker yanked and that means
>> that connections walk all over the server slots during processing.
>> - Columns in the Extended status really tell what the slot was doing and are not
>> about a connection as such. The same connection will appear in different snapshots
>> on many slots during its lifetime.
>> 
>> This works nice for connections that rarely enter KEEPALIVE, e.g. process_connection() returns to the MPM.
>> 
>> The new mod_http2 implementation now *often* returns to the MPM. Which means that connections "walk" across scoreboard slots in "server-status" and it is a bit hard to follow.
> 
> FWIW, we see a lot of this on servers mainly transferring large files as well, async transfers also walk across the scoreboard so it generally shows the current file(s) transferred all over the place. The slower the clients the more all-over-the-place it gets.
> 
>> hmmm...what to do?
> 
> What I'd like is some way to tell which *transfers* are currently in progress, which is essentially what /server-status provides with good old prefork (and I think worker) mpm. That function has kinda gotten lost with the event mpm as /server-status has kept the "what are the workers/processes doing" scope which is now decoupled from the user/admin interest in the transfer scope of things.

Agree. If the connection process stays in one slot, as with prefork and worker, one can follow server-status to see what a connection does. Especially, I guess, if something strange is going on and one wants to track down if that is somehow related to a particular vhost or proxy setup.

With event, the current design makes it very hard to track that.

> I'd suggest going drastic and just rip out the current (IMHO broken) "what are workers doing" scope (perhaps add a separate config knob to enable it for those who want it) and think up some way to get back the transfer status when in ExtendedStatus On mode.
> 
> The overview table with slot/pid/stopping/connection/thread summary is good, but the per-worker breakdown is more of a debugging tool than useful, IMHO.

We like our debugging tools, of course, but for an Admin there should be better tools to track down/analyze a problem. I'd imaging just getting extended info for a specific PID would also be useful on a large installation.

Thanks for your feedback. I currently try to improve the HTTP/2 implementation as best as possible in the current scoreboard. But this is definitely an area where the server can improve.

Kind Regards,
Stefan

>>> Am 02.12.2021 um 10:40 schrieb Stefan Eissing <st...@eissing.org>:
>>> 
>>> Friends of the scoreboard,
>>> 
>>> trying to improve the mod_http2 information available on our scoreboard, I am not sure I can interpret correctly what I see. Maybe someone can help me?
>>> 
>>> I run a 2.4.51 and see what worker are busy with. That#s fine. But I also see slots where the connection has been closed a while ago and still this is listed.
>>> 
>>> The overall stats are
>>> 
>>> Slot	PID	Stopping	Connections	Threads	Async connections
>>> total	accepting	busy	idle	writing	keep-alive	closing
>>> 0	69336	no	1	yes	1	24	0	0	0
>>> 1	69338	no	0	yes	0	25	0	0	0
>>> Sum	2	0	1	 	1	49	0	0	0
>>> 
>>> which means only the connection who reads server-status is active. But I also see listings as in:
>>> 
>>> Srv  PID   Acc    M CPU  SS  Req Dur Conn Child Slot Client        Protocol VHost         Request
>>> ...
>>> 1-1  69338 0/0/10 _ 0.00 948 0   5   0.0  0.00  0.04 195.133.18.60 http/1.1 domain.com:80 GET /up.php HTTP/1.1
>>> 
>>> The "SS" of 948 keeps on climbing, but the connection is long gone. How is one supposed to read that? Is that a side effect of an MPM event change that switches slots on re-activation?
>>> 
>>> Any help appreciated.
>>> 
>>> Cheers,
>>> Stefan
>>> 
>> 
> 
> 
> /Nikke
> -- 
> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
> Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se      |     nikke@acc.umu.se
> ---------------------------------------------------------------------------
> There is no man so blind as he who will not see.
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=


Re: 2.4.51 scoreboard

Posted by Niklas Edmundsson <ni...@acc.umu.se>.
On Thu, 2 Dec 2021, Stefan Eissing wrote:

> Thinking out loud here. Reading the source of mod_status and event some more:
>
> - mpm_event: process_socket gets an available worker yanked and that means
>  that connections walk all over the server slots during processing.
> - Columns in the Extended status really tell what the slot was doing and are not
>  about a connection as such. The same connection will appear in different snapshots
>  on many slots during its lifetime.
>
> This works nice for connections that rarely enter KEEPALIVE, e.g. 
> process_connection() returns to the MPM.
>
> The new mod_http2 implementation now *often* returns to the MPM. 
> Which means that connections "walk" across scoreboard slots in 
> "server-status" and it is a bit hard to follow.

FWIW, we see a lot of this on servers mainly transferring large files 
as well, async transfers also walk across the scoreboard so it 
generally shows the current file(s) transferred all over the place. 
The slower the clients the more all-over-the-place it gets.

> hmmm...what to do?

What I'd like is some way to tell which *transfers* are currently in 
progress, which is essentially what /server-status provides with good 
old prefork (and I think worker) mpm. That function has kinda gotten 
lost with the event mpm as /server-status has kept the "what are the 
workers/processes doing" scope which is now decoupled from the 
user/admin interest in the transfer scope of things.

I'd suggest going drastic and just rip out the current (IMHO broken) 
"what are workers doing" scope (perhaps add a separate config knob to 
enable it for those who want it) and think up some way to get back the 
transfer status when in ExtendedStatus On mode.

The overview table with slot/pid/stopping/connection/thread summary is 
good, but the per-worker breakdown is more of a debugging tool than 
useful, IMHO.


>> Am 02.12.2021 um 10:40 schrieb Stefan Eissing <st...@eissing.org>:
>>
>> Friends of the scoreboard,
>>
>> trying to improve the mod_http2 information available on our scoreboard, I am not sure I can interpret correctly what I see. Maybe someone can help me?
>>
>> I run a 2.4.51 and see what worker are busy with. That#s fine. But I also see slots where the connection has been closed a while ago and still this is listed.
>>
>> The overall stats are
>>
>> Slot	PID	Stopping	Connections	Threads	Async connections
>> total	accepting	busy	idle	writing	keep-alive	closing
>> 0	69336	no	1	yes	1	24	0	0	0
>> 1	69338	no	0	yes	0	25	0	0	0
>> Sum	2	0	1	 	1	49	0	0	0
>>
>> which means only the connection who reads server-status is active. But I also see listings as in:
>>
>> Srv  PID   Acc    M CPU  SS  Req Dur Conn Child Slot Client        Protocol VHost         Request
>> ...
>> 1-1  69338 0/0/10 _ 0.00 948 0   5   0.0  0.00  0.04 195.133.18.60 http/1.1 domain.com:80 GET /up.php HTTP/1.1
>>
>> The "SS" of 948 keeps on climbing, but the connection is long gone. How is one supposed to read that? Is that a side effect of an MPM event change that switches slots on re-activation?
>>
>> Any help appreciated.
>>
>> Cheers,
>> Stefan
>>
>


/Nikke
-- 
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
  Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se      |     nikke@acc.umu.se
---------------------------------------------------------------------------
  There is no man so blind as he who will not see.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Re: 2.4.51 scoreboard

Posted by Stefan Eissing <st...@eissing.org>.
Thinking out loud here. Reading the source of mod_status and event some more:

- mpm_event: process_socket gets an available worker yanked and that means
  that connections walk all over the server slots during processing.
- Columns in the Extended status really tell what the slot was doing and are not
  about a connection as such. The same connection will appear in different snapshots
  on many slots during its lifetime. 

This works nice for connections that rarely enter KEEPALIVE, e.g. process_connection() returns to the MPM.

The new mod_http2 implementation now *often* returns to the MPM. Which means that connections "walk" across scoreboard slots in "server-status" and it is a bit hard to follow. 

hmmm...what to do?



> Am 02.12.2021 um 10:40 schrieb Stefan Eissing <st...@eissing.org>:
> 
> Friends of the scoreboard,
> 
> trying to improve the mod_http2 information available on our scoreboard, I am not sure I can interpret correctly what I see. Maybe someone can help me?
> 
> I run a 2.4.51 and see what worker are busy with. That#s fine. But I also see slots where the connection has been closed a while ago and still this is listed.
> 
> The overall stats are
> 
> Slot	PID	Stopping	Connections	Threads	Async connections
> total	accepting	busy	idle	writing	keep-alive	closing
> 0	69336	no	1	yes	1	24	0	0	0
> 1	69338	no	0	yes	0	25	0	0	0
> Sum	2	0	1	 	1	49	0	0	0
> 
> which means only the connection who reads server-status is active. But I also see listings as in:
> 
> Srv  PID   Acc    M CPU  SS  Req Dur Conn Child Slot Client        Protocol VHost         Request
> ...
> 1-1  69338 0/0/10 _ 0.00 948 0   5   0.0  0.00  0.04 195.133.18.60 http/1.1 domain.com:80 GET /up.php HTTP/1.1
> 
> The "SS" of 948 keeps on climbing, but the connection is long gone. How is one supposed to read that? Is that a side effect of an MPM event change that switches slots on re-activation?
> 
> Any help appreciated.
> 
> Cheers,
> Stefan
> 


Re: 2.4.51 scoreboard

Posted by Stefan Eissing <st...@eissing.org>.

> Am 03.12.2021 um 13:27 schrieb Eric Covener <co...@gmail.com>:
> 
> On Thu, Dec 2, 2021 at 4:41 AM Stefan Eissing <st...@eissing.org> wrote:
>> 
>> Friends of the scoreboard,
>> 
>> trying to improve the mod_http2 information available on our scoreboard, I am not sure I can interpret correctly what I see. Maybe someone can help me?
>> 
>> I run a 2.4.51 and see what worker are busy with. That#s fine. But I also see slots where the connection has been closed a while ago and still this is listed.
>> 
>> The overall stats are
>> 
>> Slot    PID     Stopping        Connections     Threads Async connections
>> total   accepting       busy    idle    writing keep-alive      closing
>> 0       69336   no      1       yes     1       24      0       0       0
>> 1       69338   no      0       yes     0       25      0       0       0
>> Sum     2       0       1               1       49      0       0       0
>> 
>> which means only the connection who reads server-status is active. But I also see listings as in:
>> 
>> Srv  PID   Acc    M CPU  SS  Req Dur Conn Child Slot Client        Protocol VHost         Request
>> ...
>> 1-1  69338 0/0/10 _ 0.00 948 0   5   0.0  0.00  0.04 195.133.18.60 http/1.1 domain.com:80 GET /up.php HTTP/1.1
>> 
>> The "SS" of 948 keeps on climbing, but the connection is long gone. How is one supposed to read that? Is that a side effect of an MPM event change that switches slots on re-activation?
> 
> Sorry, I read this on mobile and lost track. A "M[ode] of "_" is idle
> and "SS" in this case tells you how long the slot has been idle. The
> request details are just the details from "SS" ago that used this slot
> and completed.

Thanks. I had now a more closer look at the code and how the stats are collected. As Niklas said, the scoreboard lost usefulness in mpm_event with lots of distracting "ghost" entries displayed when slots are switched. At least I know now that it is not a fluke in my recent H2 implementation.

Cheers,
Stefan

Re: 2.4.51 scoreboard

Posted by Eric Covener <co...@gmail.com>.
On Thu, Dec 2, 2021 at 4:41 AM Stefan Eissing <st...@eissing.org> wrote:
>
> Friends of the scoreboard,
>
> trying to improve the mod_http2 information available on our scoreboard, I am not sure I can interpret correctly what I see. Maybe someone can help me?
>
> I run a 2.4.51 and see what worker are busy with. That#s fine. But I also see slots where the connection has been closed a while ago and still this is listed.
>
> The overall stats are
>
> Slot    PID     Stopping        Connections     Threads Async connections
> total   accepting       busy    idle    writing keep-alive      closing
> 0       69336   no      1       yes     1       24      0       0       0
> 1       69338   no      0       yes     0       25      0       0       0
> Sum     2       0       1               1       49      0       0       0
>
> which means only the connection who reads server-status is active. But I also see listings as in:
>
> Srv  PID   Acc    M CPU  SS  Req Dur Conn Child Slot Client        Protocol VHost         Request
> ...
> 1-1  69338 0/0/10 _ 0.00 948 0   5   0.0  0.00  0.04 195.133.18.60 http/1.1 domain.com:80 GET /up.php HTTP/1.1
>
> The "SS" of 948 keeps on climbing, but the connection is long gone. How is one supposed to read that? Is that a side effect of an MPM event change that switches slots on re-activation?

Sorry, I read this on mobile and lost track. A "M[ode] of "_" is idle
and "SS" in this case tells you how long the slot has been idle. The
request details are just the details from "SS" ago that used this slot
and completed.