You are viewing a plain text version of this content. The canonical link for it is here.

Posted to bugs@httpd.apache.org by bu...@apache.org on 2017/04/05 13:20:56 UTC

[Bug 60956] New: Event MPM listener thread may get blocked by SSL shutdowns

https://bz.apache.org/bugzilla/show_bug.cgi?id=60956

            Bug ID: 60956
           Summary: Event MPM listener thread may get blocked by SSL
                    shutdowns
           Product: Apache httpd-2
           Version: 2.4.23
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: major
          Priority: P2
         Component: mpm_event
          Assignee: bugs@httpd.apache.org
          Reporter: frank.meier@ergon.ch
  Target Milestone: ---

I have analyzed an Apache httpd 2.4.23 server that did not handle new
connections anymore. I found this stack trace:

#0  0x00007f996d44f283 in poll () from /lib64/libc.so.6
#1  0x00007f996df8764f in apr_poll () from /opt/apache/bin/libapr-1.so.0
#2  0x00007f996eacb485 in ap_core_output_filter ()
#3  0x00007f996cf46488 in bio_filter_out_pass () from
/opt/apache/bin/mod_ssl.so
#4  0x00007f996cf483bf in bio_filter_out_ctrl () from
/opt/apache/bin/mod_ssl.so
#5  0x00007f996cf5803b in modssl_smart_shutdown () from
/opt/apache/bin/mod_ssl.so
#6  0x00007f996cf4856e in ssl_filter_io_shutdown.isra.2 () from
/opt/apache/bin/mod_ssl.so
#7  0x00007f996cf49c10 in ssl_io_filter_output () from
/opt/apache/bin/mod_ssl.so
#8  0x00007f996cf46b4e in ssl_io_filter_coalesce () from
/opt/apache/bin/mod_ssl.so
#9  0x00007f996ead9f93 in ap_shutdown_conn ()
#10 0x00007f996a505702 in start_lingering_close_nonblocking () from
/opt/apache/bin/mod_mpm_event.so
#11 0x00007f996a5040ac in process_timeout_queue () from
/opt/apache/bin/mod_mpm_event.so
#12 0x00007f996a5063b0 in listener_thread () from
/opt/apache/bin/mod_mpm_event.so
#13 0x00007f996d90faa1 in start_thread () from /lib64/libpthread.so.0
#14 0x00007f996d458aad in clone () from /lib64/libc.so.6


The function start_lingering_close_nonblocking() is blocked by a call to
poll(), which must not happen. Because the listener thread is blocked, this
process does not accept new connections anymore.

The line numbers are missing in the stack, but I think this happens:
- ap_shutdown_conn() creates an "End Of Connection" bucket (EOC)
- mod_ssl detects this in ssl_io_filter_output() and calls
modssl_smart_shutdown()
- modssl_smart_shutdown() sends an SSL "close notify" shutdown alert to the
peer and then flushes the data - this may block

The clean SSL shutdown has been implemented in Apache httpd 2.4.12 (see bug
54998). Previous versions of Apache httpd 2.4 are not affected.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org

[Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by bu...@apache.org.

https://bz.apache.org/bugzilla/show_bug.cgi?id=60956

Stefan Priebe <st...@priebe.ws> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |stefan@priebe.ws

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org

[Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by bu...@apache.org.

https://bz.apache.org/bugzilla/show_bug.cgi?id=60956

--- Comment #23 from Yann Ylavic <yl...@gmail.com> ---
Thanks Franck for testing, backport to 2.4.x proposed.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org

[Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by bu...@apache.org.

https://bz.apache.org/bugzilla/show_bug.cgi?id=60956

Yann Ylavic <yl...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #35158|0                           |1
        is obsolete|                            |

--- Comment #15 from Yann Ylavic <yl...@gmail.com> ---
Created attachment 35159
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=35159&action=edit
Defer nonblocking lingering close to workers (v4)

Fixes is_idle in worker threads woken up for a deferred close.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org

[Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by bu...@apache.org.

https://bz.apache.org/bugzilla/show_bug.cgi?id=60956

Yann Ylavic <yl...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED

--- Comment #25 from Yann Ylavic <yl...@gmail.com> ---
Backported to 2.4.28 (r1809299).

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org

[Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by bu...@apache.org.

https://bz.apache.org/bugzilla/show_bug.cgi?id=60956

--- Comment #10 from Luca Toscano <to...@gmail.com> ---
Hi Frank,

for reference here's some interesting commits to the modssl smart shutdown
function:

http://svn.apache.org/viewvc?view=revision&revision=1651077
https://bz.apache.org/bugzilla/show_bug.cgi?id=54998

So I'd say that it is really important for us to keep flushing the close-notify
to the client.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org

[Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by bu...@apache.org.

https://bz.apache.org/bugzilla/show_bug.cgi?id=60956

--- Comment #12 from Luca Toscano <to...@gmail.com> ---
As FYI this bug generated a discussion on dev@:

https://lists.apache.org/thread.html/0daa4c40a4396cbb411f7657b476e2524add73cbc8ed99e67264578c@%3Cdev.httpd.apache.org%3E

It seems that there is an agreement on the safest way to proceed, namely
forcing start_lingering_close_nonblocking to be executed on a worker thread.

Note for the readers: in the email thread it was pointed out that
ap_prep_lingering_close (called by start_lingering_close_nonblocking before
ap_shutdown_conn) could block as well if a module X hooks to
pre_close_connection and blocks.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org

[Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by bu...@apache.org.

https://bz.apache.org/bugzilla/show_bug.cgi?id=60956

--- Comment #19 from Luca Toscano <to...@gmail.com> ---
Hi Frank, have you had the chance to test Yann's patch to see if it fixes the
blocking issue that you reported in your testing environment?

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org

[Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by bu...@apache.org.

https://bz.apache.org/bugzilla/show_bug.cgi?id=60956

--- Comment #1 from Frank Meier <fr...@ergon.ch> ---
I finally was able to reproduce the phenomenon. It is occurs if a request
handler triggers the asynchronous write completion feature, which gives the
listener thread the opportunity to send the final bytes of a response to the
client asynchronously, without blocking a worker thread. But if the client
refuses to read the data, the connection gets stalled (TCP Window FULL message
in wireshark). This does not block the listener thread, since it does the
writing asynchronously, and a stalled connection is not a problem. But then,
after the timeout ([1] default 60s), the listener thread wants to close the
connection and triggers start_lingering_close_nonblocking() and the listener
thread gets blocked as described above. After another timeout interval [1] the
listener thread recovers from it's misery.

The tricky part to reproduce this, is to get the right amout of data locked in
the TCP pipeline (receive buffer of the client, and the send buffer of the
server).
1) If the client blocks to early, the pipeline fills up, but if the module has
more than 64k of data to send, the asynchronous write completion feature is not
triggered. 
2) If the client blocks to late, there is enough "space" in the TCP pipeline to
accommodate all the remaining bytes including the SSL shutdown alert, in which
case the start_lingering_close_nonblocking() function does not block.

I've written some test code to simplify the reproducibility:
* a httpd module (mod_gendata) which generates a given amount of body data,
where the last 60k are not flushed, this should trigger the asynchronous write
completion in the listener thread.
* a special HTTPS client, that reads a given amount of data from the server and
then stops reading completely.
* a httpd.conf file that only starts one single httpd process with 2 worker
threads, that makes it easy to show whats happening if we look at the stack of
the process.

On my test system the TCP pipeline was full at around ~800k. So I request 850k
of data, read ~1000 bytes so the headers are received and then stop receiving.
I see, that the write completion was triggered if all the worker threads are
idle (check with gstack). And I see if the TCP pipeline is full, when the TCP
connection does *not* enter the FIN_WAIT1 state after the configured
KeepAliveTimout [2] (check with netstat). If both conditions are met, the
listener thread calls start_lingering_close_nonblocking() after 60s, and
blocks. It may take some tries to figure out the right amount of data that has
to be requested to get it right.


[1] https://httpd.apache.org/docs/2.4/mod/core.html#timeout
[2] https://httpd.apache.org/docs/2.4/mod/core.html#keepalivetimeout

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org

[Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by bu...@apache.org.

https://bz.apache.org/bugzilla/show_bug.cgi?id=60956

--- Comment #8 from Luca Toscano <to...@gmail.com> ---
Hi Frank,

everybody is a bit busy with the upcoming 2.4.26 release but this bug will be
addressed, I don't have the necessary skills but I'll find somebody soon enough
:)

I tried to check event's code and everything seems originating from the
periodical call to the following snipped of the listener (as it was stated
previously):

            /* Step 2: write completion timeouts */
            process_timeout_queue(write_completion_q, timeout_time,
                                  start_lingering_close_nonblocking);

This is now done periodically by the listener, and when Timeout expires then
start_lingering_close_nonblocking ends up in blocking as you described.

Since in this case mod_ssl will have to send a close-notify to gracefully close
the TLS connection, I'd think that the listener should not have the chance to
even attempt to do any work (risking to block), but just offload it to a spare
worker.  

An alternative would be to figure out if modssl_smart_shutdown could avoid to
block or just for a very brief amount of time to avoid blocking the listener.

We'll see how others thinks about it! I'll dig a bit more into this issue and
report my findings.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org

[Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by bu...@apache.org.

https://bz.apache.org/bugzilla/show_bug.cgi?id=60956

nada <ap...@valgronda.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |apache_bugzilla@valgronda.c
                   |                            |om

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org

[Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by bu...@apache.org.

https://bz.apache.org/bugzilla/show_bug.cgi?id=60956

--- Comment #13 from Yann Ylavic <yl...@gmail.com> ---
Created attachment 35156
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=35156&action=edit
Defer nonblocking lingering close to workers

Patch issued from above discussion.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org

[Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by bu...@apache.org.

https://bz.apache.org/bugzilla/show_bug.cgi?id=60956

Yann Ylavic <yl...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |FixedInTrunk

--- Comment #17 from Yann Ylavic <yl...@gmail.com> ---
Committed to trunk in r1802875 (v6).

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org

[Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by bu...@apache.org.

https://bz.apache.org/bugzilla/show_bug.cgi?id=60956

--- Comment #3 from Frank Meier <fr...@ergon.ch> ---
Created attachment 34987
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=34987&action=edit
special https client

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org

[Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by bu...@apache.org.

https://bz.apache.org/bugzilla/show_bug.cgi?id=60956

--- Comment #9 from Frank Meier <fr...@ergon.ch> ---
Hi Luca,

Thanks again, for pushing this forward. Also my understanding of the code is
unfortunately not good enough to propose a patch.

I completely agree. Either the process of closing should be offloaded to a
worker thread, or the function "start_lingering_close_nonblocking" should
really guarantee, that it is 'nonblocking' like its name suggests. I slightly
favor the path, fixing the start_lingering_close_nonblocking function, but I
think this might be more difficult to do. The question is, what should happen
if the TLS "close notify" could not be sent. Maybe we could just close the
connection uncleanly in this case. It would not be so nice, but on the other
hand, at some point you want to get rid of the stalled connection and a 'hard'
close might be the only way to achieve that.

Cheers

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org

[Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by bu...@apache.org.

https://bz.apache.org/bugzilla/show_bug.cgi?id=60956

--- Comment #2 from Frank Meier <fr...@ergon.ch> ---
Created attachment 34986
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=34986&action=edit
module generating data without flushing to trigger write completion feature

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org

[Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by bu...@apache.org.

https://bz.apache.org/bugzilla/show_bug.cgi?id=60956

--- Comment #24 from vsamel <vi...@samel.cz> ---
(In reply to Yann Ylavic from comment #21)
> Created attachment 35332 [details]
> Backport of r1802875 to 2.4.x

My random freezes are resolved with this patch too.

Thanks.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org

[Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by bu...@apache.org.

https://bz.apache.org/bugzilla/show_bug.cgi?id=60956

--- Comment #21 from Yann Ylavic <yl...@gmail.com> ---
Created attachment 35332
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=35332&action=edit
Backport of r1802875 to 2.4.x

Harmless in CHANGES resolved.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org

[Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by bu...@apache.org.

https://bz.apache.org/bugzilla/show_bug.cgi?id=60956

Michael Kaufmann <ap...@michael-kaufmann.ch> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |apache-bugzilla@michael-kau
                   |                            |fmann.ch

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org

[Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by bu...@apache.org.

https://bz.apache.org/bugzilla/show_bug.cgi?id=60956

--- Comment #20 from Frank Meier <fr...@ergon.ch> ---
Hi Lucca

sorry I was not able to apply Yann's v6 patch (r1802875) to the 2.4.x branch.
There were some merge conflicts I could not resolve without understanding the
code better. Is there a version of the patch that is applicable to the 2.4
branch?

cheers, Frank

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org

[Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by bu...@apache.org.

https://bz.apache.org/bugzilla/show_bug.cgi?id=60956

--- Comment #22 from Frank Meier <fr...@ergon.ch> ---
Hi Yann,

I tested your patch against the current 2.4.x head and I was *not* able to
reproduce the issue. Whereas it is still reproducible without the patch of
course.

Great work!

cheers, Frank

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org

[Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by bu...@apache.org.

https://bz.apache.org/bugzilla/show_bug.cgi?id=60956

Yann Ylavic <yl...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #35156|0                           |1
        is obsolete|                            |

--- Comment #14 from Yann Ylavic <yl...@gmail.com> ---
Created attachment 35158
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=35158&action=edit
Defer nonblocking lingering close to workers (v3)

An update to:
- close pending lingering sockets (deferred) on ungraceful restart,
- call usual process_socket() in worker for deferred lingering closes
  the previous patch missed updating the scoreboard (SERVER_CLOSING)
  by calling start_lingering_close_blocking() directly (note this change
  required to put the clogging_input_filters case at the right place to
  avoid reentering process_connection for any state),
- set socket timeout to SECONDS_TO_LINGER (2s) for deferred lingering
  closes since they likely come from a time-up already.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org

[Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by bu...@apache.org.

https://bz.apache.org/bugzilla/show_bug.cgi?id=60956

--- Comment #4 from Frank Meier <fr...@ergon.ch> ---
Created attachment 34988
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=34988&action=edit
simple httpd config

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org

[Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by bu...@apache.org.

https://bz.apache.org/bugzilla/show_bug.cgi?id=60956

vsamel <vi...@samel.cz> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |vitezslav@samel.cz

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org

[Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by bu...@apache.org.

https://bz.apache.org/bugzilla/show_bug.cgi?id=60956

--- Comment #7 from Frank Meier <fr...@ergon.ch> ---
Hi Luca,

Thank you for looking into this issue. Unfortunately the behavior, as far as I
can tell, was exactly the same with your patch, as before. Sorry.

cheers, Frank

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Stefan Eissing <st...@greenbytes.de>.

> Am 30.06.2017 um 13:33 schrieb Yann Ylavic <yl...@gmail.com>:
> 
> On Fri, Jun 30, 2017 at 1:20 PM, Ruediger Pluem <rp...@apache.org> wrote:
>> 
>> On 06/30/2017 12:18 PM, Yann Ylavic wrote:
>>> 
>>> IMHO mod_ssl shoudn't (BIO_)flush unconditionally in
>>> modssl_smart_shutdown(), only in the "abortive" mode of
>>> ssl_filter_io_shutdown().
>> 
>> I think the issue starts before that.
>> ap_prep_lingering_close calls the pre_close_connection hook and modules that are registered
>> to this hook can perform all sort of long lasting blocking operations there.
>> While it can be argued that this would be a bug in the module I think the only safe way
>> is to have the whole start_lingering_close_nonblocking being executed by a worker thread.
> 
> Correct, that'd be much simpler/safer indeed.
> We need a new SHUTDOWN state then, right?

+1

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Luca Toscano <to...@gmail.com>.

2017-07-17 9:33 GMT+02:00 Stefan Eissing <st...@greenbytes.de>:

>
> > Am 14.07.2017 um 21:52 schrieb Yann Ylavic <yl...@gmail.com>:
> >
> > On Fri, Jun 30, 2017 at 1:33 PM, Yann Ylavic <yl...@gmail.com>
> wrote:
> >> On Fri, Jun 30, 2017 at 1:20 PM, Ruediger Pluem <rp...@apache.org>
> wrote:
> >>>
> >>> On 06/30/2017 12:18 PM, Yann Ylavic wrote:
> >>>>
> >>>> IMHO mod_ssl shoudn't (BIO_)flush unconditionally in
> >>>> modssl_smart_shutdown(), only in the "abortive" mode of
> >>>> ssl_filter_io_shutdown().
> >>>
> >>> I think the issue starts before that.
> >>> ap_prep_lingering_close calls the pre_close_connection hook and
> modules that are registered
> >>> to this hook can perform all sort of long lasting blocking operations
> there.
> >>> While it can be argued that this would be a bug in the module I think
> the only safe way
> >>> is to have the whole start_lingering_close_nonblocking being executed
> by a worker thread.
> >>
> >> Correct, that'd be much simpler/safer indeed.
> >> We need a new SHUTDOWN state then, right?
> >
> > Actually it was less simple than expected, and it has some caveats
> obviously...
> >
> > The attached patch does not introduce a new state but reuses the
> > existing CONN_STATE_LINGER since it was not really considered by the
> > listener thread (which uses CONN_STATE_LINGER_NORMAL and
> > CONN_STATE_LINGER_SHORT instead), but that's a detail.
> >
> > Mainly, start_lingering_close_nonblocking() now simply schedules a
> > shutdown (i.e. pre_close_connection() followed by immediate close)
> > that will we be run by a worker thread.
> > A new shutdown_linger_q is created/handled (with the same timeout as
> > the short_linger_q, namely 2 seconds) to hold connections to be
> > shutdown.
> >
> > So now when a connection times out in the write_completion or
> > keepalive queues, it needs (i.e. the listener may wait for) an
> > available worker to process its shutdown/close.
> > This means we can *not* close kept alive connections immediatly like
> > before when becoming short of workers, which will favor active KA
> > connections over new ones in this case (I don't think it's that
> > serious but the previous was taking care of that. For me it's up to
> > the admin to size the workers appropriately...).
> >
> > Same when a connection in the shutdown_linger_q itself times out, the
> > patch will require a worker immediatly to do the job (see
> > shutdown_lingering_close() callback).
> >
> > So overall, this patch may introduce the need for more workers than
> > before, what was (wrongly) done by the listener thread has to be done
> > somewhere anyway...
> >
> > Finally, I think there is room for improvements like batching
> > shutdowns in the same worker if there is no objection on the approach
> > so far.
> >
> > WDYT?
>
> I will test the patch, most likely today. I lot of +1s for the initiative!
>
>
Thanks a lot for this work Yann, really nice! I will try to test it as well
during the next days, I was not convinced in the beginning that triggering
a (potential) increase in workers usage was super great for our users but
it is definitely the most correct and safe thing to do. Even if we find a
way to fix the ssl-lingering-close-block issue we might encounter a similar
one in the future, so it is better imho to fix it at the source.

Luca

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Stefan Eissing <st...@greenbytes.de>.

Threw it into my test suite and works nicely. 

> Am 17.07.2017 um 14:02 schrieb Yann Ylavic <yl...@gmail.com>:
> 
> On Mon, Jul 17, 2017 at 9:33 AM, Stefan Eissing
> <st...@greenbytes.de> wrote:
>> 
>> I will test the patch, most likely today. I lot of +1s for the initiative!
> 
> Thanks Stefan, as I said the proposed patch currently reuses the
> existing CONN_STATE_LINGER state to shutdown connections, but if it
> needs to be set from outside mpm_event (eg. mod_h2 ;) we could add a
> new state...

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Yann Ylavic <yl...@gmail.com>.

On Mon, Jul 17, 2017 at 9:33 AM, Stefan Eissing
<st...@greenbytes.de> wrote:
>
> I will test the patch, most likely today. I lot of +1s for the initiative!

Thanks Stefan, as I said the proposed patch currently reuses the
existing CONN_STATE_LINGER state to shutdown connections, but if it
needs to be set from outside mpm_event (eg. mod_h2 ;) we could add a
new state...

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Stefan Eissing <st...@greenbytes.de>.

> Am 14.07.2017 um 21:52 schrieb Yann Ylavic <yl...@gmail.com>:
> 
> On Fri, Jun 30, 2017 at 1:33 PM, Yann Ylavic <yl...@gmail.com> wrote:
>> On Fri, Jun 30, 2017 at 1:20 PM, Ruediger Pluem <rp...@apache.org> wrote:
>>> 
>>> On 06/30/2017 12:18 PM, Yann Ylavic wrote:
>>>> 
>>>> IMHO mod_ssl shoudn't (BIO_)flush unconditionally in
>>>> modssl_smart_shutdown(), only in the "abortive" mode of
>>>> ssl_filter_io_shutdown().
>>> 
>>> I think the issue starts before that.
>>> ap_prep_lingering_close calls the pre_close_connection hook and modules that are registered
>>> to this hook can perform all sort of long lasting blocking operations there.
>>> While it can be argued that this would be a bug in the module I think the only safe way
>>> is to have the whole start_lingering_close_nonblocking being executed by a worker thread.
>> 
>> Correct, that'd be much simpler/safer indeed.
>> We need a new SHUTDOWN state then, right?
> 
> Actually it was less simple than expected, and it has some caveats obviously...
> 
> The attached patch does not introduce a new state but reuses the
> existing CONN_STATE_LINGER since it was not really considered by the
> listener thread (which uses CONN_STATE_LINGER_NORMAL and
> CONN_STATE_LINGER_SHORT instead), but that's a detail.
> 
> Mainly, start_lingering_close_nonblocking() now simply schedules a
> shutdown (i.e. pre_close_connection() followed by immediate close)
> that will we be run by a worker thread.
> A new shutdown_linger_q is created/handled (with the same timeout as
> the short_linger_q, namely 2 seconds) to hold connections to be
> shutdown.
> 
> So now when a connection times out in the write_completion or
> keepalive queues, it needs (i.e. the listener may wait for) an
> available worker to process its shutdown/close.
> This means we can *not* close kept alive connections immediatly like
> before when becoming short of workers, which will favor active KA
> connections over new ones in this case (I don't think it's that
> serious but the previous was taking care of that. For me it's up to
> the admin to size the workers appropriately...).
> 
> Same when a connection in the shutdown_linger_q itself times out, the
> patch will require a worker immediatly to do the job (see
> shutdown_lingering_close() callback).
> 
> So overall, this patch may introduce the need for more workers than
> before, what was (wrongly) done by the listener thread has to be done
> somewhere anyway...
> 
> Finally, I think there is room for improvements like batching
> shutdowns in the same worker if there is no objection on the approach
> so far.
> 
> WDYT?

I will test the patch, most likely today. I lot of +1s for the initiative!

-Stefan

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Stefan Priebe - Profihost AG <s....@profihost.ag>.

Hello,

fullstatus says:
   Slot  PID  Stopping   Connections    Threads       Async connections
                       total accepting busy idle writing keep-alive  closing
   0    25042 no       0     no        2    198  0       0
4294966698
   1    4347  no       0     no        0    200  0       0
4294966700
   2    26273 no       0     no        1    199  0       0
4294966698
   3    4348  no       0     no        0    200  0       0
4294966699
   4    10224 no       0     no        0    200  0       0
4294966697
   5    12157 no       0     no        0    200  0       0
4294966700
   6    23027 no       0     no        0    200  0       0
4294966698
   7    28597 no       0     no        0    200  0       0
4294966698
   8    7519  no       0     no        0    200  0       0
4294966697
   9    18609 no       0     no        2    198  0       0
4294966698
   10   3183  no       0     no        0    200  0       0
4294966698
   11   14704 no       0     no        0    200  0       0
4294966698
   12   26237 no       0     no        0    200  0       0
4294966700
   13   32070 no       0     no        0    200  0       0
4294966697
   14   12070 no       1     no        0    200  0       0
4294966699
   15   16627 no       0     no        0    200  0       0
4294966698
   16   29413 no       0     no        0    200  0       0
4294966699
   17   435   no       0     no        0    200  0       0
4294966699
   18   24808 no       0     no        0    200  0       0
4294966700
   19   1822  no       0     no        0    200  0       0
4294966699
   20   1721  no       0     no        0    200  0       0
4294966698
   21   2875  no       0     no        0    200  0       0
4294966698
   22   25879 no       0     no        0    200  0       0
4294966698
   23   28091 no       0     no        0    200  0       0
4294966696
   24   31452 no       0     no        0    200  0       0
4294966698
   25   32706 no       0     no        0    200  0       0
4294966698
   26   8858  no       14    yes       3    197  0       6
4294966783
   27   10203 no       5     yes       2    198  0       2
4294966949
   Sum  28    0        20              10   5590 0       8          -16400

Greets,
Stefan

Am 19.07.2017 um 17:05 schrieb Stefan Priebe - Profihost AG:
> 
> Am 19.07.2017 um 16:59 schrieb Stefan Priebe - Profihost AG:
>> Hello Yann,
>>
>> i'm observing some deadlocks again.
>>
>> I'm using
>> httpd 2.4.27
>> + mod_h2
>> + httpd-2.4.x-mpm_event-wakeup-v7.1.patch
>> + your ssl linger fix patch from this thread
>>
>> What kind of information do you need? If you need a full stack backtrace
>>  - from which pid? Or from all httpd pids?
> 
> Something i forgot to tell:
> 
> it seems httpd is running at max threads:
> awk '{print $10 $11}' lsof.txt | sort | uniq -c | grep LISTEN
>   25050 *:http(LISTEN)
>   25050 *:https(LISTEN)
> 
> Stefan
> 
>>
>> Thanks!
>>
>> Greets,
>> Stefan
>>
>> Am 14.07.2017 um 21:52 schrieb Yann Ylavic:
>>> On Fri, Jun 30, 2017 at 1:33 PM, Yann Ylavic <yl...@gmail.com> wrote:
>>>> On Fri, Jun 30, 2017 at 1:20 PM, Ruediger Pluem <rp...@apache.org> wrote:
>>>>>
>>>>> On 06/30/2017 12:18 PM, Yann Ylavic wrote:
>>>>>>
>>>>>> IMHO mod_ssl shoudn't (BIO_)flush unconditionally in
>>>>>> modssl_smart_shutdown(), only in the "abortive" mode of
>>>>>> ssl_filter_io_shutdown().
>>>>>
>>>>> I think the issue starts before that.
>>>>> ap_prep_lingering_close calls the pre_close_connection hook and modules that are registered
>>>>> to this hook can perform all sort of long lasting blocking operations there.
>>>>> While it can be argued that this would be a bug in the module I think the only safe way
>>>>> is to have the whole start_lingering_close_nonblocking being executed by a worker thread.
>>>>
>>>> Correct, that'd be much simpler/safer indeed.
>>>> We need a new SHUTDOWN state then, right?
>>>
>>> Actually it was less simple than expected, and it has some caveats obviously...
>>>
>>> The attached patch does not introduce a new state but reuses the
>>> existing CONN_STATE_LINGER since it was not really considered by the
>>> listener thread (which uses CONN_STATE_LINGER_NORMAL and
>>> CONN_STATE_LINGER_SHORT instead), but that's a detail.
>>>
>>> Mainly, start_lingering_close_nonblocking() now simply schedules a
>>> shutdown (i.e. pre_close_connection() followed by immediate close)
>>> that will we be run by a worker thread.
>>> A new shutdown_linger_q is created/handled (with the same timeout as
>>> the short_linger_q, namely 2 seconds) to hold connections to be
>>> shutdown.
>>>
>>> So now when a connection times out in the write_completion or
>>> keepalive queues, it needs (i.e. the listener may wait for) an
>>> available worker to process its shutdown/close.
>>> This means we can *not* close kept alive connections immediatly like
>>> before when becoming short of workers, which will favor active KA
>>> connections over new ones in this case (I don't think it's that
>>> serious but the previous was taking care of that. For me it's up to
>>> the admin to size the workers appropriately...).
>>>
>>> Same when a connection in the shutdown_linger_q itself times out, the
>>> patch will require a worker immediatly to do the job (see
>>> shutdown_lingering_close() callback).
>>>
>>> So overall, this patch may introduce the need for more workers than
>>> before, what was (wrongly) done by the listener thread has to be done
>>> somewhere anyway...
>>>
>>> Finally, I think there is room for improvements like batching
>>> shutdowns in the same worker if there is no objection on the approach
>>> so far.
>>>
>>> WDYT?
>>>
>>> Regards,
>>> Yann.
>>>

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Stefan Priebe - Profihost AG <s....@profihost.ag>.

Am 20.07.2017 um 01:26 schrieb Yann Ylavic:
> On Wed, Jul 19, 2017 at 11:14 PM, Stefan Priebe - Profihost AG
> <s....@profihost.ag> wrote:
>> Am 19.07.2017 um 22:46 schrieb Yann Ylavic:
>>>
>>> Attached is a v2 if you feel confident enough, still ;)
>>
>> Thanks, yes i will.
> 
> If you managed to install v2 already you may want to ignore this new
> v3, which only addresses a very unlikely error case where
> lingering_count++ is missing (plus some non-functional changes, a bit
> of renaming and the factorization which would have avoided this
> mistake in the first place).
> 
> Otherwise, you could try this one instead.

Thanks, switched to V3.

> 
> Thanks,
> Yann.
>

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Stefan Priebe - Profihost AG <s....@profihost.ag>.

Yes:
Slot  PID  Stopping   Connections    Threads       Async connections
                      total accepting busy idle writing keep-alive  closing
  0    3614  no       1     no        4    196  0       0
4294966701
  1    3615  no       0     no        5    195  0       0
4294966697
  2    10228 no       0     no        6    194  0       0
4294966698
  3    12030 no       0     no        4    196  0       0
4294966697
  4    19495 no       3     yes       6    194  0       0
4294966801
  5    22098 no       6     yes       5    195  0       5
4294966706
  6    30071 no       15    yes       8    192  0       5
4294967283
  Sum  7     0        25              38   1362 0       10         -3489

Stefan

Excuse my typo sent from my mobile phone.

> Am 20.07.2017 um 15:18 schrieb Yann Ylavic <yl...@gmail.com>:
> 
> On Thu, Jul 20, 2017 at 2:58 PM, Stefan Priebe - Profihost AG
> <s....@profihost.ag> wrote:
>> Yes it looks the same but I can't tell if it is.
>> 
>> Here's a backtrace from V3:
>> https://apaste.info/Aw0r
> 
> Thanks Stefan, how about mod_status, still some strange entries?
> 
> Regards,
> Yann.

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Yann Ylavic <yl...@gmail.com>.

On Thu, Jul 20, 2017 at 2:58 PM, Stefan Priebe - Profihost AG
<s....@profihost.ag> wrote:
> Yes it looks the same but I can't tell if it is.
>
> Here's a backtrace from V3:
> https://apaste.info/Aw0r

Thanks Stefan, how about mod_status, still some strange entries?

Regards,
Yann.

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Stefan Priebe - Profihost AG <s....@profihost.ag>.

Yes it looks the same but I can't tell if it is.

Here's a backtrace from V3:
https://apaste.info/Aw0r

Greets,
Stefan

Excuse my typo sent from my mobile phone.

> Am 20.07.2017 um 13:01 schrieb Yann Ylavic <yl...@gmail.com>:
> 
> On Thu, Jul 20, 2017 at 12:48 PM, Stefan Priebe - Profihost AG
> <s....@profihost.ag> wrote:
>> V3 didn't help. Will post a new gdb backtrace soon takes some time as I'm on
>> holiday.
> 
> Thanks Stefan, still some/the same issue in status?
> 
> Regards,
> Yann.

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Yann Ylavic <yl...@gmail.com>.

On Thu, Jul 20, 2017 at 12:48 PM, Stefan Priebe - Profihost AG
<s....@profihost.ag> wrote:
> V3 didn't help. Will post a new gdb backtrace soon takes some time as I'm on
> holiday.

Thanks Stefan, still some/the same issue in status?

Regards,
Yann.

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Stefan Priebe - Profihost AG <s....@profihost.ag>.

Hello Yann,

i downloaded V3. Can't guarantee when i can test. May be today or on monday.

Greets,
Stefan

Am 21.07.2017 um 01:08 schrieb Yann Ylavic:
> On Thu, Jul 20, 2017 at 12:48 PM, Stefan Priebe - Profihost AG
> <s....@profihost.ag> wrote:
>> V3 didn't help.
> 
> I just posted a new patch in this thread, with a new approach which I
> think is better anyway.
> 
> Would you mind testing it in your environment?
> 
> 
> Regards,
> Yann.
>

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Yann Ylavic <yl...@gmail.com>.

On Fri, Jul 21, 2017 at 1:08 AM, Yann Ylavic <yl...@gmail.com> wrote:
> On Thu, Jul 20, 2017 at 12:48 PM, Stefan Priebe - Profihost AG
> <s....@profihost.ag> wrote:
>> V3 didn't help.
>
> I just posted a new patch in this thread, with a new approach which I
> think is better anyway.
>
> Would you mind testing it in your environment?

Latest (defer_linger_chain-v2.patch, minor changes) attached to PR 60956.

>
>
> Regards,
> Yann.

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Stefan Eissing <st...@greenbytes.de>.

Thanks for testing and verifying the fix, Stefan!

> Am 31.07.2017 um 11:32 schrieb Stefan Priebe - Profihost AG <s....@profihost.ag>:
> 
> 4tr i was able to fix this by mod_h2 v1.10.10
> 
> Greets,
> Stefan
> 
> Am 25.07.2017 um 15:40 schrieb Stefan Eissing:
>> Well, if the customer could reproduce this at a 
>> 
>>  LogLevel http2:trace2
>> 
>> that would help.
>> 
>>> Am 25.07.2017 um 15:38 schrieb Stefan Priebe - Profihost AG <s....@profihost.ag>:
>>> 
>>> Hello Stefan,
>>> 
>>> thanks for the patch. No it does not solve the problem our customer is
>>> seeing.
>>> 
>>> What kind of details / logs you need?
>>> 
>>> Greets,
>>> Stefan
>>> 
>>> Am 25.07.2017 um 11:59 schrieb Stefan Eissing:
>>>> The issue was opened here: https://github.com/icing/mod_h2/issues/143
>>>> 
>>>> I made a patch that i hope addresses the problem. The 2.4.x version I attach to this mail.
>>>> 
>>>> Thanks!
>>>> 
>>>> Stefan
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>>> Am 25.07.2017 um 08:13 schrieb Stefan Priebe - Profihost AG <s....@profihost.ag>:
>>>>> 
>>>>> 
>>>>> Am 24.07.2017 um 23:06 schrieb Stefan Eissing:
>>>>>> I have another report of request getting stuck, resulting in the error you noticed. Will look tomorrow and report back here what I find.
>>>>> 
>>>>> Thanks, if you need any logs. Pleae ask.
>>>>> 
>>>>> Stefan
>>>>> 
>>>>>> 
>>>>>>> Am 24.07.2017 um 22:20 schrieb Stefan Priebe - Profihost AG <s....@profihost.ag>:
>>>>>>> 
>>>>>>> Hello all,
>>>>>>> 
>>>>>>> currently 8 hours of testing without any issues.
>>>>>>> 
>>>>>>> @Stefan
>>>>>>> i've most probably another issue with http2 where some elements of the
>>>>>>> page are sometimes missing and the connection results in
>>>>>>> ERR_CONNECTION_CLOSED after 60s. What kind of details do you need?
>>>>>>> 
>>>>>>> Greets,
>>>>>>> Stefan
>>>>>>>> Am 22.07.2017 um 13:35 schrieb Yann Ylavic:
>>>>>>>>> On Sat, Jul 22, 2017 at 2:18 AM, Yann Ylavic <yl...@gmail.com> wrote:
>>>>>>>>> On Fri, Jul 21, 2017 at 10:31 PM, Stefan Priebe - Profihost AG
>>>>>>>>> <s....@profihost.ag> wrote:
>>>>>>>>>> 
>>>>>>>>>> your new defer linger V3 deadlocked as well.
>>>>>>>>>> 
>>>>>>>>>> GDB traces:
>>>>>>>>>> https://www.apaste.info/LMfJ
>>>>>>>>> 
>>>>>>>>> This shows the listener thread waiting for a worker while there are
>>>>>>>>> many available.
>>>>>>>>> My mistake, the worker threads failed to rearm their idle state for
>>>>>>>>> the deferred close case.
>>>>>>>>> 
>>>>>>>>> V4 available, thanks Stefan!
>>>>>>>> 
>>>>>>>> Since I didn't want you to test again something that fails quickly, I
>>>>>>>> did some stress tests myself this time with several tools, with V5
>>>>>>>> (same as V4 from a 2.4.x POV).
>>>>>>>> 
>>>>>>>> It seems to work well here...
>>>>>>>> 
>>>>>>>>> Regards,
>>>>>>>>> Yann.
>>>>>> 
>>>> 
>>

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Stefan Priebe - Profihost AG <s....@profihost.ag>.

4tr i was able to fix this by mod_h2 v1.10.10

Greets,
Stefan

Am 25.07.2017 um 15:40 schrieb Stefan Eissing:
> Well, if the customer could reproduce this at a 
> 
>   LogLevel http2:trace2
> 
> that would help.
> 
>> Am 25.07.2017 um 15:38 schrieb Stefan Priebe - Profihost AG <s....@profihost.ag>:
>>
>> Hello Stefan,
>>
>> thanks for the patch. No it does not solve the problem our customer is
>> seeing.
>>
>> What kind of details / logs you need?
>>
>> Greets,
>> Stefan
>>
>> Am 25.07.2017 um 11:59 schrieb Stefan Eissing:
>>> The issue was opened here: https://github.com/icing/mod_h2/issues/143
>>>
>>> I made a patch that i hope addresses the problem. The 2.4.x version I attach to this mail.
>>>
>>> Thanks!
>>>
>>> Stefan
>>>
>>>
>>>
>>>
>>>
>>>> Am 25.07.2017 um 08:13 schrieb Stefan Priebe - Profihost AG <s....@profihost.ag>:
>>>>
>>>>
>>>> Am 24.07.2017 um 23:06 schrieb Stefan Eissing:
>>>>> I have another report of request getting stuck, resulting in the error you noticed. Will look tomorrow and report back here what I find.
>>>>
>>>> Thanks, if you need any logs. Pleae ask.
>>>>
>>>> Stefan
>>>>
>>>>>
>>>>>> Am 24.07.2017 um 22:20 schrieb Stefan Priebe - Profihost AG <s....@profihost.ag>:
>>>>>>
>>>>>> Hello all,
>>>>>>
>>>>>> currently 8 hours of testing without any issues.
>>>>>>
>>>>>> @Stefan
>>>>>> i've most probably another issue with http2 where some elements of the
>>>>>> page are sometimes missing and the connection results in
>>>>>> ERR_CONNECTION_CLOSED after 60s. What kind of details do you need?
>>>>>>
>>>>>> Greets,
>>>>>> Stefan
>>>>>>> Am 22.07.2017 um 13:35 schrieb Yann Ylavic:
>>>>>>>> On Sat, Jul 22, 2017 at 2:18 AM, Yann Ylavic <yl...@gmail.com> wrote:
>>>>>>>> On Fri, Jul 21, 2017 at 10:31 PM, Stefan Priebe - Profihost AG
>>>>>>>> <s....@profihost.ag> wrote:
>>>>>>>>>
>>>>>>>>> your new defer linger V3 deadlocked as well.
>>>>>>>>>
>>>>>>>>> GDB traces:
>>>>>>>>> https://www.apaste.info/LMfJ
>>>>>>>>
>>>>>>>> This shows the listener thread waiting for a worker while there are
>>>>>>>> many available.
>>>>>>>> My mistake, the worker threads failed to rearm their idle state for
>>>>>>>> the deferred close case.
>>>>>>>>
>>>>>>>> V4 available, thanks Stefan!
>>>>>>>
>>>>>>> Since I didn't want you to test again something that fails quickly, I
>>>>>>> did some stress tests myself this time with several tools, with V5
>>>>>>> (same as V4 from a 2.4.x POV).
>>>>>>>
>>>>>>> It seems to work well here...
>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Yann.
>>>>>
>>>
>

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Stefan Eissing <st...@greenbytes.de>.

I am waiting to hear back from the peeps that opened the github issue. From how I read their logs, the patch should help them. Will report what they say. 

-Stefan

> Am 25.07.2017 um 15:40 schrieb Stefan Eissing <st...@greenbytes.de>:
> 
> Well, if the customer could reproduce this at a 
> 
>  LogLevel http2:trace2
> 
> that would help.
> 
>> Am 25.07.2017 um 15:38 schrieb Stefan Priebe - Profihost AG <s....@profihost.ag>:
>> 
>> Hello Stefan,
>> 
>> thanks for the patch. No it does not solve the problem our customer is
>> seeing.
>> 
>> What kind of details / logs you need?
>> 
>> Greets,
>> Stefan
>> 
>>> Am 25.07.2017 um 11:59 schrieb Stefan Eissing:
>>> The issue was opened here: https://github.com/icing/mod_h2/issues/143
>>> 
>>> I made a patch that i hope addresses the problem. The 2.4.x version I attach to this mail.
>>> 
>>> Thanks!
>>> 
>>> Stefan
>>> 
>>> 
>>> 
>>> 
>>> 
>>>> Am 25.07.2017 um 08:13 schrieb Stefan Priebe - Profihost AG <s....@profihost.ag>:
>>>> 
>>>> 
>>>>> Am 24.07.2017 um 23:06 schrieb Stefan Eissing:
>>>>> I have another report of request getting stuck, resulting in the error you noticed. Will look tomorrow and report back here what I find.
>>>> 
>>>> Thanks, if you need any logs. Pleae ask.
>>>> 
>>>> Stefan
>>>> 
>>>>> 
>>>>>> Am 24.07.2017 um 22:20 schrieb Stefan Priebe - Profihost AG <s....@profihost.ag>:
>>>>>> 
>>>>>> Hello all,
>>>>>> 
>>>>>> currently 8 hours of testing without any issues.
>>>>>> 
>>>>>> @Stefan
>>>>>> i've most probably another issue with http2 where some elements of the
>>>>>> page are sometimes missing and the connection results in
>>>>>> ERR_CONNECTION_CLOSED after 60s. What kind of details do you need?
>>>>>> 
>>>>>> Greets,
>>>>>> Stefan
>>>>>>>> Am 22.07.2017 um 13:35 schrieb Yann Ylavic:
>>>>>>>> On Sat, Jul 22, 2017 at 2:18 AM, Yann Ylavic <yl...@gmail.com> wrote:
>>>>>>>> On Fri, Jul 21, 2017 at 10:31 PM, Stefan Priebe - Profihost AG
>>>>>>>> <s....@profihost.ag> wrote:
>>>>>>>>> 
>>>>>>>>> your new defer linger V3 deadlocked as well.
>>>>>>>>> 
>>>>>>>>> GDB traces:
>>>>>>>>> https://www.apaste.info/LMfJ
>>>>>>>> 
>>>>>>>> This shows the listener thread waiting for a worker while there are
>>>>>>>> many available.
>>>>>>>> My mistake, the worker threads failed to rearm their idle state for
>>>>>>>> the deferred close case.
>>>>>>>> 
>>>>>>>> V4 available, thanks Stefan!
>>>>>>> 
>>>>>>> Since I didn't want you to test again something that fails quickly, I
>>>>>>> did some stress tests myself this time with several tools, with V5
>>>>>>> (same as V4 from a 2.4.x POV).
>>>>>>> 
>>>>>>> It seems to work well here...
>>>>>>> 
>>>>>>>> Regards,
>>>>>>>> Yann.
>>>>> 
>>> 
>

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Stefan Eissing <st...@greenbytes.de>.

Well, if the customer could reproduce this at a 

  LogLevel http2:trace2

that would help.

> Am 25.07.2017 um 15:38 schrieb Stefan Priebe - Profihost AG <s....@profihost.ag>:
> 
> Hello Stefan,
> 
> thanks for the patch. No it does not solve the problem our customer is
> seeing.
> 
> What kind of details / logs you need?
> 
> Greets,
> Stefan
> 
> Am 25.07.2017 um 11:59 schrieb Stefan Eissing:
>> The issue was opened here: https://github.com/icing/mod_h2/issues/143
>> 
>> I made a patch that i hope addresses the problem. The 2.4.x version I attach to this mail.
>> 
>> Thanks!
>> 
>> Stefan
>> 
>> 
>> 
>> 
>> 
>>> Am 25.07.2017 um 08:13 schrieb Stefan Priebe - Profihost AG <s....@profihost.ag>:
>>> 
>>> 
>>> Am 24.07.2017 um 23:06 schrieb Stefan Eissing:
>>>> I have another report of request getting stuck, resulting in the error you noticed. Will look tomorrow and report back here what I find.
>>> 
>>> Thanks, if you need any logs. Pleae ask.
>>> 
>>> Stefan
>>> 
>>>> 
>>>>> Am 24.07.2017 um 22:20 schrieb Stefan Priebe - Profihost AG <s....@profihost.ag>:
>>>>> 
>>>>> Hello all,
>>>>> 
>>>>> currently 8 hours of testing without any issues.
>>>>> 
>>>>> @Stefan
>>>>> i've most probably another issue with http2 where some elements of the
>>>>> page are sometimes missing and the connection results in
>>>>> ERR_CONNECTION_CLOSED after 60s. What kind of details do you need?
>>>>> 
>>>>> Greets,
>>>>> Stefan
>>>>>> Am 22.07.2017 um 13:35 schrieb Yann Ylavic:
>>>>>>> On Sat, Jul 22, 2017 at 2:18 AM, Yann Ylavic <yl...@gmail.com> wrote:
>>>>>>> On Fri, Jul 21, 2017 at 10:31 PM, Stefan Priebe - Profihost AG
>>>>>>> <s....@profihost.ag> wrote:
>>>>>>>> 
>>>>>>>> your new defer linger V3 deadlocked as well.
>>>>>>>> 
>>>>>>>> GDB traces:
>>>>>>>> https://www.apaste.info/LMfJ
>>>>>>> 
>>>>>>> This shows the listener thread waiting for a worker while there are
>>>>>>> many available.
>>>>>>> My mistake, the worker threads failed to rearm their idle state for
>>>>>>> the deferred close case.
>>>>>>> 
>>>>>>> V4 available, thanks Stefan!
>>>>>> 
>>>>>> Since I didn't want you to test again something that fails quickly, I
>>>>>> did some stress tests myself this time with several tools, with V5
>>>>>> (same as V4 from a 2.4.x POV).
>>>>>> 
>>>>>> It seems to work well here...
>>>>>> 
>>>>>>> Regards,
>>>>>>> Yann.
>>>> 
>>

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Stefan Priebe - Profihost AG <s....@profihost.ag>.

Hello Stefan,

thanks for the patch. No it does not solve the problem our customer is
seeing.

What kind of details / logs you need?

Greets,
Stefan

Am 25.07.2017 um 11:59 schrieb Stefan Eissing:
> The issue was opened here: https://github.com/icing/mod_h2/issues/143
> 
> I made a patch that i hope addresses the problem. The 2.4.x version I attach to this mail.
> 
> Thanks!
> 
> Stefan
> 
> 
> 
> 
> 
>> Am 25.07.2017 um 08:13 schrieb Stefan Priebe - Profihost AG <s....@profihost.ag>:
>>
>>
>> Am 24.07.2017 um 23:06 schrieb Stefan Eissing:
>>> I have another report of request getting stuck, resulting in the error you noticed. Will look tomorrow and report back here what I find.
>>
>> Thanks, if you need any logs. Pleae ask.
>>
>> Stefan
>>
>>>
>>>> Am 24.07.2017 um 22:20 schrieb Stefan Priebe - Profihost AG <s....@profihost.ag>:
>>>>
>>>> Hello all,
>>>>
>>>> currently 8 hours of testing without any issues.
>>>>
>>>> @Stefan
>>>> i've most probably another issue with http2 where some elements of the
>>>> page are sometimes missing and the connection results in
>>>> ERR_CONNECTION_CLOSED after 60s. What kind of details do you need?
>>>>
>>>> Greets,
>>>> Stefan
>>>>> Am 22.07.2017 um 13:35 schrieb Yann Ylavic:
>>>>>> On Sat, Jul 22, 2017 at 2:18 AM, Yann Ylavic <yl...@gmail.com> wrote:
>>>>>> On Fri, Jul 21, 2017 at 10:31 PM, Stefan Priebe - Profihost AG
>>>>>> <s....@profihost.ag> wrote:
>>>>>>>
>>>>>>> your new defer linger V3 deadlocked as well.
>>>>>>>
>>>>>>> GDB traces:
>>>>>>> https://www.apaste.info/LMfJ
>>>>>>
>>>>>> This shows the listener thread waiting for a worker while there are
>>>>>> many available.
>>>>>> My mistake, the worker threads failed to rearm their idle state for
>>>>>> the deferred close case.
>>>>>>
>>>>>> V4 available, thanks Stefan!
>>>>>
>>>>> Since I didn't want you to test again something that fails quickly, I
>>>>> did some stress tests myself this time with several tools, with V5
>>>>> (same as V4 from a 2.4.x POV).
>>>>>
>>>>> It seems to work well here...
>>>>>
>>>>>> Regards,
>>>>>> Yann.
>>>
>

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Stefan Eissing <st...@greenbytes.de>.

The issue was opened here: https://github.com/icing/mod_h2/issues/143

I made a patch that i hope addresses the problem. The 2.4.x version I attach to this mail.

Thanks!

Stefan

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Stefan Priebe - Profihost AG <s....@profihost.ag>.

Am 24.07.2017 um 23:06 schrieb Stefan Eissing:
> I have another report of request getting stuck, resulting in the error you noticed. Will look tomorrow and report back here what I find.

Thanks, if you need any logs. Pleae ask.

Stefan

> 
>> Am 24.07.2017 um 22:20 schrieb Stefan Priebe - Profihost AG <s....@profihost.ag>:
>>
>> Hello all,
>>
>> currently 8 hours of testing without any issues.
>>
>> @Stefan
>> i've most probably another issue with http2 where some elements of the
>> page are sometimes missing and the connection results in
>> ERR_CONNECTION_CLOSED after 60s. What kind of details do you need?
>>
>> Greets,
>> Stefan
>>> Am 22.07.2017 um 13:35 schrieb Yann Ylavic:
>>>> On Sat, Jul 22, 2017 at 2:18 AM, Yann Ylavic <yl...@gmail.com> wrote:
>>>> On Fri, Jul 21, 2017 at 10:31 PM, Stefan Priebe - Profihost AG
>>>> <s....@profihost.ag> wrote:
>>>>>
>>>>> your new defer linger V3 deadlocked as well.
>>>>>
>>>>> GDB traces:
>>>>> https://www.apaste.info/LMfJ
>>>>
>>>> This shows the listener thread waiting for a worker while there are
>>>> many available.
>>>> My mistake, the worker threads failed to rearm their idle state for
>>>> the deferred close case.
>>>>
>>>> V4 available, thanks Stefan!
>>>
>>> Since I didn't want you to test again something that fails quickly, I
>>> did some stress tests myself this time with several tools, with V5
>>> (same as V4 from a 2.4.x POV).
>>>
>>> It seems to work well here...
>>>
>>>> Regards,
>>>> Yann.
>

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Stefan Eissing <st...@greenbytes.de>.

I have another report of request getting stuck, resulting in the error you noticed. Will look tomorrow and report back here what I find.

> Am 24.07.2017 um 22:20 schrieb Stefan Priebe - Profihost AG <s....@profihost.ag>:
> 
> Hello all,
> 
> currently 8 hours of testing without any issues.
> 
> @Stefan
> i've most probably another issue with http2 where some elements of the
> page are sometimes missing and the connection results in
> ERR_CONNECTION_CLOSED after 60s. What kind of details do you need?
> 
> Greets,
> Stefan
>> Am 22.07.2017 um 13:35 schrieb Yann Ylavic:
>>> On Sat, Jul 22, 2017 at 2:18 AM, Yann Ylavic <yl...@gmail.com> wrote:
>>> On Fri, Jul 21, 2017 at 10:31 PM, Stefan Priebe - Profihost AG
>>> <s....@profihost.ag> wrote:
>>>> 
>>>> your new defer linger V3 deadlocked as well.
>>>> 
>>>> GDB traces:
>>>> https://www.apaste.info/LMfJ
>>> 
>>> This shows the listener thread waiting for a worker while there are
>>> many available.
>>> My mistake, the worker threads failed to rearm their idle state for
>>> the deferred close case.
>>> 
>>> V4 available, thanks Stefan!
>> 
>> Since I didn't want you to test again something that fails quickly, I
>> did some stress tests myself this time with several tools, with V5
>> (same as V4 from a 2.4.x POV).
>> 
>> It seems to work well here...
>> 
>>> Regards,
>>> Yann.

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Stefan Priebe - Profihost AG <s....@profihost.ag>.

Hello all,

currently 8 hours of testing without any issues.

@Stefan
i've most probably another issue with http2 where some elements of the
page are sometimes missing and the connection results in
ERR_CONNECTION_CLOSED after 60s. What kind of details do you need?

Greets,
Stefan
Am 22.07.2017 um 13:35 schrieb Yann Ylavic:
> On Sat, Jul 22, 2017 at 2:18 AM, Yann Ylavic <yl...@gmail.com> wrote:
>> On Fri, Jul 21, 2017 at 10:31 PM, Stefan Priebe - Profihost AG
>> <s....@profihost.ag> wrote:
>>>
>>> your new defer linger V3 deadlocked as well.
>>>
>>> GDB traces:
>>> https://www.apaste.info/LMfJ
>>
>> This shows the listener thread waiting for a worker while there are
>> many available.
>> My mistake, the worker threads failed to rearm their idle state for
>> the deferred close case.
>>
>> V4 available, thanks Stefan!
> 
> Since I didn't want you to test again something that fails quickly, I
> did some stress tests myself this time with several tools, with V5
> (same as V4 from a 2.4.x POV).
> 
> It seems to work well here...
> 
>> Regards,
>> Yann.

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Stefan Priebe - Profihost AG <s....@profihost.ag>.

First test with version five looks good so far will continue extensive testing tomorrow.

Greets,
Stefan

Excuse my typo sent from my mobile phone.

> Am 22.07.2017 um 13:35 schrieb Yann Ylavic <yl...@gmail.com>:
> 
>> On Sat, Jul 22, 2017 at 2:18 AM, Yann Ylavic <yl...@gmail.com> wrote:
>> On Fri, Jul 21, 2017 at 10:31 PM, Stefan Priebe - Profihost AG
>> <s....@profihost.ag> wrote:
>>> 
>>> your new defer linger V3 deadlocked as well.
>>> 
>>> GDB traces:
>>> https://www.apaste.info/LMfJ
>> 
>> This shows the listener thread waiting for a worker while there are
>> many available.
>> My mistake, the worker threads failed to rearm their idle state for
>> the deferred close case.
>> 
>> V4 available, thanks Stefan!
> 
> Since I didn't want you to test again something that fails quickly, I
> did some stress tests myself this time with several tools, with V5
> (same as V4 from a 2.4.x POV).
> 
> It seems to work well here...
> 
>> Regards,
>> Yann.

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Yann Ylavic <yl...@gmail.com>.

On Sat, Jul 22, 2017 at 2:18 AM, Yann Ylavic <yl...@gmail.com> wrote:
> On Fri, Jul 21, 2017 at 10:31 PM, Stefan Priebe - Profihost AG
> <s....@profihost.ag> wrote:
>>
>> your new defer linger V3 deadlocked as well.
>>
>> GDB traces:
>> https://www.apaste.info/LMfJ
>
> This shows the listener thread waiting for a worker while there are
> many available.
> My mistake, the worker threads failed to rearm their idle state for
> the deferred close case.
>
> V4 available, thanks Stefan!

Since I didn't want you to test again something that fails quickly, I
did some stress tests myself this time with several tools, with V5
(same as V4 from a 2.4.x POV).

It seems to work well here...

> Regards,
> Yann.

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Stefan Eissing <st...@greenbytes.de>.

And to answer myself: no, the v3 patch does not expose anything when running in h2fuzz.

> Am 22.07.2017 um 07:17 schrieb Stefan Eissing <st...@greenbytes.de>:
> 
> Profihost, where bugs come to die!
> 
> I am currently fully overloaded, but it would be interesting to check how the previous versions of the patch fare in a h2fuzz setup.
> 
> -Stefan
> 
>> Am 22.07.2017 um 02:18 schrieb Yann Ylavic <yl...@gmail.com>:
>> 
>> On Fri, Jul 21, 2017 at 10:31 PM, Stefan Priebe - Profihost AG
>> <s....@profihost.ag> wrote:
>>> 
>>> your new defer linger V3 deadlocked as well.
>>> 
>>> GDB traces:
>>> https://www.apaste.info/LMfJ
>> 
>> This shows the listener thread waiting for a worker while there are
>> many available.
>> My mistake, the worker threads failed to rearm their idle state for
>> the deferred close case.
>> 
>> V4 available, thanks Stefan!
>> 
>> Regards,
>> Yann.
>

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Stefan Eissing <st...@greenbytes.de>.

Profihost, where bugs come to die!

I am currently fully overloaded, but it would be interesting to check how the previous versions of the patch fare in a h2fuzz setup.

-Stefan

> Am 22.07.2017 um 02:18 schrieb Yann Ylavic <yl...@gmail.com>:
> 
> On Fri, Jul 21, 2017 at 10:31 PM, Stefan Priebe - Profihost AG
> <s....@profihost.ag> wrote:
>> 
>> your new defer linger V3 deadlocked as well.
>> 
>> GDB traces:
>> https://www.apaste.info/LMfJ
> 
> This shows the listener thread waiting for a worker while there are
> many available.
> My mistake, the worker threads failed to rearm their idle state for
> the deferred close case.
> 
> V4 available, thanks Stefan!
> 
> Regards,
> Yann.

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Yann Ylavic <yl...@gmail.com>.

On Fri, Jul 21, 2017 at 10:31 PM, Stefan Priebe - Profihost AG
<s....@profihost.ag> wrote:
>
> your new defer linger V3 deadlocked as well.
>
> GDB traces:
> https://www.apaste.info/LMfJ

This shows the listener thread waiting for a worker while there are
many available.
My mistake, the worker threads failed to rearm their idle state for
the deferred close case.

V4 available, thanks Stefan!

Regards,
Yann.

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Stefan Priebe - Profihost AG <s....@profihost.ag>.

Hello Yann,

your new defer linger V3 deadlocked as well.

GDB traces:
https://www.apaste.info/LMfJ

But this time i have no fullstatus for you as the apache didn't serve
any connections at all anymore. But even before i did NOT see those
strange values for closing connections.

Thanks!

Greets,
Stefan
Am 21.07.2017 um 01:08 schrieb Yann Ylavic:
> On Thu, Jul 20, 2017 at 12:48 PM, Stefan Priebe - Profihost AG
> <s....@profihost.ag> wrote:
>> V3 didn't help.
> 
> I just posted a new patch in this thread, with a new approach which I
> think is better anyway.
> 
> Would you mind testing it in your environment?
> 
> 
> Regards,
> Yann.
>

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Yann Ylavic <yl...@gmail.com>.

On Thu, Jul 20, 2017 at 12:48 PM, Stefan Priebe - Profihost AG
<s....@profihost.ag> wrote:
> V3 didn't help.

I just posted a new patch in this thread, with a new approach which I
think is better anyway.

Would you mind testing it in your environment?


Regards,
Yann.

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Stefan Priebe - Profihost AG <s....@profihost.ag>.

V3 didn't help. Will post a new gdb backtrace soon takes some time as I'm on holiday.

Stefan

Excuse my typo sent from my mobile phone.

> Am 20.07.2017 um 01:26 schrieb Yann Ylavic <yl...@gmail.com>:
> 
> On Wed, Jul 19, 2017 at 11:14 PM, Stefan Priebe - Profihost AG
> <s....@profihost.ag> wrote:
>> Am 19.07.2017 um 22:46 schrieb Yann Ylavic:
>>> 
>>> Attached is a v2 if you feel confident enough, still ;)
>> 
>> Thanks, yes i will.
> 
> If you managed to install v2 already you may want to ignore this new
> v3, which only addresses a very unlikely error case where
> lingering_count++ is missing (plus some non-functional changes, a bit
> of renaming and the factorization which would have avoided this
> mistake in the first place).
> 
> Otherwise, you could try this one instead.
> 
> Thanks,
> Yann.
> <shutdown_linger_q-v3.patch>

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Yann Ylavic <yl...@gmail.com>.

On Wed, Jul 19, 2017 at 11:14 PM, Stefan Priebe - Profihost AG
<s....@profihost.ag> wrote:
> Am 19.07.2017 um 22:46 schrieb Yann Ylavic:
>>
>> Attached is a v2 if you feel confident enough, still ;)
>
> Thanks, yes i will.

If you managed to install v2 already you may want to ignore this new
v3, which only addresses a very unlikely error case where
lingering_count++ is missing (plus some non-functional changes, a bit
of renaming and the factorization which would have avoided this
mistake in the first place).

Otherwise, you could try this one instead.

Thanks,
Yann.

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Stefan Priebe - Profihost AG <s....@profihost.ag>.

Am 19.07.2017 um 22:46 schrieb Yann Ylavic:
> Hi Stefan,
> 
> thanks for testing again!
> 
> On Wed, Jul 19, 2017 at 7:42 PM, Stefan Priebe - Profihost AG
> <s....@profihost.ag> wrote:
>>
>> What looks strange
>> from a first view is that async connections closing has very high and
>> strange values:
>> 4294967211
> 
> Indeed, I messed up with mpm_event's lingering_count in the first patch.
> And it can lead to disabling the listener, which I think is what you observe.
> 
> Attached is a v2 if you feel confident enough, still ;)

Thanks, yes i will.

> 
> Regards,
> Yann.
>

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Yann Ylavic <yl...@gmail.com>.

Hi Stefan,

thanks for testing again!

On Wed, Jul 19, 2017 at 7:42 PM, Stefan Priebe - Profihost AG
<s....@profihost.ag> wrote:
>
> What looks strange
> from a first view is that async connections closing has very high and
> strange values:
> 4294967211

Indeed, I messed up with mpm_event's lingering_count in the first patch.
And it can lead to disabling the listener, which I think is what you observe.

Attached is a v2 if you feel confident enough, still ;)

Regards,
Yann.

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Stefan Priebe - Profihost AG <s....@profihost.ag>.

Hello Luca,

i need to wait until a machine is crashing again. What looks strange
from a first view is that async connections closing has very high and
strange values:
4294967211

Even a not yet crashed system has those:

   Slot  PID  Stopping   Connections    Threads       Async connections
                       total accepting busy idle writing keep-alive  closing
   0    25157 no       12    yes       4    196  0       9
4294967231 <== HERE ==
   1    25159 no       22    yes       8    192  0       13
4294967211 <== HERE ==
   Sum  2     0        34              12   388  0       22         -150

Greets,
Stefan

Am 19.07.2017 um 17:48 schrieb Luca Toscano:
> Hello Stefan,
> 
> 2017-07-19 17:05 GMT+02:00 Stefan Priebe - Profihost AG
> <s.priebe@profihost.ag <ma...@profihost.ag>>:
> 
> 
>     Am 19.07.2017 um 16:59 schrieb Stefan Priebe - Profihost AG:
>     > Hello Yann,
>     >
>     > i'm observing some deadlocks again.
>     >
>     > I'm using
>     > httpd 2.4.27
>     > + mod_h2
>     > + httpd-2.4.x-mpm_event-wakeup-v7.1.patch
>     > + your ssl linger fix patch from this thread
>     >
>     > What kind of information do you need? If you need a full stack backtrace
>     >  - from which pid? Or from all httpd pids?
> 
>     Something i forgot to tell:
> 
>     it seems httpd is running at max threads:
>     awk '{print $10 $11}' lsof.txt | sort | uniq -c | grep LISTEN
>       25050 *:http(LISTEN)
>       25050 *:https(LISTEN)
> 
> 
> First of all let me tell you how awesome is your regular testing, thank
> you! It is helping a ton to deliver stable code :)
> 
> From my point of view I think that you can attach gdb to one or more
> httpd processes and do the usual "thread apply all", IIUC your httpd
> ends up having all of its processes in more or less the same state right?
> 
> Let's use https://apaste.info/ or similar though otherwise we'll need to
> exchange super long emails.
> 
> Thanks again!
> 
> Luca

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Eric Covener <co...@gmail.com>.

On Wed, Jul 19, 2017 at 2:25 PM, Stefan Priebe - Profihost AG
<s....@profihost.ag> wrote:
> Hello,
>
> here we go:
>
> This one is from a server where the first httpd process got stuck:
>
>    Slot  PID  Stopping   Connections    Threads       Async connections
>                        total accepting busy idle writing keep-alive  closing
>    0    31675 no       0     no        0    200  0       0
> 4294966700
>
> gdb thread apply all from this pid:
> https://apaste.info/cBT5
>

summary from a script I use:

1: read>read>ap>child_main>make_child>server_main_loop>event>ap_run_mpm>main
300: pthread_cond_wait@@GLIBC_2.3.2>apr_thread_cond_wait>get_next>slot>start_thread>clone
200: pthread_cond_wait@@GLIBC_2.3.2>apr_thread_cond_wait>ap_queue_pop_something>worker_thread>start_thread>clone
1: epoll_wait>impl_pollset_poll>listener_thread>start_thread>clone

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Stefan Priebe - Profihost AG <s....@profihost.ag>.

Hello,

here we go:

This one is from a server where the first httpd process got stuck:

   Slot  PID  Stopping   Connections    Threads       Async connections
                       total accepting busy idle writing keep-alive  closing
   0    31675 no       0     no        0    200  0       0
4294966700

gdb thread apply all from this pid:
https://apaste.info/cBT5

Greets,
Stefan

Am 19.07.2017 um 17:48 schrieb Luca Toscano:
> Hello Stefan,
> 
> 2017-07-19 17:05 GMT+02:00 Stefan Priebe - Profihost AGlo
> <s.priebe@profihost.ag <ma...@profihost.ag>>:
> 
> 
>     Am 19.07.2017 um 16:59 schrieb Stefan Priebe - Profihost AG:
>     > Hello Yann,
>     >
>     > i'm observing some deadlocks again.
>     >
>     > I'm using
>     > httpd 2.4.27
>     > + mod_h2
>     > + httpd-2.4.x-mpm_event-wakeup-v7.1.patch
>     > + your ssl linger fix patch from this thread
>     >
>     > What kind of information do you need? If you need a full stack backtrace
>     >  - from which pid? Or from all httpd pids?
> 
>     Something i forgot to tell:
> 
>     it seems httpd is running at max threads:
>     awk '{print $10 $11}' lsof.txt | sort | uniq -c | grep LISTEN
>       25050 *:http(LISTEN)
>       25050 *:https(LISTEN)
> 
> 
> First of all let me tell you how awesome is your regular testing, thank
> you! It is helping a ton to deliver stable code :)
> 
> From my point of view I think that you can attach gdb to one or more
> httpd processes and do the usual "thread apply all", IIUC your httpd
> ends up having all of its processes in more or less the same state right?
> 
> Let's use https://apaste.info/ or similar though otherwise we'll need to
> exchange super long emails.
> 
> Thanks again!
> 
> Luca

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Luca Toscano <to...@gmail.com>.

Hello Stefan,

2017-07-19 17:05 GMT+02:00 Stefan Priebe - Profihost AG <
s.priebe@profihost.ag>:

>
> Am 19.07.2017 um 16:59 schrieb Stefan Priebe - Profihost AG:
> > Hello Yann,
> >
> > i'm observing some deadlocks again.
> >
> > I'm using
> > httpd 2.4.27
> > + mod_h2
> > + httpd-2.4.x-mpm_event-wakeup-v7.1.patch
> > + your ssl linger fix patch from this thread
> >
> > What kind of information do you need? If you need a full stack backtrace
> >  - from which pid? Or from all httpd pids?
>
> Something i forgot to tell:
>
> it seems httpd is running at max threads:
> awk '{print $10 $11}' lsof.txt | sort | uniq -c | grep LISTEN
>   25050 *:http(LISTEN)
>   25050 *:https(LISTEN)
>
>
First of all let me tell you how awesome is your regular testing, thank
you! It is helping a ton to deliver stable code :)

From my point of view I think that you can attach gdb to one or more httpd
processes and do the usual "thread apply all", IIUC your httpd ends up
having all of its processes in more or less the same state right?

Let's use https://apaste.info/ or similar though otherwise we'll need to
exchange super long emails.

Thanks again!

Luca

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Stefan Priebe - Profihost AG <s....@profihost.ag>.

Am 19.07.2017 um 16:59 schrieb Stefan Priebe - Profihost AG:
> Hello Yann,
> 
> i'm observing some deadlocks again.
> 
> I'm using
> httpd 2.4.27
> + mod_h2
> + httpd-2.4.x-mpm_event-wakeup-v7.1.patch
> + your ssl linger fix patch from this thread
> 
> What kind of information do you need? If you need a full stack backtrace
>  - from which pid? Or from all httpd pids?

Something i forgot to tell:

it seems httpd is running at max threads:
awk '{print $10 $11}' lsof.txt | sort | uniq -c | grep LISTEN
  25050 *:http(LISTEN)
  25050 *:https(LISTEN)

Stefan

> 
> Thanks!
> 
> Greets,
> Stefan
> 
> Am 14.07.2017 um 21:52 schrieb Yann Ylavic:
>> On Fri, Jun 30, 2017 at 1:33 PM, Yann Ylavic <yl...@gmail.com> wrote:
>>> On Fri, Jun 30, 2017 at 1:20 PM, Ruediger Pluem <rp...@apache.org> wrote:
>>>>
>>>> On 06/30/2017 12:18 PM, Yann Ylavic wrote:
>>>>>
>>>>> IMHO mod_ssl shoudn't (BIO_)flush unconditionally in
>>>>> modssl_smart_shutdown(), only in the "abortive" mode of
>>>>> ssl_filter_io_shutdown().
>>>>
>>>> I think the issue starts before that.
>>>> ap_prep_lingering_close calls the pre_close_connection hook and modules that are registered
>>>> to this hook can perform all sort of long lasting blocking operations there.
>>>> While it can be argued that this would be a bug in the module I think the only safe way
>>>> is to have the whole start_lingering_close_nonblocking being executed by a worker thread.
>>>
>>> Correct, that'd be much simpler/safer indeed.
>>> We need a new SHUTDOWN state then, right?
>>
>> Actually it was less simple than expected, and it has some caveats obviously...
>>
>> The attached patch does not introduce a new state but reuses the
>> existing CONN_STATE_LINGER since it was not really considered by the
>> listener thread (which uses CONN_STATE_LINGER_NORMAL and
>> CONN_STATE_LINGER_SHORT instead), but that's a detail.
>>
>> Mainly, start_lingering_close_nonblocking() now simply schedules a
>> shutdown (i.e. pre_close_connection() followed by immediate close)
>> that will we be run by a worker thread.
>> A new shutdown_linger_q is created/handled (with the same timeout as
>> the short_linger_q, namely 2 seconds) to hold connections to be
>> shutdown.
>>
>> So now when a connection times out in the write_completion or
>> keepalive queues, it needs (i.e. the listener may wait for) an
>> available worker to process its shutdown/close.
>> This means we can *not* close kept alive connections immediatly like
>> before when becoming short of workers, which will favor active KA
>> connections over new ones in this case (I don't think it's that
>> serious but the previous was taking care of that. For me it's up to
>> the admin to size the workers appropriately...).
>>
>> Same when a connection in the shutdown_linger_q itself times out, the
>> patch will require a worker immediatly to do the job (see
>> shutdown_lingering_close() callback).
>>
>> So overall, this patch may introduce the need for more workers than
>> before, what was (wrongly) done by the listener thread has to be done
>> somewhere anyway...
>>
>> Finally, I think there is room for improvements like batching
>> shutdowns in the same worker if there is no objection on the approach
>> so far.
>>
>> WDYT?
>>
>> Regards,
>> Yann.
>>

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Stefan Priebe - Profihost AG <s....@profihost.ag>.

Hello Yann,

i'm observing some deadlocks again.

I'm using
httpd 2.4.27
+ mod_h2
+ httpd-2.4.x-mpm_event-wakeup-v7.1.patch
+ your ssl linger fix patch from this thread

What kind of information do you need? If you need a full stack backtrace
 - from which pid? Or from all httpd pids?

Thanks!

Greets,
Stefan

Am 14.07.2017 um 21:52 schrieb Yann Ylavic:
> On Fri, Jun 30, 2017 at 1:33 PM, Yann Ylavic <yl...@gmail.com> wrote:
>> On Fri, Jun 30, 2017 at 1:20 PM, Ruediger Pluem <rp...@apache.org> wrote:
>>>
>>> On 06/30/2017 12:18 PM, Yann Ylavic wrote:
>>>>
>>>> IMHO mod_ssl shoudn't (BIO_)flush unconditionally in
>>>> modssl_smart_shutdown(), only in the "abortive" mode of
>>>> ssl_filter_io_shutdown().
>>>
>>> I think the issue starts before that.
>>> ap_prep_lingering_close calls the pre_close_connection hook and modules that are registered
>>> to this hook can perform all sort of long lasting blocking operations there.
>>> While it can be argued that this would be a bug in the module I think the only safe way
>>> is to have the whole start_lingering_close_nonblocking being executed by a worker thread.
>>
>> Correct, that'd be much simpler/safer indeed.
>> We need a new SHUTDOWN state then, right?
> 
> Actually it was less simple than expected, and it has some caveats obviously...
> 
> The attached patch does not introduce a new state but reuses the
> existing CONN_STATE_LINGER since it was not really considered by the
> listener thread (which uses CONN_STATE_LINGER_NORMAL and
> CONN_STATE_LINGER_SHORT instead), but that's a detail.
> 
> Mainly, start_lingering_close_nonblocking() now simply schedules a
> shutdown (i.e. pre_close_connection() followed by immediate close)
> that will we be run by a worker thread.
> A new shutdown_linger_q is created/handled (with the same timeout as
> the short_linger_q, namely 2 seconds) to hold connections to be
> shutdown.
> 
> So now when a connection times out in the write_completion or
> keepalive queues, it needs (i.e. the listener may wait for) an
> available worker to process its shutdown/close.
> This means we can *not* close kept alive connections immediatly like
> before when becoming short of workers, which will favor active KA
> connections over new ones in this case (I don't think it's that
> serious but the previous was taking care of that. For me it's up to
> the admin to size the workers appropriately...).
> 
> Same when a connection in the shutdown_linger_q itself times out, the
> patch will require a worker immediatly to do the job (see
> shutdown_lingering_close() callback).
> 
> So overall, this patch may introduce the need for more workers than
> before, what was (wrongly) done by the listener thread has to be done
> somewhere anyway...
> 
> Finally, I think there is room for improvements like batching
> shutdowns in the same worker if there is no objection on the approach
> so far.
> 
> WDYT?
> 
> Regards,
> Yann.
>

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Luca Toscano <to...@gmail.com>.

2017-07-21 1:16 GMT+02:00 Yann Ylavic <yl...@gmail.com>:

> On Fri, Jul 21, 2017 at 1:05 AM, Postmaster
> <po...@cienacorp.onmicrosoft.com> wrote:
> > This message was created automatically by mail delivery software. Your
> email message was not delivered as is to the intended recipients because
> malware was detected in one or more attachments included with it. All
> attachments were deleted.
> >
> > --- Additional Information ---:
> []
> >
> > Detections found:
> > defer_linger_chain.patch         JS/Jasobfus.A!ml
>
> wtf?
>

Was that a -1 for the patch? :D

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Yann Ylavic <yl...@gmail.com>.

On Fri, Jul 21, 2017 at 1:05 AM, Postmaster
<po...@cienacorp.onmicrosoft.com> wrote:
> This message was created automatically by mail delivery software. Your email message was not delivered as is to the intended recipients because malware was detected in one or more attachments included with it. All attachments were deleted.
>
> --- Additional Information ---:
[]
>
> Detections found:
> defer_linger_chain.patch         JS/Jasobfus.A!ml

wtf?

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Yann Ylavic <yl...@gmail.com>.

On Fri, Jul 14, 2017 at 9:52 PM, Yann Ylavic <yl...@gmail.com> wrote:
>
> So overall, this patch may introduce the need for more workers than
> before, what was (wrongly) done by the listener thread has to be done
> somewhere anyway...

That patch didn't work (as reported by Stefan Pribe) and I now don't
feel the need to debug it further, see below.

>
> Finally, I think there is room for improvements like batching
> shutdowns in the same worker if there is no objection on the approach
> so far.

That's the way to go IMO, here is a new patch which is much simpler
and effective I think.

The idea is that when nonblocking is required (i.e. in the listener),
connections to flush and close are atomically pushed/popped to/from a
chain (linked list) by the listener/some worker.

So start_lingering_close_nonblocking() simply fills the chain (this is
atomic/nonblocking), and any worker thread which is done with its
current connection will empty the chain while calling
start_lingering_close_blocking() for each connection.

To prevent starvation of deferred lingering closes, the listener may
create a worker at the of its loop, when/if the chain is (fully)
filled.

While the previous patch potentially induced some overhead in the
number workers and thread contexts switches, I think this new one much
better in this regard.

What do you think of it?

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Eric Covener <co...@gmail.com>.

> Also, it seems that in the deferred lingering case we should probaly
> shorten the socket timeout before calling (and possibly blocking on)
> ap_start_lingering_close()'s hooks/flush, since we likely come from a
> time-up already...

+1

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Eric Covener <co...@gmail.com>.

> So, should we favor the draining of defer_linger_chain as much workers
> as necessary like the current patch, or should we have as few workers
> as possible and not start new workers in loops with no effect on
> defer_linger_chain?

I think the fewer workers option could lead to hard to debug (from an
end user POV) intermittent problems with clients far back in the queue
who see a FIN delayed by someone ahead in the queue.  They may send
subsequent requests in the meantime that will spin.

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Yann Ylavic <yl...@gmail.com>.

On Fri, Jul 21, 2017 at 3:07 PM, Luca Toscano <to...@gmail.com> wrote:
>>
>> To prevent starvation of deferred lingering closes, the listener may
>> create a worker at the of its loop, when/if the chain is (fully)
>> filled.
>
> IIUC the trick is to run "(have_idle_worker && push2worker(NULL) ==
> APR_SUCCESS)" that reserves a worker that in turn eventually checks the
> defer_linger_chain as part of its new code.

That will be a dedicated worker, it won't handle a connection before
flushing/closing the ones in defer_linger_chain, straight to that
(thanks to the NULL arg).

> This also seems sto leverage
> workers_were_busy that will prioritize lingering closes over ka connections
> right? (or at least try to)

workers_were_busy is per loop (zeroed), so the new (late) get_worker
which updates it won't affect the next loop (and since there will be a
poll() in between this get_worker and the next time we'll have to
decide whether to kill KA connections or not, it looks reasonable to
not depend on the previous loop anyway).

Regarding this part, though :

        if (defer_linger_chain) {
            get_worker(&have_idle_worker, 0, &workers_were_busy);
            if (have_idle_worker && push2worker(NULL) == APR_SUCCESS) {
                have_idle_worker = 0;
            }

I think we could be smarter and possibly not create a worker if no
lingering close was chained in *this* loop.
With the current patch we would create one if, for example, a new
connection is accepted before the worker in charge of the lingering
closes (from the previous loop) did not finish its work.
Here defer_linger_chain was not filled by the current loop, but not
completely emptied either because the worker is blocking on its first
connections to shutdown, hence != NULL.

So, should we favor the draining of defer_linger_chain as much workers
as necessary like the current patch, or should we have as few workers
as possible and not start new workers in loops with no effect on
defer_linger_chain?

Also, it seems that in the deferred lingering case we should probaly
shorten the socket timeout before calling (and possibly blocking on)
ap_start_lingering_close()'s hooks/flush, since we likely come from a
time-up already...

Thoughts?

Regards,
Yann.

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Luca Toscano <to...@gmail.com>.

Hi Yann,

2017-07-21 1:05 GMT+02:00 Yann Ylavic <yl...@gmail.com>:

> On Fri, Jul 14, 2017 at 9:52 PM, Yann Ylavic <yl...@gmail.com> wrote:
> >
> > So overall, this patch may introduce the need for more workers than
> > before, what was (wrongly) done by the listener thread has to be done
> > somewhere anyway...
>
> That patch didn't work (as reported by Stefan Pribe) and I now don't
> feel the need to debug it further, see below.
>
> >
> > Finally, I think there is room for improvements like batching
> > shutdowns in the same worker if there is no objection on the approach
> > so far.
>
> That's the way to go IMO, here is a new patch which is much simpler
> and effective I think.
>

It is indeed much simpler to follow even for people like me with not enough
experience in mpm-event's code :)


> The idea is that when nonblocking is required (i.e. in the listener),
> connections to flush and close are atomically pushed/popped to/from a
> chain (linked list) by the listener/some worker.
>
> So start_lingering_close_nonblocking() simply fills the chain (this is
> atomic/nonblocking), and any worker thread which is done with its
> current connection will empty the chain while calling
> start_lingering_close_blocking() for each connection.
>
> To prevent starvation of deferred lingering closes, the listener may
> create a worker at the of its loop, when/if the chain is (fully)
> filled.
>

IIUC the trick is to run "(have_idle_worker && push2worker(NULL) ==
APR_SUCCESS)" that reserves a worker that in turn eventually checks the
defer_linger_chain as part of its new code. This also seems sto leverage
workers_were_busy that will prioritize lingering closes over ka connections
right? (or at least try to)



> While the previous patch potentially induced some overhead in the
> number workers and thread contexts switches, I think this new one much
> better in this regard.
>
> What do you think of it?
>

I like it, from a first look I'd +1 it but as said above I am not really
authoritative for mpm-event's code :)

Let's see how Stefan's tests go!

Thanks for this work.

Luca

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Yann Ylavic <yl...@gmail.com>.

On Fri, Jul 14, 2017 at 9:52 PM, Yann Ylavic <yl...@gmail.com> wrote:
>
> So overall, this patch may introduce the need for more workers than
> before, what was (wrongly) done by the listener thread has to be done
> somewhere anyway...

That patch didn't work (as reported by Stefan Pribe) and I now don't
feel the need to debug it further, see below.

>
> Finally, I think there is room for improvements like batching
> shutdowns in the same worker if there is no objection on the approach
> so far.

That's the way to go IMO, here is a new patch which is much simpler
and effective I think.

The idea is that when nonblocking is required (i.e. in the listener),
connections to flush and close are atomically pushed/popped to/from a
chain (linked list) by the listener/some worker.

So start_lingering_close_nonblocking() simply fills the chain (this is
atomic/nonblocking), and any worker thread which is done with its
current connection will empty the chain while calling
start_lingering_close_blocking() for each connection.

To prevent starvation of deferred lingering closes, the listener may
create a worker at the of its loop, when/if the chain is (fully)
filled.

While the previous patch potentially induced some overhead in the
number workers and thread contexts switches, I think this new one much
better in this regard.

What do you think of it?

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Yann Ylavic <yl...@gmail.com>.

On Fri, Jun 30, 2017 at 1:33 PM, Yann Ylavic <yl...@gmail.com> wrote:
> On Fri, Jun 30, 2017 at 1:20 PM, Ruediger Pluem <rp...@apache.org> wrote:
>>
>> On 06/30/2017 12:18 PM, Yann Ylavic wrote:
>>>
>>> IMHO mod_ssl shoudn't (BIO_)flush unconditionally in
>>> modssl_smart_shutdown(), only in the "abortive" mode of
>>> ssl_filter_io_shutdown().
>>
>> I think the issue starts before that.
>> ap_prep_lingering_close calls the pre_close_connection hook and modules that are registered
>> to this hook can perform all sort of long lasting blocking operations there.
>> While it can be argued that this would be a bug in the module I think the only safe way
>> is to have the whole start_lingering_close_nonblocking being executed by a worker thread.
>
> Correct, that'd be much simpler/safer indeed.
> We need a new SHUTDOWN state then, right?

Actually it was less simple than expected, and it has some caveats obviously...

The attached patch does not introduce a new state but reuses the
existing CONN_STATE_LINGER since it was not really considered by the
listener thread (which uses CONN_STATE_LINGER_NORMAL and
CONN_STATE_LINGER_SHORT instead), but that's a detail.

Mainly, start_lingering_close_nonblocking() now simply schedules a
shutdown (i.e. pre_close_connection() followed by immediate close)
that will we be run by a worker thread.
A new shutdown_linger_q is created/handled (with the same timeout as
the short_linger_q, namely 2 seconds) to hold connections to be
shutdown.

So now when a connection times out in the write_completion or
keepalive queues, it needs (i.e. the listener may wait for) an
available worker to process its shutdown/close.
This means we can *not* close kept alive connections immediatly like
before when becoming short of workers, which will favor active KA
connections over new ones in this case (I don't think it's that
serious but the previous was taking care of that. For me it's up to
the admin to size the workers appropriately...).

Same when a connection in the shutdown_linger_q itself times out, the
patch will require a worker immediatly to do the job (see
shutdown_lingering_close() callback).

So overall, this patch may introduce the need for more workers than
before, what was (wrongly) done by the listener thread has to be done
somewhere anyway...

Finally, I think there is room for improvements like batching
shutdowns in the same worker if there is no objection on the approach
so far.

WDYT?

Regards,
Yann.

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Luca Toscano <to...@gmail.com>.

Hi Yann and Ruediger,

2c from a mpm-event newbie inline:

2017-06-30 13:33 GMT+02:00 Yann Ylavic <yl...@gmail.com>:

> On Fri, Jun 30, 2017 at 1:20 PM, Ruediger Pluem <rp...@apache.org> wrote:
> >
> > On 06/30/2017 12:18 PM, Yann Ylavic wrote:
> >>
> >> IMHO mod_ssl shoudn't (BIO_)flush unconditionally in
> >> modssl_smart_shutdown(), only in the "abortive" mode of
> >> ssl_filter_io_shutdown().
> >
> > I think the issue starts before that.
> > ap_prep_lingering_close calls the pre_close_connection hook and modules
> that are registered
> > to this hook can perform all sort of long lasting blocking operations
> there.
> > While it can be argued that this would be a bug in the module I think
> the only safe way
> > is to have the whole start_lingering_close_nonblocking being executed
> by a worker thread.
>

This makes a lot of sense and I agree, but at the same time I feel that it
would be really great not to move lingering close responsibilities away
from the listener. As far as we know mod_ssl is the only one that shows
this "buggy" behavior, would it make sense to attempt to "fix it" now and
postpone the decision about pushing start_lingering_close_nonblocking to a
worker thread?

> Correct, that'd be much simpler/safer indeed.
> We need a new SHUTDOWN state then, right?
>

IIUC in each case that the listener calls start_lingering_close_nonblocking
we'd need to set the connection to SHUTDOWN, check if there is a free
worker and push2worker the connection to it right?

Thanks!

Luca

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Yann Ylavic <yl...@gmail.com>.

On Fri, Jun 30, 2017 at 1:20 PM, Ruediger Pluem <rp...@apache.org> wrote:
>
> On 06/30/2017 12:18 PM, Yann Ylavic wrote:
>>
>> IMHO mod_ssl shoudn't (BIO_)flush unconditionally in
>> modssl_smart_shutdown(), only in the "abortive" mode of
>> ssl_filter_io_shutdown().
>
> I think the issue starts before that.
> ap_prep_lingering_close calls the pre_close_connection hook and modules that are registered
> to this hook can perform all sort of long lasting blocking operations there.
> While it can be argued that this would be a bug in the module I think the only safe way
> is to have the whole start_lingering_close_nonblocking being executed by a worker thread.

Correct, that'd be much simpler/safer indeed.
We need a new SHUTDOWN state then, right?


Regards,
Yann.

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Ruediger Pluem <rp...@apache.org>.


On 06/30/2017 12:18 PM, Yann Ylavic wrote:
> Hi Luca,
> 
> [better/easier to talk about details on dev@]
> 
> On Fri, Jun 30, 2017 at 11:05 AM,  <bu...@apache.org> wrote:
>> https://bz.apache.org/bugzilla/show_bug.cgi?id=60956
>>
>> --- Comment #11 from Luca Toscano <to...@gmail.com> ---
>> Other two interesting trunk improvements that have not been backported yet:
>>
>> http://svn.apache.org/viewvc?view=revision&revision=1706669
>> http://svn.apache.org/viewvc?view=revision&revision=1734656
>>
>> IIUC these ones are meant to provide a more async behavior to most of the
>> output filters, namely setting aside buckets (on the heap) to avoid blocking.
> 
> These are quite orthogonal I think, and don't seem to fix this particular issue.
> 
>>
>> After a bit of thinking it seems to me that we'd need to find a solution that
>> prevents the mod_ssl_output filter to block, but in a safe way.
> 
> IMHO mod_ssl shoudn't (BIO_)flush unconditionally in
> modssl_smart_shutdown(), only in the "abortive" mode of
> ssl_filter_io_shutdown().

I think the issue starts before that.
ap_prep_lingering_close calls the pre_close_connection hook and modules that are registered
to this hook can perform all sort of long lasting blocking operations there.
While it can be argued that this would be a bug in the module I think the only safe way
is to have the whole start_lingering_close_nonblocking being executed by a worker thread.

Regards

Rüdiger

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Yann Ylavic <yl...@gmail.com>.

On Fri, Jun 30, 2017 at 12:52 PM, Luca Toscano <to...@gmail.com> wrote:
>
> 2017-06-30 12:18 GMT+02:00 Yann Ylavic <yl...@gmail.com>:
>> >
>> > http://svn.apache.org/viewvc?view=revision&revision=1706669
>> > http://svn.apache.org/viewvc?view=revision&revision=1734656
>> >
>> > IIUC these ones are meant to provide a more async behavior to most of
>> > the
>> > output filters, namely setting aside buckets (on the heap) to avoid
>> > blocking.
>>
>> These are quite orthogonal I think, and don't seem to fix this particular
>> issue.
>
> Sorry for the noise, I am still reviewing those to understand what they
> really do and added them as reference to the task.

NP, it's not really a noise either because with my proposed change
they'd imply a different fix for 2.4 and trunk.

>>
>> With a possibly non-blocking modssl_smart_shutdown(), I think we could
>> make ap_shutdown_conn(c, 0) return something like APR_INCOMPLETE for
>> the case the shutdown was "buffered" in the output filter stack (e.g.
>> core output filter).
>>
>> In mpm_event, we would then go to (or stay in) the WRITE_COMPLETION
>> state instead of LINGER, until every remaining piece data is flushed
>> successfully.
>
> IIUC in this case the listener is calling
> process_timeout_queue(write_completion_q, ..), so the conn has already been
> in the WRITE_COMPLETION state for Timeout seconds and needs to be closed.
> Your suggestion would be to still force the listener to call
> ap_shutdown_conn(c, 0), but a "smarter" version that eventually returns
> APR_INCOMPLETE rather than blocking. Then the listener could put the
> connection again in the WRITE_COMPLETION queue, and wait for the client to
> unblock and read the missing bytes about close-notify (or just let Timeout
> seconds to pass, forcing the listener to shutdown the socket and be done
> with it).

We'd use a shorter timeout but that was the idea yes.

But nevermind, I like Rüdiger's idea better actually, less invasive (I
think) and more bullets proof. Continued there...

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Luca Toscano <to...@gmail.com>.

Hi Yann!

2017-06-30 12:18 GMT+02:00 Yann Ylavic <yl...@gmail.com>:

> Hi Luca,
>
> [better/easier to talk about details on dev@]
>
> On Fri, Jun 30, 2017 at 11:05 AM,  <bu...@apache.org> wrote:
> > https://bz.apache.org/bugzilla/show_bug.cgi?id=60956
> >
> > --- Comment #11 from Luca Toscano <to...@gmail.com> ---
> > Other two interesting trunk improvements that have not been backported
> yet:
> >
> > http://svn.apache.org/viewvc?view=revision&revision=1706669
> > http://svn.apache.org/viewvc?view=revision&revision=1734656
> >
> > IIUC these ones are meant to provide a more async behavior to most of the
> > output filters, namely setting aside buckets (on the heap) to avoid
> blocking.
>
> These are quite orthogonal I think, and don't seem to fix this particular
> issue.
>

Sorry for the noise, I am still reviewing those to understand what they
really do and added them as reference to the task.
I thought that they might have been useful while thinking about a solution,
since from my (ignorant :) point of view the fact that ap_core_output_filter
was blocked by mod_ssl's BIO_flush was somehow pointing me to those
commits.

Will skip these notes in the future, they might be confusing!

>
> [..]
>

> With a possibly non-blocking modssl_smart_shutdown(), I think we could
> make ap_shutdown_conn(c, 0) return something like APR_INCOMPLETE for
> the case the shutdown was "buffered" in the output filter stack (e.g.
> core output filter).
>
> In mpm_event, we would then go to (or stay in) the WRITE_COMPLETION
> state instead of LINGER, until every remaining piece data is flushed
> successfully.
>

IIUC in this case the listener is calling
process_timeout_queue(write_completion_q, ..), so the conn has already been
in the WRITE_COMPLETION state for Timeout seconds and needs to be closed.
Your suggestion would be to still force the listener to
call ap_shutdown_conn(c, 0), but a "smarter" version that eventually
returns APR_INCOMPLETE rather than blocking. Then the listener could put
the connection again in the WRITE_COMPLETION queue, and wait for the client
to unblock and read the missing bytes about close-notify (or just let
Timeout seconds to pass, forcing the listener to shutdown the socket and be
done with it).

You are free to trash the email if it doesn't make sense, I am probably
missing too many important pieces of the puzzle :)

Thanks for the help!

Luca

Re: [Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by Yann Ylavic <yl...@gmail.com>.

Hi Luca,

[better/easier to talk about details on dev@]

On Fri, Jun 30, 2017 at 11:05 AM,  <bu...@apache.org> wrote:
> https://bz.apache.org/bugzilla/show_bug.cgi?id=60956
>
> --- Comment #11 from Luca Toscano <to...@gmail.com> ---
> Other two interesting trunk improvements that have not been backported yet:
>
> http://svn.apache.org/viewvc?view=revision&revision=1706669
> http://svn.apache.org/viewvc?view=revision&revision=1734656
>
> IIUC these ones are meant to provide a more async behavior to most of the
> output filters, namely setting aside buckets (on the heap) to avoid blocking.

These are quite orthogonal I think, and don't seem to fix this particular issue.

>
> After a bit of thinking it seems to me that we'd need to find a solution that
> prevents the mod_ssl_output filter to block, but in a safe way.

IMHO mod_ssl shoudn't (BIO_)flush unconditionally in
modssl_smart_shutdown(), only in the "abortive" mode of
ssl_filter_io_shutdown().

The caller of the ssl output filters should decide when to flush, so
http://svn.apache.org/r1651077 was not the good fix in this regard.

Even if we don't BIO_flush, openssl shouldn't retain the close-notify
by itself, so at least it should go down to the next/core ouput filter
(and stay there until the socket is write-able or asked to flush).

>
> In this particular case we assume this about start_lingering_close_nonblocking:
>
> """
> /*
>  * Close our side of the connection, NOT flushing data to the client.
>  * This should only be called if there has been an error or if we know
>  * that our send buffers are empty.
>  * Pre-condition: cs is not in any timeout queue and not in the pollset,
>  *                timeout_mutex is not locked
>  * return: 0 if connection is fully closed,
>  *         1 if connection is lingering
>  * may be called by listener thread
>  */
> """
>
> I tried the following patch:
>
> """
> Index: server/mpm/event/event.c
> ===================================================================
> --- server/mpm/event/event.c    (revision 1800362)
> +++ server/mpm/event/event.c    (working copy)
> @@ -744,10 +744,7 @@
>      conn_rec *c = cs->c;
>      apr_socket_t *csd = cs->pfd.desc.s;
>
> -    if (ap_prep_lingering_close(c)
> -        || c->aborted
> -        || ap_shutdown_conn(c, 0) != APR_SUCCESS || c->aborted
> -        || apr_socket_shutdown(csd, APR_SHUTDOWN_WRITE) != APR_SUCCESS) {
> +    if (ap_prep_lingering_close(c) || c->aborted) {
>          apr_socket_close(csd);
>          ap_push_pool(worker_queue_info, cs->p);
>          if (dying)
> """
>
> So the idea was to brutally close the connection only if
> ap_prep_lingering_close(c) is not 0 or if the client has already aborted, but
> to leave all the other cases to the start_lingering_close_common. This is
> probably not enough/correct because the connection would go into the
> lingering_close queue, to be picked up again by
> process_timeout_queue(linger_q,..) after the timeout that would call
> stop_lingering_close, that would in turn simply close the socket without giving
> the possibility to mod_ssl to flush its close-notify (because no
> ap_shutdown_conn would be called).

With a possibly non-blocking modssl_smart_shutdown(), I think we could
make ap_shutdown_conn(c, 0) return something like APR_INCOMPLETE for
the case the shutdown was "buffered" in the output filter stack (e.g.
core output filter).

In mpm_event, we would then go to (or stay in) the WRITE_COMPLETION
state instead of LINGER, until every remaining piece data is flushed
successfully.

Does this sound OK?

[Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by bu...@apache.org.

https://bz.apache.org/bugzilla/show_bug.cgi?id=60956

--- Comment #11 from Luca Toscano <to...@gmail.com> ---
Other two interesting trunk improvements that have not been backported yet:

http://svn.apache.org/viewvc?view=revision&revision=1706669
http://svn.apache.org/viewvc?view=revision&revision=1734656

IIUC these ones are meant to provide a more async behavior to most of the
output filters, namely setting aside buckets (on the heap) to avoid blocking.

After a bit of thinking it seems to me that we'd need to find a solution that
prevents the mod_ssl_output filter to block, but in a safe way.

In this particular case we assume this about start_lingering_close_nonblocking:

"""
/*
 * Close our side of the connection, NOT flushing data to the client.
 * This should only be called if there has been an error or if we know
 * that our send buffers are empty.
 * Pre-condition: cs is not in any timeout queue and not in the pollset,
 *                timeout_mutex is not locked
 * return: 0 if connection is fully closed,
 *         1 if connection is lingering
 * may be called by listener thread
 */
"""

I tried the following patch:

"""
Index: server/mpm/event/event.c
===================================================================
--- server/mpm/event/event.c    (revision 1800362)
+++ server/mpm/event/event.c    (working copy)
@@ -744,10 +744,7 @@
     conn_rec *c = cs->c;
     apr_socket_t *csd = cs->pfd.desc.s;

-    if (ap_prep_lingering_close(c)
-        || c->aborted
-        || ap_shutdown_conn(c, 0) != APR_SUCCESS || c->aborted
-        || apr_socket_shutdown(csd, APR_SHUTDOWN_WRITE) != APR_SUCCESS) {
+    if (ap_prep_lingering_close(c) || c->aborted) {
         apr_socket_close(csd);
         ap_push_pool(worker_queue_info, cs->p);
         if (dying)
"""

So the idea was to brutally close the connection only if
ap_prep_lingering_close(c) is not 0 or if the client has already aborted, but
to leave all the other cases to the start_lingering_close_common. This is
probably not enough/correct because the connection would go into the
lingering_close queue, to be picked up again by
process_timeout_queue(linger_q,..) after the timeout that would call
stop_lingering_close, that would in turn simply close the socket without giving
the possibility to mod_ssl to flush its close-notify (because no
ap_shutdown_conn would be called).

Still looking for a better solution :)

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org

[Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by bu...@apache.org.

https://bz.apache.org/bugzilla/show_bug.cgi?id=60956

Luca Toscano <to...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |toscano.luca@gmail.com

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org

[Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by bu...@apache.org.

https://bz.apache.org/bugzilla/show_bug.cgi?id=60956

Yann Ylavic <yl...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #35159|0                           |1
        is obsolete|                            |

--- Comment #16 from Yann Ylavic <yl...@gmail.com> ---
Created attachment 35160
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=35160&action=edit
Defer nonblocking lingering close to workers (v5)

Same as v4, plus allowing timer threads (trunk only) to handle deferred
lingering closes after their work.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org

[Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by bu...@apache.org.

https://bz.apache.org/bugzilla/show_bug.cgi?id=60956

--- Comment #5 from Frank Meier <fr...@ergon.ch> ---
I've attached my test code and httpd.conf file, if anyone likes to reproduce
the issue.

Steps to reproduce:
* start httpd with attached mod_gendata and httpd.conf

* start a shell script loop showing the stacks of the httpd process and its TCP
connections:
 -----------------------------
 $ while true; do gstack "${HTTPD_PID}"; netstat -ntp | grep 127.0.0.1:10443;
echo ----------; sleep 1; done
 -----------------------------

* start the client with the right amount of data to trigger write completion
and fill up the TCP pipeline
 -----------------------------
 $ ./openssl-test 127.0.0.1 10443 '/gendata/?nBytes=850000' 0 1000
 -----------------------------

* after 60s you should see a output like this:
----------------
Thread 4 (Thread 0x7fde1d151700 (LWP 6960)):
#0  0x00007fde1fb4e4ed in poll () from /lib64/libc.so.6
#1  0x00007fde20890dc3 in poll (__timeout=<optimized out>, __nfds=<optimized
out>, __fds=0x7fde1d150a90) at /usr/include/bits/poll2.h:46
#2  apr_poll (aprset=aprset@entry=0x7fde1d150b30, num=num@entry=1,
nsds=nsds@entry=0x7fde1d150b24, timeout=<optimized out>) at
poll/unix/poll.c:120
#3  0x000055780168d8f5 in send_brigade_blocking (c=0x7fde180077b8,
bytes_written=0x7fde100057e8, bb=0x7fde18007c60, s=0x7fde18007520) at
core_filters.c:747
#4  ap_core_output_filter (f=0x7fde100056d8, new_bb=0x7fde18007c60) at
core_filters.c:542
#5  0x00007fde1f639648 in bio_filter_out_pass (outctx=0x7fde18007c40) at
ssl_engine_io.c:139
#6  0x00007fde1f63b558 in bio_filter_out_flush (bio=<optimized out>) at
ssl_engine_io.c:160
#7  0x00007fde1f63b58f in bio_filter_out_ctrl (bio=<optimized out>,
cmd=<optimized out>, num=<optimized out>, ptr=<optimized out>) at
ssl_engine_io.c:266
#8  0x00007fde1f64b14b in modssl_smart_shutdown (ssl=ssl@entry=0x7fde1000cfe0)
at ssl_util_ssl.c:145
#9  0x00007fde1f63b741 in ssl_filter_io_shutdown (c=0x7fde180077b8,
abortive=abortive@entry=0, filter_ctx=0x7fde18007bc0) at ssl_engine_io.c:1023
#10 0x00007fde1f63cdf1 in ssl_io_filter_output (f=0x7fde18007c18,
bb=0x7fde10006110) at ssl_engine_io.c:1691
#11 0x00007fde1f639d6a in ssl_io_filter_coalesce (f=0x7fde18007bf0,
bb=0x7fde10006110) at ssl_engine_io.c:1648
#12 0x000055780169c3d3 in ap_shutdown_conn (c=c@entry=0x7fde180077b8,
flush=flush@entry=0) at connection.c:88
#13 0x00007fde1f866d62 in start_lingering_close_nonblocking (cs=0x7fde18007728)
at event.c:910
#14 0x00007fde1f86618c in process_timeout_queue (q=0x55780311ae98,
timeout_time=timeout_time@entry=1494322580641883,
func=func@entry=0x7fde1f866d10 <start_lingering_close_nonblocking>) at
event.c:1509
#15 0x00007fde1f86851f in listener_thread (thd=0x5578030c9e08, dummy=<optimized
out>) at event.c:1834
#16 0x00007fde20014444 in start_thread () from /lib64/libpthread.so.0
#17 0x00007fde1fb575ed in clone () from /lib64/libc.so.6
Thread 3 (Thread 0x7fde1d952700 (LWP 6959)):
#0  0x00007fde2001a02f in pthread_cond_wait () from /lib64/libpthread.so.0
#1  0x00007fde208887cd in apr_thread_cond_wait (cond=<optimized out>,
mutex=<optimized out>) at locks/unix/thread_cond.c:68
#2  0x00007fde1f86b005 in ap_queue_pop_something (queue=0x5578030c99a0,
sd=0x7fde1d951e70, ecs=0x7fde1d951e78, p=0x7fde1d951e80, te_out=0x7fde1d951e88)
at fdqueue.c:438
#3  0x00007fde1f86700f in worker_thread (thd=<optimized out>, dummy=<optimized
out>) at event.c:1921
#4  0x00007fde20014444 in start_thread () from /lib64/libpthread.so.0
#5  0x00007fde1fb575ed in clone () from /lib64/libc.so.6
Thread 2 (Thread 0x7fde1e153700 (LWP 6958)):
#0  0x00007fde2001a02f in pthread_cond_wait () from /lib64/libpthread.so.0
#1  0x00007fde208887cd in apr_thread_cond_wait (cond=<optimized out>,
mutex=<optimized out>) at locks/unix/thread_cond.c:68
#2  0x00007fde1f86b005 in ap_queue_pop_something (queue=0x5578030c99a0,
sd=0x7fde1e152e70, ecs=0x7fde1e152e78, p=0x7fde1e152e80, te_out=0x7fde1e152e88)
at fdqueue.c:438
#3  0x00007fde1f86700f in worker_thread (thd=<optimized out>, dummy=<optimized
out>) at event.c:1921
#4  0x00007fde20014444 in start_thread () from /lib64/libpthread.so.0
#5  0x00007fde1fb575ed in clone () from /lib64/libc.so.6
Thread 1 (Thread 0x7fde2131f780 (LWP 6956)):
#0  0x00007fde2001d0cd in read () from /lib64/libpthread.so.0
#1  0x000055780169f737 in read (__nbytes=1, __buf=0x7ffcb3cc3563,
__fd=<optimized out>) at /usr/include/bits/unistd.h:44
#2  ap_mpm_podx_check (pod=<optimized out>) at mpm_unix.c:535
#3  0x00007fde1f864abb in child_main (child_num_arg=child_num_arg@entry=0,
child_bucket=child_bucket@entry=0) at event.c:2368
#4  0x00007fde1f8698e5 in make_child (s=0x5578030c82d8, slot=slot@entry=0,
bucket=0) at event.c:2461
#5  0x00007fde1f86997c in startup_children (number_to_start=1) at event.c:2490
#6  0x00007fde1f86a63f in event_run (_pconf=<optimized out>,
plog=0x5578030cd4a8, s=0x5578030c82d8) at event.c:2857
#7  0x0000557801675f1e in ap_run_mpm (pconf=0x55780309e138,
plog=0x5578030cd4a8, s=0x5578030c82d8) at mpm_common.c:94
#8  0x000055780166ef25 in main (argc=3, argv=0x7ffcb3cc3998) at main.c:783
tcp      111      0 127.0.0.1:43112         127.0.0.1:10443         ESTABLISHED
28563/./Debug/opens 
tcp6       0 815616 127.0.0.1:10443         127.0.0.1:43112         ESTABLISHED
6956/httpd    
----------------

* the listener_thread stays in this state for another 60s, and does not handle
HTTP requests anymore

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org

[Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by bu...@apache.org.

https://bz.apache.org/bugzilla/show_bug.cgi?id=60956

Yann Ylavic <yl...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #35160|0                           |1
        is obsolete|                            |

--- Comment #18 from Yann Ylavic <yl...@gmail.com> ---
Comment on attachment 35160
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=35160
Defer nonblocking lingering close to workers (v5)

Obsoleted by r1802875 (v6).

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org

[Bug 60956] Event MPM listener thread may get blocked by SSL shutdowns

Posted by bu...@apache.org.

https://bz.apache.org/bugzilla/show_bug.cgi?id=60956

--- Comment #6 from Luca Toscano <to...@gmail.com> ---
Hi Frank,

thanks a lot for the super detailed report. I am wondering if you'd have the
patience to re-run your tests with httpd 2.4.25 with the following patch:

http://home.apache.org/~ylavic/patches/httpd-2.4.x-mpm_event-wakeup-v7.1.patch

It might not solve this particular problem but I'd be curious to know if
anything changes.

Thanks!

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscribe@httpd.apache.org
For additional commands, e-mail: bugs-help@httpd.apache.org