You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@httpd.apache.org by Stefan Eissing <st...@greenbytes.de> on 2018/04/30 07:28:41 UTC

http2 connection keepalive broken

Yann,

with the change http://svn.apache.org/r1819214 the keepalive handling in HTTP/2 was broken, as reported here

https://github.com/icing/mod_h2/issues/160

If the mpm_event connection state handling is not suitable to honoring a keepalive setting, we need to fix this. Keeping idle connections parked in fdqueue without occupying a thread seems to be one of event's advantages. mod_http2 did enable this by keeping the connection in CONN_STATE_WRITE_COMPLETION. The change places it in CONN_STATE_LINGER which means it is closed after 1 second, always.

Since you know event's connection state handling much better than me: what can we do to fix this?

Cheers,

Stefan

Re: http2 connection keepalive broken

Posted by Stefan Eissing <st...@greenbytes.de>.


> Am 30.04.2018 um 12:15 schrieb Yann Ylavic <yl...@gmail.com>:
> 
> On Mon, Apr 30, 2018 at 12:06 PM, Stefan Eissing
> <st...@greenbytes.de> wrote:
>> 
>>> Am 30.04.2018 um 12:00 schrieb Yann Ylavic <yl...@gmail.com>:
>>> 
>>> There nothing wrong with returning CONN_STATE_WRITE_COMPLETION here
>>> like before, CONN_STATE_HANDLER is wrong though.
>>> So the previous code with s/CONN_STATE_HANDLER/CONN_STATE_LINGER/
>>> should restore the behaviour I think.
>> 
>> Will give it a try. However, I had already keepalive tests in the http2
>> test suite, but those do not fail. So, mpm_event's behaviour on when
>> it ditches idle connections is not really clear to me and other tests I
>> do not know of.
> 
> Do the tests check for e.g. 1s (linger) vs 5s (keepalive) timeout?
> 
>> 
>> Basically, I am a bit frustrated by mpm_event and its non-documentation.
>> Maybe I missed the documents that explain how its different states actually work?
> 
> I can try a formal documentation with a bit of my own idle time :)
> Meanwhile, this comment in process_socket() is the only one I'm aware of:
> 
>    /*
>     * The process_connection hooks above should set the connection state
>     * appropriately upon return, for event MPM to either:
>     * - do lingering close (CONN_STATE_LINGER),
>     * - wait for readability of the next request with respect to the keepalive
>     *   timeout (state CONN_STATE_CHECK_REQUEST_LINE_READABLE),
>     * - wait for read/write-ability of the underlying socket with respect to
>     *   its timeout by setting c->clogging_input_filters to 1 and the sense
>     *   to CONN_SENSE_WANT_READ/WRITE (state CONN_STATE_WRITE_COMPLETION),
>     * - keep flushing the output filters stack in nonblocking mode, and then
>     *   if required wait for read/write-ability of the underlying socket with
>     *   respect to its own timeout (state CONN_STATE_WRITE_COMPLETION); since
>     *   completion at some point may require reads (e.g. SSL_ERROR_WANT_READ),
>     *   an output filter can also set the sense to CONN_SENSE_WANT_READ at any
>     *   time for event MPM to do the right thing,
>     * - suspend the connection (SUSPENDED) such that it now interracts with
>     *   the MPM through suspend/resume_connection() hooks, and/or registered
>     *   poll callbacks (PT_USER), and/or registered timed callbacks triggered
>     *   by timer events.
>     * If a process_connection hook returns an error or no hook sets the state
>     * to one of the above expected value, we forcibly close the connection w/
>     * CONN_STATE_LINGER.  This covers the cases where no process_connection
>     * hook executes (DECLINED), or one returns OK w/o touching the state (i.e.
>     * CONN_STATE_READ_REQUEST_LINE remains after the call) which can happen
>     * with third-party modules not updated to work specifically with event MPM
>     * while this was expected to do lingering close unconditionally with
>     * worker or prefork MPMs for instance.
>     */
> 
> We can do better...

Thanks. I still keep wondering why my test case does not trigger the behaviour. Is
there a condition/event (like a new connection) that triggers the cleanup/linger/connection close 
handling? Because my client opens one, finishes the request and then checks that the
GOAWAY does not arrive for n seconds...but nothing else is happening in the server
in the meantime.

-Stefan

Re: http2 connection keepalive broken

Posted by Yann Ylavic <yl...@gmail.com>.

On Mon, Apr 30, 2018 at 12:06 PM, Stefan Eissing
<st...@greenbytes.de> wrote:
>
>> Am 30.04.2018 um 12:00 schrieb Yann Ylavic <yl...@gmail.com>:
>>
>> There nothing wrong with returning CONN_STATE_WRITE_COMPLETION here
>> like before, CONN_STATE_HANDLER is wrong though.
>> So the previous code with s/CONN_STATE_HANDLER/CONN_STATE_LINGER/
>> should restore the behaviour I think.
>
> Will give it a try. However, I had already keepalive tests in the http2
> test suite, but those do not fail. So, mpm_event's behaviour on when
> it ditches idle connections is not really clear to me and other tests I
> do not know of.

Do the tests check for e.g. 1s (linger) vs 5s (keepalive) timeout?

>
> Basically, I am a bit frustrated by mpm_event and its non-documentation.
> Maybe I missed the documents that explain how its different states actually work?

I can try a formal documentation with a bit of my own idle time :)
Meanwhile, this comment in process_socket() is the only one I'm aware of:

    /*
     * The process_connection hooks above should set the connection state
     * appropriately upon return, for event MPM to either:
     * - do lingering close (CONN_STATE_LINGER),
     * - wait for readability of the next request with respect to the keepalive
     *   timeout (state CONN_STATE_CHECK_REQUEST_LINE_READABLE),
     * - wait for read/write-ability of the underlying socket with respect to
     *   its timeout by setting c->clogging_input_filters to 1 and the sense
     *   to CONN_SENSE_WANT_READ/WRITE (state CONN_STATE_WRITE_COMPLETION),
     * - keep flushing the output filters stack in nonblocking mode, and then
     *   if required wait for read/write-ability of the underlying socket with
     *   respect to its own timeout (state CONN_STATE_WRITE_COMPLETION); since
     *   completion at some point may require reads (e.g. SSL_ERROR_WANT_READ),
     *   an output filter can also set the sense to CONN_SENSE_WANT_READ at any
     *   time for event MPM to do the right thing,
     * - suspend the connection (SUSPENDED) such that it now interracts with
     *   the MPM through suspend/resume_connection() hooks, and/or registered
     *   poll callbacks (PT_USER), and/or registered timed callbacks triggered
     *   by timer events.
     * If a process_connection hook returns an error or no hook sets the state
     * to one of the above expected value, we forcibly close the connection w/
     * CONN_STATE_LINGER.  This covers the cases where no process_connection
     * hook executes (DECLINED), or one returns OK w/o touching the state (i.e.
     * CONN_STATE_READ_REQUEST_LINE remains after the call) which can happen
     * with third-party modules not updated to work specifically with event MPM
     * while this was expected to do lingering close unconditionally with
     * worker or prefork MPMs for instance.
     */

We can do better...

Re: http2 connection keepalive broken

Posted by Stefan Eissing <st...@greenbytes.de>.


> Am 30.04.2018 um 12:00 schrieb Yann Ylavic <yl...@gmail.com>:
> 
> On Mon, Apr 30, 2018 at 9:28 AM, Stefan Eissing <stefan.eissing@greenbytes.de
>> wrote:
>> 
>> with the change http://svn.apache.org/r1819214 the keepalive handling
>> in HTTP/2 was broken, as reported here
>> 
>> https://github.com/icing/mod_h2/issues/160
> 
> Oh crap, I thougth that the h2 connection was over at this point (with
> its own keepalive handling regardless of MPM).
> Sorry about that.

> 
>> 
>> If the mpm_event connection state handling is not suitable to
>> honoring a keepalive setting, we need to fix this. Keeping idle
>> connections parked in fdqueue without occupying a thread seems to be
>> one of event's advantages. mod_http2 did enable this by keeping the
>> connection in CONN_STATE_WRITE_COMPLETION. The change places it in
>> CONN_STATE_LINGER which means it is closed after 1 second, always.
>> 
>> Since you know event's connection state handling much better than me:
>> what can we do to fix this?
> 
> There nothing wrong with returning CONN_STATE_WRITE_COMPLETION here
> like before, CONN_STATE_HANDLER is wrong though.
> So the previous code with s/CONN_STATE_HANDLER/CONN_STATE_LINGER/
> should restore the behaviour I think.

Will give it a try. However, I had already keepalive tests in the http2 
test suite, but those do not fail. So, mpm_event's behaviour on when 
it ditches idle connections is not really clear to me and other tests I 
do not know of.

Basically, I am a bit frustrated by mpm_event and its non-documentation.
Maybe I missed the documents that explain how its different states actually work?

-Stefan

Re: http2 connection keepalive broken

Posted by Yann Ylavic <yl...@gmail.com>.

On Mon, Apr 30, 2018 at 9:28 AM, Stefan Eissing <stefan.eissing@greenbytes.de
> wrote:
>
> with the change http://svn.apache.org/r1819214 the keepalive handling
> in HTTP/2 was broken, as reported here
>
> https://github.com/icing/mod_h2/issues/160

Oh crap, I thougth that the h2 connection was over at this point (with
its own keepalive handling regardless of MPM).
Sorry about that.

>
> If the mpm_event connection state handling is not suitable to
> honoring a keepalive setting, we need to fix this. Keeping idle
> connections parked in fdqueue without occupying a thread seems to be
> one of event's advantages. mod_http2 did enable this by keeping the
> connection in CONN_STATE_WRITE_COMPLETION. The change places it in
> CONN_STATE_LINGER which means it is closed after 1 second, always.
>
> Since you know event's connection state handling much better than me:
> what can we do to fix this?

There nothing wrong with returning CONN_STATE_WRITE_COMPLETION here
like before, CONN_STATE_HANDLER is wrong though.
So the previous code with s/CONN_STATE_HANDLER/CONN_STATE_LINGER/
should restore the behaviour I think.

Regards,
Yann.