You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Stefan Eissing <st...@greenbytes.de> on 2018/09/19 14:10:49 UTC

trunk proxy_http failures

Hi,

the h2 test suite has tests with http/2 in the front and a standard proxy_http to localhost. When running the test suite against trunk, the tests run into a failed GET on a proxy. The request hangs 5 seconds and the fails reading the status line, as seen below. The proxy is defined as:

    <Proxy "balancer://https-local">
        BalancerMember "https://127.0.0.1:SUBST_PORT_HTTPS_SUBST" hcmethod=GET hcuri=/
    </Proxy>
    ...
    ProxyPass "/proxy" "balancer://http-local"
    ProxyPassReverse "/proxy" "balancer://http-local"

Typical log:
[Wed Sep 19 14:02:58.038602 2018] [http2:trace1] [pid 35806:tid 123145377832960] h2_task.c(675): [client 127.0.0.1:64231] h2_task(200-13): start process_request
[Wed Sep 19 14:02:58.038654 2018] [proxy:debug] [pid 35806:tid 123145377832960] proxy_util.c(1326): AH10122: proxy: Entering byrequests for BALANCER (balancer://https-local)
[Wed Sep 19 14:02:58.038679 2018] [proxy:debug] [pid 35806:tid 123145377832960] proxy_util.c(1418): AH10123: proxy: byrequests selected worker "https://127.0.0.1:12346" : busy 0 : lbstatus 100
[Wed Sep 19 14:02:58.038694 2018] [proxy:debug] [pid 35806:tid 123145377832960] proxy_util.c(1971): AH00924: worker https://127.0.0.1:12346 shared already initialized
[Wed Sep 19 14:02:58.038712 2018] [proxy:debug] [pid 35806:tid 123145377832960] proxy_util.c(2028): AH00926: worker https://127.0.0.1:12346 local already initialized
[Wed Sep 19 14:02:58.038736 2018] [proxy:debug] [pid 35806:tid 123145377832960] mod_proxy.c(1258): [client 127.0.0.1:64231] AH01143: Running scheme balancer handler (attempt 0)
[Wed Sep 19 14:02:58.038756 2018] [proxy:debug] [pid 35806:tid 123145377832960] proxy_util.c(2367): AH00942: HTTPS: has acquired connection for (127.0.0.1)
[Wed Sep 19 14:02:58.038775 2018] [proxy:debug] [pid 35806:tid 123145377832960] proxy_util.c(2421): [client 127.0.0.1:64231] AH00944: connecting https://127.0.0.1:12346/files/data-1m to 127.0.0.1:12346
[Wed Sep 19 14:02:58.038790 2018] [proxy:debug] [pid 35806:tid 123145377832960] proxy_util.c(2630): [client 127.0.0.1:64231] AH00947: connected /files/data-1m to 127.0.0.1:12346
[Wed Sep 19 14:03:03.041506 2018] [proxy_http:error] [pid 35806:tid 123145377832960] (70014)End of file found: [client 127.0.0.1:64231] AH01102: error reading status line from remote server 127.0.0.1:12346
[Wed Sep 19 14:03:03.041541 2018] [proxy_http:debug] [pid 35806:tid 123145377832960] mod_proxy_http.c(1386): [client 127.0.0.1:64231] AH01104: Closing connection to client because reading from backend server 127.0.0.1:12346 failed. Number of keepalives 1

What can I best do to further debug this?

Should I disable the balancer and see if the problem persists?

Help appreciated, thanks!

-Stefan

Re: trunk proxy_http failures

Posted by Stefan Eissing <st...@greenbytes.de>.

> Am 19.09.2018 um 17:12 schrieb Yann Ylavic <yl...@gmail.com>:
> 
> On Wed, Sep 19, 2018 at 4:46 PM Stefan Eissing
> <st...@greenbytes.de> wrote:
>> 
>> Thanks, Yann, this helped me pin the problem down further:
>> 
>> - with disablereuse=on everything works fine
>> - with ttl=1 the problem is still there
> 
> Is the KeepAliveTimeout on the backend side above 1 (second)?

Yes, upped it to 20, no difference.

>> 
>> and then:
>> - with mpm_worker, the problem also disappears (no disable/ttl needed)
> 
> Hmm, something (new?) which does not respect KA timeout on MPM event...
> Can you confirm this (for instance with tcpdump or alike)?

I will debug more tomorrow. As usual, timing seems to play a role. Basically there
is a sequence of 3 requests in play which repeat with different content:
1. POST (no expect), small request body
2. POST (expect-100-cont), upload.py body is 1k/10k/100k/ file
3. GET on files/data-1k etc. 

The requests are done in this order, not parallel. All on a new connection.

If a request fails, it is always 3 and always with proxy re-using a dead connection.
So, assuming it is the same proxy connection that gets re-used, what may cause the
connection to close after request 2?

That is runs on mpm_worker *may* point to mpm_event. But with "LogLevel core:trace8"
it seems to disappear in event, also. With core:trace6 it still happens...

Strange thing.

Btw. Solving this is not urgent for me. I see this only in trunk.

Let's see if I gain more insight tomorrow on this.

Cheers,

Stefan

> 
>> 
>> These tests were running since the dawn of h2 time and are still running in 2.4.x. Since the problem also goes away on worker, this looks like a new problem with mpm_event and connection closing (keepalive)?
> 
> I'm having a look at it, will come back if something pops up.


Re: trunk proxy_http failures

Posted by Yann Ylavic <yl...@gmail.com>.
On Wed, Sep 19, 2018 at 4:46 PM Stefan Eissing
<st...@greenbytes.de> wrote:
>
> Thanks, Yann, this helped me pin the problem down further:
>
> - with disablereuse=on everything works fine
> - with ttl=1 the problem is still there

Is the KeepAliveTimeout on the backend side above 1 (second)?

>
> and then:
> - with mpm_worker, the problem also disappears (no disable/ttl needed)

Hmm, something (new?) which does not respect KA timeout on MPM event...
Can you confirm this (for instance with tcpdump or alike)?

>
> These tests were running since the dawn of h2 time and are still running in 2.4.x. Since the problem also goes away on worker, this looks like a new problem with mpm_event and connection closing (keepalive)?

I'm having a look at it, will come back if something pops up.

Re: trunk proxy_http failures

Posted by Stefan Eissing <st...@greenbytes.de>.
Thanks, Yann, this helped me pin the problem down further:

- with disablereuse=on everything works fine
- with ttl=1 the problem is still there

and then:
- with mpm_worker, the problem also disappears (no disable/ttl needed)

These tests were running since the dawn of h2 time and are still running in 2.4.x. Since the problem also goes away on worker, this looks like a new problem with mpm_event and connection closing (keepalive)?

Anything I can do to give you good data? Some preferred log settings?

-Stefan

> Am 19.09.2018 um 16:32 schrieb Yann Ylavic <yl...@gmail.com>:
> 
> On Wed, Sep 19, 2018 at 4:10 PM Stefan Eissing
> <st...@greenbytes.de> wrote:
>> 
>>    <Proxy "balancer://https-local">
>>        BalancerMember "https://127.0.0.1:SUBST_PORT_HTTPS_SUBST" hcmethod=GET hcuri=/
> 
> Does disablereuse=on help here?
> 
>>    </Proxy>
>>    ...
>>    ProxyPass "/proxy" "balancer://http-local"
>>    ProxyPassReverse "/proxy" "balancer://http-local"
>> 
>> Typical log:
> ...
>> [Wed Sep 19 14:02:58.038790 2018] [proxy:debug] [pid 35806:tid 123145377832960] proxy_util.c(2630): [client 127.0.0.1:64231] AH00947: connected /files/data-1m to 127.0.0.1:12346
>> [Wed Sep 19 14:03:03.041506 2018] [proxy_http:error] [pid 35806:tid 123145377832960] (70014)End of file found: [client 127.0.0.1:64231] AH01102: error reading status line from remote server 127.0.0.1:12346
> 
> I suspect mod_proxy is reusing a connection already close by 127.0.0.1:12346.
> If you know the KeepAliveTimeout configured there, setting a ttl= to a
> lower value on the BalancerMember could be less radical than
> disablereuse...
> 
> HTH,
> Yann.


Re: trunk proxy_http failures

Posted by Yann Ylavic <yl...@gmail.com>.
On Wed, Sep 19, 2018 at 4:10 PM Stefan Eissing
<st...@greenbytes.de> wrote:
>
>     <Proxy "balancer://https-local">
>         BalancerMember "https://127.0.0.1:SUBST_PORT_HTTPS_SUBST" hcmethod=GET hcuri=/

Does disablereuse=on help here?

>     </Proxy>
>     ...
>     ProxyPass "/proxy" "balancer://http-local"
>     ProxyPassReverse "/proxy" "balancer://http-local"
>
> Typical log:
...
> [Wed Sep 19 14:02:58.038790 2018] [proxy:debug] [pid 35806:tid 123145377832960] proxy_util.c(2630): [client 127.0.0.1:64231] AH00947: connected /files/data-1m to 127.0.0.1:12346
> [Wed Sep 19 14:03:03.041506 2018] [proxy_http:error] [pid 35806:tid 123145377832960] (70014)End of file found: [client 127.0.0.1:64231] AH01102: error reading status line from remote server 127.0.0.1:12346

I suspect mod_proxy is reusing a connection already close by 127.0.0.1:12346.
If you know the KeepAliveTimeout configured there, setting a ttl= to a
lower value on the BalancerMember could be less radical than
disablereuse...

HTH,
Yann.

Re: trunk proxy_http failures

Posted by Stefan Eissing <st...@greenbytes.de>.
That one runs fine for me.

In fact, I did a total clean build and restart this morning and all the problems have disappeared. So, sorry about the witch hunt! It may have been an unclean build due to mixed header files or there was another rogue process listening.

Btw. how are people handling changes in header files? I install my httpd via "make install" and that also copies the header files, which then lead to inconsistencies when I build a new revision. Is there an install target that only installs the binaries?

-Stefan

> Am 19.09.2018 um 17:49 schrieb Jim Jagielski <ji...@jaguNET.com>:
> 
> Does folding http://home.apache.org/~ylavic/patches/2.4.x-mpms_async_objects_lifetime.patch
> into 2.4.x make any difference??
> 
>> On Sep 19, 2018, at 10:10 AM, Stefan Eissing <st...@greenbytes.de> wrote:
>> 
>> Hi,
>> 
>> the h2 test suite has tests with http/2 in the front and a standard proxy_http to localhost. When running the test suite against trunk, the tests run into a failed GET on a proxy. The request hangs 5 seconds and the fails reading the status line, as seen below. The proxy is defined as:
>> 
>>   <Proxy "balancer://https-local">
>>       BalancerMember "https://127.0.0.1:SUBST_PORT_HTTPS_SUBST" hcmethod=GET hcuri=/
>>   </Proxy>
>>   ...
>>   ProxyPass "/proxy" "balancer://http-local"
>>   ProxyPassReverse "/proxy" "balancer://http-local"
>> 
>> Typical log:
>> [Wed Sep 19 14:02:58.038602 2018] [http2:trace1] [pid 35806:tid 123145377832960] h2_task.c(675): [client 127.0.0.1:64231] h2_task(200-13): start process_request
>> [Wed Sep 19 14:02:58.038654 2018] [proxy:debug] [pid 35806:tid 123145377832960] proxy_util.c(1326): AH10122: proxy: Entering byrequests for BALANCER (balancer://https-local)
>> [Wed Sep 19 14:02:58.038679 2018] [proxy:debug] [pid 35806:tid 123145377832960] proxy_util.c(1418): AH10123: proxy: byrequests selected worker "https://127.0.0.1:12346" : busy 0 : lbstatus 100
>> [Wed Sep 19 14:02:58.038694 2018] [proxy:debug] [pid 35806:tid 123145377832960] proxy_util.c(1971): AH00924: worker https://127.0.0.1:12346 shared already initialized
>> [Wed Sep 19 14:02:58.038712 2018] [proxy:debug] [pid 35806:tid 123145377832960] proxy_util.c(2028): AH00926: worker https://127.0.0.1:12346 local already initialized
>> [Wed Sep 19 14:02:58.038736 2018] [proxy:debug] [pid 35806:tid 123145377832960] mod_proxy.c(1258): [client 127.0.0.1:64231] AH01143: Running scheme balancer handler (attempt 0)
>> [Wed Sep 19 14:02:58.038756 2018] [proxy:debug] [pid 35806:tid 123145377832960] proxy_util.c(2367): AH00942: HTTPS: has acquired connection for (127.0.0.1)
>> [Wed Sep 19 14:02:58.038775 2018] [proxy:debug] [pid 35806:tid 123145377832960] proxy_util.c(2421): [client 127.0.0.1:64231] AH00944: connecting https://127.0.0.1:12346/files/data-1m to 127.0.0.1:12346
>> [Wed Sep 19 14:02:58.038790 2018] [proxy:debug] [pid 35806:tid 123145377832960] proxy_util.c(2630): [client 127.0.0.1:64231] AH00947: connected /files/data-1m to 127.0.0.1:12346
>> [Wed Sep 19 14:03:03.041506 2018] [proxy_http:error] [pid 35806:tid 123145377832960] (70014)End of file found: [client 127.0.0.1:64231] AH01102: error reading status line from remote server 127.0.0.1:12346
>> [Wed Sep 19 14:03:03.041541 2018] [proxy_http:debug] [pid 35806:tid 123145377832960] mod_proxy_http.c(1386): [client 127.0.0.1:64231] AH01104: Closing connection to client because reading from backend server 127.0.0.1:12346 failed. Number of keepalives 1
>> 
>> What can I best do to further debug this?
>> 
>> Should I disable the balancer and see if the problem persists?
>> 
>> Help appreciated, thanks!
>> 
>> -Stefan
> 


Re: trunk proxy_http failures

Posted by Stefan Eissing <st...@greenbytes.de>.
Can try tomorrow.

> Am 19.09.2018 um 17:49 schrieb Jim Jagielski <ji...@jaguNET.com>:
> 
> Does folding http://home.apache.org/~ylavic/patches/2.4.x-mpms_async_objects_lifetime.patch
> into 2.4.x make any difference??
> 
>> On Sep 19, 2018, at 10:10 AM, Stefan Eissing <st...@greenbytes.de> wrote:
>> 
>> Hi,
>> 
>> the h2 test suite has tests with http/2 in the front and a standard proxy_http to localhost. When running the test suite against trunk, the tests run into a failed GET on a proxy. The request hangs 5 seconds and the fails reading the status line, as seen below. The proxy is defined as:
>> 
>>   <Proxy "balancer://https-local">
>>       BalancerMember "https://127.0.0.1:SUBST_PORT_HTTPS_SUBST" hcmethod=GET hcuri=/
>>   </Proxy>
>>   ...
>>   ProxyPass "/proxy" "balancer://http-local"
>>   ProxyPassReverse "/proxy" "balancer://http-local"
>> 
>> Typical log:
>> [Wed Sep 19 14:02:58.038602 2018] [http2:trace1] [pid 35806:tid 123145377832960] h2_task.c(675): [client 127.0.0.1:64231] h2_task(200-13): start process_request
>> [Wed Sep 19 14:02:58.038654 2018] [proxy:debug] [pid 35806:tid 123145377832960] proxy_util.c(1326): AH10122: proxy: Entering byrequests for BALANCER (balancer://https-local)
>> [Wed Sep 19 14:02:58.038679 2018] [proxy:debug] [pid 35806:tid 123145377832960] proxy_util.c(1418): AH10123: proxy: byrequests selected worker "https://127.0.0.1:12346" : busy 0 : lbstatus 100
>> [Wed Sep 19 14:02:58.038694 2018] [proxy:debug] [pid 35806:tid 123145377832960] proxy_util.c(1971): AH00924: worker https://127.0.0.1:12346 shared already initialized
>> [Wed Sep 19 14:02:58.038712 2018] [proxy:debug] [pid 35806:tid 123145377832960] proxy_util.c(2028): AH00926: worker https://127.0.0.1:12346 local already initialized
>> [Wed Sep 19 14:02:58.038736 2018] [proxy:debug] [pid 35806:tid 123145377832960] mod_proxy.c(1258): [client 127.0.0.1:64231] AH01143: Running scheme balancer handler (attempt 0)
>> [Wed Sep 19 14:02:58.038756 2018] [proxy:debug] [pid 35806:tid 123145377832960] proxy_util.c(2367): AH00942: HTTPS: has acquired connection for (127.0.0.1)
>> [Wed Sep 19 14:02:58.038775 2018] [proxy:debug] [pid 35806:tid 123145377832960] proxy_util.c(2421): [client 127.0.0.1:64231] AH00944: connecting https://127.0.0.1:12346/files/data-1m to 127.0.0.1:12346
>> [Wed Sep 19 14:02:58.038790 2018] [proxy:debug] [pid 35806:tid 123145377832960] proxy_util.c(2630): [client 127.0.0.1:64231] AH00947: connected /files/data-1m to 127.0.0.1:12346
>> [Wed Sep 19 14:03:03.041506 2018] [proxy_http:error] [pid 35806:tid 123145377832960] (70014)End of file found: [client 127.0.0.1:64231] AH01102: error reading status line from remote server 127.0.0.1:12346
>> [Wed Sep 19 14:03:03.041541 2018] [proxy_http:debug] [pid 35806:tid 123145377832960] mod_proxy_http.c(1386): [client 127.0.0.1:64231] AH01104: Closing connection to client because reading from backend server 127.0.0.1:12346 failed. Number of keepalives 1
>> 
>> What can I best do to further debug this?
>> 
>> Should I disable the balancer and see if the problem persists?
>> 
>> Help appreciated, thanks!
>> 
>> -Stefan
> 


Re: trunk proxy_http failures

Posted by Jim Jagielski <ji...@jaguNET.com>.
Does folding http://home.apache.org/~ylavic/patches/2.4.x-mpms_async_objects_lifetime.patch
into 2.4.x make any difference??

> On Sep 19, 2018, at 10:10 AM, Stefan Eissing <st...@greenbytes.de> wrote:
> 
> Hi,
> 
> the h2 test suite has tests with http/2 in the front and a standard proxy_http to localhost. When running the test suite against trunk, the tests run into a failed GET on a proxy. The request hangs 5 seconds and the fails reading the status line, as seen below. The proxy is defined as:
> 
>    <Proxy "balancer://https-local">
>        BalancerMember "https://127.0.0.1:SUBST_PORT_HTTPS_SUBST" hcmethod=GET hcuri=/
>    </Proxy>
>    ...
>    ProxyPass "/proxy" "balancer://http-local"
>    ProxyPassReverse "/proxy" "balancer://http-local"
> 
> Typical log:
> [Wed Sep 19 14:02:58.038602 2018] [http2:trace1] [pid 35806:tid 123145377832960] h2_task.c(675): [client 127.0.0.1:64231] h2_task(200-13): start process_request
> [Wed Sep 19 14:02:58.038654 2018] [proxy:debug] [pid 35806:tid 123145377832960] proxy_util.c(1326): AH10122: proxy: Entering byrequests for BALANCER (balancer://https-local)
> [Wed Sep 19 14:02:58.038679 2018] [proxy:debug] [pid 35806:tid 123145377832960] proxy_util.c(1418): AH10123: proxy: byrequests selected worker "https://127.0.0.1:12346" : busy 0 : lbstatus 100
> [Wed Sep 19 14:02:58.038694 2018] [proxy:debug] [pid 35806:tid 123145377832960] proxy_util.c(1971): AH00924: worker https://127.0.0.1:12346 shared already initialized
> [Wed Sep 19 14:02:58.038712 2018] [proxy:debug] [pid 35806:tid 123145377832960] proxy_util.c(2028): AH00926: worker https://127.0.0.1:12346 local already initialized
> [Wed Sep 19 14:02:58.038736 2018] [proxy:debug] [pid 35806:tid 123145377832960] mod_proxy.c(1258): [client 127.0.0.1:64231] AH01143: Running scheme balancer handler (attempt 0)
> [Wed Sep 19 14:02:58.038756 2018] [proxy:debug] [pid 35806:tid 123145377832960] proxy_util.c(2367): AH00942: HTTPS: has acquired connection for (127.0.0.1)
> [Wed Sep 19 14:02:58.038775 2018] [proxy:debug] [pid 35806:tid 123145377832960] proxy_util.c(2421): [client 127.0.0.1:64231] AH00944: connecting https://127.0.0.1:12346/files/data-1m to 127.0.0.1:12346
> [Wed Sep 19 14:02:58.038790 2018] [proxy:debug] [pid 35806:tid 123145377832960] proxy_util.c(2630): [client 127.0.0.1:64231] AH00947: connected /files/data-1m to 127.0.0.1:12346
> [Wed Sep 19 14:03:03.041506 2018] [proxy_http:error] [pid 35806:tid 123145377832960] (70014)End of file found: [client 127.0.0.1:64231] AH01102: error reading status line from remote server 127.0.0.1:12346
> [Wed Sep 19 14:03:03.041541 2018] [proxy_http:debug] [pid 35806:tid 123145377832960] mod_proxy_http.c(1386): [client 127.0.0.1:64231] AH01104: Closing connection to client because reading from backend server 127.0.0.1:12346 failed. Number of keepalives 1
> 
> What can I best do to further debug this?
> 
> Should I disable the balancer and see if the problem persists?
> 
> Help appreciated, thanks!
> 
> -Stefan