You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Joe Orton <jo...@redhat.com> on 2020/01/07 11:02:34 UTC

worker MPM test failures (was Re: Still Failing: apache/httpd#190 (trunk - 894b6a1))

On Mon, Jan 06, 2020 at 06:38:29PM +0000, Travis CI wrote:
> Build Update for apache/httpd
> -------------------------------------
> 
> Build: #190
> Status: Still Failing
> 
> Duration: 7 mins and 11 secs
> Commit: 894b6a1 (trunk)
> Author: Luca Toscano
> Message: travis: add verbose config to perl test suite for Ubuntu Worker All Modules
> 
> For some reason we get sporadic failures only in Ubuntu Worker All Modules' test:
> 
> t/apache/rwrite.t ................... 53/? # Failed test 113 in /home/travis/build/apache/httpd/test/perl-framework/blib/lib/Apache/TestCommon.pm at line 56 fail #113
> t/apache/rwrite.t ................... Failed 1/114 subtests 

It is quite easy to reproduce this for me under worker as well, if you run

./t/TEST t/apache/rwrite.t t/apache/pass_brigade.t 

in a loop, one of them will fail quite quickly.  It looks like the 
client gets a socket closure instead of the very last block of the 
response, here is the client output:

#server response:
#HTTP/1.1 200 OK
#Connection: close
#Date: Tue, 07 Jan 2020 10:55:28 GMT
#Server: Apache/2.5.1-dev (Unix) OpenSSL/1.1.1d
#Vary: In-If1
#Content-Length: 0
#Client-Aborted: die
#Client-Date: Tue, 07 Jan 2020 10:55:28 GMT
#Client-Peer: 127.0.0.1:8529
#Client-Response-Num: 1
#Client-Transfer-Encoding: chunked
#DMMATCH1: 1
#X-Content-Length-Note: added by Apache::TestRequest
#X-Died: EOF when chunk header expected at /usr/share/perl5/vendor_perl/Net/HTTP/Methods.pm line 532.
#
# testing : bytes in body
# expected: 10190848
# received: 10240000
not ok 114
Failed 1/114 subtests 

The tail end of error_log it looks like this:

[Tue Jan 07 10:55:28.414864 2020] [core:trace8] [pid 37542:tid 140091146098432] util_filter.c(1129): [client 127.0.0.1:49470] brigade contains: bytes: 49205, non-file bytes: 49205, eor buckets: 1, morphing buckets: 0
[Tue Jan 07 10:55:28.414866 2020] [core:trace6] [pid 37542:tid 140091146098432] util_filter.c(942): [client 127.0.0.1:49470] setaside full brigade to empty brigade in 'req_core' output filter
[Tue Jan 07 10:55:28.414872 2020] [core:trace6] [pid 37542:tid 140091146098432] util_filter.c(1015): [client 127.0.0.1:49470] reinstate full brigade to full brigade in 'core' output filter
[Tue Jan 07 10:55:28.414876 2020] [core:trace6] [pid 37542:tid 140091146098432] core_filters.c(670): (11)Resource temporarily unavailable: [client 127.0.0.1:49470] writev_nonblocking: 0/1656
[Tue Jan 07 10:55:28.414879 2020] [core:trace6] [pid 37542:tid 140091146098432] util_filter.c(1015): [client 127.0.0.1:49470] reinstate empty brigade to full brigade in 'core' output filter
[Tue Jan 07 10:55:28.414881 2020] [core:trace6] [pid 37542:tid 140091146098432] util_filter.c(1104): [client 127.0.0.1:49470] will flush because of FLUSH bucket
[Tue Jan 07 10:55:28.414883 2020] [core:trace8] [pid 37542:tid 140091146098432] util_filter.c(1106): [client 127.0.0.1:49470] seen in brigade so far: bytes: 1656, non-file bytes: 1656, eor buckets: 0, morphing buckets: 0
[Tue Jan 07 10:55:28.414888 2020] [core:trace8] [pid 37542:tid 140091146098432] util_filter.c(1129): [client 127.0.0.1:49470] brigade contains: bytes: 0, non-file bytes: 0, eor buckets: 0, morphing buckets: 0
[Tue Jan 07 10:55:28.418754 2020] [core:trace6] [pid 37542:tid 140091146098432] core_filters.c(670): [client 127.0.0.1:49470] writev_nonblocking: 1656/1656
[Tue Jan 07 10:55:28.418764 2020] [core:trace6] [pid 37542:tid 140091146098432] util_filter.c(942): [client 127.0.0.1:49470] setaside empty brigade to empty brigade in 'core' output filter
[Tue Jan 07 10:55:28.427067 2020] [optional_hook_import:error] [pid 37542:tid 140091146098432] AH01866: Optional hook test said: GET /test_rwrite?8192,10240000 HTTP/1.1
[Tue Jan 07 10:55:28.427072 2020] [optional_fn_export:error] [pid 37542:tid 140091146098432] AH01871: Optional function test said: GET /test_rwrite?8192,10240000 HTTP/1.1

and nothing else is logged for the connection.

In both cases, with an explicit FLUSH at the end of the response I 
cannot reproduce this any more.

I added one #if 0'ed out here:

http://svn.apache.org/viewvc?view=revision&revision=1872431

and it's a simple ap_rflush(r) in mod_test_rwrite.c as well.

This seems like a bug to me (not obviously in the tests) but I'm not 
sure exactly where.  The "writev_nonblocking: 1656/1656" line *appears* 
to suggest the last block was successfully written to the socket.

Regards, Joe

> Recent examples:
> https://travis-ci.org/apache/httpd/jobs/632425202
> https://travis-ci.org/apache/httpd/jobs/633250739
> 
> Add "-verbose" as test option to capture more data about the failure
> when it happens.
> 
> 
> 
> git-svn-id: https://svn.apache.org/repos/asf/httpd/httpd/trunk@1872389 13f79535-47bb-0310-9956-ffa450edef68
> 
> View the changeset: https://github.com/apache/httpd/compare/0bd4a3d88cba...894b6a185041
> 
> View the full build log and details: https://travis-ci.org/apache/httpd/builds/633409517?utm_medium=notification&utm_source=email
> 
> --
> 
> You can unsubscribe from build emails from the apache/httpd repository going to https://travis-ci.org/account/preferences/unsubscribe?repository=69847&utm_medium=notification&utm_source=email.
> Or unsubscribe from *all* email updating your settings at https://travis-ci.org/account/preferences/unsubscribe?utm_medium=notification&utm_source=email.
> Or configure specific recipients for build notifications in your .travis.yml file. See https://docs.travis-ci.com/user/notifications.
> 


Re: worker MPM test failures (was Re: Still Failing: apache/httpd#190 (trunk - 894b6a1))

Posted by Joe Orton <jo...@redhat.com>.
On Tue, Jan 07, 2020 at 07:31:42PM +0100, Yann Ylavic wrote:
> Could you please try the attached patch?
> The goal is to make ap_request_core_filter() a connection filter, so
> that it remains in c->output_filters until the EOR is handled.
> The patch has subtleties, but ap_request_core_filter() is really
> special. It has to do its best to use r->pool for setasides and
> morphing buckets until the EOR, and depending on network congestion
> there may be multiple ap_request_core_filter()s stacked (one per
> request not fully flushed) so I had to introduce
> AP_FILTER_PROTO_INSERT_BEFORE to stack them in the right order...

This has passed 50 iterations with no failures, so LGTM, thanks for 
looking at this!  Without this I could reliably trigger a failure within 
~10 iterations.

BTW attached the repro script I'm using, different N for stress -cN may 
help. -c2 works well for my 4 core laptop.

Regards, Joe

Re: worker MPM test failures (was Re: Still Failing: apache/httpd#190 (trunk - 894b6a1))

Posted by Ruediger Pluem <rp...@apache.org>.

On 3/22/20 6:43 PM, Yann Ylavic wrote:
> +        APR_BRIGADE_INSERT_TAIL(ctx->tmp_bb, bucket);
> +        if (do_pass
> +                || APR_BRIGADE_EMPTY(ctx->bb)
> +                || (bucket->length == (apr_size_t)-1)

How can we have a bucket length of -1 here?
What about flush buckets? Shouldn't they cause a passing here as well?

> +                || (tmp_bb_len += bucket->length) >=
> +                        conf->flush_max_threshold) {
> +            rv = ap_pass_brigade(f->next, ctx->tmp_bb);
> +            apr_brigade_cleanup(ctx->tmp_bb);
> +            if (status == APR_SUCCESS) {
> +                status = rv;
>              }

Regards

Rüdiger

Re: worker MPM test failures (was Re: Still Failing: apache/httpd#190 (trunk - 894b6a1))

Posted by Yann Ylavic <yl...@gmail.com>.
On Thu, Jan 23, 2020 at 8:23 AM Pluem, Ruediger, Vodafone Group
<ru...@vodafone.com> wrote:
> >
> > https://github.com/apache/httpd/pull/88
>
> Anyone had a look?

If I understand the patch correctly, ap_request_core_filter() would
let everything through on EOR, regardless of potential morphing
buckets set aside?
So if ap_request_core_filter() yields because of network contention,
setting aside all the remaining buckets (say including a morphing
one), and then ap_process_request_after_handler() sends the EOR (the
handler has finished sending everything on its side), it seems that
ap_request_core_filter() won't play its role anymore?
This plus the fact that ap_request_core_filter() filters "stack up"
for each request, so the oldest request's filter may setaside data for
newer ones, with its own (wrong) request pool.

How about we use a single ap_request_core_filter() for all the
requests, at AP_FTYPE_CONNECTION level, and have it "detect" requests
based on SOR/EOR ranges, where the SOR (Start Of Request) bucket is to
be inserted by the AP_FTYPE_PROTOCOL module (e.g.
ap_http_header_filter for http) before sending any data pertaining to
its request?

It tried that in the attached patch, it passes the test framework and
Joe's reproducer.
Thoughts?

Regards,
Yann.

AW: worker MPM test failures (was Re: Still Failing: apache/httpd#190 (trunk - 894b6a1))

Posted by "Pluem, Ruediger, Vodafone Group" <ru...@vodafone.com>.

> -----Ursprüngliche Nachricht-----
> Von: Pluem, Ruediger, Vodafone Group <ru...@vodafone.com>
> Gesendet: Donnerstag, 23. Januar 2020 08:23
> An: dev@httpd.apache.org
> Betreff: AW: worker MPM test failures (was Re: Still Failing:
> apache/httpd#190 (trunk - 894b6a1))
> 
> 
> 
> > -----Ursprüngliche Nachricht-----
> > Von: Pluem, Ruediger, Vodafone Group <ru...@vodafone.com>
> > Gesendet: Donnerstag, 9. Januar 2020 11:32
> > An: dev@httpd.apache.org
> > Betreff: AW: worker MPM test failures (was Re: Still Failing:
> > apache/httpd#190 (trunk - 894b6a1))
> >
> >
> >
> >
> > C2 General
> >
> > > -----Ursprüngliche Nachricht-----
> > > Von: Yann Ylavic <yl...@gmail.com>
> > > Gesendet: Mittwoch, 8. Januar 2020 16:48
> > > An: httpd-dev <de...@httpd.apache.org>
> > > Betreff: Re: worker MPM test failures (was Re: Still Failing:
> > > apache/httpd#190 (trunk - 894b6a1))
> > >
> > > On Wed, Jan 8, 2020 at 8:36 AM Ruediger Pluem <rp...@apache.org>
> > wrote:
> > >
> > >
> > > Regarding my patch, I'm afraid it's not the right fix, we can't
> front
> > > push several req_core output filters like this.
> > > If an old/outer req_core filter has to setaside buckets (network
> > > EAGAIN) while the passed brigade contains buckets from the new
> > > request, then it will setaside the buckets from a newer request pool
> > > to an oldest's.
> > > Even though it can (possibly) work, it does not look quite optimal,
> > > we'd need some kind of Start Of Request (SOR) bucket to delimit
> > > buckets from/for successive requests. SOR would be passed through
> > > c->output_filters when a new request is created (e.g. in
> > > http_create_request()), since anything sent through the request
> > > filters from that time should be about this last request.
> > > Then a single req_core connection filter could be responsible for
> > > setting aside buckets on the relevant request pool (should some
> > > congestion occur).
> > >
> > > Dunno if we want to go that route, though...
> >
> > As this looks like to be a larger amount of work I come back to my
> > simple patch idea
> > that would possibly cause more blocking writes, but would ensure that
> > the data is sent
> > completely at least. Basically it pushes all down the chain once it
> > notices an EOR bucket.
> > See
> >
> > https://github.com/apache/httpd/pull/88
> 
> Anyone had a look?
> 
> Regards
> 
> Rüdiger


Anyone had a look?
 
Regards
 
Rüdiger


AW: worker MPM test failures (was Re: Still Failing: apache/httpd#190 (trunk - 894b6a1))

Posted by "Pluem, Ruediger, Vodafone Group" <ru...@vodafone.com>.

> -----Ursprüngliche Nachricht-----
> Von: Pluem, Ruediger, Vodafone Group <ru...@vodafone.com>
> Gesendet: Donnerstag, 9. Januar 2020 11:32
> An: dev@httpd.apache.org
> Betreff: AW: worker MPM test failures (was Re: Still Failing:
> apache/httpd#190 (trunk - 894b6a1))
> 
> 
> 
> 
> C2 General
> 
> > -----Ursprüngliche Nachricht-----
> > Von: Yann Ylavic <yl...@gmail.com>
> > Gesendet: Mittwoch, 8. Januar 2020 16:48
> > An: httpd-dev <de...@httpd.apache.org>
> > Betreff: Re: worker MPM test failures (was Re: Still Failing:
> > apache/httpd#190 (trunk - 894b6a1))
> >
> > On Wed, Jan 8, 2020 at 8:36 AM Ruediger Pluem <rp...@apache.org>
> wrote:
> >
> >
> > Regarding my patch, I'm afraid it's not the right fix, we can't front
> > push several req_core output filters like this.
> > If an old/outer req_core filter has to setaside buckets (network
> > EAGAIN) while the passed brigade contains buckets from the new
> > request, then it will setaside the buckets from a newer request pool
> > to an oldest's.
> > Even though it can (possibly) work, it does not look quite optimal,
> > we'd need some kind of Start Of Request (SOR) bucket to delimit
> > buckets from/for successive requests. SOR would be passed through
> > c->output_filters when a new request is created (e.g. in
> > http_create_request()), since anything sent through the request
> > filters from that time should be about this last request.
> > Then a single req_core connection filter could be responsible for
> > setting aside buckets on the relevant request pool (should some
> > congestion occur).
> >
> > Dunno if we want to go that route, though...
> 
> As this looks like to be a larger amount of work I come back to my
> simple patch idea
> that would possibly cause more blocking writes, but would ensure that
> the data is sent
> completely at least. Basically it pushes all down the chain once it
> notices an EOR bucket.
> See
> 
> https://github.com/apache/httpd/pull/88

Anyone had a look?

Regards

Rüdiger


Re: worker MPM test failures (was Re: Still Failing: apache/httpd#190 (trunk - 894b6a1))

Posted by Stefan Eissing <st...@greenbytes.de>.

> Am 09.01.2020 um 12:39 schrieb Pluem, Ruediger, Vodafone Group <ru...@vodafone.com>:
> 
> 
> 
>> -----Ursprüngliche Nachricht-----
>> Von: Stefan Eissing <st...@greenbytes.de>
>> Gesendet: Donnerstag, 9. Januar 2020 11:51
>> An: dev@httpd.apache.org
>> Betreff: Re: worker MPM test failures (was Re: Still Failing:
>> apache/httpd#190 (trunk - 894b6a1))
>> 
>> 
>> 
>>> Am 09.01.2020 um 11:32 schrieb Pluem, Ruediger, Vodafone Group
>> <ru...@vodafone.com>:
>>> 
>>> 
>>> BTW: Can we really have morphing buckets in ap_request_core_filter?
>> ap_request_core_filter runs
>>> after a possible HTTP chunking filter that I guess needs to be in
>> place when we have a morphing
>>> bucket which makes it impossible to determine the content length
>> upfront. Hence I guess the
>>> HTTP chunking filter will transform all these morphing buckets
>> already.
>>> Or is this just a safety measure to support other protocols but HTTP?
>> 
>> What exactly do you mean by a "morphing" bucket? If that include file
>> buckets, then I guess "yes" is the answer.
>> 
> 
> A file bucket is not a the morphing bucket I am talking here about
> although it morphs as well when read, because it has a known length.
> A morphing bucket has a length of -1 and when it is read it "morphs"
> into a bucket of known length (mostly in memory) with a subset of the
> content of the morphing bucket and the morphing bucket is added back
> in the brigade after the morphed bucket sans the content left in the morphed bucket.
> Examples are
> 
> CGI buckets
> PIPE buckets
> Socket buckets

Thanks for the clarification. I agree that all those should have been resolved to a known length in the core filters before a setaside.

> 
> Regards
> 
> Rüdiger


AW: worker MPM test failures (was Re: Still Failing: apache/httpd#190 (trunk - 894b6a1))

Posted by "Pluem, Ruediger, Vodafone Group" <ru...@vodafone.com>.

> -----Ursprüngliche Nachricht-----
> Von: Stefan Eissing <st...@greenbytes.de>
> Gesendet: Donnerstag, 9. Januar 2020 11:51
> An: dev@httpd.apache.org
> Betreff: Re: worker MPM test failures (was Re: Still Failing:
> apache/httpd#190 (trunk - 894b6a1))
> 
> 
> 
> > Am 09.01.2020 um 11:32 schrieb Pluem, Ruediger, Vodafone Group
> <ru...@vodafone.com>:
> >
> >
> > BTW: Can we really have morphing buckets in ap_request_core_filter?
> ap_request_core_filter runs
> > after a possible HTTP chunking filter that I guess needs to be in
> place when we have a morphing
> > bucket which makes it impossible to determine the content length
> upfront. Hence I guess the
> > HTTP chunking filter will transform all these morphing buckets
> already.
> > Or is this just a safety measure to support other protocols but HTTP?
> 
> What exactly do you mean by a "morphing" bucket? If that include file
> buckets, then I guess "yes" is the answer.
> 

A file bucket is not a the morphing bucket I am talking here about
although it morphs as well when read, because it has a known length.
A morphing bucket has a length of -1 and when it is read it "morphs"
into a bucket of known length (mostly in memory) with a subset of the
content of the morphing bucket and the morphing bucket is added back
in the brigade after the morphed bucket sans the content left in the morphed bucket.
Examples are

CGI buckets
PIPE buckets
Socket buckets

Regards

Rüdiger



Re: worker MPM test failures (was Re: Still Failing: apache/httpd#190 (trunk - 894b6a1))

Posted by Stefan Eissing <st...@greenbytes.de>.

> Am 09.01.2020 um 11:32 schrieb Pluem, Ruediger, Vodafone Group <ru...@vodafone.com>:
> 
> 
> 
> 
> C2 General
> 
>> -----Ursprüngliche Nachricht-----
>> Von: Yann Ylavic <yl...@gmail.com>
>> Gesendet: Mittwoch, 8. Januar 2020 16:48
>> An: httpd-dev <de...@httpd.apache.org>
>> Betreff: Re: worker MPM test failures (was Re: Still Failing:
>> apache/httpd#190 (trunk - 894b6a1))
>> 
>> On Wed, Jan 8, 2020 at 8:36 AM Ruediger Pluem <rp...@apache.org> wrote:
>>> 
>>> Just for my further analysis let me try to get the purpose of the
>>> ap_request_core_filter:
>>> 
>>> It should leverage at least parts of the benefits of the filter
>> setaside / yielding / brigade reinstate framework for
>>> filters that do not use this, because e.g.
>>> 
>>> - they are not aware of this framework
>> 
>> Yes, the goal (I think, Graham could confirm) is to let request
>> filters work as before but have a catch all (last) filter to not let
>> morphing buckets go through connection filters (and consume c->pool).
>> Same for setasides.
>> 
>>> - it seems to complex for them to use it and adjust their filter
>>> 
>>> Of course this is not perfect as some resource and content filters
>> need to handle this on their own e.g. deflate when
>>> compressing contents of file buckets in order to avoid blocking
>> writes.
>> 
>> Correct.
>> 
>>> 
>>> Thinking out loudly a bit here: Except for ap_request_core_filter all
>> other resource and content filters should process
>>> all the data they might have setaside when they see an EOS in a
>> brigade, correct?
>> 
>> I think so, because otherwise they will never be called again since
>> the core/mpm only cares about connection filters, so the data must be
>> there after EOS. In any case, request filters shall not be called when
>> EOR is in the place.
>> 
>> 
>> Regarding my patch, I'm afraid it's not the right fix, we can't front
>> push several req_core output filters like this.
>> If an old/outer req_core filter has to setaside buckets (network
>> EAGAIN) while the passed brigade contains buckets from the new
>> request, then it will setaside the buckets from a newer request pool
>> to an oldest's.
>> Even though it can (possibly) work, it does not look quite optimal,
>> we'd need some kind of Start Of Request (SOR) bucket to delimit
>> buckets from/for successive requests. SOR would be passed through
>> c->output_filters when a new request is created (e.g. in
>> http_create_request()), since anything sent through the request
>> filters from that time should be about this last request.
>> Then a single req_core connection filter could be responsible for
>> setting aside buckets on the relevant request pool (should some
>> congestion occur).
>> 
>> Dunno if we want to go that route, though...
> 
> As this looks like to be a larger amount of work I come back to my simple patch idea
> that would possibly cause more blocking writes, but would ensure that the data is sent
> completely at least. Basically it pushes all down the chain once it notices an EOR bucket.
> See
> 
> https://github.com/apache/httpd/pull/88
> 
> BTW: Can we really have morphing buckets in ap_request_core_filter? ap_request_core_filter runs
> after a possible HTTP chunking filter that I guess needs to be in place when we have a morphing
> bucket which makes it impossible to determine the content length upfront. Hence I guess the
> HTTP chunking filter will transform all these morphing buckets already.
> Or is this just a safety measure to support other protocols but HTTP?

What exactly do you mean by a "morphing" bucket? If that include file buckets, then I guess "yes" is the answer.



AW: worker MPM test failures (was Re: Still Failing: apache/httpd#190 (trunk - 894b6a1))

Posted by "Pluem, Ruediger, Vodafone Group" <ru...@vodafone.com>.


C2 General

> -----Ursprüngliche Nachricht-----
> Von: Yann Ylavic <yl...@gmail.com>
> Gesendet: Mittwoch, 8. Januar 2020 16:48
> An: httpd-dev <de...@httpd.apache.org>
> Betreff: Re: worker MPM test failures (was Re: Still Failing:
> apache/httpd#190 (trunk - 894b6a1))
> 
> On Wed, Jan 8, 2020 at 8:36 AM Ruediger Pluem <rp...@apache.org> wrote:
> >
> > Just for my further analysis let me try to get the purpose of the
> > ap_request_core_filter:
> >
> > It should leverage at least parts of the benefits of the filter
> setaside / yielding / brigade reinstate framework for
> > filters that do not use this, because e.g.
> >
> > - they are not aware of this framework
> 
> Yes, the goal (I think, Graham could confirm) is to let request
> filters work as before but have a catch all (last) filter to not let
> morphing buckets go through connection filters (and consume c->pool).
> Same for setasides.
> 
> > - it seems to complex for them to use it and adjust their filter
> >
> > Of course this is not perfect as some resource and content filters
> need to handle this on their own e.g. deflate when
> > compressing contents of file buckets in order to avoid blocking
> writes.
> 
> Correct.
> 
> >
> > Thinking out loudly a bit here: Except for ap_request_core_filter all
> other resource and content filters should process
> > all the data they might have setaside when they see an EOS in a
> brigade, correct?
> 
> I think so, because otherwise they will never be called again since
> the core/mpm only cares about connection filters, so the data must be
> there after EOS. In any case, request filters shall not be called when
> EOR is in the place.
> 
> 
> Regarding my patch, I'm afraid it's not the right fix, we can't front
> push several req_core output filters like this.
> If an old/outer req_core filter has to setaside buckets (network
> EAGAIN) while the passed brigade contains buckets from the new
> request, then it will setaside the buckets from a newer request pool
> to an oldest's.
> Even though it can (possibly) work, it does not look quite optimal,
> we'd need some kind of Start Of Request (SOR) bucket to delimit
> buckets from/for successive requests. SOR would be passed through
> c->output_filters when a new request is created (e.g. in
> http_create_request()), since anything sent through the request
> filters from that time should be about this last request.
> Then a single req_core connection filter could be responsible for
> setting aside buckets on the relevant request pool (should some
> congestion occur).
> 
> Dunno if we want to go that route, though...

As this looks like to be a larger amount of work I come back to my simple patch idea
that would possibly cause more blocking writes, but would ensure that the data is sent
completely at least. Basically it pushes all down the chain once it notices an EOR bucket.
See

https://github.com/apache/httpd/pull/88

BTW: Can we really have morphing buckets in ap_request_core_filter? ap_request_core_filter runs
after a possible HTTP chunking filter that I guess needs to be in place when we have a morphing
bucket which makes it impossible to determine the content length upfront. Hence I guess the
HTTP chunking filter will transform all these morphing buckets already.
Or is this just a safety measure to support other protocols but HTTP?

Regards

Rüdiger

Re: worker MPM test failures (was Re: Still Failing: apache/httpd#190 (trunk - 894b6a1))

Posted by Yann Ylavic <yl...@gmail.com>.
On Wed, Jan 8, 2020 at 8:36 AM Ruediger Pluem <rp...@apache.org> wrote:
>
> Just for my further analysis let me try to get the purpose of the
> ap_request_core_filter:
>
> It should leverage at least parts of the benefits of the filter setaside / yielding / brigade reinstate framework for
> filters that do not use this, because e.g.
>
> - they are not aware of this framework

Yes, the goal (I think, Graham could confirm) is to let request
filters work as before but have a catch all (last) filter to not let
morphing buckets go through connection filters (and consume c->pool).
Same for setasides.

> - it seems to complex for them to use it and adjust their filter
>
> Of course this is not perfect as some resource and content filters need to handle this on their own e.g. deflate when
> compressing contents of file buckets in order to avoid blocking writes.

Correct.

>
> Thinking out loudly a bit here: Except for ap_request_core_filter all other resource and content filters should process
> all the data they might have setaside when they see an EOS in a brigade, correct?

I think so, because otherwise they will never be called again since
the core/mpm only cares about connection filters, so the data must be
there after EOS. In any case, request filters shall not be called when
EOR is in the place.


Regarding my patch, I'm afraid it's not the right fix, we can't front
push several req_core output filters like this.
If an old/outer req_core filter has to setaside buckets (network
EAGAIN) while the passed brigade contains buckets from the new
request, then it will setaside the buckets from a newer request pool
to an oldest's.
Even though it can (possibly) work, it does not look quite optimal,
we'd need some kind of Start Of Request (SOR) bucket to delimit
buckets from/for successive requests. SOR would be passed through
c->output_filters when a new request is created (e.g. in
http_create_request()), since anything sent through the request
filters from that time should be about this last request.
Then a single req_core connection filter could be responsible for
setting aside buckets on the relevant request pool (should some
congestion occur).

Dunno if we want to go that route, though...

Regards,
Yann.

Re: worker MPM test failures (was Re: Still Failing: apache/httpd#190 (trunk - 894b6a1))

Posted by Ruediger Pluem <rp...@apache.org>.

On 01/07/2020 07:31 PM, Yann Ylavic wrote:
> On Tue, Jan 7, 2020 at 5:06 PM Joe Orton <jo...@redhat.com> wrote:
>>
>> This is a classic heisenbug since it is timing related, when you add
>> more debugging it slows the server down enough that the client can keep
>> up, write never returns EAGAIN, and the bug disappears, I think...
> 
> OK, that explains why I can't reproduce..
> Based on your traces, I think I figured out how it happens though, see below.
> 
>>
>> I'm afraid I don't understand the dance being done with setaside and
>> reinstate in the filters.  Following the debugging through from the last
>> ap_pass_brigade from mod_test_pass_brigade -
>>
>> [Tue Jan 07 13:09:32.090818 2020] [:info] [pid 117876:tid 140235421763328] [client 127.0.0.1:41112] [mod_test_pass_brigade] wrote 8192 of 8192 bytes
>> [Tue Jan 07 13:09:32.090821 2020] [:info] [pid 117876:tid 140235421763328] [client 127.0.0.1:41112] [mod_test_pass_brigade] done writing 10240000 of 10240000 bytes
>> [Tue Jan 07 13:09:32.090824 2020] [core:trace6] [pid 117876:tid 140235421763328] util_filter.c(1015): [client 127.0.0.1:41112] reinstate full brigade to full brigade in 'req_core' output filter
>> [Tue Jan 07 13:09:32.090827 2020] [core:trace8] [pid 117876:tid 140235421763328] util_filter.c(1133): [client 127.0.0.1:41112] brigade contains: bytes: 49205, non-file bytes: 49205, eor buckets: 1, morphing buckets
>> : 0
>> [Tue Jan 07 13:09:32.090829 2020] [core:trace6] [pid 117876:tid 140235421763328] util_filter.c(942): [client 127.0.0.1:41112] setaside full brigade to empty brigade in 'req_core' output filter
> 
> Here, the call stack is:
>   ap_process_request()
>   => ap_process_async_request()
>      => ap_process_request_after_handler()
>        => ap_pass_brigade()
>          => ap_request_core_filter()
> so ap_request_core_filter() is called with the EOR bucket, but since
> there are still pending/setaside/unsent data in the core filter then
> ap_request_core_filter() returns without passing all of its own
> pending data (including the EOR).
> 
>> [Tue Jan 07 13:09:32.090835 2020] [core:trace6] [pid 117876:tid 140235421763328] util_filter.c(1015): [client 127.0.0.1:41112] reinstate full brigade to full brigade in 'core' output filter
> 
> Here, we are now at:
>   ap_process_request()
>   => ap_pass_brigade(c->output_filters, ...)
> but ap_request_core_filter() is not a connection filter (i.e. not part
> of c->output_filters), so it will never be called again.
> Since it's not called from ap_run_output_pending() either, I think
> mpm_event has the bug too.
> 
> 
> Could you please try the attached patch?
> The goal is to make ap_request_core_filter() a connection filter, so
> that it remains in c->output_filters until the EOR is handled.
> The patch has subtleties, but ap_request_core_filter() is really
> special. It has to do its best to use r->pool for setasides and
> morphing buckets until the EOR, and depending on network congestion
> there may be multiple ap_request_core_filter()s stacked (one per
> request not fully flushed) so I had to introduce
> AP_FILTER_PROTO_INSERT_BEFORE to stack them in the right order...

Thanks for the analysis. I am still hoping for something easier, but my initial idea would possibly cause too much
blocking writes. So I need to dig in further. Just for my further analysis let me try to get the purpose of the
ap_request_core_filter:

It should leverage at least parts of the benefits of the filter setaside / yielding / brigade reinstate framework for
filters that do not use this, because e.g.

- they are not aware of this framework
- it seems to complex for them to use it and adjust their filter

Of course this is not perfect as some resource and content filters need to handle this on their own e.g. deflate when
compressing contents of file buckets in order to avoid blocking writes.

Thinking out loudly a bit here: Except for ap_request_core_filter all other resource and content filters should process
all the data they might have setaside when they see an EOS in a brigade, correct?

Regards

Rüdiger

Re: worker MPM test failures (was Re: Still Failing: apache/httpd#190 (trunk - 894b6a1))

Posted by Yann Ylavic <yl...@gmail.com>.
On Tue, Jan 7, 2020 at 5:06 PM Joe Orton <jo...@redhat.com> wrote:
>
> This is a classic heisenbug since it is timing related, when you add
> more debugging it slows the server down enough that the client can keep
> up, write never returns EAGAIN, and the bug disappears, I think...

OK, that explains why I can't reproduce..
Based on your traces, I think I figured out how it happens though, see below.

>
> I'm afraid I don't understand the dance being done with setaside and
> reinstate in the filters.  Following the debugging through from the last
> ap_pass_brigade from mod_test_pass_brigade -
>
> [Tue Jan 07 13:09:32.090818 2020] [:info] [pid 117876:tid 140235421763328] [client 127.0.0.1:41112] [mod_test_pass_brigade] wrote 8192 of 8192 bytes
> [Tue Jan 07 13:09:32.090821 2020] [:info] [pid 117876:tid 140235421763328] [client 127.0.0.1:41112] [mod_test_pass_brigade] done writing 10240000 of 10240000 bytes
> [Tue Jan 07 13:09:32.090824 2020] [core:trace6] [pid 117876:tid 140235421763328] util_filter.c(1015): [client 127.0.0.1:41112] reinstate full brigade to full brigade in 'req_core' output filter
> [Tue Jan 07 13:09:32.090827 2020] [core:trace8] [pid 117876:tid 140235421763328] util_filter.c(1133): [client 127.0.0.1:41112] brigade contains: bytes: 49205, non-file bytes: 49205, eor buckets: 1, morphing buckets
> : 0
> [Tue Jan 07 13:09:32.090829 2020] [core:trace6] [pid 117876:tid 140235421763328] util_filter.c(942): [client 127.0.0.1:41112] setaside full brigade to empty brigade in 'req_core' output filter

Here, the call stack is:
  ap_process_request()
  => ap_process_async_request()
     => ap_process_request_after_handler()
       => ap_pass_brigade()
         => ap_request_core_filter()
so ap_request_core_filter() is called with the EOR bucket, but since
there are still pending/setaside/unsent data in the core filter then
ap_request_core_filter() returns without passing all of its own
pending data (including the EOR).

> [Tue Jan 07 13:09:32.090835 2020] [core:trace6] [pid 117876:tid 140235421763328] util_filter.c(1015): [client 127.0.0.1:41112] reinstate full brigade to full brigade in 'core' output filter

Here, we are now at:
  ap_process_request()
  => ap_pass_brigade(c->output_filters, ...)
but ap_request_core_filter() is not a connection filter (i.e. not part
of c->output_filters), so it will never be called again.
Since it's not called from ap_run_output_pending() either, I think
mpm_event has the bug too.


Could you please try the attached patch?
The goal is to make ap_request_core_filter() a connection filter, so
that it remains in c->output_filters until the EOR is handled.
The patch has subtleties, but ap_request_core_filter() is really
special. It has to do its best to use r->pool for setasides and
morphing buckets until the EOR, and depending on network congestion
there may be multiple ap_request_core_filter()s stacked (one per
request not fully flushed) so I had to introduce
AP_FILTER_PROTO_INSERT_BEFORE to stack them in the right order...

Regards,
Yann.

Re: worker MPM test failures (was Re: Still Failing: apache/httpd#190 (trunk - 894b6a1))

Posted by Joe Orton <jo...@redhat.com>.
This is a classic heisenbug since it is timing related, when you add 
more debugging it slows the server down enough that the client can keep 
up, write never returns EAGAIN, and the bug disappears, I think...

I see from 2016 people have reported similar failures before ("Subject: 
Some test failures in trunk") and they have been seen with prefork too.

I'm afraid I don't understand the dance being done with setaside and 
reinstate in the filters.  Following the debugging through from the last 
ap_pass_brigade from mod_test_pass_brigade -

[Tue Jan 07 13:09:32.090818 2020] [:info] [pid 117876:tid 140235421763328] [client 127.0.0.1:41112] [mod_test_pass_brigade] wrote 8192 of 8192 bytes
[Tue Jan 07 13:09:32.090821 2020] [:info] [pid 117876:tid 140235421763328] [client 127.0.0.1:41112] [mod_test_pass_brigade] done writing 10240000 of 10240000 bytes
[Tue Jan 07 13:09:32.090824 2020] [core:trace6] [pid 117876:tid 140235421763328] util_filter.c(1015): [client 127.0.0.1:41112] reinstate full brigade to full brigade in 'req_core' output filter
[Tue Jan 07 13:09:32.090827 2020] [core:trace8] [pid 117876:tid 140235421763328] util_filter.c(1133): [client 127.0.0.1:41112] brigade contains: bytes: 49205, non-file bytes: 49205, eor buckets: 1, morphing buckets
: 0
[Tue Jan 07 13:09:32.090829 2020] [core:trace6] [pid 117876:tid 140235421763328] util_filter.c(942): [client 127.0.0.1:41112] setaside full brigade to empty brigade in 'req_core' output filter
[Tue Jan 07 13:09:32.090835 2020] [core:trace6] [pid 117876:tid 140235421763328] util_filter.c(1015): [client 127.0.0.1:41112] reinstate full brigade to full brigade in 'core' output filter

From the above we can see there is 48K of data buffered in one of the 
filters.  At this point we've passed through:

    /* Prepend buckets set aside, if any. */
    ap_filter_reinstate_brigade(f, bb, NULL);
    if (APR_BRIGADE_EMPTY(bb)) {
        return APR_SUCCESS;
    }

I then dumped the contents of bb to the error log directly after:

[Tue Jan 07 13:09:32.090837 2020] [core:trace1] [pid 117876:tid 140235421763328] core_filters.c(373): [client 127.0.0.1:41112] sbb: cof - bucket 0 HEAP length 1654
[Tue Jan 07 13:09:32.090839 2020] [core:trace1] [pid 117876:tid 140235421763328] core_filters.c(373): [client 127.0.0.1:41112] sbb: cof - bucket 1 IMMORTAL length 2
[Tue Jan 07 13:09:32.090842 2020] [core:trace1] [pid 117876:tid 140235421763328] core_filters.c(373): [client 127.0.0.1:41112] sbb: cof - bucket 2 FLUSH length 0
[Tue Jan 07 13:09:32.090844 2020] [core:trace1] [pid 117876:tid 140235421763328] core_filters.c(373): [client 127.0.0.1:41112] sbb: cof - bucket 3 EOC length 0

so the 48K does not make it back to the core output filter and is never 
sent.

[Tue Jan 07 13:09:32.090846 2020] [core:trace1] [pid 117876:tid 140235421763328] core_filters.c(655): [client 127.0.0.1:41112] sbb: 2 vectors, bytes to write 1656
[Tue Jan 07 13:09:32.090849 2020] [core:trace6] [pid 117876:tid 140235421763328] core_filters.c(710): (11)Resource temporarily unavailable: [client 127.0.0.1:41112] writev_nonblocking: 0/1656
[Tue Jan 07 13:09:32.090852 2020] [core:trace1] [pid 117876:tid 140235421763328] core_filters.c(431): (11)Resource temporarily unavailable: [client 127.0.0.1:41112] sbb: cof - send_brigade_nonblocking RETURNED
[Tue Jan 07 13:09:32.090856 2020] [core:trace6] [pid 117876:tid 140235421763328] util_filter.c(1015): [client 127.0.0.1:41112] reinstate empty brigade to full brigade in 'core' output filter
[Tue Jan 07 13:09:32.090858 2020] [core:trace6] [pid 117876:tid 140235421763328] util_filter.c(1108): [client 127.0.0.1:41112] will flush because of FLUSH bucket
[Tue Jan 07 13:09:32.090861 2020] [core:trace8] [pid 117876:tid 140235421763328] util_filter.c(1110): [client 127.0.0.1:41112] seen in brigade so far: bytes: 1656, non-file bytes: 1656, eor buckets: 0, morphing buc
kets: 0
[Tue Jan 07 13:09:32.090863 2020] [core:trace8] [pid 117876:tid 140235421763328] util_filter.c(1133): [client 127.0.0.1:41112] brigade contains: bytes: 0, non-file bytes: 0, eor buckets: 0, morphing buckets: 0
[Tue Jan 07 13:09:32.090865 2020] [core:trace1] [pid 117876:tid 140235421763328] core_filters.c(446): [client 127.0.0.1:41112] sbb: BLOCKED EAGAIN, polling
[Tue Jan 07 13:09:32.094662 2020] [core:trace1] [pid 117876:tid 140235421763328] core_filters.c(655): [client 127.0.0.1:41112] sbb: 2 vectors, bytes to write 1656
[Tue Jan 07 13:09:32.094678 2020] [core:trace6] [pid 117876:tid 140235421763328] core_filters.c(710): [client 127.0.0.1:41112] writev_nonblocking: 1656/1656
[Tue Jan 07 13:09:32.094681 2020] [core:trace1] [pid 117876:tid 140235421763328] core_filters.c(431): [client 127.0.0.1:41112] sbb: cof - send_brigade_nonblocking RETURNED
[Tue Jan 07 13:09:32.094683 2020] [core:trace1] [pid 117876:tid 140235421763328] core_filters.c(431): [client 127.0.0.1:41112] sbb: cof - send_brigade_nonblocking RETURNED
[Tue Jan 07 13:09:32.094686 2020] [core:trace6] [pid 117876:tid 140235421763328] util_filter.c(942): [client 127.0.0.1:41112] setaside empty brigade to empty brigade in 'core' output filter