You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@httpd.apache.org by Stefan Eissing <st...@greenbytes.de> on 2016/04/26 16:49:50 UTC

TIL

Today I Learned the difference between writing 
  DATA + META 
and 
  DATA + META + FLUSH
to the core output filters. I am astonished that
my code ever worked.

Seriously, what is the reason for this kind of
implementation? I would like to pass META buckets
in non-blocking way and *not* lose the order of 
bucket destruction. Any explanation or advice is
appreciated!

Cheers,

Stefan

Re: TIL

Posted by Stefan Eissing <st...@greenbytes.de>.

Change was done in http://svn.apache.org/viewvc?view=revision&revision=327872 by brianp in 2005 with message: 

"New version of ap_core_output_filter that does nonblocking writes
(backport from async-dev branch to 2.3 trunk)"

Since Brian put his name in CHANGES, it was probably him...

-Stefan


> Am 27.04.2016 um 14:52 schrieb Graham Leggett <mi...@sharp.fm>:
> 
> On 27 Apr 2016, at 2:49 PM, Stefan Eissing <st...@greenbytes.de> wrote:
> 
>> I had a look into 2.4.x this morning and while there are changes in filter brigade setaside code, the basic "cleanup" of empty and meta buckets before the setaside is there already.
>> 
>> I think this has not bween noticed before as 
>> 1. EOR is always followed by a FLUSH
>> 2. most bucket types survive the destruction of r->pool quite well, even pool buckets silently morph
>> 3. even if transient buckets would reference pool memory, settings those aside after destruction of r->pool, but before filter return would access freed, but not re-used memory from the connection allocator - I think.
>> 
>> So far, I have seen no reasons why meta buckets should not just be setaside in core filter. 0 length data buckets should also stay, IMHO, just ignored in the actual write.
>> 
>> I can only guess the original intent: 0-length data buckets sometimes happen during some brigade read modes and if there are several FLUSH buckets in the brigade, it makes sense to get rid of them also. But I think the space savings are minimal.
>> 
>> But there could be reasons for the current behavior, I overlooked. So, I am asking around.
> 
> I don’t see any obvious reason for the current behaviour either - is it possible to go back in history and confirm the log entry when it was introduced?
> 
> Regards,
> Graham
> —
>

Re: TIL

Posted by Graham Leggett <mi...@sharp.fm>.

On 27 Apr 2016, at 2:49 PM, Stefan Eissing <st...@greenbytes.de> wrote:

> I had a look into 2.4.x this morning and while there are changes in filter brigade setaside code, the basic "cleanup" of empty and meta buckets before the setaside is there already.
> 
> I think this has not bween noticed before as 
> 1. EOR is always followed by a FLUSH
> 2. most bucket types survive the destruction of r->pool quite well, even pool buckets silently morph
> 3. even if transient buckets would reference pool memory, settings those aside after destruction of r->pool, but before filter return would access freed, but not re-used memory from the connection allocator - I think.
> 
> So far, I have seen no reasons why meta buckets should not just be setaside in core filter. 0 length data buckets should also stay, IMHO, just ignored in the actual write.
> 
> I can only guess the original intent: 0-length data buckets sometimes happen during some brigade read modes and if there are several FLUSH buckets in the brigade, it makes sense to get rid of them also. But I think the space savings are minimal.
> 
> But there could be reasons for the current behavior, I overlooked. So, I am asking around.

I don’t see any obvious reason for the current behaviour either - is it possible to go back in history and confirm the log entry when it was introduced?

Regards,
Graham
—

Re: TIL

Posted by Stefan Eissing <st...@greenbytes.de>.

> Am 27.04.2016 um 14:10 schrieb Jim Jagielski <ji...@jaguNET.com>:
> 
> Grrr... w/o looking too deeply into this, this seems very
> wrong. Is that a long-standing bug or something recently
> "optimized" away?

I had a look into 2.4.x this morning and while there are changes in filter brigade setaside code, the basic "cleanup" of empty and meta buckets before the setaside is there already.

I think this has not bween noticed before as 
1. EOR is always followed by a FLUSH
2. most bucket types survive the destruction of r->pool quite well, even pool buckets silently morph
3. even if transient buckets would reference pool memory, settings those aside after destruction of r->pool, but before filter return would access freed, but not re-used memory from the connection allocator - I think.

So far, I have seen no reasons why meta buckets should not just be setaside in core filter. 0 length data buckets should also stay, IMHO, just ignored in the actual write.

I can only guess the original intent: 0-length data buckets sometimes happen during some brigade read modes and if there are several FLUSH buckets in the brigade, it makes sense to get rid of them also. But I think the space savings are minimal.

But there could be reasons for the current behavior, I overlooked. So, I am asking around.

-Stefan

> 
>> On Apr 26, 2016, at 10:49 AM, Stefan Eissing <st...@greenbytes.de> wrote:
>> 
>> Today I Learned the difference between writing 
>> DATA + META 
>> and 
>> DATA + META + FLUSH
>> to the core output filters. I am astonished that
>> my code ever worked.
>> 
>> Seriously, what is the reason for this kind of
>> implementation? I would like to pass META buckets
>> in non-blocking way and *not* lose the order of 
>> bucket destruction. Any explanation or advice is
>> appreciated!
>> 
>> Cheers,
>> 
>> Stefan
>

Re: TIL

Posted by Jim Jagielski <ji...@jaguNET.com>.

Grrr... w/o looking too deeply into this, this seems very
wrong. Is that a long-standing bug or something recently
"optimized" away?

> On Apr 26, 2016, at 10:49 AM, Stefan Eissing <st...@greenbytes.de> wrote:
> 
> Today I Learned the difference between writing 
>  DATA + META 
> and 
>  DATA + META + FLUSH
> to the core output filters. I am astonished that
> my code ever worked.
> 
> Seriously, what is the reason for this kind of
> implementation? I would like to pass META buckets
> in non-blocking way and *not* lose the order of 
> bucket destruction. Any explanation or advice is
> appreciated!
> 
> Cheers,
> 
> Stefan

Re: TIL

Posted by Ruediger Pluem <rp...@apache.org>.

On 04/28/2016 02:40 AM, Yann Ylavic wrote:
> On Wed, Apr 27, 2016 at 7:28 AM, Stefan Eissing
> <st...@greenbytes.de> wrote:
>>
>>
>>> Am 26.04.2016 um 23:57 schrieb Yann Ylavic <yl...@gmail.com>:
>>>
>>> On Tue, Apr 26, 2016 at 4:49 PM, Stefan Eissing
>>> <st...@greenbytes.de> wrote:
>>>> Today I Learned the difference between writing
>>>>  DATA + META
>>>> and
>>>>  DATA + META + FLUSH
>>>> to the core output filters. I am astonished that
>>>> my code ever worked.
>>>>
>>>> Seriously, what is the reason for this kind of
>>>> implementation?
>>>
>>> Do you mean why META could be destroyed before (setaside-)DATA, in
>>> remove_empty_buckets()?
>>
>> Yes. That was unexpected. My understanding of the pass_brigade contract was that buckets get destroyed in order of appearance (might be after the call).
> 
> Actually after re-reading the code, the core output filter looks good to me.
> remove_empty_buckets() will only remove buckets up to DATA buckets,
> and indeed httpd couldn't work otherwise.
> 

I agree with Yann here. I think there is no out of order destruction. Keep in mind that remove_empty_buckets does not
clean the whole brigade of metadata and empty data buckets, but only all the ones that precede a non zero length data
bucket. This seems correct. So as I read it actually the following happens:

1. The core output filter tries to sent as much data as it can non blocking.
2. We possibly have buckets left in the brigade after this.
3. We process all META databuckets and empty databuckets until we reach a non empty data bucket.
4. We set aside the remainder of the brigade for further processing in the future.

So 3. doesn't do anything out of order. We can do this processing here since we do not need a writable socket for this
at this point of time.

But looking further at the code we indeed have some sort of out of order processsing, in the sense that we read data
from data buckets, where we have metabuckets before them in the brigade that have not been processed
(send_brigade_nonblocking). But once we start really sending the data we delete the appropriate buckets in the correct
order (writev_nonblocking).

Regards

Rüdiger

Re: TIL

Posted by Yann Ylavic <yl...@gmail.com>.

On Thu, Apr 28, 2016 at 2:40 AM, Yann Ylavic <yl...@gmail.com> wrote:
>>
>> What I saw was memory nullified, by assumedly, new apr_pcallocs, that should not have been freed already. With more likelihood the more load and the less logging, of course.
>
> So I'm rather thinking of a pool lifetime issue in h2 buckets
> handling, did you see the comment in ap_bucket_eor_create(), the case
> does not seem to be handled in h2_beam_bucket_make() (if that's the
> bucket you are talking about).

Hmm, the beam bucket is DATA (not META), so the issue would instead be
in h2_bucket_{eos,eoc}_destroy(), since h2_bucket_{eos,eoc}_make()
don't seem to take care of the h2_{stream,session} being destroyed
outside the bucket handling...
Is that possible?

Re: TIL

Posted by Stefan Eissing <st...@greenbytes.de>.

tl;dr

It seems that the core filter does things correctly and my reading of the code was wrong. Thanks to Yann for analyzing this.

Time to find my own mistakes...thanks for your help!

> Am 28.04.2016 um 02:40 schrieb Yann Ylavic <yl...@gmail.com>:
> 
> On Wed, Apr 27, 2016 at 7:28 AM, Stefan Eissing
> <st...@greenbytes.de> wrote:
>> 
>> 
>>> Am 26.04.2016 um 23:57 schrieb Yann Ylavic <yl...@gmail.com>:
>>> 
>>> On Tue, Apr 26, 2016 at 4:49 PM, Stefan Eissing
>>> <st...@greenbytes.de> wrote:
>>>> Today I Learned the difference between writing
>>>> DATA + META
>>>> and
>>>> DATA + META + FLUSH
>>>> to the core output filters. I am astonished that
>>>> my code ever worked.
>>>> 
>>>> Seriously, what is the reason for this kind of
>>>> implementation?
>>> 
>>> Do you mean why META could be destroyed before (setaside-)DATA, in
>>> remove_empty_buckets()?
>> 
>> Yes. That was unexpected. My understanding of the pass_brigade contract was that buckets get destroyed in order of appearance (might be after the call).
> 
> Actually after re-reading the code, the core output filter looks good to me.
> remove_empty_buckets() will only remove buckets up to DATA buckets,
> and indeed httpd couldn't work otherwise.

Ah! I see it now. So "remove_empty_head_buckets" would be a more telling name. And since any buffered buckets are prepended before this is called, the bucket should only be destroyed in the order they were passed.

Hmmm. Seems I jumped to conclusions. Glad that you took the time to review this.

>> The h2 use case is to pass a meta bucket that, when destroyed, signals the end of life for data belonging to one stream. So all buckets before that one should have been either destroyed or setaside already.
> 
> That's looks very similar to how EOR works.

Steal with pride.

>> With http/1.x and EOR, my current reading of the code, this does not happen since EORs are always FLUSHed.
> 
> No, EOR is never flushed explicitly (sent alone in the brigade by
> ap_process_request_after_handler), the flush occurs if/when pipelining
> stops (or MAX_REQUESTS_IN_PIPELINE is reached).
> While requests are pipelined, so are responses (the usual case is not
> pipelining though, so EOR may look flushed, but not always...).

ap_process_request() does after (almost) each request, as I read it. (http_request.c#458) That is why the http2 processing saw that pattern at the end of a request. But EOR and FLUSH are not passed in the same call, it seems.

>> And even if, releasing the request pool early will probably go unnoticed as WRITE_COMPLETION will read all memory *before* a new pool is opened on the conn pool.
> 
> I debugged pipeling a few time ago and would have noticed it, accurate
> ap_run_log_transaction() depends on this too.
> This is not something that crashes httpd at least :)
> 
>> 
>> This (again, if my reading is correct) does no longer hold in h2. request (stream) starting and ending is happening all the time, conn child pools are created/destroyed all the time.
>> 
>> What I saw was memory nullified, by assumedly, new apr_pcallocs, that should not have been freed already. With more likelihood the more load and the less logging, of course.
> 
> So I'm rather thinking of a pool lifetime issue in h2 buckets
> handling, did you see the comment in ap_bucket_eor_create(), the case
> does not seem to be handled in h2_beam_bucket_make() (if that's the
> bucket you are talking about).

As you noted in a later mail, h2_bucket_{eos,eoc}_destroy() are the ones cleaning up. One for streams, one for the whole session. And I think I took care of different lifetimes of bucket and stream/session pools. But I will go over that once again.

> I'm not familiar with your recent Beam changes, and don't know what
> lifetime the bucket is supposed to have, but couldn't your issue be
> that beam_bucket_destroy() is called too late?

It could be. The whole beam thing is easy to setup and use, but hard to tear down safely. I am still struggling to find the best way to set that up, so that pools clearing in various orders do not make it crash.

Re: TIL

Posted by Yann Ylavic <yl...@gmail.com>.

On Wed, Apr 27, 2016 at 7:28 AM, Stefan Eissing
<st...@greenbytes.de> wrote:
>
>
>> Am 26.04.2016 um 23:57 schrieb Yann Ylavic <yl...@gmail.com>:
>>
>> On Tue, Apr 26, 2016 at 4:49 PM, Stefan Eissing
>> <st...@greenbytes.de> wrote:
>>> Today I Learned the difference between writing
>>>  DATA + META
>>> and
>>>  DATA + META + FLUSH
>>> to the core output filters. I am astonished that
>>> my code ever worked.
>>>
>>> Seriously, what is the reason for this kind of
>>> implementation?
>>
>> Do you mean why META could be destroyed before (setaside-)DATA, in
>> remove_empty_buckets()?
>
> Yes. That was unexpected. My understanding of the pass_brigade contract was that buckets get destroyed in order of appearance (might be after the call).

Actually after re-reading the code, the core output filter looks good to me.
remove_empty_buckets() will only remove buckets up to DATA buckets,
and indeed httpd couldn't work otherwise.

>
> The h2 use case is to pass a meta bucket that, when destroyed, signals the end of life for data belonging to one stream. So all buckets before that one should have been either destroyed or setaside already.

That's looks very similar to how EOR works.

>
> With http/1.x and EOR, my current reading of the code, this does not happen since EORs are always FLUSHed.

No, EOR is never flushed explicitly (sent alone in the brigade by
ap_process_request_after_handler), the flush occurs if/when pipelining
stops (or MAX_REQUESTS_IN_PIPELINE is reached).
While requests are pipelined, so are responses (the usual case is not
pipelining though, so EOR may look flushed, but not always...).

> And even if, releasing the request pool early will probably go unnoticed as WRITE_COMPLETION will read all memory *before* a new pool is opened on the conn pool.

I debugged pipeling a few time ago and would have noticed it, accurate
ap_run_log_transaction() depends on this too.
This is not something that crashes httpd at least :)

>
> This (again, if my reading is correct) does no longer hold in h2. request (stream) starting and ending is happening all the time, conn child pools are created/destroyed all the time.
>
> What I saw was memory nullified, by assumedly, new apr_pcallocs, that should not have been freed already. With more likelihood the more load and the less logging, of course.

So I'm rather thinking of a pool lifetime issue in h2 buckets
handling, did you see the comment in ap_bucket_eor_create(), the case
does not seem to be handled in h2_beam_bucket_make() (if that's the
bucket you are talking about).
I'm not familiar with your recent Beam changes, and don't know what
lifetime the bucket is supposed to have, but couldn't your issue be
that beam_bucket_destroy() is called too late?

Regards,
Yann.

Re: TIL

Posted by Niklas Edmundsson <ni...@acc.umu.se>.

On Wed, 27 Apr 2016, Stefan Eissing wrote:

<snip>

> The h2 use case is to pass a meta bucket that, when destroyed, 
> signals the end of life for data belonging to one stream. So all 
> buckets before that one should have been either destroyed or 
> setaside already.

One workaround could be a custom (data) bucket type, and pass a 0 
length bucket of it on the brigade where appropriate...

However, I don't know if the remove-empty-buckets logic here and there 
wrecks this method as well?

/Nikke
-- 
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
  Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se      |     nikke@acc.umu.se
---------------------------------------------------------------------------
  The paper is always strongest at the perforations.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Re: TIL

Posted by Stefan Eissing <st...@greenbytes.de>.

> Am 26.04.2016 um 23:57 schrieb Yann Ylavic <yl...@gmail.com>:
> 
> On Tue, Apr 26, 2016 at 4:49 PM, Stefan Eissing
> <st...@greenbytes.de> wrote:
>> Today I Learned the difference between writing
>>  DATA + META
>> and
>>  DATA + META + FLUSH
>> to the core output filters. I am astonished that
>> my code ever worked.
>> 
>> Seriously, what is the reason for this kind of
>> implementation?
> 
> Do you mean why META could be destroyed before (setaside-)DATA, in
> remove_empty_buckets()?

Yes. That was unexpected. My understanding of the pass_brigade contract was that buckets get destroyed in order of appearance (might be after the call).
> 
>> I would like to pass META buckets
>> in non-blocking way and *not* lose the order of
>> bucket destruction. Any explanation or advice is
>> appreciated!
> 
> You'd want to setaside (some) META buckets with DATA (in order), right?
> At least for EOR buckets, that would keep requests in memory longer
> than necessary, but maybe some other META are worth it...

It's all fine if users of the filters can be aware of what the conditions and rules are. It is part of the API for all modules that pass buckets through the filters.

The h2 use case is to pass a meta bucket that, when destroyed, signals the end of life for data belonging to one stream. So all buckets before that one should have been either destroyed or setaside already.

With http/1.x and EOR, my current reading of the code, this does not happen since EORs are always FLUSHed. And even if, releasing the request pool early will probably go unnoticed as WRITE_COMPLETION will read all memory *before* a new pool is opened on the conn pool.

This (again, if my reading is correct) does no longer hold in h2. request (stream) starting and ending is happening all the time, conn child pools are created/destroyed all the time. 

What I saw was memory nullified, by assumedly, new apr_pcallocs, that should not have been freed already. With more likelihood the more load and the less logging, of course. 

-Stefan

Re: TIL

Posted by Yann Ylavic <yl...@gmail.com>.

On Tue, Apr 26, 2016 at 4:49 PM, Stefan Eissing
<st...@greenbytes.de> wrote:
> Today I Learned the difference between writing
>   DATA + META
> and
>   DATA + META + FLUSH
> to the core output filters. I am astonished that
> my code ever worked.
>
> Seriously, what is the reason for this kind of
> implementation?

Do you mean why META could be destroyed before (setaside-)DATA, in
remove_empty_buckets()?

> I would like to pass META buckets
> in non-blocking way and *not* lose the order of
> bucket destruction. Any explanation or advice is
> appreciated!

You'd want to setaside (some) META buckets with DATA (in order), right?
At least for EOR buckets, that would keep requests in memory longer
than necessary, but maybe some other META are worth it...

Regards,
Yann.

Re: TIL

Posted by Graham Leggett <mi...@sharp.fm>.

On 26 Apr 2016, at 4:49 PM, Stefan Eissing <st...@greenbytes.de> wrote:

> Today I Learned the difference between writing 
>  DATA + META 
> and 
>  DATA + META + FLUSH
> to the core output filters. I am astonished that
> my code ever worked.
> 
> Seriously, what is the reason for this kind of
> implementation? I would like to pass META buckets
> in non-blocking way and *not* lose the order of 
> bucket destruction. Any explanation or advice is
> appreciated!

Definitely looks like a bug - if we’re deleting buckets out of order things will definitely break. I say we must fix the core.

Regards,
Graham
—