You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Ruediger Pluem <rp...@apache.org> on 2008/04/10 23:52:59 UTC

Re: svn commit: r592951 - in /httpd/httpd/trunk: ./ docs/manual/mod/ include/ modules/filters/ modules/http/ server/

On 02.12.2007 02:10, Graham Leggett wrote:
> Nick Kew wrote:
> 
>> You created a new kept_body input filter, which seems exactly right
>> to me.  But you're storing the kept body on the request_rec itself,
>> which looks like extra complexity.
>>
>> Why not make the new filter into its own module, and keep the kept_body
>> brigade on the filter's own config?  That way the complexity of dealing
>> with top-level vs other requests is kept in one place.  It's not clear
>> to me from the changes in http_filter.h that this patch deals with
>> that distinction (or is kept_body supposed to be able to originate
>> in non-origin requests)?
> 
> The kept body stuff is formed of two parts, the bit that sets aside the 
> body (in ap_discard_request_body()), and the filter that puts it back 
> again. It didn't make sense to separate the two functions, which is why 
> they are both in the same source file, the http module http_filters.c.
> 
>> On principle, we should be reducing the request_rec, not adding to it!
> 
> If we can find a way to avoid it, certainly.
> 
> The kept body filter is only added on subrequests, and only where 
> r->kept_body is not-NULL, and subrequests as I recall have their own 
> filter stack separate from the main filter stack. So as I understand it, 
> linking it to the subrequest filter stack may not work, as the 
> subrequest filter stack doesn't exist when ap_discard_request_body() is 
> called.

First of all sorry for commenting on this that late. I should have paid more 
attention back in December.

While it is true that the filter stack branches under certain conditions for
output filters, the same is not true for input filters. Given Joe's recent true
comment that ap_discard_request_body pulls all *through* the input filter chain
the correct (or more correct) solution for the problem seems to be to

1. Move the saving of the request body from ap_discard_request_body to a
    separate input filter that gets implemented by a module.
2. Use the insert filter_hook to insert this filter whenever configured.
3. I guess the filter can decide on its own what to do: If it is a subrequest
    pulling from it, then deliver from the buffered brigade otherwise buffer it
    and pass it on. Of course it has to paid attention in this case for
    incomplete request bodies and how to handle this.

IMHO this would make it possible to remove this stuff from the core and from
the request_rec where it does not really belong.

Regards

RĂ¼diger

Re: svn commit: r592951 - in /httpd/httpd/trunk: ./ docs/manual/mod/ include/ modules/filters/ modules/http/ server/

Posted by Graham Leggett <mi...@sharp.fm>.
Paul Querna wrote:

> +1, this revision r592951, and the derivate r647263 need to be 
> rethought.  IMO is that this doesn't belong in the reqeust_rec.
> 
> I'm only realizing this went in 10 months after the first commit, and 5 
> months after it was rewritten in r647263, a sign of my inactivity in 
> httpd land, so I'm not gonna 'veto' it, but damn I really don't like 
> having this stuff on the reqeust_rec.

Having had the time to think this through, to be honest I cannot think 
of a different place for it to go other than request_rec.

At the core of the problem is a flaw that has been repeated over and 
over inside lots of modules within the server, and which this code 
partially fixes.

The flaw is that by design, the core will allow a request to spawn a 
subrequest, but the core offers no functionality for the subrequest to 
read a request body.

So you get the same repeated code, over and over again in various 
modules, incarnations of this:

/* what we have now */
if ( ! we_are_a_subrequest) {
    read_request_body();
}
else {
    ignore_body_somehow();
}

Now whether it is relevant for a subrequest to read a request body isn't 
important, what's important is that the subrequest isn't even allowed to 
try, and the subrequest has to go out of its way to make sure it doesn't 
read from the network twice. The subrequest shouldn't have to care about 
this stuff.

With the kept_body code, two new possibilities are available that have 
not been available before:

- Where the admin wants to set aside the body to be re-read by 
subrequests, the kept_body filter makes this possible, by allowing a 
kept body to be reread by a subrequest safely. The subrequest now 
becomes a fully fledged subrequest, able to access everything the 
request can.

- Where the admin doesn't want to set aside the body, the kept_body 
filter can be inserted anyway, with an empty brigade. This "caps" the 
input filter stack, guaranteeing that the network can never be read 
twice, regardless of how many times it tries to read from the network.

These two possibilities together mean that any module is now free to try 
and read a request body if it wants to, and the kept_body filter 
guarantees that the underlying network is never read from twice.

This in turn means that all the different variations of "detect 
subrequest and try to to read the body a second time" that are scattered 
around the code become obsolete and can be removed, resulting in a much 
simpler contract for module code, and a significantly cleaner server:

/* a cleaner alternative */
read_request_body();

In order for the partial fix to become a full fix, the kept_body filter 
should be unconditionally added to all subrequests (with an empty 
brigade in the simplest case), and all the instances of the "if ( ! 
we_are_a_subrequest)" code should be collapsed and removed, and modules 
should be free to try and read the body as many times as they like, safe 
in the knowledge that the kept_body filter guarantees the network will 
only ever be read from once.

Regards,
Graham
--

Re: svn commit: r592951 - in /httpd/httpd/trunk: ./ docs/manual/mod/ include/ modules/filters/ modules/http/ server/

Posted by Paul Querna <ch...@force-elite.com>.
Ruediger Pluem wrote:
> On 02.12.2007 02:10, Graham Leggett wrote:
>> Nick Kew wrote:
>>
>>> You created a new kept_body input filter, which seems exactly right
>>> to me.  But you're storing the kept body on the request_rec itself,
>>> which looks like extra complexity.
>>>
>>> Why not make the new filter into its own module, and keep the kept_body
>>> brigade on the filter's own config?  That way the complexity of dealing
>>> with top-level vs other requests is kept in one place.  It's not clear
>>> to me from the changes in http_filter.h that this patch deals with
>>> that distinction (or is kept_body supposed to be able to originate
>>> in non-origin requests)?
>>
>> The kept body stuff is formed of two parts, the bit that sets aside 
>> the body (in ap_discard_request_body()), and the filter that puts it 
>> back again. It didn't make sense to separate the two functions, which 
>> is why they are both in the same source file, the http module 
>> http_filters.c.
>>
>>> On principle, we should be reducing the request_rec, not adding to it!
>>
>> If we can find a way to avoid it, certainly.
>>
>> The kept body filter is only added on subrequests, and only where 
>> r->kept_body is not-NULL, and subrequests as I recall have their own 
>> filter stack separate from the main filter stack. So as I understand 
>> it, linking it to the subrequest filter stack may not work, as the 
>> subrequest filter stack doesn't exist when ap_discard_request_body() 
>> is called.
> 
> First of all sorry for commenting on this that late. I should have paid 
> more attention back in December.
> 
> While it is true that the filter stack branches under certain conditions 
> for
> output filters, the same is not true for input filters. Given Joe's 
> recent true
> comment that ap_discard_request_body pulls all *through* the input 
> filter chain
> the correct (or more correct) solution for the problem seems to be to
> 
> 1. Move the saving of the request body from ap_discard_request_body to a
>    separate input filter that gets implemented by a module.
> 2. Use the insert filter_hook to insert this filter whenever configured.
> 3. I guess the filter can decide on its own what to do: If it is a 
> subrequest
>    pulling from it, then deliver from the buffered brigade otherwise 
> buffer it
>    and pass it on. Of course it has to paid attention in this case for
>    incomplete request bodies and how to handle this.
> 
> IMHO this would make it possible to remove this stuff from the core and 
> from
> the request_rec where it does not really belong.

+1, this revision r592951, and the derivate r647263 need to be 
rethought.  IMO is that this doesn't belong in the reqeust_rec.

I'm only realizing this went in 10 months after the first commit, and 5 
months after it was rewritten in r647263, a sign of my inactivity in 
httpd land, so I'm not gonna 'veto' it, but damn I really don't like 
having this stuff on the reqeust_rec.

-Paul