You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Matthieu Estrade <me...@axiliance.com> on 2002/09/23 15:28:10 UTC

Patch mod_proxy: mod_proxy + mod_cache problem

Hi again :)

I did a patch modifiying mod_proxy to pass the entire data (response 
from backend server) to output_filter, unstead of brigade per brigade.

it seems to work well...

Matthieu

Matthieu Estrade wrote:

>
> Hi again,
>
> the problem seems to be in the proxy.
>
> When proxy read the data and pass it to output filter, it do:
>
> while (ap_get_brigade){
>
> ap_pass_brigade(output_filter)
>
> }
>
> si if the data aren't read by only one brigade, the mod_cache can't work.
> because it will try to cache only the first brigade with a part of data.
>
> Do you think it's better to modify how proxy is passing data to the 
> output_filter
> or to modify the way mod_cache is getting his data from bucket_brigade ?
>
> Matthieu
>
> Matthieu Estrade wrote:
>
>



Re: Patch mod_proxy: mod_proxy + mod_cache problem

Posted by Graham Leggett <mi...@sharp.fm>.
Matthieu Estrade wrote:

> I did a patch modifiying mod_proxy to pass the entire data (response 
> from backend server) to output_filter, unstead of brigade per brigade.

AFAIK the problem stems from a limitation in mod_cache where it can only 
cache stuff in a single brigade at the moment.

Proxy works correctly already - there is no such thing as "pass the 
entire data", because there is no way for proxy to know just how much 
data there is in advance, and there is no guarantee that data will all 
fit in memory at once.

Regards,
Graham
-- 
-----------------------------------------
minfrin@sharp.fm		"There's a moon
					over Bourbon Street
						tonight..."



Re: Patch mod_proxy: mod_proxy + mod_cache problem

Posted by Graham Leggett <mi...@sharp.fm>.
Ian Holsman wrote:

>> The cache filter is supposed to run after all the filters for maximum 
>> caching advantage.
>>
> I disagree on this. the cache filter should be able to placed where the 
> admin wants it.

In the advanced case, yes - but in the simplest 
turn-it-on-and-it-should-just-work case the cache should be as optimised 
as possible.

> what the code should do is remember which filters where placed above the 
> cache_input filter, and restore these filters on cache_output.

This starts getting complex.

The original idea for the cache was to be transparent public proxy cache 
  layered between Apache and the browser. Because RFC2616 dictates how 
such a proxy cache should work, determining whether we have done the 
"right" thing should be relatively easy.

As we move the caching filters further towards the core of Apache, 
suddenly things start to get complicated, as there is no longer a 
guarantee that the point into which cache has been inserted is compliant 
with RFC2616 in the first place, which in turn makes things less likely 
to work predictably or correctly.

Yes, being able to put the cache wherever the admin wants is in theory a 
nice thing to have, actually practically achieving this is something 
that's going to have to be looked at pretty carefully.

Regards,
Graham
-- 
-----------------------------------------
minfrin@sharp.fm		"There's a moon
					over Bourbon Street
						tonight..."



Re: Patch mod_proxy: mod_proxy + mod_cache problem

Posted by Ian Holsman <ia...@apache.org>.
Graham Leggett wrote:
> Matthieu Estrade wrote:
> 
>> I agree with you about the proxy...
>> Do you think it's possible to force the cache filter, be runned after 
>> all the proxy filters ?
> 
> 
> The cache filter is supposed to run after all the filters for maximum 
> caching advantage.
> 
I disagree on this. the cache filter should be able to placed where the 
admin wants it.
what the code should do is remember which filters where placed above the 
cache_input filter, and restore these filters on cache_output.

this would allow people to have pages designed for a specific user, as 
well as caching most of it

> Regards,
> Graham



Re: Patch mod_proxy: mod_proxy + mod_cache problem

Posted by Graham Leggett <mi...@sharp.fm>.
Matthieu Estrade wrote:

> My problem is i can't setup the filter cache_in_filter to be executed 
> after mod_proxy pass his "last" brigade to the filter chain

That's because there is never any such thing as the "last" brigade. 
Responses are not limited by length anywhere within the filter 
subsystem, this is the whole point behind using brigades - long 
neverending chains of data that can be of any length, even bigger than 
available RAM.

The cache filter works (should work) along the idea that each brigade is 
simply added to the one before. If this gets too big, we throw it away 
and the object is no longer cached.

Regards,
Graham
-- 
-----------------------------------------
minfrin@sharp.fm		"There's a moon
					over Bourbon Street
						tonight..."



Re: Patch mod_proxy: mod_proxy + mod_cache problem

Posted by "Paul J. Reder" <re...@remulak.net>.

Matthieu Estrade wrote:

In a previous note you explained:

> on the file not cached because the MaxStreamingBuffer error, the code above never detect a EOS bucket, so, it return a size to the following code:

> if (!all_bucket_here){
> if (unresolved_length || size > MaxStreamingBuffer)
> exit with MaxStreamingBuffer Error.
> }



If it entered into this chunk of code then the cache code either:
1) encountered a bucket with an unresolved length (like a socket bucket)  or
2) the cumulative length of all buckets/brigades encountered so far was greater
      then the value of MaxStreamingBuffer.

If the program goes into this branch of code it will remove the cache filter
from teh filter chain. This is why the cache code is not looking at any of
the subsequent brigades. Cache has already determined that it has no reason
to look at them.

If this is happening due to exceeding MaxStreamingBuffer, try setting the
value to something higher and see if it works for you. If it is encountering
a bucket with an unresolved length then there is nothing that can be done.
That type of bucket cannot be cached.

Proxy is working the way it should.

 >

> Hi Paul,
> 
> I know about mod_cache is only working on one brigade, but my problem is 
> not here.
> My problem is i can't setup the filter cache_in_filter to be executed 
> after mod_proxy pass his "last" brigade to the filter chain
> Actually, mod_cache_filter_in is executed only one time when mod_proxy 
> have passed the "first" brigade to the output_filters.
> So the others brigade containing data are not available.
> 
> so, have i to consider that cache_filter_in will have to be able to 
> cache document in many times process,
> Or is it possible to think that the cache_filter could be placed when 
> mod_proxy finished to pass the "last" brigade,
> reading all the brigade and cache it ?
> 
> regards,
> 
> Matthieu
> 
> Paul J. Reder wrote:
> 
>> Actually, the problem is in the fact that there is more than one
>> brigade and the cache code can't currently handle that.
>>
>> Proxy is working the way it is supposed to. This allows Apache
>> to process large responses without having to buffer the whole
>> thing inside Apache. Apache uses less memory, and the user starts
>> seeing results sooner. If proxy were to process all of the
>> brigades of a large response before the next filter were allowed
>> access, then proxy could potentially buffer a *huge* amount of
>> data.
>>
>> The answer is that the cache code currently only caches responses
>> that arrive in one brigade. Proxy isn't the problem.
>>
>> Matthieu Estrade wrote:
>>
>>> Hi graham,
>>>
>>> the problem is the filter is called between two ap_pass_brigade by 
>>> the reverse proxy...
>>> like:
>>>
>>> proxy:
>>> get_brigade (data from backend)
>>> pass_brigade(pass data to outputfilter)
>>> cache_filter
>>> get_brigade(data from backend)
>>> pass_brigade(pass_data_to outputfilter)
>>>
>>> on one response by the backend server.
>>>
>>> regards,
>>> Matthieu
>>>
>>> Graham Leggett wrote:
>>>
>>>> Matthieu Estrade wrote:
>>>>
>>>>> I agree with you about the proxy...
>>>>> Do you think it's possible to force the cache filter, be runned 
>>>>> after all the proxy filters ?
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> The cache filter is supposed to run after all the filters for 
>>>> maximum caching advantage.
>>>>
>>>> Regards,
>>>> Graham
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
> 
> 
> 
> 


-- 
Paul J. Reder
-----------------------------------------------------------
"The strength of the Constitution lies entirely in the determination of each
citizen to defend it.  Only if every single citizen feels duty bound to do
his share in this defense are the constitutional rights secure."
-- Albert Einstein



Re: Patch mod_proxy: mod_proxy + mod_cache problem

Posted by Matthieu Estrade <me...@axiliance.com>.
Hi Paul,

I know about mod_cache is only working on one brigade, but my problem is 
not here.
My problem is i can't setup the filter cache_in_filter to be executed 
after mod_proxy pass his "last" brigade to the filter chain
Actually, mod_cache_filter_in is executed only one time when mod_proxy 
have passed the "first" brigade to the output_filters.
So the others brigade containing data are not available.

so, have i to consider that cache_filter_in will have to be able to 
cache document in many times process,
Or is it possible to think that the cache_filter could be placed when 
mod_proxy finished to pass the "last" brigade,
reading all the brigade and cache it ?

regards,

Matthieu

Paul J. Reder wrote:

> Actually, the problem is in the fact that there is more than one
> brigade and the cache code can't currently handle that.
>
> Proxy is working the way it is supposed to. This allows Apache
> to process large responses without having to buffer the whole
> thing inside Apache. Apache uses less memory, and the user starts
> seeing results sooner. If proxy were to process all of the
> brigades of a large response before the next filter were allowed
> access, then proxy could potentially buffer a *huge* amount of
> data.
>
> The answer is that the cache code currently only caches responses
> that arrive in one brigade. Proxy isn't the problem.
>
> Matthieu Estrade wrote:
>
>> Hi graham,
>>
>> the problem is the filter is called between two ap_pass_brigade by 
>> the reverse proxy...
>> like:
>>
>> proxy:
>> get_brigade (data from backend)
>> pass_brigade(pass data to outputfilter)
>> cache_filter
>> get_brigade(data from backend)
>> pass_brigade(pass_data_to outputfilter)
>>
>> on one response by the backend server.
>>
>> regards,
>> Matthieu
>>
>> Graham Leggett wrote:
>>
>>> Matthieu Estrade wrote:
>>>
>>>> I agree with you about the proxy...
>>>> Do you think it's possible to force the cache filter, be runned 
>>>> after all the proxy filters ?
>>>
>>>
>>>
>>>
>>> The cache filter is supposed to run after all the filters for 
>>> maximum caching advantage.
>>>
>>> Regards,
>>> Graham
>>
>>
>>
>>
>>
>>
>>
>
>




Re: Patch mod_proxy: mod_proxy + mod_cache problem

Posted by "Paul J. Reder" <re...@remulak.net>.
Actually, the problem is in the fact that there is more than one
brigade and the cache code can't currently handle that.

Proxy is working the way it is supposed to. This allows Apache
to process large responses without having to buffer the whole
thing inside Apache. Apache uses less memory, and the user starts
seeing results sooner. If proxy were to process all of the
brigades of a large response before the next filter were allowed
access, then proxy could potentially buffer a *huge* amount of
data.

The answer is that the cache code currently only caches responses
that arrive in one brigade. Proxy isn't the problem.

Matthieu Estrade wrote:

> Hi graham,
> 
> the problem is the filter is called between two ap_pass_brigade by the 
> reverse proxy...
> like:
> 
> proxy:
> get_brigade (data from backend)
> pass_brigade(pass data to outputfilter)
> cache_filter
> get_brigade(data from backend)
> pass_brigade(pass_data_to outputfilter)
> 
> on one response by the backend server.
> 
> regards,
> Matthieu
> 
> Graham Leggett wrote:
> 
>> Matthieu Estrade wrote:
>>
>>> I agree with you about the proxy...
>>> Do you think it's possible to force the cache filter, be runned after 
>>> all the proxy filters ?
>>
>>
>>
>> The cache filter is supposed to run after all the filters for maximum 
>> caching advantage.
>>
>> Regards,
>> Graham
> 
> 
> 
> 
> 
> 


-- 
Paul J. Reder
-----------------------------------------------------------
"The strength of the Constitution lies entirely in the determination of each
citizen to defend it.  Only if every single citizen feels duty bound to do
his share in this defense are the constitutional rights secure."
-- Albert Einstein



Re: Patch mod_proxy: mod_proxy + mod_cache problem

Posted by Matthieu Estrade <me...@axiliance.com>.
Hi graham,

the problem is the filter is called between two ap_pass_brigade by the 
reverse proxy...
like:

proxy:
get_brigade (data from backend)
pass_brigade(pass data to outputfilter)
cache_filter
get_brigade(data from backend)
pass_brigade(pass_data_to outputfilter)

on one response by the backend server.

regards,
Matthieu

Graham Leggett wrote:

> Matthieu Estrade wrote:
>
>> I agree with you about the proxy...
>> Do you think it's possible to force the cache filter, be runned after 
>> all the proxy filters ?
>
>
> The cache filter is supposed to run after all the filters for maximum 
> caching advantage.
>
> Regards,
> Graham





Re: Patch mod_proxy: mod_proxy + mod_cache problem

Posted by Graham Leggett <mi...@sharp.fm>.
Matthieu Estrade wrote:

> I agree with you about the proxy...
> Do you think it's possible to force the cache filter, be runned after 
> all the proxy filters ?

The cache filter is supposed to run after all the filters for maximum 
caching advantage.

Regards,
Graham
-- 
-----------------------------------------
minfrin@sharp.fm		"There's a moon
					over Bourbon Street
						tonight..."



Re: Patch mod_proxy: mod_proxy + mod_cache problem

Posted by Matthieu Estrade <me...@axiliance.com>.
hi bill,

I agree with you about the proxy...
Do you think it's possible to force the cache filter, be runned after 
all the proxy filters ?

Matthieu

Bill Stoddard wrote:

>>Hi again :)
>>
>>I did a patch modifiying mod_proxy to pass the entire data (response
>>from backend server) to output_filter, unstead of brigade per brigade.
>>
>>    
>>
>One comment (having not reviewed the patch): Proxy should stream bytes from
>the backend server to the client as those bytes arrive. Proxy should not
>require that the entire response from the backend be received before writing
>to the client.
>
>Bill
>
>
>
>  
>




RE: Patch mod_proxy: mod_proxy + mod_cache problem

Posted by Bill Stoddard <bi...@wstoddard.com>.
> Hi again :)
>
> I did a patch modifiying mod_proxy to pass the entire data (response
> from backend server) to output_filter, unstead of brigade per brigade.
>
One comment (having not reviewed the patch): Proxy should stream bytes from
the backend server to the client as those bytes arrive. Proxy should not
require that the entire response from the backend be received before writing
to the client.

Bill