You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cocoon.apache.org by Gianugo Rabellino <gi...@apache.org> on 2003/02/24 15:06:59 UTC

CachingProcessingPipeline: new Expires code

I finally sat down and cleaned up my first implementation of the 
"expires" pipeline parameter for internal use too. With this change the 
CachingPipeline would check first if an "expires" attribute is present 
and still fresh: upon success it will give up any further checks and 
serve the response straight away from cache: this means that expensive 
and possibly non-cachable Cocoon pipelines can now be controlloed from 
the outside and used internally (say, as an example, for aggregation).

So far my tests have been successful (and showing impressive performance 
gains), but I'm a bit reluctant to commit this code right away, since it 
touches a core part of Cocoon, where I had no previous experience, and 
definitely I don't want to break your setups. :-) What would you think 
is the best way to proceed? Commit right away? Post a patch to Bugzilla? 
Have someone (Carsten, AYT? :-)) review the code?

LMK,

-- 
Gianugo Rabellino
Pro-netics s.r.l.
http://www.pro-netics.com


RE: CachingProcessingPipeline: new Expires code

Posted by Carsten Ziegeler <cz...@s-und-n.de>.
Gianugo Rabellino wrote:
> > Ah ok, does this mean that if an expires date is set, you want to
> > cache the pipeline, regardless if all components in this 
> pipeline support
> > the Cacheable... interface?
> 
> Exactly. Meaning that the "expires" setting is a way to tell the 
> pipeline engine that you know better if and how long a resource is valid.
> 
> > I think, it is much better to completly remove this expires handling
> > from the current caching implementation and great an own 
> expires pipeline
> > implementation, that simply checks for the expire parameter and don't
> > care about the key/validity handling. This is why we now have 
> the pipelines
> > as own components, so we can make different implementations and don't
> > need to put everything into one implementation.
> 
> I've been thinking about it, and indeed it might make sense. My point, 
> however, is that there are two sides to consider:
> 
> 1. this ExpiresEnabledCachingProcessingPipeline would rely *a lot* on 
> the CachingProcessingPipeline, meaning that it would just be an 
> extension built on top of it and "falling back" to the default 
> implementation if an expires is not (anymore?) provided, so I'm not dead 
> sure that having a separate implementation is the best way to go, also 
> because of usability concerns;
> 
I see your point, but I'm not quiet sure if this is the way it should be.
If an expires date is set, the complete response is cached for this
duration - this is a simple 5 line implementation ( :) just kidding) where
you don't need much of the current caching implementation.
If no expires is set, well...then don't cache, perhaps.

> 2. HTTP expires: it should be in theory possible to set them from any 
> pipeline (and definitely this is what happens, since the code is in 
> AbstractProcessingPipeline), but I'm afraid it would be somehow 
> misleading to have a different behaviour (no internal use) depending on 
> the actual caching pipeline implementation.
Yes, I think it was me who put it into the AbstractProcessingPipeline, but
perhaps it's not correct at that place.

> 
> Agreed, anyhow: it would be cleaner, and I'll think about it. It would 
> also be the case to understand if and how this code might be integrated 
> in the CachingPointProcessingPipeline. Suggestions?
> 
No, hmm...really complicated...to be honest, I have currently no clue.

Carsten

Re: CachingProcessingPipeline: new Expires code

Posted by Gianugo Rabellino <gi...@apache.org>.
Carsten Ziegeler wrote:
> 
>>Carsten Ziegeler wrote:
>>
>>>>, then some
>>>>more changes are needed to take into account that an expires header is
>>>>present: if so, there should be a way to both generate the key
>>
>>and cache
>>
>>>>the output and to update/remove entries from the cache if the expires
>>>>header is changed or removed from the pipeline. What would be the best
>>>>way to do that? I'm thinking about having an object that generates key
>>>>on behalf of non cachable components, would that be enough?
>>>
>>>Hmm, I haven't looked yet into your code. What happens now if an expires
>>>date is set, but the pipeline does not have a cacheable generator?
>>
>>Actually nothing. My code relies and performs expires logic on the
>>CachedResponse object, so if no one will be created, there is nothing I
>>can do ATM.
>>
>>What happens, then, is that my patches are good for already cachable
>>pipelines, where there is a cachedresponse that is used right away
>>without any further checking. The next step would be to modify
>>AbstractCachingProcessingPipeline#generateCachingKey(): as of now
>>(roughly) it checks if components are cacheable and if not it gives up;
>>it should instead check for an explicit expires setting and, if that is
>>the case, provide a key by itself for the whole pipeline and cache the
>>whole response right away. But it should also take into account the case
>>where the expires header is removed (modifications wouldn't be a problem
>>since they're already handled). And well, I must confess that I still
>>have some troubles figuring out exactly the caching process. Will work
>>on that, and any help/suggestion is most welcome. :-)
>>
> 
> Ah ok, does this mean that if an expires date is set, you want to
> cache the pipeline, regardless if all components in this pipeline support
> the Cacheable... interface?

Exactly. Meaning that the "expires" setting is a way to tell the 
pipeline engine that you know better if and how long a resource is valid.

> I think, it is much better to completly remove this expires handling
> from the current caching implementation and great an own expires pipeline
> implementation, that simply checks for the expire parameter and don't
> care about the key/validity handling. This is why we now have the pipelines
> as own components, so we can make different implementations and don't
> need to put everything into one implementation.

I've been thinking about it, and indeed it might make sense. My point, 
however, is that there are two sides to consider:

1. this ExpiresEnabledCachingProcessingPipeline would rely *a lot* on 
the CachingProcessingPipeline, meaning that it would just be an 
extension built on top of it and "falling back" to the default 
implementation if an expires is not (anymore?) provided, so I'm not dead 
sure that having a separate implementation is the best way to go, also 
because of usability concerns;

2. HTTP expires: it should be in theory possible to set them from any 
pipeline (and definitely this is what happens, since the code is in 
AbstractProcessingPipeline), but I'm afraid it would be somehow 
misleading to have a different behaviour (no internal use) depending on 
the actual caching pipeline implementation.

Agreed, anyhow: it would be cleaner, and I'll think about it. It would 
also be the case to understand if and how this code might be integrated 
in the CachingPointProcessingPipeline. Suggestions?

Ciao,

-- 
Gianugo Rabellino
Pro-netics s.r.l.
http://www.pro-netics.com


RE: CachingProcessingPipeline: new Expires code

Posted by Carsten Ziegeler <cz...@s-und-n.de>.
Gianugo Rabellino wrote:
>
> Carsten Ziegeler wrote:
> >>, then some
> >>more changes are needed to take into account that an expires header is
> >>present: if so, there should be a way to both generate the key
> and cache
> >>the output and to update/remove entries from the cache if the expires
> >>header is changed or removed from the pipeline. What would be the best
> >>way to do that? I'm thinking about having an object that generates key
> >>on behalf of non cachable components, would that be enough?
> >
> > Hmm, I haven't looked yet into your code. What happens now if an expires
> > date is set, but the pipeline does not have a cacheable generator?
>
> Actually nothing. My code relies and performs expires logic on the
> CachedResponse object, so if no one will be created, there is nothing I
> can do ATM.
>
> What happens, then, is that my patches are good for already cachable
> pipelines, where there is a cachedresponse that is used right away
> without any further checking. The next step would be to modify
> AbstractCachingProcessingPipeline#generateCachingKey(): as of now
> (roughly) it checks if components are cacheable and if not it gives up;
> it should instead check for an explicit expires setting and, if that is
> the case, provide a key by itself for the whole pipeline and cache the
> whole response right away. But it should also take into account the case
> where the expires header is removed (modifications wouldn't be a problem
> since they're already handled). And well, I must confess that I still
> have some troubles figuring out exactly the caching process. Will work
> on that, and any help/suggestion is most welcome. :-)
>
Ah ok, does this mean that if an expires date is set, you want to
cache the pipeline, regardless if all components in this pipeline support
the Cacheable... interface?

I think, it is much better to completly remove this expires handling
from the current caching implementation and great an own expires pipeline
implementation, that simply checks for the expire parameter and don't
care about the key/validity handling. This is why we now have the pipelines
as own components, so we can make different implementations and don't
need to put everything into one implementation.
Does this make sense?

Carsten


Re: CachingProcessingPipeline: new Expires code

Posted by Gianugo Rabellino <gi...@apache.org>.
Carsten Ziegeler wrote:
>>, then some 
>>more changes are needed to take into account that an expires header is 
>>present: if so, there should be a way to both generate the key and cache 
>>the output and to update/remove entries from the cache if the expires 
>>header is changed or removed from the pipeline. What would be the best 
>>way to do that? I'm thinking about having an object that generates key 
>>on behalf of non cachable components, would that be enough?
> 
> Hmm, I haven't looked yet into your code. What happens now if an expires
> date is set, but the pipeline does not have a cacheable generator?

Actually nothing. My code relies and performs expires logic on the 
CachedResponse object, so if no one will be created, there is nothing I 
can do ATM.

What happens, then, is that my patches are good for already cachable 
pipelines, where there is a cachedresponse that is used right away 
without any further checking. The next step would be to modify 
AbstractCachingProcessingPipeline#generateCachingKey(): as of now 
(roughly) it checks if components are cacheable and if not it gives up; 
it should instead check for an explicit expires setting and, if that is 
the case, provide a key by itself for the whole pipeline and cache the 
whole response right away. But it should also take into account the case 
where the expires header is removed (modifications wouldn't be a problem 
since they're already handled). And well, I must confess that I still 
have some troubles figuring out exactly the caching process. Will work 
on that, and any help/suggestion is most welcome. :-)

Thanks,

-- 
Gianugo Rabellino
Pro-netics s.r.l.
http://www.pro-netics.com


RE: CachingProcessingPipeline: new Expires code

Posted by Carsten Ziegeler <cz...@s-und-n.de>.

> -----Original Message-----
> From: Gianugo Rabellino [mailto:gianugo@apache.org]
> Sent: Monday, February 24, 2003 5:08 PM
> To: cocoon-dev@xml.apache.org
> Subject: Re: CachingProcessingPipeline: new Expires code
> 
> 
> Carsten Ziegeler wrote:
> 
> > I would say: commit it and we (I?) can review it.
> 
> Done. Fire at will. :-)
> 
> Two things to consider and still on my TODO:
> 
> 1. I understand from AbstractCachingProcessingPipeline that a cache 
> entry is not going to be built if the generator is not cacheable. Also, 
> the cache process stops at the first non cachable component in the 
> pipeline. If that is the case (and indeed it makes sense)

Yes, exactly, this is the way it works.

>, then some 
> more changes are needed to take into account that an expires header is 
> present: if so, there should be a way to both generate the key and cache 
> the output and to update/remove entries from the cache if the expires 
> header is changed or removed from the pipeline. What would be the best 
> way to do that? I'm thinking about having an object that generates key 
> on behalf of non cachable components, would that be enough?
Hmm, I haven't looked yet into your code. What happens now if an expires
date is set, but the pipeline does not have a cacheable generator?

Carsten

Re: CachingProcessingPipeline: new Expires code

Posted by Gianugo Rabellino <gi...@apache.org>.
Carsten Ziegeler wrote:

> I would say: commit it and we (I?) can review it.

Done. Fire at will. :-)

Two things to consider and still on my TODO:

1. I understand from AbstractCachingProcessingPipeline that a cache 
entry is not going to be built if the generator is not cacheable. Also, 
the cache process stops at the first non cachable component in the 
pipeline. If that is the case (and indeed it makes sense), then some 
more changes are needed to take into account that an expires header is 
present: if so, there should be a way to both generate the key and cache 
the output and to update/remove entries from the cache if the expires 
header is changed or removed from the pipeline. What would be the best 
way to do that? I'm thinking about having an object that generates key 
on behalf of non cachable components, would that be enough?

2. Some work needs to be done on the HTTP response abstraction. As of 
now it might well be that a component in the pipeline sets the HTTP 
Expires header in an arbitrary way, thus overriding the pipeline 
setting. This is a more general problem, it shows up with these 
modifications but it could happen with every pipeline: the value sent to 
the browser/cache would be the last one set in the pipeline. And it's 
not limited to the Expires header, actually: every component has access 
to the response object, so it's free to fsck up even the most carefully 
crafted pipeline. This is a real PITA, expecially if we are to continue 
on the path of proxy-friendly HTTP headers, where the resulting response 
should come from a very careful and thought out logic-

I'm thinking of the best way to intercept the setHeader() calls and put 
some logic in there in order to use the setting that makes more sense: 
under a "non expires" pipeline, it should be the minimum value, while if 
an "expires" setting is present, its value shouldn't be overwritten. 
Sylvain had the clever idea of adding listeners to the response 
environment, but I'm wondering if this wouldn't be too heavy since for 
every request a new set of listeners would have to be built.

Suggestions?

Ciao,

-- 
Gianugo Rabellino
Pro-netics s.r.l.
http://www.pro-netics.com


Re: CachingProcessingPipeline: new Expires code

Posted by Jeremy Quinn <je...@media.demon.co.uk>.
On Monday, February 24, 2003, at 02:09 PM, Carsten Ziegeler wrote:

> I would say: commit it and we (I?) can review it.
>
> Sounds great!
>

+1 !!!!

well done, SQL caching .... here we come!

regards Jeremy


RE: CachingProcessingPipeline: new Expires code

Posted by Carsten Ziegeler <cz...@s-und-n.de>.
I would say: commit it and we (I?) can review it.

Sounds great!
Carsten


> -----Original Message-----
> From: Gianugo Rabellino [mailto:gianugo@apache.org]
> Sent: Monday, February 24, 2003 3:07 PM
> To: cocoon-dev@xml.apache.org
> Subject: CachingProcessingPipeline: new Expires code
> 
> 
> I finally sat down and cleaned up my first implementation of the 
> "expires" pipeline parameter for internal use too. With this change the 
> CachingPipeline would check first if an "expires" attribute is present 
> and still fresh: upon success it will give up any further checks and 
> serve the response straight away from cache: this means that expensive 
> and possibly non-cachable Cocoon pipelines can now be controlloed from 
> the outside and used internally (say, as an example, for aggregation).
> 
> So far my tests have been successful (and showing impressive performance 
> gains), but I'm a bit reluctant to commit this code right away, since it 
> touches a core part of Cocoon, where I had no previous experience, and 
> definitely I don't want to break your setups. :-) What would you think 
> is the best way to proceed? Commit right away? Post a patch to Bugzilla? 
> Have someone (Carsten, AYT? :-)) review the code?
> 
> LMK,
> 
> -- 
> Gianugo Rabellino
> Pro-netics s.r.l.
> http://www.pro-netics.com
>