You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@cocoon.apache.org by Gianugo Rabellino <gi...@apache.org> on 2003/02/11 11:57:31 UTC

[RT] More on caching, expires, and proxy-friendly headers

This RT integrates the one done more than one year ago and available at 
http://marc.theaimsgroup.com/?t=101074439900001&r=1&w=2.

As of now you know that we have a basic HTTP header control that mimics 
at a pipeline level the mod_expires functionality of the Apache HTTPD 
server. This was a good start, but now I feel it's time to refine it and 
make it better. Work is needed on two sides:

Proxy handling
==============

The approach to full proxy compliance should be done, once again :-), in 
microsteps. I've been reading the HTTP/1.1 specs and the proxy-related 
RFCs, and boy, it's not easy at all to implement a fully proxy compliant 
system. It can be done, but it requires serious thinking and a major 
rework of the request handling phase.

Full proxy compliance depends on the ability of dealing with conditional 
requests, handling a bunch of request headers all in some way 
interdependant and tricky to say the least. I'm not saying that we 
shouldn't do that sooner or later, but I'd rather plan this activity 
carefully, and possibily together with someone (Chuck?) from the httpd 
group working on the proxy part, in order to ensure that things work 
smoothly.

So, the first microstep is an easy one, just as a start. The companion 
to the expires header is the "Cache-Control" header 
(http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9): this 
header allows for a finer grained control over the request, suggesting 
proxies what to with the results.

While Expires uses an HTTP date header, built in Cocoon by adding the 
result of the pipeline@expires attribute to the current system time, 
Cache-Control is somehow smarter, since it gives caches an hint on what 
is cacheable, how it should be cached (revalidated or not) and for how 
long in seconds. To make it short, my proposal is to add a Cache-Control 
header to any request coming from a pipeline with the "expires" 
attribute set with the following template:

Cache-Control: max-age={expires value in seconds}, public

The "public" keyword instructs the proxy to store a resource in its 
cache even if it should not be considered cacheable. This can be 
dangerous somehow, since the proxy will serve requests coming from 
"protected" resources without performing authentication on the origin 
server, but in the end I think that it's safe to assume that if a 
pipeline is marked with an "expires" header, than the user is perfectly 
aware that such resource can, and will, be cached.

The patch is a no-brainer, such as:

Index: 
src/java/org/apache/cocoon/components/pipeline/AbstractProcessingPipeline.java
===================================================================
RCS file: 
/home/cvs/xml-cocoon2/src/java/org/apache/cocoon/components/pipeline/AbstractProcessingPipeline.java,v
retrieving revision 1.33
diff -r1.33 AbstractProcessingPipeline.java
468a469
 >
472c473,474
<             res.setDateHeader("Expires", expires);
---
 >             res.setDateHeader("Expires", System.currentTimeMillis() + 
expires);
 >             res.setHeader("Cache-Control", "max-age=" + expires/1000 
+ ", public");
474c476
<                  new Long(expires));
---
 >                  new Long(expires + System.currentTimeMillis()));
760c762
<         return System.currentTimeMillis() + expires;
---
 >         return expires;

The only problem I see is that this header is not set under Tomcat 
(*argh*, Jetty works just OK!)  so I have to investigate what's going 
wrong, but for the rest I'm ready to commit it if you agree on the idea 
(I'm reluctant to commit it right away since it somehow touches the 
pipeline core, where I almost never worked). Now for the second (and 
more interesting) point: Cocoon integration.

Cocoon integration
==================

The above approach works perfectly for communication with the external 
world, be it a reverse proxy or just a browser cache. Sometimes, 
however, there might be a case where you might want to use this concept 
internally: imagine to have an aggregation of different cocoon 
pipelines, where you have some resources for which you want to check 
validity strictly and some others that are pretty heavy to generate, 
uncacheable because the components you are using are not cacheable by 
themselves but on which you have full control on the expiration time. In 
this case, having an internal use of the expires attribute would be 
pretty useful, i.e.:

<pipeline internal-only="true">
   <parameter name="expires" value="now plus 5 minutes"/>
   <match pattern="my-heavy-resource">
     <generate src="xmldb:xindice:///db/not/changing/frequently"/>
     <serialize/>
   </match>
</pipeline>

<pipeline internal-only="true">
   <match pattern="my-dynamic-resource">
     <generate src="/content/that/might/change"/>
     <serialize/>
   </match>
</pipeline>

<pipeline>
   <match pattern="mybeautifulportal.html">
     <aggregate element="portal">
       <part src="cocoon://my-heavy-resource" element="news"/>
       <part src="cocoon://my-dynamic-resource" element="data"/>
     </aggregate>
     <tranform src="myportal2html.xsl"/>
     <serialize type="html"/>
   </match>
</pipeline>


If we agree that this is useful, let's see the actual implementation. 
First, let's get back to the general principle: if a user sets an 
"expires" attribute on a pipeline, what she want's to say is "I know 
better than the Cocoon cache for how long this resource has to be 
considered fresh". This is by all means a configuration imposed by the 
user, to which the caching system should obey blindly. My opinion
then, wrt the caching pipeline, is that if an expires was set, all the 
pipeline engine should do is to check if the given resource has already 
been generated, and if the expiration time has not passed yet. If so, 
the resource should be considered fresh disregarding any Validity 
objects or Cacheable components.

This, AFAIU, would boost the performance even for internal pipelines and 
aggregation, and would let us use internal pipelines in a smarter and 
faster way. Not only that: if we are to use the expires feature even 
internally, Cocoon's performance will get a boost even without using a 
reverse proxy in front of the application server, since all the 
(potentially heavy) algorithms to check the resource's validity would be 
skipped.

Now for the implementation I wish I knew better the Cocoon caching 
internals, but from a quick read it seems to me that there should be:

- some logic in CachedResponse to store and get expires (easy);

- appropriate logic in the proper points to obtain the expires object 
from the environment and set a CachedResponse accordingly (is it enough 
to change CachingProcessingPipeline#cacheResults?

- more logic in the validatePipeline() method in 
AbstractCachingProcessingPipeline.java to take into account the expires 
object configured, if present.

- in all cases, all the algorithms that check if a cached entry is still 
valid, i.e. every place where a cache entry is built, validated or 
invalidated, should take into account the expires configuration.

I have started to play on this too, but I am wondering if I'm following 
the right path or if I'm missing something. Also, it might be worth 
considering to have a different CachingPipeline implementation 
(ExpiresEnabledCachingPipeline? Yuck ;-)), at least for a first start.

Comments and questions?

Ciao,

-- 
Gianugo Rabellino
Pro-netics s.r.l.
http://www.pro-netics.com


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] More on caching, expires, and proxy-friendly headers

Posted by Gianugo Rabellino <gi...@apache.org>.

Vadim Gritsenko wrote:
>>> ObjectModelHelper.getResponse(objectModel).addHeader("Vary", 
>>> "User-Agent");
>>
>> Not quite what I meant. First, I'm not sure, from the HTTP specs, that 
>> you can have more than one Vary header, actually I think it's quite 
>> the opposite (but I might well be wrong), so an addHeader might be 
>> dangerous (well, a setHeader might be even worse...). Second (and more 
>> important): in order to build a consistent system these headers should 
>> be set not by the sitemap components directly but by Cocoon itself,
> 
> 
> 
> I beleive this is not possible, Cocoon itself simply will not know. 
> Response can be generated using multiple things in many places, and thus 
> it will vary based upon values of those things and only those places 
> know what it was... 

I'm not saying that Cocoon should know about every possible header 
value. But it might be worth considering, in our environment 
abstraction, to have some kind of "policy" wrt headers.

Take the Vary case: if I'm right, only one Vary header is allowed in a 
response, so now you simply don't know what's going to happen if you 
just "pipe" the set|addHeader calls to the real response object.

If such object was wrapped we could have a way, on Cocoon, to decide if 
and what header set. I'm still not sure that this is the way to go, but 
I start feeling that it might be worth a thought, expecially if we are 
to start implementing conditional requests.

Ciao,

-- 
Gianugo Rabellino
Pro-netics s.r.l.
http://www.pro-netics.com

---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] More on caching, expires, and proxy-friendly headers

Posted by Vadim Gritsenko <va...@verizon.net>.

Gianugo Rabellino wrote:

> Vadim Gritsenko wrote:
>
>>> 3. I dare you to find a clean solution from the sitemap POV to 
>>> express this kind of conditions. :-)
>>
>>
>>
>>
>>
>> Ahem...
>> http://cvs.apache.org/viewcvs.cgi/xml-cocoon2/src/java/org/apache/cocoon/selection/BrowserSelector.java?rev=1.7&content-type=text/vnd.viewcvs-markup
>>
>> ObjectModelHelper.getResponse(objectModel).addHeader("Vary", 
>> "User-Agent");
>
>
>
> Not quite what I meant. First, I'm not sure, from the HTTP specs, that 
> you can have more than one Vary header, actually I think it's quite 
> the opposite (but I might well be wrong), so an addHeader might be 
> dangerous (well, a setHeader might be even worse...). Second (and more 
> important): in order to build a consistent system these headers should 
> be set not by the sitemap components directly but by Cocoon itself,

I beleive this is not possible, Cocoon itself simply will not know. 
Response can be generated using multiple things in many places, and thus 
it will vary based upon values of those things and only those places 
know what it was... Just an example:

* Browser selector allows to vary response based on browser,
* Host selector allows to vary response based on virtual host,
* XSP page can inject current username onto page, making it vary on 
"user" header,
...

And on top of all this, current ResourceReader for some reason also adds 
"Vary: Host" reader... Who knows why?

Vadim

> and there should be a clean way to declarate in the sitemap what kind 
> of headers you want to be sent (and no, using an Action would not be a 
> clean way :-)).
>
> Ciao,

---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] More on caching, expires, and proxy-friendly headers

Posted by Gianugo Rabellino <gi...@apache.org>.

Vadim Gritsenko wrote:

>> 3. I dare you to find a clean solution from the sitemap POV to express 
>> this kind of conditions. :-)
> 
> 
> 
> 
> Ahem...
> http://cvs.apache.org/viewcvs.cgi/xml-cocoon2/src/java/org/apache/cocoon/selection/BrowserSelector.java?rev=1.7&content-type=text/vnd.viewcvs-markup 
> 
> 
> ObjectModelHelper.getResponse(objectModel).addHeader("Vary", "User-Agent");

Not quite what I meant. First, I'm not sure, from the HTTP specs, that 
you can have more than one Vary header, actually I think it's quite the 
opposite (but I might well be wrong), so an addHeader might be dangerous 
(well, a setHeader might be even worse...). Second (and more important): 
in order to build a consistent system these headers should be set not by 
the sitemap components directly but by Cocoon itself, and there should 
be a clean way to declarate in the sitemap what kind of headers you want 
to be sent (and no, using an Action would not be a clean way :-)).

Ciao,

-- 
Gianugo Rabellino
Pro-netics s.r.l.
http://www.pro-netics.com

---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] More on caching, expires, and proxy-friendly headers

Posted by Vadim Gritsenko <va...@verizon.net>.

Gianugo Rabellino wrote:

> Stefano Mazzocchi wrote:

...

>> Question: (and an important one)
>>
>> Suppose you have a resource like
>>
>>  /images/logo
>>
>> that you hit with two different user agent and that a pipeline 
>> renders differently depending on the user agent, how can a proxy 
>> behave friendly to this? do we have a way to specify that a specific 
>> request has to be matched not only against a URI but also against the 
>> user-agent that requested it?
>
>
> This can be done in a theory, once (and if) we'll have real proxy 
> support in place. The first thing that can be done is starting to have 
> real "Etag" support: an "Etag" is an unique identifier (think 
> checksum) that represents the given URI from a server side view. This 
> means that the same URI can lead to different etags being generated, 
> according to internal server logic (request parameters, user agents, 
> you name it). ù
>
> Since an Etag is just a string, I guess that the cache key could do 
> the job, but I'm not sure that it might be the better way security 
> wise (you'll end up handing to the world your private cache keys, I'm 
> not sure that this is a good thing). Also, Etags are useless if there 
> is no conditional request handling: the proxy will ask Cocoon "give me 
> that URI, identified by this Etag, if not modified since this 
> timestamp": Cocoon must be able to analyze that query and reply either 
> with an 304 status code (not modified) or with a 200 with the actual 
> content.
>
> In this particular case, however, the cleanest solution is using the 
> HTTP/1.1  "Vary:" header. This header is supposed to contain all the 
> request fields that must match for a proxy to decide that a resource 
> is still valid even if requested by different users. Here you can 
> specify that the user-agent is one of such headers, but:
>
> 0. this is one of the most undocumented headers around;
>
> 1. you still have to support Etags too (don't really know why);
>
> 2. there is no advanced feature like regular expressions or substring 
> (proxies might implement it, but it's outside your control), so 
> slightly different user agents (language, version, operating systems) 
> will lead to regeneration;
>
> 3. I dare you to find a clean solution from the sitemap POV to express 
> this kind of conditions. :-)



Ahem...
http://cvs.apache.org/viewcvs.cgi/xml-cocoon2/src/java/org/apache/cocoon/selection/BrowserSelector.java?rev=1.7&content-type=text/vnd.viewcvs-markup

ObjectModelHelper.getResponse(objectModel).addHeader("Vary", "User-Agent");


Vadim



---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] More on caching, expires, and proxy-friendly headers

Posted by Gianugo Rabellino <gi...@apache.org>.

Stefano Mazzocchi wrote:

>> Full proxy compliance depends on the ability of dealing with 
>> conditional requests, handling a bunch of request headers all in some 
>> way interdependant and tricky to say the least. I'm not saying that we 
>> shouldn't do that sooner or later, but I'd rather plan this activity 
>> carefully, and possibily together with someone (Chuck?) from the httpd 
>> group working on the proxy part, in order to ensure that things work 
>> smoothly.
> 
> 
> At ApacheCon I set with Chuck who forwarded me to another guy who's the 
> one doing the work nowadays, but I forgot who he was. But I can ask him 
> again.

I was there too, and the guy was the one that just had the presentation 
on mod_proxy. According to the ApacheCon site he should be Graham 
Legget, does that ring a bell?

> Question: (and an important one)
> 
> Suppose you have a resource like
> 
>  /images/logo
> 
> that you hit with two different user agent and that a pipeline renders 
> differently depending on the user agent, how can a proxy behave friendly 
> to this? do we have a way to specify that a specific request has to be 
> matched not only against a URI but also against the user-agent that 
> requested it?

This can be done in a theory, once (and if) we'll have real proxy 
support in place. The first thing that can be done is starting to have 
real "Etag" support: an "Etag" is an unique identifier (think checksum) 
that represents the given URI from a server side view. This means that 
the same URI can lead to different etags being generated, according to 
internal server logic (request parameters, user agents, you name it). ù

Since an Etag is just a string, I guess that the cache key could do the 
job, but I'm not sure that it might be the better way security wise 
(you'll end up handing to the world your private cache keys, I'm not 
sure that this is a good thing). Also, Etags are useless if there is no 
conditional request handling: the proxy will ask Cocoon "give me that 
URI, identified by this Etag, if not modified since this timestamp": 
Cocoon must be able to analyze that query and reply either with an 304 
status code (not modified) or with a 200 with the actual content.

In this particular case, however, the cleanest solution is using the 
HTTP/1.1  "Vary:" header. This header is supposed to contain all the 
request fields that must match for a proxy to decide that a resource is 
still valid even if requested by different users. Here you can specify 
that the user-agent is one of such headers, but:

0. this is one of the most undocumented headers around;

1. you still have to support Etags too (don't really know why);

2. there is no advanced feature like regular expressions or substring 
(proxies might implement it, but it's outside your control), so slightly 
different user agents (language, version, operating systems) will lead 
to regeneration;

3. I dare you to find a clean solution from the sitemap POV to express 
this kind of conditions. :-)

>> <pipeline internal-only="true">
>>   <parameter name="expires" value="now plus 5 minutes"/>
> 
> 
> isn't something like
> 
>  <parameter name="TTL" value="5 minutes"/>
> 
> simpler to understand? we don't expect people to write
> 
>  <parameter name="expires" value="tomorrow plus 25 hours"/>
> 
> don't we?

Actually the "grammar" for that "production" is

expires := (access|modification|now)
            plus*
            (DIGIT (months|weeks|days|hours|minutes|seconds))+

(BNF experts, please bear with me ;-))

We have been discussing that in the past, and the reason for this syntax 
was compatibility with the syntax of HTTPd mod_expires. But I don't see 
any problem in changing that.

> Sounds like a good idea but I think Carsten is the one that knows the 
> caching internals better.

Definitely, I'm waiting for his Word to stop playing and start doing 
real work. :-) I have some toy code working here, but I'm still 
wondering what is the best way to refresh the expires information...

> One thing that you didn't describe is the ability for cocoon to reply to 
> proxy requests with the 'body hasn't changed' error code just by using 
> the pipeline caching logic but without having to regenerate the whole 
> thing. Is this another microstep or you have reasons against this?

As you can understand from my first comment, this is actually a hell of 
a lot of microsteps. :-) Next one will be etag support, followed by 
conditional request handling and full 304 support. From there, we'll be 
almost halfway (almost!) to full proxy-friendliness, but it won't be an 
easy task... it's no rocket science, but the spec is ankward, and it 
requires careful planning and coding.

Ciao,

-- 
Gianugo Rabellino
Pro-netics s.r.l.
http://www.pro-netics.com

---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org

Re: [RT] More on caching, expires, and proxy-friendly headers

Posted by Stefano Mazzocchi <st...@apache.org>.

Gianugo Rabellino wrote:
> 
> This RT integrates the one done more than one year ago and available at 
> http://marc.theaimsgroup.com/?t=101074439900001&r=1&w=2.
 >
> As of now you know that we have a basic HTTP header control that mimics 
> at a pipeline level the mod_expires functionality of the Apache HTTPD 
> server. This was a good start, but now I feel it's time to refine it and 
> make it better. Work is needed on two sides:
> 
> Proxy handling
> ==============
> 
> The approach to full proxy compliance should be done, once again :-), in 
> microsteps. I've been reading the HTTP/1.1 specs and the proxy-related 
> RFCs, and boy, it's not easy at all to implement a fully proxy compliant 
> system. It can be done, but it requires serious thinking and a major 
> rework of the request handling phase.
> 
> Full proxy compliance depends on the ability of dealing with conditional 
> requests, handling a bunch of request headers all in some way 
> interdependant and tricky to say the least. I'm not saying that we 
> shouldn't do that sooner or later, but I'd rather plan this activity 
> carefully, and possibily together with someone (Chuck?) from the httpd 
> group working on the proxy part, in order to ensure that things work 
> smoothly.

At ApacheCon I set with Chuck who forwarded me to another guy who's the 
one doing the work nowadays, but I forgot who he was. But I can ask him 
again.

> So, the first microstep is an easy one, just as a start. The companion 
> to the expires header is the "Cache-Control" header 
> (http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9): this 
> header allows for a finer grained control over the request, suggesting 
> proxies what to with the results.
> 
> While Expires uses an HTTP date header, built in Cocoon by adding the 
> result of the pipeline@expires attribute to the current system time, 
> Cache-Control is somehow smarter, since it gives caches an hint on what 
> is cacheable, how it should be cached (revalidated or not) and for how 
> long in seconds. To make it short, my proposal is to add a Cache-Control 
> header to any request coming from a pipeline with the "expires" 
> attribute set with the following template:
> 
> Cache-Control: max-age={expires value in seconds}, public
> 
> The "public" keyword instructs the proxy to store a resource in its 
> cache even if it should not be considered cacheable. This can be 
> dangerous somehow, since the proxy will serve requests coming from 
> "protected" resources without performing authentication on the origin 
> server, but in the end I think that it's safe to assume that if a 
> pipeline is marked with an "expires" header, than the user is perfectly 
> aware that such resource can, and will, be cached.

Question: (and an important one)

Suppose you have a resource like

  /images/logo

that you hit with two different user agent and that a pipeline renders 
differently depending on the user agent, how can a proxy behave friendly 
to this? do we have a way to specify that a specific request has to be 
matched not only against a URI but also against the user-agent that 
requested it?

I'm perfectly aware of the fact that we could have the resource 
/images/logo redirect to /images/logo.png or /image/logo.gif depending 
on user agent and that route around the proxy problem, but that's a hack 
and involves another round-trip to the client for the redirect.

Pier and I came up with this question and we think it might be an HTTP 
architectural fault, but before asking Roy, what do you people think?

> The patch is a no-brainer, such as:
> 
> Index: 
> src/java/org/apache/cocoon/components/pipeline/AbstractProcessingPipeline.java 
> 
> ===================================================================
> RCS file: 
> /home/cvs/xml-cocoon2/src/java/org/apache/cocoon/components/pipeline/AbstractProcessingPipeline.java,v 
> 
> retrieving revision 1.33
> diff -r1.33 AbstractProcessingPipeline.java
> 468a469
>  >
> 472c473,474
> <             res.setDateHeader("Expires", expires);
> ---
>  >             res.setDateHeader("Expires", System.currentTimeMillis() + 
> expires);
>  >             res.setHeader("Cache-Control", "max-age=" + expires/1000 
> + ", public");
> 474c476
> <                  new Long(expires));
> ---
>  >                  new Long(expires + System.currentTimeMillis()));
> 760c762
> <         return System.currentTimeMillis() + expires;
> ---
>  >         return expires;
> 
> The only problem I see is that this header is not set under Tomcat 
> (*argh*, Jetty works just OK!)

why I'm not surprised?

> so I have to investigate what's going 
> wrong, but for the rest I'm ready to commit it if you agree on the idea 
> (I'm reluctant to commit it right away since it somehow touches the 
> pipeline core, where I almost never worked). Now for the second (and 
> more interesting) point: Cocoon integration.
> 
> Cocoon integration
> ==================
> 
> The above approach works perfectly for communication with the external 
> world, be it a reverse proxy or just a browser cache. Sometimes, 
> however, there might be a case where you might want to use this concept 
> internally: imagine to have an aggregation of different cocoon 
> pipelines, where you have some resources for which you want to check 
> validity strictly and some others that are pretty heavy to generate, 
> uncacheable because the components you are using are not cacheable by 
> themselves but on which you have full control on the expiration time. In 
> this case, having an internal use of the expires attribute would be 
> pretty useful, i.e.:
> 
> <pipeline internal-only="true">
>   <parameter name="expires" value="now plus 5 minutes"/>

isn't something like

  <parameter name="TTL" value="5 minutes"/>

simpler to understand? we don't expect people to write

  <parameter name="expires" value="tomorrow plus 25 hours"/>

don't we?

>   <match pattern="my-heavy-resource">
>     <generate src="xmldb:xindice:///db/not/changing/frequently"/>
>     <serialize/>
>   </match>
> </pipeline>
> 
> <pipeline internal-only="true">
>   <match pattern="my-dynamic-resource">
>     <generate src="/content/that/might/change"/>
>     <serialize/>
>   </match>
> </pipeline>
> 
> <pipeline>
>   <match pattern="mybeautifulportal.html">
>     <aggregate element="portal">
>       <part src="cocoon://my-heavy-resource" element="news"/>
>       <part src="cocoon://my-dynamic-resource" element="data"/>
>     </aggregate>
>     <tranform src="myportal2html.xsl"/>
>     <serialize type="html"/>
>   </match>
> </pipeline>
> 
> 
> If we agree that this is useful, let's see the actual implementation. 
> First, let's get back to the general principle: if a user sets an 
> "expires" attribute on a pipeline, what she want's to say is "I know 
> better than the Cocoon cache for how long this resource has to be 
> considered fresh". This is by all means a configuration imposed by the 
> user, to which the caching system should obey blindly. My opinion
> then, wrt the caching pipeline, is that if an expires was set, all the 
> pipeline engine should do is to check if the given resource has already 
> been generated, and if the expiration time has not passed yet. If so, 
> the resource should be considered fresh disregarding any Validity 
> objects or Cacheable components.
> 
> This, AFAIU, would boost the performance even for internal pipelines and 
> aggregation, and would let us use internal pipelines in a smarter and 
> faster way. Not only that: if we are to use the expires feature even 
> internally, Cocoon's performance will get a boost even without using a 
> reverse proxy in front of the application server, since all the 
> (potentially heavy) algorithms to check the resource's validity would be 
> skipped.
> 
> Now for the implementation I wish I knew better the Cocoon caching 
> internals, but from a quick read it seems to me that there should be:
> 
> - some logic in CachedResponse to store and get expires (easy);
> 
> - appropriate logic in the proper points to obtain the expires object 
> from the environment and set a CachedResponse accordingly (is it enough 
> to change CachingProcessingPipeline#cacheResults?
> 
> - more logic in the validatePipeline() method in 
> AbstractCachingProcessingPipeline.java to take into account the expires 
> object configured, if present.
> 
> - in all cases, all the algorithms that check if a cached entry is still 
> valid, i.e. every place where a cache entry is built, validated or 
> invalidated, should take into account the expires configuration.
> 
> I have started to play on this too, but I am wondering if I'm following 
> the right path or if I'm missing something. Also, it might be worth 
> considering to have a different CachingPipeline implementation 
> (ExpiresEnabledCachingPipeline? Yuck ;-)), at least for a first start.
> 
> Comments and questions?

Sounds like a good idea but I think Carsten is the one that knows the 
caching internals better.

One thing that you didn't describe is the ability for cocoon to reply to 
proxy requests with the 'body hasn't changed' error code just by using 
the pipeline caching logic but without having to regenerate the whole 
thing. Is this another microstep or you have reasons against this?

-- 
Stefano Mazzocchi                               <st...@apache.org>
    Pluralitas non est ponenda sine necessitate [William of Ockham]
--------------------------------------------------------------------



---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org