You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cocoon.apache.org by Joerg Heinicke <jo...@gmx.de> on 2004/03/30 02:58:36 UTC

Re: Cache and HTMLGenerator

On 30.03.2004 02:41, Gustavo Nalle Fernandes wrote:
>  Thanks for the code! It is indeed very simple! That?s why I like Cocoon :)
>   Regarding the Last-Modified header, the getLastModified() do work for GET
> request, but the GET request
> also brings the whole document and not just the headers. That?s why I was
> observing the whole document being
> transferred all the time.

Ah, of course. Now it's obvious :) The getLastModified() is only for 
Cocoon's pipeline caching as it is assumed that the pipeline processing 
is the most time consuming part. Of course this changes fast if you 
fetch the content from remote.

> So what is the best scenario for the
> HTMLGenerator? Always do a HEAD request to see if the remote document is
> modified and if it is, make a subsequent GET request OR always make a GET on
> every request ? It depends of the size of the document and the modification
> frequency. If the remote document is too large, it is inefficent to make a
> GET all the time, as the HTMLGenerator does today. On the other hand, if the
> document is modified frequently, it would be inefficient to make HEAD and
> GET request, since it means making two connections to the remote site.Using
> a sitemap parameter specifying the interval that the HTMLGenerator would
> fectch data would address both issues. Do you think it is worthy to change
> the current HTMLGenerator to include this extra parameter?

Definitely not as this problem is not HTMLGenerator specific, but 
URLSource specific. So I will raise this question also on the dev list, 
maybe someone has a clever proposal for this.

For the devs with clever ideas here's the thread (unfortunately RES 
breaks the thread view at marc.theaimsgroup.com, so switching to gmane.org):
http://thread.gmane.org/gmane.text.xml.cocoon.user/34445

Joerg