You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@forrest.apache.org by Marshall Roch <ma...@exclupen.com> on 2003/12/18 04:03:11 UTC

Document last modified

I finally got forrestbot2 working (so cool!), so I will probably set it 
up to rebuild the site every 6 to 12 hours.  This will mean that the 
last modified time in the HTTP headers and at the bottom of the HTML 
will always be within 6 hours.

I'd like to have a <last-modified /> tag in the header of my xdocs (or 
something functionally equivalent), which could contain the CVS Date 
tag, or a manually-updated date for people not using CVS.  This should 
all be server-side so as to not rely on Javascript or the users' 
system's date.

Another issue with constantly rebuilding the site is the Last-Modified 
header.  A lot of cache servers use something called "conditional GET," 
where they ask Apache whether the Last-Modified and ETag headers are 
still what they remember.  If they are different, the cache is updated. 
  The majority of my pages will not change much.  When the Last-Modified 
header changes all the time, the page does not stay cached for long 
periods of time.  This problem might relate to FOR-19.

Hopefully this isn't /too/ hard to pull off... :)

--
Marshall Roch

p.s. I'm shooting to launch my Forrest-powered redesign by Jan 1.  If 
you want to check it out early, see 
http://whs.winnacunnet.k12.nh.us:81/.  Browser tests/bug reports are 
appreciated.

Re: Document last modified

Posted by Marshall Roch <ma...@exclupen.com>.
Dave Brondsema wrote:

<snip>

> 2.  Skins could put the current date into the page when forrest is run.
> 
> In #1, the date corresponds the document content.  In #2, the date corresponds
> to the end result seen by the user (theoretically like the javascript does now).
>  #2 would be more direct to implement because it wouldn't require any changes to
> DTDs or documents that want to take advantage of this.

This is definitely better than requiring a Javascript, but it won't 
solve the conditional GET problem.  The Last-Modified header needs to be 
set to the date/time the xdoc was last touched, instead of the date/time 
forrest was run, since forrest will run every few hours w/ forrestbot, 
even though the pages aren't modified.

>>p.s. I'm shooting to launch my Forrest-powered redesign by Jan 1.  If 
>>you want to check it out early, see 
>>http://whs.winnacunnet.k12.nh.us:81/.  Browser tests/bug reports are 
>>appreciated.
> 
> Looks pretty nice.  Planning on making the skin available?  It's a nice one.

Thanks.  Forrest is great, so I wouldn't mind contributing the skin. 
There is definitely some project-specific stuff in there (navigation is 
a huge one), but that could be cleaned up.  Hopefully I will have some 
time to do that over Christmas vacation.

--
Marshall Roch

Re: Document last modified

Posted by Dave Brondsema <da...@brondsema.net>.
Quoting Marshall Roch <ma...@exclupen.com>:


> I'd like to have a <last-modified /> tag in the header of my xdocs (or 
> something functionally equivalent), which could contain the CVS Date 
> tag, or a manually-updated date for people not using CVS.  This should 
> all be server-side so as to not rely on Javascript or the users' 
> system's date.
> 
> Another issue with constantly rebuilding the site is the Last-Modified 
> header.  A lot of cache servers use something called "conditional GET," 
> where they ask Apache whether the Last-Modified and ETag headers are 
> still what they remember.  If they are different, the cache is updated. 
>   The majority of my pages will not change much.  When the Last-Modified 
> header changes all the time, the page does not stay cached for long 
> periods of time.  This problem might relate to FOR-19.
> 
> Hopefully this isn't /too/ hard to pull off... :)
> 

Neither javascript nor Last-Modified headers should be used to achieve this. 
Both rely on the environment of the website (browser, server) behaving
appropriately.  The date should be placed directly into the document (random
thought: potential unnecessary 'diff' results?).

Two options come to mind:

1.  The HOWTO DTD supports a last-modified-content-date element in the header. 
We could put this in the document DTD also.  v20 perhaps?  As a tangent, what's
the status on that?

2.  Skins could put the current date into the page when forrest is run.

In #1, the date corresponds the document content.  In #2, the date corresponds
to the end result seen by the user (theoretically like the javascript does now).
 #2 would be more direct to implement because it wouldn't require any changes to
DTDs or documents that want to take advantage of this.


> p.s. I'm shooting to launch my Forrest-powered redesign by Jan 1.  If 
> you want to check it out early, see 
> http://whs.winnacunnet.k12.nh.us:81/.  Browser tests/bug reports are 
> appreciated.
> 

Looks pretty nice.  Planning on making the skin available?  It's a nice one.


-- 
Dave Brondsema 
dave@brondsema.net 
http://www.brondsema.net - personal 
http://www.splike.com - programming 
http://csx.calvin.edu - student org 

Re: Document last modified

Posted by David Crossley <cr...@indexgeo.com.au>.
Marshall Roch wrote:
> David Crossley wrote:
> >
> > BTW, i get a Timeout from your development site below.
> 
> Sorry, I think I was moving the virtualhost right then.  It should be 
> working now.

Du'oh ... trapped inside my own firewall!
Sorry for the noise.

--David




Re: Document last modified

Posted by Marshall Roch <ma...@exclupen.com>.
David Crossley wrote:
> The document-v12 DTD has an optional <version> tag in the header,
> which might be used for this purpose. See xdocs/linking.xml
> for one example (except that you want the CVS $Date$).
> It is rendered at the bottom-right.

Thanks, that worked! Just what I was looking for. :-)

> BTW, i get a Timeout from your development site below.

Sorry, I think I was moving the virtualhost right then.  It should be 
working now.

--
Marshall Roch

Re: Document last modified

Posted by David Crossley <cr...@indexgeo.com.au>.
Marshall Roch wrote:
> I finally got forrestbot2 working (so cool!),

Excellent news.

> so I will probably set it 
> up to rebuild the site every 6 to 12 hours.  This will mean that the 
> last modified time in the HTTP headers and at the bottom of the HTML 
> will always be within 6 hours.
>
> I'd like to have a <last-modified /> tag in the header of my xdocs (or 
> something functionally equivalent), which could contain the CVS Date 
> tag, or a manually-updated date for people not using CVS.  This should 
> all be server-side so as to not rely on Javascript or the users' 
> system's date.

The document-v12 DTD has an optional <version> tag in the header,
which might be used for this purpose. See xdocs/linking.xml
for one example (except that you want the CVS $Date$).
It is rendered at the bottom-right.

I hope that someone else will help with the rest of your issue.

BTW, i get a Timeout from your development site below.

--David

> Another issue with constantly rebuilding the site is the Last-Modified 
> header.  A lot of cache servers use something called "conditional GET," 
> where they ask Apache whether the Last-Modified and ETag headers are 
> still what they remember.  If they are different, the cache is updated. 
>   The majority of my pages will not change much.  When the Last-Modified 
> header changes all the time, the page does not stay cached for long 
> periods of time.  This problem might relate to FOR-19.
> 
> Hopefully this isn't /too/ hard to pull off... :)
> 
> --
> Marshall Roch
> 
> p.s. I'm shooting to launch my Forrest-powered redesign by Jan 1.  If 
> you want to check it out early, see 
> http://whs.winnacunnet.k12.nh.us:81/.  Browser tests/bug reports are 
> appreciated.


Re: Document last modified

Posted by Upayavira <uv...@upaya.co.uk>.
Juan Jose Pablos wrote:

> Hi,
>
> I think that is a bit more complex  than that.
>
> we can not relay on the last time a document content change, because:
>
> 1) ¿What about if the skin change?
>
> 2) ¿What about if the tabs/menu/ change?
>
> In both cases the page must be regenerate

No, no, you miss what it does. It checks the final generated page, not 
the source doc. So if you change a skin, then the page has changed and 
will be regenerated. So there's no problem with it.

> Would not be easier if we use rsync to do the md5 comparasion for us?

Could. But the code is already written, and what if you're running on 
the box that is also serving the pages?

Upayavira

>
> Cheers,
> Cheche
>
> Upayavira wrote:
>
>> Marshall Roch wrote:
>>
>>> I finally got forrestbot2 working (so cool!), so I will probably set 
>>> it up to rebuild the site every 6 to 12 hours.  This will mean that 
>>> the last modified time in the HTTP headers and at the bottom of the 
>>> HTML will always be within 6 hours.
>>>
>>> I'd like to have a <last-modified /> tag in the header of my xdocs 
>>> (or something functionally equivalent), which could contain the CVS 
>>> Date tag, or a manually-updated date for people not using CVS.  This 
>>> should all be server-side so as to not rely on Javascript or the 
>>> users' system's date.
>>>
>>> Another issue with constantly rebuilding the site is the 
>>> Last-Modified header.  A lot of cache servers use something called 
>>> "conditional GET," where they ask Apache whether the Last-Modified 
>>> and ETag headers are still what they remember.  If they are 
>>> different, the cache is updated.  The majority of my pages will not 
>>> change much.  When the Last-Modified header changes all the time, 
>>> the page does not stay cached for long periods of time.  This 
>>> problem might relate to FOR-19.
>>>
>>> Hopefully this isn't /too/ hard to pull off... :)
>>
>>
>>
>> I implemented some code in the CLI some time ago so that, as it 
>> builds a site, it maintains checksums for each page in a checksum 
>> file, and only writes the generated file back to disc if the file has 
>> changed (it still has to generate it though). This way, the 
>> timestamps on files will represent when the page actually last changed.
>>
>> If this is useful, let me know and I'll dig out details.
>>
>> Upayavira
>>
>>>
>>> -- 
>>> Marshall Roch
>>>
>>> p.s. I'm shooting to launch my Forrest-powered redesign by Jan 1.  
>>> If you want to check it out early, see 
>>> http://whs.winnacunnet.k12.nh.us:81/.  Browser tests/bug reports are 
>>> appreciated.
>>>
>>
>
>
>



Re: Document last modified

Posted by David Crossley <cr...@indexgeo.com.au>.
Juan Jose Pablos wrote:
> 
> Would not be easier if we use rsync to do the md5 comparasion for us?

There are two different concerns.

Generation of documents with proper date information and only when
needed. The Cocoon CLI, and maybe cache facilities, seem appropriate.

Publishing the finished product. Forrestbot2 of course. Each workstage
implementation is configurable with properties. So the "deploy" stage
could have a deploy.rsync config ... i presume. If it cannot then we
need to make it so. Rsync is my choice. (Yes can do a local-to-local.)

--David



Re: Document last modified

Posted by Juan Jose Pablos <ch...@che-che.com>.
Hi,

I think that is a bit more complex  than that.

we can not relay on the last time a document content change, because:

1) ¿What about if the skin change?

2) ¿What about if the tabs/menu/ change?

In both cases the page must be regenerate

Would not be easier if we use rsync to do the md5 comparasion for us?

Cheers,
Cheche

Upayavira wrote:
> Marshall Roch wrote:
> 
>> I finally got forrestbot2 working (so cool!), so I will probably set 
>> it up to rebuild the site every 6 to 12 hours.  This will mean that 
>> the last modified time in the HTTP headers and at the bottom of the 
>> HTML will always be within 6 hours.
>>
>> I'd like to have a <last-modified /> tag in the header of my xdocs (or 
>> something functionally equivalent), which could contain the CVS Date 
>> tag, or a manually-updated date for people not using CVS.  This should 
>> all be server-side so as to not rely on Javascript or the users' 
>> system's date.
>>
>> Another issue with constantly rebuilding the site is the Last-Modified 
>> header.  A lot of cache servers use something called "conditional 
>> GET," where they ask Apache whether the Last-Modified and ETag headers 
>> are still what they remember.  If they are different, the cache is 
>> updated.  The majority of my pages will not change much.  When the 
>> Last-Modified header changes all the time, the page does not stay 
>> cached for long periods of time.  This problem might relate to FOR-19.
>>
>> Hopefully this isn't /too/ hard to pull off... :)
> 
> 
> I implemented some code in the CLI some time ago so that, as it builds a 
> site, it maintains checksums for each page in a checksum file, and only 
> writes the generated file back to disc if the file has changed (it still 
> has to generate it though). This way, the timestamps on files will 
> represent when the page actually last changed.
> 
> If this is useful, let me know and I'll dig out details.
> 
> Upayavira
> 
>>
>> -- 
>> Marshall Roch
>>
>> p.s. I'm shooting to launch my Forrest-powered redesign by Jan 1.  If 
>> you want to check it out early, see 
>> http://whs.winnacunnet.k12.nh.us:81/.  Browser tests/bug reports are 
>> appreciated.
>>
> 



Re: Document last modified

Posted by Upayavira <uv...@upaya.co.uk>.
Marshall Roch wrote:

> I finally got forrestbot2 working (so cool!), so I will probably set 
> it up to rebuild the site every 6 to 12 hours.  This will mean that 
> the last modified time in the HTTP headers and at the bottom of the 
> HTML will always be within 6 hours.
>
> I'd like to have a <last-modified /> tag in the header of my xdocs (or 
> something functionally equivalent), which could contain the CVS Date 
> tag, or a manually-updated date for people not using CVS.  This should 
> all be server-side so as to not rely on Javascript or the users' 
> system's date.
>
> Another issue with constantly rebuilding the site is the Last-Modified 
> header.  A lot of cache servers use something called "conditional 
> GET," where they ask Apache whether the Last-Modified and ETag headers 
> are still what they remember.  If they are different, the cache is 
> updated.  The majority of my pages will not change much.  When the 
> Last-Modified header changes all the time, the page does not stay 
> cached for long periods of time.  This problem might relate to FOR-19.
>
> Hopefully this isn't /too/ hard to pull off... :)

I implemented some code in the CLI some time ago so that, as it builds a 
site, it maintains checksums for each page in a checksum file, and only 
writes the generated file back to disc if the file has changed (it still 
has to generate it though). This way, the timestamps on files will 
represent when the page actually last changed.

If this is useful, let me know and I'll dig out details.

Upayavira

>
> -- 
> Marshall Roch
>
> p.s. I'm shooting to launch my Forrest-powered redesign by Jan 1.  If 
> you want to check it out early, see 
> http://whs.winnacunnet.k12.nh.us:81/.  Browser tests/bug reports are 
> appreciated.
>