You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@forrest.apache.org by David Crossley <cr...@apache.org> on 2005/10/28 09:32:16 UTC

speeding up the static build (Was: Roadmap for v2)

Gav.... wrote:
> 
> Which shows I think I mentioned before the only downside to static site 
> generation.
> To make that one correction and add one word to the index.xml file meant I 
> had
> to rebuild the entire site again with 'forrest site' and then re-upload. I 
> chose to
> override my editors complaints that all files had been changed and do I 
> really want
> to upload the entire site again - no, just index.* please !

What do you mean by "editors complaints"?
Are you using an editor to do the upload to
your website? That is not Forrest's fault then.

For the project website we use Subversion to store
the generated content. In that way only the changed
files get uploaded. We use forrestbot too of course.

You could use 'scp' to specifically copy certain files.

Someone suggested forrestbot by ftp too.

Sure, we know that there are ways to speed the site
build process. Cocoon CLI checksums. There is probably
a Jira issue registered for that.

There is a trick that can cut down your turnaround time
with building. In forrest.properties ...

# The URL to start crawling from
#project.start-uri=linkmap.html

Uncomment that and set it to the specific page that
you want. That will build that page, then of course
it will keep crawling links from there. It may be
confined to a sub-directory, but depending on links
could end up generating the whole site.

The main thing is that your page of interest is built
first.

-David

Re: speeding up the static build (Was: Roadmap for v2)

Posted by David Crossley <cr...@apache.org>.
CFAS Webmaster wrote:
> Gav / David,
> 
>  Where does cli.xconf go for a site outside the forrest tree?  I have 
> it in {project_root}/src/documentation/conf/cli.xconf.  The uncommented 
> area for checksums reads like this:
> 
>   <checksums-uri>build/tmp/checksums</checksums-uri>
> 
> I changed it from build/work because I don't have one (yet).  Forrestbot 
> isn't behaving for me, but that's another message... 
> 
> I don't see checksums in build/tmp.  Do I need to "touch checksums" 
> there for it to be found?  This site is being generated by Forrest v0.7, 
> does that make a difference?

http://forrest.apache.org/docs/faq.html#cli-xconf

Declare it in forrest.properties otherwise it uses
the default one in Forrest main/webapp/WEB-INF

See our site-author for an example.

-David

Re: speeding up the static build (Was: Roadmap for v2)

Posted by David Crossley <cr...@apache.org>.
David Crossley wrote:
> Gav.... wrote:
> > Gav.... wrote:
> > | David Crossley wrote:
> > |
> > || Sure, we know that there are ways to speed the site
> > || build process. Cocoon CLI checksums. There is probably
> > || a Jira issue registered for that.
> > |
> > | Nothing recent that I can see,
> 
> I found it. Not in Jira, but an old discussion in the
> mail archives. It was a broader topic, checksums was
> just part of it.
>  http://marc.theaimsgroup.com/?t=112357127600001
>  Re: Reducing Forrest build time
> 
> > | I will uncomment out the line
> > | that reads <checksums-uri>build/work/checksums</checksums-uri>
> > | in the cli.xconf (site-author ?) , I guess I don't need to do anything 
> > else
> > | ?
> >
> > Nothing else needs doing, this works fine.
> > 
> > I did a 'forrest site' to an unchanged site and it took 3:55
> > 
> > I then enabled the checksums and did 'forrest site' again
> > it took over 4 minutes - during this time I guess the checksums
> > are at work.
> > 
> > I then did a 'forrest site' again and it took 2:50 and correctly
> > skipped all files.
> 
> Thanks for getting this clarified Gav.
> Yes, similar numbers for me, very impressive.
> That is a 20% speedup.

Grrr, got too excited and spoke too soon. I now reckon
that this speedup might be due to the Cocoon cache.

Try this ...
cd forrest-trunk/site-author
forrest clean
forrest
... took 4:41
forrest
... took 3:35
now enable checksums in site-author/conf/cli.xconf
forrest
... took 3:38 ... all files were generated.
forrest
... took 3:37 ... only changed files were written.

See Ferdinand's discussion about checksums and then caching:

-David

> I also tested that it worked on retrieving dynamic content,
> e.g. in forrest/site-author we have the forrest-issues.html
> which gets the top priority issues from Jira. It worked fine.
> 
> > This should be enabled by default for a 'forrest site' I'd have thought ?
> 
> The trouble is that it writes the checksums file
> relative to the Cocoon context, i.e. $FORREST_HOME/main/webapp
> So that will break it for multi-user installations
> because all projects would try to write to the same file.
> 
> So probably an FAQ for now.
> 
> > Anyway, will play with Forrestbot and see if it takes advantage of this.
> 
> That would be good to know too.
> 
> -David

Re: speeding up the static build (Was: Roadmap for v2)

Posted by David Crossley <cr...@apache.org>.
Gav.... wrote:
> Gav.... wrote:
> | David Crossley wrote:
> |
> || Sure, we know that there are ways to speed the site
> || build process. Cocoon CLI checksums. There is probably
> || a Jira issue registered for that.
> |
> | Nothing recent that I can see,

I found it. Not in Jira, but an old discussion in the
mail archives. It was a broader topic, checksums was
just part of it.
 http://marc.theaimsgroup.com/?t=112357127600001
 Re: Reducing Forrest build time

> | I will uncomment out the line
> | that reads <checksums-uri>build/work/checksums</checksums-uri>
> | in the cli.xconf (site-author ?) , I guess I don't need to do anything 
> else
> | ?
>
> Nothing else needs doing, this works fine.
> 
> I did a 'forrest site' to an unchanged site and it took 3:55
> 
> I then enabled the checksums and did 'forrest site' again
> it took over 4 minutes - during this time I guess the checksums
> are at work.
> 
> I then did a 'forrest site' again and it took 2:50 and correctly
> skipped all files.

Thanks for getting this clarified Gav.
Yes, similar numbers for me, very impressive.
That is a 20% speedup.

I also tested that it worked on retrieving dynamic content,
e.g. in forrest/site-author we have the forrest-issues.html
which gets the top priority issues from Jira. It worked fine.

> This should be enabled by default for a 'forrest site' I'd have thought ?

The trouble is that it writes the checksums file
relative to the Cocoon context, i.e. $FORREST_HOME/main/webapp
So that will break it for multi-user installations
because all projects would try to write to the same file.

So probably an FAQ for now.

> Anyway, will play with Forrestbot and see if it takes advantage of this.

That would be good to know too.

-David

Re: speeding up the static build (Was: Roadmap for v2)

Posted by CFAS Webmaster <we...@cfas.org>.
Gav / David,

  Where does cli.xconf go for a site outside the forrest tree?  I have 
it in {project_root}/src/documentation/conf/cli.xconf.  The uncommented 
area for checksums reads like this:

   <checksums-uri>build/tmp/checksums</checksums-uri>

I changed it from build/work because I don't have one (yet).  Forrestbot 
isn't behaving for me, but that's another message... 

I don't see checksums in build/tmp.  Do I need to "touch checksums" 
there for it to be found?  This site is being generated by Forrest v0.7, 
does that make a difference?

Thanks!
-Paul

Gav.... wrote:

>----- Original Message ----- 
>From: "Gav...." <br...@brightontown.com.au>
>To: <de...@forrest.apache.org>
>Sent: Friday, October 28, 2005 7:35 PM
>
>Subject: Re: speeding up the static build (Was: Roadmap for v2)
>
>|| Sure, we know that there are ways to speed the site
>|| build process. Cocoon CLI checksums. There is probably
>|| a Jira issue registered for that.
>|
>| Nothing recent that I can see, I will uncomment out the line
>| that reads <checksums-uri>build/work/checksums</checksums-uri>
>| in the cli.xconf (site-author ?) , I guess I don't need to do anything 
>else
>| ?
>
>Nothing else needs doing, this works fine.
>
>I did a 'forrest site' to an unchanged site and it took 3:55
>
>I then enabled the checksums and did 'forrest site' again
>it took over 4 minutes - during this time I guess the checksums
>are at work.
>
>I then did a 'forrest site' again and it took 2:50 and correctly
>skipped all files.
>
>This should be enabled by default for a 'forrest site' I'd have thought ?
>
>Anyway, will play with Forrestbot and see if it takes advantage of this.
>
>Gav...
>
>
>
>
>  
>

Re: speeding up the static build (Was: Roadmap for v2)

Posted by "Gav...." <br...@brightontown.com.au>.
----- Original Message ----- 
From: "Gav...." <br...@brightontown.com.au>
To: <de...@forrest.apache.org>
Sent: Friday, October 28, 2005 7:35 PM

Subject: Re: speeding up the static build (Was: Roadmap for v2)

|| Sure, we know that there are ways to speed the site
|| build process. Cocoon CLI checksums. There is probably
|| a Jira issue registered for that.
|
| Nothing recent that I can see, I will uncomment out the line
| that reads <checksums-uri>build/work/checksums</checksums-uri>
| in the cli.xconf (site-author ?) , I guess I don't need to do anything 
else
| ?

Nothing else needs doing, this works fine.

I did a 'forrest site' to an unchanged site and it took 3:55

I then enabled the checksums and did 'forrest site' again
it took over 4 minutes - during this time I guess the checksums
are at work.

I then did a 'forrest site' again and it took 2:50 and correctly
skipped all files.

This should be enabled by default for a 'forrest site' I'd have thought ?

Anyway, will play with Forrestbot and see if it takes advantage of this.

Gav...




-- 
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.1.362 / Virus Database: 267.12.5/150 - Release Date: 27/10/2005



-- 
This message was scanned for spam and viruses by BitDefender.
For more information please visit http://linux.bitdefender.com/



Re: speeding up the static build (Was: Roadmap for v2)

Posted by "Gav...." <br...@brightontown.com.au>.
----- Original Message ----- 
From: "David Crossley" <cr...@apache.org>
To: <de...@forrest.apache.org>
Sent: Friday, October 28, 2005 3:32 PM
Subject: speeding up the static build (Was: Roadmap for v2)


| Gav.... wrote:
| >
| > Which shows I think I mentioned before the only downside to static site
| > generation.
| > To make that one correction and add one word to the index.xml file meant 
I
| > had
| > to rebuild the entire site again with 'forrest site' and then re-upload. 
I
| > chose to
| > override my editors complaints that all files had been changed and do I
| > really want
| > to upload the entire site again - no, just index.* please !
|
| What do you mean by "editors complaints"?
| Are you using an editor to do the upload to
| your website? That is not Forrest's fault then.

Sometimes Dreamweaver MX, othertimes a dedicated FTP client, either
way the files are changed as far as they are concerned. No, not Forrests
fault and apologies if it sounded that way.

|
| For the project website we use Subversion to store
| the generated content. In that way only the changed
| files get uploaded. We use forrestbot too of course.

I think I am going to give Forrestbot a try, this will not
solve the above, but one less thing to do myself.

|
| You could use 'scp' to specifically copy certain files.
|
| Someone suggested forrestbot by ftp too.

Thats the method I will try.

|
| Sure, we know that there are ways to speed the site
| build process. Cocoon CLI checksums. There is probably
| a Jira issue registered for that.

Nothing recent that I can see, I will uncomment out the line
that reads <checksums-uri>build/work/checksums</checksums-uri>
in the cli.xconf (site-author ?) , I guess I don't need to do anything else 
?


|
| There is a trick that can cut down your turnaround time
| with building. In forrest.properties ...
|
| # The URL to start crawling from
| #project.start-uri=linkmap.html
|
| Uncomment that and set it to the specific page that
| you want. That will build that page, then of course
| it will keep crawling links from there. It may be
| confined to a sub-directory, but depending on links
| could end up generating the whole site.
|
| The main thing is that your page of interest is built
| first.

Thanks for that. I may be way out here as I don't know the
specifics, but I wonder if this process can be copied and then enhanced 
somehow to create a 'build one file' tool.
Where you say, 'will be that page, and then ....' stop right there, dont 
crawl, we are done. Is that even feasible and do you think it is worth it.
Of course if the CLI Checksums thing works then there is no need.

Gav...




-- 
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.1.362 / Virus Database: 267.12.5/150 - Release Date: 27/10/2005



-- 
This message was scanned for spam and viruses by BitDefender.
For more information please visit http://linux.bitdefender.com/