You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@forrest.apache.org by Nicola Ken Barozzi <ni...@apache.org> on 2004/09/07 13:48:39 UTC

[Proposal] Linkmap as Forrest site generation entry point

I propose that instead of crawling from index.html we do so from the 
linkmap.html page, that is generated from site.xml.

Genesis
-------
All this started with plain-dev: it's a skin that outputs each page in 
pure html, without any navigation, menu, header, footer... just the 
page. In this way the result can be used as a new version of the site 
sources, but all in html (which now Forrest can render).

After the first cut I ran the generation and... it outputted only 
index.html. Of course, it does not have navigation!

So I created a linkmap.html match that creates a linkmap TOC page (which 
IIRC was also a feature request) and made the crawling start from there. 
Now it works.

Why use it always?
------------------
First of all it's not necessarily true that our users want index.html as 
the main page (if they do a non-html site), and it creates a clearer 
contract WRT what gets generated first by Forrest (all the site.xml links).

Note
----
This *will* be backwards incompatible, as it will not generate the 
index.html page if it's not inserted in the navigation or if it's not 
linked.

Think about it, then tell me WYT.

-- 
Nicola Ken Barozzi                   nicolaken@apache.org
             - verba volant, scripta manent -
    (discussions get forgotten, just code remains)
---------------------------------------------------------------------


Re: [Proposal] Linkmap as Forrest site generation entry point

Posted by Ross Gardler <rg...@apache.org>.
Nicola Ken Barozzi wrote:
> 
> I propose that instead of crawling from index.html we do so from the 
> linkmap.html page, that is generated from site.xml.
 >
> Genesis
> -------
> All this started with plain-dev: it's a skin that outputs each page in 
> pure html, without any navigation, menu, header, footer... just the 
> page. In this way the result can be used as a new version of the site 
> sources, but all in html (which now Forrest can render).

I have a use case for this, I need to generate HTML pages without the 
navigation for inclusion in Learning Objects that are then imported into 
Virtual Learning Environments (which provide the navigation). I have 
hacked a solution that generates a uris.txt file from the 
imsmanifest.xml file (the Content Package equivalent of site.xml and 
tabs.xml). However, this is done in the ant build script not within 
Forrest itself.

I think that this proposal of using linkmap.html is much better than my 
ugly hack and would enable me to remove another of the "extensions" I 
have had to build - got to be a good sign for the Forrest.

I'm +1 for this.

Ross

Re: [Proposal] Linkmap as Forrest site generation entry point

Posted by Dave Brondsema <da...@brondsema.net>.
On Wed, 8 Sep 2004, Clay Leeds wrote:

> On Sep 7, 2004, at 11:34 PM, Nicola Ken Barozzi wrote:
> > Dave Brondsema wrote:
> > ...
> >> Will `forrest -Dproject.start-uri=myfile.html` still work?
> >
> > Yes, I hadn't thought of that! :-)
>
> Does this mean one could process a single page? If so, sounds nice. I
> don't think it would work for adding a new page to a web site (the
> other pages wouldn't have a link to this file), but for modifying a
> page, that might be nifty! If that's not what this is for, then... what
> is the benefit and... uh... Never mind!
>

I think there's something like project.follow-links=false which can get
passed to the CLI crawler.  Not sure offhand if that's the exact name,
you'd have to check.

-- 
Dave Brondsema : dave@brondsema.net
http://www.brondsema.net : personal
http://www.splike.com : programming
http://csx.calvin.edu : student org

Re: [Proposal] Linkmap as Forrest site generation entry point

Posted by Dave Brondsema <da...@brondsema.net>.
Quoting Clay Leeds <cl...@medata.com>:

> David Crossley said:
> > Clay Leeds wrote:
> >> Nicola Ken Barozzi wrote:
> >> > Dave Brondsema wrote:
> >> > ...
> >> >> Will `forrest -Dproject.start-uri=myfile.html` still work?
> >> >
> >> > Yes, I hadn't thought of that! :-)
> >>
> >> Does this mean one could process a single page? If so, sounds nice. I
> >> don't think it would work for adding a new page to a web site (the
> >> other pages wouldn't have a link to this file), but for modifying a
> >> page, that might be nifty! If that's not what this is for, then... what
> >> is the benefit and... uh... Never mind!
> >>
> >> Web Maestro Clay
> >
> > I cannot quite parse you there Clay.
> > Using the Cocoon cli.xconf files you can add extra
> > files for processing. For an example with the extra
> > "mirrors.html" See the ./forrest.properties file which
> > is for Forrest's own site documentation.
> >
> > --
> > David Crossley
> 
> Sorry to confuse... I had just never seen nor used the command `forrest
> -Dproject.start-uri=myfile.html` before. I don't know what it's for.
> Having seen it, it *looks* like it is a way to process a single file
> (although it ay be more like begin a file number 15, and skip 1-14 or some
> such--I don't know what it's for, so I tend to make up what I think it's
> for... I tried to rtfm but couldn't find anything...).
> 
> I guess my main question is:
> 
> What does `forrest -Dproject.start-uri=myfile.html` do?
> 
> But that got me to think, "Wouldn't it be nice if forrest could process a
> single changed file?
> 
> Hope that clear it up!
> 
> Web Maestro Clay
> 

The way files are generated is that the Cocoon CLI (command line interface) acts
like a web crawler.  It starts at ${project.start-uri} and follows all the links
it gets (by this time, the linkrewriting has happened so site: and ext: links
are translated to what they appear as in HTML).  AFAIK, there is no pattern in
the order of links followed.  If you don't want to follow links (i.e., generate
just the ${project.start-uri} page and no others), you have to override
cli.xconf and set follow-links="false" in it.  I don't think there is a way to
set that follow-links parameter on the commandline.

Note that generating a single page like this still takes a long time because
there all the normal forrest startup stuff still happens (including copying raw
files).


-- 
Dave Brondsema : dave@brondsema.net 
http://www.brondsema.net : personal 
http://www.splike.com : programming 
http://csx.calvin.edu : student org 

Re: [Proposal] Linkmap as Forrest site generation entry point

Posted by Clay Leeds <cl...@medata.com>.
David Crossley said:
> Clay Leeds wrote:
>> Nicola Ken Barozzi wrote:
>> > Dave Brondsema wrote:
>> > ...
>> >> Will `forrest -Dproject.start-uri=myfile.html` still work?
>> >
>> > Yes, I hadn't thought of that! :-)
>>
>> Does this mean one could process a single page? If so, sounds nice. I
>> don't think it would work for adding a new page to a web site (the
>> other pages wouldn't have a link to this file), but for modifying a
>> page, that might be nifty! If that's not what this is for, then... what
>> is the benefit and... uh... Never mind!
>>
>> Web Maestro Clay
>
> I cannot quite parse you there Clay.
> Using the Cocoon cli.xconf files you can add extra
> files for processing. For an example with the extra
> "mirrors.html" See the ./forrest.properties file which
> is for Forrest's own site documentation.
>
> --
> David Crossley

Sorry to confuse... I had just never seen nor used the command `forrest
-Dproject.start-uri=myfile.html` before. I don't know what it's for.
Having seen it, it *looks* like it is a way to process a single file
(although it ay be more like begin a file number 15, and skip 1-14 or some
such--I don't know what it's for, so I tend to make up what I think it's
for... I tried to rtfm but couldn't find anything...).

I guess my main question is:

What does `forrest -Dproject.start-uri=myfile.html` do?

But that got me to think, "Wouldn't it be nice if forrest could process a
single changed file?

Hope that clear it up!

Web Maestro Clay

-- 
Clay Leeds - cleeds@medata.com
Web Developer - Medata, Inc. - http://www.medata.com
PGP Public Key: https://mail.medata.com/pgp/cleeds.asc



Re: [Proposal] Linkmap as Forrest site generation entry point

Posted by David Crossley <cr...@apache.org>.
Clay Leeds wrote:
> Nicola Ken Barozzi wrote:
> > Dave Brondsema wrote:
> > ...
> >> Will `forrest -Dproject.start-uri=myfile.html` still work?
> >
> > Yes, I hadn't thought of that! :-)
> 
> Does this mean one could process a single page? If so, sounds nice. I 
> don't think it would work for adding a new page to a web site (the 
> other pages wouldn't have a link to this file), but for modifying a 
> page, that might be nifty! If that's not what this is for, then... what 
> is the benefit and... uh... Never mind!
> 
> Web Maestro Clay

I cannot quite parse you there Clay.
Using the Cocoon cli.xconf files you can add extra
files for processing. For an example with the extra
"mirrors.html" See the ./forrest.properties file which
is for Forrest's own site documentation.

-- 
David Crossley


Re: [Proposal] Linkmap as Forrest site generation entry point

Posted by Clay Leeds <cl...@medata.com>.
On Sep 7, 2004, at 11:34 PM, Nicola Ken Barozzi wrote:
> Dave Brondsema wrote:
> ...
>> Will `forrest -Dproject.start-uri=myfile.html` still work?
>
> Yes, I hadn't thought of that! :-)

Does this mean one could process a single page? If so, sounds nice. I 
don't think it would work for adding a new page to a web site (the 
other pages wouldn't have a link to this file), but for modifying a 
page, that might be nifty! If that's not what this is for, then... what 
is the benefit and... uh... Never mind!

Web Maestro Clay


Re: [Proposal] Linkmap as Forrest site generation entry point

Posted by Nicola Ken Barozzi <ni...@apache.org>.
Dave Brondsema wrote:
...
> Will `forrest -Dproject.start-uri=myfile.html` still work? 

Yes, I hadn't thought of that! :-)

> If so, the 
> easy workaround compatibility is to put project.start-uri=index.html in 
> the project's forrest.properties.

Good idea, +1 all over the place :-)

-- 
Nicola Ken Barozzi                   nicolaken@apache.org
             - verba volant, scripta manent -
    (discussions get forgotten, just code remains)
---------------------------------------------------------------------


Re: [Proposal] Linkmap as Forrest site generation entry point

Posted by Dave Brondsema <da...@brondsema.net>.
Nicola Ken Barozzi wrote:
> 
> I propose that instead of crawling from index.html we do so from the 
> linkmap.html page, that is generated from site.xml.
> 
> Genesis
> -------
> All this started with plain-dev: it's a skin that outputs each page in 
> pure html, without any navigation, menu, header, footer... just the 
> page. In this way the result can be used as a new version of the site 
> sources, but all in html (which now Forrest can render).
> 
> After the first cut I ran the generation and... it outputted only 
> index.html. Of course, it does not have navigation!
> 
> So I created a linkmap.html match that creates a linkmap TOC page (which 
> IIRC was also a feature request) and made the crawling start from there. 
> Now it works.
> 
> Why use it always?
> ------------------
> First of all it's not necessarily true that our users want index.html as 
> the main page (if they do a non-html site), and it creates a clearer 
> contract WRT what gets generated first by Forrest (all the site.xml links).
> 
> Note
> ----
> This *will* be backwards incompatible, as it will not generate the 
> index.html page if it's not inserted in the navigation or if it's not 
> linked.
> 
> Think about it, then tell me WYT.
> 

As long as we're still version < 1.0, I'm +1 for anything good even if 
it breaks backwards compatibility.  And this one looks like 2 good 
things (single-page plain html will be useful for the test suite).

Will `forrest -Dproject.start-uri=myfile.html` still work?  If so, the 
easy workaround compatibility is to put project.start-uri=index.html in 
the project's forrest.properties.

-- 
Dave Brondsema : dave@brondsema.net
http://www.splike.com : programming
http://csx.calvin.edu : student org
http://www.brondsema.net : personal

Re: [Proposal] Linkmap as Forrest site generation entry point

Posted by Nicola Ken Barozzi <ni...@apache.org>.
Jason End wrote:

> I'd find the pure html output extremely useful. I'm
> currently going through hell with my projects
> publication workflow because I'm the only one capable
> of editing our webpage. 
> 
> With this solution I could plug all the document htmls
> into a CMS and allow the rest of the staff to work on
> structure, news, RSS...
> 
> Nice work,

Thanks, I'm happy you find it useful :-)

-- 
Nicola Ken Barozzi                   nicolaken@apache.org
             - verba volant, scripta manent -
    (discussions get forgotten, just code remains)
---------------------------------------------------------------------


Re: [Proposal] Linkmap as Forrest site generation entry point

Posted by Jason End <be...@yahoo.com>.
I'd find the pure html output extremely useful. I'm
currently going through hell with my projects
publication workflow because I'm the only one capable
of editing our webpage. 

With this solution I could plug all the document htmls
into a CMS and allow the rest of the staff to work on
structure, news, RSS...

Nice work,

Jay

--- Nicola Ken Barozzi <ni...@apache.org> wrote:

> 
> I propose that instead of crawling from index.html
> we do so from the 
> linkmap.html page, that is generated from site.xml.
> 
> Genesis
> -------
> All this started with plain-dev: it's a skin that
> outputs each page in 
> pure html, without any navigation, menu, header,
> footer... just the 
> page. In this way the result can be used as a new
> version of the site 
> sources, but all in html (which now Forrest can
> render).
> 
> After the first cut I ran the generation and... it
> outputted only 
> index.html. Of course, it does not have navigation!
> 
> So I created a linkmap.html match that creates a
> linkmap TOC page (which 
> IIRC was also a feature request) and made the
> crawling start from there. 
> Now it works.
> 
> Why use it always?
> ------------------
> First of all it's not necessarily true that our
> users want index.html as 
> the main page (if they do a non-html site), and it
> creates a clearer 
> contract WRT what gets generated first by Forrest
> (all the site.xml links).
> 
> Note
> ----
> This *will* be backwards incompatible, as it will
> not generate the 
> index.html page if it's not inserted in the
> navigation or if it's not 
> linked.
> 
> Think about it, then tell me WYT.
> 
> -- 
> Nicola Ken Barozzi                  
> nicolaken@apache.org
>              - verba volant, scripta manent -
>     (discussions get forgotten, just code remains)
>
---------------------------------------------------------------------
> 
> 



		
_______________________________
Do you Yahoo!?
Win 1 of 4,000 free domain names from Yahoo! Enter now.
http://promotions.yahoo.com/goldrush