You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@forrest.apache.org by Ferdinand Soethe <sa...@soethe.net> on 2005/05/03 05:21:10 UTC

Inkonsistency in implementation of default file and site.xml and what to do about it

Sorry, this needs to be long (and dirty)

in a freshly seeded site change site.xml to

<site label="Demo Site" xmlns="http://apache.org/forrest/linkmap/1.0" tab="">
  <menu1 label="Menu 1">
    <page1 label="Page 1" href="page1.html"/> 
    <page2 label="Page 2" href="page2.html"/> 
  </menu1>
</site>

Place these files in the xdocs dir

index.xml
page1.xml
page2.xml
site.xml
tabs.xml

Now do a forrest.
In build site you will get the following files which is perfect
because the first file in site will become the start page of our
statically rendered site.

linkmap.html
linkmap.pdf
page1.html
page1.pdf
page2.html
page2.pdf

Now do a forrest run

Unfortunately that is where the trouble begins.
Calling just localhost:8888 Forrest will load index.html or - if you
remove it - give you an ugly error message.

Same site two different results. Shouldn't be, should it?

Now looking at ways to fix this I'll consider Ross's suggestions one
by one:

> From cli.xconf:
> 
>     <!--+
>         |  Specifies the filename to be appended to URIs that
>         |  refer to a directory (i.e. end with a forward slash).
>         +-->
>     <default-filename>index.html</default-filename>

This changes the default-name for all directory-only URIs, so changing
it to page1.html will technically solve our problem. But then who
wants to change that for all the pages just to be able to have a
non-standard start-up page. And what about the confusion when the
buyer of a CD-documentation will find a start-me.html in every
subdirectory.

> Also from sitemap.xmap:
> 
>        <map:match pattern="">
>          <map:redirect-to uri="index.html" />
>        </map:match>
>        <map:match type="regexp" pattern="^.+/$">
>            <map:redirect-to uri="index.html"/>
>        </map:match>

This is more promising since pattern="" will only apply to
http://myserver and not to http://myserver/mydir (I think).

Problem is the second pattern for stepping back up directories by
entering a '..'-URI. It would have to be replaced by two different
patterns for first level subdirs (use same setting as pattern="") and
other levels (keep index.html).


(My) Conclusion:

None of this is nice or easy to do or explain. So until somebody
writes a patch that will derive the name of the first project page
from the first file url in site.xml (or an attribute 'startpage' in
the site-element), I suggest to pretend that it is rigid and keep the
note in site.xml as it is.

--
Ferdinand Soethe


Re: Inkonsistency in implementation of default file and site.xml and what to do about it

Posted by Ferdinand Soethe <sa...@soethe.net>.
Ross Gardler wrote:

RG> Good point, but this need not be changed unless you want to change the
RG> default in every page. One of your use cases was that a non-English site
RG> might not want to use index.html, if this were the case then that would
RG> be the case for *every* directory wouldn't it?

No, it wouldn't. One common way to make sure a dummy user will find
and open the right file to start is to give it a unique and self
explaining name such as 'start_by_opening_this_file.html'.

To have that name repeated in every directory might work because
people tend not to go deeper down the directory tree than they have to
but it is certainly not intended.



>>>Also from sitemap.xmap:
>>>
>>>       <map:match pattern="">
>>>         <map:redirect-to uri="index.html" />
>>>       </map:match>
>>>       <map:match type="regexp" pattern="^.+/$">
>>>           <map:redirect-to uri="index.html"/>
>>>       </map:match>
>> 
>> 
>> This is more promising since pattern="" will only apply to
>> http://myserver and not to http://myserver/mydir (I think).

RG> Correct

>> Problem is the second pattern for stepping back up directories by
>> entering a '..'-URI. It would have to be replaced by two different
>> patterns for first level subdirs (use same setting as pattern="") and
>> other levels (keep index.html).

RG> That's not what the second pattern means.

Never draw conclusions in the middle of the night :-)
Thanks for explaining that.

RG> This will result in the site index page becoming "start-here.html" but
RG> all subdirectories defaulting to index.html.

>> None of this is nice or easy to do or explain.

RG> Is it easier now I have explained it more completely?

It is. Thanks for taking the time. Will try to put that into a short
comment and a longer faq.

--
Ferdinand Soethe


Re: Inkonsistency in implementation of default file and site.xml and what to do about it

Posted by Ross Gardler <rg...@apache.org>.
Ferdinand Soethe wrote:
> Sorry, this needs to be long (and dirty)
> 
> in a freshly seeded site change site.xml to
> 
> <site label="Demo Site" xmlns="http://apache.org/forrest/linkmap/1.0" tab="">
>   <menu1 label="Menu 1">
>     <page1 label="Page 1" href="page1.html"/> 
>     <page2 label="Page 2" href="page2.html"/> 
>   </menu1>
> </site>
> 
> Place these files in the xdocs dir
> 
> index.xml
> page1.xml
> page2.xml
> site.xml
> tabs.xml
> 
> Now do a forrest.
> In build site you will get the following files which is perfect
> because the first file in site will become the start page of our
> statically rendered site.
> 
> linkmap.html
> linkmap.pdf
> page1.html
> page1.pdf
> page2.html
> page2.pdf
> 
> Now do a forrest run
> 
> Unfortunately that is where the trouble begins.
> Calling just localhost:8888 Forrest will load index.html or - if you
> remove it - give you an ugly error message.
> 
> Same site two different results. Shouldn't be, should it?
> 
> Now looking at ways to fix this I'll consider Ross's suggestions one
> by one:
> 
> 
>>>From cli.xconf:
>>
>>    <!--+
>>        |  Specifies the filename to be appended to URIs that
>>        |  refer to a directory (i.e. end with a forward slash).
>>        +-->
>>    <default-filename>index.html</default-filename>
> 
> 
> This changes the default-name for all directory-only URIs, so changing
> it to page1.html will technically solve our problem. But then who
> wants to change that for all the pages just to be able to have a
> non-standard start-up page. And what about the confusion when the
> buyer of a CD-documentation will find a start-me.html in every
> subdirectory.

Good point, but this need not be changed unless you want to change the 
default in every page. One of your use cases was that a non-English site 
might not want to use index.html, if this were the case then that would 
be the case for *every* directory wouldn't it?

>>Also from sitemap.xmap:
>>
>>       <map:match pattern="">
>>         <map:redirect-to uri="index.html" />
>>       </map:match>
>>       <map:match type="regexp" pattern="^.+/$">
>>           <map:redirect-to uri="index.html"/>
>>       </map:match>
> 
> 
> This is more promising since pattern="" will only apply to
> http://myserver and not to http://myserver/mydir (I think).

Correct

> Problem is the second pattern for stepping back up directories by
> entering a '..'-URI. It would have to be replaced by two different
> patterns for first level subdirs (use same setting as pattern="") and
> other levels (keep index.html).

That's not what the second pattern means.

If you request "subdir/../xyz.html" the request that Cocoon sees is 
"xyz.html". That is the ".." is resolved before the match is attempted.

The pattern "^.+/$" is a regular expression (see type="regexp"). It means:

'^' - beginning of line
'.' - any single character (except a newline)
'+' - one or more
'/' - a '/' character
'$' - end of line

so we match any pattern that has one or more characters of any type 
*and* which ends in a '/' character.

So, to achieve what you need simply create a project sitemap with:

        <map:match pattern="">
          <map:redirect-to uri="start-here.html" />
        </map:match>

This will result in the site index page becoming "start-here.html" but 
all subdirectories defaulting to index.html.

> (My) Conclusion:
> 
> None of this is nice or easy to do or explain. 

Is it easier now I have explained it more completely?

> So until somebody
> writes a patch that will derive the name of the first project page
> from the first file url in site.xml (or an attribute 'startpage' in
> the site-element).

I very much doubt that patch will appear since we cannot assume that 
site.xml will be present. It is possible to do it with a fallback to 
using index.html but since it cusomisation is simple why do we need it?

 > I suggest to pretend that it is rigid and keep the
 > note in site.xml as it is.

I am -1 on such documentation. If we document that something is not 
possible it means that people who need that feature do not pursue it. As 
you see above, the Open Source way is to discuss an issue to find a 
solution. Generally the community will come up with a good solution and 
this issue is a great example of how that works.

If you read back over your original thread you will find that your 
questions have prompted us to discover a solution (yet to be tested, but 
I am sure you will do that). In addition, some readers of this thread 
may have learnt a little more about how Forrest works (I certainly 
didn't know how to do this until you asked the question). Furthermore, a 
problem in the way the skins work has been highlighted since they use a 
hard coded version of the "index.html" string (this can easily be 
changed if my suggestion above actually works).

It is possible that if the dopcumentation had explicitly stated that it 
*has* to be index.html this discussion, and the resulting improvement of 
Forrest, may never have happened. Forrest was designed from the ground 
up to be configurable. There is nearly always a way to configure it to 
work how you need it. It's just that we may not know how yet.

Ross