You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@forrest.apache.org by Steven Noels <st...@outerthought.org> on 2002/07/26 23:15:09 UTC
URI namespace management & the sitemap
Hi all,
happy as I am with the current progress we made with the new forrestbot,
I'm planning to convert some private sites to Forrest now, and was
immediately stumped by the fact that we haven't thoroughly discussed nor
analyzed the URI namespace management. So I want to regroup a number of
thoughts I have and throw this in the group, hopefully ending up with
some checklist and a nice todo to refactor the sitemap and the assorted
skins and xdocs.
1) Index documents
Currently, there is no matcher set up for URIs ending with a trailing
slash, which means I saw quite some broken links being reported by the
CLI during my initial trial of outerthought.net. I believe we should add
those matchers so that we support book.xml links like this:
<menu label="Document Samples">
<menu-item label="DTD documentation" href="dtd-docs.html"/>
<menu-item label="document-v11 (HTML)" href="document-v11.html"/>
<menu-item label="document-v11 (PDF)" href="document-v11.pdf"/>
<menu-item label="How-Tos" href="community/howto/"/>
^^^^
<menu-item label="xml.apache.org" href="xml-site/"/>
^^^^
</menu>
and in normal <link> links also, of course.
2) Filename extensions
I gather some people will want to generate filenames with other
extensions than the default .html ones, e.g. if they want to have
further serverside include behaviour triggered on the webserver level.
Im +0 whether we really shouls support this, but maybe some people
creating there own skins will want to generate .php3, .shtml or similar
files. I believe we could support using an XSLT library template doing
file extension rewriting for link elements, and having configurable
matchers, but I'm not sure whether this is supported with Cocoon:
<map:match pattern="*.{ext-parameter}">
<map:aggregate element="site">
<map:part src="cocoon:/book-{1}.xml"/>
<map:part src="cocoon:/tab-{1}.xml"/>
<map:part src="cocoon:/body-{1}.xml" label="content"/>
</map:aggregate>
<map:call resource="skinit">
<map:parameter name="type" value="site2xhtml"/>
</map:call>
</map:match>
and then some general sitemap parameter {ext-parameter} being set to
".html" to start with, hopefully configurable through some CLI
inputmodule possibly overriding this parameter from the commandline.
Is this feasible?
3) host and project location
With the current issue of the tab-link prefixes being hardcoded in the
XSLT in mind, I thought we should set those 'host' and
'project-location' links like this:
http://{host}/{project-location}/foobaruri
- host address links created in the XSLT that makes up the skin,
perhaps also configured from the outside with the aforementioned sitemap
parameters being fed into the XSLT as XSLT <param>s and some CLI
inputmodule setting those sitemap parameters.
- project location can be inherited from the forrestbot.conf.xml (what
about sites generated using "./build.{sh|bat} docs" then, without the
forrestbot) and primarly used in links created in the menu and tabs
pipeline - also fed in using the same mechanism or a dash of Ant
filtering ;-)
4) 'static', pregenerated resources, like downloads, Javadoc et al.
for outerthought.net, I have some PDF's and binary downloads, and also
the XMLSpy generated Schema documentation, basically a bunch of
resources that should not pass the Forrest pipeline.
Putting them in place on the server can be done from outside the
forrestbot/forrestprocess, or we could have those handled with readers
and store them inside the {project:}src/documentation tree, preferably
in the src/documentation/resources/ directory.
But we cannot enforce this, nor do I want to check in the 100+ files of
generated XMLSpy docs into my outerthought.net CVS - I'm happy to manage
those directly as files on my webserver. If I want to link to them
however, the build fails since it cannot process the link to those files
(<link href="sitemap.html">). What can we do about this...?
- having a pipeline set up for static resources, i.e.
<map:match pattern="static/**.extension">
in the Forrest sitemap - which means we will have to manage some list of
reader matchers for each and every mimetype (something which should
already be taken care off by the webserver).
- make the CLI fail gracefully for unresolvable links
- have some special link (or attribute set for the <link> element)
indicating to the CLI that it shouldn't try to traverse that link, even
though there is a pipeline set up for it
For people who want to import their static resources into CVS, we could
make sure they are moved over to the published site if they are not
processed by the crawler - perhaps configuring some copy-over task in
forrestbot.conf.xml
5) pipelines
(maybe we are lucky and Sylvain comes up with that DocTypeMatcher right
away, but I don't think so ;-)
in general, we have externally accessible pipelines set up for:
- "" (entry page)
- apachestats (not used currently)
- *.html
- **/*.html
- *.pdf
- **/*.pdf
- libre (testing purposes)
and a whole lot of skin- and project-related matchers for css, js, and
images
the other ones are internal to Forrest (and should be set so IMO):
- **tab-**.xml
- **book-**/*.xml
- **book-**.xml
- body-todo.xml
- body-changes.xml
- body-faq.xml
- body-community/*/index.xml
- body-community**revision-*.xml
- body-community/*/*/**.xml
- revisions-community/*/*/**
- doclist/content/xdocs/**book.xml
- body-doclist.xml
- body-**.dtdx.xml
- body-**.xml
body-todo/changes/faq could possibly handled by the virtual
DocTypeMatcher, the community/revision stuff I'm not so sure we should
keep it as-is (or isolate it in its own sitemap), the nekodtd matcher is
a specialty thing to Forrest, and the doclist is perhaps a candidate for
replacement by libre.
Anyway, we don't like too lenghty mails, and my babydaughter wants some
attention. So here it is, please comment and discuss until these issues
are solved. We need a todo for next week ;-)
</Steven>
--
Steven Noels http://outerthought.org/
Outerthought - Open Source, Java & XML Competence Support Center
stevenn@outerthought.org stevenn@apache.org