You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@lenya.apache.org by dr...@sdf-eu.org on 2006/09/20 18:26:39 UTC
Virtual index page generation based on xpathdirectorygenerator
results
Hi,
We have about 750 items in our core collection. Each item has a unique
catalogue code which begins with two or three capital letters, followed by
a 1 to 3 digit number, followed by one or more dot, dash or underscore
separated subcodes. (e.g., ALR001, BB050_1, PQ42-2 etc.)
Each item has an xml file based on a custom doctype (which includes the Lenya
metadata). All the items have been imported under the
pub/content/authoring/collections directory in standard Lenya fashion (i.e.,
the collections directory has 750 sub-directories named after the items'
catalogue code and each of those directories contains six index_nn.xml files
for the six languages we need).
BXE has been configured to work with this and works (including asset /
image management). So far so good.
We have created an index page which:
1) Using xpathdirectorygenerator, extracts the alphabetical prefix of each code
(i.e., ALR, BB, PQ etc.).
2) Uses i18n translation as a hack to look up the code's meaning (and as
a bonus returns it in the correct language ;-)
3) Uses XSLT to generate a unique list of letter codes (i.e., ALR001,
ALR002 etc. become ALR in the index)
4) Generates links with the code meaning as link text to the virtual pages
for each letter code (i.e., to ALR.html, BB.html which will index all
the ALRnnn and BBnnn codes respectively etc.)
Issues so far were:
1) Initially we tried recursive XSLT to generate the unique letter code
list but java.StackOverflowError occured at a recursion depth of 572
(even just making the recursive calls and nothing else!) so in
the end we pre-sorted the codes and got the XSLT to see if the code
for the previous node differs from the current node, and if so to emit
that code.
2) Because the codes are not known in advance (and we didn't want 92
new pipelines - i.e., one for each code prefix) we need(ed) some way to
structure the URI-space to avoid collisions between items and the virtual
indexes. Luckily Lenya didn't seem to complain about the doctype in the
pipeline match.
3) We hit the broken links problem mentioned elsewhere here which we've
fixed by adjusting the publication-sitemap even though it's not entirely
clear to us exactly how the cocoon://navigation calls work. Getting
breadcrumbs to work properly is still proving problematic.
4) Firefox takes an age to open up or move around in siteview.
5) Lenya seems to check if a document really exists too early so we've
had to create dummy documents for the index pages which is messy and
further complicates the matching process.
Questions:
Is there a better approach to this? (that is cleaner and more
scaleable if we e.g., increased the item count to say 1 million)
What's the best way of adjusting Lenya's error checking code so we don't
need dummy documents? Obviously we don't want to break Lenya handling
XHTML pages. Do we need some kind of flag in the URL to indicate
that a page is virtual? Could we use usecases? What doctype does a virtual
page have and how could it interact with doctype.xmap?
Is there any documentation on how to go about creating a good URI-space
bearing in mind that we'll be serving multi-lingual documents to a variety
of output formats?
Thanks,
--
drseuk@sdf-eu.org
SDF-EU Public Access UNIX System - http://sdf-eu.org
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@lenya.apache.org
For additional commands, e-mail: user-help@lenya.apache.org