You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cocoon.apache.org by Upayavira <uv...@upaya.co.uk> on 2004/07/08 23:00:33 UTC

[RT] Document based I18n sites with Cocoon

I'm building a 'document centric' internationalised site which, to my 
mind, Cocoon can't 'quite' do yet. Cocoon's i18n functionality works 
well on 'webapps', where you have snippets of text to be translated, but 
not when the content is whole pages.

The most complex part of this is identifying the most appropriate 
language content, given the combination of the user's desired 
locales/languages, and the available translations.

The site will cater for locale provided as a request parameter, as the 
one of the acceptable locales configured within the browser, or as a 
site default.

When a page is requested, it will look for a page with the preferred 
locale (request parameter, if provided), if not found, it will look for 
a page using each of the locales in turn. If none are found, the default 
page is used.

So, say we have three locales to try: pt, es, en. We have resources:
content/pl/foo.xml
content/es/foo.xml
content/en/foo.xml
When the user requests foo.html, Cocoon will look to see if 
content/pt/foo.xml exists. It doesn't, so it will look for 
content/es/foo.xml. That it finds, so that is what it uses to as a 
source for the pipeline.

Similarly, this system would be able to handle a file structure such as:
content/foo_pl.xml
content/foo_es.xml
content/foo_en.xml

Now, handling this functionality within a Cocoon component really isn't 
that easy to work out. To achieve it, the component needs to take a 
configurable path, e.g content/{locale}/{1}.xml, and needs to be told 
what to use for finding the locale (request param, accept-language 
header, default locale for site). Once it has made a decision, it might 
also want to make its choice of locale available to other components 
(e.g. the i18nTransformer) so that it can localise any other bits of 
text on the page, e.g. navigation.

I have mulled on whether an input module, a generator or maybe an action 
would do the job. In fact, I think it is a job for an I18n matcher.

Introducing the I18NMatcher
---------------------------
Here's a sample sitemap snippet:

<map:match pattern="**.html">
  <map:match type="i18n" src="content/*/{1}.xml">
    <map:generate src="{source}"/>
    <map:transform src="foo.xsl"/>
    <map:transform type="i18n">
      <map:parameter name="locale" value="{locale}"/>
    </map:transform>
    <map:serialize type="html"/>
  </map:match>
</map:match>

Once an ordinary wildcard matcher has done its job, in comes the i18n 
matcher. Its job is to see whether it can find a suitable source 
document for the requested page. The * is used to symbolise the place 
where the locale is to be placed. If a match is successful, it will make 
sitemap variables available for the source that was found, and the 
locale that matched.

Now, this seems to be quite in keeping with the Cocoon sitemap model, 
and gives some rather nice, flexible functionality.

What do you all think?

When I've finished implementing this, I'll go onto extend the CLI to be 
able to work effectively with this kind on i18n site, enableing it to 
crawl a site for each of a range of locales. But that's for another time.
   
Regards, Upayavira



Re: [RT] Document based I18n sites with Cocoon

Posted by Juan Jose Pablos <ch...@che-che.com>.
Upayavira escribió:
> 
> What do you all think?
> 
I am not sure if this is right, but I like it, I am happy to help you so 
it can be integrated within forrest.

> When I've finished implementing this, I'll go onto extend the CLI to be 
> able to work effectively with this kind on i18n site, enableing it to 
> crawl a site for each of a range of locales. But that's for another time.
>   Regards, Upayavira
> 
same as above