You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@forrest.apache.org by Sylvain Wallez <sy...@anyware-tech.com> on 2002/06/14 10:35:37 UTC

Multilingual sites

Hi forresters,

I'm currently considering the use of Forrest to rebuild the static part 
of my company's website. However, our site needs to be multilingual (at 
least french and english), and I'd like to share some thoughts and have 
your opinion.

An important hypothesis is that the site will provide the same content 
for all languages. So my idea is to have each xdoc file contain the same 
content in the different languages in order to both ease translation 
work and ensure structural consistency.

Here's an example :
<document>
  <section>
    <title xml:lang="en">A multilingual document</title>
    <title xml:lang="fr">Un site multilingue</title>
    <p xml:lang="en">Here's a paragraph</p>
    <p xml:lang="fr">Voici un paragraphe</p>
  <section>
</document>

Note that the above couldn't have been possible with the section title 
as an attribute (referring to the recent vote on this).

Building the site for a particular language will then be a simple matter 
of filtering elements belonging to that language before formatting by a 
language-agnostic skin.

Any comments and criticisms are welcome.

Sylvain

-- 
Sylvain Wallez
  Anyware Technologies                  Apache Cocoon
  http://www.anyware-tech.com           mailto:sylvain@apache.org



Re: Multilingual sites

Posted by Stefano Mazzocchi <st...@apache.org>.
Sylvain Wallez wrote:

> > Please inform us about your experiences!
> 
> Sure ! I may even come with a solution that handles both approaches.

I think you'll hit two big walls with the single document approach:

 1) concern overlap: you are forcing more than one concerns to overlap
in a single document. Saying that the person who writes them is the same
person will only lead to human scalability troubles later on.

 2) validation: if you use xml:lang as an choice label, the documents
can't be validated anymore with the same DTDs. So, either you loose
validation of your content (which is another weakness), or you write
your own DTDs and keep them in synch with ours (another big hassle)

Keeping different languages in different documents will automatically
solve both, without infecting your URI space and impacting minimally the
work of your writers, even if a single person.

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<st...@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------



Re: Cocoon docs transition trial

Posted by David Crossley <cr...@indexgeo.com.au>.
Diana Shannon wrote:
> I just wanted to let you know -- in case we're about to dup efforts -- 
> that I'm doing a trial run of a Cocoon xdoc -> forrest conversion.

Good on you. I think that this really is a one person effort,
up to a certain stage. Then you can get it into cocoon CVS and
we can all dive in to help. We can cope with improperly rendered
docs in 2.1 for a while during the transition, then move it over
to 2.0.3

During your initial stage, if we can assist then just ask.
--David



Re: Cocoon docs transition trial

Posted by Steven Noels <st...@outerthought.org>.
Diana Shannon wrote:

 > Please understand that I'm *not* trying to rush the effort of anyone
 > else, or bypass any collaborative, group process.

Of course not!

Good luck for the next few days, we'll be more than happy to hear from 
your adventures on Monday. I have no time currently to help you outif 
you get stuck, at least until the end of next week, but urgent issues of 
course always can get solved ;-)

</Steven>


Cocoon docs transition trial

Posted by Diana Shannon <sh...@apache.org>.
I just wanted to let you know -- in case we're about to dup efforts -- 
that I'm doing a trial run of a Cocoon xdoc -> forrest conversion. I'm a 
little concerned about meeting the June 30 planning deadline (related to 
a 2.0.3 release), so I'm doing this primarily for my own benefit. Please 
understand that I'm *not* trying to rush the effort of anyone else, or 
bypass any collaborative, group process.  If anyone is doing the same, 
and you'd like to compare notes/strategies, let me know. I'll report 
here on Monday as to how it went as well as the steps I followed. I'll 
also try to have a static docs build of the result up on my web site as 
well.

So far, so good.

Diana


Re: Multilingual sites

Posted by Sylvain Wallez <sy...@anyware-tech.com>.
Steven Noels wrote:

> Sylvain Wallez wrote:
>
> > Diana Shannon wrote:
> >
> >> I would keep separate docs for separate languages, especially if I
> >> had to deal with more than two languages (i.e., more than what
> >> Sylvain is proposing).
>
> strong +1


<snip/>

> > Interesting point. My proposal requires that all translations are in
> >  sync before publishing, which is a self-imposed requirement for the
> > case I'm considering.
>
> and it will be difficult to keep up (says a guy who has been managing a
> typical Belgian multi-lingual (dutch/french/english) website project
> once comprising of 1500 documents * 3 different languages) - try to
> avoid translation interdependency as much as possible.


Wow, that's another level that what I'm currently planning...

> <snip/>
>
> > I agree that languages differ in the way they express things.
> > However, a paragraph is a structural unit that identifies the
> > different "messages" that are to be delivered by the document. So I
> > have the impression that a document delivering the same content will
> > have the same paragraph structure. I'm also sure that sections will
> > be the same.
>
> For sections, assumably yes. For paragraphs, I *tend* to disagree, and
> translators *will* disagree ;-) 


Well, in my case, the writer for both french and english versions is 
likely to be the same person.

> > Thank you all for your answers. I'll make some experiments with a
> > single document approach as it seems best suited to my current
> > problem and report back to the list.
>
> It seems we cannot withhold you from going the single-document road ;-) 


It's called trial-and-error development cycle ;)

But I'd like to give it a try as the site will be only a few dozen files 
and I will prototype this on a few ones. Yes, it may grow (meaning the 
company also grows !), but when that time comes splitting the files will 
be easy.

> Please inform us about your experiences!


Sure ! I may even come with a solution that handles both approaches.

Sylvain

-- 
Sylvain Wallez
  Anyware Technologies                  Apache Cocoon
  http://www.anyware-tech.com           mailto:sylvain@apache.org




Re: Multilingual sites

Posted by Steven Noels <st...@outerthought.org>.
Sylvain Wallez wrote:

 > Diana Shannon wrote:
 >
 >> I would keep separate docs for separate languages, especially if I
 >> had to deal with more than two languages (i.e., more than what
 >> Sylvain is proposing).

strong +1

 >> - I'd worry about multiple translators dealing with the same file.
 >>
indeed, for this and other issues

<snip/>

 > That's also something I thought about : splitting a single file into
 >  several language-specific files is fairly easy, but merging them
 > back again may require some ID attributes as suggested by Michael.
 >
 >> - In my experience, some language versions typically lag others.
 >> I'd worry about syncing translation efforts if multiple languages
 >> (and thus versions) are stored in the same file.
 >
 >
 >
 > Interesting point. My proposal requires that all translations are in
 >  sync before publishing, which is a self-imposed requirement for the
 > case I'm considering.

and it will be difficult to keep up (says a guy who has been managing a
typical Belgian multi-lingual (dutch/french/english) website project
once comprising of 1500 documents * 3 different languages) - try to
avoid translation interdependency as much as possible.

<snip/>

 > I agree that languages differ in the way they express things.
 > However, a paragraph is a structural unit that identifies the
 > different "messages" that are to be delivered by the document. So I
 > have the impression that a document delivering the same content will
 > have the same paragraph structure. I'm also sure that sections will
 > be the same.

For sections, assumably yes. For paragraphs, I *tend* to disagree, and
translators *will* disagree ;-)

 > Thank you all for your answers. I'll make some experiments with a
 > single document approach as it seems best suited to my current
 > problem and report back to the list.

It seems we cannot withhold you from going the single-document road ;-)
Please inform us about your experiences!

 > However, don't expect this soon as I'm in the early stages of this
 > stuff.
 >
 > Thanks, Sylvain

Thank *you* for checking us out :-)

</Steven>



Re: Multilingual sites

Posted by Sylvain Wallez <sy...@anyware-tech.com>.
Diana Shannon wrote:

> I would keep separate docs for separate languages, especially if I had 
> to deal with more than two languages (i.e., more than what Sylvain is 
> proposing).
> - I'd worry about multiple translators dealing with the same file. 
> It's a smallish SoC issue, but I think it's still a factor. If a 
> particular translator needed to work on a special version which showed 
> two languages at a time, then I would generate a set of files for that 
> need and then transform the result back to a single language per file 
> format. 


That's also something I thought about : splitting a single file into 
several language-specific files is fairly easy, but merging them back 
again may require some ID attributes as suggested by Michael.

> - In my experience, some language versions typically lag others. I'd 
> worry about syncing translation efforts if multiple languages (and 
> thus versions) are stored in the same file.


Interesting point. My proposal requires that all translations are in 
sync before publishing, which is a self-imposed requirement for the case 
I'm considering.

> - I agree with what has been said about *not* using i18n transformer 
> approach for everything. I find the dictionary-oriented approach 
> creates an additional burden for translators. I also find it hard to 
> keep one language (used as the key) ahead of all others. This *never* 
> happens in my work. 


I think we all (seems even Konstantin) agree on this.

> - I don't like the assumption that every language version has a 
> one-to-one mapping of expression to all other languages as implied by 
> a single doc approach. Perhaps this is heretical, but I like to give 
> local translators/interpreters a bit more flexibility, especially when 
> I'm still in an iterative product development cycle.


I agree that languages differ in the way they express things. However, a 
paragraph is a structural unit that identifies the different "messages" 
that are to be delivered by the document. So I have the impression that 
a document delivering the same content will have the same paragraph 
structure. I'm also sure that sections will be the same.


Thank you all for your answers. I'll make some experiments with a single 
document approach as it seems best suited to my current problem and 
report back to the list. However, don't expect this soon as I'm in the 
early stages of this stuff.

Thanks,
Sylvain

-- 
Sylvain Wallez
  Anyware Technologies                  Apache Cocoon
  http://www.anyware-tech.com           mailto:sylvain@apache.org




Re: Multilingual sites

Posted by Diana Shannon <sh...@apache.org>.
I would keep separate docs for separate languages, especially if I had 
to deal with more than two languages (i.e., more than what Sylvain is 
proposing).
- I'd worry about multiple translators dealing with the same file. It's 
a smallish SoC issue, but I think it's still a factor. If a particular 
translator needed to work on a special version which showed two 
languages at a time, then I would generate a set of files for that need 
and then transform the result back to a single language per file format.
- In my experience, some language versions typically lag others. I'd 
worry about syncing translation efforts if multiple languages (and thus 
versions) are stored in the same file.
- I agree with what has been said about *not* using i18n transformer 
approach for everything. I find the dictionary-oriented approach creates 
an additional burden for translators. I also find it hard to keep one 
language (used as the key) ahead of all others. This *never* happens in 
my work.
- I don't like the assumption that every language version has a 
one-to-one mapping of expression to all other languages as implied by a 
single doc approach. Perhaps this is heretical, but I like to give local 
translators/interpreters a bit more flexibility, especially when I'm 
still in an iterative product development cycle.

Diana


Re: Multilingual sites

Posted by Michael Wechner <mi...@wyona.org>.
Hi

I don't know if I count as a forrester, but here is my opinion anyway:-):

I would keep the various language versions as separate documents, i.e.

doc_en.xml:

<document xml:lang="en">
   <section>
     <title>A multilingual document</title>
     <p>Here's a paragraph</p>
   <section>
</document>

doc_fr.xml:

<document>
   <section>
     <title>Un site multilingue</title>
     <p>Voici un paragraphe</p>
   <section>
</document>

and then relate them to each other by another document (like an envelope).

doc.xml:

<NewsComponent EquivalentsList="yes">
   <ContentItem href="doc_en.xml"/>
   <ContentItem href="doc_fr.xml"/>
</NewsComponent>

For this you might check out NewsML: http://www.newsml.org

To ensure consistency which can be validated by a machine you might have 
to introduce "ids" for each element!

BTW: I never really followed the discussion on having the title as an 
attribute, but I also think that it would be better to use it as an 
element, because maybe you want to tag a person, organization or a quota
within the title!

All the best

Michael






Sylvain Wallez wrote:

> Hi forresters,
> 
> I'm currently considering the use of Forrest to rebuild the static part 
> of my company's website. However, our site needs to be multilingual (at 
> least french and english), and I'd like to share some thoughts and have 
> your opinion.
> 
> An important hypothesis is that the site will provide the same content 
> for all languages. So my idea is to have each xdoc file contain the same 
> content in the different languages in order to both ease translation 
> work and ensure structural consistency.
> 
> Here's an example :
> <document>
>  <section>
>    <title xml:lang="en">A multilingual document</title>
>    <title xml:lang="fr">Un site multilingue</title>
>    <p xml:lang="en">Here's a paragraph</p>
>    <p xml:lang="fr">Voici un paragraphe</p>
>  <section>
> </document>
> 
> Note that the above couldn't have been possible with the section title 
> as an attribute (referring to the recent vote on this).
> 
> Building the site for a particular language will then be a simple matter 
> of filtering elements belonging to that language before formatting by a 
> language-agnostic skin.
> 
> Any comments and criticisms are welcome.
> 
> Sylvain
> 



Re: Multilingual sites

Posted by Sylvain Wallez <sy...@anyware-tech.com>.
Stefano Mazzocchi wrote:

>Nicola Ken Barozzi wrote:
>  
>
>>Konstantin Piroumian wrote:
>> >
>>    
>>
>>>One small problem with the below snippet: how would you validate your
>>>content against the DTD that permits only one title element? I don't know
>>>how XML specification interpret elements with xml:lang attribute.
>>>
>>>      
>>>
>>>>>From: "Sylvain Wallez" <sy...@anyware-tech.com>
>>>>>          
>>>>>
>>>>>>Here's an example :
>>>>>><document>
>>>>>><section>
>>>>>>  <title xml:lang="en">A multilingual document</title>
>>>>>>  <title xml:lang="fr">Un site multilingue</title>
>>>>>>  <p xml:lang="en">Here's a paragraph</p>
>>>>>>  <p xml:lang="fr">Voici un paragraphe</p>
>>>>>><section>
>>>>>></document>
>>>>>>
>>>>>>            
>>>>>>
>>Dang! Houston, we have a problem.
>>    
>>
>
>Yep, a semantic one.
>
>With the above document snippet, Sylvain tried to use xml:lang as an RDF
>attribute, but this is conceptually wrong because it collides with the
>intrinsic XML structure.
>
>The xml:lang attribute (which might admittedly be totally useless IRL)
>was added to allow you to have an xml-specific way to associate some
>text with some natural language.
>
>Why? I don't know. It seems like a useless thing to me, just like
>ids/idrefs and Processing instructions... how different would XML be if
>they started with namespaces instead than with SGML compatibility!
>
>Anyway, we corrently have it and I made the Document DTD aware of
>xml:lang *just in case*... but this attribute is a label, not a choice.
>This is intended to be used in multilingual *pages*, not sites.
>  
>

But isn't a multilingual site composed of multilingual pages (or more 
precisely documents) ? Or we have a misunderstanding on "multilingual 
page" : is it a page with the same content in different languages, or a 
page containing some texts in different languages that should _all_ be 
displayed to the reader ?

In the latter case, I agree that xml:lang is just a label in fact isn't 
suitable for my problem. I also agree that with the current DTD, there 
can be one and only one title, meaning two titles with a different 
xml:lang makes the document invalid.

Mmmh... I have to think more about all this, because from a translator 
point of view, having all languages in the same document seems more 
practical (for my particular case, translators will be some of my 
colleagues and me).

>*this* is the difference and using one for the other is a big design
>bug.
>
>So, you have a page like this
>
> <document>
>  <section xml:lang="en">
>   <title xml:lang="it">Ciao Mamma!</title>
>   <p>This is an english paragraph</p>
>   <p xml:lang="fr">Et ceci c'est a paragraphe en francais</p>
>  </section>
> </document>
>
>which also makes a great argument for having <title> as an element: you
>would not be able to specify which language you are using if we were
>using an attribute!
>

Sure.

>So, translated content must reside in different pages and the system
>must access it transparently.
>
>For example, using this sitemap
>
> <match type="language">
>  <match pattern="*.html">
>   <read src="docs/{../1}/{1}.html"/>
>  </match>
> </match>
>
>which also keeps the URI space clean and mimics Apache HTTPD's behavior
>with language negotiation.
>

?? You're surprising me here ! The mono or multi-lingual nature of a 
document doesn't have any influence on the URI space, and it's the 
sitemap's job to do that. The filter based on xml:lang that I mentioned 
allows this without effort :

<match pattern="*.html">
  <generate src="{1}.xml"/>
  <transform type="lang-filter"/>
  <transform src="doc2html.xsl"/>
  <serialize/>
</match>

Sylvain

-- 
Sylvain Wallez
 Anyware Technologies                  Apache Cocoon
 http://www.anyware-tech.com           mailto:sylvain@apache.org




Re: Multilingual sites

Posted by Stefano Mazzocchi <st...@apache.org>.
Nicola Ken Barozzi wrote:
> 
> Konstantin Piroumian wrote:
>  >
> > One small problem with the below snippet: how would you validate your
> > content against the DTD that permits only one title element? I don't know
> > how XML specification interpret elements with xml:lang attribute.
> >
> >>>From: "Sylvain Wallez" <sy...@anyware-tech.com>
> >>>>Here's an example :
> >>>><document>
> >>>> <section>
> >>>>   <title xml:lang="en">A multilingual document</title>
> >>>>   <title xml:lang="fr">Un site multilingue</title>
> >>>>   <p xml:lang="en">Here's a paragraph</p>
> >>>>   <p xml:lang="fr">Voici un paragraphe</p>
> >>>> <section>
> >>>></document>
> >>>>
> 
> Dang! Houston, we have a problem.

Yep, a semantic one.

With the above document snippet, Sylvain tried to use xml:lang as an RDF
attribute, but this is conceptually wrong because it collides with the
intrinsic XML structure.

The xml:lang attribute (which might admittedly be totally useless IRL)
was added to allow you to have an xml-specific way to associate some
text with some natural language.

Why? I don't know. It seems like a useless thing to me, just like
ids/idrefs and Processing instructions... how different would XML be if
they started with namespaces instead than with SGML compatibility!

Anyway, we corrently have it and I made the Document DTD aware of
xml:lang *just in case*... but this attribute is a label, not a choice.
This is intended to be used in multilingual *pages*, not sites.

*this* is the difference and using one for the other is a big design
bug.

So, you have a page like this

 <document>
  <section xml:lang="en">
   <title xml:lang="it">Ciao Mamma!</title>
   <p>This is an english paragraph</p>
   <p xml:lang="fr">Et ceci c'est a paragraphe en francais</p>
  </section>
 </document>

which also makes a great argument for having <title> as an element: you
would not be able to specify which language you are using if we were
using an attribute!

So, translated content must reside in different pages and the system
must access it transparently.

For example, using this sitemap

 <match type="language">
  <match pattern="*.html">
   <read src="docs/{../1}/{1}.html"/>
  </match>
 </match>

which also keeps the URI space clean and mimics Apache HTTPD's behavior
with language negotiation.

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<st...@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------



Re: Multilingual sites

Posted by Nicola Ken Barozzi <ni...@apache.org>.
Konstantin Piroumian wrote:
 >
> One small problem with the below snippet: how would you validate your
> content against the DTD that permits only one title element? I don't know
> how XML specification interpret elements with xml:lang attribute.
> 
>>>From: "Sylvain Wallez" <sy...@anyware-tech.com>
>>>>Here's an example :
>>>><document>
>>>> <section>
>>>>   <title xml:lang="en">A multilingual document</title>
>>>>   <title xml:lang="fr">Un site multilingue</title>
>>>>   <p xml:lang="en">Here's a paragraph</p>
>>>>   <p xml:lang="fr">Voici un paragraphe</p>
>>>> <section>
>>>></document>
>>>>

Dang! Houston, we have a problem.

Could this solve it?

<p>
     This goes with all languages.
   <i:part>
    <i:n lang="en">Here's a paragraph</i:n>
    <i:n lang="fr">Voici un paragraphe</i:n>
   </i:part>
</p>

Not that it's pretty 8-S


> What do you think, if one will need current i18n transformer features and
> filtering based on xml:lang at the same time?

I think that we would need both approaches, one for small reused 
snippets, and another for "meat" content.

-- 
Nicola Ken Barozzi                   nicolaken@apache.org
             - verba volant, scripta manent -
    (discussions get forgotten, just code remains)
---------------------------------------------------------------------


Re: Multilingual sites

Posted by Konstantin Piroumian <kp...@apache.org>.
From: "Sylvain Wallez" <sy...@anyware-tech.com>
> Konstantin Piroumian wrote:
> >From: "Sylvain Wallez" <sy...@anyware-tech.com>
> >
> >>Hi forresters,
> >>
> >>I'm currently considering the use of Forrest to rebuild the static part
> >>of my company's website. However, our site needs to be multilingual (at
> >>least french and english), and I'd like to share some thoughts and have
> >>your opinion.
> >>
> >>An important hypothesis is that the site will provide the same content
> >>for all languages. So my idea is to have each xdoc file contain the same
> >>content in the different languages in order to both ease translation
> >>work and ensure structural consistency.
> >>
> >>
> >
> >Why not to use i18n transformer and have translations in separate
dictionary
> >files? How would having all translations in content ease the translation
> >work? Your translators should go through all the xdocs and duplicate all
the
> >elements. And how would you translate attributes (sometimes you'll need
it
> >too)?
> >
> >
>
> I use a lot i18n transformer, but for data-oriented pages which contain
> text elements that do not constitute a document (button, form input
> labels, table column title, error messages, etc).
>
> >Of course, if the number of pages is not big then your approach is Ok,
but I
> >wouldn't recommend to mix translations and the content.
> >
> >
>
> But translations *are* the content !
>

One small problem with the below snippet: how would you validate your
content against the DTD that permits only one title element? I don't know
how XML specification interpret elements with xml:lang attribute.

> >>Here's an example :
> >><document>
> >>  <section>
> >>    <title xml:lang="en">A multilingual document</title>
> >>    <title xml:lang="fr">Un site multilingue</title>
> >>    <p xml:lang="en">Here's a paragraph</p>
> >>    <p xml:lang="fr">Voici un paragraphe</p>
> >>  <section>
> >></document>
> >>
> >>
> >
> >i18n transformer version of the same doc:
> >
> ><document>
> >  <section>
> >    <title><i18n:text>A multilingual document</i18n:text></title>
> >    <p><i18n:text>Here's a paragraph</i18n:text></p>
> >  <section>
> ></document>
> >
> >Note, that you can use shorter keys instead of the text itself.
> >
> >
>
> That's exacly what I'd like to avoid : the document structure is defined
> in one place, and the text in another one. Maintaining documents this
> way seems very difficult to me.

Yes, maybe you are right. I won't argue much, cause I myslef usually advise
to use i18n transformer for things that you've mentioned above (forms, error
messages, labels) and use something else for large texts and information
based sites ;)

>
> >>Note that the above couldn't have been possible with the section title
> >>as an attribute (referring to the recent vote on this).
> >>
> >>
> >
> >It could be translated by the i18n transformer, though:
> ><section title="mydoc.title" i18n:attr="title">...</section>
> >
> >
>
> Granted. But there were some comments (from Stefano, IIRC) during the
> vote explaining that titles can contain inline elements. Moreover, I
> don't think it's good to have text that gets displayed to the user
> modelled as attributes (dico keys are perfectly ok, however).

You could adjust your page layout using some translated attributes depending
on the language, but I don't think that something like that would be needed
for English/French site (no long words as in German, no right-to-left
direction as in Arabic, etc.).

>
> >>Building the site for a particular language will then be a simple matter
> >>of filtering elements belonging to that language before formatting by a
> >>language-agnostic skin.
> >>
> >>
> >
> >Yes, quite simple. I have to think if such functionality will be useful
in
> >i18n transformer too.
> >
> >
> >
> >>Any comments and criticisms are welcome.
> >>
> >>
> >
> >Just take a look at i18n samples and please tell me if it suits your
needs
> >or your still prefer your approach. If you choose my version than I have
a
> >new more advanced implementation of i18n transformer that I have no time
to
> >test a little more and commit.
> >
> >
>
> As said above I18n transformer seems to me more suited to small
> autonomous text snippets that also can be reused in several places (e.g.
> translation of the "Cancel" button label), but inadequate to documents
> where each block appears only once.

The mentioned above 'advanced' implementation of the transformer has inline
translations capability that could be used instead, but I like the xml:lang
syntax more, so have to think a little would it be better to change i18n
transformer to act as a filter or leave it as is and wait for your filtering
transformer.

What do you think, if one will need current i18n transformer features and
filtering based on xml:lang at the same time?

Konstantin

>
> Sylvain
>
> --
> Sylvain Wallez
>   Anyware Technologies                  Apache Cocoon
>   http://www.anyware-tech.com           mailto:sylvain@apache.org
>
>
>


Re: Multilingual sites

Posted by Nicola Ken Barozzi <ni...@apache.org>.

Sylvain Wallez wrote:

...
> I18n transformer seems to me more suited to small 
> autonomous text snippets that also can be reused in several places (e.g. 
> translation of the "Cancel" button label), but inadequate to documents 
> where each block appears only once.

I tend to agree with Sylvain.

Translations must remain as in synch as possible. If we scatter the 
*same* content in different places, it becomes unmanageable.

My sister is a professional translator, and she said the same thing:
common text parts -> i18n transformer
translations of the same page -> in the page itself

It's also much easier for translator to see translations of the same 
content near.

-- 
Nicola Ken Barozzi                   nicolaken@apache.org
             - verba volant, scripta manent -
    (discussions get forgotten, just code remains)
---------------------------------------------------------------------


Re: Multilingual sites

Posted by Sylvain Wallez <sy...@anyware-tech.com>.
Konstantin Piroumian wrote:

>From: "Sylvain Wallez" <sy...@anyware-tech.com>
>
>  
>
>>Hi forresters,
>>
>>I'm currently considering the use of Forrest to rebuild the static part
>>of my company's website. However, our site needs to be multilingual (at
>>least french and english), and I'd like to share some thoughts and have
>>your opinion.
>>
>>An important hypothesis is that the site will provide the same content
>>for all languages. So my idea is to have each xdoc file contain the same
>>content in the different languages in order to both ease translation
>>work and ensure structural consistency.
>>    
>>
>
>Why not to use i18n transformer and have translations in separate dictionary
>files? How would having all translations in content ease the translation
>work? Your translators should go through all the xdocs and duplicate all the
>elements. And how would you translate attributes (sometimes you'll need it
>too)?
>  
>

I use a lot i18n transformer, but for data-oriented pages which contain 
text elements that do not constitute a document (button, form input 
labels, table column title, error messages, etc).

>Of course, if the number of pages is not big then your approach is Ok, but I
>wouldn't recommend to mix translations and the content.
>  
>

But translations *are* the content !

>>Here's an example :
>><document>
>>  <section>
>>    <title xml:lang="en">A multilingual document</title>
>>    <title xml:lang="fr">Un site multilingue</title>
>>    <p xml:lang="en">Here's a paragraph</p>
>>    <p xml:lang="fr">Voici un paragraphe</p>
>>  <section>
>></document>
>>    
>>
>
>i18n transformer version of the same doc:
>
><document>
>  <section>
>    <title><i18n:text>A multilingual document</i18n:text></title>
>    <p><i18n:text>Here's a paragraph</i18n:text></p>
>  <section>
></document>
>
>Note, that you can use shorter keys instead of the text itself.
>  
>

That's exacly what I'd like to avoid : the document structure is defined 
in one place, and the text in another one. Maintaining documents this 
way seems very difficult to me.

>>Note that the above couldn't have been possible with the section title
>>as an attribute (referring to the recent vote on this).
>>    
>>
>
>It could be translated by the i18n transformer, though:
><section title="mydoc.title" i18n:attr="title">...</section>
>  
>

Granted. But there were some comments (from Stefano, IIRC) during the 
vote explaining that titles can contain inline elements. Moreover, I 
don't think it's good to have text that gets displayed to the user 
modelled as attributes (dico keys are perfectly ok, however).

>>Building the site for a particular language will then be a simple matter
>>of filtering elements belonging to that language before formatting by a
>>language-agnostic skin.
>>    
>>
>
>Yes, quite simple. I have to think if such functionality will be useful in
>i18n transformer too.
>
>  
>
>>Any comments and criticisms are welcome.
>>    
>>
>
>Just take a look at i18n samples and please tell me if it suits your needs
>or your still prefer your approach. If you choose my version than I have a
>new more advanced implementation of i18n transformer that I have no time to
>test a little more and commit.
>  
>

As said above I18n transformer seems to me more suited to small 
autonomous text snippets that also can be reused in several places (e.g. 
translation of the "Cancel" button label), but inadequate to documents 
where each block appears only once.

Sylvain

-- 
Sylvain Wallez
  Anyware Technologies                  Apache Cocoon
  http://www.anyware-tech.com           mailto:sylvain@apache.org




Re: Multilingual sites

Posted by Konstantin Piroumian <kp...@apache.org>.
From: "Sylvain Wallez" <sy...@anyware-tech.com>


> Hi forresters,
>
> I'm currently considering the use of Forrest to rebuild the static part
> of my company's website. However, our site needs to be multilingual (at
> least french and english), and I'd like to share some thoughts and have
> your opinion.
>
> An important hypothesis is that the site will provide the same content
> for all languages. So my idea is to have each xdoc file contain the same
> content in the different languages in order to both ease translation
> work and ensure structural consistency.

Why not to use i18n transformer and have translations in separate dictionary
files? How would having all translations in content ease the translation
work? Your translators should go through all the xdocs and duplicate all the
elements. And how would you translate attributes (sometimes you'll need it
too)?

Of course, if the number of pages is not big then your approach is Ok, but I
wouldn't recommend to mix translations and the content.

>
> Here's an example :
> <document>
>   <section>
>     <title xml:lang="en">A multilingual document</title>
>     <title xml:lang="fr">Un site multilingue</title>
>     <p xml:lang="en">Here's a paragraph</p>
>     <p xml:lang="fr">Voici un paragraphe</p>
>   <section>
> </document>

i18n transformer version of the same doc:

<document>
  <section>
    <title><i18n:text>A multilingual document</i18n:text></title>
    <p><i18n:text>Here's a paragraph</i18n:text></p>
  <section>
</document>

Note, that you can use shorter keys instead of the text itself.

>
> Note that the above couldn't have been possible with the section title
> as an attribute (referring to the recent vote on this).

It could be translated by the i18n transformer, though:
<section title="mydoc.title" i18n:attr="title">...</section>

>
> Building the site for a particular language will then be a simple matter
> of filtering elements belonging to that language before formatting by a
> language-agnostic skin.

Yes, quite simple. I have to think if such functionality will be useful in
i18n transformer too.

>
> Any comments and criticisms are welcome.

Just take a look at i18n samples and please tell me if it suits your needs
or your still prefer your approach. If you choose my version than I have a
new more advanced implementation of i18n transformer that I have no time to
test a little more and commit.

Konstantin

>
> Sylvain
>
> --
> Sylvain Wallez
>   Anyware Technologies                  Apache Cocoon
>   http://www.anyware-tech.com           mailto:sylvain@apache.org
>
>