You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cocoon.apache.org by Conny Krappatsch <co...@smb-tec.com> on 2000/10/06 09:36:53 UTC

Internationalization

Hello,

we are about to develop some internationalization 'tool' for cocoon.
You can take this mail as a proposal and every comment is very welcome.

With internationalization I meen marking character data in an XML file for
translation into different languages.

The requirements are:
1. a very short syntax: internationalization shouldn't require much additional
writing (the translation itself requires some additional work, of course).
2. Flexible handling of dynamicly generated content: Replacing a static text
by a corresponding translated text is easy. But if the text contains dynamic
content, that content may have different order in different languages (example
below).
3. Support multiple languages by some sort of runtime passed parameter

The markup could look like the following:

<tr>this text will be translated</tr>

[<tr> stands for 'translate'; I ommitted any namespace info]

The text between the <tr> tags will be taken as key in a translation table.
For a given number of documents the keys of this table can be created
automatically by searching the documents for the <tr> tags.

Some character data may contain dynamically created content which itself can
have different order (within the localized text) in different languages. Here's
an example (dynamic content in curly brackets}:
In english one would say: We are a {team} of {4}.
In german you say: Wir sind ein {4}er-{team}.

[It's a silly example, I know. Everyone understanding what I mean, is free to
provide a better one.]

The example shows, that we need some sort of parameters:
<tr>
    <text>We are a $1 of $2.</text>
    <param>dynamic_content_generation_for_the_first_parameter</param>
    <param>dynamic_content_generation_for_the_second_parameter</param>
</tr>

The characters between <text> tags are the key in the translation table. The
replacement text must contain the same number of parameter references ($1, $2)
like the key, but the order of the references can be different (i.e. $2 before
$1).

So the internationalization tool has to perform the following actions (for a
given XML document):

1. Find the tags belonging to it's grammar (using namespaces)
2. Determine the target language
3. Find the given key in the translation table and replace it with the
appropriate target language text.
4. replace parameter references with the given parameter values.

Our intention is to write a processor (resp. a transformer for C2) which does
the job.
The target language could be passed by parameter to the processor/transformer.

Does this sound reasonable? Did we overlook something? Or is anybody doing
something similar (maybe with a different approach)?

regards,
Conny

(I'm not native english speaker. If something sounds odd, don't hesitate to
correct me.)

-- 
______________________________________________________________________
Conny Krappatsch                              mailto:conny@smb-tec.com
SMB GmbH                                        http://www.smb-tec.com




RE: Internationalization

Posted by Colin Britton <cb...@freefoto.com>.
I know it is rather late to throw this into this discussion and I have not
been back and read all the old posts on this subject but has anyone looked
at TMX?

<quote>The purpose of TMX is to allow easier exchange of translation memory
data between tools and/or translation vendors with little or no loss of
critical data during the process.</quote> http://www.lisa.org/tmx/

Here is an article on using TMX with Java specifically but includes some
good info  http://www.lisa.org/tmx/m_itagaki.html

CB

-----Original Message-----
From: Conny Krappatsch [mailto:conny@smb-tec.com]
Sent: Monday, December 11, 2000 6:15 AM
To: cocoon-dev@xml.apache.org
Subject: Re: Internationalization


On Sat, 09 Dec 2000 23:00:49 +0100
Giacomo Pati <gi...@apache.org> wrote:

> Hi Conny
>
> Could you please post the proposal to this list, too.
>
> Thakns
>
> Giacomo
>
> Conny Krappatsch wrote:

Here it comes (original message by Gerd Müller):

> Hi all,
>
> I've rewised the i18n proposal again and included also the suggestions
> made by Lassi Immonen and Konstantin Piroumian. I think, the proposal
> is now almost complete. Comments are always welcome.

> Best Regards,
> Gerd



--
______________________________________________________________________
Conny Krappatsch                              mailto:conny@smb-tec.com
SMB GmbH                                        http://www.smb-tec.com





Re: Internationalization

Posted by Conny Krappatsch <co...@smb-tec.com>.
On Sat, 09 Dec 2000 23:00:49 +0100
Giacomo Pati <gi...@apache.org> wrote:

> Hi Conny
> 
> Could you please post the proposal to this list, too.
> 
> Thakns
> 
> Giacomo
> 
> Conny Krappatsch wrote:

Here it comes (original message by Gerd Müller):

> Hi all,
>
> I've rewised the i18n proposal again and included also the suggestions
> made by Lassi Immonen and Konstantin Piroumian. I think, the proposal
> is now almost complete. Comments are always welcome.

> Best Regards,
> Gerd



-- 
______________________________________________________________________
Conny Krappatsch                              mailto:conny@smb-tec.com
SMB GmbH                                        http://www.smb-tec.com




Re: Internationalization

Posted by Giacomo Pati <gi...@apache.org>.
Hi Conny

Could you please post the proposal to this list, too.

Thakns

Giacomo

Conny Krappatsch wrote:
> 
> Hello,
> 
> we are about to develop some internationalization 'tool' for cocoon.
> You can take this mail as a proposal and every comment is very welcome.
> 
> With internationalization I meen marking character data in an XML file for
> translation into different languages.
> 
> The requirements are:
> 1. a very short syntax: internationalization shouldn't require much additional
> writing (the translation itself requires some additional work, of course).
> 2. Flexible handling of dynamicly generated content: Replacing a static text
> by a corresponding translated text is easy. But if the text contains dynamic
> content, that content may have different order in different languages (example
> below).
> 3. Support multiple languages by some sort of runtime passed parameter
> 
> The markup could look like the following:
> 
> <tr>this text will be translated</tr>
> 
> [<tr> stands for 'translate'; I ommitted any namespace info]
> 
> The text between the <tr> tags will be taken as key in a translation table.
> For a given number of documents the keys of this table can be created
> automatically by searching the documents for the <tr> tags.
> 
> Some character data may contain dynamically created content which itself can
> have different order (within the localized text) in different languages. Here's
> an example (dynamic content in curly brackets}:
> In english one would say: We are a {team} of {4}.
> In german you say: Wir sind ein {4}er-{team}.
> 
> [It's a silly example, I know. Everyone understanding what I mean, is free to
> provide a better one.]
> 
> The example shows, that we need some sort of parameters:
> <tr>
>     <text>We are a $1 of $2.</text>
>     <param>dynamic_content_generation_for_the_first_parameter</param>
>     <param>dynamic_content_generation_for_the_second_parameter</param>
> </tr>
> 
> The characters between <text> tags are the key in the translation table. The
> replacement text must contain the same number of parameter references ($1, $2)
> like the key, but the order of the references can be different (i.e. $2 before
> $1).
> 
> So the internationalization tool has to perform the following actions (for a
> given XML document):
> 
> 1. Find the tags belonging to it's grammar (using namespaces)
> 2. Determine the target language
> 3. Find the given key in the translation table and replace it with the
> appropriate target language text.
> 4. replace parameter references with the given parameter values.
> 
> Our intention is to write a processor (resp. a transformer for C2) which does
> the job.
> The target language could be passed by parameter to the processor/transformer.
> 
> Does this sound reasonable? Did we overlook something? Or is anybody doing
> something similar (maybe with a different approach)?
> 
> regards,
> Conny
> 
> (I'm not native english speaker. If something sounds odd, don't hesitate to
> correct me.)
> 
> --
> ______________________________________________________________________
> Conny Krappatsch                              mailto:conny@smb-tec.com
> SMB GmbH                                        http://www.smb-tec.com

Re: Internationalization

Posted by Sylvain Wallez <wa...@free.fr>.
I think the features of java.text.MessageFormat can handle most if not
all of theses issues. The additional dynamic features that may be
required when translating to a new language are essentially related to
the expression of plural.

Maybe we should model the arguments of MessageFormat as XML whith the
help of the xml:lang attribute ?

Something like :
<translate>
  <text xml:lang="fr">Sur le disque "{1}", il {0,choice,0#n'y a aucun
fichier|1#y a un fichier|1#y a {0,number,integer} fichiers}.</text>
  <text xml:lang="en">There {0,choice,0#are no files|1#is one file|1#are
{0,number,integer} files} on disk "{1}"</text>
  <param>2345</param>
  <param>D:</param>
<translate>

This implementation puts all translations in a single source file. If
you prefer to have one dictionary file per language, maybe we could use
XLink references ?

Another additional idea : with C2, we could pre-filter each multilingual
file into serveral monolingual files, thus avoiding the overhead of
applying the translation transformer a each request.

My .02 euro.

-Sylvain 

Peter Verhage a écrit :
> 
> Conny Krappatsch wrote:
> > In english one would say: We are a {team} of {4}.
> > In german you say: Wir sind ein {4}er-{team}.
> 
> Problem is you can't always know this before you make it... Maybe for a
> few languages, but when you add a new language, it still can be
> different from the above, so then you have to make some more elements
> dynamic for example... Then you still have to edit the original page to
> make it a little more dynamically... :/ What I mean to say is that you
> should not have to point to the dynamic elements of the text. But if you
> would not, it's a lot more difficult to implement, I know... I don't
> know if there is a perfect way to do it..
> Peter
> 
> --
> Peter Verhage       <pe...@ibuildings.nl>
> ibuildings.nl BV - information technology
> http://www.ibuildings.nl -  0118 41 50 54

Re: Internationalization

Posted by Conny Krappatsch <co...@smb-tec.com>.
Peter Verhage wrote:
> Conny Krappatsch wrote:
> > In english one would say: We are a {team} of {4}.
> > In german you say: Wir sind ein {4}er-{team}.
> 
> Problem is you can't always know this before you make it... Maybe for a
> few languages, but when you add a new language, it still can be
> different from the above, so then you have to make some more elements
> dynamic for example... Then you still have to edit the original page to
> make it a little more dynamically... :/ What I mean to say is that you
> should not have to point to the dynamic elements of the text. But if you
> would not, it's a lot more difficult to implement, I know... I don't
> know if there is a perfect way to do it..
> Peter

Hi Peter,

could you explain this a little more. I can't really follow you.

Maybe I have to explain further:

'We are a {1} of {2}.' is the key for the translation table (changed syntax
according to Robins hint). {1} and {2} are references to the parameters which
are really dynamic, i.e. created by XSP, database query, whatever, the i18n
(short vor internationalization) processor doesn't care.
This key leads us to the replacement text, e.g. 'Wir sind ein {2}er-{1}.' The
references in the replacement text are exchanged with the value of the
corresponding parameter.
The replacement text could also be 'Ein {2}er-{1} wir sind.'. (Even though
nobody would say that - sounds somewhat poetic in german.) The translation
would work, as well.
So we don't _make_ content dynamic for translation purpose. Some content _is_
(or may be) dynamic and thus can't be part of the translation key.

Oh wait, my example in the proposal looks like we want to translate word by
word. Maybe that was confusing. Actually we translate phrase by phrase.

It's still possible that I simply don't understand the problem you describing.
:-|
An example could possibly enlighten me ;-)

regards,
Conny


-- 
______________________________________________________________________
Conny Krappatsch                              mailto:conny@smb-tec.com
SMB GmbH                                        http://www.smb-tec.com




Re: Internationalization

Posted by Peter Verhage <pe...@ibuildings.nl>.
Conny Krappatsch wrote:
> In english one would say: We are a {team} of {4}.
> In german you say: Wir sind ein {4}er-{team}.

Problem is you can't always know this before you make it... Maybe for a
few languages, but when you add a new language, it still can be
different from the above, so then you have to make some more elements
dynamic for example... Then you still have to edit the original page to
make it a little more dynamically... :/ What I mean to say is that you
should not have to point to the dynamic elements of the text. But if you
would not, it's a lot more difficult to implement, I know... I don't
know if there is a perfect way to do it..
Peter 


-- 
Peter Verhage       <pe...@ibuildings.nl>
ibuildings.nl BV - information technology
http://www.ibuildings.nl -  0118 41 50 54

Internationalization

Posted by Conny Krappatsch <co...@smb-tec.com>.
Hi Cocoon users,

some days ago I posted the mail below to the cocoon-dev list. As there seemed
to be an interest for an internationalization tool for cocoon, i'd like to ask
some users (you), what they would expect from such a tool (handy syntax,
representation of translation table, ...).

Please post plenty of requirements :-)

regards,
Conny


Conny Krappatsch wrote:
> Hello,
> 
> we are about to develop some internationalization 'tool' for cocoon.
> You can take this mail as a proposal and every comment is very welcome.
> 
> With internationalization I meen marking character data in an XML file for
> translation into different languages.
> 
> The requirements are:
> 1. a very short syntax: internationalization shouldn't require much additional
> writing (the translation itself requires some additional work, of course).
> 2. Flexible handling of dynamicly generated content: Replacing a static text
> by a corresponding translated text is easy. But if the text contains dynamic
> content, that content may have different order in different languages (example
> below).
> 3. Support multiple languages by some sort of runtime passed parameter
> 
> The markup could look like the following:
> 
> <tr>this text will be translated</tr>
> 
> [<tr> stands for 'translate'; I ommitted any namespace info]
> 
> The text between the <tr> tags will be taken as key in a translation table.
> For a given number of documents the keys of this table can be created
> automatically by searching the documents for the <tr> tags.
> 
> Some character data may contain dynamically created content which itself can
> have different order (within the localized text) in different languages. Here's
> an example (dynamic content in curly brackets}:
> In english one would say: We are a {team} of {4}.
> In german you say: Wir sind ein {4}er-{team}.
> 
> [It's a silly example, I know. Everyone understanding what I mean, is free to
> provide a better one.]
> 
> The example shows, that we need some sort of parameters:
> <tr>
>     <text>We are a $1 of $2.</text>
>     <param>dynamic_content_generation_for_the_first_parameter</param>
>     <param>dynamic_content_generation_for_the_second_parameter</param>
> </tr>
> 
> The characters between <text> tags are the key in the translation table. The
> replacement text must contain the same number of parameter references ($1, $2)
> like the key, but the order of the references can be different (i.e. $2 before
> $1).
> 
> So the internationalization tool has to perform the following actions (for a
> given XML document):
> 
> 1. Find the tags belonging to it's grammar (using namespaces)
> 2. Determine the target language
> 3. Find the given key in the translation table and replace it with the
> appropriate target language text.
> 4. replace parameter references with the given parameter values.
> 
> Our intention is to write a processor (resp. a transformer for C2) which does
> the job.
> The target language could be passed by parameter to the processor/transformer.
> 
> Does this sound reasonable? Did we overlook something? Or is anybody doing
> something similar (maybe with a different approach)?
> 
> regards,
> Conny
> 
> (I'm not native english speaker. If something sounds odd, don't hesitate to
> correct me.)
> 
> -- 
> ______________________________________________________________________
> Conny Krappatsch                              mailto:conny@smb-tec.com
> SMB GmbH                                        http://www.smb-tec.com
-- 
______________________________________________________________________
Conny Krappatsch                              mailto:conny@smb-tec.com
SMB GmbH                                        http://www.smb-tec.com




Re: Internationalization

Posted by Lars Martin <la...@smb-tec.com>.
Hi,

On Fre, 06 Okt 2000, you wrote:
> <tr>
>  ....
> </tr>

to prevent misunderstandings with the HTML table row element one should try to
find a more unique element name for <tr>.

Just my $0.02,
Lars
-- 
___________________________________________________________________
Lars Martin                                 mailto:lars@smb-tec.com
SMB GmbH                                     http://www.smb-tec.com


Increase util:include by cache DOM.

Posted by Krzysztof Zieliñski <KZ...@supermedia.pl>.
Hi.

I use util:include-file tag to include whole xml or part  xml as DOM.
Util:include-file reads whole file and parse it each time when it’s called.
I think it should be good to keep in memory (cache) parsed files and when
util:include-file id called check if source file has been changed.
It should increase util:include-file.

I have to write cache class for caching util:include-file.

I am not an expert in cocoon class structure and I don’t wont to make
additional code. Are there in cocoon useful classes for this problem.
I looked at cocoon classes especially at XSPProcessor and
XSPProcessor$PageEntry. I don’t how can I use or change those classes to
cache files as DOM.

Who can help me with cocoon API or may by in other project API?

Who have any suggestion?.

Regards,
Krzysztof Zielinski
Web Application developer
kzielinski@supermedia.pl



Re: Internationalization

Posted by Berin Loritsch <bl...@infoplanning.com>.
There is the beginnings of an XML based internationalization system in Avalon.
It was developed in the hopes of having an easy way to use XML as a repository
for error messages as well as a repository for web development (it's first use).

----- Original Message ----- 
From: "Conny Krappatsch" <co...@smb-tec.com>
To: <co...@xml.apache.org>
Sent: Friday, October 06, 2000 8:23 AM
Subject: Re: Internationalization


> I wrote:
> > > The text between the <tr> tags will be taken as key in a translation table.
> 
> Does anyone have an idea of what could be the best way to pass the location of
> the translation table to the internationalization processor. Especially in C1 I
> can't find an smart solution.
> Keep in mind that someone may like to split the t.-table and I have to get the
> right table for the current XML file.
> Is adding a parameter to the processing instruction the only possible way?
> 
> Conny
> 
> -- 
> ______________________________________________________________________
> Conny Krappatsch                              mailto:conny@smb-tec.com
> SMB GmbH                                        http://www.smb-tec.com
> 
> 


Re: Internationalization

Posted by Conny Krappatsch <co...@smb-tec.com>.
I wrote:
> > The text between the <tr> tags will be taken as key in a translation table.

Does anyone have an idea of what could be the best way to pass the location of
the translation table to the internationalization processor. Especially in C1 I
can't find an smart solution.
Keep in mind that someone may like to split the t.-table and I have to get the
right table for the current XML file.
Is adding a parameter to the processing instruction the only possible way?

Conny

-- 
______________________________________________________________________
Conny Krappatsch                              mailto:conny@smb-tec.com
SMB GmbH                                        http://www.smb-tec.com




Re: Internationalization

Posted by Conny Krappatsch <co...@smb-tec.com>.
Let me add something to my own proposal.

Maybe we should have some attribute indicating that the parameter should be
translated, too:
<tr>
    <text>translate this text and the following parameter</text>
    <param translate="yes">the_parameter_value</param>
</tr>

You wrote:
> Hello,
> 
> we are about to develop some internationalization 'tool' for cocoon.
> You can take this mail as a proposal and every comment is very welcome.
> 
> With internationalization I meen marking character data in an XML file for
> translation into different languages.
> 
> The requirements are:
> 1. a very short syntax: internationalization shouldn't require much additional
> writing (the translation itself requires some additional work, of course).
> 2. Flexible handling of dynamicly generated content: Replacing a static text
> by a corresponding translated text is easy. But if the text contains dynamic
> content, that content may have different order in different languages (example
> below).
> 3. Support multiple languages by some sort of runtime passed parameter
> 
> The markup could look like the following:
> 
> <tr>this text will be translated</tr>
> 
> [<tr> stands for 'translate'; I ommitted any namespace info]
> 
> The text between the <tr> tags will be taken as key in a translation table.
> For a given number of documents the keys of this table can be created
> automatically by searching the documents for the <tr> tags.
> 
> Some character data may contain dynamically created content which itself can
> have different order (within the localized text) in different languages. Here's
> an example (dynamic content in curly brackets}:
> In english one would say: We are a {team} of {4}.
> In german you say: Wir sind ein {4}er-{team}.
> 
> [It's a silly example, I know. Everyone understanding what I mean, is free to
> provide a better one.]
> 
> The example shows, that we need some sort of parameters:
> <tr>
>     <text>We are a $1 of $2.</text>
>     <param>dynamic_content_generation_for_the_first_parameter</param>
>     <param>dynamic_content_generation_for_the_second_parameter</param>
> </tr>
> 
> The characters between <text> tags are the key in the translation table. The
> replacement text must contain the same number of parameter references ($1, $2)
> like the key, but the order of the references can be different (i.e. $2 before
> $1).
> 
> So the internationalization tool has to perform the following actions (for a
> given XML document):
> 
> 1. Find the tags belonging to it's grammar (using namespaces)
> 2. Determine the target language
> 3. Find the given key in the translation table and replace it with the
> appropriate target language text.
> 4. replace parameter references with the given parameter values.
> 
> Our intention is to write a processor (resp. a transformer for C2) which does
> the job.
> The target language could be passed by parameter to the processor/transformer.
> 
> Does this sound reasonable? Did we overlook something? Or is anybody doing
> something similar (maybe with a different approach)?
> 
> regards,
> Conny
> 
> (I'm not native english speaker. If something sounds odd, don't hesitate to
> correct me.)
> 
> -- 
> ______________________________________________________________________
> Conny Krappatsch                              mailto:conny@smb-tec.com
> SMB GmbH                                        http://www.smb-tec.com
-- 
______________________________________________________________________
Conny Krappatsch                              mailto:conny@smb-tec.com
SMB GmbH                                        http://www.smb-tec.com