You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@tomcat.apache.org by "André Warnier (tomcat)" <aw...@ice-sa.com> on 2018/11/13 15:33:41 UTC

[OT] Re: Translation help wanted

Holy Smoke (Fumée Sacrée | Sagrado Humo | Heilige Rauch) !
How many messages are in that code ?
Seems time to add some AI-translate add-on to the code.

On 13.11.2018 14:50, Mark Thomas wrote:
> On 13/11/2018 12:32, Rémy Maucherat wrote:
>> On Mon, Nov 12, 2018 at 12:49 PM Mark Thomas <ma...@apache.org> wrote:
>>
>>> I'm aiming to export the translations on a regular basis to the Tomcat
>>> source code. How regularly will depend on the rate of new/updated
>>> translations but as a minimum, I'm aiming to get any updates into the
>>> next Tomcat 9 release.
>>>
>>
>> Ok. Could you remove "French (MC)" ? No idea where it comes from but in
>> Monaco they do standard French, and it's too small anyway :D
>
> Done. I wasn't sure about that one. Currently, any contributor can add a
> new language. I left it open to let the community set the direction.
>
> More generally...
>
> I don't have any fixed rules in mind for when I'll a new language to the
> Tomcat source code but the bar was set pretty low for what we had before
> POEditor was being used. My gut feeling is 1-2% is sufficient.
>
> I've just completed the first set of updates. It might be worth checking
> the commits on the dev list (e.g. [1]) to make sure I haven't messed
> anything up.
>
> The community generated over 300 new translations yesterday which is
> fantastic - and new translators continue to join. If you haven't
> already, please spread the word.
>
> I'm planning on making any Tomcat committers that sign up, admins for
> the project.
>
> Mark
>
>
> [1] https://tomcat.markmail.org/thread/d5gt43lhgiw2miyc
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: Translation help wanted

Posted by Woonsan Ko <wo...@apache.org>.
On Tue, Nov 13, 2018 at 7:02 PM André Warnier (tomcat) <aw...@ice-sa.com> wrote:
>
> Ok, I take it back. I don't think there's an AI smart enough to translate this one :

+1. ;-)

While translating only 3% for Korean, I already met some challenging
items and it was not easy to find right words.
In my case, it was helpful to look up the source code as well in that case.

For example,

Unable to replicate out data for a AbstractReplicatedMap.get operation
java.org.apache.catalina.tribes.tipis.zzz.abstractReplicatedMap.unable.get

The first line is the English message, and the second is a hint for
the message key. Just remove before "....zzz." and find
"abstractReplicatedMap.unable.get" in the .java files from [1]. Then
the message can be better understood, or we could get a better
translation than the default one.

AI is cheap at the moment. Cheap word is cheap to readers.

Regards,

Woonsan

[1] https://github.com/apache/tomcat.git

>
> "The attribute directive (declared in line [{1}] and whose name attribute is [{0}], the
> value of this name-from-attribute attribute) must be of type java.lang.String, is
> "required" and not a "rtexprvalue"."
>
> On 13.11.2018 18:54, André Warnier (tomcat) wrote:
> > On 13.11.2018 18:12, Mark Thomas wrote:
> >> Removing the [OT] marker as I think this is very much on topic.
> >>
> >> On 13/11/2018 15:33, André Warnier (tomcat) wrote:
> >>> Holy Smoke (Fumée Sacrée | Sagrado Humo | Heilige Rauch) !
> >>> How many messages are in that code ?
> >>
> >> Currently there are 2747 unique terms.
> >>
> >>> Seems time to add some AI-translate add-on to the code.
> >>
> >> That is supported but it has to be paid for. That was something I was
> >> thinking about. I have 10k characters of free translation (POEditor uses
> >> either Google translate or Microsoft Automatic Translation) with my
> >> POEditor account. The Tomcat messages average ~67.5 characters per
> >> message so those free credits should be able to translate just under 150
> >> messages.
> >>
> >> To put it another way, automatic translation of the 2000 untranslated
> >> French messages would cost less than $10 USD.
> >>
> >> Hmm. The Tomcat project has a little over GBP 800 in the bank to cover
> >> the up front costs of the next Tomca,t conference.
> >>
> >> Here is a thought. I try automatic translation of as many French
> >> messages as I can with the 10k free characters. You review them (you can
> >> filter by automatic translation and then mark them as proof read). If
> >> you think the automatic translations are worthwhile, I get the PMC to
> >> vote on spending some of that money on automatic translation. For
> >> example, if we spent ~$55 we could do automatic translation for just
> >> over 10 complete languages.
> >>
> >> Are you up for that?
> >>
> >
> > I was half-kidding, but what I was really thinking of, was a Valve which would use some AI
> > to translate the messages going out, on-the-fly.
> >
> > The vast majority of the messages which I've seen so far (and attempted to translate to
> > French), are error messages, which either go to the logs (in majority I presume), or to
> > the user as some kind of error response (of which the status codes should be identifiable).
> > A good number of terms in them (50% ?) are either untranslatable, or should not be
> > translated because they point to Classnames and the like (so, "reserved words").
> > The rest looks like a limited vocabulary and "filler" words, such as "can", "cannot",
> > "disallowed", "parameter", "directory", "request", "response", "committed" ..
> > The majority of the messages also look like they would make sense only to a public of
> > programmers, which are used to deal with english-speaking-only programming languages (only
> > Java in this case), and (I believe) are not so picky about the finer points of style or
> > syntax (well, at least not the ones I know) ;-).
> > The thing is also that one really needs such a translation only when things go wrong or
> > during development/testing, so it could be turned off (default) and on only when needed,
> > using some dynamic parameter e.g. (the Manager, anyone ?).
> >
> > That all looks to me like it may make sense, and it should not be so difficult, to apply
> > some automated (and optional) translation to them on-the-fly.  And such a thing may save
> > *a lot* of maintenance and contributed time over the years, don't you think ?
> >
> > Note that this is not in any way meant to denigrate the enthousiasm and literary talent of
> > the people having contributed so far. But let's face it : due to the very nature of the
> > beast itself, to the length limit etc., most of what comes out looks like Denglish or
> > Frenglish or Spanglish anyway (and has to be so, to be really helpful). So maybe we might
> > as well bite the bullet..
> > Also, AI sounds hot again nowadays, and having the first Apache software which implements
> > an automatic on-the-fly translator-assistant for messages should be a hit, no ?
> >
> > As a final marketing spiel, I would add that the inevitable initial vagaries of the
> > AI-assistant, would probably add much enjoyment to the arduous task of debugging one's
> > code. And if one can switch the language on the fly, it may even fulfill some educational
> > purpose.
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> > For additional commands, e-mail: users-help@tomcat.apache.org
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


RE: Translation help wanted

Posted by "Caldarale, Charles R" <Ch...@unisys.com>.
> From: André Warnier (tomcat) [mailto:aw@ice-sa.com] 
> Subject: Re: Translation help wanted

> Ok, I take it back. I don't think there's an AI smart enough to translate
this one :

> "The attribute directive (declared in line [{1}] and whose name attribute
is [{0}], the 
> value of this name-from-attribute attribute) must be of type
java.lang.String, is 
> "required" and not a "rtexprvalue"."

Maybe we should translate it to English first...

  - Chuck

"This is the sort of bloody nonsense up with which I will not put."
(probably Churchill, in The Strand magazine)


THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY
MATERIAL and is thus for use only by the intended recipient. If you received
this in error, please contact the sender and delete the e-mail and its
attachments from all computers.


Re: Translation help wanted

Posted by "André Warnier (tomcat)" <aw...@ice-sa.com>.
Ok, I take it back. I don't think there's an AI smart enough to translate this one :

"The attribute directive (declared in line [{1}] and whose name attribute is [{0}], the 
value of this name-from-attribute attribute) must be of type java.lang.String, is 
"required" and not a "rtexprvalue"."

On 13.11.2018 18:54, André Warnier (tomcat) wrote:
> On 13.11.2018 18:12, Mark Thomas wrote:
>> Removing the [OT] marker as I think this is very much on topic.
>>
>> On 13/11/2018 15:33, André Warnier (tomcat) wrote:
>>> Holy Smoke (Fumée Sacrée | Sagrado Humo | Heilige Rauch) !
>>> How many messages are in that code ?
>>
>> Currently there are 2747 unique terms.
>>
>>> Seems time to add some AI-translate add-on to the code.
>>
>> That is supported but it has to be paid for. That was something I was
>> thinking about. I have 10k characters of free translation (POEditor uses
>> either Google translate or Microsoft Automatic Translation) with my
>> POEditor account. The Tomcat messages average ~67.5 characters per
>> message so those free credits should be able to translate just under 150
>> messages.
>>
>> To put it another way, automatic translation of the 2000 untranslated
>> French messages would cost less than $10 USD.
>>
>> Hmm. The Tomcat project has a little over GBP 800 in the bank to cover
>> the up front costs of the next Tomca,t conference.
>>
>> Here is a thought. I try automatic translation of as many French
>> messages as I can with the 10k free characters. You review them (you can
>> filter by automatic translation and then mark them as proof read). If
>> you think the automatic translations are worthwhile, I get the PMC to
>> vote on spending some of that money on automatic translation. For
>> example, if we spent ~$55 we could do automatic translation for just
>> over 10 complete languages.
>>
>> Are you up for that?
>>
>
> I was half-kidding, but what I was really thinking of, was a Valve which would use some AI
> to translate the messages going out, on-the-fly.
>
> The vast majority of the messages which I've seen so far (and attempted to translate to
> French), are error messages, which either go to the logs (in majority I presume), or to
> the user as some kind of error response (of which the status codes should be identifiable).
> A good number of terms in them (50% ?) are either untranslatable, or should not be
> translated because they point to Classnames and the like (so, "reserved words").
> The rest looks like a limited vocabulary and "filler" words, such as "can", "cannot",
> "disallowed", "parameter", "directory", "request", "response", "committed" ..
> The majority of the messages also look like they would make sense only to a public of
> programmers, which are used to deal with english-speaking-only programming languages (only
> Java in this case), and (I believe) are not so picky about the finer points of style or
> syntax (well, at least not the ones I know) ;-).
> The thing is also that one really needs such a translation only when things go wrong or
> during development/testing, so it could be turned off (default) and on only when needed,
> using some dynamic parameter e.g. (the Manager, anyone ?).
>
> That all looks to me like it may make sense, and it should not be so difficult, to apply
> some automated (and optional) translation to them on-the-fly.  And such a thing may save
> *a lot* of maintenance and contributed time over the years, don't you think ?
>
> Note that this is not in any way meant to denigrate the enthousiasm and literary talent of
> the people having contributed so far. But let's face it : due to the very nature of the
> beast itself, to the length limit etc., most of what comes out looks like Denglish or
> Frenglish or Spanglish anyway (and has to be so, to be really helpful). So maybe we might
> as well bite the bullet..
> Also, AI sounds hot again nowadays, and having the first Apache software which implements
> an automatic on-the-fly translator-assistant for messages should be a hit, no ?
>
> As a final marketing spiel, I would add that the inevitable initial vagaries of the
> AI-assistant, would probably add much enjoyment to the arduous task of debugging one's
> code. And if one can switch the language on the fly, it may even fulfill some educational
> purpose.
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: Translation help wanted

Posted by "André Warnier (tomcat)" <aw...@ice-sa.com>.
On 13.11.2018 18:12, Mark Thomas wrote:
> Removing the [OT] marker as I think this is very much on topic.
>
> On 13/11/2018 15:33, André Warnier (tomcat) wrote:
>> Holy Smoke (Fumée Sacrée | Sagrado Humo | Heilige Rauch) !
>> How many messages are in that code ?
>
> Currently there are 2747 unique terms.
>
>> Seems time to add some AI-translate add-on to the code.
>
> That is supported but it has to be paid for. That was something I was
> thinking about. I have 10k characters of free translation (POEditor uses
> either Google translate or Microsoft Automatic Translation) with my
> POEditor account. The Tomcat messages average ~67.5 characters per
> message so those free credits should be able to translate just under 150
> messages.
>
> To put it another way, automatic translation of the 2000 untranslated
> French messages would cost less than $10 USD.
>
> Hmm. The Tomcat project has a little over GBP 800 in the bank to cover
> the up front costs of the next Tomca,t conference.
>
> Here is a thought. I try automatic translation of as many French
> messages as I can with the 10k free characters. You review them (you can
> filter by automatic translation and then mark them as proof read). If
> you think the automatic translations are worthwhile, I get the PMC to
> vote on spending some of that money on automatic translation. For
> example, if we spent ~$55 we could do automatic translation for just
> over 10 complete languages.
>
> Are you up for that?
>

I was half-kidding, but what I was really thinking of, was a Valve which would use some AI 
to translate the messages going out, on-the-fly.

The vast majority of the messages which I've seen so far (and attempted to translate to 
French), are error messages, which either go to the logs (in majority I presume), or to 
the user as some kind of error response (of which the status codes should be identifiable).
A good number of terms in them (50% ?) are either untranslatable, or should not be 
translated because they point to Classnames and the like (so, "reserved words").
The rest looks like a limited vocabulary and "filler" words, such as "can", "cannot", 
"disallowed", "parameter", "directory", "request", "response", "committed" ..
The majority of the messages also look like they would make sense only to a public of 
programmers, which are used to deal with english-speaking-only programming languages (only 
Java in this case), and (I believe) are not so picky about the finer points of style or 
syntax (well, at least not the ones I know) ;-).
The thing is also that one really needs such a translation only when things go wrong or 
during development/testing, so it could be turned off (default) and on only when needed, 
using some dynamic parameter e.g. (the Manager, anyone ?).

That all looks to me like it may make sense, and it should not be so difficult, to apply 
some automated (and optional) translation to them on-the-fly.  And such a thing may save 
*a lot* of maintenance and contributed time over the years, don't you think ?

Note that this is not in any way meant to denigrate the enthousiasm and literary talent of 
the people having contributed so far. But let's face it : due to the very nature of the 
beast itself, to the length limit etc., most of what comes out looks like Denglish or 
Frenglish or Spanglish anyway (and has to be so, to be really helpful). So maybe we might 
as well bite the bullet..
Also, AI sounds hot again nowadays, and having the first Apache software which implements 
an automatic on-the-fly translator-assistant for messages should be a hit, no ?

As a final marketing spiel, I would add that the inevitable initial vagaries of the 
AI-assistant, would probably add much enjoyment to the arduous task of debugging one's 
code. And if one can switch the language on the fly, it may even fulfill some educational 
purpose.



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: Translation help wanted

Posted by Mark Thomas <ma...@apache.org>.
On 13/11/2018 18:08, Olaf Kock wrote:
> 
> On 13.11.18 18:12, Mark Thomas wrote:
>>
>>> Seems time to add some AI-translate add-on to the code.
>> That is supported but it has to be paid for. That was something I was
>> thinking about. I have 10k characters of free translation (POEditor uses
>> either Google translate or Microsoft Automatic Translation) with my
>> POEditor account. The Tomcat messages average ~67.5 characters per
>> message so those free credits should be able to translate just under 150
>> messages.
> 
> 
> We've been using automatic translation in Liferay for a while, but IMHO
> this didn't work out well.
> 
> Machine translations lack the context, and are rarely right (and
> accurate), but typically confusing, sometimes hillarious, partly rude
> and rather unhelpful. More than that: If even new terms will be
> automatically translated in the future, translators will always have to
> go through *everything* and identify the newly automatically translated
> strings. Not fun.
> 
> I'd recommend to not even consider it. At Liferay, which has
> considerable user-facing UI, this was a quick start, that triggered
> corrections once there was a wonky automatic translation. Plus, after a
> while we introduced a mechanism to identify newly translated strings.
> Without that, it's hard to maintain.

We do have this. POEditor will filter to show only automated
translations and we can mark them as proof-read once they have been checked.

Is any one willing to try this? If so, for which language?

Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: Translation help wanted

Posted by Olaf Kock <to...@olafkock.de>.
On 13.11.18 18:12, Mark Thomas wrote:
>
>> Seems time to add some AI-translate add-on to the code.
> That is supported but it has to be paid for. That was something I was
> thinking about. I have 10k characters of free translation (POEditor uses
> either Google translate or Microsoft Automatic Translation) with my
> POEditor account. The Tomcat messages average ~67.5 characters per
> message so those free credits should be able to translate just under 150
> messages.


We've been using automatic translation in Liferay for a while, but IMHO 
this didn't work out well.

Machine translations lack the context, and are rarely right (and 
accurate), but typically confusing, sometimes hillarious, partly rude 
and rather unhelpful. More than that: If even new terms will be 
automatically translated in the future, translators will always have to 
go through *everything* and identify the newly automatically translated 
strings. Not fun.

I'd recommend to not even consider it. At Liferay, which has 
considerable user-facing UI, this was a quick start, that triggered 
corrections once there was a wonky automatic translation. Plus, after a 
while we introduced a mechanism to identify newly translated strings. 
Without that, it's hard to maintain.

My 2 cents,

Olaf



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: Translation help wanted

Posted by Mark Thomas <ma...@apache.org>.
Removing the [OT] marker as I think this is very much on topic.

On 13/11/2018 15:33, André Warnier (tomcat) wrote:
> Holy Smoke (Fumée Sacrée | Sagrado Humo | Heilige Rauch) !
> How many messages are in that code ?

Currently there are 2747 unique terms.

> Seems time to add some AI-translate add-on to the code.

That is supported but it has to be paid for. That was something I was
thinking about. I have 10k characters of free translation (POEditor uses
either Google translate or Microsoft Automatic Translation) with my
POEditor account. The Tomcat messages average ~67.5 characters per
message so those free credits should be able to translate just under 150
messages.

To put it another way, automatic translation of the 2000 untranslated
French messages would cost less than $10 USD.

Hmm. The Tomcat project has a little over GBP 800 in the bank to cover
the up front costs of the next Tomcat conference.

Here is a thought. I try automatic translation of as many French
messages as I can with the 10k free characters. You review them (you can
filter by automatic translation and then mark them as proof read). If
you think the automatic translations are worthwhile, I get the PMC to
vote on spending some of that money on automatic translation. For
example, if we spent ~$55 we could do automatic translation for just
over 10 complete languages.

Are you up for that?

Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org