You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by Robert Hairgrove <ev...@hispeed.ch> on 2021/01/02 16:48:56 UTC

German translation of loadable message list

I have built the latest version of Xerces-C++ 3.2.3 on Linux Ubuntu 
18.04 LTS and would like to translate the loadable message file 
"XMLErrList_EN_US.Xml" located under "src/xercesc/NLS/EN_US/" into German.

At the top of the XML file are some instructions which don't play well 
with German:

"(...)  - All messages start with a lower-case letter (except where
    a name is used as the first word) and do not have a period
    at the end."

Leaving a period (AKA "full stop") off of the string's end is OK. But 
German language uses upper-case initial letters for all nouns, not just 
words at the beginning of a sentence. Also, since many of the terms 
should be left in English which refer to XML names, these should be 
enclosed in single or double quotes since they are not German.

Is this a problem? I don't want to go to the trouble of creating 
translations for hundreds of strings only to discover that they cannot 
be used due to the way they are processed after being loaded.

(Or perhaps someone has already done this chore?)


---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org


Re: German translation of loadable message list

Posted by Roger Leigh <rl...@codelibre.net>.
Hi Robert,


Yes, sorry I couldn’t be more positive about it.  If it was as simple as making and submitting a translation file, I’d have been all for it.

Part of the complexity here is due to the age of the project and the lack of standard, portable, translation systems at the time it was originally written.  I don’t think attempting to support several platform-specific (windows, catgets) and one portable (ICU) system was a good choice, particularly without any translations to actively test any of them!  The Qt system is very good, but would require taking on Qt dependencies.  If we had mandated ICU from the start so it could be relied upon to be present, then I think that would have been the way forward.

I think it could be made to work, but it will require significant effort, and all of the existing translation machinery is in a completely unknown state, since none of it has been exercised with anything but en_US.  I’m afraid that given the lack of maintenance for Xerces-C++, if I was to make a recommendation here, it would be to completely strip out all of the translation machinery, because right now it has zero value since it just adds complexity for no benefit.


Kind regards,
Roger


> On 3 Jan 2021, at 10:38, Robert Hairgrove <ev...@hispeed.ch> wrote:
> 
> Thanks, Roger -- it's really too bad, because I would have liked to contribute this. But I suspected as much after browsing the source code a little more.
> 
> The XMLErrorReporter interface receives the numeric code used to look up the error string as the first argument to its virtual error() function which is overloaded by SAX2XMLReaderImpl (et al.) along with the error text itself. However, the error code is not used by the overload, only the already formatted text itself is used.
> 
> If there were some way to pass this numeric code on to the client, it would be trivial to set up a mapping object to return a localised version of the error string. After all, only a few alternate locales would usually need to be supported by any given application, and this is most easily managed by some kind of plugin mechanism (for example, see how the Qt framework does it -- IMHO, those people have it nailed down pretty well).
> 
> I agree that all translations should be managed by a single interface instead of having all the different message loaders presently implemented.
> 
> Cheers,
> Robert
> 
> --
> 
> On 02.01.21 18:02, Roger Leigh wrote:
>> Hi Robert,
>> 
>> 
>> While the Xerces-C++ codebase notionally supports translation, I’m unaware of any translations ever having been publicly submitted or used in the lifetime of the project to date.  So for the conventions indicated in the XML file, which were likely written many years ago, I wouldn’t treat them as hard and fast rules for a translation—there simply hasn’t been the need to revisit it or any experience with translations which influenced it.
>> 
>> It’s not likely that the translation will work without some additional support work being done on the build system side, both for the CMake build and the Autoconf/Automake build.  It’s currently set up to build the en_US translation, and anything new will need adding in.  It might possibly need logic writing to support selection of the language; it’s quite likely untested and possibly incomplete.
>> 
>> To complicate matters we currently support several alternative translation systems.  I did suggest last year we should drop some of them to make this more maintainable.
>> 
>> 
>> Kind regards,
>> Roger
>> 
>>> On 2 Jan 2021, at 16:48, Robert Hairgrove <ev...@hispeed.ch> wrote:
>>> 
>>> I have built the latest version of Xerces-C++ 3.2.3 on Linux Ubuntu 18.04 LTS and would like to translate the loadable message file "XMLErrList_EN_US.Xml" located under "src/xercesc/NLS/EN_US/" into German.
>>> 
>>> At the top of the XML file are some instructions which don't play well with German:
>>> 
>>> "(...)  - All messages start with a lower-case letter (except where
>>>    a name is used as the first word) and do not have a period
>>>    at the end."
>>> 
>>> Leaving a period (AKA "full stop") off of the string's end is OK. But German language uses upper-case initial letters for all nouns, not just words at the beginning of a sentence. Also, since many of the terms should be left in English which refer to XML names, these should be enclosed in single or double quotes since they are not German.
>>> 
>>> Is this a problem? I don't want to go to the trouble of creating translations for hundreds of strings only to discover that they cannot be used due to the way they are processed after being loaded.
>>> 
>>> (Or perhaps someone has already done this chore?)
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
>>> For additional commands, e-mail: c-dev-help@xerces.apache.org
>>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
>> For additional commands, e-mail: c-dev-help@xerces.apache.org
>> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
> For additional commands, e-mail: c-dev-help@xerces.apache.org
> 


Re: German translation of loadable message list

Posted by Robert Hairgrove <ev...@hispeed.ch>.
Thanks, Roger -- it's really too bad, because I would have liked to 
contribute this. But I suspected as much after browsing the source code 
a little more.

The XMLErrorReporter interface receives the numeric code used to look up 
the error string as the first argument to its virtual error() function 
which is overloaded by SAX2XMLReaderImpl (et al.) along with the error 
text itself. However, the error code is not used by the overload, only 
the already formatted text itself is used.

If there were some way to pass this numeric code on to the client, it 
would be trivial to set up a mapping object to return a localised 
version of the error string. After all, only a few alternate locales 
would usually need to be supported by any given application, and this is 
most easily managed by some kind of plugin mechanism (for example, see 
how the Qt framework does it -- IMHO, those people have it nailed down 
pretty well).

I agree that all translations should be managed by a single interface 
instead of having all the different message loaders presently implemented.

Cheers,
Robert

--

On 02.01.21 18:02, Roger Leigh wrote:
> Hi Robert,
>
>
> While the Xerces-C++ codebase notionally supports translation, I’m unaware of any translations ever having been publicly submitted or used in the lifetime of the project to date.  So for the conventions indicated in the XML file, which were likely written many years ago, I wouldn’t treat them as hard and fast rules for a translation—there simply hasn’t been the need to revisit it or any experience with translations which influenced it.
>
> It’s not likely that the translation will work without some additional support work being done on the build system side, both for the CMake build and the Autoconf/Automake build.  It’s currently set up to build the en_US translation, and anything new will need adding in.  It might possibly need logic writing to support selection of the language; it’s quite likely untested and possibly incomplete.
>
> To complicate matters we currently support several alternative translation systems.  I did suggest last year we should drop some of them to make this more maintainable.
>
>
> Kind regards,
> Roger
>
>> On 2 Jan 2021, at 16:48, Robert Hairgrove <ev...@hispeed.ch> wrote:
>>
>> I have built the latest version of Xerces-C++ 3.2.3 on Linux Ubuntu 18.04 LTS and would like to translate the loadable message file "XMLErrList_EN_US.Xml" located under "src/xercesc/NLS/EN_US/" into German.
>>
>> At the top of the XML file are some instructions which don't play well with German:
>>
>> "(...)  - All messages start with a lower-case letter (except where
>>     a name is used as the first word) and do not have a period
>>     at the end."
>>
>> Leaving a period (AKA "full stop") off of the string's end is OK. But German language uses upper-case initial letters for all nouns, not just words at the beginning of a sentence. Also, since many of the terms should be left in English which refer to XML names, these should be enclosed in single or double quotes since they are not German.
>>
>> Is this a problem? I don't want to go to the trouble of creating translations for hundreds of strings only to discover that they cannot be used due to the way they are processed after being loaded.
>>
>> (Or perhaps someone has already done this chore?)
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
>> For additional commands, e-mail: c-dev-help@xerces.apache.org
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
> For additional commands, e-mail: c-dev-help@xerces.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org


Re: German translation of loadable message list

Posted by Roger Leigh <rl...@codelibre.net>.
Hi Robert,


While the Xerces-C++ codebase notionally supports translation, I’m unaware of any translations ever having been publicly submitted or used in the lifetime of the project to date.  So for the conventions indicated in the XML file, which were likely written many years ago, I wouldn’t treat them as hard and fast rules for a translation—there simply hasn’t been the need to revisit it or any experience with translations which influenced it.

It’s not likely that the translation will work without some additional support work being done on the build system side, both for the CMake build and the Autoconf/Automake build.  It’s currently set up to build the en_US translation, and anything new will need adding in.  It might possibly need logic writing to support selection of the language; it’s quite likely untested and possibly incomplete.

To complicate matters we currently support several alternative translation systems.  I did suggest last year we should drop some of them to make this more maintainable.


Kind regards,
Roger

> On 2 Jan 2021, at 16:48, Robert Hairgrove <ev...@hispeed.ch> wrote:
> 
> I have built the latest version of Xerces-C++ 3.2.3 on Linux Ubuntu 18.04 LTS and would like to translate the loadable message file "XMLErrList_EN_US.Xml" located under "src/xercesc/NLS/EN_US/" into German.
> 
> At the top of the XML file are some instructions which don't play well with German:
> 
> "(...)  - All messages start with a lower-case letter (except where
>    a name is used as the first word) and do not have a period
>    at the end."
> 
> Leaving a period (AKA "full stop") off of the string's end is OK. But German language uses upper-case initial letters for all nouns, not just words at the beginning of a sentence. Also, since many of the terms should be left in English which refer to XML names, these should be enclosed in single or double quotes since they are not German.
> 
> Is this a problem? I don't want to go to the trouble of creating translations for hundreds of strings only to discover that they cannot be used due to the way they are processed after being loaded.
> 
> (Or perhaps someone has already done this chore?)
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
> For additional commands, e-mail: c-dev-help@xerces.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org