You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@directory.apache.org by Emmanuel Lecharny <el...@gmail.com> on 2010/03/17 12:24:57 UTC

About I18n

Hi guys,

Felix has done a tremendous work extracting all the error messages from 
the code and gathering them in a sub project (shared-i18n and 
apacheds-i18n).

This is just great, but I think we should go a bit further. If we want 
to add a new error message, we have to add a new number at the end of 
the list. As all the numbers start from 1 and are incremented, it 
becomes rapidly difficult to group errors by their numbers (ie, all the 
errors between 450 and 460 are related to operation X).

What about defining a number which would inform immediately about the 
kind of message we are dealing with ? We can for instance use hex 
numbers, where the two higher bits will be used to indicate the log level :
DEBUG = 00XXXXXXXXX...
INFO  = 01XXXXXXXXX...
WARN  = 10XXXXXXXXX...
ERROR = 11XXXXXXXXX...

The idea is that if the number is <0, then it's an error or a warning.

IN the same vein, we can also split the errors by family. As the number 
will be an integer, it remains 30 bits to store informations. Assuming 
that shared messages are indicated by the bit number 29, then we have a 
way to split again :
101xxxxxx = a warning in the shared module
1111xxxxx = an error in the shared module, asn1 subproject...

etc.

wdyt ?

-- 
Regards,
Cordialement,
Emmanuel Lécharny
www.nextury.com

Re: About I18n

Posted by Pierre-Arnaud Marcelot <pa...@marcelot.net>.

+1

I like the idea of being able to identify some part of the error (origin, criticality, etc.) without having to open an external table with error description and lookup the error number in that table.

That said, the various ranges need to be selected carefully in order to allow the addition of new errors , and the numbering scheme must be clearly defined and documented on the website (and maintained up-to-date).

Regards,
Pierre-Arnaud


On 17 mars 2010, at 12:24, Emmanuel Lecharny wrote:

> Hi guys,
> 
> Felix has done a tremendous work extracting all the error messages from the code and gathering them in a sub project (shared-i18n and apacheds-i18n).
> 
> This is just great, but I think we should go a bit further. If we want to add a new error message, we have to add a new number at the end of the list. As all the numbers start from 1 and are incremented, it becomes rapidly difficult to group errors by their numbers (ie, all the errors between 450 and 460 are related to operation X).
> 
> What about defining a number which would inform immediately about the kind of message we are dealing with ? We can for instance use hex numbers, where the two higher bits will be used to indicate the log level :
> DEBUG = 00XXXXXXXXX...
> INFO  = 01XXXXXXXXX...
> WARN  = 10XXXXXXXXX...
> ERROR = 11XXXXXXXXX...
> 
> The idea is that if the number is <0, then it's an error or a warning.
> 
> IN the same vein, we can also split the errors by family. As the number will be an integer, it remains 30 bits to store informations. Assuming that shared messages are indicated by the bit number 29, then we have a way to split again :
> 101xxxxxx = a warning in the shared module
> 1111xxxxx = an error in the shared module, asn1 subproject...
> 
> etc.
> 
> wdyt ?
> 
> -- 
> Regards,
> Cordialement,
> Emmanuel Lécharny
> www.nextury.com
> 
>

Re: About I18n

Posted by Emmanuel Lecharny <el...@gmail.com>.

On 3/17/10 1:50 PM, Felix Knecht wrote:
>
>> What about defining a number which would inform immediately about the
>> kind of message we are dealing with ? We can for instance use hex
>> numbers, where the two higher bits will be used to indicate the log level :
>> DEBUG = 00XXXXXXXXX...
>> INFO  = 01XXXXXXXXX...
>> WARN  = 10XXXXXXXXX...
>> ERROR = 11XXXXXXXXX...
>>      
> ? ATM only errors should be i18n-tyified. This means only ERROR/FATAL
> may happen. All the other things are not yet translated. Furthermore a
> log messages indicates clearly of which level it is. Do I miss something?
>    
No, you're right, this info is already provided by the logger. So please 
forget about the 2 first bits. I still think that we can use bits to 
discriminate errors though.

-- 

Regards,
Cordialement,
Emmanuel Lécharny
www.nextury.com

Re: About I18n

Posted by Felix Knecht <fe...@apache.org>.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

> This is just great, but I think we should go a bit further. If we want
> to add a new error message, we have to add a new number at the end of
> the list. As all the numbers start from 1 and are incremented, it
> becomes rapidly difficult to group errors by their numbers (ie, all the
> errors between 450 and 460 are related to operation X).

For the apacheds this is true, but for the shared I started a new block
of numbers for each submodule (01xxx, 02xxx, ...). Maybe a refactoring
of the apacheds is needed.

> 
> What about defining a number which would inform immediately about the
> kind of message we are dealing with ? We can for instance use hex
> numbers, where the two higher bits will be used to indicate the log level :
> DEBUG = 00XXXXXXXXX...
> INFO  = 01XXXXXXXXX...
> WARN  = 10XXXXXXXXX...
> ERROR = 11XXXXXXXXX...

? ATM only errors should be i18n-tyified. This means only ERROR/FATAL
may happen. All the other things are not yet translated. Furthermore a
log messages indicates clearly of which level it is. Do I miss something?

> 
> The idea is that if the number is <0, then it's an error or a warning.
> 
> IN the same vein, we can also split the errors by family. As the number
> will be an integer, it remains 30 bits to store informations. Assuming
> that shared messages are indicated by the bit number 29, then we have a
> way to split again :
> 101xxxxxx = a warning in the shared module
> 1111xxxxx = an error in the shared module, asn1 subproject...
> 
> etc.
> 
> wdyt ?
> 

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.14 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkugz/4ACgkQ2lZVCB08qHFU1wCfXS8Z4S9M5mvNjvBHzrtiQCEQ
xkcAnioPrxyxep0Ih2pSc+XKoZ3tg1Lh
=qoVr
-----END PGP SIGNATURE-----

Re: About I18n

Posted by Felix Knecht <fe...@apache.org>.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

No problem, I don't feel blamed. But without starting the task, I don't
know when the i18n stuff would have been introduced ...

> I agree with Alan. However, I don't blame Felix for having chose this
> solution : he had some very good reason to do so :
> - having no knowledge about the context, he wasn't able to pick a
> correct name for each error

That's true, but there's another point "We have to add a number for each
kind of error, instead of a simple String message." [1]. Of course it
would be easier to have a number based knowledge base for finding a
specific problem.

> Regarding the numbers, we may remove them, I also agree.

Looking back we had a small discussion about using numbers or not and we
had for both solutions pros and cons [2].

BTW: Whatever solution finally is chosen, it should also be the
preferrable one for message translation. Up to now only error messages
are translated. 'Common' messages are still to be done.

Regards
Felix

[1] https://issues.apache.org/jira/browse/DIRSERVER-886
[2]
http://directory.markmail.org/search/?q=[ApacheDS]%20I18n#query:[ApacheDS]%20I18n+page:1+mid:2h5ac5wqn4jt6ye7+state:results
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.14 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkureHsACgkQ2lZVCB08qHFpZQCg1n+TRtEnRBDEnTukMfb+7Fj/
Dv8AnRz8DbJN+iY7KDifmqzzcQyuuof+
=kWdN
-----END PGP SIGNATURE-----

Re: About I18n

Posted by "Alan D. Cabrera" <li...@toolazydogs.com>.

On Mar 25, 2010, at 7:11 AM, Emmanuel Lecharny wrote:

> On 3/25/10 8:03 AM, Alex Karasulu wrote:
>> Wow u just described fully and emaculated why this proposal was  
>> rubbing me the wrong way. I did not have the time to run through a  
>> use case to see clearly - thanks for doing this and commenting for  
>> all our benefit.
>>
>> No general mechanical procedure makes up for act thought for each  
>> case. We have to watch for that here.
>>
>> I agree with Alan on this one. Let's not further obfuscate our  
>> code. BTW it's time for a thorough audit of error messages and log  
>> output since these days many are complaining about false error and  
>> excessive verbosity without clear meaning.
> I agree with Alan. However, I don't blame Felix for having chose  
> this solution : he had some very good reason to do so :
> - having no knowledge about the context, he wasn't able to pick a  
> correct name for each error
> - this was a very painful task, and he did it. It's now our turn to  
> complete the job

Yeah, I saw that.  Great work!

> So yes, we should move to Enum, pick correct names for those enum.
>
> This can be done step by step, I don't believe we could spend one  
> full week in a row doing that.
>
> I remember years ago when we had thousands of string constants all  
> over the code, and decided that we should gather all those constants  
> in a few places : it's not completely done, but it took months to do  
> it.

Definitely a good idea.


Regards,
Alan

Re: About I18n

Posted by Emmanuel Lecharny <el...@gmail.com>.

On 3/25/10 8:03 AM, Alex Karasulu wrote:
> Wow u just described fully and emaculated why this proposal was 
> rubbing me the wrong way. I did not have the time to run through a use 
> case to see clearly - thanks for doing this and commenting for all our 
> benefit.
>
> No general mechanical procedure makes up for act thought for each 
> case. We have to watch for that here.
>
> I agree with Alan on this one. Let's not further obfuscate our code. 
> BTW it's time for a thorough audit of error messages and log output 
> since these days many are complaining about false error and excessive 
> verbosity without clear meaning.
I agree with Alan. However, I don't blame Felix for having chose this 
solution : he had some very good reason to do so :
- having no knowledge about the context, he wasn't able to pick a 
correct name for each error
- this was a very painful task, and he did it. It's now our turn to 
complete the job

So yes, we should move to Enum, pick correct names for those enum.

This can be done step by step, I don't believe we could spend one full 
week in a row doing that.

I remember years ago when we had thousands of string constants all over 
the code, and decided that we should gather all those constants in a few 
places : it's not completely done, but it took months to do it.

Regarding the numbers, we may remove them, I also agree.

Let's discuss this.--

Regards,
Cordialement,
Emmanuel Lécharny
www.nextury.com

Re: About I18n

Posted by Alex Karasulu <ak...@gmail.com>.

Wow u just described fully and emaculated why this proposal was  
rubbing me the wrong way. I did not have the time to run through a use  
case to see clearly - thanks for doing this and commenting for all our  
benefit.

No general mechanical procedure makes up for act thought for each  
case. We have to watch for that here.

I agree with Alan on this one. Let's not further obfuscate our code.  
BTW it's time for a thorough audit of error messages and log output  
since these days many are complaining about false error and excessive  
verbosity without clear meaning.

Sent from my iPho

On Mar 24, 2010, at 9:46 PM, "Alan D. Cabrera" <li...@toolazydogs.com>  
wrote:

>
> On Mar 17, 2010, at 4:24 AM, Emmanuel Lecharny wrote:
>
>> Hi guys,
>>
>> Felix has done a tremendous work extracting all the error messages  
>> from the code and gathering them in a sub project (shared-i18n and  
>> apacheds-i18n).
>>
>> This is just great, but I think we should go a bit further. If we  
>> want to add a new error message, we have to add a new number at the  
>> end of the list. As all the numbers start from 1 and are  
>> incremented, it becomes rapidly difficult to group errors by their  
>> numbers (ie, all the errors between 450 and 460 are related to  
>> operation X).
>>
>> What about defining a number which would inform immediately about  
>> the kind of message we are dealing with ? We can for instance use  
>> hex numbers, where the two higher bits will be used to indicate the  
>> log level :
>> DEBUG = 00XXXXXXXXX...
>> INFO  = 01XXXXXXXXX...
>> WARN  = 10XXXXXXXXX...
>> ERROR = 11XXXXXXXXX...
>>
>> The idea is that if the number is <0, then it's an error or a  
>> warning.
>>
>> IN the same vein, we can also split the errors by family. As the  
>> number will be an integer, it remains 30 bits to store  
>> informations. Assuming that shared messages are indicated by the  
>> bit number 29, then we have a way to split again :
>> 101xxxxxx = a warning in the shared module
>> 1111xxxxx = an error in the shared module, asn1 subproject...
>>
>> etc.
>>
>> wdyt ?
>
> I have a fair bit of experience in this area and would recommend  
> against using numbers.  It adds no value, one never needs to know  
> the log level of a message when translating, and is an impediment to  
> translation and understanding the code.  Here's an example that  
> proves my point:
>
>                log.error( I18n.err( I18n.ERR_122 ), ioe );
>
> That doesn't seem very helpful.  Would this not be better?
>
>                log.error( I18n.err 
> ( I18n.errorEncodingEncryptionKey ), ioe );
>
> If I am reviewing/translating the i18n files
>
> ERR_122=Störungskodierung EncryptionKey
>
> is not as easy to read and fact check as
>
> errorEncodingEncryptionKey=Störungskodierung EncryptionKey
>
> Finally the set of static strings in the I18n.java file are  
> redundant and don't really add any safety per se.  As a matter of  
> fact things become a bit more brittle and are morally equivalent to  
> using ints instead of enums.  I might make these enums whose values  
> become indexes to the resource bundle; this way you get type safety  
> and can use your IDE to see who uses the message.
>
> Just my 2 cents.
>
>
> Regards,
> Alan
>

Re: About I18n

Posted by "Alan D. Cabrera" <li...@toolazydogs.com>.

On Mar 17, 2010, at 4:24 AM, Emmanuel Lecharny wrote:

> Hi guys,
>
> Felix has done a tremendous work extracting all the error messages  
> from the code and gathering them in a sub project (shared-i18n and  
> apacheds-i18n).
>
> This is just great, but I think we should go a bit further. If we  
> want to add a new error message, we have to add a new number at the  
> end of the list. As all the numbers start from 1 and are  
> incremented, it becomes rapidly difficult to group errors by their  
> numbers (ie, all the errors between 450 and 460 are related to  
> operation X).
>
> What about defining a number which would inform immediately about  
> the kind of message we are dealing with ? We can for instance use  
> hex numbers, where the two higher bits will be used to indicate the  
> log level :
> DEBUG = 00XXXXXXXXX...
> INFO  = 01XXXXXXXXX...
> WARN  = 10XXXXXXXXX...
> ERROR = 11XXXXXXXXX...
>
> The idea is that if the number is <0, then it's an error or a warning.
>
> IN the same vein, we can also split the errors by family. As the  
> number will be an integer, it remains 30 bits to store informations.  
> Assuming that shared messages are indicated by the bit number 29,  
> then we have a way to split again :
> 101xxxxxx = a warning in the shared module
> 1111xxxxx = an error in the shared module, asn1 subproject...
>
> etc.
>
> wdyt ?

I have a fair bit of experience in this area and would recommend  
against using numbers.  It adds no value, one never needs to know the  
log level of a message when translating, and is an impediment to  
translation and understanding the code.  Here's an example that proves  
my point:

                 log.error( I18n.err( I18n.ERR_122 ), ioe );

That doesn't seem very helpful.  Would this not be better?

log.error( I18n.err( I18n.errorEncodingEncryptionKey ), ioe );

If I am reviewing/translating the i18n files

ERR_122=Störungskodierung EncryptionKey

is not as easy to read and fact check as

errorEncodingEncryptionKey=Störungskodierung EncryptionKey

Finally the set of static strings in the I18n.java file are redundant  
and don't really add any safety per se.  As a matter of fact things  
become a bit more brittle and are morally equivalent to using ints  
instead of enums.  I might make these enums whose values become  
indexes to the resource bundle; this way you get type safety and can  
use your IDE to see who uses the message.

Just my 2 cents.

Regards,
Alan