You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cocoon.apache.org by Jeremy Quinn <je...@apache.org> on 2008/07/30 14:11:20 UTC

RFC: Using icu4j for number formatting

Dear All

Background:

While working on validating number fields for CForms, I am finding  
that there is a huge number of discrepancies between Dojo's localised  
number formatting and the ones built-in to Java. These discrepancies  
are breaking Dojo's ability to perform client-side validation for many  
Locales.

@see http://blog.fiveone.org/2008/07/number-format-hell.html

I mention a few ideas for solutions in the comments, but I think I  
came up with a better one ......

com.ibm.icu.* provides equivalents to java.text.DecimalFormat,  
java.util.Currency etc. that are built using the same CLDR (Common  
Locale Data Repository) dataset that Dojo is built from. @see http://www.unicode.org/cldr/ 
  .

Specifically, Dojo 1.1.1 (current release) uses CLDR 1.5.1 that comes  
in icu4j version 3.8.1 and Dojo 1.2 will use CLDR 1.6 which comes in  
icu4j 4.0 (clear upgrade path).

If this works, the benefit would be that number formatting would be  
consistent regardless of the JVM you are using (above 1.4 the minimum  
icu needs to run).

Question:

Currently, o.a.c.forms.datatype.convertor.FormattingDecimalConvertor  
(the baseclass for all Number Formatting convertors), uses  
java.text.DecimalFormat internally, without exposing the class to the  
outside (except for one protected Method).

If I were to re-implement FormattingDecimalConvertor using icu4j,  
should I leave the old one alone and create a new  
icu4jFormattingDecimalConvertor, or work with the original class?

If this solves the problem, this would be the only decimal convertor  
that would work properly with Dojo, so it would seem pointless to  
leave the old one around, leading to confusion .....

I ask this because when it comes to Date Convertors, we do have  
separate ones for icu4j and the built-in date formatters.

Many thanks for any suggestions

regards Jeremy



Re: RFC: Using icu4j for number formatting

Posted by Antonio Gallardo <ag...@agssa.net>.
Jeremy Quinn escribió:
> Dear All
>
> Background:
>
> While working on validating number fields for CForms, I am finding 
> that there is a huge number of discrepancies between Dojo's localised 
> number formatting and the ones built-in to Java. These discrepancies 
> are breaking Dojo's ability to perform client-side validation for many 
> Locales.
>
> @see http://blog.fiveone.org/2008/07/number-format-hell.html
>
> I mention a few ideas for solutions in the comments, but I think I 
> came up with a better one ......
>
> com.ibm.icu.* provides equivalents to java.text.DecimalFormat, 
> java.util.Currency etc. that are built using the same CLDR (Common 
> Locale Data Repository) dataset that Dojo is built from. @see 
> http://www.unicode.org/cldr/ .
>
> Specifically, Dojo 1.1.1 (current release) uses CLDR 1.5.1 that comes 
> in icu4j version 3.8.1 and Dojo 1.2 will use CLDR 1.6 which comes in 
> icu4j 4.0 (clear upgrade path).
>
> If this works, the benefit would be that number formatting would be 
> consistent regardless of the JVM you are using (above 1.4 the minimum 
> icu needs to run).
>
> Question:
>
> Currently, o.a.c.forms.datatype.convertor.FormattingDecimalConvertor 
> (the baseclass for all Number Formatting convertors), uses 
> java.text.DecimalFormat internally, without exposing the class to the 
> outside (except for one protected Method).
>
> If I were to re-implement FormattingDecimalConvertor using icu4j, 
> should I leave the old one alone and create a new 
> icu4jFormattingDecimalConvertor, or work with the original class?
>
> If this solves the problem, this would be the only decimal convertor 
> that would work properly with Dojo, so it would seem pointless to 
> leave the old one around, leading to confusion .....
>
> I ask this because when it comes to Date Convertors, we do have 
> separate ones for icu4j and the built-in date formatters.
I agree, it is pointless to have the old around.

Thanks Jeremy for your effort!

Best Regards,

Antonio Gallardo.


Re: RFC: Using icu4j for number formatting

Posted by "Mr.Quinn" <su...@gmail.com>.
On 31 Jul 2008, at 12:46, Carsten Ziegeler wrote:

> Jeremy Quinn wrote:
>> Any suggestions about a clean way to have both Java and ICU  
>> NumberFormat ?
>
> If the ICU number formats are compatible with the current ones, I  
> think replacing the current implementation with ICU is the best way.  
> I don't think that we have to be compatible on a class level here.

It was very nearly a drop-in replacement :)

The main difference was that com.ibm.icu.text.DecimalFormat.parse  
returns com.ibm.icu.math.* classes instead of java.math.* classes,  
easily worked around.

As for any subtle differences in the pattern formats, I have not  
noticed anything yet.

Thanks for your feedback.

regards Jeremy

Re: RFC: Using icu4j for number formatting

Posted by Carsten Ziegeler <cz...@apache.org>.
Jeremy Quinn wrote:
> Any suggestions about a clean way to have both Java and ICU NumberFormat ?

If the ICU number formats are compatible with the current ones, I think 
replacing the current implementation with ICU is the best way. I don't 
think that we have to be compatible on a class level here.

Carsten
-- 
Carsten Ziegeler
cziegeler@apache.org

Re: RFC: Using icu4j for number formatting

Posted by Jeremy Quinn <je...@apache.org>.
Hi Joerg

Yes, there are separate icu4jDateConvertor and FormattingDateConvertor  
(which uses the original java.text.DateFormat).

The problem I see, replicating the same separation with Numbers, is  
having to make almost duplicates of the many  
Formatting[Number]Convertors, feels like a mess ..... unless someone  
can think of a better way of doing it .....

BTW. Changing to icu4j immediately fixed the majority of formatting  
problems between CForms and Dojo, the main category still not working  
are languages (eg. Arabic, Hindi etc.) that use different characters  
for number digits. Also there are unexpected problems converting the  
request params back to numbers again (hope to crack that today ... ).

The problem icu4j is solving IMHO is more than Dojo compatibility.
The java.text.NumberFormat classes seem to have really stale data,  
switching JVM (or OS) would probably change the formats it outputs.

Number, currency and currency symbols are cultural artefacts, they  
change over time (countries change currency etc.), IBM seem to be more  
proactive in keeping their libraries up to date.

Any suggestions about a clean way to have both Java and ICU  
NumberFormat ?

Thanks

regards Jeremy


On 30 Jul 2008, at 16:35, Joerg Heinicke wrote:

> Jeremy Quinn <jeremy <at> apache.org> writes:
>
>> Currently, o.a.c.forms.datatype.convertor.FormattingDecimalConvertor
>> (the baseclass for all Number Formatting convertors), uses
>> java.text.DecimalFormat internally, without exposing the class to the
>> outside (except for one protected Method).
>>
>> If I were to re-implement FormattingDecimalConvertor using icu4j,
>> should I leave the old one alone and create a new
>> icu4jFormattingDecimalConvertor, or work with the original class?
>
> Please see [1].
>
>> If this solves the problem, this would be the only decimal convertor
>> that would work properly with Dojo, so it would seem pointless to
>> leave the old one around, leading to confusion .....
>
> But Dojo is not the only option. And considering the differences  
> between icu4j
> and java.text people might want to have the option to switch. I  
> don't know if
> Sylvain ever did what he wanted to do (last mail in mentioned thread).
>
> Joerg
>
> [1] http://marc.info/?t=110966545500001&r=1&w=4
>


Re: RFC: Using icu4j for number formatting

Posted by Joerg Heinicke <jo...@gmx.de>.
Jeremy Quinn <jeremy <at> apache.org> writes:

> Currently, o.a.c.forms.datatype.convertor.FormattingDecimalConvertor  
> (the baseclass for all Number Formatting convertors), uses  
> java.text.DecimalFormat internally, without exposing the class to the  
> outside (except for one protected Method).
> 
> If I were to re-implement FormattingDecimalConvertor using icu4j,  
> should I leave the old one alone and create a new  
> icu4jFormattingDecimalConvertor, or work with the original class?

Please see [1].

> If this solves the problem, this would be the only decimal convertor  
> that would work properly with Dojo, so it would seem pointless to  
> leave the old one around, leading to confusion .....

But Dojo is not the only option. And considering the differences between icu4j
and java.text people might want to have the option to switch. I don't know if
Sylvain ever did what he wanted to do (last mail in mentioned thread).

Joerg

[1] http://marc.info/?t=110966545500001&r=1&w=4