You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@harmony.apache.org by Tony Wu <wu...@gmail.com> on 2006/09/18 08:30:06 UTC

[classlib][luni]A difference between Unicode4.0 and Unicode4.1 affects our implementation of j.l.Character.

Hi all,
I encounter a problem when implement the method isJavaIdentifierPart(int) in
j.l.Character. The Character U+200B was redefined[1] in Unicode4.1 and
caused a testcase[2] failed.
Our implementation is compatible with Unicode 4.1 whereas RI is compatible
with 4.0. I wonder which one should we follow.
[1]
Unicode 4.0 200B;ZERO WIDTH SPACE;Zs;0;BN;;;;;N;;;;;
Unicode 4.1 200B;ZERO WIDTH SPACE;Cf;0;BN;;;;;N;;;;;

[2]
assertFalse(isJavaIdentifierPart("\u200B"));

-- 
Tony Wu
China Software Development Lab, IBM

Re: [classlib][luni]A difference between Unicode4.0 and Unicode4.1 affects our implementation of j.l.Character.

Posted by Tony Wu <wu...@gmail.com>.
I have raised a non-bug difference JIRA issue at
https://issues.apache.org/jira/browse/HARMONY-1488

On 9/18/06, Tim Ellison <t....@gmail.com> wrote:
>
> Looks like 200B was modified/corrected from 'space separator 'to
> 'format'.  Therefore I'd be inclined to follow the allowances in the
> spec, i.e. modify the test to allow it since it is an ignorable format
> character.
>
> Regards,
> Tim
>
> Tony Wu wrote:
> > Hi all,
> > I encounter a problem when implement the method
> > isJavaIdentifierPart(int) in
> > j.l.Character. The Character U+200B was redefined[1] in Unicode4.1 and
> > caused a testcase[2] failed.
> > Our implementation is compatible with Unicode 4.1 whereas RI is
> compatible
> > with 4.0. I wonder which one should we follow.
> > [1]
> > Unicode 4.0 200B;ZERO WIDTH SPACE;Zs;0;BN;;;;;N;;;;;
> > Unicode 4.1 200B;ZERO WIDTH SPACE;Cf;0;BN;;;;;N;;;;;
> >
> > [2]
> > assertFalse(isJavaIdentifierPart("\u200B"));
> >
>
> --
>
> Tim Ellison (t.p.ellison@gmail.com)
> IBM Java technology centre, UK.
>
> ---------------------------------------------------------------------
> Terms of use : http://incubator.apache.org/harmony/mailing.html
> To unsubscribe, e-mail: harmony-dev-unsubscribe@incubator.apache.org
> For additional commands, e-mail: harmony-dev-help@incubator.apache.org
>
>


-- 
Tony Wu
China Software Development Lab, IBM

Re: [classlib][luni]A difference between Unicode4.0 and Unicode4.1 affects our implementation of j.l.Character.

Posted by Tim Ellison <t....@gmail.com>.
Looks like 200B was modified/corrected from 'space separator 'to
'format'.  Therefore I'd be inclined to follow the allowances in the
spec, i.e. modify the test to allow it since it is an ignorable format
character.

Regards,
Tim

Tony Wu wrote:
> Hi all,
> I encounter a problem when implement the method
> isJavaIdentifierPart(int) in
> j.l.Character. The Character U+200B was redefined[1] in Unicode4.1 and
> caused a testcase[2] failed.
> Our implementation is compatible with Unicode 4.1 whereas RI is compatible
> with 4.0. I wonder which one should we follow.
> [1]
> Unicode 4.0 200B;ZERO WIDTH SPACE;Zs;0;BN;;;;;N;;;;;
> Unicode 4.1 200B;ZERO WIDTH SPACE;Cf;0;BN;;;;;N;;;;;
> 
> [2]
> assertFalse(isJavaIdentifierPart("\u200B"));
> 

-- 

Tim Ellison (t.p.ellison@gmail.com)
IBM Java technology centre, UK.

---------------------------------------------------------------------
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: harmony-dev-unsubscribe@incubator.apache.org
For additional commands, e-mail: harmony-dev-help@incubator.apache.org