You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oro-user@jakarta.apache.org by Todd Federman <to...@toddfederman.com> on 2002/05/07 02:00:13 UTC

processing non-english characters

The code below behaves correctly in Windows XP with ORO version 2.0.6. The
substitute method returns 0 and the string does not change (the e with the
accent in the input string does not match the e with the backward accent in
the substitution pattern.)

The same code in Solaris with 2.0.6 does perform one substitution and the
resulting string is "abcdf". Is there a system library or something I need
to upgrade that prevents ORO from handling the special characters? Or is
there a gap somewhere in my understanding? If I load the java source code in
vim, I do see the two e's correctly, with their respective accents.

Thanks in advance for any help. I've browsed the archive and haven't seen
this discussed.

Todd

import org.apache.oro.text.perl.Perl5Util;

public class perltest {

    public static void main(String[] args) {

        Perl5Util util = new Perl5Util();

        String input = "abcdéf";
        StringBuffer result = new StringBuffer();

        int numSubs = util.substitute(result, "s#è##g", input);

        System.out.println("num: " + numSubs); // returns 0 in windows, 1 in
solaris
        System.out.println(input);
        System.out.println(result);  // returns the input string in windows,
"abcdf" in solaris
    }
}


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>