You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "James Aley (JIRA)" <ji...@apache.org> on 2015/10/02 17:37:26 UTC
[jira] [Commented] (AVRO-1493) Avoid the "Turkish Locale Problem"

    [ https://issues.apache.org/jira/browse/AVRO-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941273#comment-14941273 ] 

James Aley commented on AVRO-1493:
----------------------------------

We ship Avro to Android devices and use it for data serialisation. We have a lot of Turkish users, and have consequently run into a couple of issues relating to this. I thought perhaps it would be a good idea to mention the two ways we've seen this manifest itself so far:

* Schema fingerprint generation behaves differently (generates different values for the schemata) if locale is set to Turkish
* The "order: ignore" annotation causes schema parsing to fail, as the Turkish dotted I character is used on the word "ignore", when loading the enum at runtime dynamically using Enum.valueOf() in generated Java code.



> Avoid the "Turkish Locale Problem"
> ----------------------------------
>
>                 Key: AVRO-1493
>                 URL: https://issues.apache.org/jira/browse/AVRO-1493
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>    Affects Versions: 1.7.6
>         Environment: Hadoop trunk build error on mac-os with turkish locale.
>            Reporter: Serkan Taş
>             Fix For: 1.7.8
>
>
> Locale dependent String.toUpperCase(), String.toLowerCase() causes unexpected behavior if the the locale is Turkish
> Not sure about String.equalsIgnoreCase(..).
> Here is the error :
> [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.5.1:testCompile (default-testCompile) on project hadoop-common: Compilation failure
> [ERROR] /Users/serkan/programlar/dev/hadooptest/hadoop-trunk/hadoop-common-project/hadoop-common/target/generated-test-sources/java/org/apache/hadoop/io/serializer/avro/AvroRecord.java:[10,244] unmappable character for encoding UTF-8
> [ERROR] -> [Help 1]
> org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.5.1:testCompile (default-testCompile) on project hadoop-common: Compilation failure
> /Users/serkan/programlar/dev/hadooptest/hadoop-trunk/hadoop-common-project/hadoop-common/target/generated-test-sources/java/org/apache/hadoop/io/serializer/avro/AvroRecord.java:[10,244] unmappable character for encoding UTF-8
> I f i check the code i discovered the reason for error :
>  public static final org.apache.avro.Schema SCHEMA$ = new org.apache.avro.Schema.Parser().parse("{\"type\":\"record\",\"name\":\"AvroRecord\",\"namespace\":\"org.apache.hadoop.io.serializer.avro\",\"fields\":[{\"name\":\"intField\",\"type\":\"Ýnt\"}]}");
> For the code generated from schema, locale dependent capitalization of letter "i" turns in to "Ý" should be the same for "I" to "ı".
> Same bug exist in OPENEJB-1071, OAK-260, IBATIS-218.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)