You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Doug Cutting (JIRA)" <ji...@apache.org> on 2012/05/08 23:41:49 UTC
[jira] [Commented] (AVRO-1072) The JSON encoder doesn't handle
non-ASCII character properly
[ https://issues.apache.org/jira/browse/AVRO-1072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13270857#comment-13270857 ]
Doug Cutting commented on AVRO-1072:
------------------------------------
Avro's JsonEncoder.java specifies the UTF-8 encoding, so I don't see how this is happening.
Can you please provide a test that fails in your environment? Thanks!
> The JSON encoder doesn't handle non-ASCII character properly
> ------------------------------------------------------------
>
> Key: AVRO-1072
> URL: https://issues.apache.org/jira/browse/AVRO-1072
> Project: Avro
> Issue Type: Bug
> Components: java
> Affects Versions: 1.6.3
> Environment: uname -a
> Darwin zmac 10.8.0 Darwin Kernel Version 10.8.0: Tue Jun 7 16:33:36 PDT 2011; root:xnu-1504.15.3~1/RELEASE_I386 i386
> java -version
> java version "1.6.0_29"
> Java(TM) SE Runtime Environment (build 1.6.0_29-b11-402-10M3527)
> Java HotSpot(TM) 64-Bit Server VM (build 20.4-b02-402, mixed mode)
> Reporter: Zhihong Zhang
>
> The JSON encoder uses default encoding of the platform. It should always use UTF-8.
> This causes multiple problems for us,
> 1. The text is mangled if sending/receiving machine has different encoding.
> 2. Some encodings (like Latin-1 or MacRoman) can't handle all characters (like Chinese) and we get ? in the text.
> 3. The binary encoder (ByteBuffer) doesn't work due to this problem.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira