You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Rob Turner (JIRA)" <ji...@apache.org> on 2013/11/30 01:05:35 UTC

[jira] [Updated] (AVRO-1348) Improve Utf8 to String conversion

     [ https://issues.apache.org/jira/browse/AVRO-1348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rob Turner updated AVRO-1348:
-----------------------------

    Attachment: AVRO-1348v2.patch

Surprisingly, passing the character set name string "UTF-8" performs about 15%  better (in my tests) than passing a single instance of Charset. This is due to the former caching the StringDecoder/StringEncoder in a ThreadLocal whereas the latter creates a new instance each time. 

In patch v2 I handle the UnsupportedEncodingException by throwing an unchecked java.nio.charset.UnsupportedCharsetException in the same way as Charset.forName.

Please review the patch and what do you think?

> Improve Utf8 to String conversion
> ---------------------------------
>
>                 Key: AVRO-1348
>                 URL: https://issues.apache.org/jira/browse/AVRO-1348
>             Project: Avro
>          Issue Type: Bug
>            Reporter: Mark Wagner
>            Assignee: Mohammad Kamrul Islam
>         Attachments: AVRO-1348v2.patch, AVRO1348v1.patch
>
>
> AVRO-1241 found that the existing method of creating Strings from Utf8 byte arrays could be made faster. The same method is being used in the Utf8.toString(), and could likely be sped up by doing the same thing.



--
This message was sent by Atlassian JIRA
(v6.1#6144)