You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Jeff Hodges (JIRA)" <ji...@apache.org> on 2010/02/04 00:27:28 UTC
[jira] Created: (AVRO-393) byte[] constructor for Utf8 desired
byte[] constructor for Utf8 desired
-----------------------------------
Key: AVRO-393
URL: https://issues.apache.org/jira/browse/AVRO-393
Project: Avro
Issue Type: New Feature
Reporter: Jeff Hodges
Priority: Minor
Attachments: byte_utf8.patch
We've come across a few use cases where we know that a given byte array is properly Utf8 encoded, but Utf8 has no constructor to take it. Instead, we have to turn it into a String first just to have it swapped back. This is sucky.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (AVRO-393) byte[] constructor for Utf8 desired
Posted by "Thiruvalluvan M. G. (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/AVRO-393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12829362#action_12829362 ]
Thiruvalluvan M. G. commented on AVRO-393:
------------------------------------------
While at it, it will be useful to add another constructor that takes a sub-array:
Utf8(byte[] bytes, int start, int len);
> byte[] constructor for Utf8 desired
> -----------------------------------
>
> Key: AVRO-393
> URL: https://issues.apache.org/jira/browse/AVRO-393
> Project: Avro
> Issue Type: New Feature
> Reporter: Jeff Hodges
> Assignee: Jeff Hodges
> Priority: Minor
> Fix For: 1.3.0
>
> Attachments: byte_utf8.patch
>
>
> We've come across a few use cases where we know that a given byte array is properly Utf8 encoded, but Utf8 has no constructor to take it. Instead, we have to turn it into a String first just to have it swapped back. This is sucky.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (AVRO-393) byte[] constructor for Utf8 desired
Posted by "Jeff Hodges (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/AVRO-393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jeff Hodges updated AVRO-393:
-----------------------------
Attachment: byte_utf8.patch
Here's a patch that adds a byte[] constructor to Utf8 and adds the beginnings of a TestUtf8 class.
> byte[] constructor for Utf8 desired
> -----------------------------------
>
> Key: AVRO-393
> URL: https://issues.apache.org/jira/browse/AVRO-393
> Project: Avro
> Issue Type: New Feature
> Reporter: Jeff Hodges
> Priority: Minor
> Attachments: byte_utf8.patch
>
>
> We've come across a few use cases where we know that a given byte array is properly Utf8 encoded, but Utf8 has no constructor to take it. Instead, we have to turn it into a String first just to have it swapped back. This is sucky.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (AVRO-393) byte[] constructor for Utf8 desired
Posted by "Jeff Hodges (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/AVRO-393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jeff Hodges updated AVRO-393:
-----------------------------
Assignee: Jeff Hodges
Status: Patch Available (was: Open)
> byte[] constructor for Utf8 desired
> -----------------------------------
>
> Key: AVRO-393
> URL: https://issues.apache.org/jira/browse/AVRO-393
> Project: Avro
> Issue Type: New Feature
> Reporter: Jeff Hodges
> Assignee: Jeff Hodges
> Priority: Minor
> Attachments: byte_utf8.patch
>
>
> We've come across a few use cases where we know that a given byte array is properly Utf8 encoded, but Utf8 has no constructor to take it. Instead, we have to turn it into a String first just to have it swapped back. This is sucky.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (AVRO-393) byte[] constructor for Utf8 desired
Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/AVRO-393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Doug Cutting updated AVRO-393:
------------------------------
Resolution: Fixed
Fix Version/s: 1.3.0
Hadoop Flags: [Reviewed]
Status: Resolved (was: Patch Available)
I just committed this. I made two minor changes:
- in the test, I specified "UTF-8", as String#getBytes() uses the installation's default encoding by default.
- i also used 2-space-per-level indentation
Thanks, Jeff!
> byte[] constructor for Utf8 desired
> -----------------------------------
>
> Key: AVRO-393
> URL: https://issues.apache.org/jira/browse/AVRO-393
> Project: Avro
> Issue Type: New Feature
> Reporter: Jeff Hodges
> Assignee: Jeff Hodges
> Priority: Minor
> Fix For: 1.3.0
>
> Attachments: byte_utf8.patch
>
>
> We've come across a few use cases where we know that a given byte array is properly Utf8 encoded, but Utf8 has no constructor to take it. Instead, we have to turn it into a String first just to have it swapped back. This is sucky.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (AVRO-393) byte[] constructor for Utf8 desired
Posted by "Kevin Oliver (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/AVRO-393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12829319#action_12829319 ]
Kevin Oliver commented on AVRO-393:
-----------------------------------
While its not obvious, the work around is to use something like this:
{code}
byte[] myBytes = ...;
Utf8 utf8 = new Utf8();
utf8.setLength(myBytes.length);
System.arraycopy(myBytes, 0, utf8.getBytes(), 0, myBytes.length);
{code}
That said, I agree that a Utf8(byte[]) constructor would be useful.
> byte[] constructor for Utf8 desired
> -----------------------------------
>
> Key: AVRO-393
> URL: https://issues.apache.org/jira/browse/AVRO-393
> Project: Avro
> Issue Type: New Feature
> Reporter: Jeff Hodges
> Assignee: Jeff Hodges
> Priority: Minor
> Attachments: byte_utf8.patch
>
>
> We've come across a few use cases where we know that a given byte array is properly Utf8 encoded, but Utf8 has no constructor to take it. Instead, we have to turn it into a String first just to have it swapped back. This is sucky.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.