You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Gordon Sommers (JIRA)" <ji...@apache.org> on 2010/07/27 16:05:16 UTC
[jira] Created: (HADOOP-6883) Text.toString violates its
abstraction
Text.toString violates its abstraction
--------------------------------------
Key: HADOOP-6883
URL: https://issues.apache.org/jira/browse/HADOOP-6883
Project: Hadoop Common
Issue Type: Bug
Components: io
Affects Versions: 0.20.1
Environment: Linux
Reporter: Gordon Sommers
I stumbled upon this when encoding a google protocol buffer in base64, and storing it in a Text object for serialization. Compare the following two lines:
byte [] decoded = b64.decode(val.getBytes())
//this does not return the same bytes as below and the result, after decoding the base64 successfully, is a very mangled protocol buffer
byte [] decoded = b64.decode(val.toString().getBytes());
//YES, toString() FIXES IT
Elsewhere in my code I also have:
Text curline = new Text(values.next().toString());
byte [] raw = base64.decode(curline.getBytes());
//This does work.
It looks like the Text object must be toString'd (just once, somewhere, even if its later repacked in a Text) before it will have the proper byte representation. I would classify this as a leaky abstraction and ask that the reason please be isolated and the api fixed somehow so that other developers dont have to spend 3 days figuring out when Text.getBytes isn't returning the right bytes even though Text.toString prints exactly the right string representation and Text.toString.getBytes does return the right bytes.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Resolved: (HADOOP-6883) Text.toString violates its
abstraction
Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-6883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Owen O'Malley resolved HADOOP-6883.
-----------------------------------
Resolution: Invalid
The proper call is:
{code}
b64.decode(val.getBytes(), 0, val.getLength());
{code}
Yes, it is confusing, but doing anything else would not perform acceptably. If you look at the javadoc for getBytes(), you'll see why your call fails.
> Text.toString violates its abstraction
> --------------------------------------
>
> Key: HADOOP-6883
> URL: https://issues.apache.org/jira/browse/HADOOP-6883
> Project: Hadoop Common
> Issue Type: Bug
> Components: io
> Affects Versions: 0.20.1
> Environment: Linux
> Reporter: Gordon Sommers
>
> I stumbled upon this when encoding a google protocol buffer in base64, and storing it in a Text object for serialization. Compare the following two lines:
> byte [] decoded = b64.decode(val.getBytes())
> //this does not return the same bytes as below and the result, after decoding the base64 successfully, is a very mangled protocol buffer
> byte [] decoded = b64.decode(val.toString().getBytes());
> //YES, toString() FIXES IT
> Elsewhere in my code I also have:
> Text curline = new Text(values.next().toString());
> byte [] raw = base64.decode(curline.getBytes());
> //This does work.
> It looks like the Text object must be toString'd (just once, somewhere, even if its later repacked in a Text) before it will have the proper byte representation. I would classify this as a leaky abstraction and ask that the reason please be isolated and the api fixed somehow so that other developers dont have to spend 3 days figuring out when Text.getBytes isn't returning the right bytes even though Text.toString prints exactly the right string representation and Text.toString.getBytes does return the right bytes.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.