You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Alan Gates (JIRA)" <ji...@apache.org> on 2008/07/15 17:41:32 UTC

[jira] Commented: (PIG-312) Casting a byte array that contains a double value to an int results in a null pointer

    [ https://issues.apache.org/jira/browse/PIG-312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12613639#action_12613639 ] 

Alan Gates commented on PIG-312:
--------------------------------

Rather than letting (int)gpa return a 0, we could change it to take just the integer portion of the double.  This seems better.

> Casting a byte array that contains a double value to an int results in a null pointer
> -------------------------------------------------------------------------------------
>
>                 Key: PIG-312
>                 URL: https://issues.apache.org/jira/browse/PIG-312
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: types_branch
>            Reporter: Alan Gates
>             Fix For: types_branch
>
>
> {code}
> a = load 'myfile' as (name, age, gpa);                                                                        
> c = foreach a generate age * 10, (int)gpa * 2;                                                                                                                  
> store c into 'outfile';
> {code}
> The values in gpa are doubles.  The issue is that they are read as byte arrays and then when the user tries to cast them to an int, the system does a direct cast from byte array to int, which results in a null.  First of all, it should result in a zero, not a null (unless the underlying value is null).  Second, we have to clarify semantics here.  gpa was never officially declared to be a double, so trying to do a cast directly from bytearray to int is a reasonable thing to do.  But users may not see it that way.  Do we want to first cast numbers to double and then to anything subsequent to avoid this?  Or should we force users to write this as (int)(double)gpa * 2 so we know to first cast to double and then int?  In the interest of speed (especially considering the rarity of doubles in most data) I'd vote for the latter.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.