You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Bill Graham (Commented) (JIRA)" <ji...@apache.org> on 2012/03/23 16:21:32 UTC

[jira] [Commented] (PIG-2611) HBaseStorage not casting correctly

    [ https://issues.apache.org/jira/browse/PIG-2611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13236662#comment-13236662 ] 

Bill Graham commented on PIG-2611:
----------------------------------

I believe that declaring a long an int does not make it an int. It needs to be casted into one. Do both approaches work if you write output to a file using {{PigStorage}}?
                
> HBaseStorage not casting correctly
> ----------------------------------
>
>                 Key: PIG-2611
>                 URL: https://issues.apache.org/jira/browse/PIG-2611
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.9.2
>         Environment: Ubuntu 11.10, Hadoop 0.20.2, HBase 0.92.0
>            Reporter: David Arthur
>            Priority: Minor
>              Labels: cast, hbase
>
> When loading data into HBase with HBaseStorage, there is unexpected behavior regarding record schema and casting.
> Here is the relevant code snippet:
> {code}
> B = group A by (time_tuple, some_scalar);
> C = foreach B {
> 	-- UDF to generate id (bytearray)
> 	generate id, flatten(group.$0), COUNT(A);
> }
> {code}
> At this point the schema for C is unknown, so I declare a schema with a foreach statement
> {code}
> D = foreach C generate $0 as id:bytearray, $1 as year:int, $2 as month:int, $3 as date:int, $4 as count:int;
> {code}
> Even though I've declared C.$4 as an int, it is still a long (from the COUNT). When I go to insert into HBase I get a ClassCastException since the schema (int) does not match the actual tuple value (long). I can fix this by explicitly casting when I declare the schema.
> {code}
> D = foreach C generate $0 as id:bytearray, $1 as year:int, $2 as month:int, $3 as date:int, (int)$4 as count:int;
> {code}
> Is this expected behavior? If not, is this an HBaseStorage issue - not honoring the schema before going off casting things?
> Cheers,
> David

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira