You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Bill Graham (JIRA)" <ji...@apache.org> on 2011/07/19 00:20:57 UTC
[jira] [Commented] (PIG-2174) HBaseStorage column filters miss some
fields
[ https://issues.apache.org/jira/browse/PIG-2174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13067358#comment-13067358 ]
Bill Graham commented on PIG-2174:
----------------------------------
FYI, the HBase issue I mentioned was in fact a bug which has been fixed (HBASE-3550). The Pig bug is still valid though.
> HBaseStorage column filters miss some fields
> --------------------------------------------
>
> Key: PIG-2174
> URL: https://issues.apache.org/jira/browse/PIG-2174
> Project: Pig
> Issue Type: Bug
> Reporter: Bill Graham
> Assignee: Bill Graham
> Attachments: PIG-2174_1.patch
>
>
> When mixing static and dynamic column mappings, {{HBaseStorage}} sometimes doesn't pick up the static column values and nulls are returned. I believe this bug has been masked by HBase being a bit over-eager when it comes to respecting column filters (i.e. HBase is returning more columns than it should).
> For example, this query returns nulls for the {{sc}} column, even when it contains data:
> {noformat}
> a = LOAD 'hbase://pigtable_1' USING
> org.apache.pig.backend.hadoop.hbase.HBaseStorage
> ('pig:sc pig:prefixed_col_*','-loadKey') AS
> (rowKey:chararray, sc:chararray, pig_cf_map:map[]);
> {noformat}
> What is very strange (about HBase), is that the same script will return values just fine if {{sc}} is instead {{col_a}}, assuming of course that both columns contain data:
> {noformat}
> a = LOAD 'hbase://pigtable_1' USING
> org.apache.pig.backend.hadoop.hbase.HBaseStorage
> ('pig:col_a pig:prefixed_col_*','-loadKey') AS
> (rowKey:chararray, col_a:chararray, pig_cf_map:map[]);
> {noformat}
> Potential HBase issues aside, I think there is a bug in the logic on the Pig side. Patch to follow.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira