You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Todd Lipcon (JIRA)" <ji...@apache.org> on 2011/09/10 01:06:08 UTC
[jira] [Commented] (HBASE-4364) Column family pruning incorrectly
prunes CFs referred to by filters
[ https://issues.apache.org/jira/browse/HBASE-4364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13101629#comment-13101629 ]
Todd Lipcon commented on HBASE-4364:
------------------------------------
Example shell code to reproduce this:
{noformat}
create 't1', 'f1', f2'
put 't1', 'r1', 'f1:word', 'hello'
put 't1', 'r1', 'f2:word', 'bonjour'
put 't1', 'r2', 'f1:word', 'goodbye'
put 't1', 'r2', 'f2:word', 'au revoir'
# scan whole table, has 2 rows, each with 2 cols
scan 't1'
# scan selecting only one column - returns 2 distinct rows
scan 't1', { COLUMNS => ['f1:word'] }
# scan with a predicate of the french word > 'b', returns 1 row
scan 't1', { FILTER => "SingleColumnValueFilter('f2', 'word', >, 'binary:b')" }
# scan with a predicate of the french word > 'b', selecting only the english word
scan 't1', { COLUMNS => ['f1:word'], FILTER => "SingleColumnValueFilter('f2', 'word', >, 'binary:b')" }
{noformat}
The incorrect result is as follows:
{noformat}
hbase(main):008:0> scan 't1'
ROW COLUMN+CELL
r1 column=f1:word, timestamp=1315608975212, value=hello
r1 column=f2:word, timestamp=1315608975238, value=bonjour
r2 column=f1:word, timestamp=1315608975258, value=goodbye
r2 column=f2:word, timestamp=1315608975286, value=au revoir
2 row(s) in 0.0270 seconds
hbase(main):009:0> scan 't1', { COLUMNS => ['f1:word'] }
ROW COLUMN+CELL
r1 column=f1:word, timestamp=1315608975212, value=hello
r2 column=f1:word, timestamp=1315608975258, value=goodbye
2 row(s) in 0.0140 seconds
hbase(main):010:0> scan 't1', { FILTER => "SingleColumnValueFilter('f2', 'word', >, 'binary:b')" }
ROW COLUMN+CELL
r1 column=f1:word, timestamp=1315608975212, value=hello
r1 column=f2:word, timestamp=1315608975238, value=bonjour
1 row(s) in 0.0250 seconds
hbase(main):011:0> scan 't1', { COLUMNS => ['f1:word'], FILTER => "SingleColumnValueFilter('f2', 'word', >, 'binary:b')" }
ROW COLUMN+CELL
r1 column=f1:word, timestamp=1315608975212, value=hello
r2 column=f1:word, timestamp=1315608975258, value=goodbye
2 row(s) in 0.0270 seconds <---- SHOULD NOT HAVE RETURNED ANY VALUE FOR r2!
{noformat}
> Column family pruning incorrectly prunes CFs referred to by filters
> -------------------------------------------------------------------
>
> Key: HBASE-4364
> URL: https://issues.apache.org/jira/browse/HBASE-4364
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.4, 0.92.0
> Reporter: Todd Lipcon
> Priority: Critical
>
> For a scan, if you select some set of columns using addColumns(), and then apply a SingleColumnValueFilter that restricts the results based on some other columns which aren't selected, and those non-selected columns are part of a separate column family, then those filter conditions are ignored.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira