You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Jonathan Gray (JIRA)" <ji...@apache.org> on 2011/04/01 19:05:05 UTC

[jira] [Commented] (HBASE-3562) ValueFilter is being evaluated before performing the column match

    [ https://issues.apache.org/jira/browse/HBASE-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13014748#comment-13014748 ] 

Jonathan Gray commented on HBASE-3562:
--------------------------------------

Thanks for looking into this Evert.  This is definitely some tricky stuff.

A few comments on your patch...

- Our convention in conditionals is to put the variable first.  I find it a little tricky to read the code when the constant is first.  For example:
{code}
if (MatchCode.INCLUDE == mc)
{code}
should be
{code}
if (mc == MatchCode.INCLUDE)
{code}
(And all the other places where you have this type of logic)

- The unit test {{TestColumnMatchAndFilterOrder}} is clever how you check correctness, but I think it would be good to actually do a read query and verify the results for a few different combinations of the query to prove correctness of the overall algorithm.  Other changes to SQM down the road might change more behavior / order of operations, so this test may no longer apply or give full coverage for correctness.  Having some tests which don't rely on the precise server-side interactions but rather confirm the end results will be more applicable as we move forward.

- You have some lines that are > 80 characters, especially in some of the javadoc.  Just wrap that so all lines are <= 80 chars.

- There was a comment in SQM that described why the filter was checked first.  Can you write some inline comments to describe how this works now?  There are a couple lines at the end but it will be useful to have some explanation on why this has changed and what the behavior is now.

- Is there any particular reason that you had includeLatestColumn take timestamp as a parameter?  The timestamp is passed in the check call, and we could just hang on to that.  It just feels a little strange to me since you should never pass a different timestamp, and the tracker can know which was the latest column.

Overall this is really solid!  Great work Evert!

> ValueFilter is being evaluated before performing the column match
> -----------------------------------------------------------------
>
>                 Key: HBASE-3562
>                 URL: https://issues.apache.org/jira/browse/HBASE-3562
>             Project: HBase
>          Issue Type: Bug
>          Components: filters
>    Affects Versions: 0.90.0
>            Reporter: Evert Arckens
>         Attachments: HBASE-3562.patch
>
>
> When performing a Get operation where a both a column is specified and a ValueFilter, the ValueFilter is evaluated before making the column match as is indicated in the javadoc of Get.setFilter()  : " {@link Filter#filterKeyValue(KeyValue)} is called AFTER all tests for ttl, column match, deletes and max versions have been run. "
> The is shown in the little test below, which uses a TestComparator extending a WritableByteArrayComparable.
> public void testFilter() throws Exception {
> 	byte[] cf = Bytes.toBytes("cf");
> 	byte[] row = Bytes.toBytes("row");
> 	byte[] col1 = Bytes.toBytes("col1");
> 	byte[] col2 = Bytes.toBytes("col2");
> 	Put put = new Put(row);
> 	put.add(cf, col1, new byte[]{(byte)1});
> 	put.add(cf, col2, new byte[]{(byte)2});
> 	table.put(put);
> 	Get get = new Get(row);
> 	get.addColumn(cf, col2); // We only want to retrieve col2
> 	TestComparator testComparator = new TestComparator();
> 	Filter filter = new ValueFilter(CompareOp.EQUAL, testComparator);
> 	get.setFilter(filter);
> 	Result result = table.get(get);
> }
> public class TestComparator extends WritableByteArrayComparable {
>     /**
>      * Nullary constructor, for Writable
>      */
>     public TestComparator() {
>         super();
>     }
>     
>     @Override
>     public int compareTo(byte[] theirValue) {
>         if (theirValue[0] == (byte)1) {
>             // If the column match was done before evaluating the filter, we should never get here.
>             throw new RuntimeException("I only expect (byte)2 in col2, not (byte)1 from col1");
>         }
>         if (theirValue[0] == (byte)2) {
>             return 0;
>         }
>         else return 1;
>     }
> }
> When only one column should be retrieved, this can be worked around by using a SingleColumnValueFilter instead of the ValueFilter.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira