You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Jonathan Gray (JIRA)" <ji...@apache.org> on 2011/04/01 19:05:05 UTC
[jira] [Commented] (HBASE-3562) ValueFilter is being evaluated
before performing the column match
[ https://issues.apache.org/jira/browse/HBASE-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13014748#comment-13014748 ]
Jonathan Gray commented on HBASE-3562:
--------------------------------------
Thanks for looking into this Evert. This is definitely some tricky stuff.
A few comments on your patch...
- Our convention in conditionals is to put the variable first. I find it a little tricky to read the code when the constant is first. For example:
{code}
if (MatchCode.INCLUDE == mc)
{code}
should be
{code}
if (mc == MatchCode.INCLUDE)
{code}
(And all the other places where you have this type of logic)
- The unit test {{TestColumnMatchAndFilterOrder}} is clever how you check correctness, but I think it would be good to actually do a read query and verify the results for a few different combinations of the query to prove correctness of the overall algorithm. Other changes to SQM down the road might change more behavior / order of operations, so this test may no longer apply or give full coverage for correctness. Having some tests which don't rely on the precise server-side interactions but rather confirm the end results will be more applicable as we move forward.
- You have some lines that are > 80 characters, especially in some of the javadoc. Just wrap that so all lines are <= 80 chars.
- There was a comment in SQM that described why the filter was checked first. Can you write some inline comments to describe how this works now? There are a couple lines at the end but it will be useful to have some explanation on why this has changed and what the behavior is now.
- Is there any particular reason that you had includeLatestColumn take timestamp as a parameter? The timestamp is passed in the check call, and we could just hang on to that. It just feels a little strange to me since you should never pass a different timestamp, and the tracker can know which was the latest column.
Overall this is really solid! Great work Evert!
> ValueFilter is being evaluated before performing the column match
> -----------------------------------------------------------------
>
> Key: HBASE-3562
> URL: https://issues.apache.org/jira/browse/HBASE-3562
> Project: HBase
> Issue Type: Bug
> Components: filters
> Affects Versions: 0.90.0
> Reporter: Evert Arckens
> Attachments: HBASE-3562.patch
>
>
> When performing a Get operation where a both a column is specified and a ValueFilter, the ValueFilter is evaluated before making the column match as is indicated in the javadoc of Get.setFilter() : " {@link Filter#filterKeyValue(KeyValue)} is called AFTER all tests for ttl, column match, deletes and max versions have been run. "
> The is shown in the little test below, which uses a TestComparator extending a WritableByteArrayComparable.
> public void testFilter() throws Exception {
> byte[] cf = Bytes.toBytes("cf");
> byte[] row = Bytes.toBytes("row");
> byte[] col1 = Bytes.toBytes("col1");
> byte[] col2 = Bytes.toBytes("col2");
> Put put = new Put(row);
> put.add(cf, col1, new byte[]{(byte)1});
> put.add(cf, col2, new byte[]{(byte)2});
> table.put(put);
> Get get = new Get(row);
> get.addColumn(cf, col2); // We only want to retrieve col2
> TestComparator testComparator = new TestComparator();
> Filter filter = new ValueFilter(CompareOp.EQUAL, testComparator);
> get.setFilter(filter);
> Result result = table.get(get);
> }
> public class TestComparator extends WritableByteArrayComparable {
> /**
> * Nullary constructor, for Writable
> */
> public TestComparator() {
> super();
> }
>
> @Override
> public int compareTo(byte[] theirValue) {
> if (theirValue[0] == (byte)1) {
> // If the column match was done before evaluating the filter, we should never get here.
> throw new RuntimeException("I only expect (byte)2 in col2, not (byte)1 from col1");
> }
> if (theirValue[0] == (byte)2) {
> return 0;
> }
> else return 1;
> }
> }
> When only one column should be retrieved, this can be worked around by using a SingleColumnValueFilter instead of the ValueFilter.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira