You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Lucas Bernardi <lu...@gmail.com> on 2012/08/14 23:51:02 UTC

Filtering values and Get.addColumn

Hello there, I'm struggling with a situation here.
I want to apply a value filter to a specific set of columns, like this:

Get get = new Get(keyBytes);
get.addColumn(Bytes.toBytes("test"), Bytes.toBytes(4));
get.addColumn(Bytes.toBytes("test"), Bytes.toBytes(5));
get.addColumn(Bytes.toBytes("test"), Bytes.toBytes(6));
get.setFilter(new MyValueFilter());

It looks like the filter is applied to the requested columns (4,5,6) and
the first column in the row (0 in my case),

This is bad for my case, because MyValueFilter will skip the entire row as
soon as it finds an odd value, so if the first column has an odd value, and
the columns 4,5, and 6 have even values, the outcome is wrong.

Is this a bug? Is this how it works?
Workarounds?

Thanks!
Lucas Bernardi

Re: Filtering values and Get.addColumn

Posted by Alex Baranau <al...@gmail.com>.
Indeed. Wrote simple unit-test [1] and it fails.

And there's a JIRA for that also:
https://issues.apache.org/jira/browse/HBASE-4364. I added patch with the
simple unit-test that fails to it.

Alex Baranau
------
Sematext :: http://blog.sematext.com/ :: Hadoop - HBase - ElasticSearch -
Solr

[1]

  @Test
  public void testNonSelectedColumnsAreSkippedFromFilterApplying() throws
IOException {
    HTable hTable = TEST_UTIL.createTable(Bytes.toBytes("t"),
            new byte[][] {Bytes.toBytes("cf")});

    Put put = new Put(new byte[] {1});
    put.add(Bytes.toBytes("cf"), Bytes.toBytes("3"), Bytes.toBytes("abc"));
    put.add(Bytes.toBytes("cf"), Bytes.toBytes("4"), Bytes.toBytes("abcd"));
    hTable.put(put);

    put = new Put(new byte[] {2});
    put.add(Bytes.toBytes("cf"), Bytes.toBytes("3"), Bytes.toBytes("a"));
    put.add(Bytes.toBytes("cf"), Bytes.toBytes("4"), Bytes.toBytes("ab"));
    hTable.put(put);

    put = new Put(new byte[] {3});
    put.add(Bytes.toBytes("cf"), Bytes.toBytes("1"), Bytes.toBytes("a"));
    put.add(Bytes.toBytes("cf"), Bytes.toBytes("2"), Bytes.toBytes("ab"));
    put.add(Bytes.toBytes("cf"), Bytes.toBytes("3"), Bytes.toBytes("abc"));
    put.add(Bytes.toBytes("cf"), Bytes.toBytes("4"), Bytes.toBytes("abcd"));
    hTable.put(put);

    hTable.flushCommits();

    Get get = new Get(new byte[] {1});
    get.setFilter(new ValueHasMoreThan2BytesFilter());
    Result res = hTable.get(get);
    Assert.assertTrue(res != null && !res.isEmpty());

    get = new Get(new byte[] {2});
    get.addColumn(Bytes.toBytes("cf"), Bytes.toBytes("3"));
    get.addColumn(Bytes.toBytes("cf"), Bytes.toBytes("4"));
    get.setFilter(new ValueHasMoreThan2BytesFilter());
    res = hTable.get(get);
    Assert.assertTrue(res == null || res.isEmpty());

    get = new Get(new byte[] {3});
    get.addColumn(Bytes.toBytes("cf"), Bytes.toBytes("3"));
    get.addColumn(Bytes.toBytes("cf"), Bytes.toBytes("4"));
    get.setFilter(new ValueHasMoreThan2BytesFilter());
    res = hTable.get(get);
    Assert.assertTrue(res != null && !res.isEmpty()); // <<<<< FAILS HERE
  }

  private static class ValueHasMoreThan2BytesFilter extends FilterBase {
    @Override
    public ReturnCode filterKeyValue(KeyValue kv) {
      byte[] val = kv.getValue();
      if (val.length > 2) {
        return ReturnCode.INCLUDE;
      } else {
        return ReturnCode.NEXT_ROW;
      }
    }

    @Override
    public void write(DataOutput dataOutput) throws IOException {}

    @Override
    public void readFields(DataInput dataInput) throws IOException {}
  }

On Tue, Aug 14, 2012 at 5:51 PM, Lucas Bernardi <lu...@gmail.com> wrote:

> Hello there, I'm struggling with a situation here.
> I want to apply a value filter to a specific set of columns, like this:
>
> Get get = new Get(keyBytes);
> get.addColumn(Bytes.toBytes("test"), Bytes.toBytes(4));
> get.addColumn(Bytes.toBytes("test"), Bytes.toBytes(5));
> get.addColumn(Bytes.toBytes("test"), Bytes.toBytes(6));
> get.setFilter(new MyValueFilter());
>
> It looks like the filter is applied to the requested columns (4,5,6) and
> the first column in the row (0 in my case),
>
> This is bad for my case, because MyValueFilter will skip the entire row as
> soon as it finds an odd value, so if the first column has an odd value, and
> the columns 4,5, and 6 have even values, the outcome is wrong.
>
> Is this a bug? Is this how it works?
> Workarounds?
>
> Thanks!
> Lucas Bernardi
>