You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Amit Sela <am...@infolinks.com> on 2014/03/06 14:09:13 UTC

Is FuzzyRowFilter also a PrefixFilter

Hi all,

I have the following row structure:

yyyyMMdd_Country_CategoryId1_CategoryId2_CategoryId3

Any field, except for the date may be empty (each day there is one row that
looks like this: yyyyMMdd____)

I use FuzzyRowFilter like this:

Filter filter = new FuzzyRowFilter(
                Arrays.asList(
                        new Pair<>(

Bytes.toBytesBinary("\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00____"),
                                new byte[] {1,1,1,1,1,1,1,1,0,0,0,0})));
In addition to getting back all rows of type:
yyyyMMdd____
I also get rows of type:
yyyyMMdd____13 - where 13 is of type CategoryId3

It looks like FuzzyRowFilter acts as RowPrefixFilter as well...

Is it supposed to be like that ?  am I using it wrong ?

Thanks,
Amit.

(Using HBase 0.94.12 with Hadoop 1.0.4)

Re: Is FuzzyRowFilter also a PrefixFilter

Posted by Ted Yu <yu...@gmail.com>.
If you're only interested in the pattern given below, you can
use PrefixFilter directly.

Otherwise enhancement to FuzzyRowFilter is needed.


On Thu, Mar 6, 2014 at 8:45 AM, Amit Sela <am...@infolinks.com> wrote:

> So, correct me if I'm wrong, fuzzy acts as prefix as well, right ?
>
> for key:          yyyyMMdd____
> with mask:       111111110000
> both 20140201____ and 20140201____13 will pass the filter.
>
> Is there a way to use FuzzyRowFilter as "exact fuzzy match" so that prefix
> matches will not pass ?
>
>
> On Thu, Mar 6, 2014 at 5:02 PM, Ted Yu <yu...@gmail.com> wrote:
>
> > Take a look at this method of FuzzyRowFilter:
> >
> >   private static SatisfiesCode satisfies(byte[] row, int offset, int
> > length,
> >
> >                                          byte[] fuzzyKeyBytes, byte[]
> > fuzzyKeyMeta) {
> >
> > The for loop is controlled by fuzzyKeyMeta.length
> >
> > Meaning it compares as many bytes as you give in the fuzzy info byte
> array.
> >
> >
> > Cheers
> >
> >
> > On Thu, Mar 6, 2014 at 5:09 AM, Amit Sela <am...@infolinks.com> wrote:
> >
> > > Hi all,
> > >
> > > I have the following row structure:
> > >
> > > yyyyMMdd_Country_CategoryId1_CategoryId2_CategoryId3
> > >
> > > Any field, except for the date may be empty (each day there is one row
> > that
> > > looks like this: yyyyMMdd____)
> > >
> > > I use FuzzyRowFilter like this:
> > >
> > > Filter filter = new FuzzyRowFilter(
> > >                 Arrays.asList(
> > >                         new Pair<>(
> > >
> > > Bytes.toBytesBinary("\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00____"),
> > >                                 new byte[]
> {1,1,1,1,1,1,1,1,0,0,0,0})));
> > > In addition to getting back all rows of type:
> > > yyyyMMdd____
> > > I also get rows of type:
> > > yyyyMMdd____13 - where 13 is of type CategoryId3
> > >
> > > It looks like FuzzyRowFilter acts as RowPrefixFilter as well...
> > >
> > > Is it supposed to be like that ?  am I using it wrong ?
> > >
> > > Thanks,
> > > Amit.
> > >
> > > (Using HBase 0.94.12 with Hadoop 1.0.4)
> > >
> >
>

Re: Is FuzzyRowFilter also a PrefixFilter

Posted by Amit Sela <am...@infolinks.com>.
So, correct me if I'm wrong, fuzzy acts as prefix as well, right ?

for key:          yyyyMMdd____
with mask:       111111110000
both 20140201____ and 20140201____13 will pass the filter.

Is there a way to use FuzzyRowFilter as "exact fuzzy match" so that prefix
matches will not pass ?


On Thu, Mar 6, 2014 at 5:02 PM, Ted Yu <yu...@gmail.com> wrote:

> Take a look at this method of FuzzyRowFilter:
>
>   private static SatisfiesCode satisfies(byte[] row, int offset, int
> length,
>
>                                          byte[] fuzzyKeyBytes, byte[]
> fuzzyKeyMeta) {
>
> The for loop is controlled by fuzzyKeyMeta.length
>
> Meaning it compares as many bytes as you give in the fuzzy info byte array.
>
>
> Cheers
>
>
> On Thu, Mar 6, 2014 at 5:09 AM, Amit Sela <am...@infolinks.com> wrote:
>
> > Hi all,
> >
> > I have the following row structure:
> >
> > yyyyMMdd_Country_CategoryId1_CategoryId2_CategoryId3
> >
> > Any field, except for the date may be empty (each day there is one row
> that
> > looks like this: yyyyMMdd____)
> >
> > I use FuzzyRowFilter like this:
> >
> > Filter filter = new FuzzyRowFilter(
> >                 Arrays.asList(
> >                         new Pair<>(
> >
> > Bytes.toBytesBinary("\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00____"),
> >                                 new byte[] {1,1,1,1,1,1,1,1,0,0,0,0})));
> > In addition to getting back all rows of type:
> > yyyyMMdd____
> > I also get rows of type:
> > yyyyMMdd____13 - where 13 is of type CategoryId3
> >
> > It looks like FuzzyRowFilter acts as RowPrefixFilter as well...
> >
> > Is it supposed to be like that ?  am I using it wrong ?
> >
> > Thanks,
> > Amit.
> >
> > (Using HBase 0.94.12 with Hadoop 1.0.4)
> >
>

Re: Is FuzzyRowFilter also a PrefixFilter

Posted by Ted Yu <yu...@gmail.com>.
Take a look at this method of FuzzyRowFilter:

  private static SatisfiesCode satisfies(byte[] row, int offset, int length,

                                         byte[] fuzzyKeyBytes, byte[]
fuzzyKeyMeta) {

The for loop is controlled by fuzzyKeyMeta.length

Meaning it compares as many bytes as you give in the fuzzy info byte array.


Cheers


On Thu, Mar 6, 2014 at 5:09 AM, Amit Sela <am...@infolinks.com> wrote:

> Hi all,
>
> I have the following row structure:
>
> yyyyMMdd_Country_CategoryId1_CategoryId2_CategoryId3
>
> Any field, except for the date may be empty (each day there is one row that
> looks like this: yyyyMMdd____)
>
> I use FuzzyRowFilter like this:
>
> Filter filter = new FuzzyRowFilter(
>                 Arrays.asList(
>                         new Pair<>(
>
> Bytes.toBytesBinary("\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00____"),
>                                 new byte[] {1,1,1,1,1,1,1,1,0,0,0,0})));
> In addition to getting back all rows of type:
> yyyyMMdd____
> I also get rows of type:
> yyyyMMdd____13 - where 13 is of type CategoryId3
>
> It looks like FuzzyRowFilter acts as RowPrefixFilter as well...
>
> Is it supposed to be like that ?  am I using it wrong ?
>
> Thanks,
> Amit.
>
> (Using HBase 0.94.12 with Hadoop 1.0.4)
>