You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Christophe Taton <ta...@wibidata.com> on 2013/04/05 20:19:31 UTC

Interactions between max versions and filters

Hi,

Is there an explicit specification of the behavior of max versions (set in
a get/scan) when combined with filters?
>From my experiments (with 0.92 CDH4.1.2), the max versions is applied in a
way that is neither pre-filtering nor post-filtering.
In particular, I am currently playing with the ColumnPaginationFilter, and
I am not entirely certain I understand the intended/expected behavior.
I also did not find an explicit specification in the reference user guide
nor in the API javadoc.

Thanks,
C.

Re: Interactions between max versions and filters

Posted by Christophe Taton <ta...@wibidata.com>.
On Fri, Apr 5, 2013 at 3:05 PM, Stack <st...@duboce.net> wrote:

> On Fri, Apr 5, 2013 at 11:19 AM, Christophe Taton <taton@wibidata.com
> >wrote:
>
> > Is there an explicit specification of the behavior of max versions (set
> in
> > a get/scan) when combined with filters?
> > From my experiments (with 0.92 CDH4.1.2), the max versions is applied in
> a
> > way that is neither pre-filtering nor post-filtering.
> > In particular, I am currently playing with the ColumnPaginationFilter,
> and
> > I am not entirely certain I understand the intended/expected behavior.
> > I also did not find an explicit specification in the reference user guide
> > nor in the API javadoc.
> >
> Sounds like a bug.  My understanding is that the regardless of filters, max
> versions should be respected (Yes, we should have a specification but we do
> not have one here).
>

Should it be respected as:
 - a post-filtering max-versions : there should be max-versions cells
returned per column to the user (ie. potentially a lot more cells processed
on the region server to fill in the allowed max-versions cells requested)
 - a pre-filtering max-versions : there should be up to max-versions cells
processed per column within the regionserver, and submitted to additional
custom filtering (ie. potentially less or no cells returned to the user)
 - something else?

Thanks for your answer,
C.

Re: Interactions between max versions and filters

Posted by Christophe Taton <ta...@wibidata.com>.
Hi,
Thanks for all your answers, that was very helpful.
It appears we were using the non-intended behavior of the
ColumnPaginationFilter.
I now understand that:
 - max versions applies post-filtering.
 - ColumnPaginationFilter forces max-versions to 1 (and so does
ColumnCountGetFilter).

>From some experiments, the offset of the ColumnPaginationFilter appears
fairly inefficient, and it seems I will always want it to be 0.
Instead, I'd combine the ColumnPaginationFilter with a ColumnRangeFilter
that sets a low bound on columns.

C.


On Sat, Apr 6, 2013 at 12:55 PM, Varun Sharma <va...@pinterest.com> wrote:

> Looking at the JIRA, I added a couple of tests to the 0.94 version for
> checking deleting versions etc. - if u look at the 0.94 patch.
>
> Perhaps, we should carry over those tests to trunk - my bad on not adding
> those tests...
>
> On Sat, Apr 6, 2013 at 7:10 AM, Ted Yu <yu...@gmail.com> wrote:
>
> > Looking at
> >
> https://issues.apache.org/jira/secure/attachment/12550741/5257-trunk-v2.txt
> > ,
> > TestFilter#testColumnPaginationFilter() was modified but no new test was
> > added.
> >
> > I think Christophe is in better position to suggest what additional test
> > should be added based on his use case.
> >
> > Cheers
> >
> >
> > On Sat, Apr 6, 2013 at 6:58 AM, Jean-Marc Spaggiari <
> > jean-marc@spaggiari.org
> > > wrote:
> >
> > > Ted, should we have such test in the test suite? If so, should a JIRA
> > > be opened for that?
> > >
> > > JM
> > >
> > > 2013/4/6 Ted Yu <yu...@gmail.com>:
> > > > Christophe:
> > > > HBASE-5257 has been integrated into 0.94
> > > > Can you try 0.94.6.1 to see if the problem is solved ?
> > > >
> > > > Writing a unit test probably is the easiest way for validation.
> > > >
> > > > Thanks
> > > >
> > > >
> > > > On Fri, Apr 5, 2013 at 5:25 PM, Varun Sharma <va...@pinterest.com>
> > > wrote:
> > > >
> > > >> HBASE 5257 is probably what lars is talking about - that fixed a bug
> > > with
> > > >> version tracking on ColumnPaginatinoFilter - there is a patch for
> > 0.92,
> > > >> 0.94 and 0.96 but not for the cdh versions...
> > > >>
> > > >> On Fri, Apr 5, 2013 at 3:28 PM, lars hofhansl <la...@apache.org>
> > wrote:
> > > >>
> > > >> > Normally Filters are evaluated before the version counting. There
> > was
> > > >> some
> > > >> > issue that was fixed recently that changed this behavior
> > specifically
> > > for
> > > >> > ColumnPaginationFilters and friend... Lemme see if I can find
> that.
> > > >> >
> > > >> >
> > > >> >
> > > >> > ________________________________
> > > >> >  From: Stack <st...@duboce.net>
> > > >> > To: Hbase-User <us...@hbase.apache.org>
> > > >> > Sent: Friday, April 5, 2013 3:05 PM
> > > >> > Subject: Re: Interactions between max versions and filters
> > > >> >
> > > >> > On Fri, Apr 5, 2013 at 11:19 AM, Christophe Taton <
> > taton@wibidata.com
> > > >> > >wrote:
> > > >> >
> > > >> > > Hi,
> > > >> > >
> > > >> > > Is there an explicit specification of the behavior of max
> versions
> > > (set
> > > >> > in
> > > >> > > a get/scan) when combined with filters?
> > > >> > > From my experiments (with 0.92 CDH4.1.2), the max versions is
> > > applied
> > > >> in
> > > >> > a
> > > >> > > way that is neither pre-filtering nor post-filtering.
> > > >> > > In particular, I am currently playing with the
> > > ColumnPaginationFilter,
> > > >> > and
> > > >> > > I am not entirely certain I understand the intended/expected
> > > behavior.
> > > >> > > I also did not find an explicit specification in the reference
> > user
> > > >> guide
> > > >> > > nor in the API javadoc.
> > > >> > >
> > > >> >
> > > >> >
> > > >> > Hey Christophe:
> > > >> >
> > > >> > Sounds like a bug.  My understanding is that the regardless of
> > > filters,
> > > >> max
> > > >> > versions should be respected (Yes, we should have a specification
> > but
> > > we
> > > >> do
> > > >> > not have one here).
> > > >> >
> > > >> > Yours,
> > > >> > St.Ack
> > > >> >
> > > >>
> > >
> >
>

Re: Interactions between max versions and filters

Posted by Varun Sharma <va...@pinterest.com>.
Looking at the JIRA, I added a couple of tests to the 0.94 version for
checking deleting versions etc. - if u look at the 0.94 patch.

Perhaps, we should carry over those tests to trunk - my bad on not adding
those tests...

On Sat, Apr 6, 2013 at 7:10 AM, Ted Yu <yu...@gmail.com> wrote:

> Looking at
> https://issues.apache.org/jira/secure/attachment/12550741/5257-trunk-v2.txt
> ,
> TestFilter#testColumnPaginationFilter() was modified but no new test was
> added.
>
> I think Christophe is in better position to suggest what additional test
> should be added based on his use case.
>
> Cheers
>
>
> On Sat, Apr 6, 2013 at 6:58 AM, Jean-Marc Spaggiari <
> jean-marc@spaggiari.org
> > wrote:
>
> > Ted, should we have such test in the test suite? If so, should a JIRA
> > be opened for that?
> >
> > JM
> >
> > 2013/4/6 Ted Yu <yu...@gmail.com>:
> > > Christophe:
> > > HBASE-5257 has been integrated into 0.94
> > > Can you try 0.94.6.1 to see if the problem is solved ?
> > >
> > > Writing a unit test probably is the easiest way for validation.
> > >
> > > Thanks
> > >
> > >
> > > On Fri, Apr 5, 2013 at 5:25 PM, Varun Sharma <va...@pinterest.com>
> > wrote:
> > >
> > >> HBASE 5257 is probably what lars is talking about - that fixed a bug
> > with
> > >> version tracking on ColumnPaginatinoFilter - there is a patch for
> 0.92,
> > >> 0.94 and 0.96 but not for the cdh versions...
> > >>
> > >> On Fri, Apr 5, 2013 at 3:28 PM, lars hofhansl <la...@apache.org>
> wrote:
> > >>
> > >> > Normally Filters are evaluated before the version counting. There
> was
> > >> some
> > >> > issue that was fixed recently that changed this behavior
> specifically
> > for
> > >> > ColumnPaginationFilters and friend... Lemme see if I can find that.
> > >> >
> > >> >
> > >> >
> > >> > ________________________________
> > >> >  From: Stack <st...@duboce.net>
> > >> > To: Hbase-User <us...@hbase.apache.org>
> > >> > Sent: Friday, April 5, 2013 3:05 PM
> > >> > Subject: Re: Interactions between max versions and filters
> > >> >
> > >> > On Fri, Apr 5, 2013 at 11:19 AM, Christophe Taton <
> taton@wibidata.com
> > >> > >wrote:
> > >> >
> > >> > > Hi,
> > >> > >
> > >> > > Is there an explicit specification of the behavior of max versions
> > (set
> > >> > in
> > >> > > a get/scan) when combined with filters?
> > >> > > From my experiments (with 0.92 CDH4.1.2), the max versions is
> > applied
> > >> in
> > >> > a
> > >> > > way that is neither pre-filtering nor post-filtering.
> > >> > > In particular, I am currently playing with the
> > ColumnPaginationFilter,
> > >> > and
> > >> > > I am not entirely certain I understand the intended/expected
> > behavior.
> > >> > > I also did not find an explicit specification in the reference
> user
> > >> guide
> > >> > > nor in the API javadoc.
> > >> > >
> > >> >
> > >> >
> > >> > Hey Christophe:
> > >> >
> > >> > Sounds like a bug.  My understanding is that the regardless of
> > filters,
> > >> max
> > >> > versions should be respected (Yes, we should have a specification
> but
> > we
> > >> do
> > >> > not have one here).
> > >> >
> > >> > Yours,
> > >> > St.Ack
> > >> >
> > >>
> >
>

Re: Interactions between max versions and filters

Posted by Ted Yu <yu...@gmail.com>.
Looking at
https://issues.apache.org/jira/secure/attachment/12550741/5257-trunk-v2.txt,
TestFilter#testColumnPaginationFilter() was modified but no new test was
added.

I think Christophe is in better position to suggest what additional test
should be added based on his use case.

Cheers


On Sat, Apr 6, 2013 at 6:58 AM, Jean-Marc Spaggiari <jean-marc@spaggiari.org
> wrote:

> Ted, should we have such test in the test suite? If so, should a JIRA
> be opened for that?
>
> JM
>
> 2013/4/6 Ted Yu <yu...@gmail.com>:
> > Christophe:
> > HBASE-5257 has been integrated into 0.94
> > Can you try 0.94.6.1 to see if the problem is solved ?
> >
> > Writing a unit test probably is the easiest way for validation.
> >
> > Thanks
> >
> >
> > On Fri, Apr 5, 2013 at 5:25 PM, Varun Sharma <va...@pinterest.com>
> wrote:
> >
> >> HBASE 5257 is probably what lars is talking about - that fixed a bug
> with
> >> version tracking on ColumnPaginatinoFilter - there is a patch for 0.92,
> >> 0.94 and 0.96 but not for the cdh versions...
> >>
> >> On Fri, Apr 5, 2013 at 3:28 PM, lars hofhansl <la...@apache.org> wrote:
> >>
> >> > Normally Filters are evaluated before the version counting. There was
> >> some
> >> > issue that was fixed recently that changed this behavior specifically
> for
> >> > ColumnPaginationFilters and friend... Lemme see if I can find that.
> >> >
> >> >
> >> >
> >> > ________________________________
> >> >  From: Stack <st...@duboce.net>
> >> > To: Hbase-User <us...@hbase.apache.org>
> >> > Sent: Friday, April 5, 2013 3:05 PM
> >> > Subject: Re: Interactions between max versions and filters
> >> >
> >> > On Fri, Apr 5, 2013 at 11:19 AM, Christophe Taton <taton@wibidata.com
> >> > >wrote:
> >> >
> >> > > Hi,
> >> > >
> >> > > Is there an explicit specification of the behavior of max versions
> (set
> >> > in
> >> > > a get/scan) when combined with filters?
> >> > > From my experiments (with 0.92 CDH4.1.2), the max versions is
> applied
> >> in
> >> > a
> >> > > way that is neither pre-filtering nor post-filtering.
> >> > > In particular, I am currently playing with the
> ColumnPaginationFilter,
> >> > and
> >> > > I am not entirely certain I understand the intended/expected
> behavior.
> >> > > I also did not find an explicit specification in the reference user
> >> guide
> >> > > nor in the API javadoc.
> >> > >
> >> >
> >> >
> >> > Hey Christophe:
> >> >
> >> > Sounds like a bug.  My understanding is that the regardless of
> filters,
> >> max
> >> > versions should be respected (Yes, we should have a specification but
> we
> >> do
> >> > not have one here).
> >> >
> >> > Yours,
> >> > St.Ack
> >> >
> >>
>

Re: Interactions between max versions and filters

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.
Ted, should we have such test in the test suite? If so, should a JIRA
be opened for that?

JM

2013/4/6 Ted Yu <yu...@gmail.com>:
> Christophe:
> HBASE-5257 has been integrated into 0.94
> Can you try 0.94.6.1 to see if the problem is solved ?
>
> Writing a unit test probably is the easiest way for validation.
>
> Thanks
>
>
> On Fri, Apr 5, 2013 at 5:25 PM, Varun Sharma <va...@pinterest.com> wrote:
>
>> HBASE 5257 is probably what lars is talking about - that fixed a bug with
>> version tracking on ColumnPaginatinoFilter - there is a patch for 0.92,
>> 0.94 and 0.96 but not for the cdh versions...
>>
>> On Fri, Apr 5, 2013 at 3:28 PM, lars hofhansl <la...@apache.org> wrote:
>>
>> > Normally Filters are evaluated before the version counting. There was
>> some
>> > issue that was fixed recently that changed this behavior specifically for
>> > ColumnPaginationFilters and friend... Lemme see if I can find that.
>> >
>> >
>> >
>> > ________________________________
>> >  From: Stack <st...@duboce.net>
>> > To: Hbase-User <us...@hbase.apache.org>
>> > Sent: Friday, April 5, 2013 3:05 PM
>> > Subject: Re: Interactions between max versions and filters
>> >
>> > On Fri, Apr 5, 2013 at 11:19 AM, Christophe Taton <taton@wibidata.com
>> > >wrote:
>> >
>> > > Hi,
>> > >
>> > > Is there an explicit specification of the behavior of max versions (set
>> > in
>> > > a get/scan) when combined with filters?
>> > > From my experiments (with 0.92 CDH4.1.2), the max versions is applied
>> in
>> > a
>> > > way that is neither pre-filtering nor post-filtering.
>> > > In particular, I am currently playing with the ColumnPaginationFilter,
>> > and
>> > > I am not entirely certain I understand the intended/expected behavior.
>> > > I also did not find an explicit specification in the reference user
>> guide
>> > > nor in the API javadoc.
>> > >
>> >
>> >
>> > Hey Christophe:
>> >
>> > Sounds like a bug.  My understanding is that the regardless of filters,
>> max
>> > versions should be respected (Yes, we should have a specification but we
>> do
>> > not have one here).
>> >
>> > Yours,
>> > St.Ack
>> >
>>

Re: Interactions between max versions and filters

Posted by Ted Yu <yu...@gmail.com>.
Christophe:
HBASE-5257 has been integrated into 0.94
Can you try 0.94.6.1 to see if the problem is solved ?

Writing a unit test probably is the easiest way for validation.

Thanks


On Fri, Apr 5, 2013 at 5:25 PM, Varun Sharma <va...@pinterest.com> wrote:

> HBASE 5257 is probably what lars is talking about - that fixed a bug with
> version tracking on ColumnPaginatinoFilter - there is a patch for 0.92,
> 0.94 and 0.96 but not for the cdh versions...
>
> On Fri, Apr 5, 2013 at 3:28 PM, lars hofhansl <la...@apache.org> wrote:
>
> > Normally Filters are evaluated before the version counting. There was
> some
> > issue that was fixed recently that changed this behavior specifically for
> > ColumnPaginationFilters and friend... Lemme see if I can find that.
> >
> >
> >
> > ________________________________
> >  From: Stack <st...@duboce.net>
> > To: Hbase-User <us...@hbase.apache.org>
> > Sent: Friday, April 5, 2013 3:05 PM
> > Subject: Re: Interactions between max versions and filters
> >
> > On Fri, Apr 5, 2013 at 11:19 AM, Christophe Taton <taton@wibidata.com
> > >wrote:
> >
> > > Hi,
> > >
> > > Is there an explicit specification of the behavior of max versions (set
> > in
> > > a get/scan) when combined with filters?
> > > From my experiments (with 0.92 CDH4.1.2), the max versions is applied
> in
> > a
> > > way that is neither pre-filtering nor post-filtering.
> > > In particular, I am currently playing with the ColumnPaginationFilter,
> > and
> > > I am not entirely certain I understand the intended/expected behavior.
> > > I also did not find an explicit specification in the reference user
> guide
> > > nor in the API javadoc.
> > >
> >
> >
> > Hey Christophe:
> >
> > Sounds like a bug.  My understanding is that the regardless of filters,
> max
> > versions should be respected (Yes, we should have a specification but we
> do
> > not have one here).
> >
> > Yours,
> > St.Ack
> >
>

Re: Interactions between max versions and filters

Posted by Varun Sharma <va...@pinterest.com>.
HBASE 5257 is probably what lars is talking about - that fixed a bug with
version tracking on ColumnPaginatinoFilter - there is a patch for 0.92,
0.94 and 0.96 but not for the cdh versions...

On Fri, Apr 5, 2013 at 3:28 PM, lars hofhansl <la...@apache.org> wrote:

> Normally Filters are evaluated before the version counting. There was some
> issue that was fixed recently that changed this behavior specifically for
> ColumnPaginationFilters and friend... Lemme see if I can find that.
>
>
>
> ________________________________
>  From: Stack <st...@duboce.net>
> To: Hbase-User <us...@hbase.apache.org>
> Sent: Friday, April 5, 2013 3:05 PM
> Subject: Re: Interactions between max versions and filters
>
> On Fri, Apr 5, 2013 at 11:19 AM, Christophe Taton <taton@wibidata.com
> >wrote:
>
> > Hi,
> >
> > Is there an explicit specification of the behavior of max versions (set
> in
> > a get/scan) when combined with filters?
> > From my experiments (with 0.92 CDH4.1.2), the max versions is applied in
> a
> > way that is neither pre-filtering nor post-filtering.
> > In particular, I am currently playing with the ColumnPaginationFilter,
> and
> > I am not entirely certain I understand the intended/expected behavior.
> > I also did not find an explicit specification in the reference user guide
> > nor in the API javadoc.
> >
>
>
> Hey Christophe:
>
> Sounds like a bug.  My understanding is that the regardless of filters, max
> versions should be respected (Yes, we should have a specification but we do
> not have one here).
>
> Yours,
> St.Ack
>

Re: Interactions between max versions and filters

Posted by lars hofhansl <la...@apache.org>.
Normally Filters are evaluated before the version counting. There was some issue that was fixed recently that changed this behavior specifically for ColumnPaginationFilters and friend... Lemme see if I can find that.



________________________________
 From: Stack <st...@duboce.net>
To: Hbase-User <us...@hbase.apache.org> 
Sent: Friday, April 5, 2013 3:05 PM
Subject: Re: Interactions between max versions and filters
 
On Fri, Apr 5, 2013 at 11:19 AM, Christophe Taton <ta...@wibidata.com>wrote:

> Hi,
>
> Is there an explicit specification of the behavior of max versions (set in
> a get/scan) when combined with filters?
> From my experiments (with 0.92 CDH4.1.2), the max versions is applied in a
> way that is neither pre-filtering nor post-filtering.
> In particular, I am currently playing with the ColumnPaginationFilter, and
> I am not entirely certain I understand the intended/expected behavior.
> I also did not find an explicit specification in the reference user guide
> nor in the API javadoc.
>


Hey Christophe:

Sounds like a bug.  My understanding is that the regardless of filters, max
versions should be respected (Yes, we should have a specification but we do
not have one here).

Yours,
St.Ack

Re: Interactions between max versions and filters

Posted by Stack <st...@duboce.net>.
On Fri, Apr 5, 2013 at 11:19 AM, Christophe Taton <ta...@wibidata.com>wrote:

> Hi,
>
> Is there an explicit specification of the behavior of max versions (set in
> a get/scan) when combined with filters?
> From my experiments (with 0.92 CDH4.1.2), the max versions is applied in a
> way that is neither pre-filtering nor post-filtering.
> In particular, I am currently playing with the ColumnPaginationFilter, and
> I am not entirely certain I understand the intended/expected behavior.
> I also did not find an explicit specification in the reference user guide
> nor in the API javadoc.
>


Hey Christophe:

Sounds like a bug.  My understanding is that the regardless of filters, max
versions should be respected (Yes, we should have a specification but we do
not have one here).

Yours,
St.Ack