You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Varun Sharma <va...@pinterest.com> on 2013/04/04 19:31:36 UTC

Adding String offset for ColumnPaginationFilter

Hi,

I am thinking of adding a string offset to ColumnPaginationFilter. There
are two reasons:

1) For deep pagination, you can seek using SEEK_NEXT_USING_HINT.
2) For correctness reasons, this approach is better if the list of columns
is mutation. Lets say you get 1st 50 columns using the current approach. In
the mean time some columns are inserted amongst the 1st 50 columns. Now you
request the 2nd set of 50 columns. Chances are that you will have
duplicates amongst the 2 sets (1st 50 and 2nd 50). If instead you used the
last column of the 1st 50 as a string offset for getting the 2nd set of
columns, the chances of getting dups is significantly lower.

This becomes important for user facing interactive applications.
Particularly where consistency etc. are not as important since those are
best effort services. But showing duplicates across pages is pretty bad.

Please let me know if this makes sense and is feasible. Basically, I would
like a string offset passed to ColumnPaginationFilter as an alternative
constructor. If the string offset is supplied, then, I would like to seek
to either the column supplied or if the column is deleted, seek to the
column just greater than the supplied column.

Thanks
Varun

Re: Adding String offset for ColumnPaginationFilter

Posted by Varun Sharma <va...@pinterest.com>.
I put a sample patch at HBASE-8284, I will let someone from the HBase
committee take a look before moving any further with it...

On Thu, Apr 4, 2013 at 12:13 PM, Nick Dimiduk <nd...@gmail.com> wrote:

> +1
>
> Wouldn't offset be a family:qualifier instead of a String?
>
> Please consider adding two interfaces: a version which exposes the state
> externally (as you've described) and another that encapsulates the state
> handling on the user's behalf. The former is useful for exposing over
> stateless protocols like REST while the latter is more convenient for other
> applications.
>
> -n
>
> On Thu, Apr 4, 2013 at 10:31 AM, Varun Sharma <va...@pinterest.com> wrote:
>
> > Hi,
> >
> > I am thinking of adding a string offset to ColumnPaginationFilter. There
> > are two reasons:
> >
> > 1) For deep pagination, you can seek using SEEK_NEXT_USING_HINT.
> > 2) For correctness reasons, this approach is better if the list of
> columns
> > is mutation. Lets say you get 1st 50 columns using the current approach.
> In
> > the mean time some columns are inserted amongst the 1st 50 columns. Now
> you
> > request the 2nd set of 50 columns. Chances are that you will have
> > duplicates amongst the 2 sets (1st 50 and 2nd 50). If instead you used
> the
> > last column of the 1st 50 as a string offset for getting the 2nd set of
> > columns, the chances of getting dups is significantly lower.
> >
> > This becomes important for user facing interactive applications.
> > Particularly where consistency etc. are not as important since those are
> > best effort services. But showing duplicates across pages is pretty bad.
> >
> > Please let me know if this makes sense and is feasible. Basically, I
> would
> > like a string offset passed to ColumnPaginationFilter as an alternative
> > constructor. If the string offset is supplied, then, I would like to seek
> > to either the column supplied or if the column is deleted, seek to the
> > column just greater than the supplied column.
> >
> > Thanks
> > Varun
> >
>

Re: Adding String offset for ColumnPaginationFilter

Posted by Varun Sharma <va...@pinterest.com>.
I put a sample patch at HBASE-8284, I will let someone from the HBase
committee take a look before moving any further with it...

On Thu, Apr 4, 2013 at 12:13 PM, Nick Dimiduk <nd...@gmail.com> wrote:

> +1
>
> Wouldn't offset be a family:qualifier instead of a String?
>
> Please consider adding two interfaces: a version which exposes the state
> externally (as you've described) and another that encapsulates the state
> handling on the user's behalf. The former is useful for exposing over
> stateless protocols like REST while the latter is more convenient for other
> applications.
>
> -n
>
> On Thu, Apr 4, 2013 at 10:31 AM, Varun Sharma <va...@pinterest.com> wrote:
>
> > Hi,
> >
> > I am thinking of adding a string offset to ColumnPaginationFilter. There
> > are two reasons:
> >
> > 1) For deep pagination, you can seek using SEEK_NEXT_USING_HINT.
> > 2) For correctness reasons, this approach is better if the list of
> columns
> > is mutation. Lets say you get 1st 50 columns using the current approach.
> In
> > the mean time some columns are inserted amongst the 1st 50 columns. Now
> you
> > request the 2nd set of 50 columns. Chances are that you will have
> > duplicates amongst the 2 sets (1st 50 and 2nd 50). If instead you used
> the
> > last column of the 1st 50 as a string offset for getting the 2nd set of
> > columns, the chances of getting dups is significantly lower.
> >
> > This becomes important for user facing interactive applications.
> > Particularly where consistency etc. are not as important since those are
> > best effort services. But showing duplicates across pages is pretty bad.
> >
> > Please let me know if this makes sense and is feasible. Basically, I
> would
> > like a string offset passed to ColumnPaginationFilter as an alternative
> > constructor. If the string offset is supplied, then, I would like to seek
> > to either the column supplied or if the column is deleted, seek to the
> > column just greater than the supplied column.
> >
> > Thanks
> > Varun
> >
>

Re: Adding String offset for ColumnPaginationFilter

Posted by Nick Dimiduk <nd...@gmail.com>.
+1

Wouldn't offset be a family:qualifier instead of a String?

Please consider adding two interfaces: a version which exposes the state
externally (as you've described) and another that encapsulates the state
handling on the user's behalf. The former is useful for exposing over
stateless protocols like REST while the latter is more convenient for other
applications.

-n

On Thu, Apr 4, 2013 at 10:31 AM, Varun Sharma <va...@pinterest.com> wrote:

> Hi,
>
> I am thinking of adding a string offset to ColumnPaginationFilter. There
> are two reasons:
>
> 1) For deep pagination, you can seek using SEEK_NEXT_USING_HINT.
> 2) For correctness reasons, this approach is better if the list of columns
> is mutation. Lets say you get 1st 50 columns using the current approach. In
> the mean time some columns are inserted amongst the 1st 50 columns. Now you
> request the 2nd set of 50 columns. Chances are that you will have
> duplicates amongst the 2 sets (1st 50 and 2nd 50). If instead you used the
> last column of the 1st 50 as a string offset for getting the 2nd set of
> columns, the chances of getting dups is significantly lower.
>
> This becomes important for user facing interactive applications.
> Particularly where consistency etc. are not as important since those are
> best effort services. But showing duplicates across pages is pretty bad.
>
> Please let me know if this makes sense and is feasible. Basically, I would
> like a string offset passed to ColumnPaginationFilter as an alternative
> constructor. If the string offset is supplied, then, I would like to seek
> to either the column supplied or if the column is deleted, seek to the
> column just greater than the supplied column.
>
> Thanks
> Varun
>

Re: Adding String offset for ColumnPaginationFilter

Posted by Nick Dimiduk <nd...@gmail.com>.
+1

Wouldn't offset be a family:qualifier instead of a String?

Please consider adding two interfaces: a version which exposes the state
externally (as you've described) and another that encapsulates the state
handling on the user's behalf. The former is useful for exposing over
stateless protocols like REST while the latter is more convenient for other
applications.

-n

On Thu, Apr 4, 2013 at 10:31 AM, Varun Sharma <va...@pinterest.com> wrote:

> Hi,
>
> I am thinking of adding a string offset to ColumnPaginationFilter. There
> are two reasons:
>
> 1) For deep pagination, you can seek using SEEK_NEXT_USING_HINT.
> 2) For correctness reasons, this approach is better if the list of columns
> is mutation. Lets say you get 1st 50 columns using the current approach. In
> the mean time some columns are inserted amongst the 1st 50 columns. Now you
> request the 2nd set of 50 columns. Chances are that you will have
> duplicates amongst the 2 sets (1st 50 and 2nd 50). If instead you used the
> last column of the 1st 50 as a string offset for getting the 2nd set of
> columns, the chances of getting dups is significantly lower.
>
> This becomes important for user facing interactive applications.
> Particularly where consistency etc. are not as important since those are
> best effort services. But showing duplicates across pages is pretty bad.
>
> Please let me know if this makes sense and is feasible. Basically, I would
> like a string offset passed to ColumnPaginationFilter as an alternative
> constructor. If the string offset is supplied, then, I would like to seek
> to either the column supplied or if the column is deleted, seek to the
> column just greater than the supplied column.
>
> Thanks
> Varun
>