You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by jeremy p <at...@gmail.com> on 2015/06/05 20:18:27 UTC

How to do a fast range scan on a prefix

Assume that my keys look like this :
bar:0
bar:1
bar:2
baz:0
baz:1
foo:0
foo:1
foo:2

How do I do a fast range scan that returns all the rows that begin with
"baz:"?  Assume that I know nothing about any of the other rows in the
table.

Thanks for taking a look!

--Jeremy

Re: How to do a fast range scan on a prefix

Posted by jeremy p <at...@gmail.com>.
Excellent.  This is exactly what I need.  Thank you!

On Fri, Jun 5, 2015 at 3:07 PM, Ted Yu <yu...@gmail.com> wrote:

> Clarification: setRowPrefixFilter() doesn't use PrefixFilter. It calculates
> the scan range itself:
>
>   public Scan setRowPrefixFilter(byte[] rowPrefix) {
>
>     if (rowPrefix == null) {
>
>       setStartRow(HConstants.EMPTY_START_ROW);
>
>       setStopRow(HConstants.EMPTY_END_ROW);
>
>     } else {
>
>       this.setStartRow(rowPrefix);
>
>       this.setStopRow(calculateTheClosestNextRowKeyForPrefix(rowPrefix));
>
>     }
>
> FYI
>
> On Fri, Jun 5, 2015 at 11:27 AM, jeremy p <at...@gmail.com>
> wrote:
>
> > I've heard that PrefixFilter does a full table scan, and that a range
> scan
> > is faster.  Am I mistaken?
> >
> > On Fri, Jun 5, 2015 at 2:22 PM, Ted Yu <yu...@gmail.com> wrote:
> >
> > > You can utilize PrefixFilter.
> > >
> > > See example in http://hbase.apache.org/book.html#scan
> > >
> > > On Fri, Jun 5, 2015 at 11:18 AM, jeremy p <
> > athomewithagroovebox@gmail.com>
> > > wrote:
> > >
> > > > Assume that my keys look like this :
> > > > bar:0
> > > > bar:1
> > > > bar:2
> > > > baz:0
> > > > baz:1
> > > > foo:0
> > > > foo:1
> > > > foo:2
> > > >
> > > > How do I do a fast range scan that returns all the rows that begin
> with
> > > > "baz:"?  Assume that I know nothing about any of the other rows in
> the
> > > > table.
> > > >
> > > > Thanks for taking a look!
> > > >
> > > > --Jeremy
> > > >
> > >
> >
>

Re: How to do a fast range scan on a prefix

Posted by Ted Yu <yu...@gmail.com>.
Clarification: setRowPrefixFilter() doesn't use PrefixFilter. It calculates
the scan range itself:

  public Scan setRowPrefixFilter(byte[] rowPrefix) {

    if (rowPrefix == null) {

      setStartRow(HConstants.EMPTY_START_ROW);

      setStopRow(HConstants.EMPTY_END_ROW);

    } else {

      this.setStartRow(rowPrefix);

      this.setStopRow(calculateTheClosestNextRowKeyForPrefix(rowPrefix));

    }

FYI

On Fri, Jun 5, 2015 at 11:27 AM, jeremy p <at...@gmail.com>
wrote:

> I've heard that PrefixFilter does a full table scan, and that a range scan
> is faster.  Am I mistaken?
>
> On Fri, Jun 5, 2015 at 2:22 PM, Ted Yu <yu...@gmail.com> wrote:
>
> > You can utilize PrefixFilter.
> >
> > See example in http://hbase.apache.org/book.html#scan
> >
> > On Fri, Jun 5, 2015 at 11:18 AM, jeremy p <
> athomewithagroovebox@gmail.com>
> > wrote:
> >
> > > Assume that my keys look like this :
> > > bar:0
> > > bar:1
> > > bar:2
> > > baz:0
> > > baz:1
> > > foo:0
> > > foo:1
> > > foo:2
> > >
> > > How do I do a fast range scan that returns all the rows that begin with
> > > "baz:"?  Assume that I know nothing about any of the other rows in the
> > > table.
> > >
> > > Thanks for taking a look!
> > >
> > > --Jeremy
> > >
> >
>

Re: How to do a fast range scan on a prefix

Posted by jeremy p <at...@gmail.com>.
I've heard that PrefixFilter does a full table scan, and that a range scan
is faster.  Am I mistaken?

On Fri, Jun 5, 2015 at 2:22 PM, Ted Yu <yu...@gmail.com> wrote:

> You can utilize PrefixFilter.
>
> See example in http://hbase.apache.org/book.html#scan
>
> On Fri, Jun 5, 2015 at 11:18 AM, jeremy p <at...@gmail.com>
> wrote:
>
> > Assume that my keys look like this :
> > bar:0
> > bar:1
> > bar:2
> > baz:0
> > baz:1
> > foo:0
> > foo:1
> > foo:2
> >
> > How do I do a fast range scan that returns all the rows that begin with
> > "baz:"?  Assume that I know nothing about any of the other rows in the
> > table.
> >
> > Thanks for taking a look!
> >
> > --Jeremy
> >
>

Re: How to do a fast range scan on a prefix

Posted by Ted Yu <yu...@gmail.com>.
You can utilize PrefixFilter.

See example in http://hbase.apache.org/book.html#scan

On Fri, Jun 5, 2015 at 11:18 AM, jeremy p <at...@gmail.com>
wrote:

> Assume that my keys look like this :
> bar:0
> bar:1
> bar:2
> baz:0
> baz:1
> foo:0
> foo:1
> foo:2
>
> How do I do a fast range scan that returns all the rows that begin with
> "baz:"?  Assume that I know nothing about any of the other rows in the
> table.
>
> Thanks for taking a look!
>
> --Jeremy
>