You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by mu yu <wi...@gmail.com> on 2014/05/07 11:06:04 UTC

Hbase full scan

Hi
We  deployed  a hbase-hadoop cluster for log storage .It's known hbase has
no index , i wanna know all the scan including the hbase filter scan are
full table scan ,and there's no other scans ?
For example if  implement a  rowkey  scan by rowkey filter , and  hbase
would execute a full table scan .

Any reply are appreciate.Thanks in advance.

Re: Hbase full scan

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.
Exact.

When you specify start row, you jump directly to the right place. And you
stop at end row. Also, if you add a filter to that, you will skip (server
side) some other rows.

JM


2014-05-12 23:17 GMT-04:00 mu yu <wi...@gmail.com>:

> Hi JM,
> Thanks for your reply .
> Ok,that's mean when filter or start row and stop row are used ,the scan
> would skip the other rows.
> Thank you so much.
>
>
> On Tue, May 13, 2014 at 2:15 AM, Jean-Marc Spaggiari <
> jean-marc@spaggiari.org> wrote:
>
> > Hi Mu,
> >
> > For a scan you can give start row and stop row. If you do so, it's only a
> > partial scan. Also, if you add filters, rows are skipped on the server
> > side.
> >
> > So you need to think your key to match your access pattern to avoid huge
> > scans.
> >
> > JM
> >
> >
> > 2014-05-07 5:06 GMT-04:00 mu yu <wi...@gmail.com>:
> >
> > > Hi
> > > We  deployed  a hbase-hadoop cluster for log storage .It's known hbase
> > has
> > > no index , i wanna know all the scan including the hbase filter scan
> are
> > > full table scan ,and there's no other scans ?
> > > For example if  implement a  rowkey  scan by rowkey filter , and  hbase
> > > would execute a full table scan .
> > >
> > > Any reply are appreciate.Thanks in advance.
> > >
> >
>

Re: Hbase full scan

Posted by Ted Yu <yu...@gmail.com>.
RegexFilter extends AbstractPatternFilter.

Among the filters shipped with HBase, here're the ones which do row
skipping:

ColumnPaginationFilter
ColumnPrefixFilter
ColumnRangeFilter
FuzzyRowFilter
MultipleColumnPrefixFilter


On Wed, May 14, 2014 at 8:20 AM, Mike Axiak <mi...@axiak.net> wrote:

> Just to clarify - filters can only skip rows when the filter is
> operating on the row keys, and even then only some filters can take
> advantage of this. (Notably, FuzzyRowFilter and RegexFilter)
>
> Best,
> Mike
>
> On Mon, May 12, 2014 at 11:17 PM, mu yu <wi...@gmail.com> wrote:
> > Hi JM,
> > Thanks for your reply .
> > Ok,that's mean when filter or start row and stop row are used ,the scan
> > would skip the other rows.
> > Thank you so much.
> >
> >
> > On Tue, May 13, 2014 at 2:15 AM, Jean-Marc Spaggiari <
> > jean-marc@spaggiari.org> wrote:
> >
> >> Hi Mu,
> >>
> >> For a scan you can give start row and stop row. If you do so, it's only
> a
> >> partial scan. Also, if you add filters, rows are skipped on the server
> >> side.
> >>
> >> So you need to think your key to match your access pattern to avoid huge
> >> scans.
> >>
> >> JM
> >>
> >>
> >> 2014-05-07 5:06 GMT-04:00 mu yu <wi...@gmail.com>:
> >>
> >> > Hi
> >> > We  deployed  a hbase-hadoop cluster for log storage .It's known hbase
> >> has
> >> > no index , i wanna know all the scan including the hbase filter scan
> are
> >> > full table scan ,and there's no other scans ?
> >> > For example if  implement a  rowkey  scan by rowkey filter , and
>  hbase
> >> > would execute a full table scan .
> >> >
> >> > Any reply are appreciate.Thanks in advance.
> >> >
> >>
>

Re: Hbase full scan

Posted by Mike Axiak <mi...@axiak.net>.
Just to clarify - filters can only skip rows when the filter is
operating on the row keys, and even then only some filters can take
advantage of this. (Notably, FuzzyRowFilter and RegexFilter)

Best,
Mike

On Mon, May 12, 2014 at 11:17 PM, mu yu <wi...@gmail.com> wrote:
> Hi JM,
> Thanks for your reply .
> Ok,that's mean when filter or start row and stop row are used ,the scan
> would skip the other rows.
> Thank you so much.
>
>
> On Tue, May 13, 2014 at 2:15 AM, Jean-Marc Spaggiari <
> jean-marc@spaggiari.org> wrote:
>
>> Hi Mu,
>>
>> For a scan you can give start row and stop row. If you do so, it's only a
>> partial scan. Also, if you add filters, rows are skipped on the server
>> side.
>>
>> So you need to think your key to match your access pattern to avoid huge
>> scans.
>>
>> JM
>>
>>
>> 2014-05-07 5:06 GMT-04:00 mu yu <wi...@gmail.com>:
>>
>> > Hi
>> > We  deployed  a hbase-hadoop cluster for log storage .It's known hbase
>> has
>> > no index , i wanna know all the scan including the hbase filter scan are
>> > full table scan ,and there's no other scans ?
>> > For example if  implement a  rowkey  scan by rowkey filter , and  hbase
>> > would execute a full table scan .
>> >
>> > Any reply are appreciate.Thanks in advance.
>> >
>>

Re: Hbase full scan

Posted by mu yu <wi...@gmail.com>.
Hi JM,
Thanks for your reply .
Ok,that's mean when filter or start row and stop row are used ,the scan
would skip the other rows.
Thank you so much.


On Tue, May 13, 2014 at 2:15 AM, Jean-Marc Spaggiari <
jean-marc@spaggiari.org> wrote:

> Hi Mu,
>
> For a scan you can give start row and stop row. If you do so, it's only a
> partial scan. Also, if you add filters, rows are skipped on the server
> side.
>
> So you need to think your key to match your access pattern to avoid huge
> scans.
>
> JM
>
>
> 2014-05-07 5:06 GMT-04:00 mu yu <wi...@gmail.com>:
>
> > Hi
> > We  deployed  a hbase-hadoop cluster for log storage .It's known hbase
> has
> > no index , i wanna know all the scan including the hbase filter scan are
> > full table scan ,and there's no other scans ?
> > For example if  implement a  rowkey  scan by rowkey filter , and  hbase
> > would execute a full table scan .
> >
> > Any reply are appreciate.Thanks in advance.
> >
>

Re: Hbase full scan

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.
Hi Mu,

For a scan you can give start row and stop row. If you do so, it's only a
partial scan. Also, if you add filters, rows are skipped on the server side.

So you need to think your key to match your access pattern to avoid huge
scans.

JM


2014-05-07 5:06 GMT-04:00 mu yu <wi...@gmail.com>:

> Hi
> We  deployed  a hbase-hadoop cluster for log storage .It's known hbase has
> no index , i wanna know all the scan including the hbase filter scan are
> full table scan ,and there's no other scans ?
> For example if  implement a  rowkey  scan by rowkey filter , and  hbase
> would execute a full table scan .
>
> Any reply are appreciate.Thanks in advance.
>