You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Jianshi Huang <ji...@gmail.com> on 2014/07/17 08:40:58 UTC

Scan columns of a row within a Range

Hi,

I scanned through HBase' Scan API and couldn't find out how to scan a range
of columns in a row.

It seems I can only do scan(startRow, endRow), which are both just RowKeys.

What's the most efficient way to do it? Should I use a Filter? I heard
filter is not as efficient as RK scans, how much slower is it?

(BTW, I was using Accumulo for the same thing and it has a really nice API
(Range, Key) for it. A Key is a combination of RK+CF+CQ+TS.)

Am I missing anything?

Cheers,
-- 
Jianshi Huang

LinkedIn: jianshi
Twitter: @jshuang
Github & Blog: http://huangjs.github.com/

Re: Scan columns of a row within a Range

Posted by Jianshi Huang <ji...@gmail.com>.
Yes, I found the info from a nice blog article. Thanks Ted!

Jianshi


On Thu, Jul 17, 2014 at 10:07 PM, Ted Yu <yu...@gmail.com> wrote:

> ColumnRangeFilter implements getNextCellHint() in facilitating jumping to
> the minColumn.
> When current column is past maxColumn, it skips to next row.
>
> So ColumnRangeFilter is very effective.
>
> Cheers
>
>
> On Thu, Jul 17, 2014 at 12:45 AM, Jianshi Huang <ji...@gmail.com>
> wrote:
>
> > Hi Esteban,
> >
> > Yes, I found it moments ago. Is it as efficient as the Row scan?
> >
> > And can I have millions of columns for a row with no or little
> performance
> > impaction? (the traditional tall vs wide problem, the hbase manual
> > recommends tall table than wide table).
> >
> >
> > Jianshi
> >
> >
> > On Thu, Jul 17, 2014 at 3:01 PM, Esteban Gutierrez <esteban@cloudera.com
> >
> > wrote:
> >
> > > Hi Jianshi,
> > >
> > > Have you looked into the ColumnRangeFilter?
> > >
> > >
> >
> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/ColumnRangeFilter.html
> > >
> > > cheers,
> > > esteban.
> > >
> > >
> > > --
> > > Cloudera, Inc.
> > >
> > >
> > >
> > > On Wed, Jul 16, 2014 at 11:40 PM, Jianshi Huang <
> jianshi.huang@gmail.com
> > >
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > I scanned through HBase' Scan API and couldn't find out how to scan a
> > > range
> > > > of columns in a row.
> > > >
> > > > It seems I can only do scan(startRow, endRow), which are both just
> > > RowKeys.
> > > >
> > > > What's the most efficient way to do it? Should I use a Filter? I
> heard
> > > > filter is not as efficient as RK scans, how much slower is it?
> > > >
> > > > (BTW, I was using Accumulo for the same thing and it has a really
> nice
> > > API
> > > > (Range, Key) for it. A Key is a combination of RK+CF+CQ+TS.)
> > > >
> > > > Am I missing anything?
> > > >
> > > > Cheers,
> > > > --
> > > > Jianshi Huang
> > > >
> > > > LinkedIn: jianshi
> > > > Twitter: @jshuang
> > > > Github & Blog: http://huangjs.github.com/
> > > >
> > >
> >
> >
> >
> > --
> > Jianshi Huang
> >
> > LinkedIn: jianshi
> > Twitter: @jshuang
> > Github & Blog: http://huangjs.github.com/
> >
>



-- 
Jianshi Huang

LinkedIn: jianshi
Twitter: @jshuang
Github & Blog: http://huangjs.github.com/

Re: Scan columns of a row within a Range

Posted by Ted Yu <yu...@gmail.com>.
ColumnRangeFilter implements getNextCellHint() in facilitating jumping to
the minColumn.
When current column is past maxColumn, it skips to next row.

So ColumnRangeFilter is very effective.

Cheers


On Thu, Jul 17, 2014 at 12:45 AM, Jianshi Huang <ji...@gmail.com>
wrote:

> Hi Esteban,
>
> Yes, I found it moments ago. Is it as efficient as the Row scan?
>
> And can I have millions of columns for a row with no or little performance
> impaction? (the traditional tall vs wide problem, the hbase manual
> recommends tall table than wide table).
>
>
> Jianshi
>
>
> On Thu, Jul 17, 2014 at 3:01 PM, Esteban Gutierrez <es...@cloudera.com>
> wrote:
>
> > Hi Jianshi,
> >
> > Have you looked into the ColumnRangeFilter?
> >
> >
> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/ColumnRangeFilter.html
> >
> > cheers,
> > esteban.
> >
> >
> > --
> > Cloudera, Inc.
> >
> >
> >
> > On Wed, Jul 16, 2014 at 11:40 PM, Jianshi Huang <jianshi.huang@gmail.com
> >
> > wrote:
> >
> > > Hi,
> > >
> > > I scanned through HBase' Scan API and couldn't find out how to scan a
> > range
> > > of columns in a row.
> > >
> > > It seems I can only do scan(startRow, endRow), which are both just
> > RowKeys.
> > >
> > > What's the most efficient way to do it? Should I use a Filter? I heard
> > > filter is not as efficient as RK scans, how much slower is it?
> > >
> > > (BTW, I was using Accumulo for the same thing and it has a really nice
> > API
> > > (Range, Key) for it. A Key is a combination of RK+CF+CQ+TS.)
> > >
> > > Am I missing anything?
> > >
> > > Cheers,
> > > --
> > > Jianshi Huang
> > >
> > > LinkedIn: jianshi
> > > Twitter: @jshuang
> > > Github & Blog: http://huangjs.github.com/
> > >
> >
>
>
>
> --
> Jianshi Huang
>
> LinkedIn: jianshi
> Twitter: @jshuang
> Github & Blog: http://huangjs.github.com/
>

Re: Scan columns of a row within a Range

Posted by Jianshi Huang <ji...@gmail.com>.
Hi Esteban,

Yes, I found it moments ago. Is it as efficient as the Row scan?

And can I have millions of columns for a row with no or little performance
impaction? (the traditional tall vs wide problem, the hbase manual
recommends tall table than wide table).


Jianshi


On Thu, Jul 17, 2014 at 3:01 PM, Esteban Gutierrez <es...@cloudera.com>
wrote:

> Hi Jianshi,
>
> Have you looked into the ColumnRangeFilter?
>
> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/ColumnRangeFilter.html
>
> cheers,
> esteban.
>
>
> --
> Cloudera, Inc.
>
>
>
> On Wed, Jul 16, 2014 at 11:40 PM, Jianshi Huang <ji...@gmail.com>
> wrote:
>
> > Hi,
> >
> > I scanned through HBase' Scan API and couldn't find out how to scan a
> range
> > of columns in a row.
> >
> > It seems I can only do scan(startRow, endRow), which are both just
> RowKeys.
> >
> > What's the most efficient way to do it? Should I use a Filter? I heard
> > filter is not as efficient as RK scans, how much slower is it?
> >
> > (BTW, I was using Accumulo for the same thing and it has a really nice
> API
> > (Range, Key) for it. A Key is a combination of RK+CF+CQ+TS.)
> >
> > Am I missing anything?
> >
> > Cheers,
> > --
> > Jianshi Huang
> >
> > LinkedIn: jianshi
> > Twitter: @jshuang
> > Github & Blog: http://huangjs.github.com/
> >
>



-- 
Jianshi Huang

LinkedIn: jianshi
Twitter: @jshuang
Github & Blog: http://huangjs.github.com/

Re: Scan columns of a row within a Range

Posted by Esteban Gutierrez <es...@cloudera.com>.
Hi Jianshi,

Have you looked into the ColumnRangeFilter?
http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/ColumnRangeFilter.html

cheers,
esteban.


--
Cloudera, Inc.



On Wed, Jul 16, 2014 at 11:40 PM, Jianshi Huang <ji...@gmail.com>
wrote:

> Hi,
>
> I scanned through HBase' Scan API and couldn't find out how to scan a range
> of columns in a row.
>
> It seems I can only do scan(startRow, endRow), which are both just RowKeys.
>
> What's the most efficient way to do it? Should I use a Filter? I heard
> filter is not as efficient as RK scans, how much slower is it?
>
> (BTW, I was using Accumulo for the same thing and it has a really nice API
> (Range, Key) for it. A Key is a combination of RK+CF+CQ+TS.)
>
> Am I missing anything?
>
> Cheers,
> --
> Jianshi Huang
>
> LinkedIn: jianshi
> Twitter: @jshuang
> Github & Blog: http://huangjs.github.com/
>