You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by "Gan, Xiyun" <ga...@gmail.com> on 2011/04/20 14:23:16 UTC

HBase column wide scanning and fetching

 My problem is as the following:

http://stackoverflow.com/questions/4790029/hbase-column-wide-scanning-and-fetching
 Let's say i've created a table

rowkey (attrId+attr_value) //compound key

column => doc:doc1, doc:doc2, ...[the qualifier is variable, which depends
on the value]

when use scan feature, i would fetch 1 row every time inside iterator, what
if the column qualifier reach millions entries. how do you loop through
that, and will there be a cache issue?

thanks.


-- 
Best wishes
Gan, Xiyun

Re: HBase column wide scanning and fetching

Posted by Stack <st...@duboce.net>.

On Sun, Apr 24, 2011 at 8:22 PM, Gan, Xiyun <ga...@gmail.com> wrote:
> It works for java code, but I'm writing php scripts using thrift gateway.
> What is the solution?
>

Hack what you need into thrift idl and regen your php bindings.
Thanks,
St.Ack

Re: HBase column wide scanning and fetching

Posted by "Gan, Xiyun" <ga...@gmail.com>.

It works for java code, but I'm writing php scripts using thrift gateway.
What is the solution?

Thanks

On Thu, Apr 21, 2011 at 1:27 AM, Stack <st...@duboce.net> wrote:

> On Wed, Apr 20, 2011 at 5:23 AM, Gan, Xiyun <ga...@gmail.com> wrote:
> > when use scan feature, i would fetch 1 row every time inside iterator,
> what
> > if the column qualifier reach millions entries. how do you loop through
> > that, and will there be a cache issue?
> >
>
> You do in-row scan.  You set an upper bound on how many columns to
> return on each next invocation [1.].  In this way you can iterate
> through a large row incrementally.  Regards cache, you can set whether
> or not your scan goes via cache on the Scan object [2.].
>
> St.Ack
>
> 1.
> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setBatch(int)
> 2.
> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setCacheBlocks(boolean)
>



-- 
Best wishes
Gan, Xiyun

Re: HBase column wide scanning and fetching

Posted by Stack <st...@duboce.net>.

On Wed, Apr 20, 2011 at 5:23 AM, Gan, Xiyun <ga...@gmail.com> wrote:
> when use scan feature, i would fetch 1 row every time inside iterator, what
> if the column qualifier reach millions entries. how do you loop through
> that, and will there be a cache issue?
>

You do in-row scan.  You set an upper bound on how many columns to
return on each next invocation [1.].  In this way you can iterate
through a large row incrementally.  Regards cache, you can set whether
or not your scan goes via cache on the Scan object [2.].

St.Ack

1. http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setBatch(int)
2. http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setCacheBlocks(boolean)