You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Weishung Chung <we...@gmail.com> on 2011/01/18 17:30:31 UTC

Scan with Filter

I have some questions about the way Hbase returns results. When we use
HTable getScanner(Scan), it looks like it only retrieves 1 row per next()
call which is different from how jdbc returns the resultset. If I set the
Filter on the Scan, could it return a set of rows in one connection call? It
looks like the ClientScanner does not make use of the Filter. Only the Scan
uses the Filter in the readFields method. Please correct me if i am wrong
about this.

Thank you so much.

Re: Scan with Filter

Posted by Sean Bigdatafun <se...@gmail.com>.
If I expect the valid scan result to be more than 10k (just a big number),
how are we going to set this number properly? -- if I set it to 1k, then
there will be 10 RPC round trips.




On Tue, Jan 18, 2011 at 1:24 PM, Weishung Chung <we...@gmail.com> wrote:

> Thank you :)
>
> On Tue, Jan 18, 2011 at 12:49 PM, Jonathan Gray <jg...@fb.com> wrote:
>
> > The API shows one row per next() call but the number of rows fetched per
> > RPC can be configured much higher with Scan.setCaching().
> >
> > Filters are basically just server-side predicates that will dictate which
> > rows/columns/values will be returned to the client.  This does not relate
> to
> > the number of rows sent per RPC.  See
> >
> http://hbase.apache.org/docs/current/api/org/apache/hadoop/hbase/filter/package-summary.htmlformore information about filters.
> >
> > JG
> >
> > > -----Original Message-----
> > > From: Weishung Chung [mailto:weishung@gmail.com]
> > > Sent: Tuesday, January 18, 2011 8:31 AM
> > > To: user@hbase.apache.org
> > > Subject: Scan with Filter
> > >
> > > I have some questions about the way Hbase returns results. When we use
> > > HTable getScanner(Scan), it looks like it only retrieves 1 row per
> next()
> > call
> > > which is different from how jdbc returns the resultset. If I set the
> > Filter on
> > > the Scan, could it return a set of rows in one connection call? It
> looks
> > like the
> > > ClientScanner does not make use of the Filter. Only the Scan uses the
> > Filter in
> > > the readFields method. Please correct me if i am wrong about this.
> > >
> > > Thank you so much.
> >
>



-- 
--Sean

Re: Scan with Filter

Posted by Weishung Chung <we...@gmail.com>.
Thank you :)

On Tue, Jan 18, 2011 at 12:49 PM, Jonathan Gray <jg...@fb.com> wrote:

> The API shows one row per next() call but the number of rows fetched per
> RPC can be configured much higher with Scan.setCaching().
>
> Filters are basically just server-side predicates that will dictate which
> rows/columns/values will be returned to the client.  This does not relate to
> the number of rows sent per RPC.  See
> http://hbase.apache.org/docs/current/api/org/apache/hadoop/hbase/filter/package-summary.htmlfor more information about filters.
>
> JG
>
> > -----Original Message-----
> > From: Weishung Chung [mailto:weishung@gmail.com]
> > Sent: Tuesday, January 18, 2011 8:31 AM
> > To: user@hbase.apache.org
> > Subject: Scan with Filter
> >
> > I have some questions about the way Hbase returns results. When we use
> > HTable getScanner(Scan), it looks like it only retrieves 1 row per next()
> call
> > which is different from how jdbc returns the resultset. If I set the
> Filter on
> > the Scan, could it return a set of rows in one connection call? It looks
> like the
> > ClientScanner does not make use of the Filter. Only the Scan uses the
> Filter in
> > the readFields method. Please correct me if i am wrong about this.
> >
> > Thank you so much.
>

RE: Scan with Filter

Posted by Jonathan Gray <jg...@fb.com>.
The API shows one row per next() call but the number of rows fetched per RPC can be configured much higher with Scan.setCaching().

Filters are basically just server-side predicates that will dictate which rows/columns/values will be returned to the client.  This does not relate to the number of rows sent per RPC.  See http://hbase.apache.org/docs/current/api/org/apache/hadoop/hbase/filter/package-summary.html for more information about filters.

JG

> -----Original Message-----
> From: Weishung Chung [mailto:weishung@gmail.com]
> Sent: Tuesday, January 18, 2011 8:31 AM
> To: user@hbase.apache.org
> Subject: Scan with Filter
> 
> I have some questions about the way Hbase returns results. When we use
> HTable getScanner(Scan), it looks like it only retrieves 1 row per next() call
> which is different from how jdbc returns the resultset. If I set the Filter on
> the Scan, could it return a set of rows in one connection call? It looks like the
> ClientScanner does not make use of the Filter. Only the Scan uses the Filter in
> the readFields method. Please correct me if i am wrong about this.
> 
> Thank you so much.