You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Mohit Anchlia <mo...@gmail.com> on 2012/07/31 02:56:31 UTC

Parallel scans

Is there a way to execute multiple scans in parallel like get?

Re: Parallel scans

Posted by Alex Baranau <al...@gmail.com>.

> Is there a way to execute multiple scans in parallel like get?

I guess the Q is can we (and does it makes sense) to execute multiple scans
in parallel, e.g. in multiple threads inside the client. The answer is yes,
you can do it and it makes sense: HBase is likely to be able to process
much more requests in parallel than you have clients (depends on your
clients number of course, but I assume you don't have more that several,
incl. MR jobs).

Alex Baranau
------
Sematext :: http://blog.sematext.com/ :: Hadoop - HBase - ElasticSearch -
Solr

On Tue, Jul 31, 2012 at 3:27 PM, Tom Brown <to...@gmail.com> wrote:

> I think you could do it manually by looking up all the different
> regions and starting a separate scan for each region. Not quite as
> handy as the built-in multi get, but essentially the same.
>
> Of course, that leaves the question of processing-- If you're
> processing it in a single-threaded environment, HBase is unlikely to
> be the bottleneck. If your sending each scan to multiple processors,
> this could be a significant speedup.
>
> --Tom
>
> On Mon, Jul 30, 2012 at 11:34 PM, Bertrand Dechoux <de...@gmail.com>
> wrote:
> > Hi,
> >
> > Are you talking about as coprocessor or MapReduce input? If it is the
> first
> > then it is up to you (the client). If it is the latter I am not sure that
> > -if scans were changed to be parallel (assuming they are sequential now)-
> > the whole job would be noticeably faster. But I am interested in an
> answer
> > too.
> >
> > Regards
> >
> > Bertrand
> >
> > On Tue, Jul 31, 2012 at 2:56 AM, Mohit Anchlia <mohitanchlia@gmail.com
> >wrote:
> >
> >> Is there a way to execute multiple scans in parallel like get?
> >>
> >
> >
> >
> > --
> > Bertrand Dechoux
>



-- 
Alex Baranau
------
Sematext :: http://blog.sematext.com/ :: Hadoop - HBase - ElasticSearch -
Solr

Re: Parallel scans

Posted by Tom Brown <to...@gmail.com>.

I think you could do it manually by looking up all the different
regions and starting a separate scan for each region. Not quite as
handy as the built-in multi get, but essentially the same.

Of course, that leaves the question of processing-- If you're
processing it in a single-threaded environment, HBase is unlikely to
be the bottleneck. If your sending each scan to multiple processors,
this could be a significant speedup.

--Tom

On Mon, Jul 30, 2012 at 11:34 PM, Bertrand Dechoux <de...@gmail.com> wrote:
> Hi,
>
> Are you talking about as coprocessor or MapReduce input? If it is the first
> then it is up to you (the client). If it is the latter I am not sure that
> -if scans were changed to be parallel (assuming they are sequential now)-
> the whole job would be noticeably faster. But I am interested in an answer
> too.
>
> Regards
>
> Bertrand
>
> On Tue, Jul 31, 2012 at 2:56 AM, Mohit Anchlia <mo...@gmail.com>wrote:
>
>> Is there a way to execute multiple scans in parallel like get?
>>
>
>
>
> --
> Bertrand Dechoux

Re: Parallel scans

Posted by Bertrand Dechoux <de...@gmail.com>.

Hi,

Are you talking about as coprocessor or MapReduce input? If it is the first
then it is up to you (the client). If it is the latter I am not sure that
-if scans were changed to be parallel (assuming they are sequential now)-
the whole job would be noticeably faster. But I am interested in an answer
too.

Regards

Bertrand

On Tue, Jul 31, 2012 at 2:56 AM, Mohit Anchlia <mo...@gmail.com>wrote:

> Is there a way to execute multiple scans in parallel like get?
>



-- 
Bertrand Dechoux