You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Jean-Marc Spaggiari <je...@spaggiari.org> on 2014/02/01 02:46:14 UTC

Re: Full table scan from random starting point?

Hi Robert,

You can randomly build your start key, give it to your scanner, scan until
the end of the table, then give it as the end key for a new scanner. Doing
that you will scan the way you are looking for.

Also, this might interest you:
https://issues.apache.org/jira/browse/HBASE-9272

JM


2014-01-31 Robert Dyer <rd...@iastate.edu>:

> Let's say I have one client on each of my regionservers.  Each client needs
> to do a full scan on the same table.  The order in which the rows are
> scanned by clients does not matter.
>
> Is it possible to have each client start at a random (or better, the first
> row located on the local rs) point in the table so that if I start all of
> them at once they don't all peg the same rs for reads?
>
> Example (to keep it simple, assume 3 RS):
>
> RS1: rows 1-2
> RS2: rows 3-4
> RS3: rows 5-6
>
> client1 (on RS1) reads rows: 1, 2, 3, 4, 5, 6
> client2 (on RS2) reads rows: 3, 4, 5, 6, 1, 2
> client3 (on RS3) reads rows: 5, 6, 1, 2, 3, 4
>
> Obviously they may progress at different rates and still wind up hitting
> the same RSs, but at least we can start out a bit more distributed.
>
> Is this easily possible, without first obtaining a list of all rows and
> manually batching them up?
>