You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Felix Sprick <fs...@gmail.com> on 2011/05/11 11:21:46 UTC

Scans on salted rowkeys

Hi guys,

I am using rowkeys with a pattern like [minute]_[timestamp] because my
main use case is to read time ranges over a couple of hours and I want
to read in parallel from as many nodes in the cluster as possible,
thus, distributing the data in minute buckets across the cluster.

Problem now is that I am not sure how to do sequential reads (for
example all records between 11:10 and 12:00) and for defining such
time frames as input to my MapReduce jobs.

Any ideas?

Thanks,
Felix

Re: Scans on salted rowkeys

Posted by Ted Yu <yu...@gmail.com>.

See '[ANN]: HBaseWD: Distribute Sequential Writes in HBase' thread.

https://github.com/sematext/HBaseWD

On Wed, May 11, 2011 at 2:21 AM, Felix Sprick <fs...@gmail.com> wrote:

> Hi guys,
>
> I am using rowkeys with a pattern like [minute]_[timestamp] because my
> main use case is to read time ranges over a couple of hours and I want
> to read in parallel from as many nodes in the cluster as possible,
> thus, distributing the data in minute buckets across the cluster.
>
> Problem now is that I am not sure how to do sequential reads (for
> example all records between 11:10 and 12:00) and for defining such
> time frames as input to my MapReduce jobs.
>
> Any ideas?
>
> Thanks,
> Felix
>