You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Tomas Tillery <to...@gmail.com> on 2012/04/30 16:00:26 UTC

Scan on compound key

Hello,

I have a key composed of two parts, a semi-unique value and the time it was
observed, pipe separated. I need to do scans on the data based on the
value. The value I will be scanning on is always at the front of the string
(I am not searching based on date). So far, I've been using
SubstringComparator, but it has to consider all parts of the key, and takes
a long time for keys that start near the end of the list. Is there a better
filter to use for substring scans at the beginning of the string?

A few keys might look like this:
"example key|2012-01-01 00:01:01.000000001"
"example key|2012-02-01 00:01:01.000000001"

And I would be looking for all keys that matched "example key".

Thanks,
Tomas

Re: Scan on compound key

Posted by Tomas Tillery <to...@gmail.com>.

No, that looks like exactly what I should have been doing. Thanks.

On Mon, Apr 30, 2012 at 10:34 AM, Alex Baranau <al...@gmail.com>wrote:

> Why not just define startRow & stopRow for Scan [1]? Am I missing smth?
>
> Alex Baranau
> ------
> Sematext :: http://blog.sematext.com/ :: Solr - Lucene - Hadoop - HBase
>
> [1]
>
> Smth like:
>
> byte[] startRow = Bytes.toString("example key");
> byte[] stopRow = Arrays.copyOf(startRow, startRow.length);
> stopRow[stopRow.length - 1]++; // stop row is exclusive (note: be careful
> when incrementing the last byte)
> Scan scan = new Scan(startRow, stopRow);
>
> On Mon, Apr 30, 2012 at 10:00 AM, Tomas Tillery <tomastillery@gmail.com
> >wrote:
>
> > Hello,
> >
> > I have a key composed of two parts, a semi-unique value and the time it
> was
> > observed, pipe separated. I need to do scans on the data based on the
> > value. The value I will be scanning on is always at the front of the
> string
> > (I am not searching based on date). So far, I've been using
> > SubstringComparator, but it has to consider all parts of the key, and
> takes
> > a long time for keys that start near the end of the list. Is there a
> better
> > filter to use for substring scans at the beginning of the string?
> >
> > A few keys might look like this:
> > "example key|2012-01-01 00:01:01.000000001"
> > "example key|2012-02-01 00:01:01.000000001"
> >
> > And I would be looking for all keys that matched "example key".
> >
> > Thanks,
> > Tomas
> >
>

Re: Scan on compound key

Posted by Alex Baranau <al...@gmail.com>.

Why not just define startRow & stopRow for Scan [1]? Am I missing smth?

Alex Baranau
------
Sematext :: http://blog.sematext.com/ :: Solr - Lucene - Hadoop - HBase

[1]

Smth like:

byte[] startRow = Bytes.toString("example key");
byte[] stopRow = Arrays.copyOf(startRow, startRow.length);
stopRow[stopRow.length - 1]++; // stop row is exclusive (note: be careful
when incrementing the last byte)
Scan scan = new Scan(startRow, stopRow);

On Mon, Apr 30, 2012 at 10:00 AM, Tomas Tillery <to...@gmail.com>wrote:

> Hello,
>
> I have a key composed of two parts, a semi-unique value and the time it was
> observed, pipe separated. I need to do scans on the data based on the
> value. The value I will be scanning on is always at the front of the string
> (I am not searching based on date). So far, I've been using
> SubstringComparator, but it has to consider all parts of the key, and takes
> a long time for keys that start near the end of the list. Is there a better
> filter to use for substring scans at the beginning of the string?
>
> A few keys might look like this:
> "example key|2012-01-01 00:01:01.000000001"
> "example key|2012-02-01 00:01:01.000000001"
>
> And I would be looking for all keys that matched "example key".
>
> Thanks,
> Tomas
>