You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Something Something <ma...@gmail.com> on 2011/07/26 17:20:46 UTC

Retrieving last 100 rows by timestamp

Hello,

Need to create a report that shows 'last 100 rows by timestamp'.  This query
should return almost instantaneously.  Any recommendation regarding the
design?  I was thinking of creating a table with 'sequence #' as a key and
value would be 'key of another table that contains the master data'.
 Anyway, I will think some more, but if anyone has done something similar, I
would greatly appreciate it if they would share the knowledge.  Thanks.

- Ajay

RE: Retrieving last 100 rows by timestamp

Posted by "Srikanth P. Shreenivas" <Sr...@mindtree.com>.
There are few  ways to look at it:

1) Lets say your row key is a LONG value that keeps incrementing for every writes, then, you can use, HTable.getRowOrBefore method multiple times to get hold of latest N entries.  For the first call, you can pass Long.MAX_VALUE as row key, the HBase will return you the row with key just less than Long.MAX_VALUE, and in this case, it will be the latest row key.  Let's call this "latestRowKey".  For next call to "getRowOrBefore", you can pass "latestRowKey - 1" as row key, and so on.

    http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html#getRowOrBefore(byte[], byte[])


2) Another variant of above approach can be wherein your row key is "<timestamp>".  In this case, you can pass the current time to first call to "getRowOrBefore" and (latestRowKey - 1) to subsequent calls.


3) Use reverse timestamp.  This is discussed in Hbase chapter in "Hadoop - The definitive guide book".
   If you use row key as  Long.MAX_VALUE - System.currentTimeInMillis(), then, all the latest entries will show up first if you do a Scan on the table.
   This way first N entries will be your latest N entries.


Regards,
Srikanth


-----Original Message-----
From: Harsh J [mailto:harsh@cloudera.com]
Sent: Tuesday, July 26, 2011 10:25 PM
To: user@hbase.apache.org
Subject: Re: Retrieving last 100 rows by timestamp

Hello Ajay,

See the recent mailing list discussion on using reverse timestamps in key. It should help your case, I think.

On 26-Jul-2011, at 8:50 PM, Something Something wrote:

> Hello,
>
> Need to create a report that shows 'last 100 rows by timestamp'.  This query
> should return almost instantaneously.  Any recommendation regarding the
> design?  I was thinking of creating a table with 'sequence #' as a key and
> value would be 'key of another table that contains the master data'.
> Anyway, I will think some more, but if anyone has done something similar, I
> would greatly appreciate it if they would share the knowledge.  Thanks.
>
> - Ajay


________________________________

http://www.mindtree.com/email/disclaimer.html

Re: Retrieving last 100 rows by timestamp

Posted by Harsh J <ha...@cloudera.com>.
Hello Ajay,

See the recent mailing list discussion on using reverse timestamps in key. It should help your case, I think.

On 26-Jul-2011, at 8:50 PM, Something Something wrote:

> Hello,
> 
> Need to create a report that shows 'last 100 rows by timestamp'.  This query
> should return almost instantaneously.  Any recommendation regarding the
> design?  I was thinking of creating a table with 'sequence #' as a key and
> value would be 'key of another table that contains the master data'.
> Anyway, I will think some more, but if anyone has done something similar, I
> would greatly appreciate it if they would share the knowledge.  Thanks.
> 
> - Ajay