You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by T Vinod Gupta <tv...@readypulse.com> on 2012/01/28 22:09:54 UTC

using hbase key/column name to create and maintain realtime sorted list

Hi,
Can someone tell if this approach is not appropriate or inefficient for how
hbase works?
Let say i want to maintain a sorted list of urls based on visits to them.
So if I create a column key that looks like this -  "Long.MAX_VALUE - <num
visits>:<url>" . I also have another column whose name is "<url>" itself
and the value is "Long.MAX_VALUE - <num visits>:<url>". So whenever a visit
happens on a url, i can retrieve the sorted column name. update the number
of visits. and hence delete that column and create a column with the new
visit count. and update its reference.

this way, if i have to get the 10 most visited urls, i can retrieve it in
realtime.

i can see that this approach can create a burst of deletes/puts and
potentially hotspot a region. but if the pattern is not going to be bursty,
is this ok?

what are the recommendations for maintaining sorted list in hbase?

thanks