You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Robertis Tongbram <rt...@gmail.com> on 2012/09/24 20:40:48 UTC

A different sort of BitComparator

Hi all,

We have this scenario, where we pre-calculate a condition ( based on 7/8
dimensions - like  age, gender, occupation). We then use this calculated
value as column qualifier ( not as Rowkeys).
To optimize storage, we use a bit set to represent the dimensions ( for
example , in gender, we use 0 to represent female and 1 to represent male,
so we use only 1 bit to represent gender).  When we insert data to Hbase,
we do this calculation ( using BigInteger since we use Java 6) . And during
lookup, we do the similar bit-wise calculation and come up with the right
column qualifier. Calculation is a simple shift+OR operation. We also had
to re-implement compareTo() method of BitComparator class to mask and
compare ( for example, if we want to look up occupation, we mask the other
conditions and compare with the value from KeyValue). With this , the
response times almost reduced to half from the previous filter of using
RegEx Filter ( Earlier, we used Regex to compare the column qualifier where
the conditions are concatenated using a delimiter) .

I wanted to ask the Hbase experts here, if this is the right way of doing ,
or is there a better more scalable and maintainable way.



Thanks,
Robertis