You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Nishanth S <ni...@gmail.com> on 2014/09/22 22:43:37 UTC
Restructuring Hbase Table
Hi folks,
We have a hbase table with 4 column families which stores log data.The
columns and the content stored on each of these column families are the
same. The reason for having multiple families is that we needed 4 retention
buckets for messages and were using the TTL feature of hbase to achieve
this.Each of our hbase row would have a predefined set of meta fields and a
large blob message.
I was considering re structuring the table with 2 column families.One
column family for metadata and other for the blob message which is the
meatier chunk.The reason for this approach being most of the analytics
queries would be directed at meta data which is in cf1 and few in cf2 which
has the blob message.There will be few use cases where you would need to
query the data in both cf1 and cf2 but that is not the dominant use
case.We would then devise some method to purge the data manually(using
retention bucket + timestamp) in row key. How does this look so far?Is
there a better way?.
Thanks,
Nishanth
Fwd: Restructuring Hbase Table
Posted by Nishanth S <ni...@gmail.com>.
Hi folks,
We have a hbase table with 4 column families which stores log data.The
columns and the content stored on each of these column families are the
same. The reason for having multiple families is that we needed 4 retention
buckets for messages and were using the TTL feature of hbase to achieve
this.Each of our hbase row would have a predefined set of meta fields and a
large blob message.
I was considering re structuring the table with 2 column families.One
column family for metadata and other for the blob message which is the
meatier chunk.The reason for this approach being most of the analytics
queries would be directed at meta data which is in cf1 and few in cf2 which
has the blob message.There will be few use cases where you would need to
query the data in both cf1 and cf2 but that is not the dominant use
case.We would then devise some method to purge the data manually(using
retention bucket + timestamp) in row key. How does this look so far?Is
there a better way to implement this?
Thanks,
Nishanth