You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Pradheep Shanmugam <Pr...@infor.com> on 2016/11/18 16:42:02 UTC

HBase Schema

Hi,

I have table in Hbase which stores multiple versions of data in different rows.
The key is something like  <orgid><doctype><docid><timestamp>. The timestamp will differ for multiple versions of the same document.
Orgs are skewed say one org may have 1 billion docs while some orgs may have just 100K docs.
So I decided to do salting to spread the write across all region servers and to improve the writes..
Also one more factor for considering salting is these docs will not be referenced after say 6 months and only the new ones will be queried often.

Assuming a hybrid load, will this affect my read(to get the latest version of a document given the <orgid><doctype><docid>) performance of large and small orgs when there are more than 10 billion rows in total?

Thanks,
Pradheep