You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Usman Waheed <us...@opera.com> on 2011/02/18 09:16:16 UTC

Tall versus wide tables in Hbase

Hi,

I would like to setup an Hbase table that would provide users the ability  
to perform selects only (get and scans). We don't have a need for users to  
perform inserts or updates at the moment. But yes i will have to  
load/insert the data into the tables before users can perform selects.

I can have the row key as a composite, having "brand:date:users" where  
brand is a 4 letter code for all brands, date is DD-MM-YYYY and users is  
the metric (how many people bought a certain brand). This will give me  
rather tall table which will have millions of rows and less columns (maybe  
2) at most.

or

Would it be better to have a wider table with the row key as users:date  
only and have the brands become a column family. There are many brands to  
track on a daily basis. People using my table will need to select a  
particular brand, a group or all brands to retrieve and display data.

If i recollect is it recommended to have tall tables if one is not doing  
atomic operations? Does a get/scan in Hbase perform any row locking?  
Having a tall table means more data can be spread out over regions on  
different nodes in my cluster. I have a small test cluster of 3 nodes at  
the moment.

I intend to have other metrics (quantity, price etc) and types (brand,  
products, campaigns etc). So my table will be gorw fast and have lots of  
data.

If i use the type (brand, campaign, product) as part of the row key then  
my inserts will be in the millions over time but if i make the type a  
column family then i will end up with wider entries and less rows.

Thanks,
Usman






-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

Re: Tall versus wide tables in Hbase

Posted by Jean-Daniel Cryans <jd...@apache.org>.
This has been discussed recently on the mailing list, see those two
threads for example:

http://search-hadoop.com/m/amq9c1OaV9z1/wide+tall+hbase+table&subj=Insert+into+tall+table+50+faster+than+wide+table

and

http://search-hadoop.com/m/zbKmE14o0Js/wide+tall+hbase+table&subj=Re+Parent+child+relation+go+vertical+horizontal+or+many+tables+

This should help you getting started, feel free to write back if you
have any followup questions.

J-D

On Fri, Feb 18, 2011 at 12:16 AM, Usman Waheed <us...@opera.com> wrote:
> Hi,
>
> I would like to setup an Hbase table that would provide users the ability to
> perform selects only (get and scans). We don't have a need for users to
> perform inserts or updates at the moment. But yes i will have to load/insert
> the data into the tables before users can perform selects.
>
> I can have the row key as a composite, having "brand:date:users" where brand
> is a 4 letter code for all brands, date is DD-MM-YYYY and users is the
> metric (how many people bought a certain brand). This will give me rather
> tall table which will have millions of rows and less columns (maybe 2) at
> most.
>
> or
>
> Would it be better to have a wider table with the row key as users:date only
> and have the brands become a column family. There are many brands to track
> on a daily basis. People using my table will need to select a particular
> brand, a group or all brands to retrieve and display data.
>
> If i recollect is it recommended to have tall tables if one is not doing
> atomic operations? Does a get/scan in Hbase perform any row locking? Having
> a tall table means more data can be spread out over regions on different
> nodes in my cluster. I have a small test cluster of 3 nodes at the moment.
>
> I intend to have other metrics (quantity, price etc) and types (brand,
> products, campaigns etc). So my table will be gorw fast and have lots of
> data.
>
> If i use the type (brand, campaign, product) as part of the row key then my
> inserts will be in the millions over time but if i make the type a column
> family then i will end up with wider entries and less rows.
>
> Thanks,
> Usman
>
>
>
>
>
>
> --
> Using Opera's revolutionary email client: http://www.opera.com/mail/
>