You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Mark <st...@gmail.com> on 2011/08/16 17:36:47 UTC

Rows vs Columns

We have a typical site that includes users and products. If we wanted to 
log all user product views and when they viewed them, how would one 
model this in HBase? As far as I can tell there are at least 2 ways.

1) Each row key would he user/epoch and there would be only 1 column 
"products:id" with a value of the item id. This would lead to 1 row per 
user per product view.

2) Each row key would be the user while each column would be 
"products:epoch" with a value of the item id. This would be one row per 
user having 1 column per product view.

3) I'm sure there is a third but I can't think of one :)

What is more preferable and more importantly, why? Large number of rows 
or fatter rows?

Thanks



Re: Rows vs Columns

Posted by Jean-Daniel Cryans <jd...@apache.org>.
Pretty much the same kind of answer I just gave to Stuti in the thread
"Hbase Schema Query". The query pattern is very important, for
example: are you most likely to query a user for all its product views
or single pair lookups?

It's always easier to filter by row key since it's the "first level
index" so in general I would tend to have lots of small rows.

J-D

On Tue, Aug 16, 2011 at 8:36 AM, Mark <st...@gmail.com> wrote:
> We have a typical site that includes users and products. If we wanted to log
> all user product views and when they viewed them, how would one model this
> in HBase? As far as I can tell there are at least 2 ways.
>
> 1) Each row key would he user/epoch and there would be only 1 column
> "products:id" with a value of the item id. This would lead to 1 row per user
> per product view.
>
> 2) Each row key would be the user while each column would be
> "products:epoch" with a value of the item id. This would be one row per user
> having 1 column per product view.
>
> 3) I'm sure there is a third but I can't think of one :)
>
> What is more preferable and more importantly, why? Large number of rows or
> fatter rows?
>
> Thanks
>
>
>