You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Marcelo Elias Del Valle <mv...@gmail.com> on 2012/09/27 16:25:31 UTC

Re: best design

2012/9/27 Andre Tavares <an...@gmail.com>

> create column family users_test with comparator=UTF8Type and
> column_metadata=[
> {column_name: generic_key, validation_class: UTF8Type, index_type: KEYS},
> {column_name: user_key, validation_class: UTF8Type, index_type: KEYS}
> ];
>
> where generic_id can be: user_cook_id value, or a user_facebook_id,
> user_cell_phone, user_personal_id values ... the "problem" of this solution
> is that I have 200 million users_id x 4 keys (user_cook_id,
> user_facebook_id, user_cell_phone, user_personal_id) = 800 million rows
>

If I understood correctly, if any key is the same, the use is the same. So
your row key is generic_id and for each generic key you want to find the
corresponding user.

Your search, if I understood correctly, is: "find user by generic_id"

The way you designed, there are no partitions. I don't know Cassandra well,
but I am not sure of what would happen if you have 3 billion users, for
instance. You would have 12 billion rows... Would Cassandra have any
problem to find the user by row key? How would Cassandra index these rows?


-- 
Marcelo Elias Del Valle
http://mvalle.com - @mvallebr