You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Data Craftsman <da...@gmail.com> on 2012/03/01 22:37:09 UTC

Re: Is this the correct data model thinking?

Yes.  Think in queries.
• <https://workflowy.com/#/4a23ce31-2444-fd1e-72a4-0abc3182c167>
Break your normalization habit
• <https://workflowy.com/#/5c60ae54-4122-faa7-d826-8b3887938324>
Roughly ~one CF per query
• <https://workflowy.com/#/e7606890-2402-1a62-cc1d-f6e764fd3e2c>
Denormalize!
• <https://workflowy.com/#/bcd7ff85-2f3a-6c17-8249-9a31a7e760b3>
Use in-column entity caching


On Tue, Feb 28, 2012 at 12:12 AM, aaron morton <aa...@thelastpickle.com>wrote:

> A.) store ALL the data associated with the user onto a single users
> row-key. Some user keys may be small, others may get larger over time
> depending upon activity.
>
> I would go with this.
> The important thing is supporting the read queries.
>
> Cheers
> Aaron
>
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 28/02/2012, at 7:40 PM, Blake Starkenburg wrote:
>
> Using a user/member as an example I am curious which of the data models
> would be the best fit for performance and longevity of data in Cassandra?
>
> Consider the simple staples of user/member details like
> username,email,address,state,preferences,etc. Fairly simple, storing this
> data into a row key users->username[email], etc.
>
> Now as time goes on more data such as snapshot changes like
> users->username['change:123456] = 'changed email', etc. columns compound
> onto the users row-key. Perhaps more preferences are added onto the row-key
> or login information. I wouldn't expect the amount of columns to grow
> hugely, but I've also learned to plan for the un-expected...
>
> Simplicity would tell me to:
>
> A.) store ALL the data associated with the user onto a single users
> row-key. Some user keys may be small, others may get larger over time
> depending upon activity.
>
> but would B be a better performance model
>
> B.) Split out user data into seperate row-keys such as
> users->changes_username['change123456] = 'changed email' AND
> users->preferences_username['fav_color] = 'blue'. This would add a level of
> complexity and in some cases tiny row-keys along with multiple fetches for
> all user/member data?
>
> Curious what your opinions are?
>
> Thanks!
>
>
>
-- 
Thanks,

Charlie (Yi) Zhu (一个 木匠)
=======
Data Solution Architect Developer
http://mujiang.blogspot.com