You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@cassandra.apache.org by Apache Wiki <wi...@apache.org> on 2012/01/02 19:08:00 UTC

[Cassandra Wiki] Update of "DataModel" by DaveBrosius

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification.

The "DataModel" page has been changed by DaveBrosius:
http://wiki.apache.org/cassandra/DataModel?action=diff&rev1=12&rev2=13

Comment:
change OrderPreservingPartitioner to ByteOrderedPartitioner

Just like normal columns, super columns are sparse: each row may contain as many or as few as it likes; Cassandra imposes no restrictions.

= Range queries =
- Cassandra supports pluggable partitioning schemes with a relatively small amount of code. Out of the box, Cassandra provides the hash-based RandomPartitioner and an OrderPreservingPartitioner. RandomPartitioner gives you pretty good load balancing with no further work required. OrderPreservingPartitioner on the other hand lets you perform range queries on the keys you have stored, but requires choosing node tokens carefully or active load balancing. Systems that only support hash-based partitioning cannot perform range queries efficiently.
+ Cassandra supports pluggable partitioning schemes with a relatively small amount of code. Out of the box, Cassandra provides the hash-based RandomPartitioner and a ByteOrderedPartitioner. RandomPartitioner gives you pretty good load balancing with no further work required. ByteOrderedPartitioner on the other hand lets you perform range queries on the keys you have stored, but requires choosing node tokens carefully or active load balancing. Systems that only support hash-based partitioning cannot perform range queries efficiently.

= Modeling your application =
Unlike with relational systems, where you model entities and relationships and then just add indexes to support whatever queries become necessary, with Cassandra you need to think about what queries you want to support efficiently ahead of time, and model appropriately. Since there are no automatically-provided indexes, you will be much closer to one !ColumnFamily per query than you would have been with tables:queries relationally. Don't be afraid to denormalize accordingly; Cassandra is much, much faster at writes than relational systems, without giving up speed on reads.