You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by Apache Wiki <wi...@apache.org> on 2014/09/29 04:16:05 UTC

[Cassandra Wiki] Update of "Operations" by JonHaddad

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification.

The "Operations" page has been changed by JonHaddad:
https://wiki.apache.org/cassandra/Operations?action=diff&rev1=112&rev2=113

  When the !RandomPartitioner is used, Tokens are integers from 0 to 2**127.  Keys are converted to this range by MD5 hashing for comparison with Tokens.  (Thus, keys are always convertible to Tokens, but the reverse is not always true.)
  
  === Token selection ===
+ 
  Using a strong hash function means !RandomPartitioner keys will, on average, be evenly spread across the Token space, but you can still have imbalances if your Tokens do not divide up the range evenly, so you should specify !InitialToken to your first nodes as `i * (2**127 / N)` for i = 0 .. N-1. In Cassandra 0.7, you should specify `initial_token` in `cassandra.yaml`.
  
  With !NetworkTopologyStrategy, you should calculate the tokens the nodes in each DC independently. Tokens still needed to be unique, so you can add 1 to the tokens in the 2nd DC, add 2 in the 3rd, and so on.  Thus, for a 4-node cluster in 2 datacenters, you would have
@@ -45, +46 @@

  With order preserving partitioners, your key distribution will be application-dependent.  You should still take your best guess at specifying initial tokens (guided by sampling actual data, if possible), but you will be more dependent on active load balancing (see below) and/or adding new nodes to hot spots.
  
  Once data is placed on the cluster, the partitioner may not be changed without wiping and starting over.
+ 
+ As a caveat to the above section, it is generally not necessary to manually select individual tokens when using the vnodes feature.
+ 
  
  === Replication ===
  A Cassandra cluster always divides up the key space into ranges delimited by Tokens as described above, but additional replica placement is customizable via IReplicaPlacementStrategy in the configuration file.  The standard strategies are