You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by Apache Wiki <wi...@apache.org> on 2011/09/06 03:12:56 UTC

[Cassandra Wiki] Update of "Operations" by JonathanEllis

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification.

The "Operations" page has been changed by JonathanEllis:
http://wiki.apache.org/cassandra/Operations?action=diff&rev1=96&rev2=97

  Using a strong hash function means !RandomPartitioner keys will, on average, be evenly spread across the Token space, but you can still have imbalances if your Tokens do not divide up the range evenly, so you should specify !InitialToken to your first nodes as `i * (2**127 / N)` for i = 0 .. N-1. In Cassandra 0.7, you should specify `initial_token` in `cassandra.yaml`.
  
  With !NetworkTopologyStrategy, you should calculate the tokens the nodes in each DC independently. Tokens still needed to be unique, so you can add 1 to the tokens in the 2nd DC, add 2 in the 3rd, and so on.  Thus, for a 4-node cluster in 2 datacenters, you would have
+ 
  {{{
  DC1
  node 1 = 0
@@ -33, +34 @@

  node 3 = 1
  node 4 = 85070591730234615865843651857942052865
  }}}
- 
- 
  If you happen to have the same number of nodes in each data center, you can also alternate data centers when assigning tokens:
+ 
  {{{
  [DC1] node 1 = 0
  [DC2] node 2 = 42535295865117307932921825928971026432
  [DC1] node 3 = 85070591730234615865843651857942052864
  [DC2] node 4 = 127605887595351923798765477786913079296
  }}}
- 
  With order preserving partitioners, your key distribution will be application-dependent.  You should still take your best guess at specifying initial tokens (guided by sampling actual data, if possible), but you will be more dependent on active load balancing (see below) and/or adding new nodes to hot spots.
  
  Once data is placed on the cluster, the partitioner may not be changed without wiping and starting over.
@@ -127, +126 @@

  
  The status of move and balancing operations can be monitored using `nodetool` with the `netstat` argument.  (Cassandra 0.6.* and lower use the `streams` argument).
  
- === Replacing a Dead Node (with same token): ===
+ === Replacing a Dead Node ===
- 
- Since Cassandra 1.0 we can replace an existing node with a new node using the property "cassandra.replace_token=<Token>", This property can be set using -D option while starting cassandra demon process.
+ Since Cassandra 1.0 we can replace a dead node with a new one using the property "cassandra.replace_token=<Token>", This property can be set using -D option while starting cassandra demon process.
  
  (Note:This property will be taken into effect only when the node doesn't have any data in it, You might want to empty the data dir if you want to force the node replace.)
  
+ You must use this property when replacing a dead node (If tried to replace an existing live node, the bootstrapping node will throw a Exception). The token used via this property must be part of the ring and the node have died due to various reasons.
- You must use this property when replacing a dead node (If tried to replace an existing live node, the bootstrapping node will throw a Exception).
- The token used via this property must be part of the ring and the node have died due to various reasons.
  
  Once this Property is enabled the node starts in a hibernate state, during which all the other nodes will see this node to be down. The new node will now start to bootstrap the data from the rest of the nodes in the cluster (Main difference between normal bootstrapping of a new node is that this new node will not accept any writes during this phase). Once the bootstrapping is complete the node will be marked "UP", we rely on the hinted handoff's for making this node consistent (Since we don't accept writes since the start of the bootstrap).
  
@@ -238, +235 @@

  NOTE: Starting with version 0.7, json2sstable and sstable2json must be run in such a way that the schema can be loaded from system tables.  This means that cassandra.yaml must be found in the classpath and refer to valid storage directories.
  
  == Monitoring ==
- Running `nodetool cfstats` can provide an overview of each Column Family, and important metrics to graph your cluster. Cassandra also exposes internal metrics as JMX data. This is a common standard in the JVM world; OpenNMS, Nagios, and Munin at least offer some level of JMX support. For a non-stupid JMX plugin for Munin check out https://github.com/tcurdt/jmx2munin
+ Running `nodetool cfstats` can provide an overview of each Column Family, and important metrics to graph your cluster. Cassandra also exposes internal metrics as JMX data. This is a common standard in the JVM world; OpenNMS, Nagios, and Munin at least offer some level of JMX support. For a non-stupid JMX plugin for Munin check out https://github.com/tcurdt/jmx2munin The specifics of the JMX Interface are documented at JmxInterface.
- The specifics of the JMX Interface are documented at JmxInterface.
  
  Some folks prefer having to deal with non-jmx clients, there is a JMX-to-REST bridge available at http://code.google.com/p/polarrose-jmx-rest-bridge/ Bridging to SNMP is a bit more work but can be done with https://github.com/tcurdt/jmx2snmp