You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by Apache Wiki <wi...@apache.org> on 2010/06/14 15:30:38 UTC

FAQ reverted to revision 73 on Cassandra Wiki

Dear wiki user,

You have subscribed to a wiki page "Cassandra Wiki" for change notification.

The page FAQ has been reverted to revision 73 by JonathanEllis.
The comment on this change is: un-bork formatting.  thanks, gui mode.
http://wiki.apache.org/cassandra/FAQ?action=diff&rev1=74&rev2=75

--------------------------------------------------

  = Frequently asked questions =
-  *
-  [[#cant_listen_on_ip_any|Why can't I make Cassandra listen on 0.0.0.0 (all my addresses)?]]
+  * [[#cant_listen_on_ip_any|Why can't I make Cassandra listen on 0.0.0.0 (all my addresses)?]]
- 
-  *
-  [[#ports|What ports does Cassandra use?]]
+  * [[#ports|What ports does Cassandra use?]]
- 
-  *
-  [[#slows_down_after_lotso_inserts|Why does Cassandra slow down after doing a lot of inserts?]]
+  * [[#slows_down_after_lotso_inserts|Why does Cassandra slow down after doing a lot of inserts?]]
- 
-  *
-  [[#existing_data_when_adding_new_nodes|What happens to existing data in my cluster when I add new nodes?]]
+  * [[#existing_data_when_adding_new_nodes|What happens to existing data in my cluster when I add new nodes?]]
- 
-  *
-  [[#modify_cf_config|Can I add/remove/rename Column Families on a working cluster?]]
+  * [[#modify_cf_config|Can I add/remove/rename Column Families on a working cluster?]]
- 
-  *
-  [[#node_clients_connect_to|Does it matter which node a Thrift client connects to?]]
+  * [[#node_clients_connect_to|Does it matter which node a Thrift client connects to?]]
- 
-  *
-  [[#what_kind_of_hardware_should_i_use|What kind of hardware should I run Cassandra on?]]
+  * [[#what_kind_of_hardware_should_i_use|What kind of hardware should I run Cassandra on?]]
- 
-  *
-  [[#architecture|What are SSTables and Memtables?]]
+  * [[#architecture|What are SSTables and Memtables?]]
- 
-  *
-  [[#working_with_timeuuid_in_java|Why is it so hard to work with TimeUUIDType in Java?]]
+  * [[#working_with_timeuuid_in_java|Why is it so hard to work with TimeUUIDType in Java?]]
- 
-  *
-  [[#i_deleted_what_gives|I delete data from Cassandra, but disk usage stays the same. What gives?]]
+  * [[#i_deleted_what_gives|I delete data from Cassandra, but disk usage stays the same. What gives?]]
- 
-  *
-  [[#reads_slower_writes|Why are reads slower than writes?]]
+  * [[#reads_slower_writes|Why are reads slower than writes?]]
- 
-  *
-  [[#cloned|Why does nodeprobe ring only show one entry, even though my nodes logged that they see each other joining the ring?]]
+  * [[#cloned|Why does nodeprobe ring only show one entry, even though my nodes logged that they see each other joining the ring?]]
- 
-  *
-  [[#range_ghosts|Why do deleted keys show up during range scans?]]
+  * [[#range_ghosts|Why do deleted keys show up during range scans?]]
- 
-  *
-  [[#change_replication|Can I change the ReplicationFactor on a live cluster?]]
+  * [[#change_replication|Can I change the ReplicationFactor on a live cluster?]]
- 
-  *
-  [[#large_file_and_blob_storage|Can I store large files or BLOBs in Cassandra?]]
+  * [[#large_file_and_blob_storage|Can I store large files or BLOBs in Cassandra?]]
- 
-  *
-  [[#jmx_localhost_refused|Nodetool says "Connection refused to host: 127.0.1.1", for any remote host. What gives?]]
+  * [[#jmx_localhost_refused|Nodetool says "Connection refused to host: 127.0.1.1", for any remote host. What gives?]]
- 
-  *
-  [[#iter_world|How can I iterate over all the rows in a ColumnFamily?]]
+  * [[#iter_world|How can I iterate over all the rows in a ColumnFamily?]]
- 
-  *
-  [[#no_keyspaces|Why were none of the keyspaces described in storage-conf.xml loaded?]]
+  * [[#no_keyspaces|Why were none of the keyspaces described in storage-conf.xml loaded?]]
- 
-  *
-  [[#gui|Is there a GUI admin tool for Cassandra?]]
+  * [[#gui|Is there a GUI admin tool for Cassandra?]]
- 
-  *
-  [[#a_long_is_exactly_8_bytes|Insert operation throws InvalidRequestException with message "A long is exactly 8 bytes"]]
+  * [[#a_long_is_exactly_8_bytes|Insert operation throws InvalidRequestException with message "A long is exactly 8 bytes"]]
- 
-  *
-  [[#clustername_mismatch|Cassandra says "ClusterName mismatch: oldClusterName != newClusterName" and refuses to start]]
+  * [[#clustername_mismatch|Cassandra says "ClusterName mismatch: oldClusterName != newClusterName" and refuses to start]]
- 
-  *
-  [[#batch_mutate_atomic|Are batch_mutate operations atomic?]]
+  * [[#batch_mutate_atomic|Are batch_mutate operations atomic?]]
- 
  
  <<Anchor(cant_listen_on_ip_any)>>
  
@@ -124, +80 @@

  
   1. You can maintain a list of contact nodes (all or a subset of the nodes in the cluster), and configure your clients to choose among them.
   1. Use round-robin DNS and create a record that points to a set of contact nodes (recommended).
-  1.
-  Use the `get_string_property("token map")` RPC to obtain an update-to-date list of the nodes in the cluster and cycle through them.
+  1. Use the `get_string_property("token map")` RPC to obtain an update-to-date list of the nodes in the cluster and cycle through them.
- 
   1. Deploy a load-balancer, proxy, etc.
  
  <<Anchor(what_kind_of_hardware_should_i_use)>>
@@ -249, +203 @@

  == Can I change the ReplicationFactor on a live cluster? ==
  Yes, but it will require restarting and running repair manually to change the replica count of existing data.
  
-  *
-  Alter the ReplicationFactor for the desired keyspace(s) in the storage configuration on each node in the cluster.
+  * Alter the ReplicationFactor for the desired keyspace(s) in the storage configuration on each node in the cluster.
- 
   * Restart cassandra on each node in the cluster
  
  If you're reducing the ReplicationFactor:
@@ -269, +221 @@

  
   * The main limitation on a column and super column size is that all the data for a single key and column must fit (on disk) on a single machine(node) in the cluster.  Because keys alone are used to determine the nodes responsible for replicating their data, the amount of data associated with a single key has this upper bound. This is an inherent limitation of the distribution model.
  
-  *
-  When large columns are created and retrieved, that columns data is loaded into RAM which  can get resource intensive quickly.  Consider, loading  200 rows with columns  that store 10Mb image files each into RAM.  That small result set would consume about 2Gb of RAM.  Clearly as more and more large columns are loaded,  RAM would start to get consumed quickly.  This can be worked around, but will take some upfront planning and testing to get a workable solution for most applications.  You can find more information regarding this behavior here: [[MemtableThresholds|memtables]], and a possible solution in 0.7 here: [[https://issues.apache.org/jira/browse/CASSANDRA-16|CASSANDRA-16]].
+  * When large columns are created and retrieved, that columns data is loaded into RAM which  can get resource intensive quickly.  Consider, loading  200 rows with columns  that store 10Mb image files each into RAM.  That small result set would consume about 2Gb of RAM.  Clearly as more and more large columns are loaded,  RAM would start to get consumed quickly.  This can be worked around, but will take some upfront planning and testing to get a workable solution for most applications.  You can find more information regarding this behavior here: [[MemtableThresholds|memtables]], and a possible solution in 0.7 here: [[https://issues.apache.org/jira/browse/CASSANDRA-16|CASSANDRA-16]].
  
- 
-  *
-  Please refer to the notes in the Cassandra limitations section for more information: [[CassandraLimitations|Cassandra Limitations]]
+  * Please refer to the notes in the Cassandra limitations section for more information: [[CassandraLimitations|Cassandra Limitations]]
- 
  
  <<Anchor(jmx_localhost_refused)>>
  
@@ -341, +289 @@

  <<Anchor(batch_mutate_atomic)>>
  
  == Are batch_mutate operations atomic? ==
- As a special case, mutations against a single key are atomic, but more generally no.  [[API#batch_mutate|batch_mutate]] allows grouping operations on many keys into a single call in order to save on the cost of network round-trips.  If `batch_mutate` fails in the middle of its list of mutations, no rollback occurs and the mutations that have already been applied stay applied. The client should typically retry the `batch_mutate` operation.
+ No.  [[API#batch_mutate|batch_mutate]] is a way to group many operations into a single call in order to save on the cost of network round-trips.  If `batch_mutate` fails in the middle of its list of mutations, no rollback occurs and the mutations that have already been applied stay applied. The client should typically retry the mutation.