You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by Apache Wiki <wi...@apache.org> on 2010/06/14 15:17:28 UTC

[Cassandra Wiki] Update of "FAQ" by JonathanEllis

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification.

The "FAQ" page has been changed by JonathanEllis.
http://wiki.apache.org/cassandra/FAQ?action=diff&rev1=72&rev2=73

--------------------------------------------------

  
   * The main limitation on a column and super column size is that all the data for a single key and column must fit (on disk) on a single machine(node) in the cluster.  Because keys alone are used to determine the nodes responsible for replicating their data, the amount of data associated with a single key has this upper bound. This is an inherent limitation of the distribution model.
  
-  * When large columns are created and retrieved, that columns data is loaded into RAM which  can get resource intensive quickly.  Consider, loading  200 rows with columns  that store 10Mb image files each into RAM.  That small result set would consume about 2Gb of RAM.  Clearly as more and more large columns are loaded,  RAM would start to get consumed quickly.  This can be worked around, but will take some upfront planning and testing to get a workable solution for most applications.  You can find more information regarding this behavior here: [[MemtableThresholds|memtables]], and a possible solution in 0.7 here: [[https://issues.apache.org/jira/browse/CASSANDRA-16|CASSANDRA-16 ]].
+  * When large columns are created and retrieved, that columns data is loaded into RAM which  can get resource intensive quickly.  Consider, loading  200 rows with columns  that store 10Mb image files each into RAM.  That small result set would consume about 2Gb of RAM.  Clearly as more and more large columns are loaded,  RAM would start to get consumed quickly.  This can be worked around, but will take some upfront planning and testing to get a workable solution for most applications.  You can find more information regarding this behavior here: [[MemtableThresholds|memtables]], and a possible solution in 0.7 here: [[https://issues.apache.org/jira/browse/CASSANDRA-16|CASSANDRA-16]].
  
   * Please refer to the notes in the Cassandra limitations section for more information: [[CassandraLimitations|Cassandra Limitations]]
  
@@ -258, +258 @@

  <<Anchor(a_long_is_exactly_8_bytes)>>
  
  == Insert operation throws InvalidRequestException with message "A long is exactly 8 bytes" ==
- 
  You are propably using !LongType column sorter in your column family. !LongType assumes that the numbers stored into column names are exactly 64bit (8 bytes) long and in big endian format. Example code how to pack and unpack an integer for storing into cassandra and unpacking it for php:
  
  {{{
- 	/**
+         /**
- 	 * Takes php integer and packs it to 64bit (8 bytes) long big endian binary representation.
+          * Takes php integer and packs it to 64bit (8 bytes) long big endian binary representation.
- 	 * @param  $x integer
+          * @param  $x integer
- 	 * @return string eight bytes long binary repersentation of the integer in big endian order.
+          * @return string eight bytes long binary repersentation of the integer in big endian order.
- 	 */
+          */
- 	public static function pack_longtype($x) {
+         public static function pack_longtype($x) {
- 		return pack('C8', ($x >> 56) & 0xff, ($x >> 48) & 0xff,	($x >> 40) & 0xff, ($x >> 32) & 0xff,
+                 return pack('C8', ($x >> 56) & 0xff, ($x >> 48) & 0xff, ($x >> 40) & 0xff, ($x >> 32) & 0xff,
- 				($x >> 24) & 0xff, ($x >> 16) & 0xff, ($x >> 8) & 0xff,	$x & 0xff);
+                                 ($x >> 24) & 0xff, ($x >> 16) & 0xff, ($x >> 8) & 0xff, $x & 0xff);
- 	}
+         }
  
- 	/**
+         /**
- 	 * Takes eight bytes long big endian binary representation of an integer and unpacks it to a php integer.
+          * Takes eight bytes long big endian binary representation of an integer and unpacks it to a php integer.
- 	 * @param  $x
+          * @param  $x
- 	 * @return php integer
+          * @return php integer
- 	 */
+          */
- 	public static function unpack_longtype($x) {
+         public static function unpack_longtype($x) {
- 		$a = unpack('C8', $x);
+                 $a = unpack('C8', $x);
- 		return ($a[1] << 56) + ($a[2] << 48) + ($a[3] << 40) + ($a[4] << 32) + ($a[5] << 24) + ($a[6] << 16) + ($a[7] << 8) + $a[8];
+                 return ($a[1] << 56) + ($a[2] << 48) + ($a[3] << 40) + ($a[4] << 32) + ($a[5] << 24) + ($a[6] << 16) + ($a[7] << 8) + $a[8];
- 	}
+         }
  }}}
- 
  <<Anchor(clustername_mismatch)>>
  
  == Cassandra says "ClusterName mismatch: oldClusterName != newClusterName" and refuses to start ==
- 
  To prevent operator errors, Cassandra stores the name of the cluster in its system table.  If you need to rename a cluster for some reason, it is safe to remove system/LocationInfo* after forcing a compaction on all ColumnFamilies (with the old cluster name) if you've specified the node's token in the config file, or if you don't care about preserving the node's token (for instance in single node clusters.)
  
  <<Anchor(batch_mutate_atomic)>>
+ 
  == Are batch_mutate operations atomic? ==
- 
- No.  [[API#batch_mutate|batch_mutate]] is a way to group many operations into a single call in order to save on the cost of network round-trips.  If `batch_mutate` fails in the middle of its list of mutations, no rollback occurs and the mutations that have already been applied stay applied. 
+ No.  [[API#batch_mutate|batch_mutate]] is a way to group many operations into a single call in order to save on the cost of network round-trips.  If `batch_mutate` fails in the middle of its list of mutations, no rollback occurs and the mutations that have already been applied stay applied. The client should typically retry the mutation.