You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by Apache Wiki <wi...@apache.org> on 2014/05/19 14:31:16 UTC

[Cassandra Wiki] Trivial Update of "HadoopSupport" by jeremyhanna

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification.

The "HadoopSupport" page has been changed by jeremyhanna:
https://wiki.apache.org/cassandra/HadoopSupport?action=diff&rev1=58&rev2=59

  If you are running into timeout exceptions, you might need to tweak one or both of these settings:
  
   * Each input split is divided into sequential batches of rows requested at a time from Cassandra.  This is the '''cassandra.range.batch.size''' property and it defaults to 4096.  If you are experiencing timeouts, you might first try to reduce the batch size so that it can more easily complete the request within the timeout.  This is either specified in your hadoop configuration or using `org.apache.cassandra.hadoop.ConfigHelper.setRangeBatchSize`.
-  * Starting in Cassandra 1.2, there is range request specific timeout called '''range_request_timeout_in_ms''' in the cassandra.yaml.  Hadoop will request data in sequential batches and the request has to complete within this timeout.  Prior to Cassandra 1.2, you're can set the general '''rpc_timeout_in_ms''' higher, which affects timeouts for reads, writes, and truncate operations in addition to range requests.
+  * Starting in Cassandra 1.2, there is range request specific timeout called '''range_request_timeout_in_ms''' in the cassandra.yaml.  Hadoop requests data in sequential batches and each request has to complete within this timeout.  Prior to Cassandra 1.2, you're can set the general '''rpc_timeout_in_ms''' higher, which affects timeouts for reads, writes, and truncate operations in addition to range requests.
  
  If you still see timeout exceptions with resultant failed jobs and/or blacklisted tasktrackers, there are settings that can give Cassandra more latitude before failing the jobs.  An example of usage (in either the job configuration or tasktracker mapred-site.xml):