You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Patrik Modesto <pa...@gmail.com> on 2012/03/06 09:32:59 UTC

Re: newer Cassandra + Hadoop = TimedOutException()

Hi,

I was recently trying Hadoop job + cassandra-all 0.8.10 again and the
Timeouts I get are not because of the Cassandra can't handle the
requests. I've noticed there are several tasks that show proggess of
several thousands percents. Seems like they are looping their range of
keys. I've run the job with debug enabled and the ranges look ok, see
http://pastebin.com/stVsFzLM

Another difference between cassandra-all 0.8.7 and 0.8.10 is the
number of mappers the job creates:
0.8.7: 4680
0.8.10: 595

Task       Complete
task_201202281457_2027_m_000041	9076.81%
task_201202281457_2027_m_000073	9639.04%
task_201202281457_2027_m_000105	10538.60%
task_201202281457_2027_m_000108	9364.17%

None of this happens with cassandra-all 0.8.7.

Regards,
P.



On Tue, Feb 28, 2012 at 12:29, Patrik Modesto <pa...@gmail.com> wrote:
> I'll alter these settings and will let you know.
>
> Regards,
> P.
>
> On Tue, Feb 28, 2012 at 09:23, aaron morton <aa...@thelastpickle.com> wrote:
>> Have you tried lowering the  batch size and increasing the time out? Even
>> just to get it to work.
>>
>> If you get a TimedOutException it means CL number of servers did not respond
>> in time.
>>
>> Cheers
>>
>> -----------------
>> Aaron Morton
>> Freelance Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>>
>> On 28/02/2012, at 8:18 PM, Patrik Modesto wrote:
>>
>> Hi aaron,
>>
>> this is our current settings:
>>
>>      <property>
>>          <name>cassandra.range.batch.size</name>
>>          <value>1024</value>
>>      </property>
>>
>>      <property>
>>          <name>cassandra.input.split.size</name>
>>          <value>16384</value>
>>      </property>
>>
>> rpc_timeout_in_ms: 30000
>>
>> Regards,
>> P.
>>
>> On Mon, Feb 27, 2012 at 21:54, aaron morton <aa...@thelastpickle.com> wrote:
>>
>> What settings do you have for cassandra.range.batch.size
>>
>> and rpc_timeout_in_ms  ? Have you tried reducing the first and/or increasing
>>
>> the second ?
>>
>>
>> Cheers
>>
>>
>> -----------------
>>
>> Aaron Morton
>>
>> Freelance Developer
>>
>> @aaronmorton
>>
>> http://www.thelastpickle.com
>>
>>
>> On 27/02/2012, at 8:02 PM, Patrik Modesto wrote:
>>
>>
>> On Sun, Feb 26, 2012 at 04:25, Edward Capriolo <ed...@gmail.com>
>>
>> wrote:
>>
>>
>> Did you see the notes here?
>>
>>
>>
>> I'm not sure what do you mean by the notes?
>>
>>
>> I'm using the mapred.* settings suggested there:
>>
>>
>>     <property>
>>
>>         <name>mapred.max.tracker.failures</name>
>>
>>         <value>20</value>
>>
>>     </property>
>>
>>     <property>
>>
>>         <name>mapred.map.max.attempts</name>
>>
>>         <value>20</value>
>>
>>     </property>
>>
>>     <property>
>>
>>         <name>mapred.reduce.max.attempts</name>
>>
>>         <value>20</value>
>>
>>     </property>
>>
>>
>> But I still see the timeouts that I haven't with cassandra-all 0.8.7.
>>
>>
>> P.
>>
>>
>> http://wiki.apache.org/cassandra/HadoopSupport#Troubleshooting
>>
>>
>>
>>