You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Hefeng Yuan <hf...@rhapsody.com> on 2011/08/17 20:24:28 UTC

One hot node slows down whole cluster

Hi,

We're noticing that when one node gets hot (very high cpu usage) because of 'nodetool repair', the whole cluster's performance becomes really bad.

We're using 0.8.0 with random partition. We have 6 nodes with RF 5. Our repair is scheduled to run once a week, spread across whole cluster. I do get suggestion from Jonothan that 0.8.0 has some bug on the repair, but wondering why one hot node would slow down the whole cluster.

We saw this symptom yesterday on one node, and today on the adjacent node. Most probably it'll happen on the next one tomorrow.

We do see lots of (~200) RejectedExecutionException 3 hours before the repair job, and also in the middle of the repair job, not sure whether they're related. Full stack is attached in the end.

Do we have any suggestion/hint?

Thanks,
Hefeng


ERROR [pool-2-thread-3097] 2011-08-17 08:42:38,118 Cassandra.java (line 3462) Internal error processing batch_mutate
java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down
	at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:73)
	at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:767)
	at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:658)
	at org.apache.cassandra.service.StorageProxy.insertLocal(StorageProxy.java:360)
	at org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:241)
	at org.apache.cassandra.service.StorageProxy.access$000(StorageProxy.java:62)
	at org.apache.cassandra.service.StorageProxy$1.apply(StorageProxy.java:99)
	at org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:210)
	at org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:154)
	at org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:560)
	at org.apache.cassandra.thrift.CassandraServer.internal_batch_mutate(CassandraServer.java:511)
	at org.apache.cassandra.thrift.CassandraServer.batch_mutate(CassandraServer.java:519)
	at org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.process(Cassandra.java:3454)
	at org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889)
	at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:619)
ERROR [Thread-137480] 2011-08-17 08:42:38,121 AbstractCassandraDaemon.java (line 113) Fatal exception in thread Thread[Thread-137480,5,main]
java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down
	at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:73)
	at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:767)
	at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:658)
	at org.apache.cassandra.net.MessagingService.receive(MessagingService.java:444)
	at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:117)

Re: One hot node slows down whole cluster

Posted by Hefeng Yuan <hf...@rhapsody.com>.
Thanks Aaron for the response. We're not doing drain on node, and there's no that message in the log.

We used LOCAL_QUORUM CL,
endpoint_snitch: org.apache.cassandra.locator.PropertyFileSnitch
dynamic_snitch: false
dynamic_snitch_badness_threshold: 0.0

Because we have another 3 nodes DC for Brisk, maybe that's the problem?


On Aug 17, 2011, at 4:27 PM, aaron morton wrote:

> wrt the Exception something has shutdown the Mutation thread pool. The only thing I can see in the code to do this is nodetool drain and running the Embedded server. If it was drain you should see an INFO level messages "Node is drained" somewhere. Could either of these things be happening ? 
> 
> wrt the slow down:
> - what CL are you using  for reads and writes ? What does the ring look like ? 
> - have a look at tp stats to see what stage backing up 
> - ensure you have the dynamic snitch enabled
> - what setting do you have for dynamic_snitch_badness_threshold in yaml
> - have a look at the o.a.c.DynamicEndpointSnitch info in JMX / JConsole  at dumpTimings() and scores
> 
> Basically slower nodes should be used less. But there are reasons they may not be, so lets work out what requests are running slow and if the Dynamic Snitch is doing the right thing. I would look at that error first, seems odd.
> 
> Cheers
> 
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 18/08/2011, at 6:52 AM, Hefeng Yuan wrote:
> 
>> Just wondering, would it help if we shorten the rpc_timeout_in_ms (currently using 30,000), so that when one node gets hot and responding slowly, others will just take it as down and move forward?
>> 
>> On Aug 17, 2011, at 11:35 AM, Hefeng Yuan wrote:
>> 
>>> Sorry, correction, we're using 0.8.1.
>>> 
>>> On Aug 17, 2011, at 11:24 AM, Hefeng Yuan wrote:
>>> 
>>>> Hi,
>>>> 
>>>> We're noticing that when one node gets hot (very high cpu usage) because of 'nodetool repair', the whole cluster's performance becomes really bad.
>>>> 
>>>> We're using 0.8.1 with random partition. We have 6 nodes with RF 5. Our repair is scheduled to run once a week, spread across whole cluster. I do get suggestion from Jonothan that 0.8.0 has some bug on the repair, but wondering why one hot node would slow down the whole cluster.
>>>> 
>>>> We saw this symptom yesterday on one node, and today on the adjacent node. Most probably it'll happen on the next one tomorrow.
>>>> 
>>>> We do see lots of (~200) RejectedExecutionException 3 hours before the repair job, and also in the middle of the repair job, not sure whether they're related. Full stack is attached in the end.
>>>> 
>>>> Do we have any suggestion/hint?
>>>> 
>>>> Thanks,
>>>> Hefeng
>>>> 
>>>> 
>>>> ERROR [pool-2-thread-3097] 2011-08-17 08:42:38,118 Cassandra.java (line 3462) Internal error processing batch_mutate
>>>> java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down
>>>> 	at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:73)
>>>> 	at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:767)
>>>> 	at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:658)
>>>> 	at org.apache.cassandra.service.StorageProxy.insertLocal(StorageProxy.java:360)
>>>> 	at org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:241)
>>>> 	at org.apache.cassandra.service.StorageProxy.access$000(StorageProxy.java:62)
>>>> 	at org.apache.cassandra.service.StorageProxy$1.apply(StorageProxy.java:99)
>>>> 	at org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:210)
>>>> 	at org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:154)
>>>> 	at org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:560)
>>>> 	at org.apache.cassandra.thrift.CassandraServer.internal_batch_mutate(CassandraServer.java:511)
>>>> 	at org.apache.cassandra.thrift.CassandraServer.batch_mutate(CassandraServer.java:519)
>>>> 	at org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.process(Cassandra.java:3454)
>>>> 	at org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889)
>>>> 	at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
>>>> 	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>>> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>>> 	at java.lang.Thread.run(Thread.java:619)
>>>> ERROR [Thread-137480] 2011-08-17 08:42:38,121 AbstractCassandraDaemon.java (line 113) Fatal exception in thread Thread[Thread-137480,5,main]
>>>> java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down
>>>> 	at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:73)
>>>> 	at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:767)
>>>> 	at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:658)
>>>> 	at org.apache.cassandra.net.MessagingService.receive(MessagingService.java:444)
>>>> 	at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:117)
>>> 
>> 
> 


Re: One hot node slows down whole cluster

Posted by aaron morton <aa...@thelastpickle.com>.
wrt the Exception something has shutdown the Mutation thread pool. The only thing I can see in the code to do this is nodetool drain and running the Embedded server. If it was drain you should see an INFO level messages "Node is drained" somewhere. Could either of these things be happening ? 

wrt the slow down:
- what CL are you using  for reads and writes ? What does the ring look like ? 
- have a look at tp stats to see what stage backing up 
- ensure you have the dynamic snitch enabled
- what setting do you have for dynamic_snitch_badness_threshold in yaml
- have a look at the o.a.c.DynamicEndpointSnitch info in JMX / JConsole  at dumpTimings() and scores

Basically slower nodes should be used less. But there are reasons they may not be, so lets work out what requests are running slow and if the Dynamic Snitch is doing the right thing. I would look at that error first, seems odd.

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 18/08/2011, at 6:52 AM, Hefeng Yuan wrote:

> Just wondering, would it help if we shorten the rpc_timeout_in_ms (currently using 30,000), so that when one node gets hot and responding slowly, others will just take it as down and move forward?
> 
> On Aug 17, 2011, at 11:35 AM, Hefeng Yuan wrote:
> 
>> Sorry, correction, we're using 0.8.1.
>> 
>> On Aug 17, 2011, at 11:24 AM, Hefeng Yuan wrote:
>> 
>>> Hi,
>>> 
>>> We're noticing that when one node gets hot (very high cpu usage) because of 'nodetool repair', the whole cluster's performance becomes really bad.
>>> 
>>> We're using 0.8.1 with random partition. We have 6 nodes with RF 5. Our repair is scheduled to run once a week, spread across whole cluster. I do get suggestion from Jonothan that 0.8.0 has some bug on the repair, but wondering why one hot node would slow down the whole cluster.
>>> 
>>> We saw this symptom yesterday on one node, and today on the adjacent node. Most probably it'll happen on the next one tomorrow.
>>> 
>>> We do see lots of (~200) RejectedExecutionException 3 hours before the repair job, and also in the middle of the repair job, not sure whether they're related. Full stack is attached in the end.
>>> 
>>> Do we have any suggestion/hint?
>>> 
>>> Thanks,
>>> Hefeng
>>> 
>>> 
>>> ERROR [pool-2-thread-3097] 2011-08-17 08:42:38,118 Cassandra.java (line 3462) Internal error processing batch_mutate
>>> java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down
>>> 	at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:73)
>>> 	at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:767)
>>> 	at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:658)
>>> 	at org.apache.cassandra.service.StorageProxy.insertLocal(StorageProxy.java:360)
>>> 	at org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:241)
>>> 	at org.apache.cassandra.service.StorageProxy.access$000(StorageProxy.java:62)
>>> 	at org.apache.cassandra.service.StorageProxy$1.apply(StorageProxy.java:99)
>>> 	at org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:210)
>>> 	at org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:154)
>>> 	at org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:560)
>>> 	at org.apache.cassandra.thrift.CassandraServer.internal_batch_mutate(CassandraServer.java:511)
>>> 	at org.apache.cassandra.thrift.CassandraServer.batch_mutate(CassandraServer.java:519)
>>> 	at org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.process(Cassandra.java:3454)
>>> 	at org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889)
>>> 	at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
>>> 	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>> 	at java.lang.Thread.run(Thread.java:619)
>>> ERROR [Thread-137480] 2011-08-17 08:42:38,121 AbstractCassandraDaemon.java (line 113) Fatal exception in thread Thread[Thread-137480,5,main]
>>> java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down
>>> 	at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:73)
>>> 	at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:767)
>>> 	at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:658)
>>> 	at org.apache.cassandra.net.MessagingService.receive(MessagingService.java:444)
>>> 	at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:117)
>> 
> 


Re: One hot node slows down whole cluster

Posted by Hefeng Yuan <hf...@rhapsody.com>.
Just wondering, would it help if we shorten the rpc_timeout_in_ms (currently using 30,000), so that when one node gets hot and responding slowly, others will just take it as down and move forward?

On Aug 17, 2011, at 11:35 AM, Hefeng Yuan wrote:

> Sorry, correction, we're using 0.8.1.
> 
> On Aug 17, 2011, at 11:24 AM, Hefeng Yuan wrote:
> 
>> Hi,
>> 
>> We're noticing that when one node gets hot (very high cpu usage) because of 'nodetool repair', the whole cluster's performance becomes really bad.
>> 
>> We're using 0.8.1 with random partition. We have 6 nodes with RF 5. Our repair is scheduled to run once a week, spread across whole cluster. I do get suggestion from Jonothan that 0.8.0 has some bug on the repair, but wondering why one hot node would slow down the whole cluster.
>> 
>> We saw this symptom yesterday on one node, and today on the adjacent node. Most probably it'll happen on the next one tomorrow.
>> 
>> We do see lots of (~200) RejectedExecutionException 3 hours before the repair job, and also in the middle of the repair job, not sure whether they're related. Full stack is attached in the end.
>> 
>> Do we have any suggestion/hint?
>> 
>> Thanks,
>> Hefeng
>> 
>> 
>> ERROR [pool-2-thread-3097] 2011-08-17 08:42:38,118 Cassandra.java (line 3462) Internal error processing batch_mutate
>> java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down
>> 	at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:73)
>> 	at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:767)
>> 	at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:658)
>> 	at org.apache.cassandra.service.StorageProxy.insertLocal(StorageProxy.java:360)
>> 	at org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:241)
>> 	at org.apache.cassandra.service.StorageProxy.access$000(StorageProxy.java:62)
>> 	at org.apache.cassandra.service.StorageProxy$1.apply(StorageProxy.java:99)
>> 	at org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:210)
>> 	at org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:154)
>> 	at org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:560)
>> 	at org.apache.cassandra.thrift.CassandraServer.internal_batch_mutate(CassandraServer.java:511)
>> 	at org.apache.cassandra.thrift.CassandraServer.batch_mutate(CassandraServer.java:519)
>> 	at org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.process(Cassandra.java:3454)
>> 	at org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889)
>> 	at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
>> 	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>> 	at java.lang.Thread.run(Thread.java:619)
>> ERROR [Thread-137480] 2011-08-17 08:42:38,121 AbstractCassandraDaemon.java (line 113) Fatal exception in thread Thread[Thread-137480,5,main]
>> java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down
>> 	at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:73)
>> 	at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:767)
>> 	at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:658)
>> 	at org.apache.cassandra.net.MessagingService.receive(MessagingService.java:444)
>> 	at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:117)
> 


Re: One hot node slows down whole cluster

Posted by Hefeng Yuan <hf...@rhapsody.com>.
Sorry, correction, we're using 0.8.1.

On Aug 17, 2011, at 11:24 AM, Hefeng Yuan wrote:

> Hi,
> 
> We're noticing that when one node gets hot (very high cpu usage) because of 'nodetool repair', the whole cluster's performance becomes really bad.
> 
> We're using 0.8.1 with random partition. We have 6 nodes with RF 5. Our repair is scheduled to run once a week, spread across whole cluster. I do get suggestion from Jonothan that 0.8.0 has some bug on the repair, but wondering why one hot node would slow down the whole cluster.
> 
> We saw this symptom yesterday on one node, and today on the adjacent node. Most probably it'll happen on the next one tomorrow.
> 
> We do see lots of (~200) RejectedExecutionException 3 hours before the repair job, and also in the middle of the repair job, not sure whether they're related. Full stack is attached in the end.
> 
> Do we have any suggestion/hint?
> 
> Thanks,
> Hefeng
> 
> 
> ERROR [pool-2-thread-3097] 2011-08-17 08:42:38,118 Cassandra.java (line 3462) Internal error processing batch_mutate
> java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down
> 	at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:73)
> 	at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:767)
> 	at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:658)
> 	at org.apache.cassandra.service.StorageProxy.insertLocal(StorageProxy.java:360)
> 	at org.apache.cassandra.service.StorageProxy.sendToHintedEndpoints(StorageProxy.java:241)
> 	at org.apache.cassandra.service.StorageProxy.access$000(StorageProxy.java:62)
> 	at org.apache.cassandra.service.StorageProxy$1.apply(StorageProxy.java:99)
> 	at org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:210)
> 	at org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:154)
> 	at org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:560)
> 	at org.apache.cassandra.thrift.CassandraServer.internal_batch_mutate(CassandraServer.java:511)
> 	at org.apache.cassandra.thrift.CassandraServer.batch_mutate(CassandraServer.java:519)
> 	at org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.process(Cassandra.java:3454)
> 	at org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889)
> 	at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> 	at java.lang.Thread.run(Thread.java:619)
> ERROR [Thread-137480] 2011-08-17 08:42:38,121 AbstractCassandraDaemon.java (line 113) Fatal exception in thread Thread[Thread-137480,5,main]
> java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down
> 	at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:73)
> 	at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:767)
> 	at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:658)
> 	at org.apache.cassandra.net.MessagingService.receive(MessagingService.java:444)
> 	at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:117)