You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Stefano Ortolani (JIRA)" <ji...@apache.org> on 2017/09/03 15:56:00 UTC

[jira] [Comment Edited] (CASSANDRA-13043) UnavailabeException caused by counter writes forwarded to leaders without complete cluster view

    [ https://issues.apache.org/jira/browse/CASSANDRA-13043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16149262#comment-16149262 ] 

Stefano Ortolani edited comment on CASSANDRA-13043 at 9/3/17 3:55 PM:
----------------------------------------------------------------------

Some updates: added to ccm the ability to send a byteman rule when restarting a node (https://github.com/ostefano/ccm/tree/startup_byteman). 

This allowed me to finally reproduce the bug: https://github.com/ostefano/cassandra-dtest/tree/CASSANDRA-13043

Attached a patch that try to fix it by removing from a leader election nodes that are not RPC ready.
Also, fixed corner case where the CL required is local.


was (Author: ostefano):
Some updates:

* Added to ccm the ability to send a byteman rule when restarting a node (https://github.com/ostefano/ccm/tree/startup_byteman). 
* Not trying to slow down the gossip anymore, but rather I instruct the other two nodes to pick the restarting node as leader.

This allowed me to finally reproduce the bug: https://github.com/ostefano/cassandra-dtest/tree/CASSANDRA-13043
The way I plan to fix is to make `assureSufficientLiveNodes` wait for the gossip to settle.
What do you think, [~iamaleksey]? Would that be a sound approach to fix it?

> UnavailabeException caused by counter writes forwarded to leaders without complete cluster view
> -----------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-13043
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13043
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: Debian
>            Reporter: Catalin Alexandru Zamfir
>         Attachments: patch.diff
>
>
> In version 3.9 of Cassandra, we get the following exceptions on the system.log whenever booting an agent. They seem to grow in number with each reboot. Any idea where they come from or what can we do about them? Note that the cluster is healthy (has sufficient live nodes).
> {noformat}
> 2/14/2016 12:39:47 PMINFO  10:39:47 Updating topology for /10.136.64.120
> 12/14/2016 12:39:47 PMINFO  10:39:47 Updating topology for /10.136.64.120
> 12/14/2016 12:39:47 PMWARN  10:39:47 Uncaught exception on thread Thread[CounterMutationStage-111,5,main]: {}
> 12/14/2016 12:39:47 PMorg.apache.cassandra.exceptions.UnavailableException: Cannot achieve consistency level LOCAL_QUORUM
> 12/14/2016 12:39:47 PM	at org.apache.cassandra.db.ConsistencyLevel.assureSufficientLiveNodes(ConsistencyLevel.java:313) ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM	at org.apache.cassandra.service.AbstractWriteResponseHandler.assureSufficientLiveNodes(AbstractWriteResponseHandler.java:146) ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM	at org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:1054) ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM	at org.apache.cassandra.service.StorageProxy.applyCounterMutationOnLeader(StorageProxy.java:1450) ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM	at org.apache.cassandra.db.CounterMutationVerbHandler.doVerb(CounterMutationVerbHandler.java:48) ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM	at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_111]
> 12/14/2016 12:39:47 PM	at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164) ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM	at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136) [apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM	at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) [apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM	at java.lang.Thread.run(Thread.java:745) [na:1.8.0_111]
> 12/14/2016 12:39:47 PMWARN  10:39:47 Uncaught exception on thread Thread[CounterMutationStage-118,5,main]: {}
> 12/14/2016 12:39:47 PMorg.apache.cassandra.exceptions.UnavailableException: Cannot achieve consistency level LOCAL_QUORUM
> 12/14/2016 12:39:47 PM	at org.apache.cassandra.db.ConsistencyLevel.assureSufficientLiveNodes(ConsistencyLevel.java:313) ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM	at org.apache.cassandra.service.AbstractWriteResponseHandler.assureSufficientLiveNodes(AbstractWriteResponseHandler.java:146) ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM	at org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:1054) ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM	at org.apache.cassandra.service.StorageProxy.applyCounterMutationOnLeader(StorageProxy.java:1450) ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM	at org.apache.cassandra.db.CounterMutationVerbHandler.doVerb(CounterMutationVerbHandler.java:48) ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM	at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_111]
> 12/14/2016 12:39:47 PM	at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164) ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM	at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136) [apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM	at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) [apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM	at java.lang.Thread.run(Thread.java:745) [na:1.8.0_111]
> 12/14/2016 12:39:47 PMWARN  10:39:47 Uncaught exception on thread Thread[CounterMutationStage-164,5,main]: {}
> 12/14/2016 12:39:47 PMorg.apache.cassandra.exceptions.UnavailableException: Cannot achieve consistency level LOCAL_QUORUM
> 12/14/2016 12:39:47 PM	at org.apache.cassandra.db.ConsistencyLevel.assureSufficientLiveNodes(ConsistencyLevel.java:313) ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM	at org.apache.cassandra.service.AbstractWriteResponseHandler.assureSufficientLiveNodes(AbstractWriteResponseHandler.java:146) ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM	at org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:1054) ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM	at org.apache.cassandra.service.StorageProxy.applyCounterMutationOnLeader(StorageProxy.java:1450) ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM	at org.apache.cassandra.db.CounterMutationVerbHandler.doVerb(CounterMutationVerbHandler.java:48) ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM	at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_111]
> 12/14/2016 12:39:47 PM	at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164) ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM	at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136) [apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM	at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) [apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM	at java.lang.Thread.run(Thread.java:745) [na:1.8.0_111]
> 12/14/2016 12:39:47 PMWARN  10:39:47 Uncaught exception on thread Thread[CounterMutationStage-117,5,main]: {}
> 12/14/2016 12:39:47 PMorg.apache.cassandra.exceptions.UnavailableException: Cannot achieve consistency level LOCAL_QUORUM
> 12/14/2016 12:39:47 PM	at org.apache.cassandra.db.ConsistencyLevel.assureSufficientLiveNodes(ConsistencyLevel.java:313) ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM	at org.apache.cassandra.service.AbstractWriteResponseHandler.assureSufficientLiveNodes(AbstractWriteResponseHandler.java:146) ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM	at org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:1054) ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM	at org.apache.cassandra.service.StorageProxy.applyCounterMutationOnLeader(StorageProxy.java:1450) ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM	at org.apache.cassandra.db.CounterMutationVerbHandler.doVerb(CounterMutationVerbHandler.java:48) ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM	at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64) ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_111]
> 12/14/2016 12:39:47 PM	at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164) ~[apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM	at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136) [apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM	at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) [apache-cassandra-3.9.jar:3.9]
> 12/14/2016 12:39:47 PM	at java.lang.Thread.run(Thread.java:745) [na:1.8.0_111]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org