You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@cassandra.apache.org by "Philip Thompson (JIRA)" <ji...@apache.org> on 2016/07/21 14:49:20 UTC

[jira] [Commented] (CASSANDRA-12260) dtest failure in topology_test.TestTopology.decommissioned_node_cant_rejoin_test

    [ https://issues.apache.org/jira/browse/CASSANDRA-12260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15387818#comment-15387818 ] 

Philip Thompson commented on CASSANDRA-12260:
---------------------------------------------

So, [~jkni], this test flakes a lot, for the stupidest reason. We decommission a node, and check it can't rejoin. The entire process always works correctly, but the test flakes because the node's C* process doesn't shut down in time. The correct errors are in the log, and it fails to join the ring.

Last time this happened, in CASSANDRA-11665, we tripled the timeout. Do we have a guarantee that this node should eventually shut down? How long should we need to wait for that? Would it be better to only wait for the rejoin error and check the node isn't in the ring, without waiting for the C* process to shut down?

> dtest failure in topology_test.TestTopology.decommissioned_node_cant_rejoin_test
> --------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-12260
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12260
>             Project: Cassandra
>          Issue Type: Test
>            Reporter: Philip Thompson
>            Assignee: Philip Thompson
>              Labels: dtest
>         Attachments: node1.log, node1_debug.log, node1_gc.log, node2.log, node2_debug.log, node2_gc.log, node3.log, node3_debug.log, node3_gc.log
>
>
> example failure:
> http://cassci.datastax.com/job/cassandra-3.9_novnode_dtest/14/testReport/topology_test/TestTopology/decommissioned_node_cant_rejoin_test



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)