You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@cassandra.apache.org by "Ankit Patel (JIRA)" <ji...@apache.org> on 2014/01/21 17:33:19 UTC

[jira] [Created] (CASSANDRA-6608) Cassandra timeout on node failure

Ankit Patel created CASSANDRA-6608:
--------------------------------------

             Summary: Cassandra timeout on node failure
                 Key: CASSANDRA-6608
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6608
             Project: Cassandra
          Issue Type: Bug
          Components: Core
            Reporter: Ankit Patel


We are seeing a weird issue with our Cassandra cluster(version 1.0.10). We have 6 nodes(DC1:3, DC2:3) in our cluster. So all 6 nodes are replicas of each other. All reads and writes are LOCAL_QOURUM. We see that when one of the node in DC1 fails, we see timeout errors. When we turned on DEBUG level logs, we see the following error in the Cassandra logs –

DEBUG [Thrift:322] 2013-12-20 14:30:20,123 StorageProxy.java (line 676) Read timeout: java.util.concurrent.TimeoutException: Operation timed out - received only 2 responses from / xxx.xxx.xxx.IP1, xxx.xxx.xxx.IP2, .

Considering that for LOCAL_QOURUM, we only need 2 nodes out of the 3 in the DC, I am surprised we are seeing this issue. Interestingly, when we connect to the third node after the second node returned timeout error, it works as expected.




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)