You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Sergio Bossa (JIRA)" <ji...@apache.org> on 2013/06/23 15:03:19 UTC

[jira] [Commented] (CASSANDRA-5692) Race condition in detecting version on a mixed 1.1/1.2 cluster

    [ https://issues.apache.org/jira/browse/CASSANDRA-5692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13691453#comment-13691453 ] 

Sergio Bossa commented on CASSANDRA-5692:
-----------------------------------------

I believe the correct thing to do is to _never_ assume a version when opening an outbound connection: rather, retrying a few times waiting for the version to be detected, and eventually failing fast.
                
> Race condition in detecting version on a mixed 1.1/1.2 cluster
> --------------------------------------------------------------
>
>                 Key: CASSANDRA-5692
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5692
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.1.9, 1.2.5
>            Reporter: Sergio Bossa
>
> On a mixed 1.1 / 1.2 cluster, starting 1.2 nodes fires sometimes a race condition in version detection, where the 1.2 node wrongly detects version 6 for a 1.1 node.
> It works as follows:
> 1) The just started 1.2 node quickly opens an OutboundTcpConnection toward a 1.1 node before receiving any messages from the latter.
> 2) Given the version is correctly detected only when the first message is received, the version is momentarily set at 6.
> 3) This opens an OutboundTcpConnection from 1.2 to 1.1 at version 6, which gets stuck in the connect() method.
> Later, the version is correctly fixed, but all outbound connections from 1.2 to 1.1 are stuck at this point.
> Evidence from 1.2 logs:
> TRACE 13:48:31,133 Assuming current protocol version for /127.0.0.2
> DEBUG 13:48:37,837 Setting version 5 for /127.0.0.2

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira