You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Nick Puz (JIRA)" <ji...@apache.org> on 2013/06/19 21:05:19 UTC

[jira] [Created] (CASSANDRA-5667) Change timestamps used in CAS ballot proposals to be more resilient to clock skew

Nick Puz created CASSANDRA-5667:
-----------------------------------

             Summary: Change timestamps used in CAS ballot proposals to be more resilient to clock skew
                 Key: CASSANDRA-5667
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5667
             Project: Cassandra
          Issue Type: Bug
          Components: Core
         Environment: n/a
            Reporter: Nick Puz


The current time is used to generate the timeuuid used for CAS ballots proposals with the logic that if a newer proposal exists then the current one needs to complete that and re-propose. The problem is that if a machine has clock skew and drifts into the future it will propose with a large timestamp (which will get accepted) but then subsequent proposals with lower (but correct) timestamps will not be able to proceed. This will prevent CAS write operations and also reads at serializable consistency level. 

The work around is to initially propose with current time (current behavior) but if the proposal fails due to a larger existing one re-propose (after completing the existing if necessary) with the max of (currentTime, mostRecent+1, proposed+1).

Since small drift is normal between different nodes in the same datacenter this can happen even if NTP is working properly and a write hits one node and a subsequent serialized read hits another. In the case of NTP config issues (or OS bugs with time esp around DST) the unavailability window could be much larger.  



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira