You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Sylvain Lebresne (JIRA)" <ji...@apache.org> on 2011/06/14 11:36:47 UTC

[jira] [Commented] (CASSANDRA-2768) AntiEntropyService excluding nodes that are on version 0.7 or sooner

    [ https://issues.apache.org/jira/browse/CASSANDRA-2768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13049078#comment-13049078 ] 

Sylvain Lebresne commented on CASSANDRA-2768:
---------------------------------------------

The important part here is that this is not a repair specific thing per se. The important part of the stack trace is the 'Excluding ...' part.
It is triggered because of the following code in AES.getNeighbors:
{noformat}
  if (Gossiper.instance.getVersion(endpoint) <= MessagingService.VERSION_07)
  {
      logger.info("Excluding " + endpoint + " from repair because it is on version 0.7 or sooner. You should consider updating this node before running repair again.");
      neighbors.remove(endpoint);
  }
{noformat}
Since Sasha has reportedly verified that all node report being on 0.8.0, this suggests a Gossiper bug that reports the wrong version (even after node restarts).

The exception itself has been fixed in CASSANDRA-2767 and should not be the focus of attention here.

> AntiEntropyService excluding nodes that are on version 0.7 or sooner
> --------------------------------------------------------------------
>
>                 Key: CASSANDRA-2768
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2768
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.8.0
>         Environment: 4 node environment -- 
> Originally 0.7.6-2 with a Keyspace defined with RF=3
> Upgraded all nodes ( 1 at a time ) to version 0.8.0:  For each node, the node was shut down, new version was turned on, using the existing data files / directories and a nodetool repair was run.  
>            Reporter: Sasha Dolgy
>            Assignee: Sylvain Lebresne
>
> When I run nodetool repair on any of the nodes, the /var/log/cassandra/system.log reports errors similar to:
> INFO [manual-repair-1c6b33bc-ef14-4ec8-94f6-f1464ec8bdec] 2011-06-13 21:28:39,877 AntiEntropyService.java (line 177) Excluding /10.128.34.18 from repair because it is on version 0.7 or sooner. You should consider updating this node before running repair again.
> ERROR [manual-repair-1c6b33bc-ef14-4ec8-94f6-f1464ec8bdec] 2011-06-13 21:28:39,877 AbstractCassandraDaemon.java (line 113) Fatal exception in thread Thread[manual-repair-1c6b33bc-ef14-4ec8-94f6-f1464ec8bdec,5,RMI Runtime]
> java.util.ConcurrentModificationException
>       at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793)
>       at java.util.HashMap$KeyIterator.next(HashMap.java:828)
>       at org.apache.cassandra.service.AntiEntropyService.getNeighbors(AntiEntropyService.java:173)
>       at org.apache.cassandra.service.AntiEntropyService$RepairSession.run(AntiEntropyService.java:776)
> The INFO message and subsequent ERROR message are logged for 2 nodes .. I suspect that this is because RF=3.  
> nodetool ring shows that all nodes are up.  
> Client connections (read / write) are not having issues..  
> nodetool version on all nodes shows that each node is 0.8.0
> At suggestion of some contributors, I have restarted each node and tried to run a nodetool repair again ... the result is the same with the messages being logged.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira