You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Sylvain Lebresne (JIRA)" <ji...@apache.org> on 2012/09/07 17:02:08 UTC

[jira] [Updated] (CASSANDRA-4626) Multiple values for CurrentLocal Node ID

     [ https://issues.apache.org/jira/browse/CASSANDRA-4626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sylvain Lebresne updated CASSANDRA-4626:
----------------------------------------

    Attachment: 4626.txt

I think this can happen because of the commit log. Basically, it's possible that when you restart a node that it doesn't pick the correct current NodeId if he attempts to read the current NodeId before the commit log if fully replayed (and the more recent NodeId is in the log, not yet replayed). This would then lead to having 2 columns in the CurrentLocal row.

However, the main problem is that the way we maintain the CurrentLocal row is fragile and honestly dumb (I wrote it so I'm blaming myself). We store all the generated NodeId sorted by creation time in a separated row, so reading the last column of that row is a much simpler and resilient way to do it. Attaching a patch that does just that.  

The patch also adds a forceFlush in SystemTable.writeCurrentNodeId to avoid the problem of not reading the last NodeId because of log replay.

                
> Multiple values for CurrentLocal Node ID
> ----------------------------------------
>
>                 Key: CASSANDRA-4626
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4626
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.11
>            Reporter: Aaron Morton
>         Attachments: 4626.txt
>
>
> From this email thread http://www.mail-archive.com/user@cassandra.apache.org/msg24677.html
> There are multiple columns for the CurrentLocal row in NodeIdInfo:
> {noformat}
> [default@system] list NodeIdInfo ;
> Using default limit of 100
> ...
> -------------------
> RowKey: 43757272656e744c6f63616c
> => (column=01efa5d0-e133-11e1-0000-51be601cd0ff, value=0a1020d2, timestamp=1344414498989)
> => (column=92109b80-ea0a-11e1-0000-51be601cd0af, value=0a1020d2, timestamp=1345386691897)
> {noformat}
> SystemTable.getCurrentLocalNodeId() throws an assertion that occurs when the static constructor for o.a.c.utils.NodeId is in the stack.
> The impact is a java.lang.NoClassDefFoundError when accessing a particular CF (I assume on with counters) on a particular node.
> Cannot see an obvious cause in the code. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira