You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Eric Evans (JIRA)" <ji...@apache.org> on 2012/06/28 00:32:44 UTC
[jira] [Commented] (CASSANDRA-4384) HintedHandoff can begin before
SS knows the hostID
[ https://issues.apache.org/jira/browse/CASSANDRA-4384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402628#comment-13402628 ]
Eric Evans commented on CASSANDRA-4384:
---------------------------------------
This is a corner-case that only happens to a node that is restarted (and lost its hostId map as a result). It is the result of a hint delivery being triggered (by the _reception_ of a gossip message) before the hostId could be processed from the message.
I see two ways to fix:
# Skip hint delivery when we don't (yet) have a hostId.
# Persist hostIds the way we do tokens.
(1) has the benefit of being a one-liner. Presumably this code exists to schedule hint delivery for a remote host that was dead and is now alive. Since in this case it is _us_ that was down, I don't think skipping would be Evil.
(2) adds complexity, but guards against any future cases of an {{IEndpointStateChangeSubscriber.onAlive()}} relying on {{TokenMetadata.isMember()}} before looking up a hostId.
> HintedHandoff can begin before SS knows the hostID
> --------------------------------------------------
>
> Key: CASSANDRA-4384
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4384
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Affects Versions: 1.2
> Reporter: Brandon Williams
> Assignee: Brandon Williams
> Fix For: 1.2
>
>
> Since HH fires from the FD, SS won't quite have the hostId yet:
> {noformat}
> INFO 18:58:04,196 Started hinted handoff for host: null with IP: /10.179.65.102
> INFO 18:58:04,197 Node /10.179.65.102 state jump to normal
> ERROR 18:58:04,197 Exception in thread Thread[HintedHandoff:1,1,main]
> java.lang.NullPointerException
> at org.apache.cassandra.utils.UUIDGen.decompose(UUIDGen.java:120)
> at org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpointInternal(HintedHandOffManager.java:304)
> at org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:250)
> at org.apache.cassandra.db.HintedHandOffManager.access$400(HintedHandOffManager.java:87)
> at org.apache.cassandra.db.HintedHandOffManager$4.runMayThrow(HintedHandOffManager.java:433)
> at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:26)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> {noformat}
> Simple solution seems to be getting the hostId from gossip instead.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira