You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@gossip.apache.org by "Edward Capriolo (JIRA)" <ji...@apache.org> on 2017/01/28 16:50:24 UTC
[jira] [Created] (GOSSIP-49) Refactor Failure detector Lambda into
named class
Edward Capriolo created GOSSIP-49:
-------------------------------------
Summary: Refactor Failure detector Lambda into named class
Key: GOSSIP-49
URL: https://issues.apache.org/jira/browse/GOSSIP-49
Project: Gossip
Issue Type: Improvement
Reporter: Edward Capriolo
When receiving a message the PassiveGossipThread updates heartbeats. Currently a lambda in the GossipManager, which periodically moves through the list and marks hosts as down and fires the event notification listner:
{noformat}
scheduledServiced.scheduleAtFixedRate(() -> {
try {
for (Entry<LocalGossipMember, GossipState> entry : members.entrySet()) {
Double result = null;
try {
result = entry.getKey().detect(clock.nanoTime());
//System.out.println(entry.getKey() +" "+ result);
if (result != null) {
if (result > settings.getConvictThreshold() && entry.getValue() == GossipState.UP) {
members.put(entry.getKey(), GossipState.DOWN);
listener.gossipEvent(entry.getKey(), GossipState.DOWN);
}
if (result <= settings.getConvictThreshold() && entry.getValue() == GossipState.DOWN) {
members.put(entry.getKey(), GossipState.UP);
listener.gossipEvent(entry.getKey(), GossipState.UP);
}
}
} catch (IllegalArgumentException ex) {
//0.0 returns throws exception computing the mean.
long now = clock.nanoTime();
long nowInMillis = TimeUnit.MILLISECONDS.convert(now,TimeUnit.NANOSECONDS);
if (nowInMillis - settings.getCleanupInterval() > entry.getKey().getHeartbeat() && entry.getValue() == GossipState.UP){
LOGGER.warn("Marking down");
members.put(entry.getKey(), GossipState.DOWN);
listener.gossipEvent(entry.getKey(), GossipState.DOWN);
}
} //end catch
} // end for
} catch (RuntimeException ex) {
LOGGER.warn("scheduled state had exception", ex);
}
{noformat}
This should be moved to a named class that is injected with the data members it needs. This would make the logic easier to unit/mock test. We need to run it periodically in the rare case that no messages are coming to us, but we could also run this after receiving a message rather than waiting for the scheduled executor to trigger it. In many cases that would alert faster.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)