You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@gossip.apache.org by "Edward Capriolo (JIRA)" <ji...@apache.org> on 2017/01/28 16:50:24 UTC

[jira] [Created] (GOSSIP-49) Refactor Failure detector Lambda into named class

Edward Capriolo created GOSSIP-49:
-------------------------------------

             Summary: Refactor Failure detector Lambda into named class
                 Key: GOSSIP-49
                 URL: https://issues.apache.org/jira/browse/GOSSIP-49
             Project: Gossip
          Issue Type: Improvement
            Reporter: Edward Capriolo


When receiving a message the PassiveGossipThread updates heartbeats. Currently a lambda in the GossipManager, which periodically moves through the list and marks hosts as down and fires the event notification listner:
{noformat}
scheduledServiced.scheduleAtFixedRate(() -> {
      try {
        for (Entry<LocalGossipMember, GossipState> entry : members.entrySet()) {
          Double result = null;
          try {
            result = entry.getKey().detect(clock.nanoTime());
            //System.out.println(entry.getKey() +" "+ result);
            if (result != null) {
              if (result > settings.getConvictThreshold() && entry.getValue() == GossipState.UP) {
                members.put(entry.getKey(), GossipState.DOWN);
                listener.gossipEvent(entry.getKey(), GossipState.DOWN);
              }
              if (result <= settings.getConvictThreshold() && entry.getValue() == GossipState.DOWN) {
                members.put(entry.getKey(), GossipState.UP);
                listener.gossipEvent(entry.getKey(), GossipState.UP);
              }
            }
          } catch (IllegalArgumentException ex) {
            //0.0 returns throws exception computing the mean. 
            long now = clock.nanoTime(); 
            long nowInMillis = TimeUnit.MILLISECONDS.convert(now,TimeUnit.NANOSECONDS);
            if (nowInMillis - settings.getCleanupInterval() > entry.getKey().getHeartbeat() && entry.getValue() == GossipState.UP){
              LOGGER.warn("Marking down");
              members.put(entry.getKey(), GossipState.DOWN);
              listener.gossipEvent(entry.getKey(), GossipState.DOWN);
            }
          } //end catch
        } // end for
      } catch (RuntimeException ex) {
        LOGGER.warn("scheduled state had exception", ex);
      }
{noformat}

This should be moved to a named class that is injected with the data members it needs. This would make the logic easier to unit/mock test. We need to run it periodically in the rare case that no messages are coming to us, but we could also run this after receiving a message rather than waiting for the scheduled executor to trigger it. In many cases that would alert faster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)