You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@gossip.apache.org by "Dorian Ellerbe (JIRA)" <ji...@apache.org> on 2017/01/28 17:39:25 UTC

[jira] [Assigned] (GOSSIP-49) Refactor Failure detector Lambda into named class

     [ https://issues.apache.org/jira/browse/GOSSIP-49?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dorian Ellerbe reassigned GOSSIP-49:
------------------------------------

    Assignee: Dorian Ellerbe

> Refactor Failure detector Lambda into named class
> -------------------------------------------------
>
>                 Key: GOSSIP-49
>                 URL: https://issues.apache.org/jira/browse/GOSSIP-49
>             Project: Gossip
>          Issue Type: Improvement
>            Reporter: Edward Capriolo
>            Assignee: Dorian Ellerbe
>
> When receiving a message the PassiveGossipThread updates heartbeats. Currently a lambda in the GossipManager, which periodically moves through the list and marks hosts as down and fires the event notification listner:
> {noformat}
> scheduledServiced.scheduleAtFixedRate(() -> {
>       try {
>         for (Entry<LocalGossipMember, GossipState> entry : members.entrySet()) {
>           Double result = null;
>           try {
>             result = entry.getKey().detect(clock.nanoTime());
>             //System.out.println(entry.getKey() +" "+ result);
>             if (result != null) {
>               if (result > settings.getConvictThreshold() && entry.getValue() == GossipState.UP) {
>                 members.put(entry.getKey(), GossipState.DOWN);
>                 listener.gossipEvent(entry.getKey(), GossipState.DOWN);
>               }
>               if (result <= settings.getConvictThreshold() && entry.getValue() == GossipState.DOWN) {
>                 members.put(entry.getKey(), GossipState.UP);
>                 listener.gossipEvent(entry.getKey(), GossipState.UP);
>               }
>             }
>           } catch (IllegalArgumentException ex) {
>             //0.0 returns throws exception computing the mean. 
>             long now = clock.nanoTime(); 
>             long nowInMillis = TimeUnit.MILLISECONDS.convert(now,TimeUnit.NANOSECONDS);
>             if (nowInMillis - settings.getCleanupInterval() > entry.getKey().getHeartbeat() && entry.getValue() == GossipState.UP){
>               LOGGER.warn("Marking down");
>               members.put(entry.getKey(), GossipState.DOWN);
>               listener.gossipEvent(entry.getKey(), GossipState.DOWN);
>             }
>           } //end catch
>         } // end for
>       } catch (RuntimeException ex) {
>         LOGGER.warn("scheduled state had exception", ex);
>       }
> {noformat}
> This should be moved to a named class that is injected with the data members it needs. This would make the logic easier to unit/mock test. We need to run it periodically in the rare case that no messages are coming to us, but we could also run this after receiving a message rather than waiting for the scheduled executor to trigger it. In many cases that would alert faster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)