You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@geode.apache.org by "Bruce Schuchardt (JIRA)" <ji...@apache.org> on 2019/02/15 23:13:00 UTC

[jira] [Created] (GEODE-6423) availability checks sometimes immediately initiate removal

Bruce Schuchardt created GEODE-6423:
---------------------------------------

             Summary: availability checks sometimes immediately initiate removal
                 Key: GEODE-6423
                 URL: https://issues.apache.org/jira/browse/GEODE-6423
             Project: Geode
          Issue Type: Bug
          Components: membership
            Reporter: Bruce Schuchardt


If the network goes down the JGroupsMessenger service initiates suspect processing when it tries to send messages.  In 1.8 this seems to initiate immediate removal of the suspect.

ioexception sending udp message initiates suspicion

suspect processing initiates a final check

the final check fails immediately (it's using a timed Socket.connect() which fails immediately)

the member is declared dead
{noformat}
[info 2019/02/13 17:44:59.366 CST perf157-130-167-server1 <Geode Failure Detection thread 3> tid=0xc2] received suspect message from myself for 192.168.130.167(perf157-130-167-locator1:225065:locator)<ec><v0>:41000: Unable to send messages to this member via JGroups

[info 2019/02/13 17:44:59.368 CST perf157-130-167-server1 <Geode Failure Detection thread 4> tid=0xc3] Performing final check for suspect member 192.168.130.167(perf157-130-167-locator1:225065:locator)<ec><v0>:41000 reason=Unable to send messages to this member via JGroups

[info 2019/02/13 17:44:59.368 CST perf157-130-167-server1 <Geode Failure Detection thread 5> tid=0xc4] Performing final check for suspect member 192.168.130.167(perf157-130-167-worker1:225794)<v3>:16202 reason=Unable to send messages to this member via JGroups

[info 2019/02/13 17:44:59.368 CST perf157-130-167-server1 <Geode Failure Detection thread 4> tid=0xc3] Failure detection is now watching 192.168.130.167(perf157-130-167-server1:225263)<v1>:16200

[info 2019/02/13 17:44:59.368 CST perf157-130-167-server1 <Geode Failure Detection thread 5> tid=0xc4] Failure detection is now watching 192.168.130.167(perf157-130-167-locator1:225065:locator)<ec><v0>:41000

[info 2019/02/13 17:44:59.368 CST perf157-130-167-server1 <Geode Failure Detection thread 3> tid=0xc2] received suspect message from myself for 192.168.130.167(perf157-130-167-server2:225522)<v2>:16201: Unable to send messages to this member via JGroups

[info 2019/02/13 17:44:59.369 CST perf157-130-167-server1 <Geode Failure Detection thread 6> tid=0xc5] Performing final check for suspect member 192.168.130.167(perf157-130-167-server2:225522)<v2>:16201 reason=Unable to send messages to this member via JGroups

[info 2019/02/13 17:44:59.369 CST perf157-130-167-server1 <Geode Failure Detection thread 6> tid=0xc5] Failure detection is now watching 192.168.130.167(perf157-130-167-server1:225263)<v1>:16200

[info 2019/02/13 17:44:59.371 CST perf157-130-167-server1 <Geode Failure Detection thread 5> tid=0xc4] Final check failed for member 192.168.130.167(perf157-130-167-worker1:225794)<v3>:16202

[info 2019/02/13 17:44:59.371 CST perf157-130-167-server1 <Geode Failure Detection thread 5> tid=0xc4] Requesting removal of suspect member 192.168.130.167(perf157-130-167-worker1:225794)<v3>:16202

[info 2019/02/13 17:44:59.371 CST perf157-130-167-server1 <Geode Failure Detection thread 4> tid=0xc3] Final check failed for member 192.168.130.167(perf157-130-167-locator1:225065:locator)<ec><v0>:41000

[info 2019/02/13 17:44:59.371 CST perf157-130-167-server1 <Geode Failure Detection thread 4> tid=0xc3] Requesting removal of suspect member 192.168.130.167(perf157-130-167-locator1:225065:locator)<ec><v0>:41000

[info 2019/02/13 17:44:59.371 CST perf157-130-167-server1 <Geode Failure Detection thread 4> tid=0xc3] This member is becoming the membership coordinator with address 192.168.130.167(perf157-130-167-server1:225263)<v1>:16200

[info 2019/02/13 17:44:59.371 CST perf157-130-167-server1 <Geode Failure Detection thread 6> tid=0xc5] Final check failed for member 192.168.130.167(perf157-130-167-server2:225522)<v2>:16201

[info 2019/02/13 17:44:59.373 CST perf157-130-167-server1 <Geode Failure Detection thread 6> tid=0xc5] Requesting removal of suspect member 192.168.130.167(perf157-130-167-server2:225522)<v2>:16201

[info 2019/02/13 17:44:59.376 CST perf157-130-167-server1 <Geode Failure Detection thread 4> tid=0xc3] ViewCreator starting on:192.168.130.167(perf157-130-167-server1:225263)<v1>:16200

[info 2019/02/13 17:44:59.376 CST perf157-130-167-server1 <Geode Membership View Creator> tid=0xc6] View Creator thread is starting

[info 2019/02/13 17:44:59.377 CST perf157-130-167-server1 <Geode Membership View Creator> tid=0xc6] 192.168.130.167(perf157-130-167-locator1:225065:locator)<ec><v0>:41000 had a weight of 3

[info 2019/02/13 17:44:59.377 CST perf157-130-167-server1 <Geode Membership View Creator> tid=0xc6] 192.168.130.167(perf157-130-167-worker1:225794)<v3>:16202 had a weight of 10

[info 2019/02/13 17:44:59.377 CST perf157-130-167-server1 <Geode Membership View Creator> tid=0xc6] preparing new view View[192.168.130.167(perf157-130-167-server1:225263)<v1>:16200|10] members: [192.168.130.167(perf157-130-167-server1:225263)<v1>:16200{lead}, 192.168.130.167(perf157-130-167-server2:225522)<v2>:16201] crashed: [192.168.130.167(perf157-130-167-locator1:225065:locator)<ec><v0>:41000, 192.168.130.167(perf157-130-167-worker1:225794)<v3>:16202]

[info 2019/02/13 17:45:03.627 CST perf157-130-167-server1 <unicast receiver,perf157-130-167-62066> tid=0x21] received suspect message from 192.168.130.167(perf157-130-167-worker1:225794)<v3>:16202 for 192.168.130.167(perf157-130-167-locator1:225065:locator)<ec><v0>:41000: Unable to send messages to this member via JGroups

[info 2019/02/13 17:45:03.718 CST perf157-130-167-server1 <unicast receiver,perf157-130-167-62066> tid=0x21] Membership received a request to remove 192.168.130.167(perf157-130-167-server1:225263)<v1>:16200 from 192.168.130.167(perf157-130-167-locator1:225065:locator)<ec><v0>:41000 reason=Unable to send messages to this member via JGroups

[severe 2019/02/13 17:45:03.719 CST perf157-130-167-server1 <unicast receiver,perf157-130-167-62066> tid=0x21] Membership service failure: Unable to send messages to this member via JGroups
org.apache.geode.ForcedDisconnectException: Unable to send messages to this member via JGroups
{noformat}
 

We expect the final check to respect the member-timeout setting.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)