You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@geode.apache.org by Jason Huynh <hu...@gmail.com> on 2015/10/23 20:10:27 UTC

Review Request 39605: GEODE-77: Coordinator shutdown does not trigger coordinator reassignment

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39605/
-----------------------------------------------------------

Review request for geode, anilkumar gingade, Bruce Schuchardt, Jianxia Chen, and Lynn Gallinat.


Repository: geode


Description
-------

It looks like the coordinator node has left, sent a leave message but the member that is supposed to become coordinator next, ignores this leave message (as he is not yet the coordinator).
The fix is to not ignore these leave messages and instead have each member track them, similar to removedMembers list. When the leave message is processed by non coordinators, the results will be the same. However if the non coordinator will now become coordinator, they can now process these leave messages


Diffs
-----

  gemfire-core/src/main/java/com/gemstone/gemfire/distributed/internal/membership/NetView.java cb61a1b 
  gemfire-core/src/main/java/com/gemstone/gemfire/distributed/internal/membership/gms/fd/GMSHealthMonitor.java ed9f214 
  gemfire-core/src/main/java/com/gemstone/gemfire/distributed/internal/membership/gms/membership/GMSJoinLeave.java 57611e6 
  gemfire-core/src/test/java/com/gemstone/gemfire/distributed/internal/membership/gms/membership/GMSJoinLeaveJUnitTest.java 42594cf 

Diff: https://reviews.apache.org/r/39605/diff/


Testing
-------

Ran precheckin but had a handful of failures due to suspect strings, will rerun


Thanks,

Jason Huynh


Re: Review Request 39605: GEODE-77: Coordinator shutdown does not trigger coordinator reassignment

Posted by Bruce Schuchardt <bs...@pivotal.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39605/#review104213
-----------------------------------------------------------

Ship it!


I have a suggesting for your tests.  The product changes look good


gemfire-core/src/test/java/com/gemstone/gemfire/distributed/internal/membership/gms/membership/GMSJoinLeaveJUnitTest.java (line 617)
<https://reviews.apache.org/r/39605/#comment162506>

    I think this should install a member list of m[0], m[1], m[2], gmsJoinLeaveMemberId, m[3].
    
    Both m[1] and m[2] are leaving so this will exercise your new code better.



gemfire-core/src/test/java/com/gemstone/gemfire/distributed/internal/membership/gms/membership/GMSJoinLeaveJUnitTest.java (line 640)
<https://reviews.apache.org/r/39605/#comment162507>

    same thing here


- Bruce Schuchardt


On Oct. 27, 2015, 9:41 p.m., Jason Huynh wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/39605/
> -----------------------------------------------------------
> 
> (Updated Oct. 27, 2015, 9:41 p.m.)
> 
> 
> Review request for geode, anilkumar gingade, Bruce Schuchardt, Jianxia Chen, and Lynn Gallinat.
> 
> 
> Repository: geode
> 
> 
> Description
> -------
> 
> It looks like the coordinator node has left, sent a leave message but the member that is supposed to become coordinator next, ignores this leave message (as he is not yet the coordinator).
> The fix is to not ignore these leave messages and instead have each member track them, similar to removedMembers list. When the leave message is processed by non coordinators, the results will be the same. However if the non coordinator will now become coordinator, they can now process these leave messages
> 
> 
> Diffs
> -----
> 
>   gemfire-core/src/main/java/com/gemstone/gemfire/distributed/internal/membership/NetView.java cb61a1b 
>   gemfire-core/src/main/java/com/gemstone/gemfire/distributed/internal/membership/gms/fd/GMSHealthMonitor.java 774ab37 
>   gemfire-core/src/main/java/com/gemstone/gemfire/distributed/internal/membership/gms/membership/GMSJoinLeave.java 6d39a6a 
>   gemfire-core/src/test/java/com/gemstone/gemfire/distributed/internal/membership/gms/membership/GMSJoinLeaveJUnitTest.java 41b0df7 
> 
> Diff: https://reviews.apache.org/r/39605/diff/
> 
> 
> Testing
> -------
> 
> Ran precheckin but had a handful of failures due to suspect strings, will rerun
> 
> 
> Thanks,
> 
> Jason Huynh
> 
>


Re: Review Request 39605: GEODE-77: Coordinator shutdown does not trigger coordinator reassignment

Posted by Jason Huynh <hu...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39605/
-----------------------------------------------------------

(Updated Oct. 27, 2015, 9:41 p.m.)


Review request for geode, anilkumar gingade, Bruce Schuchardt, Jianxia Chen, and Lynn Gallinat.


Repository: geode


Description
-------

It looks like the coordinator node has left, sent a leave message but the member that is supposed to become coordinator next, ignores this leave message (as he is not yet the coordinator).
The fix is to not ignore these leave messages and instead have each member track them, similar to removedMembers list. When the leave message is processed by non coordinators, the results will be the same. However if the non coordinator will now become coordinator, they can now process these leave messages


Diffs (updated)
-----

  gemfire-core/src/main/java/com/gemstone/gemfire/distributed/internal/membership/NetView.java cb61a1b 
  gemfire-core/src/main/java/com/gemstone/gemfire/distributed/internal/membership/gms/fd/GMSHealthMonitor.java 774ab37 
  gemfire-core/src/main/java/com/gemstone/gemfire/distributed/internal/membership/gms/membership/GMSJoinLeave.java 6d39a6a 
  gemfire-core/src/test/java/com/gemstone/gemfire/distributed/internal/membership/gms/membership/GMSJoinLeaveJUnitTest.java 41b0df7 

Diff: https://reviews.apache.org/r/39605/diff/


Testing
-------

Ran precheckin but had a handful of failures due to suspect strings, will rerun


Thanks,

Jason Huynh


Re: Review Request 39605: GEODE-77: Coordinator shutdown does not trigger coordinator reassignment

Posted by Jason Huynh <hu...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39605/
-----------------------------------------------------------

(Updated Oct. 27, 2015, 9:25 p.m.)


Review request for geode, anilkumar gingade, Bruce Schuchardt, Jianxia Chen, and Lynn Gallinat.


Changes
-------

Additional change to not clear from removedMember and leftMembers collections when a view is installed.  Instead we will only remove from those collections if the new view does not contain the member ids.  This should help with booting out members in a more timely fashion.


Repository: geode


Description
-------

It looks like the coordinator node has left, sent a leave message but the member that is supposed to become coordinator next, ignores this leave message (as he is not yet the coordinator).
The fix is to not ignore these leave messages and instead have each member track them, similar to removedMembers list. When the leave message is processed by non coordinators, the results will be the same. However if the non coordinator will now become coordinator, they can now process these leave messages


Diffs (updated)
-----

  gemfire-core/src/main/java/com/gemstone/gemfire/distributed/internal/membership/NetView.java cb61a1b 
  gemfire-core/src/main/java/com/gemstone/gemfire/distributed/internal/membership/gms/fd/GMSHealthMonitor.java 774ab37 
  gemfire-core/src/main/java/com/gemstone/gemfire/distributed/internal/membership/gms/membership/GMSJoinLeave.java 6d39a6a 
  gemfire-core/src/test/java/com/gemstone/gemfire/distributed/internal/membership/gms/membership/GMSJoinLeaveJUnitTest.java 41b0df7 

Diff: https://reviews.apache.org/r/39605/diff/


Testing
-------

Ran precheckin but had a handful of failures due to suspect strings, will rerun


Thanks,

Jason Huynh


Re: Review Request 39605: GEODE-77: Coordinator shutdown does not trigger coordinator reassignment

Posted by Jason Huynh <hu...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39605/#review103816
-----------------------------------------------------------



gemfire-core/src/main/java/com/gemstone/gemfire/distributed/internal/membership/NetView.java (line 155)
<https://reviews.apache.org/r/39605/#comment161909>

    Just saw this in the NetView class and wanted Bruce to take a look


- Jason Huynh


On Oct. 23, 2015, 6:10 p.m., Jason Huynh wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/39605/
> -----------------------------------------------------------
> 
> (Updated Oct. 23, 2015, 6:10 p.m.)
> 
> 
> Review request for geode, anilkumar gingade, Bruce Schuchardt, Jianxia Chen, and Lynn Gallinat.
> 
> 
> Repository: geode
> 
> 
> Description
> -------
> 
> It looks like the coordinator node has left, sent a leave message but the member that is supposed to become coordinator next, ignores this leave message (as he is not yet the coordinator).
> The fix is to not ignore these leave messages and instead have each member track them, similar to removedMembers list. When the leave message is processed by non coordinators, the results will be the same. However if the non coordinator will now become coordinator, they can now process these leave messages
> 
> 
> Diffs
> -----
> 
>   gemfire-core/src/main/java/com/gemstone/gemfire/distributed/internal/membership/NetView.java cb61a1b 
>   gemfire-core/src/main/java/com/gemstone/gemfire/distributed/internal/membership/gms/fd/GMSHealthMonitor.java ed9f214 
>   gemfire-core/src/main/java/com/gemstone/gemfire/distributed/internal/membership/gms/membership/GMSJoinLeave.java 57611e6 
>   gemfire-core/src/test/java/com/gemstone/gemfire/distributed/internal/membership/gms/membership/GMSJoinLeaveJUnitTest.java 42594cf 
> 
> Diff: https://reviews.apache.org/r/39605/diff/
> 
> 
> Testing
> -------
> 
> Ran precheckin but had a handful of failures due to suspect strings, will rerun
> 
> 
> Thanks,
> 
> Jason Huynh
> 
>


Re: Review Request 39605: GEODE-77: Coordinator shutdown does not trigger coordinator reassignment

Posted by Bruce Schuchardt <bs...@pivotal.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39605/#review103824
-----------------------------------------------------------

Ship it!



gemfire-core/src/main/java/com/gemstone/gemfire/distributed/internal/membership/NetView.java (line 155)
<https://reviews.apache.org/r/39605/#comment161924>

    yes, please add it


- Bruce Schuchardt


On Oct. 23, 2015, 6:10 p.m., Jason Huynh wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/39605/
> -----------------------------------------------------------
> 
> (Updated Oct. 23, 2015, 6:10 p.m.)
> 
> 
> Review request for geode, anilkumar gingade, Bruce Schuchardt, Jianxia Chen, and Lynn Gallinat.
> 
> 
> Repository: geode
> 
> 
> Description
> -------
> 
> It looks like the coordinator node has left, sent a leave message but the member that is supposed to become coordinator next, ignores this leave message (as he is not yet the coordinator).
> The fix is to not ignore these leave messages and instead have each member track them, similar to removedMembers list. When the leave message is processed by non coordinators, the results will be the same. However if the non coordinator will now become coordinator, they can now process these leave messages
> 
> 
> Diffs
> -----
> 
>   gemfire-core/src/main/java/com/gemstone/gemfire/distributed/internal/membership/NetView.java cb61a1b 
>   gemfire-core/src/main/java/com/gemstone/gemfire/distributed/internal/membership/gms/fd/GMSHealthMonitor.java ed9f214 
>   gemfire-core/src/main/java/com/gemstone/gemfire/distributed/internal/membership/gms/membership/GMSJoinLeave.java 57611e6 
>   gemfire-core/src/test/java/com/gemstone/gemfire/distributed/internal/membership/gms/membership/GMSJoinLeaveJUnitTest.java 42594cf 
> 
> Diff: https://reviews.apache.org/r/39605/diff/
> 
> 
> Testing
> -------
> 
> Ran precheckin but had a handful of failures due to suspect strings, will rerun
> 
> 
> Thanks,
> 
> Jason Huynh
> 
>