You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@geode.apache.org by Jason Huynh <hu...@gmail.com> on 2015/10/23 20:10:27 UTC
Review Request 39605: GEODE-77: Coordinator shutdown does not trigger
coordinator reassignment
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39605/
-----------------------------------------------------------
Review request for geode, anilkumar gingade, Bruce Schuchardt, Jianxia Chen, and Lynn Gallinat.
Repository: geode
Description
-------
It looks like the coordinator node has left, sent a leave message but the member that is supposed to become coordinator next, ignores this leave message (as he is not yet the coordinator).
The fix is to not ignore these leave messages and instead have each member track them, similar to removedMembers list. When the leave message is processed by non coordinators, the results will be the same. However if the non coordinator will now become coordinator, they can now process these leave messages
Diffs
-----
gemfire-core/src/main/java/com/gemstone/gemfire/distributed/internal/membership/NetView.java cb61a1b
gemfire-core/src/main/java/com/gemstone/gemfire/distributed/internal/membership/gms/fd/GMSHealthMonitor.java ed9f214
gemfire-core/src/main/java/com/gemstone/gemfire/distributed/internal/membership/gms/membership/GMSJoinLeave.java 57611e6
gemfire-core/src/test/java/com/gemstone/gemfire/distributed/internal/membership/gms/membership/GMSJoinLeaveJUnitTest.java 42594cf
Diff: https://reviews.apache.org/r/39605/diff/
Testing
-------
Ran precheckin but had a handful of failures due to suspect strings, will rerun
Thanks,
Jason Huynh
Re: Review Request 39605: GEODE-77: Coordinator shutdown does not
trigger coordinator reassignment
Posted by Bruce Schuchardt <bs...@pivotal.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39605/#review104213
-----------------------------------------------------------
Ship it!
I have a suggesting for your tests. The product changes look good
gemfire-core/src/test/java/com/gemstone/gemfire/distributed/internal/membership/gms/membership/GMSJoinLeaveJUnitTest.java (line 617)
<https://reviews.apache.org/r/39605/#comment162506>
I think this should install a member list of m[0], m[1], m[2], gmsJoinLeaveMemberId, m[3].
Both m[1] and m[2] are leaving so this will exercise your new code better.
gemfire-core/src/test/java/com/gemstone/gemfire/distributed/internal/membership/gms/membership/GMSJoinLeaveJUnitTest.java (line 640)
<https://reviews.apache.org/r/39605/#comment162507>
same thing here
- Bruce Schuchardt
On Oct. 27, 2015, 9:41 p.m., Jason Huynh wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/39605/
> -----------------------------------------------------------
>
> (Updated Oct. 27, 2015, 9:41 p.m.)
>
>
> Review request for geode, anilkumar gingade, Bruce Schuchardt, Jianxia Chen, and Lynn Gallinat.
>
>
> Repository: geode
>
>
> Description
> -------
>
> It looks like the coordinator node has left, sent a leave message but the member that is supposed to become coordinator next, ignores this leave message (as he is not yet the coordinator).
> The fix is to not ignore these leave messages and instead have each member track them, similar to removedMembers list. When the leave message is processed by non coordinators, the results will be the same. However if the non coordinator will now become coordinator, they can now process these leave messages
>
>
> Diffs
> -----
>
> gemfire-core/src/main/java/com/gemstone/gemfire/distributed/internal/membership/NetView.java cb61a1b
> gemfire-core/src/main/java/com/gemstone/gemfire/distributed/internal/membership/gms/fd/GMSHealthMonitor.java 774ab37
> gemfire-core/src/main/java/com/gemstone/gemfire/distributed/internal/membership/gms/membership/GMSJoinLeave.java 6d39a6a
> gemfire-core/src/test/java/com/gemstone/gemfire/distributed/internal/membership/gms/membership/GMSJoinLeaveJUnitTest.java 41b0df7
>
> Diff: https://reviews.apache.org/r/39605/diff/
>
>
> Testing
> -------
>
> Ran precheckin but had a handful of failures due to suspect strings, will rerun
>
>
> Thanks,
>
> Jason Huynh
>
>
Re: Review Request 39605: GEODE-77: Coordinator shutdown does not
trigger coordinator reassignment
Posted by Jason Huynh <hu...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39605/
-----------------------------------------------------------
(Updated Oct. 27, 2015, 9:41 p.m.)
Review request for geode, anilkumar gingade, Bruce Schuchardt, Jianxia Chen, and Lynn Gallinat.
Repository: geode
Description
-------
It looks like the coordinator node has left, sent a leave message but the member that is supposed to become coordinator next, ignores this leave message (as he is not yet the coordinator).
The fix is to not ignore these leave messages and instead have each member track them, similar to removedMembers list. When the leave message is processed by non coordinators, the results will be the same. However if the non coordinator will now become coordinator, they can now process these leave messages
Diffs (updated)
-----
gemfire-core/src/main/java/com/gemstone/gemfire/distributed/internal/membership/NetView.java cb61a1b
gemfire-core/src/main/java/com/gemstone/gemfire/distributed/internal/membership/gms/fd/GMSHealthMonitor.java 774ab37
gemfire-core/src/main/java/com/gemstone/gemfire/distributed/internal/membership/gms/membership/GMSJoinLeave.java 6d39a6a
gemfire-core/src/test/java/com/gemstone/gemfire/distributed/internal/membership/gms/membership/GMSJoinLeaveJUnitTest.java 41b0df7
Diff: https://reviews.apache.org/r/39605/diff/
Testing
-------
Ran precheckin but had a handful of failures due to suspect strings, will rerun
Thanks,
Jason Huynh
Re: Review Request 39605: GEODE-77: Coordinator shutdown does not
trigger coordinator reassignment
Posted by Jason Huynh <hu...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39605/
-----------------------------------------------------------
(Updated Oct. 27, 2015, 9:25 p.m.)
Review request for geode, anilkumar gingade, Bruce Schuchardt, Jianxia Chen, and Lynn Gallinat.
Changes
-------
Additional change to not clear from removedMember and leftMembers collections when a view is installed. Instead we will only remove from those collections if the new view does not contain the member ids. This should help with booting out members in a more timely fashion.
Repository: geode
Description
-------
It looks like the coordinator node has left, sent a leave message but the member that is supposed to become coordinator next, ignores this leave message (as he is not yet the coordinator).
The fix is to not ignore these leave messages and instead have each member track them, similar to removedMembers list. When the leave message is processed by non coordinators, the results will be the same. However if the non coordinator will now become coordinator, they can now process these leave messages
Diffs (updated)
-----
gemfire-core/src/main/java/com/gemstone/gemfire/distributed/internal/membership/NetView.java cb61a1b
gemfire-core/src/main/java/com/gemstone/gemfire/distributed/internal/membership/gms/fd/GMSHealthMonitor.java 774ab37
gemfire-core/src/main/java/com/gemstone/gemfire/distributed/internal/membership/gms/membership/GMSJoinLeave.java 6d39a6a
gemfire-core/src/test/java/com/gemstone/gemfire/distributed/internal/membership/gms/membership/GMSJoinLeaveJUnitTest.java 41b0df7
Diff: https://reviews.apache.org/r/39605/diff/
Testing
-------
Ran precheckin but had a handful of failures due to suspect strings, will rerun
Thanks,
Jason Huynh
Re: Review Request 39605: GEODE-77: Coordinator shutdown does not
trigger coordinator reassignment
Posted by Jason Huynh <hu...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39605/#review103816
-----------------------------------------------------------
gemfire-core/src/main/java/com/gemstone/gemfire/distributed/internal/membership/NetView.java (line 155)
<https://reviews.apache.org/r/39605/#comment161909>
Just saw this in the NetView class and wanted Bruce to take a look
- Jason Huynh
On Oct. 23, 2015, 6:10 p.m., Jason Huynh wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/39605/
> -----------------------------------------------------------
>
> (Updated Oct. 23, 2015, 6:10 p.m.)
>
>
> Review request for geode, anilkumar gingade, Bruce Schuchardt, Jianxia Chen, and Lynn Gallinat.
>
>
> Repository: geode
>
>
> Description
> -------
>
> It looks like the coordinator node has left, sent a leave message but the member that is supposed to become coordinator next, ignores this leave message (as he is not yet the coordinator).
> The fix is to not ignore these leave messages and instead have each member track them, similar to removedMembers list. When the leave message is processed by non coordinators, the results will be the same. However if the non coordinator will now become coordinator, they can now process these leave messages
>
>
> Diffs
> -----
>
> gemfire-core/src/main/java/com/gemstone/gemfire/distributed/internal/membership/NetView.java cb61a1b
> gemfire-core/src/main/java/com/gemstone/gemfire/distributed/internal/membership/gms/fd/GMSHealthMonitor.java ed9f214
> gemfire-core/src/main/java/com/gemstone/gemfire/distributed/internal/membership/gms/membership/GMSJoinLeave.java 57611e6
> gemfire-core/src/test/java/com/gemstone/gemfire/distributed/internal/membership/gms/membership/GMSJoinLeaveJUnitTest.java 42594cf
>
> Diff: https://reviews.apache.org/r/39605/diff/
>
>
> Testing
> -------
>
> Ran precheckin but had a handful of failures due to suspect strings, will rerun
>
>
> Thanks,
>
> Jason Huynh
>
>
Re: Review Request 39605: GEODE-77: Coordinator shutdown does not
trigger coordinator reassignment
Posted by Bruce Schuchardt <bs...@pivotal.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39605/#review103824
-----------------------------------------------------------
Ship it!
gemfire-core/src/main/java/com/gemstone/gemfire/distributed/internal/membership/NetView.java (line 155)
<https://reviews.apache.org/r/39605/#comment161924>
yes, please add it
- Bruce Schuchardt
On Oct. 23, 2015, 6:10 p.m., Jason Huynh wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/39605/
> -----------------------------------------------------------
>
> (Updated Oct. 23, 2015, 6:10 p.m.)
>
>
> Review request for geode, anilkumar gingade, Bruce Schuchardt, Jianxia Chen, and Lynn Gallinat.
>
>
> Repository: geode
>
>
> Description
> -------
>
> It looks like the coordinator node has left, sent a leave message but the member that is supposed to become coordinator next, ignores this leave message (as he is not yet the coordinator).
> The fix is to not ignore these leave messages and instead have each member track them, similar to removedMembers list. When the leave message is processed by non coordinators, the results will be the same. However if the non coordinator will now become coordinator, they can now process these leave messages
>
>
> Diffs
> -----
>
> gemfire-core/src/main/java/com/gemstone/gemfire/distributed/internal/membership/NetView.java cb61a1b
> gemfire-core/src/main/java/com/gemstone/gemfire/distributed/internal/membership/gms/fd/GMSHealthMonitor.java ed9f214
> gemfire-core/src/main/java/com/gemstone/gemfire/distributed/internal/membership/gms/membership/GMSJoinLeave.java 57611e6
> gemfire-core/src/test/java/com/gemstone/gemfire/distributed/internal/membership/gms/membership/GMSJoinLeaveJUnitTest.java 42594cf
>
> Diff: https://reviews.apache.org/r/39605/diff/
>
>
> Testing
> -------
>
> Ran precheckin but had a handful of failures due to suspect strings, will rerun
>
>
> Thanks,
>
> Jason Huynh
>
>