You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Benno Evers <be...@mesosphere.com> on 2019/08/16 13:57:26 UTC
Review Request 71297: Fixed a flaky operation reconciliation test.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71297/
-----------------------------------------------------------
Review request for mesos, Andrei Sekretenko, Greg Mann, Joseph Wu, and Till Toenshoff.
Bugs: MESOS-9928
https://issues.apache.org/jira/browse/MESOS-9928
Repository: mesos
Description
-------
The FrameworkReconciliationRaceWithUpdateSlave test from the
operation reconciliation tests was flaky since we did not wait
for the scheduler to reconnect before advancing the clock to
trigger reregistration.
Diffs
-----
src/tests/operation_reconciliation_tests.cpp 9d084c027ec2f910515cafebf715f7428c43f1a9
Diff: https://reviews.apache.org/r/71297/diff/1/
Testing
-------
`./src/mesos-tests --gtest_filter="*FrameworkReconciliationRaceWithUpdateSlaveMessage*" --gtest_repeat=200` while simultaneously running `stress-ng` in the background.
Thanks,
Benno Evers
Re: Review Request 71297: Fixed a flaky operation reconciliation test.
Posted by Mesos Reviewbot <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71297/#review217291
-----------------------------------------------------------
Patch looks great!
Reviews applied: [71297]
Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' CONFIGURATION='--verbose --disable-libtool-wrappers --disable-parallel-test-execution' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; ./support/docker-build.sh
- Mesos Reviewbot
On Aug. 19, 2019, 12:36 p.m., Benno Evers wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71297/
> -----------------------------------------------------------
>
> (Updated Aug. 19, 2019, 12:36 p.m.)
>
>
> Review request for mesos, Andrei Sekretenko, Greg Mann, Joseph Wu, and Till Toenshoff.
>
>
> Bugs: MESOS-9928
> https://issues.apache.org/jira/browse/MESOS-9928
>
>
> Repository: mesos
>
>
> Description
> -------
>
> The FrameworkReconciliationRaceWithUpdateSlave test from the
> operation reconciliation tests was flaky since we did not wait
> for the scheduler to reconnect before attempting to send a
> subscribe call.
>
>
> Diffs
> -----
>
> src/tests/operation_reconciliation_tests.cpp 9d084c027ec2f910515cafebf715f7428c43f1a9
>
>
> Diff: https://reviews.apache.org/r/71297/diff/2/
>
>
> Testing
> -------
>
> `./src/mesos-tests --gtest_filter="*FrameworkReconciliationRaceWithUpdateSlaveMessage*" --gtest_repeat=200` while simultaneously running `stress-ng` in the background.
>
>
> Thanks,
>
> Benno Evers
>
>
Re: Review Request 71297: Fixed a flaky operation reconciliation test.
Posted by Mesos Reviewbot <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71297/#review217334
-----------------------------------------------------------
Patch looks great!
Reviews applied: [71297]
Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' CONFIGURATION='--verbose --disable-libtool-wrappers --disable-parallel-test-execution' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; ./support/docker-build.sh
- Mesos Reviewbot
On Aug. 19, 2019, 8:36 p.m., Benno Evers wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71297/
> -----------------------------------------------------------
>
> (Updated Aug. 19, 2019, 8:36 p.m.)
>
>
> Review request for mesos, Andrei Sekretenko, Greg Mann, Joseph Wu, and Till Toenshoff.
>
>
> Bugs: MESOS-9928
> https://issues.apache.org/jira/browse/MESOS-9928
>
>
> Repository: mesos
>
>
> Description
> -------
>
> The FrameworkReconciliationRaceWithUpdateSlave test from the
> operation reconciliation tests was flaky since we did not wait
> for the scheduler to reconnect before attempting to send a
> subscribe call.
>
>
> Diffs
> -----
>
> src/tests/operation_reconciliation_tests.cpp 9d084c027ec2f910515cafebf715f7428c43f1a9
>
>
> Diff: https://reviews.apache.org/r/71297/diff/2/
>
>
> Testing
> -------
>
> `./src/mesos-tests --gtest_filter="*FrameworkReconciliationRaceWithUpdateSlaveMessage*" --gtest_repeat=200` while simultaneously running `stress-ng` in the background.
>
>
> Thanks,
>
> Benno Evers
>
>
Re: Review Request 71297: Fixed a flaky operation reconciliation test.
Posted by Benno Evers <be...@mesosphere.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71297/
-----------------------------------------------------------
(Updated Aug. 19, 2019, 12:36 p.m.)
Review request for mesos, Andrei Sekretenko, Greg Mann, Joseph Wu, and Till Toenshoff.
Bugs: MESOS-9928
https://issues.apache.org/jira/browse/MESOS-9928
Repository: mesos
Description (updated)
-------
The FrameworkReconciliationRaceWithUpdateSlave test from the
operation reconciliation tests was flaky since we did not wait
for the scheduler to reconnect before attempting to send a
subscribe call.
Diffs (updated)
-----
src/tests/operation_reconciliation_tests.cpp 9d084c027ec2f910515cafebf715f7428c43f1a9
Diff: https://reviews.apache.org/r/71297/diff/2/
Changes: https://reviews.apache.org/r/71297/diff/1-2/
Testing
-------
`./src/mesos-tests --gtest_filter="*FrameworkReconciliationRaceWithUpdateSlaveMessage*" --gtest_repeat=200` while simultaneously running `stress-ng` in the background.
Thanks,
Benno Evers
Re: Review Request 71297: Fixed a flaky operation reconciliation test.
Posted by Mesos Reviewbot <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71297/#review217247
-----------------------------------------------------------
Patch looks great!
Reviews applied: [71297]
Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' CONFIGURATION='--verbose --disable-libtool-wrappers --disable-parallel-test-execution' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; ./support/docker-build.sh
- Mesos Reviewbot
On Aug. 16, 2019, 1:57 p.m., Benno Evers wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71297/
> -----------------------------------------------------------
>
> (Updated Aug. 16, 2019, 1:57 p.m.)
>
>
> Review request for mesos, Andrei Sekretenko, Greg Mann, Joseph Wu, and Till Toenshoff.
>
>
> Bugs: MESOS-9928
> https://issues.apache.org/jira/browse/MESOS-9928
>
>
> Repository: mesos
>
>
> Description
> -------
>
> The FrameworkReconciliationRaceWithUpdateSlave test from the
> operation reconciliation tests was flaky since we did not wait
> for the scheduler to reconnect before advancing the clock to
> trigger reregistration.
>
>
> Diffs
> -----
>
> src/tests/operation_reconciliation_tests.cpp 9d084c027ec2f910515cafebf715f7428c43f1a9
>
>
> Diff: https://reviews.apache.org/r/71297/diff/1/
>
>
> Testing
> -------
>
> `./src/mesos-tests --gtest_filter="*FrameworkReconciliationRaceWithUpdateSlaveMessage*" --gtest_repeat=200` while simultaneously running `stress-ng` in the background.
>
>
> Thanks,
>
> Benno Evers
>
>
Re: Review Request 71297: Fixed a flaky operation reconciliation test.
Posted by Benno Evers <be...@mesosphere.com>.
> On Aug. 16, 2019, 2:43 p.m., Andrei Sekretenko wrote:
> > src/tests/operation_reconciliation_tests.cpp
> > Lines 1842 (patched)
> > <https://reviews.apache.org/r/71297/diff/1/?file=2161041#file2161041line1842>
> >
> > We are restarting the master once, but the scheduler gets connected/disconnected event pair twice during the master restart... is this guaranteed by TestMesos + StandaloneMasterDetector, a stable coincidence, or just the most likely outcome of some other race?
> >
> > IMO, this is at least worth a comment - but what about preventing it? (Will something like `detector->appoint(None())` + `AWAIT_READY(disconnected)` before killing the master help?)
Great idea, thanks!
- Benno
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71297/#review217239
-----------------------------------------------------------
On Aug. 19, 2019, 12:36 p.m., Benno Evers wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71297/
> -----------------------------------------------------------
>
> (Updated Aug. 19, 2019, 12:36 p.m.)
>
>
> Review request for mesos, Andrei Sekretenko, Greg Mann, Joseph Wu, and Till Toenshoff.
>
>
> Bugs: MESOS-9928
> https://issues.apache.org/jira/browse/MESOS-9928
>
>
> Repository: mesos
>
>
> Description
> -------
>
> The FrameworkReconciliationRaceWithUpdateSlave test from the
> operation reconciliation tests was flaky since we did not wait
> for the scheduler to reconnect before attempting to send a
> subscribe call.
>
>
> Diffs
> -----
>
> src/tests/operation_reconciliation_tests.cpp 9d084c027ec2f910515cafebf715f7428c43f1a9
>
>
> Diff: https://reviews.apache.org/r/71297/diff/2/
>
>
> Testing
> -------
>
> `./src/mesos-tests --gtest_filter="*FrameworkReconciliationRaceWithUpdateSlaveMessage*" --gtest_repeat=200` while simultaneously running `stress-ng` in the background.
>
>
> Thanks,
>
> Benno Evers
>
>
Re: Review Request 71297: Fixed a flaky operation reconciliation test.
Posted by Andrei Sekretenko <as...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71297/#review217239
-----------------------------------------------------------
src/tests/operation_reconciliation_tests.cpp
Lines 1842 (patched)
<https://reviews.apache.org/r/71297/#comment304522>
We are restarting the master once, but the scheduler gets connected/disconnected event pair twice during the master restart... is this guaranteed by TestMesos + StandaloneMasterDetector, a stable coincidence, or just the most likely outcome of some other race?
IMO, this is at least worth a comment - but what about preventing it? (Will something like `detector->appoint(None())` + `AWAIT_READY(disconnected)` before killing the master help?)
- Andrei Sekretenko
On Aug. 16, 2019, 1:57 p.m., Benno Evers wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71297/
> -----------------------------------------------------------
>
> (Updated Aug. 16, 2019, 1:57 p.m.)
>
>
> Review request for mesos, Andrei Sekretenko, Greg Mann, Joseph Wu, and Till Toenshoff.
>
>
> Bugs: MESOS-9928
> https://issues.apache.org/jira/browse/MESOS-9928
>
>
> Repository: mesos
>
>
> Description
> -------
>
> The FrameworkReconciliationRaceWithUpdateSlave test from the
> operation reconciliation tests was flaky since we did not wait
> for the scheduler to reconnect before advancing the clock to
> trigger reregistration.
>
>
> Diffs
> -----
>
> src/tests/operation_reconciliation_tests.cpp 9d084c027ec2f910515cafebf715f7428c43f1a9
>
>
> Diff: https://reviews.apache.org/r/71297/diff/1/
>
>
> Testing
> -------
>
> `./src/mesos-tests --gtest_filter="*FrameworkReconciliationRaceWithUpdateSlaveMessage*" --gtest_repeat=200` while simultaneously running `stress-ng` in the background.
>
>
> Thanks,
>
> Benno Evers
>
>