You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Joseph Wu <jo...@mesosphere.io> on 2019/01/31 01:38:53 UTC
Review Request 69869: [WIP] Added test for tearing down frameworks
while creating disks.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69869/
-----------------------------------------------------------
Review request for mesos, Benno Evers, Gastón Kleiman, and Greg Mann.
Bugs: MESOS-9542
https://issues.apache.org/jira/browse/MESOS-9542
Repository: mesos
Description
-------
The CREATE_DISK and DESTROY_DISK operations are "non-speculative"
operations, which means the master must wait for the operations to
complete successfully before the master can update its resources.
Because the master must wait to update the results of non-speculative
operations, it is possible for the framework making the
CREATE/DESTROY_DISK to be torn down before the operation completes.
This commit adds a test to make sure the master can gracefully handle
such a case.
Diffs
-----
src/tests/slave_tests.cpp 22a0295086ae4f4ec26df00a0e077eecfa27f1fb
Diff: https://reviews.apache.org/r/69869/diff/1/
Testing
-------
The test does not currently pass, as it is meant to be a regression
test for MESOS-9542 (a bug which has not been started yet).
src/mesos-tests --gtest_filter="SlaveTest.FrameworkTeardownBeforeTerminalOperationStatusUpdate" --verbose
Thanks,
Joseph Wu
Re: Review Request 69869: [WIP] Added test for tearing down
frameworks while creating disks.
Posted by Mesos Reviewbot Windows <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69869/#review212588
-----------------------------------------------------------
PASS: Mesos patch 69869 was successfully built and tested.
Reviews applied: `['69872', '69869']`
All the build artifacts available at: http://dcos-win.westus2.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/2856/mesos-review-69869
- Mesos Reviewbot Windows
On Jan. 31, 2019, 1:38 a.m., Joseph Wu wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69869/
> -----------------------------------------------------------
>
> (Updated Jan. 31, 2019, 1:38 a.m.)
>
>
> Review request for mesos, Benno Evers, Gastón Kleiman, and Greg Mann.
>
>
> Bugs: MESOS-9542
> https://issues.apache.org/jira/browse/MESOS-9542
>
>
> Repository: mesos
>
>
> Description
> -------
>
> The CREATE_DISK and DESTROY_DISK operations are "non-speculative"
> operations, which means the master must wait for the operations to
> complete successfully before the master can update its resources.
> Because the master must wait to update the results of non-speculative
> operations, it is possible for the framework making the
> CREATE/DESTROY_DISK to be torn down before the operation completes.
>
> This commit adds a test to make sure the master can gracefully handle
> such a case.
>
>
> Diffs
> -----
>
> src/tests/storage_local_resource_provider_tests.cpp fb001aa8d32d1a0a03014a35772fe10b65ce8d9a
>
>
> Diff: https://reviews.apache.org/r/69869/diff/2/
>
>
> Testing
> -------
>
> The test currently fails with the exact message as MESOS-9542, which is the intended behavior right now (we are discussing the fix).
>
> src/mesos-tests --gtest_filter="StorageLocalResourceProviderTest.FrameworkTeardownBeforeTerminalOperationStatusUpdate" --verbose
>
>
> Thanks,
>
> Joseph Wu
>
>
Re: Review Request 69869: [WIP] Added test for tearing down
frameworks while creating disks.
Posted by Mesos Reviewbot Windows <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69869/#review212472
-----------------------------------------------------------
FAIL: Some of the unit tests failed. Please check the relevant logs.
Reviews applied: `['69869']`
Failed command: `Start-MesosCITesting`
All the build artifacts available at: http://dcos-win.westus2.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/2837/mesos-review-69869
Relevant logs:
- [mesos-tests.log](http://dcos-win.westus2.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/2837/mesos-review-69869/logs/mesos-tests.log):
```
W0131 02:42:53.282217 27312 slave.cpp:3934] Ignoring shutdown framework 4f25b998-2b3d-46b6-92d8-f59210e156af-0000 because it is terminating
I0131 02:42:53.284219 27700 master.cpp:1269] Agent 4f25b998-2b3d-46b6-92d8-f59210e156af-S0 at slave(478)@192.10.1.6:59648 (windows-02.chtsmhjxogyevckjfayqqcnjda.xx.internal.cloudapp.net) disconnected
I0131 02:42:53.284219 27700 master.cpp:3272] Disconnecting agent 4f25b998-2b3d-46b6-92d8-f59210e156af-S0 at slave(478)@192.10.1.6:59648 (windows-02.chtsmhjxogyevckjfayqqcnjda.xx.internal.cloudapp.net)
I0131 02:42:53.284219 27700 master.cpp:3291] Deactivating agent 4f25b998-2b3d-46b6-92d8-f59210e156af-S0 at slave(478)@192.10.1.6:59648 (windows-02.chtsmhjxogyevckjfayqqcnjda.xx.internal.cloudapp.net)
I0131 02:42:53.284219 27524 hierarchical.cpp:358] Removed framework 4f25b998-2b3d-46b6-92d8-f59210e156af-0000
I0131 02:42:53.285212 27524 hierarchical.cpp:793] Agent 4f25b998-2b3d-46b6-92d8-f59210e156af-S0 deactivated
I0131 02:42:53.286213 27504 containerizer.cpp:2477] Destroying container 60d4eba9-ceae-4c21-bd7e-755ddb5d1605 in RUNNING state
I0131 02:42:53.286213 27504 containerizer.cpp:3144] Transitioning the state of container 60d4eba9-ceae-4c21-bd7e-755ddb5d1605 from RUNNING to DESTROYING
I0131 02:42:53.287214 27504 launcher.cpp:161] Asked to destroy container 60d4eba9-ceae-4c21-bd7e-755ddb5d1605
W0131 02:42:53.288219 26296 process.cpp:838] Failed to recv on socket WindowsFD::Type::SOCKET=6384 to peer '192.10.1.6:61562': IO failed with error code: The specified network name is no longer available.
W0131 02:42:53.288219 26296 process.cpp:1423] Failed to recv on socket WindowsFD::Type::SOCKET=6528 to peer '192.10.1.6:61561': IO failed with error code: The specified network name is no longer available.
I0131 02:42:53.363895 27700 containerizer.cpp:2983] Container 60d4eba9-ceae-4c21-bd7e-755ddb5d1605 has exited
I0131 02:42:53.392890 26164 master.cpp:1109] Master terminating
I0131 02:42:53.394879 27504 hierarchical.cpp:644] Removed agent 4f25b99[ OK ] IsolationFlag/MemoryIsolatorTest.ROOT_MemUsage/0 (686 ms)
[----------] 1 test from IsolationFlag/MemoryIsolatorTest (703 ms total)
[----------] Global test environment tear-down
[==========] 1096 tests from 104 test cases ran. (519202 ms total)
[ PASSED ] 1095 tests.
[ FAILED ] 1 test, listed below:
[ FAILED ] SlaveTest.FrameworkTeardownBeforeTerminalOperationStatusUpdate
1 FAILED TEST
YOU HAVE 231 DISABLED TESTS
8-2b3d-46b6-92d8-f59210e156af-S0
I0131 02:42:53.800878 26296 process.cpp:927] Stopped the socket accept loop
```
- Mesos Reviewbot Windows
On Jan. 31, 2019, 1:38 a.m., Joseph Wu wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69869/
> -----------------------------------------------------------
>
> (Updated Jan. 31, 2019, 1:38 a.m.)
>
>
> Review request for mesos, Benno Evers, Gastón Kleiman, and Greg Mann.
>
>
> Bugs: MESOS-9542
> https://issues.apache.org/jira/browse/MESOS-9542
>
>
> Repository: mesos
>
>
> Description
> -------
>
> The CREATE_DISK and DESTROY_DISK operations are "non-speculative"
> operations, which means the master must wait for the operations to
> complete successfully before the master can update its resources.
> Because the master must wait to update the results of non-speculative
> operations, it is possible for the framework making the
> CREATE/DESTROY_DISK to be torn down before the operation completes.
>
> This commit adds a test to make sure the master can gracefully handle
> such a case.
>
>
> Diffs
> -----
>
> src/tests/slave_tests.cpp 22a0295086ae4f4ec26df00a0e077eecfa27f1fb
>
>
> Diff: https://reviews.apache.org/r/69869/diff/1/
>
>
> Testing
> -------
>
> The test does not currently pass, as it is meant to be a regression
> test for MESOS-9542 (a bug which has not been started yet).
>
> src/mesos-tests --gtest_filter="SlaveTest.FrameworkTeardownBeforeTerminalOperationStatusUpdate" --verbose
>
>
> Thanks,
>
> Joseph Wu
>
>
Re: Review Request 69869: Added test for tearing down frameworks while
creating disks.
Posted by Joseph Wu <jo...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69869/
-----------------------------------------------------------
(Updated Feb. 26, 2019, 11:24 a.m.)
Review request for mesos, Benno Evers, Gastón Kleiman, and Greg Mann.
Changes
-------
Comment tweak only.
Bugs: MESOS-9542
https://issues.apache.org/jira/browse/MESOS-9542
Repository: mesos
Description
-------
The CREATE_DISK and DESTROY_DISK operations are "non-speculative"
operations, which means the master must wait for the operations to
complete successfully before the master can update its resources.
Because the master must wait to update the results of non-speculative
operations, it is possible for the framework making the
CREATE/DESTROY_DISK to be torn down before the operation completes.
This commit adds a test to make sure the master can gracefully handle
such a case.
Diffs (updated)
-----
src/tests/storage_local_resource_provider_tests.cpp a661951a0a326cc342aa0c45dd0967692ae70941
Diff: https://reviews.apache.org/r/69869/diff/5/
Changes: https://reviews.apache.org/r/69869/diff/4-5/
Testing
-------
src/mesos-tests --gtest_filter="StorageLocalResourceProviderTest.FrameworkTeardownBeforeTerminalOperationStatusUpdate" --verbose
One more test added next patch.
Thanks,
Joseph Wu
Re: Review Request 69869: Added test for tearing down frameworks while
creating disks.
Posted by Greg Mann <gr...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69869/#review213229
-----------------------------------------------------------
Ship it!
Ship It!
- Greg Mann
On Feb. 25, 2019, 9 p.m., Joseph Wu wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69869/
> -----------------------------------------------------------
>
> (Updated Feb. 25, 2019, 9 p.m.)
>
>
> Review request for mesos, Benno Evers, Gastón Kleiman, and Greg Mann.
>
>
> Bugs: MESOS-9542
> https://issues.apache.org/jira/browse/MESOS-9542
>
>
> Repository: mesos
>
>
> Description
> -------
>
> The CREATE_DISK and DESTROY_DISK operations are "non-speculative"
> operations, which means the master must wait for the operations to
> complete successfully before the master can update its resources.
> Because the master must wait to update the results of non-speculative
> operations, it is possible for the framework making the
> CREATE/DESTROY_DISK to be torn down before the operation completes.
>
> This commit adds a test to make sure the master can gracefully handle
> such a case.
>
>
> Diffs
> -----
>
> src/tests/storage_local_resource_provider_tests.cpp a661951a0a326cc342aa0c45dd0967692ae70941
>
>
> Diff: https://reviews.apache.org/r/69869/diff/4/
>
>
> Testing
> -------
>
> src/mesos-tests --gtest_filter="StorageLocalResourceProviderTest.FrameworkTeardownBeforeTerminalOperationStatusUpdate" --verbose
>
> One more test added next patch.
>
>
> Thanks,
>
> Joseph Wu
>
>
Re: Review Request 69869: Added test for tearing down frameworks while
creating disks.
Posted by Joseph Wu <jo...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69869/
-----------------------------------------------------------
(Updated Feb. 25, 2019, 1 p.m.)
Review request for mesos, Benno Evers, Gastón Kleiman, and Greg Mann.
Changes
-------
Addressed comments.
Bugs: MESOS-9542
https://issues.apache.org/jira/browse/MESOS-9542
Repository: mesos
Description
-------
The CREATE_DISK and DESTROY_DISK operations are "non-speculative"
operations, which means the master must wait for the operations to
complete successfully before the master can update its resources.
Because the master must wait to update the results of non-speculative
operations, it is possible for the framework making the
CREATE/DESTROY_DISK to be torn down before the operation completes.
This commit adds a test to make sure the master can gracefully handle
such a case.
Diffs (updated)
-----
src/tests/storage_local_resource_provider_tests.cpp a661951a0a326cc342aa0c45dd0967692ae70941
Diff: https://reviews.apache.org/r/69869/diff/4/
Changes: https://reviews.apache.org/r/69869/diff/3-4/
Testing
-------
src/mesos-tests --gtest_filter="StorageLocalResourceProviderTest.FrameworkTeardownBeforeTerminalOperationStatusUpdate" --verbose
One more test added next patch.
Thanks,
Joseph Wu
Re: Review Request 69869: Added test for tearing down frameworks while
creating disks.
Posted by Joseph Wu <jo...@mesosphere.io>.
> On Feb. 25, 2019, 10:59 a.m., Greg Mann wrote:
> > src/tests/storage_local_resource_provider_tests.cpp
> > Lines 5130 (patched)
> > <https://reviews.apache.org/r/69869/diff/3/?file=2126214#file2126214line5130>
> >
> > LAUNCH_GROUP isn't a non-speculative operation; it uses `addTask()` while processing the operation in the master, which updates the agent's used resources under the assumption that the task launches will succeed: https://github.com/apache/mesos/blob/68b73928a622be093a720277ec2ae1589c221b88/src/master/master.cpp#L12615
I grabbed the definition of non-speculative from this helper:
https://github.com/apache/mesos/blob/68b73928a622be093a720277ec2ae1589c221b88/src/common/protobuf_utils.cpp#L891-L894
> On Feb. 25, 2019, 10:59 a.m., Greg Mann wrote:
> > src/tests/storage_local_resource_provider_tests.cpp
> > Lines 5243-5246 (patched)
> > <https://reviews.apache.org/r/69869/diff/3/?file=2126214#file2126214line5243>
> >
> > Should we do this before the call to `mesos.send()`?
We need the clock unpaused for the SLRP's initialization, but it is safe to pause the clock afterwards. I guess it wouldn't hurt to move up the pausing as much as possible.
- Joseph
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69869/#review213179
-----------------------------------------------------------
On Feb. 21, 2019, 4:08 p.m., Joseph Wu wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69869/
> -----------------------------------------------------------
>
> (Updated Feb. 21, 2019, 4:08 p.m.)
>
>
> Review request for mesos, Benno Evers, Gastón Kleiman, and Greg Mann.
>
>
> Bugs: MESOS-9542
> https://issues.apache.org/jira/browse/MESOS-9542
>
>
> Repository: mesos
>
>
> Description
> -------
>
> The CREATE_DISK and DESTROY_DISK operations are "non-speculative"
> operations, which means the master must wait for the operations to
> complete successfully before the master can update its resources.
> Because the master must wait to update the results of non-speculative
> operations, it is possible for the framework making the
> CREATE/DESTROY_DISK to be torn down before the operation completes.
>
> This commit adds a test to make sure the master can gracefully handle
> such a case.
>
>
> Diffs
> -----
>
> src/tests/storage_local_resource_provider_tests.cpp a661951a0a326cc342aa0c45dd0967692ae70941
>
>
> Diff: https://reviews.apache.org/r/69869/diff/3/
>
>
> Testing
> -------
>
> src/mesos-tests --gtest_filter="StorageLocalResourceProviderTest.FrameworkTeardownBeforeTerminalOperationStatusUpdate" --verbose
>
> One more test added next patch.
>
>
> Thanks,
>
> Joseph Wu
>
>
Re: Review Request 69869: Added test for tearing down frameworks while
creating disks.
Posted by Greg Mann <gr...@mesosphere.io>.
> On Feb. 25, 2019, 6:59 p.m., Greg Mann wrote:
> > src/tests/storage_local_resource_provider_tests.cpp
> > Lines 5130 (patched)
> > <https://reviews.apache.org/r/69869/diff/3/?file=2126214#file2126214line5130>
> >
> > LAUNCH_GROUP isn't a non-speculative operation; it uses `addTask()` while processing the operation in the master, which updates the agent's used resources under the assumption that the task launches will succeed: https://github.com/apache/mesos/blob/68b73928a622be093a720277ec2ae1589c221b88/src/master/master.cpp#L12615
>
> Joseph Wu wrote:
> I grabbed the definition of non-speculative from this helper:
> https://github.com/apache/mesos/blob/68b73928a622be093a720277ec2ae1589c221b88/src/common/protobuf_utils.cpp#L891-L894
Hm weird, ok we can leave it for now.
- Greg
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69869/#review213179
-----------------------------------------------------------
On Feb. 25, 2019, 9 p.m., Joseph Wu wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69869/
> -----------------------------------------------------------
>
> (Updated Feb. 25, 2019, 9 p.m.)
>
>
> Review request for mesos, Benno Evers, Gastón Kleiman, and Greg Mann.
>
>
> Bugs: MESOS-9542
> https://issues.apache.org/jira/browse/MESOS-9542
>
>
> Repository: mesos
>
>
> Description
> -------
>
> The CREATE_DISK and DESTROY_DISK operations are "non-speculative"
> operations, which means the master must wait for the operations to
> complete successfully before the master can update its resources.
> Because the master must wait to update the results of non-speculative
> operations, it is possible for the framework making the
> CREATE/DESTROY_DISK to be torn down before the operation completes.
>
> This commit adds a test to make sure the master can gracefully handle
> such a case.
>
>
> Diffs
> -----
>
> src/tests/storage_local_resource_provider_tests.cpp a661951a0a326cc342aa0c45dd0967692ae70941
>
>
> Diff: https://reviews.apache.org/r/69869/diff/4/
>
>
> Testing
> -------
>
> src/mesos-tests --gtest_filter="StorageLocalResourceProviderTest.FrameworkTeardownBeforeTerminalOperationStatusUpdate" --verbose
>
> One more test added next patch.
>
>
> Thanks,
>
> Joseph Wu
>
>
Re: Review Request 69869: Added test for tearing down frameworks while
creating disks.
Posted by Greg Mann <gr...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69869/#review213179
-----------------------------------------------------------
src/tests/storage_local_resource_provider_tests.cpp
Lines 5130 (patched)
<https://reviews.apache.org/r/69869/#comment299043>
LAUNCH_GROUP isn't a non-speculative operation; it uses `addTask()` while processing the operation in the master, which updates the agent's used resources under the assumption that the task launches will succeed: https://github.com/apache/mesos/blob/68b73928a622be093a720277ec2ae1589c221b88/src/master/master.cpp#L12615
src/tests/storage_local_resource_provider_tests.cpp
Lines 5243-5246 (patched)
<https://reviews.apache.org/r/69869/#comment299046>
Should we do this before the call to `mesos.send()`?
src/tests/storage_local_resource_provider_tests.cpp
Lines 5248 (patched)
<https://reviews.apache.org/r/69869/#comment299044>
s/completes/completed/
src/tests/storage_local_resource_provider_tests.cpp
Lines 5282 (patched)
<https://reviews.apache.org/r/69869/#comment299045>
Nit: indent two spaces further.
- Greg Mann
On Feb. 22, 2019, 12:08 a.m., Joseph Wu wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69869/
> -----------------------------------------------------------
>
> (Updated Feb. 22, 2019, 12:08 a.m.)
>
>
> Review request for mesos, Benno Evers, Gastón Kleiman, and Greg Mann.
>
>
> Bugs: MESOS-9542
> https://issues.apache.org/jira/browse/MESOS-9542
>
>
> Repository: mesos
>
>
> Description
> -------
>
> The CREATE_DISK and DESTROY_DISK operations are "non-speculative"
> operations, which means the master must wait for the operations to
> complete successfully before the master can update its resources.
> Because the master must wait to update the results of non-speculative
> operations, it is possible for the framework making the
> CREATE/DESTROY_DISK to be torn down before the operation completes.
>
> This commit adds a test to make sure the master can gracefully handle
> such a case.
>
>
> Diffs
> -----
>
> src/tests/storage_local_resource_provider_tests.cpp a661951a0a326cc342aa0c45dd0967692ae70941
>
>
> Diff: https://reviews.apache.org/r/69869/diff/3/
>
>
> Testing
> -------
>
> src/mesos-tests --gtest_filter="StorageLocalResourceProviderTest.FrameworkTeardownBeforeTerminalOperationStatusUpdate" --verbose
>
> One more test added next patch.
>
>
> Thanks,
>
> Joseph Wu
>
>
Re: Review Request 69869: Added test for tearing down frameworks while
creating disks.
Posted by Joseph Wu <jo...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69869/
-----------------------------------------------------------
(Updated Feb. 21, 2019, 4:08 p.m.)
Review request for mesos, Benno Evers, Gastón Kleiman, and Greg Mann.
Changes
-------
Fix mock expectation (almost always occurs) when scheduler tries to reconnect to master during scheduler teardown.
Summary (updated)
-----------------
Added test for tearing down frameworks while creating disks.
Bugs: MESOS-9542
https://issues.apache.org/jira/browse/MESOS-9542
Repository: mesos
Description
-------
The CREATE_DISK and DESTROY_DISK operations are "non-speculative"
operations, which means the master must wait for the operations to
complete successfully before the master can update its resources.
Because the master must wait to update the results of non-speculative
operations, it is possible for the framework making the
CREATE/DESTROY_DISK to be torn down before the operation completes.
This commit adds a test to make sure the master can gracefully handle
such a case.
Diffs (updated)
-----
src/tests/storage_local_resource_provider_tests.cpp a661951a0a326cc342aa0c45dd0967692ae70941
Diff: https://reviews.apache.org/r/69869/diff/3/
Changes: https://reviews.apache.org/r/69869/diff/2-3/
Testing
-------
The test currently fails with the exact message as MESOS-9542, which is the intended behavior right now (we are discussing the fix).
src/mesos-tests --gtest_filter="StorageLocalResourceProviderTest.FrameworkTeardownBeforeTerminalOperationStatusUpdate" --verbose
Thanks,
Joseph Wu