You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Joseph Wu <jo...@mesosphere.io> on 2019/01/31 01:38:53 UTC

Review Request 69869: [WIP] Added test for tearing down frameworks while creating disks.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69869/
-----------------------------------------------------------

Review request for mesos, Benno Evers, Gastón Kleiman, and Greg Mann.


Bugs: MESOS-9542
    https://issues.apache.org/jira/browse/MESOS-9542


Repository: mesos


Description
-------

The CREATE_DISK and DESTROY_DISK operations are "non-speculative"
operations, which means the master must wait for the operations to
complete successfully before the master can update its resources.
Because the master must wait to update the results of non-speculative
operations, it is possible for the framework making the
CREATE/DESTROY_DISK to be torn down before the operation completes.

This commit adds a test to make sure the master can gracefully handle
such a case.


Diffs
-----

  src/tests/slave_tests.cpp 22a0295086ae4f4ec26df00a0e077eecfa27f1fb 


Diff: https://reviews.apache.org/r/69869/diff/1/


Testing
-------

The test does not currently pass, as it is meant to be a regression
test for MESOS-9542 (a bug which has not been started yet).

src/mesos-tests --gtest_filter="SlaveTest.FrameworkTeardownBeforeTerminalOperationStatusUpdate" --verbose


Thanks,

Joseph Wu


Re: Review Request 69869: [WIP] Added test for tearing down frameworks while creating disks.

Posted by Mesos Reviewbot Windows <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69869/#review212588
-----------------------------------------------------------



PASS: Mesos patch 69869 was successfully built and tested.

Reviews applied: `['69872', '69869']`

All the build artifacts available at: http://dcos-win.westus2.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/2856/mesos-review-69869

- Mesos Reviewbot Windows


On Jan. 31, 2019, 1:38 a.m., Joseph Wu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69869/
> -----------------------------------------------------------
> 
> (Updated Jan. 31, 2019, 1:38 a.m.)
> 
> 
> Review request for mesos, Benno Evers, Gastón Kleiman, and Greg Mann.
> 
> 
> Bugs: MESOS-9542
>     https://issues.apache.org/jira/browse/MESOS-9542
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> The CREATE_DISK and DESTROY_DISK operations are "non-speculative"
> operations, which means the master must wait for the operations to
> complete successfully before the master can update its resources.
> Because the master must wait to update the results of non-speculative
> operations, it is possible for the framework making the
> CREATE/DESTROY_DISK to be torn down before the operation completes.
> 
> This commit adds a test to make sure the master can gracefully handle
> such a case.
> 
> 
> Diffs
> -----
> 
>   src/tests/storage_local_resource_provider_tests.cpp fb001aa8d32d1a0a03014a35772fe10b65ce8d9a 
> 
> 
> Diff: https://reviews.apache.org/r/69869/diff/2/
> 
> 
> Testing
> -------
> 
> The test currently fails with the exact message as MESOS-9542, which is the intended behavior right now (we are discussing the fix).
> 
> src/mesos-tests --gtest_filter="StorageLocalResourceProviderTest.FrameworkTeardownBeforeTerminalOperationStatusUpdate" --verbose
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>


Re: Review Request 69869: [WIP] Added test for tearing down frameworks while creating disks.

Posted by Mesos Reviewbot Windows <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69869/#review212472
-----------------------------------------------------------



FAIL: Some of the unit tests failed. Please check the relevant logs.

Reviews applied: `['69869']`

Failed command: `Start-MesosCITesting`

All the build artifacts available at: http://dcos-win.westus2.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/2837/mesos-review-69869

Relevant logs:

- [mesos-tests.log](http://dcos-win.westus2.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/2837/mesos-review-69869/logs/mesos-tests.log):

```
W0131 02:42:53.282217 27312 slave.cpp:3934] Ignoring shutdown framework 4f25b998-2b3d-46b6-92d8-f59210e156af-0000 because it is terminating
I0131 02:42:53.284219 27700 master.cpp:1269] Agent 4f25b998-2b3d-46b6-92d8-f59210e156af-S0 at slave(478)@192.10.1.6:59648 (windows-02.chtsmhjxogyevckjfayqqcnjda.xx.internal.cloudapp.net) disconnected
I0131 02:42:53.284219 27700 master.cpp:3272] Disconnecting agent 4f25b998-2b3d-46b6-92d8-f59210e156af-S0 at slave(478)@192.10.1.6:59648 (windows-02.chtsmhjxogyevckjfayqqcnjda.xx.internal.cloudapp.net)
I0131 02:42:53.284219 27700 master.cpp:3291] Deactivating agent 4f25b998-2b3d-46b6-92d8-f59210e156af-S0 at slave(478)@192.10.1.6:59648 (windows-02.chtsmhjxogyevckjfayqqcnjda.xx.internal.cloudapp.net)
I0131 02:42:53.284219 27524 hierarchical.cpp:358] Removed framework 4f25b998-2b3d-46b6-92d8-f59210e156af-0000
I0131 02:42:53.285212 27524 hierarchical.cpp:793] Agent 4f25b998-2b3d-46b6-92d8-f59210e156af-S0 deactivated
I0131 02:42:53.286213 27504 containerizer.cpp:2477] Destroying container 60d4eba9-ceae-4c21-bd7e-755ddb5d1605 in RUNNING state
I0131 02:42:53.286213 27504 containerizer.cpp:3144] Transitioning the state of container 60d4eba9-ceae-4c21-bd7e-755ddb5d1605 from RUNNING to DESTROYING
I0131 02:42:53.287214 27504 launcher.cpp:161] Asked to destroy container 60d4eba9-ceae-4c21-bd7e-755ddb5d1605
W0131 02:42:53.288219 26296 process.cpp:838] Failed to recv on socket WindowsFD::Type::SOCKET=6384 to peer '192.10.1.6:61562': IO failed with error code: The specified network name is no longer available.

W0131 02:42:53.288219 26296 process.cpp:1423] Failed to recv on socket WindowsFD::Type::SOCKET=6528 to peer '192.10.1.6:61561': IO failed with error code: The specified network name is no longer available.

I0131 02:42:53.363895 27700 containerizer.cpp:2983] Container 60d4eba9-ceae-4c21-bd7e-755ddb5d1605 has exited
I0131 02:42:53.392890 26164 master.cpp:1109] Master terminating
I0131 02:42:53.394879 27504 hierarchical.cpp:644] Removed agent 4f25b99[       OK ] IsolationFlag/MemoryIsolatorTest.ROOT_MemUsage/0 (686 ms)
[----------] 1 test from IsolationFlag/MemoryIsolatorTest (703 ms total)

[----------] Global test environment tear-down
[==========] 1096 tests from 104 test cases ran. (519202 ms total)
[  PASSED  ] 1095 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] SlaveTest.FrameworkTeardownBeforeTerminalOperationStatusUpdate

 1 FAILED TEST
  YOU HAVE 231 DISABLED TESTS

8-2b3d-46b6-92d8-f59210e156af-S0
I0131 02:42:53.800878 26296 process.cpp:927] Stopped the socket accept loop
```

- Mesos Reviewbot Windows


On Jan. 31, 2019, 1:38 a.m., Joseph Wu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69869/
> -----------------------------------------------------------
> 
> (Updated Jan. 31, 2019, 1:38 a.m.)
> 
> 
> Review request for mesos, Benno Evers, Gastón Kleiman, and Greg Mann.
> 
> 
> Bugs: MESOS-9542
>     https://issues.apache.org/jira/browse/MESOS-9542
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> The CREATE_DISK and DESTROY_DISK operations are "non-speculative"
> operations, which means the master must wait for the operations to
> complete successfully before the master can update its resources.
> Because the master must wait to update the results of non-speculative
> operations, it is possible for the framework making the
> CREATE/DESTROY_DISK to be torn down before the operation completes.
> 
> This commit adds a test to make sure the master can gracefully handle
> such a case.
> 
> 
> Diffs
> -----
> 
>   src/tests/slave_tests.cpp 22a0295086ae4f4ec26df00a0e077eecfa27f1fb 
> 
> 
> Diff: https://reviews.apache.org/r/69869/diff/1/
> 
> 
> Testing
> -------
> 
> The test does not currently pass, as it is meant to be a regression
> test for MESOS-9542 (a bug which has not been started yet).
> 
> src/mesos-tests --gtest_filter="SlaveTest.FrameworkTeardownBeforeTerminalOperationStatusUpdate" --verbose
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>


Re: Review Request 69869: Added test for tearing down frameworks while creating disks.

Posted by Joseph Wu <jo...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69869/
-----------------------------------------------------------

(Updated Feb. 26, 2019, 11:24 a.m.)


Review request for mesos, Benno Evers, Gastón Kleiman, and Greg Mann.


Changes
-------

Comment tweak only.


Bugs: MESOS-9542
    https://issues.apache.org/jira/browse/MESOS-9542


Repository: mesos


Description
-------

The CREATE_DISK and DESTROY_DISK operations are "non-speculative"
operations, which means the master must wait for the operations to
complete successfully before the master can update its resources.
Because the master must wait to update the results of non-speculative
operations, it is possible for the framework making the
CREATE/DESTROY_DISK to be torn down before the operation completes.

This commit adds a test to make sure the master can gracefully handle
such a case.


Diffs (updated)
-----

  src/tests/storage_local_resource_provider_tests.cpp a661951a0a326cc342aa0c45dd0967692ae70941 


Diff: https://reviews.apache.org/r/69869/diff/5/

Changes: https://reviews.apache.org/r/69869/diff/4-5/


Testing
-------

src/mesos-tests --gtest_filter="StorageLocalResourceProviderTest.FrameworkTeardownBeforeTerminalOperationStatusUpdate" --verbose

One more test added next patch.


Thanks,

Joseph Wu


Re: Review Request 69869: Added test for tearing down frameworks while creating disks.

Posted by Greg Mann <gr...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69869/#review213229
-----------------------------------------------------------


Ship it!




Ship It!

- Greg Mann


On Feb. 25, 2019, 9 p.m., Joseph Wu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69869/
> -----------------------------------------------------------
> 
> (Updated Feb. 25, 2019, 9 p.m.)
> 
> 
> Review request for mesos, Benno Evers, Gastón Kleiman, and Greg Mann.
> 
> 
> Bugs: MESOS-9542
>     https://issues.apache.org/jira/browse/MESOS-9542
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> The CREATE_DISK and DESTROY_DISK operations are "non-speculative"
> operations, which means the master must wait for the operations to
> complete successfully before the master can update its resources.
> Because the master must wait to update the results of non-speculative
> operations, it is possible for the framework making the
> CREATE/DESTROY_DISK to be torn down before the operation completes.
> 
> This commit adds a test to make sure the master can gracefully handle
> such a case.
> 
> 
> Diffs
> -----
> 
>   src/tests/storage_local_resource_provider_tests.cpp a661951a0a326cc342aa0c45dd0967692ae70941 
> 
> 
> Diff: https://reviews.apache.org/r/69869/diff/4/
> 
> 
> Testing
> -------
> 
> src/mesos-tests --gtest_filter="StorageLocalResourceProviderTest.FrameworkTeardownBeforeTerminalOperationStatusUpdate" --verbose
> 
> One more test added next patch.
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>


Re: Review Request 69869: Added test for tearing down frameworks while creating disks.

Posted by Joseph Wu <jo...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69869/
-----------------------------------------------------------

(Updated Feb. 25, 2019, 1 p.m.)


Review request for mesos, Benno Evers, Gastón Kleiman, and Greg Mann.


Changes
-------

Addressed comments.


Bugs: MESOS-9542
    https://issues.apache.org/jira/browse/MESOS-9542


Repository: mesos


Description
-------

The CREATE_DISK and DESTROY_DISK operations are "non-speculative"
operations, which means the master must wait for the operations to
complete successfully before the master can update its resources.
Because the master must wait to update the results of non-speculative
operations, it is possible for the framework making the
CREATE/DESTROY_DISK to be torn down before the operation completes.

This commit adds a test to make sure the master can gracefully handle
such a case.


Diffs (updated)
-----

  src/tests/storage_local_resource_provider_tests.cpp a661951a0a326cc342aa0c45dd0967692ae70941 


Diff: https://reviews.apache.org/r/69869/diff/4/

Changes: https://reviews.apache.org/r/69869/diff/3-4/


Testing
-------

src/mesos-tests --gtest_filter="StorageLocalResourceProviderTest.FrameworkTeardownBeforeTerminalOperationStatusUpdate" --verbose

One more test added next patch.


Thanks,

Joseph Wu


Re: Review Request 69869: Added test for tearing down frameworks while creating disks.

Posted by Joseph Wu <jo...@mesosphere.io>.

> On Feb. 25, 2019, 10:59 a.m., Greg Mann wrote:
> > src/tests/storage_local_resource_provider_tests.cpp
> > Lines 5130 (patched)
> > <https://reviews.apache.org/r/69869/diff/3/?file=2126214#file2126214line5130>
> >
> >     LAUNCH_GROUP isn't a non-speculative operation; it uses `addTask()` while processing the operation in the master, which updates the agent's used resources under the assumption that the task launches will succeed: https://github.com/apache/mesos/blob/68b73928a622be093a720277ec2ae1589c221b88/src/master/master.cpp#L12615

I grabbed the definition of non-speculative from this helper:
https://github.com/apache/mesos/blob/68b73928a622be093a720277ec2ae1589c221b88/src/common/protobuf_utils.cpp#L891-L894


> On Feb. 25, 2019, 10:59 a.m., Greg Mann wrote:
> > src/tests/storage_local_resource_provider_tests.cpp
> > Lines 5243-5246 (patched)
> > <https://reviews.apache.org/r/69869/diff/3/?file=2126214#file2126214line5243>
> >
> >     Should we do this before the call to `mesos.send()`?

We need the clock unpaused for the SLRP's initialization, but it is safe to pause the clock afterwards.  I guess it wouldn't hurt to move up the pausing as much as possible.


- Joseph


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69869/#review213179
-----------------------------------------------------------


On Feb. 21, 2019, 4:08 p.m., Joseph Wu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69869/
> -----------------------------------------------------------
> 
> (Updated Feb. 21, 2019, 4:08 p.m.)
> 
> 
> Review request for mesos, Benno Evers, Gastón Kleiman, and Greg Mann.
> 
> 
> Bugs: MESOS-9542
>     https://issues.apache.org/jira/browse/MESOS-9542
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> The CREATE_DISK and DESTROY_DISK operations are "non-speculative"
> operations, which means the master must wait for the operations to
> complete successfully before the master can update its resources.
> Because the master must wait to update the results of non-speculative
> operations, it is possible for the framework making the
> CREATE/DESTROY_DISK to be torn down before the operation completes.
> 
> This commit adds a test to make sure the master can gracefully handle
> such a case.
> 
> 
> Diffs
> -----
> 
>   src/tests/storage_local_resource_provider_tests.cpp a661951a0a326cc342aa0c45dd0967692ae70941 
> 
> 
> Diff: https://reviews.apache.org/r/69869/diff/3/
> 
> 
> Testing
> -------
> 
> src/mesos-tests --gtest_filter="StorageLocalResourceProviderTest.FrameworkTeardownBeforeTerminalOperationStatusUpdate" --verbose
> 
> One more test added next patch.
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>


Re: Review Request 69869: Added test for tearing down frameworks while creating disks.

Posted by Greg Mann <gr...@mesosphere.io>.

> On Feb. 25, 2019, 6:59 p.m., Greg Mann wrote:
> > src/tests/storage_local_resource_provider_tests.cpp
> > Lines 5130 (patched)
> > <https://reviews.apache.org/r/69869/diff/3/?file=2126214#file2126214line5130>
> >
> >     LAUNCH_GROUP isn't a non-speculative operation; it uses `addTask()` while processing the operation in the master, which updates the agent's used resources under the assumption that the task launches will succeed: https://github.com/apache/mesos/blob/68b73928a622be093a720277ec2ae1589c221b88/src/master/master.cpp#L12615
> 
> Joseph Wu wrote:
>     I grabbed the definition of non-speculative from this helper:
>     https://github.com/apache/mesos/blob/68b73928a622be093a720277ec2ae1589c221b88/src/common/protobuf_utils.cpp#L891-L894

Hm weird, ok we can leave it for now.


- Greg


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69869/#review213179
-----------------------------------------------------------


On Feb. 25, 2019, 9 p.m., Joseph Wu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69869/
> -----------------------------------------------------------
> 
> (Updated Feb. 25, 2019, 9 p.m.)
> 
> 
> Review request for mesos, Benno Evers, Gastón Kleiman, and Greg Mann.
> 
> 
> Bugs: MESOS-9542
>     https://issues.apache.org/jira/browse/MESOS-9542
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> The CREATE_DISK and DESTROY_DISK operations are "non-speculative"
> operations, which means the master must wait for the operations to
> complete successfully before the master can update its resources.
> Because the master must wait to update the results of non-speculative
> operations, it is possible for the framework making the
> CREATE/DESTROY_DISK to be torn down before the operation completes.
> 
> This commit adds a test to make sure the master can gracefully handle
> such a case.
> 
> 
> Diffs
> -----
> 
>   src/tests/storage_local_resource_provider_tests.cpp a661951a0a326cc342aa0c45dd0967692ae70941 
> 
> 
> Diff: https://reviews.apache.org/r/69869/diff/4/
> 
> 
> Testing
> -------
> 
> src/mesos-tests --gtest_filter="StorageLocalResourceProviderTest.FrameworkTeardownBeforeTerminalOperationStatusUpdate" --verbose
> 
> One more test added next patch.
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>


Re: Review Request 69869: Added test for tearing down frameworks while creating disks.

Posted by Greg Mann <gr...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69869/#review213179
-----------------------------------------------------------




src/tests/storage_local_resource_provider_tests.cpp
Lines 5130 (patched)
<https://reviews.apache.org/r/69869/#comment299043>

    LAUNCH_GROUP isn't a non-speculative operation; it uses `addTask()` while processing the operation in the master, which updates the agent's used resources under the assumption that the task launches will succeed: https://github.com/apache/mesos/blob/68b73928a622be093a720277ec2ae1589c221b88/src/master/master.cpp#L12615



src/tests/storage_local_resource_provider_tests.cpp
Lines 5243-5246 (patched)
<https://reviews.apache.org/r/69869/#comment299046>

    Should we do this before the call to `mesos.send()`?



src/tests/storage_local_resource_provider_tests.cpp
Lines 5248 (patched)
<https://reviews.apache.org/r/69869/#comment299044>

    s/completes/completed/



src/tests/storage_local_resource_provider_tests.cpp
Lines 5282 (patched)
<https://reviews.apache.org/r/69869/#comment299045>

    Nit: indent two spaces further.


- Greg Mann


On Feb. 22, 2019, 12:08 a.m., Joseph Wu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69869/
> -----------------------------------------------------------
> 
> (Updated Feb. 22, 2019, 12:08 a.m.)
> 
> 
> Review request for mesos, Benno Evers, Gastón Kleiman, and Greg Mann.
> 
> 
> Bugs: MESOS-9542
>     https://issues.apache.org/jira/browse/MESOS-9542
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> The CREATE_DISK and DESTROY_DISK operations are "non-speculative"
> operations, which means the master must wait for the operations to
> complete successfully before the master can update its resources.
> Because the master must wait to update the results of non-speculative
> operations, it is possible for the framework making the
> CREATE/DESTROY_DISK to be torn down before the operation completes.
> 
> This commit adds a test to make sure the master can gracefully handle
> such a case.
> 
> 
> Diffs
> -----
> 
>   src/tests/storage_local_resource_provider_tests.cpp a661951a0a326cc342aa0c45dd0967692ae70941 
> 
> 
> Diff: https://reviews.apache.org/r/69869/diff/3/
> 
> 
> Testing
> -------
> 
> src/mesos-tests --gtest_filter="StorageLocalResourceProviderTest.FrameworkTeardownBeforeTerminalOperationStatusUpdate" --verbose
> 
> One more test added next patch.
> 
> 
> Thanks,
> 
> Joseph Wu
> 
>


Re: Review Request 69869: Added test for tearing down frameworks while creating disks.

Posted by Joseph Wu <jo...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69869/
-----------------------------------------------------------

(Updated Feb. 21, 2019, 4:08 p.m.)


Review request for mesos, Benno Evers, Gastón Kleiman, and Greg Mann.


Changes
-------

Fix mock expectation (almost always occurs) when scheduler tries to reconnect to master during scheduler teardown.


Summary (updated)
-----------------

Added test for tearing down frameworks while creating disks.


Bugs: MESOS-9542
    https://issues.apache.org/jira/browse/MESOS-9542


Repository: mesos


Description
-------

The CREATE_DISK and DESTROY_DISK operations are "non-speculative"
operations, which means the master must wait for the operations to
complete successfully before the master can update its resources.
Because the master must wait to update the results of non-speculative
operations, it is possible for the framework making the
CREATE/DESTROY_DISK to be torn down before the operation completes.

This commit adds a test to make sure the master can gracefully handle
such a case.


Diffs (updated)
-----

  src/tests/storage_local_resource_provider_tests.cpp a661951a0a326cc342aa0c45dd0967692ae70941 


Diff: https://reviews.apache.org/r/69869/diff/3/

Changes: https://reviews.apache.org/r/69869/diff/2-3/


Testing
-------

The test currently fails with the exact message as MESOS-9542, which is the intended behavior right now (we are discussing the fix).

src/mesos-tests --gtest_filter="StorageLocalResourceProviderTest.FrameworkTeardownBeforeTerminalOperationStatusUpdate" --verbose


Thanks,

Joseph Wu