You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Benjamin Mahler <bm...@apache.org> on 2019/02/11 18:21:11 UTC

Review Request 69942: Added a test for MESOS-9554.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69942/
-----------------------------------------------------------

Review request for mesos, Benjamin Bannier and Meng Zhu.


Bugs: MESOS-9554
    https://issues.apache.org/jira/browse/MESOS-9554


Repository: mesos


Description
-------

This test ensures that the framework allocation loop does not
incorrectly break in the case that a framework is incapable
of receiving the resources on an agent. In this case, the loop
should continue as other frameworks may be capable of receiving
the resources.


Diffs
-----

  src/tests/hierarchical_allocator_tests.cpp cc88afbad1b4e6bf707cb13b50c964aa01f9a3ee 


Diff: https://reviews.apache.org/r/69942/diff/1/


Testing
-------

Ensured it fails on master, succeeds after fix to MESOS-9554.

Ran in repetition.


Thanks,

Benjamin Mahler


Re: Review Request 69942: Added a test for MESOS-9554.

Posted by Mesos Reviewbot Windows <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69942/#review212712
-----------------------------------------------------------



FAIL: Some of the unit tests failed. Please check the relevant logs.

Reviews applied: `['69900', '69902', '69942']`

Failed command: `Start-MesosCITesting`

All the build artifacts available at: http://dcos-win.westus2.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/2872/mesos-review-69942

Relevant logs:

- [mesos-tests.log](http://dcos-win.westus2.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/2872/mesos-review-69942/logs/mesos-tests.log):

```
W0211 19:33:19.789881 41488 slave.cpp:3934] Ignoring shutdown framework 0e08c1e8-6346-4bce-bd53-64c57bdab0cb-0000 because it is terminating
I0211 19:33:19.791877 43452 hierarchical.cpp:358] Removed framework 0e08c1e8-6346-4bce-bd53-64c57bdab0cb-0000
I0211 19:33:19.792883 37668 master.cpp:1269] Agent 0e08c1e8-6346-4bce-bd53-64c57bdab0cb-S0 at slave(477)@192.10.1.6:60203 (windows-02.chtsmhjxogyevckjfayqqcnjda.xx.internal.cloudapp.net) disconnected
I0211 19:33:19.792883 37668 master.cpp:3272] Disconnecting agent 0e08c1e8-6346-4bce-bd53-64c57bdab0cb-S0 at slave(477)@192.10.1.6:60203 (windows-02.chtsmhjxogyevckjfayqqcnjda.xx.internal.cloudapp.net)
I0211 19:33:19.792883 37668 master.cpp:3291] Deactivating agent 0e08c1e8-6346-4bce-bd53-64c57bdab0cb-S0 at slave(477)@192.10.1.6:60203 (windows-02.chtsmhjxogyevckjfayqqcnjda.xx.internal.cloudapp.net)
I0211 19:33:19.792883 41488 hierarchical.cpp:793] Agent 0e08c1e8-6346-4bce-bd53-64c57bdab0cb-S0 deactivated
I0211 19:33:19.793946 41488 containerizer.cpp:2477] Destroying container 00a47ff7-873c-4085-978b-485cbca4089f in RUNNING state
I0211 19:33:19.793946 41488 containerizer.cpp:3144] Transitioning the state of container 00a47ff7-873c-4085-978b-485cbca4089f from RUNNING to DESTROYING
I0211 19:33:19.794879 41488 launcher.cpp:161] Asked to destroy container 00a47ff7-873c-4085-978b-485cbca4089f
W0211 19:33:19.795878 43388 process.cpp[       OK ] IsolationFlag/MemoryIsolatorTest.ROOT_MemUsage/0 (688 ms)
[----------] 1 test from IsolationFlag/MemoryIsolatorTest (706 ms total)

[----------] Global test environment tear-down
[==========] 1096 tests from 104 test cases ran. (514431 ms total)
[  PASSED  ] 1095 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] DockerFetcherPluginTest.INTERNET_CURL_InvokeFetchByName

 1 FAILED TEST
  YOU HAVE 231 DISABLED TESTS

:1423] Failed to recv on socket WindowsFD::Type::SOCKET=6532 to peer '192.10.1.6:62100': IO failed with error code: The specified network name is no longer available.

W0211 19:33:19.796885 43388 process.cpp:838] Failed to recv on socket WindowsFD::Type::SOCKET=8316 to peer '192.10.1.6:62101': IO failed with error code: The specified network name is no longer available.

I0211 19:33:19.815894 44752 containerizer.cpp:2983] Container 00a47ff7-873c-4085-978b-485cbca4089f has exited
I0211 19:33:19.844877 40992 master.cpp:1109] Master terminating
I0211 19:33:19.846884 41488 hierarchical.cpp:644] Removed agent 0e08c1e8-6346-4bce-bd53-64c57bdab0cb-S0
I0211 19:33:20.391891 43388 process.cpp:927] Stopped the socket accept loop
```

- Mesos Reviewbot Windows


On Feb. 11, 2019, 6:21 p.m., Benjamin Mahler wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69942/
> -----------------------------------------------------------
> 
> (Updated Feb. 11, 2019, 6:21 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier and Meng Zhu.
> 
> 
> Bugs: MESOS-9554
>     https://issues.apache.org/jira/browse/MESOS-9554
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This test ensures that the framework allocation loop does not
> incorrectly break in the case that a framework is incapable
> of receiving the resources on an agent. In this case, the loop
> should continue as other frameworks may be capable of receiving
> the resources.
> 
> 
> Diffs
> -----
> 
>   src/tests/hierarchical_allocator_tests.cpp cc88afbad1b4e6bf707cb13b50c964aa01f9a3ee 
> 
> 
> Diff: https://reviews.apache.org/r/69942/diff/1/
> 
> 
> Testing
> -------
> 
> Ensured it fails on master, succeeds after fix to MESOS-9554.
> 
> Ran in repetition.
> 
> 
> Thanks,
> 
> Benjamin Mahler
> 
>


Re: Review Request 69942: Added a test for MESOS-9554.

Posted by Benjamin Bannier <be...@mesosphere.io>.

> On Feb. 12, 2019, 7:34 p.m., Benjamin Bannier wrote:
> > src/tests/hierarchical_allocator_tests.cpp
> > Lines 2924-2929 (patched)
> > <https://reviews.apache.org/r/69942/diff/1/?file=2124739#file2124739line2924>
> >
> >     This test actually is more than a regression test for MESOS-9554 as it validates to some extent that framework allocation decisions are independent (but for possibly decreasing the available resource pool).
> >     
> >     Maybe rephrase along the lines of
> >     ```
> >     This test validates with one case that framework allocation decisions are independent (but for possibly decreasing the available resource pool). In particular, we validate that the decision not to allocate resources of an agent to a particular framework does not influence to allocate the same resources to another agent. In this test we make agent resources not allocatable to a framework by refining the resource reservations while the framework which should be allocated these resources based on DRF share is not capable of receiving them (it lacks the `RESERVATION_REFINEMENT` capability).
> >     
> >     This is a regression test for MESOS-9554.
> >     ```
> >     
> >     We could also rename the test, e.g., to `HierarchicalAllocatorTest.AllocationIndependent` (doesn't read to well, either). This tests models a whole class of scenarios we currently don't validate.
> 
> Benjamin Mahler wrote:
>     Thanks for the suggestion. I will have to think on this, and so I'll leave it open for now

Thinking more about this, "allocations have no effect on each other" doesn't seem right. A better invariant might that absent quota all cluster resources get allocated if at least one framework could receive them. We could even generalize this

Given
- `k` reasons resources could not be allocatable to a framework (e.g., filters for whole agents or parts of them, framework capabilities)
- `n > k` frameworks

For all assignments of reason to `n - 1` frameworks (reasons can be repeated; an assignement doesn't need to use all `k` reasons, `n`th framework can always receive any resource):
  For all DRF sortings of frameworks:
    sum of allocations == total resources

If we care about this, it should be possible to generate all possible configurations and check them.

There might be other invariants which could be checked in a similar way. Quota be treated similarly, but we'd need to include quota headroom.


- Benjamin


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69942/#review212751
-----------------------------------------------------------


On Feb. 11, 2019, 7:21 p.m., Benjamin Mahler wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69942/
> -----------------------------------------------------------
> 
> (Updated Feb. 11, 2019, 7:21 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier and Meng Zhu.
> 
> 
> Bugs: MESOS-9554
>     https://issues.apache.org/jira/browse/MESOS-9554
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This test ensures that the framework allocation loop does not
> incorrectly break in the case that a framework is incapable
> of receiving the resources on an agent. In this case, the loop
> should continue as other frameworks may be capable of receiving
> the resources.
> 
> 
> Diffs
> -----
> 
>   src/tests/hierarchical_allocator_tests.cpp cc88afbad1b4e6bf707cb13b50c964aa01f9a3ee 
> 
> 
> Diff: https://reviews.apache.org/r/69942/diff/1/
> 
> 
> Testing
> -------
> 
> Ensured it fails on master, succeeds after fix to MESOS-9554.
> 
> Ran in repetition.
> 
> 
> Thanks,
> 
> Benjamin Mahler
> 
>


Re: Review Request 69942: Added a test for MESOS-9554.

Posted by Benjamin Mahler <bm...@apache.org>.

> On Feb. 12, 2019, 6:34 p.m., Benjamin Bannier wrote:
> > src/tests/hierarchical_allocator_tests.cpp
> > Lines 2924-2929 (patched)
> > <https://reviews.apache.org/r/69942/diff/1/?file=2124739#file2124739line2924>
> >
> >     This test actually is more than a regression test for MESOS-9554 as it validates to some extent that framework allocation decisions are independent (but for possibly decreasing the available resource pool).
> >     
> >     Maybe rephrase along the lines of
> >     ```
> >     This test validates with one case that framework allocation decisions are independent (but for possibly decreasing the available resource pool). In particular, we validate that the decision not to allocate resources of an agent to a particular framework does not influence to allocate the same resources to another agent. In this test we make agent resources not allocatable to a framework by refining the resource reservations while the framework which should be allocated these resources based on DRF share is not capable of receiving them (it lacks the `RESERVATION_REFINEMENT` capability).
> >     
> >     This is a regression test for MESOS-9554.
> >     ```
> >     
> >     We could also rename the test, e.g., to `HierarchicalAllocatorTest.AllocationIndependent` (doesn't read to well, either). This tests models a whole class of scenarios we currently don't validate.

Thanks for the suggestion. I will have to think on this, and so I'll leave it open for now


> On Feb. 12, 2019, 6:34 p.m., Benjamin Bannier wrote:
> > src/tests/hierarchical_allocator_tests.cpp
> > Lines 2984-2990 (patched)
> > <https://reviews.apache.org/r/69942/diff/1/?file=2124739#file2124739line2984>
> >
> >     I don't think using a single statement here reads much worse,
> >     ```
> >     Resources resources =?
> >       CHECK_NOTERROR(Resources::parse("cpus:1;mem:1;disk:1"))?
> >         .pushReservation(createDynamicReservationInfo("parent"))?
> >         .pushReservation(createDynamicReservationInfo("parent/child"));?
> >     ```

Looks good, thanks


- Benjamin


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69942/#review212751
-----------------------------------------------------------


On Feb. 11, 2019, 6:21 p.m., Benjamin Mahler wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69942/
> -----------------------------------------------------------
> 
> (Updated Feb. 11, 2019, 6:21 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier and Meng Zhu.
> 
> 
> Bugs: MESOS-9554
>     https://issues.apache.org/jira/browse/MESOS-9554
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This test ensures that the framework allocation loop does not
> incorrectly break in the case that a framework is incapable
> of receiving the resources on an agent. In this case, the loop
> should continue as other frameworks may be capable of receiving
> the resources.
> 
> 
> Diffs
> -----
> 
>   src/tests/hierarchical_allocator_tests.cpp cc88afbad1b4e6bf707cb13b50c964aa01f9a3ee 
> 
> 
> Diff: https://reviews.apache.org/r/69942/diff/1/
> 
> 
> Testing
> -------
> 
> Ensured it fails on master, succeeds after fix to MESOS-9554.
> 
> Ran in repetition.
> 
> 
> Thanks,
> 
> Benjamin Mahler
> 
>


Re: Review Request 69942: Added a test for MESOS-9554.

Posted by Benjamin Bannier <be...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69942/#review212751
-----------------------------------------------------------


Fix it, then Ship it!





src/tests/hierarchical_allocator_tests.cpp
Lines 2924-2929 (patched)
<https://reviews.apache.org/r/69942/#comment298655>

    This test actually is more than a regression test for MESOS-9554 as it validates to some extent that framework allocation decisions are independent (but for possibly decreasing the available resource pool).
    
    Maybe rephrase along the lines of
    ```
    This test validates with one case that framework allocation decisions are independent (but for possibly decreasing the available resource pool). In particular, we validate that the decision not to allocate resources of an agent to a particular framework does not influence to allocate the same resources to another agent. In this test we make agent resources not allocatable to a framework by refining the resource reservations while the framework which should be allocated these resources based on DRF share is not capable of receiving them (it lacks the `RESERVATION_REFINEMENT` capability).
    
    This is a regression test for MESOS-9554.
    ```
    
    We could also rename the test, e.g., to `HierarchicalAllocatorTest.AllocationIndependent` (doesn't read to well, either). This tests models a whole class of scenarios we currently don't validate.



src/tests/hierarchical_allocator_tests.cpp
Lines 2970 (patched)
<https://reviews.apache.org/r/69942/#comment298654>

    Let's `ASSERT_EQ` here as the test is pointless without this being true.



src/tests/hierarchical_allocator_tests.cpp
Lines 2984-2990 (patched)
<https://reviews.apache.org/r/69942/#comment298656>

    I don't think using a single statement here reads much worse,
    ```
    Resources resources =?
      CHECK_NOTERROR(Resources::parse("cpus:1;mem:1;disk:1"))?
        .pushReservation(createDynamicReservationInfo("parent"))?
        .pushReservation(createDynamicReservationInfo("parent/child"));?
    ```


- Benjamin Bannier


On Feb. 11, 2019, 7:21 p.m., Benjamin Mahler wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69942/
> -----------------------------------------------------------
> 
> (Updated Feb. 11, 2019, 7:21 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier and Meng Zhu.
> 
> 
> Bugs: MESOS-9554
>     https://issues.apache.org/jira/browse/MESOS-9554
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This test ensures that the framework allocation loop does not
> incorrectly break in the case that a framework is incapable
> of receiving the resources on an agent. In this case, the loop
> should continue as other frameworks may be capable of receiving
> the resources.
> 
> 
> Diffs
> -----
> 
>   src/tests/hierarchical_allocator_tests.cpp cc88afbad1b4e6bf707cb13b50c964aa01f9a3ee 
> 
> 
> Diff: https://reviews.apache.org/r/69942/diff/1/
> 
> 
> Testing
> -------
> 
> Ensured it fails on master, succeeds after fix to MESOS-9554.
> 
> Ran in repetition.
> 
> 
> Thanks,
> 
> Benjamin Mahler
> 
>


Re: Review Request 69942: Added a test for MESOS-9554.

Posted by Meng Zhu <mz...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69942/#review212753
-----------------------------------------------------------


Ship it!




Ship It!

- Meng Zhu


On Feb. 11, 2019, 10:21 a.m., Benjamin Mahler wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69942/
> -----------------------------------------------------------
> 
> (Updated Feb. 11, 2019, 10:21 a.m.)
> 
> 
> Review request for mesos, Benjamin Bannier and Meng Zhu.
> 
> 
> Bugs: MESOS-9554
>     https://issues.apache.org/jira/browse/MESOS-9554
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This test ensures that the framework allocation loop does not
> incorrectly break in the case that a framework is incapable
> of receiving the resources on an agent. In this case, the loop
> should continue as other frameworks may be capable of receiving
> the resources.
> 
> 
> Diffs
> -----
> 
>   src/tests/hierarchical_allocator_tests.cpp cc88afbad1b4e6bf707cb13b50c964aa01f9a3ee 
> 
> 
> Diff: https://reviews.apache.org/r/69942/diff/1/
> 
> 
> Testing
> -------
> 
> Ensured it fails on master, succeeds after fix to MESOS-9554.
> 
> Ran in repetition.
> 
> 
> Thanks,
> 
> Benjamin Mahler
> 
>