You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Meng Zhu <mz...@mesosphere.io> on 2019/03/29 18:45:29 UTC

Review Request 70344: Fixed a bug where quota may be under allocated.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70344/
-----------------------------------------------------------

Review request for mesos and Benjamin Mahler.


Bugs: MESOS-9692
    https://issues.apache.org/jira/browse/MESOS-9692


Repository: mesos


Description
-------

The resources chopping logic in the allocator
currently overly shrinks resources with the same
name (e.g. vanilla disk and mount disk) if there
are multiple such resources. This would lead
to quota being under allocated. This patch corrects
the shrinking logic. See MESOS-9692.


Diffs
-----

  src/master/allocator/mesos/hierarchical.cpp 8bc749903b8ceb09a02e260919377483479302b5 
  src/tests/hierarchical_allocator_tests.cpp a34fa96fbc455d830c1fd7c1df83f3db72c96ee3 


Diff: https://reviews.apache.org/r/70344/diff/1/


Testing
-------

make check
Added a dedicated test. The test fails without the fix and passes with the fix.


Thanks,

Meng Zhu


Re: Review Request 70344: Fixed a bug where disk quota may be under allocated.

Posted by Mesos Reviewbot <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70344/#review214218
-----------------------------------------------------------



Patch looks great!

Reviews applied: [70344]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' CONFIGURATION='--verbose --disable-libtool-wrappers --disable-parallel-test-execution' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; ./support/docker-build.sh

- Mesos Reviewbot


On March 29, 2019, 2:28 p.m., Meng Zhu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70344/
> -----------------------------------------------------------
> 
> (Updated March 29, 2019, 2:28 p.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Bugs: MESOS-9692
>     https://issues.apache.org/jira/browse/MESOS-9692
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> The resources chopping logic in the allocator
> currently overly shrinks resources with the same
> name (e.g. vanilla disk and mount disk) if there
> are multiple such resources. Currently only
> different disk resources might share the same
> name (we only chop unreserved/non-revocable/non-shared
> resources). This would lead to disk quota being under
> allocated. This patch corrects the shrinking logic.
> See MESOS-9692.
> 
> 
> Diffs
> -----
> 
>   src/master/allocator/mesos/hierarchical.cpp 8bc749903b8ceb09a02e260919377483479302b5 
>   src/tests/hierarchical_allocator_tests.cpp a34fa96fbc455d830c1fd7c1df83f3db72c96ee3 
> 
> 
> Diff: https://reviews.apache.org/r/70344/diff/2/
> 
> 
> Testing
> -------
> 
> make check
> Added a dedicated test. The test fails without the fix and passes with the fix.
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>


Re: Review Request 70344: Fixed a bug where disk quota may be under allocated.

Posted by Mesos Reviewbot Windows <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70344/#review214220
-----------------------------------------------------------



FAIL: Some of the unit tests failed. Please check the relevant logs.

Reviews applied: `['70344']`

Failed command: `Start-MesosCITesting`

All the build artifacts available at: http://dcos-win.westus2.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/3045/mesos-review-70344

Relevant logs:

- [mesos-tests.log](http://dcos-win.westus2.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/3045/mesos-review-70344/logs/mesos-tests.log):

```
W0329 22:16:30.679788 21580 slave.cpp:3932] Ignoring shutdown framework 5c9a38ec-cccd-4681-ac45-c0eb5f66624d-0000 because it is terminating
I0329 22:16:30.682793 16952 master.cpp:1295] Agent 5c9a38ec-cccd-4681-ac45-c0eb5f66624d-S0 at slave(500)@192.10.1.4:57935 (windows-01.chtsmhjxogyevckjfayqqcnjda.xx.internal.cloudapp.net) disconnected
I0329 22:16:30.682793 16952 master.cpp:3333] Disconnecting agent 5c9a38ec-cccd-4681-ac45-c0eb5f66624d-S0 at slave(500)@192.10.1.4:57935 (windows-01.chtsmhjxogyevckjfayqqcnjda.xx.internal.cloudapp.net)
I0329 22:16:30.682793 16952 master.cpp:3352] Deactivating agent 5c9a38ec-cccd-4681-ac45-c0eb5f66624d-S0 at slave(500)@192.10.1.4:57935 (windows-01.chtsmhjxogyevckjfayqqcnjda.xx.internal.cloudapp.net)
I0329 22:16:30.683786 22456 hierarchical.cpp:391] Removed framework 5c9a38ec-cccd-4681-ac45-c0eb5f66624d-0000
I0329 22:16:30.683786 22456 hierarchical.cpp:828] Agent 5c9a38ec-cccd-4681-ac45-c0eb5f66624d-S0 deactivated
I0329 22:16:30.683786 20600 containerizer.cpp:2576] Destroying container cb233331-b0cb-420a-b639-06ea955effa1 in RUNNING state
I0329 22:16:30.683786 20600 containerizer.cpp:3278] Transitioning the state of container cb233331-b0cb-420a-b639-06ea955effa1 from RUNNING to DESTROYING
I0329 22:16:30.684798 20600 launcher.cpp:161] Asked to destroy container cb233331-b0cb-420a-b639-06ea955effa1
W0329 22:16:30.685784 14640 process.cpp:1423] Failed to recv on socket WindowsFD::Type::SOCKET=4260 to peer '192.10.1.4:60335': IO failed with error code: The specified network name is no longer available.

W0329 22:16:30.685784 14640 process.cpp:838] Failed to recv on socket WindowsFD::Type::SOCKET=928 to peer '192.10.1.4:60336': IO failed [       OK ] IsolationFlag/MemoryIsolatorTest.ROOT_MemUsage/0 (683 ms)
[----------] 1 test from IsolationFlag/MemoryIsolatorTest (701 ms total)

[----------] Global test environment tear-down
[==========] 1131 tests from 107 test cases ran. (568799 ms total)
[  PASSED  ] 1130 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] DockerFetcherPluginTest.INTERNET_CURL_FetchBlob

 1 FAILED TEST
  YOU HAVE 231 DISABLED TESTS

with error code: The specified network name is no longer available.

I0329 22:16:30.705838 16952 containerizer.cpp:3117] Container cb233331-b0cb-420a-b639-06ea955effa1 has exited
I0329 22:16:30.738899 20848 master.cpp:1135] Master terminating
I0329 22:16:30.739795 12768 hierarchical.cpp:679] Removed agent 5c9a38ec-cccd-4681-ac45-c0eb5f66624d-S0
I0329 22:16:31.088856 14640 process.cpp:927] Stopped the socket accept loop
```

- Mesos Reviewbot Windows


On March 29, 2019, 9:28 p.m., Meng Zhu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70344/
> -----------------------------------------------------------
> 
> (Updated March 29, 2019, 9:28 p.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Bugs: MESOS-9692
>     https://issues.apache.org/jira/browse/MESOS-9692
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> The resources chopping logic in the allocator
> currently overly shrinks resources with the same
> name (e.g. vanilla disk and mount disk) if there
> are multiple such resources. Currently only
> different disk resources might share the same
> name (we only chop unreserved/non-revocable/non-shared
> resources). This would lead to disk quota being under
> allocated. This patch corrects the shrinking logic.
> See MESOS-9692.
> 
> 
> Diffs
> -----
> 
>   src/master/allocator/mesos/hierarchical.cpp 8bc749903b8ceb09a02e260919377483479302b5 
>   src/tests/hierarchical_allocator_tests.cpp a34fa96fbc455d830c1fd7c1df83f3db72c96ee3 
> 
> 
> Diff: https://reviews.apache.org/r/70344/diff/2/
> 
> 
> Testing
> -------
> 
> make check
> Added a dedicated test. The test fails without the fix and passes with the fix.
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>


Re: Review Request 70344: Fixed a bug where disk quota may be under allocated.

Posted by Meng Zhu <mz...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70344/
-----------------------------------------------------------

(Updated March 29, 2019, 2:28 p.m.)


Review request for mesos and Benjamin Mahler.


Changes
-------

Addressed Ben's comment


Summary (updated)
-----------------

Fixed a bug where disk quota may be under allocated.


Bugs: MESOS-9692
    https://issues.apache.org/jira/browse/MESOS-9692


Repository: mesos


Description (updated)
-------

The resources chopping logic in the allocator
currently overly shrinks resources with the same
name (e.g. vanilla disk and mount disk) if there
are multiple such resources. Currently only
different disk resources might share the same
name (we only chop unreserved/non-revocable/non-shared
resources). This would lead to disk quota being under
allocated. This patch corrects the shrinking logic.
See MESOS-9692.


Diffs (updated)
-----

  src/master/allocator/mesos/hierarchical.cpp 8bc749903b8ceb09a02e260919377483479302b5 
  src/tests/hierarchical_allocator_tests.cpp a34fa96fbc455d830c1fd7c1df83f3db72c96ee3 


Diff: https://reviews.apache.org/r/70344/diff/2/

Changes: https://reviews.apache.org/r/70344/diff/1-2/


Testing
-------

make check
Added a dedicated test. The test fails without the fix and passes with the fix.


Thanks,

Meng Zhu


Re: Review Request 70344: Fixed a bug where quota may be under allocated.

Posted by Benjamin Mahler <bm...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70344/#review214211
-----------------------------------------------------------


Fix it, then Ship it!




Could we make the summary/description/test name a bit more specific about it being related to disk w/ metadata?


src/tests/hierarchical_allocator_tests.cpp
Lines 3948-3950 (patched)
<https://reviews.apache.org/r/70344/#comment300406>

    Could we instead describe this as multiple disk resources instead of the more generic resources w/ same name?



src/tests/hierarchical_allocator_tests.cpp
Lines 3960-3965 (patched)
<https://reviews.apache.org/r/70344/#comment300407>

    Hm.. a high level comment at the top of the test about how this test works would be helpful (i.e. we create 50 MOUNT disk and 50 regular disk and expect both to be allocated for quota? probably also need to mention the ticket since this makes sense to the reader as a regression test for that issue)


- Benjamin Mahler


On March 29, 2019, 6:45 p.m., Meng Zhu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70344/
> -----------------------------------------------------------
> 
> (Updated March 29, 2019, 6:45 p.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Bugs: MESOS-9692
>     https://issues.apache.org/jira/browse/MESOS-9692
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> The resources chopping logic in the allocator
> currently overly shrinks resources with the same
> name (e.g. vanilla disk and mount disk) if there
> are multiple such resources. This would lead
> to quota being under allocated. This patch corrects
> the shrinking logic. See MESOS-9692.
> 
> 
> Diffs
> -----
> 
>   src/master/allocator/mesos/hierarchical.cpp 8bc749903b8ceb09a02e260919377483479302b5 
>   src/tests/hierarchical_allocator_tests.cpp a34fa96fbc455d830c1fd7c1df83f3db72c96ee3 
> 
> 
> Diff: https://reviews.apache.org/r/70344/diff/1/
> 
> 
> Testing
> -------
> 
> make check
> Added a dedicated test. The test fails without the fix and passes with the fix.
> 
> 
> Thanks,
> 
> Meng Zhu
> 
>