You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Benjamin Bannier <be...@mesosphere.io> on 2017/12/15 13:15:58 UTC
Review Request 64648: Fixed handling of resource versions in agent
oversubscribed updates.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64648/
-----------------------------------------------------------
Review request for mesos, Jie Yu and Jan Schlicht.
Repository: mesos
Description
-------
Agents can use 'UpdateSlaveMessage' to send updates on their
oversubscribed resources, their resource provider state, or both. We
previously assumed that 'UpdateSlaveMessage' from a resource
provider-capable agent would always contain the most recent resource
version of the agent even thought the field is marked 'optional'.
This patch simplifies the handling in the master to not assert a set
resource version for 'UpdateSlaveMessage' for resource
provider-capable agents. Instead we explicitly and unconditionally
check whether the field is set and handle only set values.
Diffs
-----
src/master/master.cpp e082da8267fa22c26818c67bd6da573fe1808696
Diff: https://reviews.apache.org/r/64648/diff/1/
Testing
-------
Tested on a number of platforms and setups in internal CI.
Thanks,
Benjamin Bannier
Re: Review Request 64648: Fixed handling of resource versions in agent
oversubscribed updates.
Posted by Jie Yu <yu...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64648/#review193932
-----------------------------------------------------------
Ship it!
Ship It!
- Jie Yu
On Dec. 15, 2017, 1:15 p.m., Benjamin Bannier wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64648/
> -----------------------------------------------------------
>
> (Updated Dec. 15, 2017, 1:15 p.m.)
>
>
> Review request for mesos, Jie Yu and Jan Schlicht.
>
>
> Repository: mesos
>
>
> Description
> -------
>
> Agents can use 'UpdateSlaveMessage' to send updates on their
> oversubscribed resources, their resource provider state, or both. We
> previously assumed that 'UpdateSlaveMessage' from a resource
> provider-capable agent would always contain the most recent resource
> version of the agent even thought the field is marked 'optional'.
>
> This patch simplifies the handling in the master to not assert a set
> resource version for 'UpdateSlaveMessage' for resource
> provider-capable agents. Instead we explicitly and unconditionally
> check whether the field is set and handle only set values.
>
>
> Diffs
> -----
>
> src/master/master.cpp e082da8267fa22c26818c67bd6da573fe1808696
>
>
> Diff: https://reviews.apache.org/r/64648/diff/1/
>
>
> Testing
> -------
>
> Tested on a number of platforms and setups in internal CI.
>
>
> Thanks,
>
> Benjamin Bannier
>
>
Re: Review Request 64648: Fixed handling of resource versions in agent
oversubscribed updates.
Posted by Mesos Reviewbot Windows <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64648/#review193912
-----------------------------------------------------------
FAIL: Some Mesos tests failed.
Reviews applied: `['64648']`
Failed command: `D:\DCOS\mesos\src\mesos-tests.exe --verbose`
All the build artifacts available at: http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/64648
Relevant logs:
- [mesos-tests-stdout.log](http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/64648/logs/mesos-tests-stdout.log):
```
[----------] 1 test from IsolationFlag/CpuIsolatorTest
[ RUN ] IsolationFlag/CpuIsolatorTest.ROOT_UserCpuUsage/0
[ OK ] IsolationFlag/CpuIsolatorTest.ROOT_UserCpuUsage/0 (2425 ms)
[----------] 1 test from IsolationFlag/CpuIsolatorTest (2450 ms total)
[----------] 1 test from IsolationFlag/MemoryIsolatorTest
[ RUN ] IsolationFlag/MemoryIsolatorTest.ROOT_MemUsage/0
[ OK ] IsolationFlag/MemoryIsolatorTest.ROOT_MemUsage/0 (2374 ms)
[----------] 1 test from IsolationFlag/MemoryIsolatorTest (2400 ms total)
[----------] Global test environment tear-down
[==========] 835 tests from 85 test cases ran. (319626 ms total)
[ PASSED ] 825 tests.
[ FAILED ] 10 tests, listed below:
[ FAILED ] OfferOperationStatusUpdateManagerTest.UpdateAndAckNonTerminalUpdate
[ FAILED ] OfferOperationStatusUpdateManagerTest.RecoverCheckpointedStream
[ FAILED ] OfferOperationStatusUpdateManagerTest.RecoverEmptyFile
[ FAILED ] OfferOperationStatusUpdateManagerTest.RecoverTerminatedStream
[ FAILED ] OfferOperationStatusUpdateManagerTest.IgnoreDuplicateUpdate
[ FAILED ] OfferOperationStatusUpdateManagerTest.IgnoreDuplicateUpdateAfterRecover
[ FAILED ] OfferOperationStatusUpdateManagerTest.RejectDuplicateAck
[ FAILED ] OfferOperationStatusUpdateManagerTest.RejectDuplicateAckAfterRecover
[ FAILED ] OfferOperationStatusUpdateManagerTest.NonStrictRecoveryCorruptedFile
[ FAILED ] OfferOperationStatusUpdateManagerTest.UpdateLatestWhenResending
10 FAILED TESTS
YOU HAVE 205 DISABLED TESTS
```
- [mesos-tests-stderr.log](http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/64648/logs/mesos-tests-stderr.log):
```
I1215 14:11:51.022387 6896 slave.cpp:3401] Shutting down framework 1f3c1bf2-6079-4fd6-937f-984bfd86e2ce-0000
I1215 14:11:51.022387 6896 slave.cpp:6109] Shutting down executor '891703b9-fd6f-4f81-b804-5d47fc675f0c' of framI1215 14:11:50.339382 7592 exec.cpp:162] Version: 1.5.0
I1215 14:11:50.363382 3216 exec.cpp:237] Executor registered on agent 1f3c1bf2-6079-4fd6-937f-984bfd86e2ce-S0
I1215 14:11:50.366389 1476 executor.cpp:171] Received SUBSCRIBED event
I1215 14:11:50.371387 1476 executor.cpp:175] Subscribed executor on build-srv-03.zq4gs31qjdiunm1ryi1452nvnh.dx.internal.cloudapp.net
I1215 14:11:50.371387 1476 executor.cpp:171] Received LAUNCH event
I1215 14:11:50.375386 1476 executor.cpp:638] Starting task 891703b9-fd6f-4f81-b804-5d47fc675f0c
I1215 14:11:50.454388 1476 executor.cpp:478] Running 'D:\DCOS\mesos\src\mesos-containerizer.exe launch <POSSIBLY-SENSITIVE-DATA>'
I1215 14:11:50.997388 1476 executor.cpp:651] Forked command at 7272
I1215 14:11:51.025388 9772 exec.cpp:435] Executor asked to shutdown
I1215 14:11:51.025388 1476 executor.cpp:171] Received SHUTDOWN event
I1215 14:11:51.025388 1476 executor.cpp:748] Shutting down
I1215 14:11:51.025388 1476 executor.cpp:855] Sending SIGTERM to process tree at pid 7ework 1f3c1bf2-6079-4fd6-937f-984bfd86e2ce-0000 at executor(1)@10.3.1.11:64964
I1215 14:11:51.023658 4796 master.cpp:10156] Updating the state of task 891703b9-fd6f-4f81-b804-5d47fc675f0c of framework 1f3c1bf2-6079-4fd6-937f-984bfd86e2ce-0000 (latest state: TASK_KILLED, status update state: TASK_KILLED)
I1215 14:11:51.024386 6896 slave.cpp:909] Agent terminating
W1215 14:11:51.024386 6896 slave.cpp:3397] Ignoring shutdown framework 1f3c1bf2-6079-4fd6-937f-984bfd86e2ce-0000 because it is terminating
I1215 14:11:51.025388 4796 master.cpp:10262] Removing task 891703b9-fd6f-4f81-b804-5d47fc675f0c with resources cpus(allocated: *):4; mem(allocated: *):2048; disk(allocated: *):1024; ports(allocated: *):[31000-32000] of framework 1f3c1bf2-6079-4fd6-937f-984bfd86e2ce-0000 on agent 1f3c1bf2-6079-4fd6-937f-984bfd86e2ce-S0 at slave(327)@10.3.1.11:64942 (build-srv-03.zq4gs31qjdiunm1ryi1452nvnh.dx.internal.cloudapp.net)
I1215 14:11:51.026387 8100 containerizer.cpp:2337] Destroying container 99967f28-5cd5-483e-b317-47b4dfbf6fb7 in RUNNING state
I1215 14:11:51.027398 8100 containerizer.cpp:2939] Transitioning the state of container 99967f28-5cd5-483e-b317-47b4dfbf6fb7 from RUNNING to DESTROYING
I1215 14:11:51.028406 4796 master.cpp:1305] Agent 1f3c1bf2-6079-4fd6-937f-984bfd86e2ce-S0 at slave(327)@10.3.1.11:64942 (build-srv-03.zq4gs31qjdiunm1ryi1452nvnh.dx.internal.cloudapp.net) disconnected
I1215 14:11:51.028406 4796 master.cpp:3364] Disconnecting agent 1f3c1bf2-6079-4fd6-937f-984bfd86e2ce-S0 at slave(327)@10.3.1.11:64942 (build-srv-03.zq4gs31qjdiunm1ryi1452nvnh.dx.internal.cloudapp.net)
I1215 14:11:51.028406 4796 master.cpp:3383] Deactivating agent 1f3c1bf2-6079-4fd6-937f-984bfd86e2ce-S0 at slave(327)@10.3.1.11:64942 (build-srv-03.zq4gs31qjdiunm1ryi1452nvnh.dx.internal.cloudapp.net)
I1215 14:11:51.028406 8100 launcher.cpp:156] Asked to destroy container 99967f28-5cd5-483e-b317-47b4dfbf6fb7
I1215 14:11:51.029387 1120 hierarchical.cpp:344] Removed framework 1f3c1bf2-6079-4fd6-937f-984bfd86e2ce-0000
I1215 14:11:51.029387 1120 hierarchical.cpp:766] Agent 1f3c1bf2-6079-4fd6-937f-984bfd86e2ce-S0 deactivated
I1215 14:11:51.038398 8100 containerizer.cpp:2788] Container 99967f28-5cd5-483e-b317-47b4dfbf6fb7 has exited
I1215 14:11:51.068393 8880 master.cpp:1147] Master terminating
I1215 14:11:51.070389 4660 hierarchical.cpp:609] Removed agent 1f3c1bf2-6079-4fd6-937f-984bfd86e2ce-S0
I1215 14:11:51.386394 8640 process.cpp:887] Failed to accept socket: future discarded
```
- Mesos Reviewbot Windows
On Dec. 15, 2017, 1:15 p.m., Benjamin Bannier wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64648/
> -----------------------------------------------------------
>
> (Updated Dec. 15, 2017, 1:15 p.m.)
>
>
> Review request for mesos, Jie Yu and Jan Schlicht.
>
>
> Repository: mesos
>
>
> Description
> -------
>
> Agents can use 'UpdateSlaveMessage' to send updates on their
> oversubscribed resources, their resource provider state, or both. We
> previously assumed that 'UpdateSlaveMessage' from a resource
> provider-capable agent would always contain the most recent resource
> version of the agent even thought the field is marked 'optional'.
>
> This patch simplifies the handling in the master to not assert a set
> resource version for 'UpdateSlaveMessage' for resource
> provider-capable agents. Instead we explicitly and unconditionally
> check whether the field is set and handle only set values.
>
>
> Diffs
> -----
>
> src/master/master.cpp e082da8267fa22c26818c67bd6da573fe1808696
>
>
> Diff: https://reviews.apache.org/r/64648/diff/1/
>
>
> Testing
> -------
>
> Tested on a number of platforms and setups in internal CI.
>
>
> Thanks,
>
> Benjamin Bannier
>
>