You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Greg Mann <gr...@mesosphere.io> on 2017/12/08 23:46:15 UTC
Review Request 64464: Made master reconcile known offer operations
with agent.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64464/
-----------------------------------------------------------
Review request for mesos, Benjamin Bannier, Gaston Kleiman, and Jie Yu.
Bugs: MESOS-8195
https://issues.apache.org/jira/browse/MESOS-8195
Repository: mesos
Description
-------
In cases where the agent fails over or where an `UpdateSlaveMessage`
races with an `ApplyOfferOperationMessage`, it's possible that the
master knows about an offer operation which is not contained in an
`UpdateSlaveMessage`. In such cases, the master should send a
`ReconcileOfferOperations` message to the agent. The agent will
then respond by sending OFFER_OPERATION_DROPPED status updates for
any operations which it does not know about.
Diffs
-----
src/master/master.cpp b3e074cfe86600793310deb87932fa145e95055d
Diff: https://reviews.apache.org/r/64464/diff/1/
Testing
-------
make check
Thanks,
Greg Mann
Re: Review Request 64464: Made master reconcile known offer
operations with agent.
Posted by Mesos Reviewbot Windows <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64464/#review193408
-----------------------------------------------------------
FAIL: Some Mesos tests failed.
Reviews applied: `['64457', '64458', '64462', '64463', '64464']`
Failed command: `D:\DCOS\mesos\src\mesos-tests.exe --verbose`
All the build artifacts available at: http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/64464
Relevant logs:
- [mesos-tests-stdout.log](http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/64464/logs/mesos-tests-stdout.log):
```
[----------] 1 test from IsolationFlag/CpuIsolatorTest
[ RUN ] IsolationFlag/CpuIsolatorTest.ROOT_UserCpuUsage/0
[ OK ] IsolationFlag/CpuIsolatorTest.ROOT_UserCpuUsage/0 (2318 ms)
[----------] 1 test from IsolationFlag/CpuIsolatorTest (2340 ms total)
[----------] 1 test from IsolationFlag/MemoryIsolatorTest
[ RUN ] IsolationFlag/MemoryIsolatorTest.ROOT_MemUsage/0
[ OK ] IsolationFlag/MemoryIsolatorTest.ROOT_MemUsage/0 (2267 ms)
[----------] 1 test from IsolationFlag/MemoryIsolatorTest (2289 ms total)
[----------] Global test environment tear-down
[==========] 825 tests from 84 test cases ran. (306961 ms total)
[ PASSED ] 815 tests.
[ FAILED ] 10 tests, listed below:
[ FAILED ] OfferOperationStatusUpdateManagerTest.UpdateAndAckNonTerminalUpdate
[ FAILED ] OfferOperationStatusUpdateManagerTest.RecoverCheckpointedStream
[ FAILED ] OfferOperationStatusUpdateManagerTest.RecoverEmptyFile
[ FAILED ] OfferOperationStatusUpdateManagerTest.RecoverTerminatedStream
[ FAILED ] OfferOperationStatusUpdateManagerTest.IgnoreDuplicateUpdate
[ FAILED ] OfferOperationStatusUpdateManagerTest.IgnoreDuplicateUpdateAfterRecover
[ FAILED ] OfferOperationStatusUpdateManagerTest.RejectDuplicateAck
[ FAILED ] OfferOperationStatusUpdateManagerTest.RejectDuplicateAckAfterRecover
[ FAILED ] OfferOperationStatusUpdateManagerTest.NonStrictRecoveryCorruptedFile
[ FAILED ] SlaveTest.ResourceProviderPublishAll
10 FAILED TESTS
YOU HAVE 204 DISABLED TESTS
```
- [mesos-tests-stderr.log](http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/64464/logs/mesos-tests-stderr.log):
```
I1211 17:58:04.948421 4308 executor.cpp:171] Received SUBSCRIBED event
I1211 17:58:04.952181 4308 executor.cpp:175] Subscribed executor on build-srv-04.zq4gs31qjdiunm1ryi1452nvnh.dx.internal.cloudapp.net
I1211 17:58:04.953182 4308 executor.cpp:171] Received LAUNCH event
I1211 17:58:04.956182 4308 executor.cpp:637] Starting task 61c1e9db-979d-4f24-a6e8-f4b778638371
I1211 17:58:05.028184 4308 executor.cpp:477] Running 'D:\DCOS\mesos\src\mesos-containerizer.exe launch <POSSIBLY-SENSITIVE-DATA>'
I1211 17:58:05.531175 4308 executor.cpp:650] Forked command at 6512
I1211 17:58:05.557174 2020 exec.cpp:435] Executor asked to shutdown
I1211 17:58:05.558176 4308 executor.cpp:171] Received SHUTDOWN event
I1211 17:58:05.558176 4308 executor.cpp:747] Shutting down
I1211 17:58:05.558176 4308 executor.cpp:854] Sending SIGTERM to process tree at pid 629-71bb7e019a9f@10.3.1.5:59670
I1211 17:58:05.556175 1640 hierarchical.cpp:405] Deactivated framework b24356ce-7f3c-4f21-b78c-6f012fc8a020-0000
I1211 17:58:05.556175 8680 master.cpp:10115] Updating the state of task 61c1e9db-979d-4f24-a6e8-f4b778638371 of framework b24356ce-7f3c-4f21-b78c-6f012fc8a020-0000 (latest state: TASK_KILLED, status update state: TASK_KILLED)
I1211 17:58:05.556175 8984 slave.cpp:3400] Shutting down framework b24356ce-7f3c-4f21-b78c-6f012fc8a020-0000
I1211 17:58:05.556175 8984 slave.cpp:6091] Shutting down executor '61c1e9db-979d-4f24-a6e8-f4b778638371' of framework b24356ce-7f3c-4f21-b78c-6f012fc8a020-0000 at executor(1)@10.3.1.5:59691
I1211 17:58:05.557174 8984 slave.cpp:909] Agent terminating
W1211 17:58:05.557174 8984 slave.cpp:3396] Ignoring shutdown framework b24356ce-7f3c-4f21-b78c-6f012fc8a020-0000 because it is terminating
I1211 17:58:05.558176 8680 master.cpp:10221] Removing task 61c1e9db-979d-4f24-a6e8-f4b778638371 with resources cpus(allocated: *):4; mem(allocated: *):2048; disk(allocated: *):1024; ports(allocated: *):[31000-32000] of framework b24356ce-7f3c-4f21-b78c-6f012fc8a020-0000 on agent b24356ce-7f3c-4f21-b78c-6f012fc8a020-S0 at slave(326)@10.3.1.5:59670 (build-srv-04.zq4gs31qjdiunm1ryi1452nvnh.dx.internal.cloudapp.net)
I1211 17:58:05.560175 6484 containerizer.cpp:2328] Destroying container 7f665881-5c1f-4b11-b437-f0f3c42c187c in RUNNING state
I1211 17:58:05.560175 6484 containerizer.cpp:2930] Transitioning the state of container 7f665881-5c1f-4b11-b437-f0f3c42c187c from RUNNING to DESTROYING
I1211 17:58:05.561177 8680 master.cpp:1310] Agent b24356ce-7f3c-4f21-b78c-6f012fc8a020-S0 at slave(326)@10.3.1.5:59670 (build-srv-04.zq4gs31qjdiunm1ryi1452nvnh.dx.internal.cloudapp.net) disconnected
I1211 17:58:05.561177 8680 master.cpp:3369] Disconnecting agent b24356ce-7f3c-4f21-b78c-6f012fc8a020-S0 at slave(326)@10.3.1.5:59670 (build-srv-04.zq4gs31qjdiunm1ryi1452nvnh.dx.internal.cloudapp.net)
I1211 17:58:05.561177 6484 launcher.cpp:156] Asked to destroy container 7f665881-5c1f-4b11-b437-f0f3c42c187c
I1211 17:58:05.561177 8680 master.cpp:3388] Deactivating agent b24356ce-7f3c-4f21-b78c-6f012fc8a020-S0 at slave(326)@10.3.1.5:59670 (build-srv-04.zq4gs31qjdiunm1ryi1452nvnh.dx.internal.cloudapp.net)
I1211 17:58:05.561177 8984 hierarchical.cpp:344] Removed framework b24356ce-7f3c-4f21-b78c-6f012fc8a020-0000
I1211 17:58:05.561177 8984 hierarchical.cpp:762] Agent b24356ce-7f3c-4f21-b78c-6f012fc8a020-S0 deactivated
I1211 17:58:05.653242 8680 containerizer.cpp:2779] Container 7f665881-5c1f-4b11-b437-f0f3c42c187c has exited
I1211 17:58:05.681249 8160 master.cpp:1152] Master terminating
I1211 17:58:05.683250 8680 hierarchical.cpp:605] Removed agent b24356ce-7f3c-4f21-b78c-6f012fc8a020-S0
I1211 17:58:05.957247 576 process.cpp:887] Failed to accept socket: future discarded
```
- Mesos Reviewbot Windows
On Dec. 8, 2017, 11:46 p.m., Greg Mann wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64464/
> -----------------------------------------------------------
>
> (Updated Dec. 8, 2017, 11:46 p.m.)
>
>
> Review request for mesos, Benjamin Bannier, Gaston Kleiman, and Jie Yu.
>
>
> Bugs: MESOS-8195
> https://issues.apache.org/jira/browse/MESOS-8195
>
>
> Repository: mesos
>
>
> Description
> -------
>
> In cases where the agent fails over or where an `UpdateSlaveMessage`
> races with an `ApplyOfferOperationMessage`, it's possible that the
> master knows about an offer operation which is not contained in an
> `UpdateSlaveMessage`. In such cases, the master should send a
> `ReconcileOfferOperations` message to the agent. The agent will
> then respond by sending OFFER_OPERATION_DROPPED status updates for
> any operations which it does not know about.
>
>
> Diffs
> -----
>
> src/master/master.cpp b3e074cfe86600793310deb87932fa145e95055d
>
>
> Diff: https://reviews.apache.org/r/64464/diff/1/
>
>
> Testing
> -------
>
> make check
>
>
> Thanks,
>
> Greg Mann
>
>
Re: Review Request 64464: Made master reconcile known offer
operations with agent.
Posted by Gaston Kleiman <ga...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64464/#review193306
-----------------------------------------------------------
Ship it!
Ship It!
- Gaston Kleiman
On Dec. 8, 2017, 3:46 p.m., Greg Mann wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64464/
> -----------------------------------------------------------
>
> (Updated Dec. 8, 2017, 3:46 p.m.)
>
>
> Review request for mesos, Benjamin Bannier, Gaston Kleiman, and Jie Yu.
>
>
> Bugs: MESOS-8195
> https://issues.apache.org/jira/browse/MESOS-8195
>
>
> Repository: mesos
>
>
> Description
> -------
>
> In cases where the agent fails over or where an `UpdateSlaveMessage`
> races with an `ApplyOfferOperationMessage`, it's possible that the
> master knows about an offer operation which is not contained in an
> `UpdateSlaveMessage`. In such cases, the master should send a
> `ReconcileOfferOperations` message to the agent. The agent will
> then respond by sending OFFER_OPERATION_DROPPED status updates for
> any operations which it does not know about.
>
>
> Diffs
> -----
>
> src/master/master.cpp b3e074cfe86600793310deb87932fa145e95055d
>
>
> Diff: https://reviews.apache.org/r/64464/diff/1/
>
>
> Testing
> -------
>
> make check
>
>
> Thanks,
>
> Greg Mann
>
>
Re: Review Request 64464: Made master reconcile known offer
operations with agent.
Posted by Jie Yu <yu...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64464/#review193473
-----------------------------------------------------------
Ship it!
Ship It!
- Jie Yu
On Dec. 8, 2017, 11:46 p.m., Greg Mann wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/64464/
> -----------------------------------------------------------
>
> (Updated Dec. 8, 2017, 11:46 p.m.)
>
>
> Review request for mesos, Benjamin Bannier, Gaston Kleiman, and Jie Yu.
>
>
> Bugs: MESOS-8195
> https://issues.apache.org/jira/browse/MESOS-8195
>
>
> Repository: mesos
>
>
> Description
> -------
>
> In cases where the agent fails over or where an `UpdateSlaveMessage`
> races with an `ApplyOfferOperationMessage`, it's possible that the
> master knows about an offer operation which is not contained in an
> `UpdateSlaveMessage`. In such cases, the master should send a
> `ReconcileOfferOperations` message to the agent. The agent will
> then respond by sending OFFER_OPERATION_DROPPED status updates for
> any operations which it does not know about.
>
>
> Diffs
> -----
>
> src/master/master.cpp b3e074cfe86600793310deb87932fa145e95055d
>
>
> Diff: https://reviews.apache.org/r/64464/diff/1/
>
>
> Testing
> -------
>
> make check
>
>
> Thanks,
>
> Greg Mann
>
>