You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Chun-Hung Hsiao <ch...@apache.org> on 2018/08/17 21:12:12 UTC

Review Request 68417: Ignored pre-existing CSI volumes known to SLRP.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68417/
-----------------------------------------------------------

Review request for mesos, Benjamin Bannier and Jie Yu.


Bugs: MESOS-9166
    https://issues.apache.org/jira/browse/MESOS-9166


Repository: mesos


Description
-------

If a pre-existing volume is known to SLRP, i.e., the SLRP keeps a CSI
volume state checkpoint for the volume, but there is no corresponding
resource checkpoint, then this means the volume was created by a
previous SLRP instance, but the SLRP lost the state checkpoint
(typically happens when the agent ID changes). In such a case, we should
not report this volume as an unmanaged pre-existing volume. For now we
simply ingore such a volume.

See MESOS-9167 for future work.


Diffs
-----

  src/resource_provider/storage/provider.cpp fc48072aac531bac3cbffc3ba089b8dfa2a2f200 


Diff: https://reviews.apache.org/r/68417/diff/1/


Testing
-------

sudo make check


Thanks,

Chun-Hung Hsiao


Re: Review Request 68417: Ignored pre-existing CSI volumes known to SLRP.

Posted by Mesos Reviewbot Windows <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68417/#review207567
-----------------------------------------------------------



FAIL: Some of the unit tests failed. Please check the relevant logs.

Reviews applied: `['68417']`

Failed command: `Start-MesosCITesting`

All the build artifacts available at: http://dcos-win.westus.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/2191/mesos-review-68417

Relevant logs:

- [stout-tests.log](http://dcos-win.westus.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/2191/mesos-review-68417/logs/stout-tests.log):

```
[ RUN      ] SocketTests.InitSocket
[       OK ] SocketTests.InitSocket (2 ms)
[ RUN      ] SocketTests.IntFD
[       OK ] SocketTests.IntFD (1 ms)
[----------] 2 tests from SocketTests (3 ms total)

[----------] 2 tests from StrerrorTest
[ RUN      ] StrerrorTest.ValidErrno
[       OK ] StrerrorTest.ValidErrno (0 ms)
[ RUN      ] StrerrorTest.InvalidErrno
[       OK ] StrerrorTest.InvalidErrno (0 ms)
[----------] 2 tests from StrerrorTest (0 ms total)

[----------] 2 tests from OsSendfileTest
[ RUN      ] OsSendfileTest.Sendfile
[       OK ] OsSendfileTest.Sendfile (3 ms)
[ RUN      ] OsSendfileTest.SendfileAsync
[       OK ] OsSendfileTest.SendfileAsync (14 ms)
[----------] 2 tests from OsSendfileTest (18 ms total)

[----------] Global test environment tear-down
[==========] 333 tests from 52 test cases ran. (8267 ms total)
[  PASSED  ] 332 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] ProcessTest.Pstree

 1 FAILED TEST
  YOU HAVE 1 DISABLED TEST

```

- [mesos-tests.log](http://dcos-win.westus.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/2191/mesos-review-68417/logs/mesos-tests.log):

```

goroutine 18 [runnable]:
os/signal.loop()
	/usr/local/go/src/os/signal/signal_unix.go:20
created by os/signal.init.0
	/usr/local/go/src/os/signal/signal_unix.go:28 +0x48
'
If recovery failed due to a change in configuration and you want to
keep the current agent id, you might want to change the
`--reconfiguration_policy` flag to a more permissive value.

To restart this agent with a new agent id instead, do as follows:
rm -f C:\Users\jenkins\AppData\Local\Temp\KLENJx\meta\slaves\latest
This ensures that the agent does not recover old live executors.

If you use the Docker containerizer and think that the Docker
daemon state is broken, you can try to clear it. 
d:\dcos\mesos\mesos\src\tests\mock_registrar.cpp(54): ERROR: this mock object (used in test ROOT_DOCKER_DockerAndMesosContainerizers/DefaultExecutorTest.TaskRunning/0) should be deleted but never is. Its address is @0000012227CFC1C0.
d:\dcos\mesos\mesos\3rdparty\libprocess\include\process\gmock.hpp(247): ERROR: this mock object (used in test ROOT_DOCKER_DockerAndMesosContainerizers/DefaultExecutorTest.TaskRunning/0) should be deleted but never is. Its address is @000001222B39CF28.
d:\dcos\mesos\mesos\src\tests\default_executor_tests.cpp(126): ERROR: this mock object (used in test ROOT_DOCKER_DockerAndMesosContainerizers/DefaultExecutorTest.TaskRunning/0) should be deleted but never is. Its address is @000001222B8AF010.
ERROR: 3 leaked mock objects found at program exit.
But be careful:
these commands will erase all containers and images from this host,
not just those started by Mesos!
docker kill $(docker ps -q)
docker rm $(docker ps -a -q)
docker rmi $(docker images -q)

Finally, restart the agent.
```

- Mesos Reviewbot Windows


On Aug. 17, 2018, 9:12 p.m., Chun-Hung Hsiao wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68417/
> -----------------------------------------------------------
> 
> (Updated Aug. 17, 2018, 9:12 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier and Jie Yu.
> 
> 
> Bugs: MESOS-9166
>     https://issues.apache.org/jira/browse/MESOS-9166
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> If a pre-existing volume is known to SLRP, i.e., the SLRP keeps a CSI
> volume state checkpoint for the volume, but there is no corresponding
> resource checkpoint, then this means the volume was created by a
> previous SLRP instance, but the SLRP lost the state checkpoint
> (typically happens when the agent ID changes). In such a case, we should
> not report this volume as an unmanaged pre-existing volume. For now we
> simply ingore such a volume.
> 
> See MESOS-9167 for future work.
> 
> 
> Diffs
> -----
> 
>   src/resource_provider/storage/provider.cpp fc48072aac531bac3cbffc3ba089b8dfa2a2f200 
> 
> 
> Diff: https://reviews.apache.org/r/68417/diff/1/
> 
> 
> Testing
> -------
> 
> sudo make check
> 
> 
> Thanks,
> 
> Chun-Hung Hsiao
> 
>


Re: Review Request 68417: Ignored pre-existing CSI volumes known to SLRP.

Posted by Jie Yu <yu...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68417/#review207551
-----------------------------------------------------------


Ship it!




Ship It!

- Jie Yu


On Aug. 17, 2018, 9:12 p.m., Chun-Hung Hsiao wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68417/
> -----------------------------------------------------------
> 
> (Updated Aug. 17, 2018, 9:12 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier and Jie Yu.
> 
> 
> Bugs: MESOS-9166
>     https://issues.apache.org/jira/browse/MESOS-9166
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> If a pre-existing volume is known to SLRP, i.e., the SLRP keeps a CSI
> volume state checkpoint for the volume, but there is no corresponding
> resource checkpoint, then this means the volume was created by a
> previous SLRP instance, but the SLRP lost the state checkpoint
> (typically happens when the agent ID changes). In such a case, we should
> not report this volume as an unmanaged pre-existing volume. For now we
> simply ingore such a volume.
> 
> See MESOS-9167 for future work.
> 
> 
> Diffs
> -----
> 
>   src/resource_provider/storage/provider.cpp fc48072aac531bac3cbffc3ba089b8dfa2a2f200 
> 
> 
> Diff: https://reviews.apache.org/r/68417/diff/1/
> 
> 
> Testing
> -------
> 
> sudo make check
> 
> 
> Thanks,
> 
> Chun-Hung Hsiao
> 
>