You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@aurora.apache.org by Joshua Cohen <jc...@apache.org> on 2016/04/29 18:22:50 UTC

Review Request 46835: Add client and scheduler support for launching tasks using the Mesos unified containerizer

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/46835/
-----------------------------------------------------------

Review request for Aurora, John Sirois, Maxim Khutornenko, and Bill Farner.


Bugs: AURORA-1636, AURORA-1637, AURORA-1638, and AURORA-1639
    https://issues.apache.org/jira/browse/AURORA-1636
    https://issues.apache.org/jira/browse/AURORA-1637
    https://issues.apache.org/jira/browse/AURORA-1638
    https://issues.apache.org/jira/browse/AURORA-1639


Repository: aurora


Description
-------

A few notes:

1. It's not possible to configure Mesos 0.27.x to launch docker tasks due to a bug in parsing the docker_store_dir flag. Fixed here: https://reviews.apache.org/r/43451/ but has not been backported to Mesos 0.27. This means we can only launch tasks that use AppC images until we upgrade our Mesos dependency to 0.28.x. The good news is I've confirmed that launching tasks with Docker images *does* work by using Aurora linked against 0.27.x but running Mesos 0.28.x in Vagrant.
1. In order to work around the setuid issues (i.e. task is launched as root, but the executor cannot setuid because the role-user does not exist), I've mounted /etc/passwd and /etc/group into the container and added a new flag, `thermos_run_as_job_role`, to the scheduler. This flag is only used when launching a task with a filesystem image, and causes us to add `--execute-as-user <role from job key>` to the thermos executor commandline.
1. The Mesos unified containerizer does not automatically create mount points in the filesystem from the image. It expects the full path to the mount to exist in the image. For /etc/passwd and /etc/groups this is not a problem, but for the announcer acls file it was. I ended up moving the announcer acl file into its own directory and mount that instead. In conjunction with this I also had to modify our http_example Dockerfile to explicitly create that mount point. A case could be made for sticking with the current path and just creating an empty file in the image, I felt that creating an empty directory was slightly less gross. This is tracked by https://issues.apache.org/jira/browse/MESOS-5229.
1. The AppC image for end to end tests is created by running [docker2aci](https://github.com/appc/docker2aci) on our http_example docker image. The base box needed to be upgraded to add this utility. I haven't published the new base box yet even though I've updated the Vagrantfile to point to version 6. Once this review has been approved and I'm sure there's no further changes that need to be made I'll publish the base box before committing.


Diffs
-----

  RELEASE-NOTES.md 7a37d0d69f688bece624628fe5b98efc85d506a2 
  Vagrantfile 3f126ee348d0f95d6f159b62280de79f41e87e2e 
  build-support/packer/build.sh 76197c31c365aa3d8a67049da40b2976c1e25d22 
  docs/reference/configuration.md 9fcfdfcd9ab793e888ca2bba2035d5122142a5ab 
  docs/reference/scheduler-configuration.md d2262f79edfde23eccd87bae7f1cf319b63b1103 
  examples/vagrant/announcer-auth.json  
  examples/vagrant/mesos_config/etc_mesos-slave/appc_store_dir PRE-CREATION 
  examples/vagrant/mesos_config/etc_mesos-slave/image_providers PRE-CREATION 
  examples/vagrant/mesos_config/etc_mesos-slave/image_provisioner_backend PRE-CREATION 
  examples/vagrant/mesos_config/etc_mesos-slave/isolation PRE-CREATION 
  examples/vagrant/upstart/aurora-scheduler.conf 084016abc169ed82b7ed00f5d14aea2e0ff38a49 
  src/main/java/org/apache/aurora/scheduler/configuration/executor/ExecutorModule.java 32f2fa90b21189180e2bcd65a3cebf13f6551646 
  src/main/java/org/apache/aurora/scheduler/configuration/executor/ExecutorSettings.java 501e6431f21822d9816952377546586da02ce42a 
  src/main/java/org/apache/aurora/scheduler/mesos/MesosTaskFactory.java b325106c7f45b1ad1657221aaa39e3a428719ab0 
  src/main/java/org/apache/aurora/scheduler/mesos/TestExecutorSettings.java 9aadcebf547bd1eb4b4e238507e27ae2b699f473 
  src/main/python/apache/aurora/config/schema/base.py 00be8747d70dbf1cb370f09536588f8602d8fcce 
  src/main/python/apache/aurora/config/thrift.py 928ca9313b2c2062a322ba80b504a09c55e5377f 
  src/main/python/apache/aurora/executor/common/sandbox.py 36f1eabedc3ae47b23d9ab2ac0ab7a576ea36fd7 
  src/test/java/org/apache/aurora/scheduler/mesos/MesosTaskFactoryImplTest.java bf18d5d53f7eda62120299146a956fa0a0985f71 
  src/test/python/apache/aurora/config/test_thrift.py 7a076f0350ab2967abc6b8b7a2e5da0817926a56 
  src/test/python/apache/aurora/executor/common/test_sandbox.py bd402fc03c7790eab0198dd48414ad4de138e195 
  src/test/sh/org/apache/aurora/e2e/Dockerfile b2557b5a20cc369e31bd10ea92462bdb1879add7 
  src/test/sh/org/apache/aurora/e2e/http/http_example.aurora 2813b6c79e4d44007dde79a10e2c7c9e9c1cecd9 
  src/test/sh/org/apache/aurora/e2e/http/http_example_bad_healthcheck.aurora 0534c9e589d10c53b834850477f95ad15b50010e 
  src/test/sh/org/apache/aurora/e2e/http/http_example_updated.aurora b33e8f5cd95ce25ba0dc4c08da32783cecf1c44d 
  src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh eee6b4c62130567ecd5c32603feae88fce1c13a8 

Diff: https://reviews.apache.org/r/46835/diff/


Testing
-------

./gradlew build -Pq
e2e tests with new base box.


Thanks,

Joshua Cohen


Re: Review Request 46835: Add client and scheduler support for launching tasks using the Mesos unified containerizer

Posted by Maxim Khutornenko <ma...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/46835/#review131220
-----------------------------------------------------------



Following up on our offline conversation, it would be great to explore the feasibility of running executor outside of user image. This was one of the proposed [goals](http://markmail.org/message/g2xkh7nzzblokdgk) behind moving to unified containerizer and is likely [already possible](https://github.com/apache/mesos/blob/master/docs/container-image.md#executor-dependencies-in-a-container-image) with mesos. As it stands, there is not much that separates appc adoption from the existing docker implementation feature-wise. I am fine proceeding with this patch as an interim solution though as long as we identify the follow up work to explore the possibilities here.

- Maxim Khutornenko


On April 29, 2016, 4:22 p.m., Joshua Cohen wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/46835/
> -----------------------------------------------------------
> 
> (Updated April 29, 2016, 4:22 p.m.)
> 
> 
> Review request for Aurora, John Sirois, Maxim Khutornenko, and Bill Farner.
> 
> 
> Bugs: AURORA-1636, AURORA-1637, AURORA-1638, and AURORA-1639
>     https://issues.apache.org/jira/browse/AURORA-1636
>     https://issues.apache.org/jira/browse/AURORA-1637
>     https://issues.apache.org/jira/browse/AURORA-1638
>     https://issues.apache.org/jira/browse/AURORA-1639
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> A few notes:
> 
> 1. It's not possible to configure Mesos 0.27.x to launch docker tasks due to a bug in parsing the docker_store_dir flag. Fixed here: https://reviews.apache.org/r/43451/ but has not been backported to Mesos 0.27. This means we can only launch tasks that use AppC images until we upgrade our Mesos dependency to 0.28.x. The good news is I've confirmed that launching tasks with Docker images *does* work by using Aurora linked against 0.27.x but running Mesos 0.28.x in Vagrant.
> 1. In order to work around the setuid issues (i.e. task is launched as root, but the executor cannot setuid because the role-user does not exist), I've mounted /etc/passwd and /etc/group into the container and added a new flag, `thermos_run_as_job_role`, to the scheduler. This flag is only used when launching a task with a filesystem image, and causes us to add `--execute-as-user <role from job key>` to the thermos executor commandline.
> 1. The Mesos unified containerizer does not automatically create mount points in the filesystem from the image. It expects the full path to the mount to exist in the image. For /etc/passwd and /etc/groups this is not a problem, but for the announcer acls file it was. I ended up moving the announcer acl file into its own directory and mount that instead. In conjunction with this I also had to modify our http_example Dockerfile to explicitly create that mount point. A case could be made for sticking with the current path and just creating an empty file in the image, I felt that creating an empty directory was slightly less gross. This is tracked by https://issues.apache.org/jira/browse/MESOS-5229.
> 1. The AppC image for end to end tests is created by running [docker2aci](https://github.com/appc/docker2aci) on our http_example docker image. The base box needed to be upgraded to add this utility. I haven't published the new base box yet even though I've updated the Vagrantfile to point to version 6. Once this review has been approved and I'm sure there's no further changes that need to be made I'll publish the base box before committing.
> 
> 
> Diffs
> -----
> 
>   RELEASE-NOTES.md 7a37d0d69f688bece624628fe5b98efc85d506a2 
>   Vagrantfile 3f126ee348d0f95d6f159b62280de79f41e87e2e 
>   build-support/packer/build.sh 76197c31c365aa3d8a67049da40b2976c1e25d22 
>   docs/reference/configuration.md 9fcfdfcd9ab793e888ca2bba2035d5122142a5ab 
>   docs/reference/scheduler-configuration.md d2262f79edfde23eccd87bae7f1cf319b63b1103 
>   examples/vagrant/announcer-auth.json  
>   examples/vagrant/mesos_config/etc_mesos-slave/appc_store_dir PRE-CREATION 
>   examples/vagrant/mesos_config/etc_mesos-slave/image_providers PRE-CREATION 
>   examples/vagrant/mesos_config/etc_mesos-slave/image_provisioner_backend PRE-CREATION 
>   examples/vagrant/mesos_config/etc_mesos-slave/isolation PRE-CREATION 
>   examples/vagrant/upstart/aurora-scheduler.conf 084016abc169ed82b7ed00f5d14aea2e0ff38a49 
>   src/main/java/org/apache/aurora/scheduler/configuration/executor/ExecutorModule.java 32f2fa90b21189180e2bcd65a3cebf13f6551646 
>   src/main/java/org/apache/aurora/scheduler/configuration/executor/ExecutorSettings.java 501e6431f21822d9816952377546586da02ce42a 
>   src/main/java/org/apache/aurora/scheduler/mesos/MesosTaskFactory.java b325106c7f45b1ad1657221aaa39e3a428719ab0 
>   src/main/java/org/apache/aurora/scheduler/mesos/TestExecutorSettings.java 9aadcebf547bd1eb4b4e238507e27ae2b699f473 
>   src/main/python/apache/aurora/config/schema/base.py 00be8747d70dbf1cb370f09536588f8602d8fcce 
>   src/main/python/apache/aurora/config/thrift.py 928ca9313b2c2062a322ba80b504a09c55e5377f 
>   src/main/python/apache/aurora/executor/common/sandbox.py 36f1eabedc3ae47b23d9ab2ac0ab7a576ea36fd7 
>   src/test/java/org/apache/aurora/scheduler/mesos/MesosTaskFactoryImplTest.java bf18d5d53f7eda62120299146a956fa0a0985f71 
>   src/test/python/apache/aurora/config/test_thrift.py 7a076f0350ab2967abc6b8b7a2e5da0817926a56 
>   src/test/python/apache/aurora/executor/common/test_sandbox.py bd402fc03c7790eab0198dd48414ad4de138e195 
>   src/test/sh/org/apache/aurora/e2e/Dockerfile b2557b5a20cc369e31bd10ea92462bdb1879add7 
>   src/test/sh/org/apache/aurora/e2e/http/http_example.aurora 2813b6c79e4d44007dde79a10e2c7c9e9c1cecd9 
>   src/test/sh/org/apache/aurora/e2e/http/http_example_bad_healthcheck.aurora 0534c9e589d10c53b834850477f95ad15b50010e 
>   src/test/sh/org/apache/aurora/e2e/http/http_example_updated.aurora b33e8f5cd95ce25ba0dc4c08da32783cecf1c44d 
>   src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh eee6b4c62130567ecd5c32603feae88fce1c13a8 
> 
> Diff: https://reviews.apache.org/r/46835/diff/
> 
> 
> Testing
> -------
> 
> ./gradlew build -Pq
> e2e tests with new base box.
> 
> 
> Thanks,
> 
> Joshua Cohen
> 
>


Re: Review Request 46835: Add client and scheduler support for launching tasks using the Mesos unified containerizer

Posted by Aurora ReviewBot <wf...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/46835/#review131130
-----------------------------------------------------------


Ship it!




Master (450d881) is green with this patch.
  ./build-support/jenkins/build.sh

I will refresh this build result if you post a review containing "@ReviewBot retry"

- Aurora ReviewBot


On April 29, 2016, 4:22 p.m., Joshua Cohen wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/46835/
> -----------------------------------------------------------
> 
> (Updated April 29, 2016, 4:22 p.m.)
> 
> 
> Review request for Aurora, John Sirois, Maxim Khutornenko, and Bill Farner.
> 
> 
> Bugs: AURORA-1636, AURORA-1637, AURORA-1638, and AURORA-1639
>     https://issues.apache.org/jira/browse/AURORA-1636
>     https://issues.apache.org/jira/browse/AURORA-1637
>     https://issues.apache.org/jira/browse/AURORA-1638
>     https://issues.apache.org/jira/browse/AURORA-1639
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> A few notes:
> 
> 1. It's not possible to configure Mesos 0.27.x to launch docker tasks due to a bug in parsing the docker_store_dir flag. Fixed here: https://reviews.apache.org/r/43451/ but has not been backported to Mesos 0.27. This means we can only launch tasks that use AppC images until we upgrade our Mesos dependency to 0.28.x. The good news is I've confirmed that launching tasks with Docker images *does* work by using Aurora linked against 0.27.x but running Mesos 0.28.x in Vagrant.
> 1. In order to work around the setuid issues (i.e. task is launched as root, but the executor cannot setuid because the role-user does not exist), I've mounted /etc/passwd and /etc/group into the container and added a new flag, `thermos_run_as_job_role`, to the scheduler. This flag is only used when launching a task with a filesystem image, and causes us to add `--execute-as-user <role from job key>` to the thermos executor commandline.
> 1. The Mesos unified containerizer does not automatically create mount points in the filesystem from the image. It expects the full path to the mount to exist in the image. For /etc/passwd and /etc/groups this is not a problem, but for the announcer acls file it was. I ended up moving the announcer acl file into its own directory and mount that instead. In conjunction with this I also had to modify our http_example Dockerfile to explicitly create that mount point. A case could be made for sticking with the current path and just creating an empty file in the image, I felt that creating an empty directory was slightly less gross. This is tracked by https://issues.apache.org/jira/browse/MESOS-5229.
> 1. The AppC image for end to end tests is created by running [docker2aci](https://github.com/appc/docker2aci) on our http_example docker image. The base box needed to be upgraded to add this utility. I haven't published the new base box yet even though I've updated the Vagrantfile to point to version 6. Once this review has been approved and I'm sure there's no further changes that need to be made I'll publish the base box before committing.
> 
> 
> Diffs
> -----
> 
>   RELEASE-NOTES.md 7a37d0d69f688bece624628fe5b98efc85d506a2 
>   Vagrantfile 3f126ee348d0f95d6f159b62280de79f41e87e2e 
>   build-support/packer/build.sh 76197c31c365aa3d8a67049da40b2976c1e25d22 
>   docs/reference/configuration.md 9fcfdfcd9ab793e888ca2bba2035d5122142a5ab 
>   docs/reference/scheduler-configuration.md d2262f79edfde23eccd87bae7f1cf319b63b1103 
>   examples/vagrant/announcer-auth.json  
>   examples/vagrant/mesos_config/etc_mesos-slave/appc_store_dir PRE-CREATION 
>   examples/vagrant/mesos_config/etc_mesos-slave/image_providers PRE-CREATION 
>   examples/vagrant/mesos_config/etc_mesos-slave/image_provisioner_backend PRE-CREATION 
>   examples/vagrant/mesos_config/etc_mesos-slave/isolation PRE-CREATION 
>   examples/vagrant/upstart/aurora-scheduler.conf 084016abc169ed82b7ed00f5d14aea2e0ff38a49 
>   src/main/java/org/apache/aurora/scheduler/configuration/executor/ExecutorModule.java 32f2fa90b21189180e2bcd65a3cebf13f6551646 
>   src/main/java/org/apache/aurora/scheduler/configuration/executor/ExecutorSettings.java 501e6431f21822d9816952377546586da02ce42a 
>   src/main/java/org/apache/aurora/scheduler/mesos/MesosTaskFactory.java b325106c7f45b1ad1657221aaa39e3a428719ab0 
>   src/main/java/org/apache/aurora/scheduler/mesos/TestExecutorSettings.java 9aadcebf547bd1eb4b4e238507e27ae2b699f473 
>   src/main/python/apache/aurora/config/schema/base.py 00be8747d70dbf1cb370f09536588f8602d8fcce 
>   src/main/python/apache/aurora/config/thrift.py 928ca9313b2c2062a322ba80b504a09c55e5377f 
>   src/main/python/apache/aurora/executor/common/sandbox.py 36f1eabedc3ae47b23d9ab2ac0ab7a576ea36fd7 
>   src/test/java/org/apache/aurora/scheduler/mesos/MesosTaskFactoryImplTest.java bf18d5d53f7eda62120299146a956fa0a0985f71 
>   src/test/python/apache/aurora/config/test_thrift.py 7a076f0350ab2967abc6b8b7a2e5da0817926a56 
>   src/test/python/apache/aurora/executor/common/test_sandbox.py bd402fc03c7790eab0198dd48414ad4de138e195 
>   src/test/sh/org/apache/aurora/e2e/Dockerfile b2557b5a20cc369e31bd10ea92462bdb1879add7 
>   src/test/sh/org/apache/aurora/e2e/http/http_example.aurora 2813b6c79e4d44007dde79a10e2c7c9e9c1cecd9 
>   src/test/sh/org/apache/aurora/e2e/http/http_example_bad_healthcheck.aurora 0534c9e589d10c53b834850477f95ad15b50010e 
>   src/test/sh/org/apache/aurora/e2e/http/http_example_updated.aurora b33e8f5cd95ce25ba0dc4c08da32783cecf1c44d 
>   src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh eee6b4c62130567ecd5c32603feae88fce1c13a8 
> 
> Diff: https://reviews.apache.org/r/46835/diff/
> 
> 
> Testing
> -------
> 
> ./gradlew build -Pq
> e2e tests with new base box.
> 
> 
> Thanks,
> 
> Joshua Cohen
> 
>