You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mesos.apache.org by ajkf9uvxc ajkf9uvxc <aj...@yahoo.com> on 2018/02/02 21:30:25 UTC

Struggling with running Docker container on Windows agent

Hi,
I am trying to get a job in DCOS to run a docker container on a Windows agent machine. DCOS was installed using the AWS CF template here: https://downloads.dcos.io/dcos/stable/aws.html (single master).
The Windows agent is added:
C:\mesos\mesos\build\src\mesos-agent.exe --attributes=os:windows --containerizers=docker,mesos --hostname=10.19.10.206 --IP=10.19.10.206 --master=zk://10.22.1.94:2181/mesos --work_dir=c:\mesos\work_dir --launcher_dir=c:\mesos\mesos\build\src --log_dir=c:\mesos\logs

And a simple job works:

dcos.activestate.com -> Job -> New



{

  "id": "mywindowstest01",

  "labels": {},

  "run": {

    "cpus": 0.01,

    "mem": 128,

    "disk": 0,

    "cmd": "C:\\Windows\\System32\\cmd.exe /c echo helloworld > c:\\mesos\\work_dir\\helloworld2",

    "env": {},

    "placement": {

      "constraints": [

        {

          "attribute": "os",

          "operator": "EQ",

          "value": "windows"

        }

      ]

    },

    "artifacts": [],

    "maxLaunchDelay": 3600,

    "volumes": [],

    "restart": {

      "policy": "NEVER"

    }

  },

  "schedules": []

}

creates: "c:\\mesos\\work_dir\\helloworld2"

The Windows agent has DockerCE installed and is set to run Windows containers (tried with Linux containers as well and getting the same problem, but for the purpose of this question let's stick to Windows containers)
I confirmed that it's possible to run a Windows container manually, directly on Windows 10 by starting a Powershell as Administrator and running:
docker run -ti microsoft/windowsservercoreand docker run microsoft/windowsservercore
Both commands create a new container (verified with "docker ps" , besides I get a cmd.exe shell in the conatiner for the first command)
Now the problem:
trying to run a container from DCOS does not work:

dcos job add a.json

with the json:

{  "id": "myattempt11",  "labels": {},  "run": {    "env": {},    "cpus": 1.00,    "mem": 512,    "disk": 1000,    "placement": {      "constraints": [        {          "attribute": "os",          "operator": "EQ",          "value": "windows"        }      ]    },    "artifacts": [],    "maxLaunchDelay": 3600,    "docker": {      "image": "microsoft/windowsservercore"    },    "restart": {      "policy": "NEVER"    }  },  "schedules": []}
Does not work:
# dcos job add a.json
# dcos job run myattempt11 
Run ID: 20180202203339zVpxc
The log on the Mesos Agent on Windows shows activity but not much information about the problem (see "TASK_FAILED" at the end below):
Log file created at: 2018/02/02 12:52:47Running on machine: DESKTOP-JJK06UJLog line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msgI0202 12:52:47.330880  8388 logging.cpp:201] INFO level logging started!I0202 12:52:47.335886  8388 main.cpp:365] Build: 2017-12-20 23:35:42 UTC by Anne S BellI0202 12:52:47.335886  8388 main.cpp:366] Version: 1.5.0I0202 12:52:47.337895  8388 main.cpp:373] Git SHA: 327726d3c7272806c8f3c3b7479758c26e55fd43I0202 12:52:47.358888  8388 resolver.cpp:69] Creating default secret resolverI0202 12:52:47.574883  8388 containerizer.cpp:304] Using isolation { windows/cpu, filesystem/windows, windows/mem, environment_secret }I0202 12:52:47.577883  8388 provisioner.cpp:299] Using default backend 'copy'I0202 12:52:47.596886  3348 slave.cpp:262] Mesos agent started on (1)@10.19.10.206:5051I0202 12:52:47.597883  3348 slave.cpp:263] Flags at startup: --appc_simple_discovery_uri_prefix="http://" --appc_store_dir="C:\Users\activeit\AppData\Local\Temp\mesos\store\appc" --attributes="os:windows" --authenticate_http_readonly="false" --authenticate_http_readwrite="false" --authenticatee="crammd5" --authentication_backoff_factor="1secs" --authorizer="local" --container_disk_watch_interval="15secs" --containerizers="docker,mesos" --default_role="*" --disk_watch_interval="1mins" --docker="docker" --docker_kill_orphans="true" --docker_registry="https://registry-1.docker.io" --docker_remove_delay="6hrs" --docker_socket="//./pipe/docker_engine" --docker_stop_timeout="0ns" --docker_store_dir="C:\Users\activeit\AppData\Local\Temp\mesos\store\docker" --docker_volume_checkpoint_dir="/var/run/mesos/isolators/docker/volume" --enforce_container_disk_quota="false" --executor_registration_timeout="1mins" --executor_reregistration_timeout="2secs" --executor_shutdown_grace_period="5secs" --fetcher_cache_dir="C:\Users\activeit\AppData\Local\Temp\mesos\fetch" --fetcher_cache_size="2GB" --frameworks_home="" --gc_delay="1weeks" --gc_disk_headroom="0.1" --hadoop_home="" --help="false" --hostname="10.19.10.206" --hostname_lookup="true" --http_command_executor="false" --http_heartbeat_interval="30secs" --initialize_driver_logging="true" --ip="10.19.10.206" --isolation="windows/cpu,windows/mem" --launcher="windows" --launcher_dir="c:\mesos\mesos\build\src" --log_dir="c:\mesos\logs" --logbufsecs="0" --logging_level="INFO" --master="zk://10.22.1.94:2181/mesos" --max_completed_executors_per_framework="150" --oversubscribed_resources_interval="15secs" --port="5051" --qos_correction_interval_min="0ns" --quiet="false" --reconfiguration_policy="equal" --recover="reconnect" --recovery_timeout="15mins" --registration_backoff_factor="1secs" --runtime_dir="C:\ProgramData\mesos\runtime" --sandbox_directory="C:\mesos\sandbox" --strict="true" --version="false" --work_dir="c:\mesos\work_dir" --zk_session_timeout="10secs"I0202 12:52:47.604887  3348 slave.cpp:612] Agent resources: [{"name":"cpus","scalar":{"value":4.0},"type":"SCALAR"},{"name":"mem","scalar":{"value":15290.0},"type":"SCALAR"},{"name":"disk","scalar":{"value":470301.0},"type":"SCALAR"},{"name":"ports","ranges":{"range":[{"begin":31000,"end":32000}]},"type":"RANGES"}]I0202 12:52:47.725885  3348 slave.cpp:620] Agent attributes: [ os=windows ]I0202 12:52:47.727886  3348 slave.cpp:629] Agent hostname: 10.19.10.206I0202 12:52:47.735886  7652 task_status_update_manager.cpp:181] Pausing sending task status updatesI0202 12:52:47.738890  4052 group.cpp:341] Group process (zookeeper-group(1)@10.19.10.206:5051) connected to ZooKeeperI0202 12:52:47.739887  4052 group.cpp:831] Syncing group operations: queue size (joins, cancels, datas) = (0, 0, 0)I0202 12:52:47.740885  4052 group.cpp:419] Trying to create path '/mesos' in ZooKeeperI0202 12:52:47.773885  5168 state.cpp:66] Recovering state from 'c:\mesos\work_dir\meta'E0202 12:52:47.773885  3348 slave.cpp:1009] Failed to attach 'c:\mesos\logs\mesos-agent.exe.INFO' to virtual path '/slave/log': Failed to get realpath of 'c:\mesos\logs\mesos-agent.exe.INFO': Failed to get attributes for file 'c:\mesos\logs\mesos-agent.exe.INFO': The system cannot find the file specified.
I0202 12:52:47.774884  5168 state.cpp:724] No committed checkpointed resources found at 'c:\mesos\work_dir\meta\resources\resources.info'I0202 12:52:47.779883  5168 state.cpp:110] Failed to find the latest agent from 'c:\mesos\work_dir\meta'I0202 12:52:47.781888  3528 task_status_update_manager.cpp:207] Recovering task status update managerI0202 12:52:47.782883  3348 docker.cpp:890] Recovering Docker containersI0202 12:52:47.782883  7652 containerizer.cpp:674] Recovering containerizerI0202 12:52:47.807888  3768 provisioner.cpp:495] Provisioner recovery completeI0202 12:52:47.891667  5168 detector.cpp:152] Detected a new leader: (id='1171')I0202 12:52:47.892666  7652 group.cpp:700] Trying to get '/mesos/json.info_0000001171' in ZooKeeperI0202 12:52:47.970657  5168 zookeeper.cpp:262] A new leading master (UPID=master@10.22.1.94:5050) is detectedI0202 12:52:48.011252  7652 slave.cpp:6776] Finished recoveryI0202 12:52:48.020246  3768 task_status_update_manager.cpp:181] Pausing sending task status updatesI0202 12:52:48.020246  7652 slave.cpp:1055] New master detected at master@10.22.1.94:5050I0202 12:52:48.021251  7652 slave.cpp:1099] No credentials provided. Attempting to register without authenticationI0202 12:52:48.023254  7652 slave.cpp:1110] Detecting new masterI0202 12:52:48.330085  4052 slave.cpp:1275] Registered with master master@10.22.1.94:5050; given agent ID a0664e60-846a-42d0-9586-cf97e997eba3-S0I0202 12:52:48.331082  5168 task_status_update_manager.cpp:188] Resuming sending task status updatesI0202 12:52:48.348086  4052 slave.cpp:1352] Forwarding agent update {"offer_operations":{},"resource_version_uuid":{"value":"DEVEk\/KOR5KLtmOgVG9qvw=="},"slave_id":{"value":"a0664e60-846a-42d0-9586-cf97e997eba3-S0"},"update_oversubscribed_resources":true}W0202 12:52:48.351085  4052 slave.cpp:1334] Already registered with master master@10.22.1.94:5050I0202 12:52:48.356086  4052 slave.cpp:1352] Forwarding agent update {"offer_operations":{},"resource_version_uuid":{"value":"DEVEk\/KOR5KLtmOgVG9qvw=="},"slave_id":{"value":"a0664e60-846a-42d0-9586-cf97e997eba3-S0"},"update_oversubscribed_resources":true}W0202 12:52:48.358086  4052 slave.cpp:1334] Already registered with master master@10.22.1.94:5050I0202 12:52:48.359086  4052 slave.cpp:1352] Forwarding agent update {"offer_operations":{},"resource_version_uuid":{"value":"DEVEk\/KOR5KLtmOgVG9qvw=="},"slave_id":{"value":"a0664e60-846a-42d0-9586-cf97e997eba3-S0"},"update_oversubscribed_resources":true}W0202 12:52:48.362089  4052 slave.cpp:1334] Already registered with master master@10.22.1.94:5050I0202 12:52:48.363085  4052 slave.cpp:1352] Forwarding agent update {"offer_operations":{},"resource_version_uuid":{"value":"DEVEk\/KOR5KLtmOgVG9qvw=="},"slave_id":{"value":"a0664e60-846a-42d0-9586-cf97e997eba3-S0"},"update_oversubscribed_resources":true}W0202 12:52:48.364082  4052 slave.cpp:1334] Already registered with master master@10.22.1.94:5050I0202 12:52:48.365085  4052 slave.cpp:1352] Forwarding agent update {"offer_operations":{},"resource_version_uuid":{"value":"DEVEk\/KOR5KLtmOgVG9qvw=="},"slave_id":{"value":"a0664e60-846a-42d0-9586-cf97e997eba3-S0"},"update_oversubscribed_resources":true}I0202 12:52:50.938498  7652 slave.cpp:1831] Got assigned task 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' for framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000I0202 12:52:50.962504  7652 slave.cpp:2101] Authorizing task 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' for framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000I0202 12:52:50.965504  3768 slave.cpp:2494] Launching task 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' for framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000I0202 12:52:50.988512  3768 slave.cpp:8373] Launching executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 with resources [{"allocation_info":{"role":"*"},"name":"cpus","scalar":{"value":0.1},"type":"SCALAR"},{"allocation_info":{"role":"*"},"name":"mem","scalar":{"value":32.0},"type":"SCALAR"}] in work directory 'c:\mesos\work_dir\slaves\a0664e60-846a-42d0-9586-cf97e997eba3-S0\frameworks\0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000\executors\myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88\runs\74298e92-9700-486d-b211-a42e5fd0bf85'I0202 12:52:50.995501  3768 slave.cpp:3046] Launching container 74298e92-9700-486d-b211-a42e5fd0bf85 for executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000I0202 12:52:51.010500  3768 slave.cpp:2580] Queued task 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' for executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000I0202 12:52:51.017498  3348 docker.cpp:1144] Starting container '74298e92-9700-486d-b211-a42e5fd0bf85' for task 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' (and executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88') of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000I0202 12:52:53.731667  1104 docker.cpp:784] Checkpointing pid 7732 to 'c:\mesos\work_dir\meta\slaves\a0664e60-846a-42d0-9586-cf97e997eba3-S0\frameworks\0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000\executors\myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88\runs\74298e92-9700-486d-b211-a42e5fd0bf85\pids\forked.pid'I0202 12:52:53.894371  4052 slave.cpp:4314] Got registration for executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 from executor(1)@10.19.10.206:49855I0202 12:52:53.911371  1104 docker.cpp:1627] Ignoring updating container 74298e92-9700-486d-b211-a42e5fd0bf85 because resources passed to update are identical to existing resourcesI0202 12:52:53.914371  3768 slave.cpp:2785] Sending queued task 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' to executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 at executor(1)@10.19.10.206:49855I0202 12:52:53.931371  7652 slave.cpp:4771] Handling status update TASK_STARTING (Status UUID: ef5adc2f-6f66-44c3-bc98-7697c1315ebf) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 from executor(1)@10.19.10.206:49855I0202 12:52:53.942371  5168 task_status_update_manager.cpp:328] Received task status update TASK_STARTING (Status UUID: ef5adc2f-6f66-44c3-bc98-7697c1315ebf) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000I0202 12:52:53.948371  5168 task_status_update_manager.cpp:842] Checkpointing UPDATE for task status update TASK_STARTING (Status UUID: ef5adc2f-6f66-44c3-bc98-7697c1315ebf) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000I0202 12:52:53.950371  1104 slave.cpp:5254] Forwarding the update TASK_STARTING (Status UUID: ef5adc2f-6f66-44c3-bc98-7697c1315ebf) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 to master@10.22.1.94:5050I0202 12:52:53.953371  1104 slave.cpp:5163] Sending acknowledgement for status update TASK_STARTING (Status UUID: ef5adc2f-6f66-44c3-bc98-7697c1315ebf) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 to executor(1)@10.19.10.206:49855I0202 12:52:54.049816  3348 task_status_update_manager.cpp:401] Received task status update acknowledgement (UUID: ef5adc2f-6f66-44c3-bc98-7697c1315ebf) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000I0202 12:52:54.051817  3348 task_status_update_manager.cpp:842] Checkpointing ACK for task status update TASK_STARTING (Status UUID: ef5adc2f-6f66-44c3-bc98-7697c1315ebf) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000I0202 12:52:59.255755  4052 slave.cpp:4771] Handling status update TASK_FAILED (Status UUID: c0775c86-4f1b-44a6-ae8f-347486f6fa9f) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 from executor(1)@10.19.10.206:49855I0202 12:52:59.260759  4052 task_status_update_manager.cpp:328] Received task status update TASK_FAILED (Status UUID: c0775c86-4f1b-44a6-ae8f-347486f6fa9f) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000I0202 12:52:59.261757  4052 task_status_update_manager.cpp:842] Checkpointing UPDATE for task status update TASK_FAILED (Status UUID: c0775c86-4f1b-44a6-ae8f-347486f6fa9f) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000I0202 12:52:59.263756  5168 slave.cpp:5254] Forwarding the update TASK_FAILED (Status UUID: c0775c86-4f1b-44a6-ae8f-347486f6fa9f) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 to master@10.22.1.94:5050I0202 12:52:59.265756  5168 slave.cpp:5163] Sending acknowledgement for status update TASK_FAILED (Status UUID: c0775c86-4f1b-44a6-ae8f-347486f6fa9f) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 to executor(1)@10.19.10.206:49855I0202 12:52:59.367189  7052 task_status_update_manager.cpp:401] Received task status update acknowledgement (UUID: c0775c86-4f1b-44a6-ae8f-347486f6fa9f) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000I0202 12:52:59.368187  7052 task_status_update_manager.cpp:842] Checkpointing ACK for task status update TASK_FAILED (Status UUID: c0775c86-4f1b-44a6-ae8f-347486f6fa9f) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000I0202 12:53:00.261153  4052 slave.cpp:5386] Got exited event for executor(1)@10.19.10.206:49855I0202 12:53:00.471400  7052 docker.cpp:2415] Executor for container 74298e92-9700-486d-b211-a42e5fd0bf85 has exitedI0202 12:53:00.472362  7052 docker.cpp:2186] Destroying container 74298e92-9700-486d-b211-a42e5fd0bf85 in RUNNING stateI0202 12:53:00.474362  7052 docker.cpp:2236] Running docker stop on container 74298e92-9700-486d-b211-a42e5fd0bf85I0202 12:53:00.477478  3348 slave.cpp:5795] Executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 exited with status 0I0202 12:53:00.478476  3348 slave.cpp:5899] Cleaning up executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 at executor(1)@10.19.10.206:49855I0202 12:53:00.481472  4052 gc.cpp:90] Scheduling 'c:\mesos\work_dir\slaves\a0664e60-846a-42d0-9586-cf97e997eba3-S0\frameworks\0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000\executors\myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88\runs\74298e92-9700-486d-b211-a42e5fd0bf85' for gc 6.99989026072889days in the futureI0202 12:53:00.483475  3528 gc.cpp:90] Scheduling 'c:\mesos\work_dir\slaves\a0664e60-846a-42d0-9586-cf97e997eba3-S0\frameworks\0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000\executors\myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' for gc 6.99987866347259days in the futureI0202 12:53:00.484474  5168 gc.cpp:90] Scheduling 'c:\mesos\work_dir\meta\slaves\a0664e60-846a-42d0-9586-cf97e997eba3-S0\frameworks\0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000\executors\myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88\runs\74298e92-9700-486d-b211-a42e5fd0bf85' for gc 6.99999439265185days in the futureI0202 12:53:00.485474  5168 gc.cpp:90] Scheduling 'c:\mesos\work_dir\meta\slaves\a0664e60-846a-42d0-9586-cf97e997eba3-S0\frameworks\0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000\executors\myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' for gc 6.99987864033482days in the futureI0202 12:53:00.485474  3348 slave.cpp:6006] Cleaning up framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000I0202 12:53:00.486479  1104 task_status_update_manager.cpp:289] Closing task status update streams for framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000I0202 12:53:00.487473  3768 gc.cpp:90] Scheduling 'c:\mesos\work_dir\slaves\a0664e60-846a-42d0-9586-cf97e997eba3-S0\frameworks\0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000' for gc 6.9998786172days in the futureI0202 12:53:00.488477  3768 gc.cpp:90] Scheduling 'c:\mesos\work_dir\meta\slaves\a0664e60-846a-42d0-9586-cf97e997eba3-S0\frameworks\0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000' for gc 6.99987860557926days in the futureI0202 12:53:47.742332  7052 slave.cpp:6314] Current disk usage 24.73%. Max allowed age: 4.568714599279827daysI0202 12:54:01.675030  7052 slave.cpp:6222] Framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 seems to have exited. Ignoring registration timeout for executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88'I0202 12:54:03.169529  3348 slave.cpp:970] Received SIGUSR1 signal; unregistering and shutting downI0202 12:54:03.170536  3348 slave.cpp:931] Agent terminatingI0202 12:54:03.199530  3308 process.cpp:887] Failed to accept socket: future discarded

in DCOS web-ui -> Jobs -> myattempt11 -> Run History  there is also no information.

Are there any good troubleshooting tips / ideas what to try or where to find more informative logs to run a Docker container on Windows using Mesos? 

Are there any more suitable alternative orchestration tools to run Docker Windows containers in a cluster?

Re: RE: Struggling with running Docker container on Windows agent

Posted by Andrew Schwartzmeyer <an...@schwartzmeyer.com>.
 Awesome to hear it!

On 02/05/2018 3:30 pm, ajkf9uvxc ajkf9uvxc wrote: 

> After compiling the tip of master from 2018-02-02 on Windows and then doing the exact same steps as before IT WORKS NOW ! docker ps shows the started container. (In this case the network setting is "networks": [ { "mode": "container/bridge" } ] ) 
> 
> Thanks a lot everybody for your help! 
> 
> On Friday, February 2, 2018, 4:26:11 p.m. PST, Akash Gupta (EOSG) <ak...@microsoft.com> wrote: 
> 
> To summarize: 
> 
> If you update to the latest 1.5.x branch, then it will fix the docker $PATH issue, but you will still run into problems with running docker containers, because Mesos 1.5.x doesn't have the Windows docker network patches that the master Mesos branch has. A work around is to send a "network=nat" through the docker.parameters field in the json like this: 
> 
> "docker": { 
> 
> "parameters": [ 
> 
> { "key": "network", "value": "nat" } 
> 
> ] 
> 
> } 
> 
> If you update to the tip of master, then you should be able to run your job by adding the "networks": [ { "mode": "container/bridge" } ] field to your json. You need the network field because the default network setting in marathon is `HOST` mode, which is Linux only. 
> 
> FROM: ajkf9uvxc ajkf9uvxc [mailto:ajkf9uvxc@yahoo.com] 
> SENT: Friday, February 2, 2018 3:27 PM
> TO: Andrew Schwartzmeyer <an...@schwartzmeyer.com>
> CC: user@mesos.apache.org; ulrichb@activestate.com; Akash Gupta (EOSG) <ak...@microsoft.com>; Joseph Wu <jo...@mesosphere.io>
> SUBJECT: Re: Struggling with running Docker container on Windows agent 
> 
> Yes, I got the same result with Marathon after adding "networks": [ { "mode": "container/bridge" } ], . It sounds like there are multiple reasons for compiling a newer version and the PATH issue you mentioned is the most likely fix that will solve the problem. 
> 
> Knowing what to do next is a big step further. I will tell you how it worked by mid next week. 
> 
> Thank you! 
> 
> On Friday, February 2, 2018, 2:50:34 p.m. PST, Andrew Schwartzmeyer <an...@schwartzmeyer.com> wrote: 
> 
> Oh, geez, this is even simpler.
> 
> We'd temporarily broken the Docker containerizer in 1.5 when we fixed environment variables. You need at least commit 1b6f9e90f, where we fixed it. You don't have to move to the tip of master, we backported it (as af64bcb387) to the 1.5.x branch.
> 
> The bug was: https://issues.apache.org/jira/browse/MESOS-8443 [1]
> 
> commit 1b6f9e90f
> Author: Akash Gupta <ak...@microsoft.com>
> Date: Fri Jan 12 16:23:39 2018 -0800
> 
> Windows: Fixed docker executor `PATH` variable.
> 
> The `docker` executable is not usually installed in
> `os::host_default_path()` on Windows, so the Executor cannot find it.
> Now, before launching the Executor, the Agent finds the directory
> containing `docker` and prepends it to the `PATH` given to the Executor
> so that both the Executor and Agent use the same `docker`.
> 
> Review: https://reviews.apache.org/r/65147 [2]
> 
> Sorry about that!
> 
> Andy 
> 
> On 02/02/2018 2:33 pm, ajkf9uvxc ajkf9uvxc wrote: 
> 
> Thanks for all your replies. 
> 
> Here is the stderr requested by Andy (good to know about this log): 
> 
> I0202 12:52:53.865368 7140 exec.cpp:162] Version: 1.5.0 
> 
> I0202 12:52:53.911371 7684 exec.cpp:237] Executor registered on agent a0664e60-846a-42d0-9586-cf97e997eba3-S0 
> 
> I0202 12:52:53.915374 7192 executor.cpp:120] Registered docker executor on 10.19.10.206 
> 
> I0202 12:52:53.920373 548 executor.cpp:160] Starting task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 
> 
> I0202 12:52:59.252701 6752 executor.cpp:546] Failed to run docker container: Failed to create subprocess 'docker': Could not launch child process: Failed to call `CreateProcess`: docker -H npipe:////./pipe/docker_engine run --cpu-shares 1024 --memory 536870912 -e HOST=10.19.10.206 -e MARATHON_APP_DOCKER_IMAGE=microsoft/windowsservercore -e MARATHON_APP_ID=/myattempt11/20180202203339zVpxc -e MARATHON_APP_LABELS= -e MARATHON_APP_RESOURCE_CPUS=1.0 -e MARATHON_APP_RESOURCE_DISK=1000.0 -e MARATHON_APP_RESOURCE_MEM=512.0 -e MARATHON_APP_VERSION=1970-01-01T00:00:00.000Z -e MESOS_CONTAINER_NAME=mesos-74298e92-9700-486d-b211-a42e5fd0bf85 -e MESOS_SANDBOX=C:mesossandbox -e MESOS_TASK_ID=myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 -v c:mesoswork_dirslavesa0664e60-846a-42d0-9586-cf97e997eba3-S0dockerlinks74298e92-9700-486d-b211-a42e5fd0bf85:C:mesossandbox --net host --name mesos-74298e92-9700-486d-b211-a42e5fd0bf85 microsoft/windowsservercore0: The system cannot find
the file specified. 
> 
> I0202 12:53:00.255151 1164 process.cpp:887] Failed to accept socket: future discarded 
> 
> Note about that: the path c:mesoswork_dirslavesa0664e60-846a-42d0-9586-cf97e997eba3-S0dockerlinks exists but it's empty. There also is no "C:mesossandbox" . I guess the error must be referring to one of those. 
> 
> I will try and see if I get it running by specifying "NAT". Hold on... 
> 
> In the end, the Windows boxes are supposed to build an application and won't run services so NAT is probably fine. 
> 
> On Friday, February 2, 2018, 2:13:51 p.m. PST, Andrew Schwartzmeyer <an...@schwartzmeyer.com> wrote: 
> 
> You may want to build from the tip of master (e.g. 1.6), as the following change/fix went in:
> 
> commit 6b35c93ba
> Author: Akash Gupta <ak...@hotmail.com>
> Date: Wed Jan 17 13:51:44 2018 -0800
> 
> Windows: Mapped the Docker network info types.
> 
> The Network enum in DockerInfo is specific to Linux containers. `HOST`
> doesn't exist on Windows and `BRIDGE` is `NAT` on Windows. The current
> default docker network setting was always `HOST`, which broke the
> Windows docker executor. Now, if a specific network isn't given, the
> network mode will default to `HOST` on Linux agents and `NAT` on Windows
> agents. Also, `BRIDGE` mode will be translated to `NAT` on Windows.
> 
> Review: https://reviews.apache.org/r/63860/ [3]
> 
> Then you can set the network type to "BRIDGE" in Marathon, which on Windows is equivalent (and will use) the Docker NAT network. See https://mesos.apache.org/documentation/latest/networking/#docker-containerizer [4] (but note that it applies to 1.6, not your current version at 1.5). 
> 
> On 02/02/2018 1:44 pm, Joseph Wu wrote: 
> 
> It doesn't look like you've set the network type to `NAT`. Marathon isn't aware (yet) that Docker on Windows does not support HOST/BRIDGE networks. 
> 
> On Fri, Feb 2, 2018 at 1:37 PM, Andrew Schwartzmeyer <an...@schwartzmeyer.com> wrote: 
> 
> Hello,
> 
> Would you please provide me with the executor's stderr log? This can be found in the work directory on the agent, it should give us a bit more information as to why it failed to start the task.
> 
> It'll be deeply nested, something like:
> 
> c:mesoswork_dirslaves7dc02270-a4e1-4f59-9ad7-56bad5182ea4-S3frameworkseb32cef4-c503-4ab7-85d4-8d4577e6a3bf-0000executorsnotepad.fcf078d1-084a-11e8-8f77-02421c3bc93crunslateststderr (and stdout) 
> 
> Thanks,
> 
> Andy 
> 
> On 02/02/2018 1:30 pm, ajkf9uvxc ajkf9uvxc wrote: 
> 
> Hi, 
> 
> I am trying to get a job in DCOS to run a docker container on a Windows agent machine. DCOS was installed using the AWS CF template here: https://downloads.dcos.io/dcos/stable/aws.html [5] (single master). 
> 
> The Windows agent is added: 
> 
> C:mesosmesosbuildsrcmesos-agent.exe --attributes=os:windows --containerizers=docker,mesos --hostname=10.19.10.206 --IP=10.19.10.206 --master=zk://10.22.1.94:2181/mesos [6] --work_dir=c:mesoswork_dir --launcher_dir=c:mesosmesosbuildsrc --log_dir=c:mesoslogs 
> 
> And a simple job works: 
> 
> dcos.activestate.com [7] -> Job -> New 
> 
> { 
> 
> "id": "mywindowstest01", 
> 
> "labels": {}, 
> 
> "run": { 
> 
> "cpus": 0.01, 
> 
> "mem": 128, 
> 
> "disk": 0, 
> 
> "cmd": "C:\Windows\System32\cmd.exe /c echo helloworld > c:\mesos\work_dir\helloworld2", 
> 
> "env": {}, 
> 
> "placement": { 
> 
> "constraints": [ 
> 
> { 
> 
> "attribute": "os", 
> 
> "operator": "EQ", 
> 
> "value": "windows" 
> 
> } 
> 
> ] 
> 
> }, 
> 
> "artifacts": [], 
> 
> "maxLaunchDelay": 3600, 
> 
> "volumes": [], 
> 
> "restart": { 
> 
> "policy": "NEVER" 
> 
> } 
> 
> }, 
> 
> "schedules": [] 
> 
> } 
> 
> creates: "c:\mesos\work_dir\helloworld2" 
> 
> The Windows agent has DockerCE installed and is set to run Windows containers (tried with Linux containers as well and getting the same problem, but for the purpose of this question let's stick to Windows containers) 
> 
> I confirmed that it's possible to run a Windows container manually, directly on Windows 10 by starting a Powershell as Administrator and running: 
> 
> docker run -ti microsoft/windowsservercore 
> 
> and 
> 
> docker run microsoft/windowsservercore 
> 
> Both commands create a new container (verified with "docker ps" , besides I get a cmd.exe shell in the conatiner for the first command) 
> 
> Now the problem: 
> 
> trying to run a container from DCOS does not work: 
> 
> dcos job add a.json 
> 
> with the json: 
> 
> { 
> 
> "id": "myattempt11", 
> 
> "labels": {}, 
> 
> "run": { 
> 
> "env": {}, 
> 
> "cpus": 1.00, 
> 
> "mem": 512, 
> 
> "disk": 1000, 
> 
> "placement": { 
> 
> "constraints": [ 
> 
> { 
> 
> "attribute": "os", 
> 
> "operator": "EQ", 
> 
> "value": "windows" 
> 
> } 
> 
> ] 
> 
> }, 
> 
> "artifacts": [], 
> 
> "maxLaunchDelay": 3600, 
> 
> "docker": { 
> 
> "image": "microsoft/windowsservercore" 
> 
> }, 
> 
> "restart": { 
> 
> "policy": "NEVER" 
> 
> } 
> 
> }, 
> 
> "schedules": [] 
> 
> } 
> 
> Does not work: 
> 
> # dcos job add a.json 
> 
> # dcos job run myattempt11 
> 
> Run ID: 20180202203339zVpxc 
> 
> The log on the Mesos Agent on Windows shows activity but not much information about the problem (see "TASK_FAILED" at the end below): 
> 
> Log file created at: 2018/02/02 12:52:47 
> 
> Running on machine: DESKTOP-JJK06UJ 
> 
> Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg 
> 
> I0202 12:52:47.330880 8388 logging.cpp:201] INFO level logging started! 
> 
> I0202 12:52:47.335886 8388 main.cpp:365] Build: 2017-12-20 23:35:42 UTC by Anne S Bell 
> 
> I0202 12:52:47.335886 8388 main.cpp:366] Version: 1.5.0 
> 
> I0202 12:52:47.337895 8388 main.cpp:373] Git SHA: 327726d3c7272806c8f3c3b7479758c26e55fd43 
> 
> I0202 12:52:47.358888 8388 resolver.cpp:69] Creating default secret resolver 
> 
> I0202 12:52:47.574883 8388 containerizer.cpp:304] Using isolation { windows/cpu, filesystem/windows, windows/mem, environment_secret } 
> 
> I0202 12:52:47.577883 8388 provisioner.cpp:299] Using default backend 'copy' 
> 
> I0202 12:52:47.596886 3348 slave.cpp:262] Mesos agent started on (1)@10.19.10.206:5051 [8] 
> 
> I0202 12:52:47.597883 3348 slave.cpp:263] Flags at startup: --appc_simple_discovery_uri_prefix="http://" --appc_store_dir="C:UsersactiveitAppDataLocalTempmesosstoreappc" --attributes="os:windows" --authenticate_http_readonly="false" --authenticate_http_readwrite="false" --authenticatee="crammd5" --authentication_backoff_factor="1secs" --authorizer="local" --container_disk_watch_interval="15secs" --containerizers="docker,mesos" --default_role="*" --disk_watch_interval="1mins" --docker="docker" --docker_kill_orphans="true" --docker_registry="https://registry-1.docker.io [9]" --docker_remove_delay="6hrs" --docker_socket="//./pipe/docker_engine" --docker_stop_timeout="0ns" --docker_store_dir="C:UsersactiveitAppDataLocalTempmesosstoredocker" --docker_volume_checkpoint_dir="/var/run/mesos/isolators/docker/volume" --enforce_container_disk_quota="false" --executor_registration_timeout="1mins" --executor_reregistration_timeout="2secs" --executor_shutdown_grace_period="5secs"
--fetcher_cache_dir="C:UsersactiveitAppDataLocalTempmesosfetch" --fetcher_cache_size="2GB" --frameworks_home="" --gc_delay="1weeks" --gc_disk_headroom="0.1" --hadoop_home="" --help="false" --hostname="10.19.10.206" --hostname_lookup="true" --http_command_executor="false" --http_heartbeat_interval="30secs" --initialize_driver_logging="true" --ip="10.19.10.206" --isolation="windows/cpu,windows/mem" --launcher="windows" --launcher_dir="c:mesosmesosbuildsrc" --log_dir="c:mesoslogs" --logbufsecs="0" --logging_level="INFO" --master="zk://10.22.1.94:2181/mesos [10]" --max_completed_executors_per_framework="150" --oversubscribed_resources_interval="15secs" --port="5051" --qos_correction_interval_min="0ns" --quiet="false" --reconfiguration_policy="equal" --recover="reconnect" --recovery_timeout="15mins" --registration_backoff_factor="1secs" --runtime_dir="C:ProgramDatamesosruntime" --sandbox_directory="C:mesossandbox" --strict="true" --version="false" --work_dir="c:mesoswork_dir"
--zk_session_timeout="10secs" 
> 
> I0202 12:52:47.604887 3348 slave.cpp:612] Agent resources: [{"name":"cpus","scalar":{"value":4.0},"type":"SCALAR"},{"name":"mem","scalar":{"value":15290.0},"type":"SCALAR"},{"name":"disk","scalar":{"value":470301.0},"type":"SCALAR"},{"name":"ports","ranges":{"range":[{"begin":31000,"end":32000}]},"type":"RANGES"}] 
> 
> I0202 12:52:47.725885 3348 slave.cpp:620] Agent attributes: [ os=windows ] 
> 
> I0202 12:52:47.727886 3348 slave.cpp:629] Agent hostname: 10.19.10.206 
> 
> I0202 12:52:47.735886 7652 task_status_update_manager.cpp:181] Pausing sending task status updates 
> 
> I0202 12:52:47.738890 4052 group.cpp:341] Group process (zookeeper-group(1)@10.19.10.206:5051 [8]) connected to ZooKeeper 
> 
> I0202 12:52:47.739887 4052 group.cpp:831] Syncing group operations: queue size (joins, cancels, datas) = (0, 0, 0) 
> 
> I0202 12:52:47.740885 4052 group.cpp:419] Trying to create path '/mesos' in ZooKeeper 
> 
> I0202 12:52:47.773885 5168 state.cpp:66] Recovering state from 'c:mesoswork_dirmeta' 
> 
> E0202 12:52:47.773885 3348 slave.cpp:1009] Failed to attach 'c:mesoslogsmesos-agent.exe.INFO [11]' to virtual path '/slave/log': Failed to get realpath of 'c:mesoslogsmesos-agent.exe.INFO [11]': Failed to get attributes for file 'c:mesoslogsmesos-agent.exe.INFO [11]': The system cannot find the file specified. 
> 
> I0202 12:52:47.774884 5168 state.cpp:724] No committed checkpointed resources found at 'c:mesoswork_dirmetaresourcesresources.info [12]' 
> 
> I0202 12:52:47.779883 5168 state.cpp:110] Failed to find the latest agent from 'c:mesoswork_dirmeta' 
> 
> I0202 12:52:47.781888 3528 task_status_update_manager.cpp:207] Recovering task status update manager 
> 
> I0202 12:52:47.782883 3348 docker.cpp:890] Recovering Docker containers 
> 
> I0202 12:52:47.782883 7652 containerizer.cpp:674] Recovering containerizer 
> 
> I0202 12:52:47.807888 3768 provisioner.cpp:495] Provisioner recovery complete 
> 
> I0202 12:52:47.891667 5168 detector.cpp:152] Detected a new leader: (id='1171') 
> 
> I0202 12:52:47.892666 7652 group.cpp:700] Trying to get '/mesos/json.info_0000001171' in ZooKeeper 
> 
> I0202 12:52:47.970657 5168 zookeeper.cpp:262] A new leading master (UPID=master@10.22.1.94:5050 [13]) is detected 
> 
> I0202 12:52:48.011252 7652 slave.cpp:6776] Finished recovery 
> 
> I0202 12:52:48.020246 3768 task_status_update_manager.cpp:181] Pausing sending task status updates 
> 
> I0202 12:52:48.020246 7652 slave.cpp:1055] New master detected at master@10.22.1.94:5050 [13] 
> 
> I0202 12:52:48.021251 7652 slave.cpp:1099] No credentials provided. Attempting to register without authentication 
> 
> I0202 12:52:48.023254 7652 slave.cpp:1110] Detecting new master 
> 
> I0202 12:52:48.330085 4052 slave.cpp:1275] Registered with master master@10.22.1.94:5050 [14]; given agent ID a0664e60-846a-42d0-9586-cf97e997eba3-S0 
> 
> I0202 12:52:48.331082 5168 task_status_update_manager.cpp:188] Resuming sending task status updates 
> 
> I0202 12:52:48.348086 4052 slave.cpp:1352] Forwarding agent update {"offer_operations":{},"resource_version_uuid":{"value":"DEVEk/KOR5KLtmOgVG9qvw=="},"slave_id":{"value":"a0664e60-846a-42d0-9586-cf97e997eba3-S0"},"update_oversubscribed_resources":true} 
> 
> W0202 12:52:48.351085 4052 slave.cpp:1334] Already registered with master master@10.22.1.94:5050 [14] 
> 
> I0202 12:52:48.356086 4052 slave.cpp:1352] Forwarding agent update {"offer_operations":{},"resource_version_uuid":{"value":"DEVEk/KOR5KLtmOgVG9qvw=="},"slave_id":{"value":"a0664e60-846a-42d0-9586-cf97e997eba3-S0"},"update_oversubscribed_resources":true} 
> 
> W0202 12:52:48.358086 4052 slave.cpp:1334] Already registered with master master@10.22.1.94:5050 [14] 
> 
> I0202 12:52:48.359086 4052 slave.cpp:1352] Forwarding agent update {"offer_operations":{},"resource_version_uuid":{"value":"DEVEk/KOR5KLtmOgVG9qvw=="},"slave_id":{"value":"a0664e60-846a-42d0-9586-cf97e997eba3-S0"},"update_oversubscribed_resources":true} 
> 
> W0202 12:52:48.362089 4052 slave.cpp:1334] Already registered with master master@10.22.1.94:5050 [14] 
> 
> I0202 12:52:48.363085 4052 slave.cpp:1352] Forwarding agent update {"offer_operations":{},"resource_version_uuid":{"value":"DEVEk/KOR5KLtmOgVG9qvw=="},"slave_id":{"value":"a0664e60-846a-42d0-9586-cf97e997eba3-S0"},"update_oversubscribed_resources":true} 
> 
> W0202 12:52:48.364082 4052 slave.cpp:1334] Already registered with master master@10.22.1.94:5050 [14] 
> 
> I0202 12:52:48.365085 4052 slave.cpp:1352] Forwarding agent update {"offer_operations":{},"resource_version_uuid":{"value":"DEVEk/KOR5KLtmOgVG9qvw=="},"slave_id":{"value":"a0664e60-846a-42d0-9586-cf97e997eba3-S0"},"update_oversubscribed_resources":true} 
> 
> I0202 12:52:50.938498 7652 slave.cpp:1831] Got assigned task 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' for framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> 
> I0202 12:52:50.962504 7652 slave.cpp:2101] Authorizing task 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' for framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> 
> I0202 12:52:50.965504 3768 slave.cpp:2494] Launching task 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' for framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> 
> I0202 12:52:50.988512 3768 slave.cpp:8373] Launching executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 with resources [{"allocation_info":{"role":"*"},"name":"cpus","scalar":{"value":0.1},"type":"SCALAR"},{"allocation_info":{"role":"*"},"name":"mem","scalar":{"value":32.0},"type":"SCALAR"}] in work directory 'c:mesoswork_dirslavesa0664e60-846a-42d0-9586-cf97e997eba3-S0frameworksca2eae6-8912-4f6a-984a-d501ac02ff88-0000executorsmyattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88runs74298e92-9700-486d-b211-a42e5fd0bf85' 
> 
> I0202 12:52:50.995501 3768 slave.cpp:3046] Launching container 74298e92-9700-486d-b211-a42e5fd0bf85 for executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> 
> I0202 12:52:51.010500 3768 slave.cpp:2580] Queued task 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' for executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> 
> I0202 12:52:51.017498 3348 docker.cpp:1144] Starting container '74298e92-9700-486d-b211-a42e5fd0bf85' for task 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' (and executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88') of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> 
> I0202 12:52:53.731667 1104 docker.cpp:784] Checkpointing pid 7732 to 'c:mesoswork_dirmetaslavesa0664e60-846a-42d0-9586-cf97e997eba3-S0frameworksca2eae6-8912-4f6a-984a-d501ac02ff88-0000executorsmyattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88runs74298e92-9700-486d-b211-a42e5fd0bf85pidsforked.pid' 
> 
> I0202 12:52:53.894371 4052 slave.cpp:4314] Got registration for executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 from executor(1)@10.19.10.206:49855 [15] 
> 
> I0202 12:52:53.911371 1104 docker.cpp:1627] Ignoring updating container 74298e92-9700-486d-b211-a42e5fd0bf85 because resources passed to update are identical to existing resources 
> 
> I0202 12:52:53.914371 3768 slave.cpp:2785] Sending queued task 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' to executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 at executor(1)@10.19.10.206:49855 [15] 
> 
> I0202 12:52:53.931371 7652 slave.cpp:4771] Handling status update TASK_STARTING (Status UUID: ef5adc2f-6f66-44c3-bc98-7697c1315ebf) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 from executor(1)@10.19.10.206:49855 [15] 
> 
> I0202 12:52:53.942371 5168 task_status_update_manager.cpp:328] Received task status update TASK_STARTING (Status UUID: ef5adc2f-6f66-44c3-bc98-7697c1315ebf) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> 
> I0202 12:52:53.948371 5168 task_status_update_manager.cpp:842] Checkpointing UPDATE for task status update TASK_STARTING (Status UUID: ef5adc2f-6f66-44c3-bc98-7697c1315ebf) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> 
> I0202 12:52:53.950371 1104 slave.cpp:5254] Forwarding the update TASK_STARTING (Status UUID: ef5adc2f-6f66-44c3-bc98-7697c1315ebf) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 to master@10.22.1.94:5050 [14] 
> 
> I0202 12:52:53.953371 1104 slave.cpp:5163] Sending acknowledgement for status update TASK_STARTING (Status UUID: ef5adc2f-6f66-44c3-bc98-7697c1315ebf) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 to executor(1)@10.19.10.206:49855 [15] 
> 
> I0202 12:52:54.049816 3348 task_status_update_manager.cpp:401] Received task status update acknowledgement (UUID: ef5adc2f-6f66-44c3-bc98-7697c1315ebf) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> 
> I0202 12:52:54.051817 3348 task_status_update_manager.cpp:842] Checkpointing ACK for task status update TASK_STARTING (Status UUID: ef5adc2f-6f66-44c3-bc98-7697c1315ebf) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> 
> I0202 12:52:59.255755 4052 slave.cpp:4771] Handling status update TASK_FAILED (Status UUID: c0775c86-4f1b-44a6-ae8f-347486f6fa9f) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 from executor(1)@10.19.10.206:49855 [15] 
> 
> I0202 12:52:59.260759 4052 task_status_update_manager.cpp:328] Received task status update TASK_FAILED (Status UUID: c0775c86-4f1b-44a6-ae8f-347486f6fa9f) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> 
> I0202 12:52:59.261757 4052 task_status_update_manager.cpp:842] Checkpointing UPDATE for task status update TASK_FAILED (Status UUID: c0775c86-4f1b-44a6-ae8f-347486f6fa9f) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> 
> I0202 12:52:59.263756 5168 slave.cpp:5254] Forwarding the update TASK_FAILED (Status UUID: c0775c86-4f1b-44a6-ae8f-347486f6fa9f) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 to master@10.22.1.94:5050 [14] 
> 
> I0202 12:52:59.265756 5168 slave.cpp:5163] Sending acknowledgement for status update TASK_FAILED (Status UUID: c0775c86-4f1b-44a6-ae8f-347486f6fa9f) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 to executor(1)@10.19.10.206:49855 [16] 
> 
> I0202 12:52:59.367189 7052 task_status_update_manager.cpp:401] Received task status update acknowledgement (UUID: c0775c86-4f1b-44a6-ae8f-347486f6fa9f) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> 
> I0202 12:52:59.368187 7052 task_status_update_manager.cpp:842] Checkpointing ACK for task status update TASK_FAILED (Status UUID: c0775c86-4f1b-44a6-ae8f-347486f6fa9f) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> 
> I0202 12:53:00.261153 4052 slave.cpp:5386] Got exited event for executor(1)@10.19.10.206:49855 [16] 
> 
> I0202 12:53:00.471400 7052 docker.cpp:2415] Executor for container 74298e92-9700-486d-b211-a42e5fd0bf85 has exited 
> 
> I0202 12:53:00.472362 7052 docker.cpp:2186] Destroying container 74298e92-9700-486d-b211-a42e5fd0bf85 in RUNNING state 
> 
> I0202 12:53:00.474362 7052 docker.cpp:2236] Running docker stop on container 74298e92-9700-486d-b211-a42e5fd0bf85 
> 
> I0202 12:53:00.477478 3348 slave.cpp:5795] Executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 exited with status 0 
> 
> I0202 12:53:00.478476 3348 slave.cpp:5899] Cleaning up executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 at executor(1)@10.19.10.206:49855 [16] 
> 
> I0202 12:53:00.481472 4052 gc.cpp:90] Scheduling 'c:mesoswork_dirslavesa0664e60-846a-42d0-9586-cf97e997eba3-S0frameworksca2eae6-8912-4f6a-984a-d501ac02ff88-0000executorsmyattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88runs74298e92-9700-486d-b211-a42e5fd0bf85' for gc 6.99989026072889days in the future 
> 
> I0202 12:53:00.483475 3528 gc.cpp:90] Scheduling 'c:mesoswork_dirslavesa0664e60-846a-42d0-9586-cf97e997eba3-S0frameworksca2eae6-8912-4f6a-984a-d501ac02ff88-0000executorsmyattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' for gc 6.99987866347259days in the future 
> 
> I0202 12:53:00.484474 5168 gc.cpp:90] Scheduling 'c:mesoswork_dirmetaslavesa0664e60-846a-42d0-9586-cf97e997eba3-S0frameworksca2eae6-8912-4f6a-984a-d501ac02ff88-0000executorsmyattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88runs74298e92-9700-486d-b211-a42e5fd0bf85' for gc 6.99999439265185days in the future 
> 
> I0202 12:53:00.485474 5168 gc.cpp:90] Scheduling 'c:mesoswork_dirmetaslavesa0664e60-846a-42d0-9586-cf97e997eba3-S0frameworksca2eae6-8912-4f6a-984a-d501ac02ff88-0000executorsmyattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' for gc 6.99987864033482days in the future 
> 
> I0202 12:53:00.485474 3348 slave.cpp:6006] Cleaning up framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> 
> I0202 12:53:00.486479 1104 task_status_update_manager.cpp:289] Closing task status update streams for framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> 
> I0202 12:53:00.487473 3768 gc.cpp:90] Scheduling 'c:mesoswork_dirslavesa0664e60-846a-42d0-9586-cf97e997eba3-S0frameworksca2eae6-8912-4f6a-984a-d501ac02ff88-0000' for gc 6.9998786172days in the future 
> 
> I0202 12:53:00.488477 3768 gc.cpp:90] Scheduling 'c:mesoswork_dirmetaslavesa0664e60-846a-42d0-9586-cf97e997eba3-S0frameworksca2eae6-8912-4f6a-984a-d501ac02ff88-0000' for gc 6.99987860557926days in the future 
> 
> I0202 12:53:47.742332 7052 slave.cpp:6314] Current disk usage 24.73%. Max allowed age: 4.568714599279827days 
> 
> I0202 12:54:01.675030 7052 slave.cpp:6222] Framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 seems to have exited. Ignoring registration timeout for executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' 
> 
> I0202 12:54:03.169529 3348 slave.cpp:970] Received SIGUSR1 signal; unregistering and shutting down 
> 
> I0202 12:54:03.170536 3348 slave.cpp:931] Agent terminating 
> 
> I0202 12:54:03.199530 3308 process.cpp:887] Failed to accept socket: future discarded 
> 
> in DCOS web-ui -> Jobs -> myattempt11 -> Run History there is also no information. 
> 
> Are there any good troubleshooting tips / ideas what to try or where to find more informative logs to run a Docker container on Windows using Mesos? 
> 
> Are there any more suitable alternative orchestration tools to run Docker Windows containers in a cluster?
 

Links:
------
[1] https://issues.apache.org/jira/browse/MESOS-8443
[2] https://reviews.apache.org/r/65147
[3] https://reviews.apache.org/r/63860/
[4]
https://mesos.apache.org/documentation/latest/networking/#docker-containerizer
[5]
https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdownloads.dcos.io%2Fdcos%2Fstable%2Faws.html&amp;data=02%7C01%7Cakagup%40microsoft.com%7C531f0fdca7214dafd28508d56a946cad%7Cee3303d7fb734b0c8589bcd847f1c277%7C1%7C0%7C636532108080164263&amp;sdata=duA9cAP6crnIuHYO6CYGWvOelh7qUsmJNotGTkzR6m0%3D&amp;reserved=0
[6]
https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2F10.22.1.94%3A2181%2Fmesos&amp;data=02%7C01%7Cakagup%40microsoft.com%7C531f0fdca7214dafd28508d56a946cad%7Cee3303d7fb734b0c8589bcd847f1c277%7C1%7C0%7C636532108080164263&amp;sdata=cskw6ExyYqo2IJgeIMjnQLw5QQzwTjTFaPp4WfZYr18%3D&amp;reserved=0
[7]
https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdcos.activestate.com%2F&amp;data=02%7C01%7Cakagup%40microsoft.com%7C531f0fdca7214dafd28508d56a946cad%7Cee3303d7fb734b0c8589bcd847f1c277%7C1%7C0%7C636532108080174272&amp;sdata=maKeyQab6xSqRQxTem3WtMDGHu%2Bnn88hag2C5%2Bcd%2BSU%3D&amp;reserved=0
[8]
https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2F10.19.10.206%3A5051%2F&amp;data=02%7C01%7Cakagup%40microsoft.com%7C531f0fdca7214dafd28508d56a946cad%7Cee3303d7fb734b0c8589bcd847f1c277%7C1%7C0%7C636532108080174272&amp;sdata=TMp%2FN8XQvitcatoA1%2B7p6qrxaKe5jViDJok1CwoK%2B9c%3D&amp;reserved=0
[9]
https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fregistry-1.docker.io%2F&amp;data=02%7C01%7Cakagup%40microsoft.com%7C531f0fdca7214dafd28508d56a946cad%7Cee3303d7fb734b0c8589bcd847f1c277%7C1%7C0%7C636532108080174272&amp;sdata=28ML6pT8nOMndmxk%2Fu2xV4GUDmvHsyEzJwvDJ4KCZTc%3D&amp;reserved=0
[10]
https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2F10.22.1.94%3A2181%2Fmesos&amp;data=02%7C01%7Cakagup%40microsoft.com%7C531f0fdca7214dafd28508d56a946cad%7Cee3303d7fb734b0c8589bcd847f1c277%7C1%7C0%7C636532108080174272&amp;sdata=NB1oum5jo6UgZvXOUvS%2BZ5%2BQXG8RQQI8jT68BHJjmnM%3D&amp;reserved=0
[11]
https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmesos-agent.exe.info%2F&amp;data=02%7C01%7Cakagup%40microsoft.com%7C531f0fdca7214dafd28508d56a946cad%7Cee3303d7fb734b0c8589bcd847f1c277%7C1%7C1%7C636532108080174272&amp;sdata=Fo007TtbUnt52yWRErJpu5GEEdZm8FKrzvyuUov4g60%3D&amp;reserved=0
[12]
https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fresources.info%2F&amp;data=02%7C01%7Cakagup%40microsoft.com%7C531f0fdca7214dafd28508d56a946cad%7Cee3303d7fb734b0c8589bcd847f1c277%7C1%7C0%7C636532108080174272&amp;sdata=vg4cQH9HPn8aOzqREgJ5rBA2OEVpWWALsp%2FGJ3DxCuk%3D&amp;reserved=0
[13]
https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmaster%4010.22.1.94%3A5050%2F&amp;data=02%7C01%7Cakagup%40microsoft.com%7C531f0fdca7214dafd28508d56a946cad%7Cee3303d7fb734b0c8589bcd847f1c277%7C1%7C0%7C636532108080174272&amp;sdata=UFdEqh%2FyCCjrGrpujk7ZxNykApjkne29MFwsMkeXmrM%3D&amp;reserved=0
[14]
https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmaster%4010.22.1.94%3A5050%2F&amp;data=02%7C01%7Cakagup%40microsoft.com%7C531f0fdca7214dafd28508d56a946cad%7Cee3303d7fb734b0c8589bcd847f1c277%7C1%7C0%7C636532108080184286&amp;sdata=QQnW%2BZpsM0yAM0mLGNyB1%2FMEodppj4GSUPgVPJgUhzM%3D&amp;reserved=0
[15]
https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2F10.19.10.206%3A49855%2F&amp;data=02%7C01%7Cakagup%40microsoft.com%7C531f0fdca7214dafd28508d56a946cad%7Cee3303d7fb734b0c8589bcd847f1c277%7C1%7C0%7C636532108080184286&amp;sdata=eNRYdbzFNjR3kZ%2FKq%2B8e1mgAhGDAXiT0rkS%2FjjnKlLw%3D&amp;reserved=0
[16]
https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2F10.19.10.206%3A49855%2F&amp;data=02%7C01%7Cakagup%40microsoft.com%7C531f0fdca7214dafd28508d56a946cad%7Cee3303d7fb734b0c8589bcd847f1c277%7C1%7C0%7C636532108080194291&amp;sdata=ovmyVwCUqTYxHeytcPaa3JWs5VXS%2FMQeYe89KtE7GGE%3D&amp;reserved=0

Re: RE: Struggling with running Docker container on Windows agent

Posted by ajkf9uvxc ajkf9uvxc <aj...@yahoo.com>.
 After compiling the tip of master from 2018-02-02 on Windows and then doing the exact same steps as before IT WORKS NOW ! docker ps shows the started container. (In this case the network setting is  "networks": [ { "mode": "container/bridge" } ]  ) 
Thanks a lot everybody for your help!


    On Friday, February 2, 2018, 4:26:11 p.m. PST, Akash Gupta (EOSG) <ak...@microsoft.com> wrote:  
 
 #yiv5315327519 #yiv5315327519 -- _filtered #yiv5315327519 {font-family:Helvetica;panose-1:2 11 6 4 2 2 2 2 2 4;} _filtered #yiv5315327519 {panose-1:2 4 5 3 5 4 6 3 2 4;} _filtered #yiv5315327519 {font-family:Calibri;panose-1:2 15 5 2 2 2 4 3 2 4;} _filtered #yiv5315327519 {font-family:Verdana;panose-1:2 11 6 4 3 5 4 4 2 4;}#yiv5315327519 #yiv5315327519 p.yiv5315327519MsoNormal, #yiv5315327519 li.yiv5315327519MsoNormal, #yiv5315327519 div.yiv5315327519MsoNormal {margin:0in;margin-bottom:.0001pt;font-size:11.0pt;}#yiv5315327519 a:link, #yiv5315327519 span.yiv5315327519MsoHyperlink {color:blue;text-decoration:underline;}#yiv5315327519 a:visited, #yiv5315327519 span.yiv5315327519MsoHyperlinkFollowed {color:purple;text-decoration:underline;}#yiv5315327519 p.yiv5315327519msonormal0, #yiv5315327519 li.yiv5315327519msonormal0, #yiv5315327519 div.yiv5315327519msonormal0 {margin-right:0in;margin-left:0in;font-size:11.0pt;}#yiv5315327519 span.yiv5315327519EmailStyle20 {color:windowtext;}#yiv5315327519 .yiv5315327519MsoChpDefault {font-size:10.0pt;} _filtered #yiv5315327519 {margin:1.0in 1.0in 1.0in 1.0in;}#yiv5315327519 div.yiv5315327519WordSection1 {}#yiv5315327519 
To summarize:
 
  
 
If you update to the latest 1.5.x branch, then it will fix the docker $PATH issue, but you will still run into problems with running docker containers, because Mesos 1.5.x doesn’t have the Windows docker network patches that the master Mesos branch has. A work around is to send a “network=nat” through the docker.parameters field in the json like this:
 
  
 
  "docker": {
 
        "parameters": [
 
            { "key": "network", "value": "nat" }
 
        ]
 
    }
 
  
 
If you update to the tip of master, then you should be able to run your job by adding the"networks": [ { "mode": "container/bridge" } ] field to your json. You need the network field because the default network setting in marathon is `HOST` mode, which is Linux only.
 
  
 
  
 
From: ajkf9uvxc ajkf9uvxc [mailto:ajkf9uvxc@yahoo.com]
Sent: Friday, February 2, 2018 3:27 PM
To: Andrew Schwartzmeyer <an...@schwartzmeyer.com>
Cc: user@mesos.apache.org; ulrichb@activestate.com; Akash Gupta (EOSG) <ak...@microsoft.com>; Joseph Wu <jo...@mesosphere.io>
Subject: Re: Struggling with running Docker container on Windows agent
 
  
 
Yes, I got the same result with Marathon after adding     "networks": [ { "mode": "container/bridge" } ],  . It sounds like there are multiple reasons for compiling a newer version and the PATH issue you mentioned is the most likely fix that will solve the problem.
 
  
 
Knowing what to do next is a big step further. I will tell you how it worked by mid next week. 
 
  
 
Thank you!
 
  
 
  
 
  
 
On Friday, February 2, 2018, 2:50:34 p.m. PST, Andrew Schwartzmeyer <an...@schwartzmeyer.com> wrote:
 
  
 
  
 
Oh, geez, this is even simpler.

We'd temporarily broken the Docker containerizer in 1.5 when we fixed environment variables. You need at least commit 1b6f9e90f, where we fixed it. You don't have to move to the tip of master, we backported it (as af64bcb387) to the 1.5.x branch.

The bug was: https://issues.apache.org/jira/browse/MESOS-8443

commit 1b6f9e90f
Author: Akash Gupta <ak...@microsoft.com>
Date:   Fri Jan 12 16:23:39 2018 -0800

    Windows: Fixed docker executor `PATH` variable.

    The `docker` executable is not usually installed in
    `os::host_default_path()` on Windows, so the Executor cannot find it.
    Now, before launching the Executor, the Agent finds the directory
    containing `docker` and prepends it to the `PATH` given to the Executor
    so that both the Executor and Agent use the same `docker`.

    Review: https://reviews.apache.org/r/65147

Sorry about that!

Andy
 
On 02/02/2018 2:33 pm, ajkf9uvxc ajkf9uvxc wrote:
 

 
 
Thanks for all your replies.
 
 
 
Here is the stderr requested by Andy (good to know about this log):
 
 
 
I0202 12:52:53.865368  7140 exec.cpp:162] Version: 1.5.0
 
I0202 12:52:53.911371  7684 exec.cpp:237] Executor registered on agent a0664e60-846a-42d0-9586-cf97e997eba3-S0
 
I0202 12:52:53.915374  7192 executor.cpp:120] Registered docker executor on 10.19.10.206
 
I0202 12:52:53.920373   548 executor.cpp:160] Starting task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88
 
I0202 12:52:59.252701  6752 executor.cpp:546] Failed to run docker container: Failed to create subprocess 'docker': Could not launch child process: Failed to call `CreateProcess`: docker -H npipe:////./pipe/docker_engine run --cpu-shares 1024 --memory 536870912 -e HOST=10.19.10.206 -e MARATHON_APP_DOCKER_IMAGE=microsoft/windowsservercore -e MARATHON_APP_ID=/myattempt11/20180202203339zVpxc -e MARATHON_APP_LABELS= -e MARATHON_APP_RESOURCE_CPUS=1.0 -e MARATHON_APP_RESOURCE_DISK=1000.0 -e MARATHON_APP_RESOURCE_MEM=512.0 -e MARATHON_APP_VERSION=1970-01-01T00:00:00.000Z -e MESOS_CONTAINER_NAME=mesos-74298e92-9700-486d-b211-a42e5fd0bf85 -e MESOS_SANDBOX=C:\mesos\sandbox -e MESOS_TASK_ID=myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 -v c:\mesos\work_dir\slaves\a0664e60-846a-42d0-9586-cf97e997eba3-S0\docker\links\74298e92-9700-486d-b211-a42e5fd0bf85:C:\mesos\sandbox --net host --name mesos-74298e92-9700-486d-b211-a42e5fd0bf85 microsoft/windowsservercore\00: The system cannot find the file specified.
 
 
 
I0202 12:53:00.255151  1164 process.cpp:887] Failed to accept socket: future discarded
 
 
 
 
 
Note about that: the path c:\mesos\work_dir\slaves\a0664e60-846a-42d0-9586-cf97e997eba3-S0\docker\links exists but it's empty. There also is no "C:\mesos\sandbox" . I guess the error must be referring to one of those.
 
 
 
I will try and see if I get it running by specifying "NAT". Hold on...
 
 
 
In the end, the Windows boxes are supposed to build an application and won't run services so NAT is probably fine. 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
On Friday, February 2, 2018, 2:13:51 p.m. PST, Andrew Schwartzmeyer <an...@schwartzmeyer.com> wrote:
 
 
 
 
 
You may want to build from the tip of master (e.g. 1.6), as the following change/fix went in:

commit 6b35c93ba
Author: Akash Gupta <ak...@hotmail.com>
Date:   Wed Jan 17 13:51:44 2018 -0800

    Windows: Mapped the Docker network info types.

    The Network enum in DockerInfo is specific to Linux containers. `HOST`
    doesn't exist on Windows and `BRIDGE` is `NAT` on Windows. The current
    default docker network setting was always `HOST`, which broke the
    Windows docker executor. Now, if a specific network isn't given, the
    network mode will default to `HOST` on Linux agents and `NAT` on Windows
    agents. Also, `BRIDGE` mode will be translated to `NAT` on Windows.

    Review: https://reviews.apache.org/r/63860/

Then you can set the network type to "BRIDGE" in Marathon, which on Windows is equivalent (and will use) the Docker NAT network. Seehttps://mesos.apache.org/documentation/latest/networking/#docker-containerizer (but note that it applies to 1.6, not your current version at 1.5).
 
 
 
  
 
On 02/02/2018 1:44 pm, Joseph Wu wrote:
 

It doesn't look like you've set the network type to `NAT`.  Marathon isn't aware (yet) that Docker on Windows does not support HOST/BRIDGE networks.
 
  
 
On Fri, Feb 2, 2018 at 1:37 PM, Andrew Schwartzmeyer <an...@schwartzmeyer.com> wrote:
 

Hello,

Would you please provide me with the executor's stderr log? This can be found in the work directory on the agent, it should give us a bit more information as to why it failed to start the task.

It'll be deeply nested, something like:

c:\mesos\work_dir\slaves\7dc02270-a4e1-4f59-9ad7-56bad5182ea4-S3\frameworks\eb32cef4-c503-4ab7-85d4-8d4577e6a3bf-0000\executors\notepad.fcf078d1-084a-11e8-8f77-02421c3bc93c\runs\latest\stderr (and stdout)
 
 
 

Thanks,

Andy 
 
  
 
On 02/02/2018 1:30 pm, ajkf9uvxc ajkf9uvxc wrote:
 

Hi,
 
 
 
I am trying to get a job in DCOS to run a docker container on a Windows agent machine. DCOS was installed using the AWS CF template here: https://downloads.dcos.io/dcos/stable/aws.html (single master).
 
 
 
The Windows agent is added:
 
 
 
C:\mesos\mesos\build\src\mesos-agent.exe --attributes=os:windows --containerizers=docker,mesos --hostname=10.19.10.206 --IP=10.19.10.206 --master=zk://10.22.1.94:2181/mesos --work_dir=c:\mesos\work_dir --launcher_dir=c:\mesos\mesos\build\src --log_dir=c:\mesos\logs
 
 
 
And a simple job works:
 
 
 
dcos.activestate.com -> Job -> New
 
  
 
 
 
{
 
 "id": "mywindowstest01",
 
 "labels": {},
 
 "run": {
 
   "cpus": 0.01,
 
   "mem": 128,
 
   "disk": 0,
 
   "cmd": "C:\\Windows\\System32\\cmd.exe /c echo helloworld > c:\\mesos\\work_dir\\helloworld2",
 
   "env": {},
 
   "placement": {
 
     "constraints": [
 
       {
 
         "attribute": "os",
 
         "operator": "EQ",
 
         "value": "windows"
 
       }
 
     ]
 
   },
 
   "artifacts": [],
 
   "maxLaunchDelay": 3600,
 
   "volumes": [],
 
   "restart": {
 
     "policy": "NEVER"
 
   }
 
 },
 
 "schedules": []
 
}
 
 
 
creates: "c:\\mesos\\work_dir\\helloworld2"
 
 
 
 
 
The Windows agent has DockerCE installed and is set to run Windows containers (tried with Linux containers as well and getting the same problem, but for the purpose of this question let's stick to Windows containers)
 
 
 
I confirmed that it's possible to run a Windows container manually, directly on Windows 10 by starting a Powershell as Administrator and running:
 
 
 
docker run -ti microsoft/windowsservercore
 
and 
 
docker run microsoft/windowsservercore
 
 
 
Both commands create a new container (verified with "docker ps" , besides I get a cmd.exe shell in the conatiner for the first command)
 
 
 
Now the problem:
 
 
 
trying to run a container from DCOS does not work:
 
 
 
dcos job add a.json
 
 
 
with the json:
 
 
 
{
 
  "id": "myattempt11",
 
  "labels": {},
 
  "run": {
 
    "env": {},
 
    "cpus": 1.00,
 
    "mem": 512,
 
    "disk": 1000,
 
    "placement": {
 
      "constraints": [
 
        {
 
          "attribute": "os",
 
          "operator": "EQ",
 
          "value": "windows"
 
        }
 
      ]
 
    },
 
    "artifacts": [],
 
    "maxLaunchDelay": 3600,
 
    "docker": {
 
      "image": "microsoft/windowsservercore"
 
    },
 
    "restart": {
 
      "policy": "NEVER"
 
    }
 
  },
 
  "schedules": []
 
}
 
 
 
Does not work:
 
 
 
# dcos job add a.json
 
# dcos job run myattempt11 
 
Run ID: 20180202203339zVpxc
 
 
 
The log on the Mesos Agent on Windows shows activity but not much information about the problem (see "TASK_FAILED" at the end below):
 
 
 
Log file created at: 2018/02/02 12:52:47
 
Running on machine: DESKTOP-JJK06UJ
 
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
 
I0202 12:52:47.330880  8388 logging.cpp:201] INFO level logging started!
 
I0202 12:52:47.335886  8388 main.cpp:365] Build: 2017-12-20 23:35:42 UTC by Anne S Bell
 
I0202 12:52:47.335886  8388 main.cpp:366] Version: 1.5.0
 
I0202 12:52:47.337895  8388 main.cpp:373] Git SHA: 327726d3c7272806c8f3c3b7479758c26e55fd43
 
I0202 12:52:47.358888  8388 resolver.cpp:69] Creating default secret resolver
 
I0202 12:52:47.574883  8388 containerizer.cpp:304] Using isolation { windows/cpu, filesystem/windows, windows/mem, environment_secret }
 
I0202 12:52:47.577883  8388 provisioner.cpp:299] Using default backend 'copy'
 
I0202 12:52:47.596886  3348 slave.cpp:262] Mesos agent started on (1)@10.19.10.206:5051
 
I0202 12:52:47.597883  3348 slave.cpp:263] Flags at startup: --appc_simple_discovery_uri_prefix="http://" --appc_store_dir="C:\Users\activeit\AppData\Local\Temp\mesos\store\appc" --attributes="os:windows" --authenticate_http_readonly="false" --authenticate_http_readwrite="false" --authenticatee="crammd5" --authentication_backoff_factor="1secs" --authorizer="local" --container_disk_watch_interval="15secs" --containerizers="docker,mesos" --default_role="*" --disk_watch_interval="1mins" --docker="docker" --docker_kill_orphans="true" --docker_registry="https://registry-1.docker.io" --docker_remove_delay="6hrs" --docker_socket="//./pipe/docker_engine" --docker_stop_timeout="0ns" --docker_store_dir="C:\Users\activeit\AppData\Local\Temp\mesos\store\docker" --docker_volume_checkpoint_dir="/var/run/mesos/isolators/docker/volume" --enforce_container_disk_quota="false" --executor_registration_timeout="1mins" --executor_reregistration_timeout="2secs" --executor_shutdown_grace_period="5secs" --fetcher_cache_dir="C:\Users\activeit\AppData\Local\Temp\mesos\fetch" --fetcher_cache_size="2GB" --frameworks_home="" --gc_delay="1weeks" --gc_disk_headroom="0.1" --hadoop_home="" --help="false" --hostname="10.19.10.206" --hostname_lookup="true" --http_command_executor="false" --http_heartbeat_interval="30secs" --initialize_driver_logging="true" --ip="10.19.10.206" --isolation="windows/cpu,windows/mem" --launcher="windows" --launcher_dir="c:\mesos\mesos\build\src" --log_dir="c:\mesos\logs" --logbufsecs="0" --logging_level="INFO" --master="zk://10.22.1.94:2181/mesos" --max_completed_executors_per_framework="150" --oversubscribed_resources_interval="15secs" --port="5051" --qos_correction_interval_min="0ns" --quiet="false" --reconfiguration_policy="equal" --recover="reconnect" --recovery_timeout="15mins" --registration_backoff_factor="1secs" --runtime_dir="C:\ProgramData\mesos\runtime" --sandbox_directory="C:\mesos\sandbox" --strict="true" --version="false" --work_dir="c:\mesos\work_dir" --zk_session_timeout="10secs"
 
I0202 12:52:47.604887  3348 slave.cpp:612] Agent resources: [{"name":"cpus","scalar":{"value":4.0},"type":"SCALAR"},{"name":"mem","scalar":{"value":15290.0},"type":"SCALAR"},{"name":"disk","scalar":{"value":470301.0},"type":"SCALAR"},{"name":"ports","ranges":{"range":[{"begin":31000,"end":32000}]},"type":"RANGES"}]
 
I0202 12:52:47.725885  3348 slave.cpp:620] Agent attributes: [ os=windows ]
 
I0202 12:52:47.727886  3348 slave.cpp:629] Agent hostname: 10.19.10.206
 
I0202 12:52:47.735886  7652 task_status_update_manager.cpp:181] Pausing sending task status updates
 
I0202 12:52:47.738890  4052 group.cpp:341] Group process (zookeeper-group(1)@10.19.10.206:5051) connected to ZooKeeper
 
I0202 12:52:47.739887  4052 group.cpp:831] Syncing group operations: queue size (joins, cancels, datas) = (0, 0, 0)
 
I0202 12:52:47.740885  4052 group.cpp:419] Trying to create path '/mesos' in ZooKeeper
 
I0202 12:52:47.773885  5168 state.cpp:66] Recovering state from 'c:\mesos\work_dir\meta'
 
E0202 12:52:47.773885  3348 slave.cpp:1009] Failed to attach 'c:\mesos\logs\mesos-agent.exe.INFO' to virtual path '/slave/log': Failed to get realpath of 'c:\mesos\logs\mesos-agent.exe.INFO': Failed to get attributes for file 'c:\mesos\logs\mesos-agent.exe.INFO': The system cannot find the file specified.
 
 
 
I0202 12:52:47.774884  5168 state.cpp:724] No committed checkpointed resources found at 'c:\mesos\work_dir\meta\resources\resources.info'
 
I0202 12:52:47.779883  5168 state.cpp:110] Failed to find the latest agent from 'c:\mesos\work_dir\meta'
 
I0202 12:52:47.781888  3528 task_status_update_manager.cpp:207] Recovering task status update manager
 
I0202 12:52:47.782883  3348 docker.cpp:890] Recovering Docker containers
 
I0202 12:52:47.782883  7652 containerizer.cpp:674] Recovering containerizer
 
I0202 12:52:47.807888  3768 provisioner.cpp:495] Provisioner recovery complete
 
I0202 12:52:47.891667  5168 detector.cpp:152] Detected a new leader: (id='1171')
 
I0202 12:52:47.892666  7652 group.cpp:700] Trying to get '/mesos/json.info_0000001171' in ZooKeeper
 
I0202 12:52:47.970657  5168 zookeeper.cpp:262] A new leading master (UPID=master@10.22.1.94:5050) is detected
 
I0202 12:52:48.011252  7652 slave.cpp:6776] Finished recovery
 
I0202 12:52:48.020246  3768 task_status_update_manager.cpp:181] Pausing sending task status updates
 
I0202 12:52:48.020246  7652 slave.cpp:1055] New master detected at master@10.22.1.94:5050
 
I0202 12:52:48.021251  7652 slave.cpp:1099] No credentials provided. Attempting to register without authentication
 
I0202 12:52:48.023254  7652 slave.cpp:1110] Detecting new master
 
I0202 12:52:48.330085  4052 slave.cpp:1275] Registered with master master@10.22.1.94:5050; given agent ID a0664e60-846a-42d0-9586-cf97e997eba3-S0
 
I0202 12:52:48.331082  5168 task_status_update_manager.cpp:188] Resuming sending task status updates
 
I0202 12:52:48.348086  4052 slave.cpp:1352] Forwarding agent update {"offer_operations":{},"resource_version_uuid":{"value":"DEVEk\/KOR5KLtmOgVG9qvw=="},"slave_id":{"value":"a0664e60-846a-42d0-9586-cf97e997eba3-S0"},"update_oversubscribed_resources":true}
 
W0202 12:52:48.351085  4052 slave.cpp:1334] Already registered with master master@10.22.1.94:5050
 
I0202 12:52:48.356086  4052 slave.cpp:1352] Forwarding agent update {"offer_operations":{},"resource_version_uuid":{"value":"DEVEk\/KOR5KLtmOgVG9qvw=="},"slave_id":{"value":"a0664e60-846a-42d0-9586-cf97e997eba3-S0"},"update_oversubscribed_resources":true}
 
W0202 12:52:48.358086  4052 slave.cpp:1334] Already registered with master master@10.22.1.94:5050
 
I0202 12:52:48.359086  4052 slave.cpp:1352] Forwarding agent update {"offer_operations":{},"resource_version_uuid":{"value":"DEVEk\/KOR5KLtmOgVG9qvw=="},"slave_id":{"value":"a0664e60-846a-42d0-9586-cf97e997eba3-S0"},"update_oversubscribed_resources":true}
 
W0202 12:52:48.362089  4052 slave.cpp:1334] Already registered with master master@10.22.1.94:5050
 
I0202 12:52:48.363085  4052 slave.cpp:1352] Forwarding agent update {"offer_operations":{},"resource_version_uuid":{"value":"DEVEk\/KOR5KLtmOgVG9qvw=="},"slave_id":{"value":"a0664e60-846a-42d0-9586-cf97e997eba3-S0"},"update_oversubscribed_resources":true}
 
W0202 12:52:48.364082  4052 slave.cpp:1334] Already registered with master master@10.22.1.94:5050
 
I0202 12:52:48.365085  4052 slave.cpp:1352] Forwarding agent update {"offer_operations":{},"resource_version_uuid":{"value":"DEVEk\/KOR5KLtmOgVG9qvw=="},"slave_id":{"value":"a0664e60-846a-42d0-9586-cf97e997eba3-S0"},"update_oversubscribed_resources":true}
 
I0202 12:52:50.938498  7652 slave.cpp:1831] Got assigned task 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' for framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000
 
I0202 12:52:50.962504  7652 slave.cpp:2101] Authorizing task 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' for framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000
 
I0202 12:52:50.965504  3768 slave.cpp:2494] Launching task 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' for framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000
 
I0202 12:52:50.988512  3768 slave.cpp:8373] Launching executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 with resources [{"allocation_info":{"role":"*"},"name":"cpus","scalar":{"value":0.1},"type":"SCALAR"},{"allocation_info":{"role":"*"},"name":"mem","scalar":{"value":32.0},"type":"SCALAR"}] in work directory 'c:\mesos\work_dir\slaves\a0664e60-846a-42d0-9586-cf97e997eba3-S0\frameworks\0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000\executors\myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88\runs\74298e92-9700-486d-b211-a42e5fd0bf85'
 
I0202 12:52:50.995501  3768 slave.cpp:3046] Launching container 74298e92-9700-486d-b211-a42e5fd0bf85 for executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000
 
I0202 12:52:51.010500  3768 slave.cpp:2580] Queued task 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' for executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000
 
I0202 12:52:51.017498  3348 docker.cpp:1144] Starting container '74298e92-9700-486d-b211-a42e5fd0bf85' for task 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' (and executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88') of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000
 
I0202 12:52:53.731667  1104 docker.cpp:784] Checkpointing pid 7732 to 'c:\mesos\work_dir\meta\slaves\a0664e60-846a-42d0-9586-cf97e997eba3-S0\frameworks\0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000\executors\myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88\runs\74298e92-9700-486d-b211-a42e5fd0bf85\pids\forked.pid'
 
I0202 12:52:53.894371  4052 slave.cpp:4314] Got registration for executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 from executor(1)@10.19.10.206:49855
 
I0202 12:52:53.911371  1104 docker.cpp:1627] Ignoring updating container 74298e92-9700-486d-b211-a42e5fd0bf85 because resources passed to update are identical to existing resources
 
I0202 12:52:53.914371  3768 slave.cpp:2785] Sending queued task 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' to executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 at executor(1)@10.19.10.206:49855
 
I0202 12:52:53.931371  7652 slave.cpp:4771] Handling status update TASK_STARTING (Status UUID: ef5adc2f-6f66-44c3-bc98-7697c1315ebf) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 from executor(1)@10.19.10.206:49855
 
I0202 12:52:53.942371  5168 task_status_update_manager.cpp:328] Received task status update TASK_STARTING (Status UUID: ef5adc2f-6f66-44c3-bc98-7697c1315ebf) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000
 
I0202 12:52:53.948371  5168 task_status_update_manager.cpp:842] Checkpointing UPDATE for task status update TASK_STARTING (Status UUID: ef5adc2f-6f66-44c3-bc98-7697c1315ebf) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000
 
I0202 12:52:53.950371  1104 slave.cpp:5254] Forwarding the update TASK_STARTING (Status UUID: ef5adc2f-6f66-44c3-bc98-7697c1315ebf) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 to master@10.22.1.94:5050
 
I0202 12:52:53.953371  1104 slave.cpp:5163] Sending acknowledgement for status update TASK_STARTING (Status UUID: ef5adc2f-6f66-44c3-bc98-7697c1315ebf) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 to executor(1)@10.19.10.206:49855
 
I0202 12:52:54.049816  3348 task_status_update_manager.cpp:401] Received task status update acknowledgement (UUID: ef5adc2f-6f66-44c3-bc98-7697c1315ebf) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000
 
I0202 12:52:54.051817  3348 task_status_update_manager.cpp:842] Checkpointing ACK for task status update TASK_STARTING (Status UUID: ef5adc2f-6f66-44c3-bc98-7697c1315ebf) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000
 
I0202 12:52:59.255755  4052 slave.cpp:4771] Handling status update TASK_FAILED (Status UUID: c0775c86-4f1b-44a6-ae8f-347486f6fa9f) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 from executor(1)@10.19.10.206:49855
 
I0202 12:52:59.260759  4052 task_status_update_manager.cpp:328] Received task status update TASK_FAILED (Status UUID: c0775c86-4f1b-44a6-ae8f-347486f6fa9f) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000
 
I0202 12:52:59.261757  4052 task_status_update_manager.cpp:842] Checkpointing UPDATE for task status update TASK_FAILED (Status UUID: c0775c86-4f1b-44a6-ae8f-347486f6fa9f) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000
 
I0202 12:52:59.263756  5168 slave.cpp:5254] Forwarding the update TASK_FAILED (Status UUID: c0775c86-4f1b-44a6-ae8f-347486f6fa9f) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 to master@10.22.1.94:5050
 
I0202 12:52:59.265756  5168 slave.cpp:5163] Sending acknowledgement for status update TASK_FAILED (Status UUID: c0775c86-4f1b-44a6-ae8f-347486f6fa9f) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 to executor(1)@10.19.10.206:49855
 
I0202 12:52:59.367189  7052 task_status_update_manager.cpp:401] Received task status update acknowledgement (UUID: c0775c86-4f1b-44a6-ae8f-347486f6fa9f) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000
 
I0202 12:52:59.368187  7052 task_status_update_manager.cpp:842] Checkpointing ACK for task status update TASK_FAILED (Status UUID: c0775c86-4f1b-44a6-ae8f-347486f6fa9f) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000
 
I0202 12:53:00.261153  4052 slave.cpp:5386] Got exited event for executor(1)@10.19.10.206:49855
 
I0202 12:53:00.471400  7052 docker.cpp:2415] Executor for container 74298e92-9700-486d-b211-a42e5fd0bf85 has exited
 
I0202 12:53:00.472362  7052 docker.cpp:2186] Destroying container 74298e92-9700-486d-b211-a42e5fd0bf85 in RUNNING state
 
I0202 12:53:00.474362  7052 docker.cpp:2236] Running docker stop on container 74298e92-9700-486d-b211-a42e5fd0bf85
 
I0202 12:53:00.477478  3348 slave.cpp:5795] Executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 exited with status 0
 
I0202 12:53:00.478476  3348 slave.cpp:5899] Cleaning up executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 at executor(1)@10.19.10.206:49855
 
I0202 12:53:00.481472  4052 gc.cpp:90] Scheduling 'c:\mesos\work_dir\slaves\a0664e60-846a-42d0-9586-cf97e997eba3-S0\frameworks\0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000\executors\myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88\runs\74298e92-9700-486d-b211-a42e5fd0bf85' for gc 6.99989026072889days in the future
 
I0202 12:53:00.483475  3528 gc.cpp:90] Scheduling 'c:\mesos\work_dir\slaves\a0664e60-846a-42d0-9586-cf97e997eba3-S0\frameworks\0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000\executors\myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' for gc 6.99987866347259days in the future
 
I0202 12:53:00.484474  5168 gc.cpp:90] Scheduling 'c:\mesos\work_dir\meta\slaves\a0664e60-846a-42d0-9586-cf97e997eba3-S0\frameworks\0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000\executors\myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88\runs\74298e92-9700-486d-b211-a42e5fd0bf85' for gc 6.99999439265185days in the future
 
I0202 12:53:00.485474  5168 gc.cpp:90] Scheduling 'c:\mesos\work_dir\meta\slaves\a0664e60-846a-42d0-9586-cf97e997eba3-S0\frameworks\0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000\executors\myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' for gc 6.99987864033482days in the future
 
I0202 12:53:00.485474  3348 slave.cpp:6006] Cleaning up framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000
 
I0202 12:53:00.486479  1104 task_status_update_manager.cpp:289] Closing task status update streams for framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000
 
I0202 12:53:00.487473  3768 gc.cpp:90] Scheduling 'c:\mesos\work_dir\slaves\a0664e60-846a-42d0-9586-cf97e997eba3-S0\frameworks\0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000' for gc 6.9998786172days in the future
 
I0202 12:53:00.488477  3768 gc.cpp:90] Scheduling 'c:\mesos\work_dir\meta\slaves\a0664e60-846a-42d0-9586-cf97e997eba3-S0\frameworks\0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000' for gc 6.99987860557926days in the future
 
I0202 12:53:47.742332  7052 slave.cpp:6314] Current disk usage 24.73%. Max allowed age: 4.568714599279827days
 
I0202 12:54:01.675030  7052 slave.cpp:6222] Framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 seems to have exited. Ignoring registration timeout for executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88'
 
I0202 12:54:03.169529  3348 slave.cpp:970] Received SIGUSR1 signal; unregistering and shutting down
 
I0202 12:54:03.170536  3348 slave.cpp:931] Agent terminating
 
I0202 12:54:03.199530  3308 process.cpp:887] Failed to accept socket: future discarded
 
 
 
 
 
in DCOS web-ui -> Jobs -> myattempt11 -> Run History  there is also no information.
 
 
 
Are there any good troubleshooting tips / ideas what to try or where to find more informative logs to run a Docker container on Windows using Mesos? 
 
 
 
Are there any more suitable alternative orchestration tools to run Docker Windows containers in a cluster?
 



  

Re: Struggling with running Docker container on Windows agent

Posted by Andrew Schwartzmeyer <an...@schwartzmeyer.com>.
 Oh, geez, this is even simpler.

We'd temporarily broken the Docker containerizer in 1.5 when we fixed
environment variables. You need at least commit 1b6f9e90f, where we
fixed it. You don't have to move to the tip of master, we backported it
(as af64bcb387) to the 1.5.x branch.

The bug was: https://issues.apache.org/jira/browse/MESOS-8443

commit 1b6f9e90f
Author: Akash Gupta <ak...@microsoft.com>
Date: Fri Jan 12 16:23:39 2018 -0800

 Windows: Fixed docker executor `PATH` variable.

 The `docker` executable is not usually installed in
 `os::host_default_path()` on Windows, so the Executor cannot find it.
 Now, before launching the Executor, the Agent finds the directory
 containing `docker` and prepends it to the `PATH` given to the Executor
 so that both the Executor and Agent use the same `docker`.

 Review: https://reviews.apache.org/r/65147

Sorry about that!

Andy

On 02/02/2018 2:33 pm, ajkf9uvxc ajkf9uvxc wrote: 

> Thanks for all your replies. 
> 
> Here is the stderr requested by Andy (good to know about this log): 
> 
> I0202 12:52:53.865368 7140 exec.cpp:162] Version: 1.5.0 
> I0202 12:52:53.911371 7684 exec.cpp:237] Executor registered on agent a0664e60-846a-42d0-9586-cf97e997eba3-S0 
> I0202 12:52:53.915374 7192 executor.cpp:120] Registered docker executor on 10.19.10.206 
> I0202 12:52:53.920373 548 executor.cpp:160] Starting task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 
> I0202 12:52:59.252701 6752 executor.cpp:546] Failed to run docker container: Failed to create subprocess 'docker': Could not launch child process: Failed to call `CreateProcess`: docker -H npipe:////./pipe/docker_engine run --cpu-shares 1024 --memory 536870912 -e HOST=10.19.10.206 -e MARATHON_APP_DOCKER_IMAGE=microsoft/windowsservercore -e MARATHON_APP_ID=/myattempt11/20180202203339zVpxc -e MARATHON_APP_LABELS= -e MARATHON_APP_RESOURCE_CPUS=1.0 -e MARATHON_APP_RESOURCE_DISK=1000.0 -e MARATHON_APP_RESOURCE_MEM=512.0 -e MARATHON_APP_VERSION=1970-01-01T00:00:00.000Z -e MESOS_CONTAINER_NAME=mesos-74298e92-9700-486d-b211-a42e5fd0bf85 -e MESOS_SANDBOX=C:mesossandbox -e MESOS_TASK_ID=myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 -v c:mesoswork_dirslavesa0664e60-846a-42d0-9586-cf97e997eba3-S0dockerlinks74298e92-9700-486d-b211-a42e5fd0bf85:C:mesossandbox --net host --name mesos-74298e92-9700-486d-b211-a42e5fd0bf85 microsoft/windowsservercore0: The system cannot find
the file specified. 
> 
> I0202 12:53:00.255151 1164 process.cpp:887] Failed to accept socket: future discarded 
> 
> Note about that: the path c:mesoswork_dirslavesa0664e60-846a-42d0-9586-cf97e997eba3-S0dockerlinks exists but it's empty. There also is no "C:mesossandbox" . I guess the error must be referring to one of those. 
> 
> I will try and see if I get it running by specifying "NAT". Hold on... 
> 
> In the end, the Windows boxes are supposed to build an application and won't run services so NAT is probably fine. 
> 
> On Friday, February 2, 2018, 2:13:51 p.m. PST, Andrew Schwartzmeyer <an...@schwartzmeyer.com> wrote: 
> 
> You may want to build from the tip of master (e.g. 1.6), as the following change/fix went in:
> 
> commit 6b35c93ba
> Author: Akash Gupta <ak...@hotmail.com>
> Date: Wed Jan 17 13:51:44 2018 -0800
> 
> Windows: Mapped the Docker network info types.
> 
> The Network enum in DockerInfo is specific to Linux containers. `HOST`
> doesn't exist on Windows and `BRIDGE` is `NAT` on Windows. The current
> default docker network setting was always `HOST`, which broke the
> Windows docker executor. Now, if a specific network isn't given, the
> network mode will default to `HOST` on Linux agents and `NAT` on Windows
> agents. Also, `BRIDGE` mode will be translated to `NAT` on Windows.
> 
> Review: https://reviews.apache.org/r/63860/
> 
> Then you can set the network type to "BRIDGE" in Marathon, which on Windows is equivalent (and will use) the Docker NAT network. See https://mesos.apache.org/documentation/latest/networking/#docker-containerizer (but note that it applies to 1.6, not your current version at 1.5).
> 
> On 02/02/2018 1:44 pm, Joseph Wu wrote: 
> It doesn't look like you've set the network type to `NAT`. Marathon isn't aware (yet) that Docker on Windows does not support HOST/BRIDGE networks. 
> 
> On Fri, Feb 2, 2018 at 1:37 PM, Andrew Schwartzmeyer <an...@schwartzmeyer.com> wrote:
> 
> Hello,
> 
> Would you please provide me with the executor's stderr log? This can be found in the work directory on the agent, it should give us a bit more information as to why it failed to start the task.
> 
> It'll be deeply nested, something like:
> 
> c:mesoswork_dirslaves7dc02270-a4e1-4f59-9ad7-56bad5182ea4-S3frameworkseb32cef4-c503-4ab7-85d4-8d4577e6a3bf-0000executorsnotepad.fcf078d1-084a-11e8-8f77-02421c3bc93crunslateststderr (and stdout)
> 
> Thanks,
> 
> Andy 
> 
> On 02/02/2018 1:30 pm, ajkf9uvxc ajkf9uvxc wrote: 
> 
> Hi, 
> 
> I am trying to get a job in DCOS to run a docker container on a Windows agent machine. DCOS was installed using the AWS CF template here: https://downloads.dcos.io/dcos/stable/aws.html [1] (single master). 
> 
> The Windows agent is added: 
> 
> C:mesosmesosbuildsrcmesos-agent.exe --attributes=os:windows --containerizers=docker,mesos --hostname=10.19.10.206 --IP=10.19.10.206 --master=zk://10.22.1.94:2181/mesos [2] --work_dir=c:mesoswork_dir --launcher_dir=c:mesosmesosbuildsrc --log_dir=c:mesoslogs 
> 
> And a simple job works: 
> 
> dcos.activestate.com [3] -> Job -> New 
> 
> { 
> 
> "id": "mywindowstest01", 
> 
> "labels": {}, 
> 
> "run": { 
> 
> "cpus": 0.01, 
> 
> "mem": 128, 
> 
> "disk": 0, 
> 
> "cmd": "C:\Windows\System32\cmd.exe /c echo helloworld > c:\mesos\work_dir\helloworld2", 
> 
> "env": {}, 
> 
> "placement": { 
> 
> "constraints": [ 
> 
> { 
> 
> "attribute": "os", 
> 
> "operator": "EQ", 
> 
> "value": "windows" 
> 
> } 
> 
> ] 
> 
> }, 
> 
> "artifacts": [], 
> 
> "maxLaunchDelay": 3600, 
> 
> "volumes": [], 
> 
> "restart": { 
> 
> "policy": "NEVER" 
> 
> } 
> 
> }, 
> 
> "schedules": [] 
> 
> } 
> 
> creates: "c:\mesos\work_dir\helloworld2" 
> 
> The Windows agent has DockerCE installed and is set to run Windows containers (tried with Linux containers as well and getting the same problem, but for the purpose of this question let's stick to Windows containers) 
> 
> I confirmed that it's possible to run a Windows container manually, directly on Windows 10 by starting a Powershell as Administrator and running: 
> 
> docker run -ti microsoft/windowsservercore 
> and 
> 
> docker run microsoft/windowsservercore 
> 
> Both commands create a new container (verified with "docker ps" , besides I get a cmd.exe shell in the conatiner for the first command) 
> 
> Now the problem: 
> 
> trying to run a container from DCOS does not work: 
> 
> dcos job add a.json 
> 
> with the json: 
> 
> { 
> "id": "myattempt11", 
> "labels": {}, 
> "run": { 
> "env": {}, 
> "cpus": 1.00, 
> "mem": 512, 
> "disk": 1000, 
> "placement": { 
> "constraints": [ 
> { 
> "attribute": "os", 
> "operator": "EQ", 
> "value": "windows" 
> } 
> ] 
> }, 
> "artifacts": [], 
> "maxLaunchDelay": 3600, 
> "docker": { 
> "image": "microsoft/windowsservercore" 
> }, 
> "restart": { 
> "policy": "NEVER" 
> } 
> }, 
> "schedules": [] 
> } 
> 
> Does not work: 
> 
> # dcos job add a.json 
> 
> # dcos job run myattempt11 
> Run ID: 20180202203339zVpxc 
> 
> The log on the Mesos Agent on Windows shows activity but not much information about the problem (see "TASK_FAILED" at the end below): 
> 
> Log file created at: 2018/02/02 12:52:47 
> Running on machine: DESKTOP-JJK06UJ 
> Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg 
> I0202 12:52:47.330880 8388 logging.cpp:201] INFO level logging started! 
> I0202 12:52:47.335886 8388 main.cpp:365] Build: 2017-12-20 23:35:42 UTC by Anne S Bell 
> I0202 12:52:47.335886 8388 main.cpp:366] Version: 1.5.0 
> I0202 12:52:47.337895 8388 main.cpp:373] Git SHA: 327726d3c7272806c8f3c3b7479758c26e55fd43 
> I0202 12:52:47.358888 8388 resolver.cpp:69] Creating default secret resolver 
> I0202 12:52:47.574883 8388 containerizer.cpp:304] Using isolation { windows/cpu, filesystem/windows, windows/mem, environment_secret } 
> I0202 12:52:47.577883 8388 provisioner.cpp:299] Using default backend 'copy' 
> I0202 12:52:47.596886 3348 slave.cpp:262] Mesos agent started on (1)@10.19.10.206:5051 [4] 
> I0202 12:52:47.597883 3348 slave.cpp:263] Flags at startup: --appc_simple_discovery_uri_prefix="http://" --appc_store_dir="C:UsersactiveitAppDataLocalTempmesosstoreappc" --attributes="os:windows" --authenticate_http_readonly="false" --authenticate_http_readwrite="false" --authenticatee="crammd5" --authentication_backoff_factor="1secs" --authorizer="local" --container_disk_watch_interval="15secs" --containerizers="docker,mesos" --default_role="*" --disk_watch_interval="1mins" --docker="docker" --docker_kill_orphans="true" --docker_registry="https://registry-1.docker.io [5]" --docker_remove_delay="6hrs" --docker_socket="//./pipe/docker_engine" --docker_stop_timeout="0ns" --docker_store_dir="C:UsersactiveitAppDataLocalTempmesosstoredocker" --docker_volume_checkpoint_dir="/var/run/mesos/isolators/docker/volume" --enforce_container_disk_quota="false" --executor_registration_timeout="1mins" --executor_reregistration_timeout="2secs" --executor_shutdown_grace_period="5secs"
--fetcher_cache_dir="C:UsersactiveitAppDataLocalTempmesosfetch" --fetcher_cache_size="2GB" --frameworks_home="" --gc_delay="1weeks" --gc_disk_headroom="0.1" --hadoop_home="" --help="false" --hostname="10.19.10.206" --hostname_lookup="true" --http_command_executor="false" --http_heartbeat_interval="30secs" --initialize_driver_logging="true" --ip="10.19.10.206" --isolation="windows/cpu,windows/mem" --launcher="windows" --launcher_dir="c:mesosmesosbuildsrc" --log_dir="c:mesoslogs" --logbufsecs="0" --logging_level="INFO" --master="zk://10.22.1.94:2181/mesos [2]" --max_completed_executors_per_framework="150" --oversubscribed_resources_interval="15secs" --port="5051" --qos_correction_interval_min="0ns" --quiet="false" --reconfiguration_policy="equal" --recover="reconnect" --recovery_timeout="15mins" --registration_backoff_factor="1secs" --runtime_dir="C:ProgramDatamesosruntime" --sandbox_directory="C:mesossandbox" --strict="true" --version="false" --work_dir="c:mesoswork_dir"
--zk_session_timeout="10secs" 
> I0202 12:52:47.604887 3348 slave.cpp:612] Agent resources: [{"name":"cpus","scalar":{"value":4.0},"type":"SCALAR"},{"name":"mem","scalar":{"value":15290.0},"type":"SCALAR"},{"name":"disk","scalar":{"value":470301.0},"type":"SCALAR"},{"name":"ports","ranges":{"range":[{"begin":31000,"end":32000}]},"type":"RANGES"}] 
> I0202 12:52:47.725885 3348 slave.cpp:620] Agent attributes: [ os=windows ] 
> I0202 12:52:47.727886 3348 slave.cpp:629] Agent hostname: 10.19.10.206 
> I0202 12:52:47.735886 7652 task_status_update_manager.cpp:181] Pausing sending task status updates 
> I0202 12:52:47.738890 4052 group.cpp:341] Group process (zookeeper-group(1)@10.19.10.206:5051 [4]) connected to ZooKeeper 
> I0202 12:52:47.739887 4052 group.cpp:831] Syncing group operations: queue size (joins, cancels, datas) = (0, 0, 0) 
> I0202 12:52:47.740885 4052 group.cpp:419] Trying to create path '/mesos' in ZooKeeper 
> I0202 12:52:47.773885 5168 state.cpp:66] Recovering state from 'c:mesoswork_dirmeta' 
> E0202 12:52:47.773885 3348 slave.cpp:1009] Failed to attach 'c:mesoslogsmesos-agent.exe.INFO [6]' to virtual path '/slave/log': Failed to get realpath of 'c:mesoslogsmesos-agent.exe.INFO [6]': Failed to get attributes for file 'c:mesoslogsmesos-agent.exe.INFO [6]': The system cannot find the file specified. 
> 
> I0202 12:52:47.774884 5168 state.cpp:724] No committed checkpointed resources found at 'c:mesoswork_dirmetaresourcesresources.info [7]' 
> I0202 12:52:47.779883 5168 state.cpp:110] Failed to find the latest agent from 'c:mesoswork_dirmeta' 
> I0202 12:52:47.781888 3528 task_status_update_manager.cpp:207] Recovering task status update manager 
> I0202 12:52:47.782883 3348 docker.cpp:890] Recovering Docker containers 
> I0202 12:52:47.782883 7652 containerizer.cpp:674] Recovering containerizer 
> I0202 12:52:47.807888 3768 provisioner.cpp:495] Provisioner recovery complete 
> I0202 12:52:47.891667 5168 detector.cpp:152] Detected a new leader: (id='1171') 
> I0202 12:52:47.892666 7652 group.cpp:700] Trying to get '/mesos/json.info_0000001171' in ZooKeeper 
> I0202 12:52:47.970657 5168 zookeeper.cpp:262] A new leading master (UPID=master@10.22.1.94:5050 [8]) is detected 
> I0202 12:52:48.011252 7652 slave.cpp:6776] Finished recovery 
> I0202 12:52:48.020246 3768 task_status_update_manager.cpp:181] Pausing sending task status updates 
> I0202 12:52:48.020246 7652 slave.cpp:1055] New master detected at master@10.22.1.94:5050 [8] 
> I0202 12:52:48.021251 7652 slave.cpp:1099] No credentials provided. Attempting to register without authentication 
> I0202 12:52:48.023254 7652 slave.cpp:1110] Detecting new master 
> I0202 12:52:48.330085 4052 slave.cpp:1275] Registered with master master@10.22.1.94:5050 [8]; given agent ID a0664e60-846a-42d0-9586-cf97e997eba3-S0 
> I0202 12:52:48.331082 5168 task_status_update_manager.cpp:188] Resuming sending task status updates 
> I0202 12:52:48.348086 4052 slave.cpp:1352] Forwarding agent update {"offer_operations":{},"resource_version_uuid":{"value":"DEVEk/KOR5KLtmOgVG9qvw=="},"slave_id":{"value":"a0664e60-846a-42d0-9586-cf97e997eba3-S0"},"update_oversubscribed_resources":true} 
> W0202 12:52:48.351085 4052 slave.cpp:1334] Already registered with master master@10.22.1.94:5050 [8] 
> I0202 12:52:48.356086 4052 slave.cpp:1352] Forwarding agent update {"offer_operations":{},"resource_version_uuid":{"value":"DEVEk/KOR5KLtmOgVG9qvw=="},"slave_id":{"value":"a0664e60-846a-42d0-9586-cf97e997eba3-S0"},"update_oversubscribed_resources":true} 
> W0202 12:52:48.358086 4052 slave.cpp:1334] Already registered with master master@10.22.1.94:5050 [8] 
> I0202 12:52:48.359086 4052 slave.cpp:1352] Forwarding agent update {"offer_operations":{},"resource_version_uuid":{"value":"DEVEk/KOR5KLtmOgVG9qvw=="},"slave_id":{"value":"a0664e60-846a-42d0-9586-cf97e997eba3-S0"},"update_oversubscribed_resources":true} 
> W0202 12:52:48.362089 4052 slave.cpp:1334] Already registered with master master@10.22.1.94:5050 [8] 
> I0202 12:52:48.363085 4052 slave.cpp:1352] Forwarding agent update {"offer_operations":{},"resource_version_uuid":{"value":"DEVEk/KOR5KLtmOgVG9qvw=="},"slave_id":{"value":"a0664e60-846a-42d0-9586-cf97e997eba3-S0"},"update_oversubscribed_resources":true} 
> W0202 12:52:48.364082 4052 slave.cpp:1334] Already registered with master master@10.22.1.94:5050 [8] 
> I0202 12:52:48.365085 4052 slave.cpp:1352] Forwarding agent update {"offer_operations":{},"resource_version_uuid":{"value":"DEVEk/KOR5KLtmOgVG9qvw=="},"slave_id":{"value":"a0664e60-846a-42d0-9586-cf97e997eba3-S0"},"update_oversubscribed_resources":true} 
> I0202 12:52:50.938498 7652 slave.cpp:1831] Got assigned task 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' for framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> I0202 12:52:50.962504 7652 slave.cpp:2101] Authorizing task 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' for framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> I0202 12:52:50.965504 3768 slave.cpp:2494] Launching task 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' for framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> I0202 12:52:50.988512 3768 slave.cpp:8373] Launching executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 with resources [{"allocation_info":{"role":"*"},"name":"cpus","scalar":{"value":0.1},"type":"SCALAR"},{"allocation_info":{"role":"*"},"name":"mem","scalar":{"value":32.0},"type":"SCALAR"}] in work directory 'c:mesoswork_dirslavesa0664e60-846a-42d0-9586-cf97e997eba3-S0frameworksca2eae6-8912-4f6a-984a-d501ac02ff88-0000executorsmyattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88runs74298e92-9700-486d-b211-a42e5fd0bf85' 
> I0202 12:52:50.995501 3768 slave.cpp:3046] Launching container 74298e92-9700-486d-b211-a42e5fd0bf85 for executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> I0202 12:52:51.010500 3768 slave.cpp:2580] Queued task 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' for executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> I0202 12:52:51.017498 3348 docker.cpp:1144] Starting container '74298e92-9700-486d-b211-a42e5fd0bf85' for task 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' (and executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88') of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> I0202 12:52:53.731667 1104 docker.cpp:784] Checkpointing pid 7732 to 'c:mesoswork_dirmetaslavesa0664e60-846a-42d0-9586-cf97e997eba3-S0frameworksca2eae6-8912-4f6a-984a-d501ac02ff88-0000executorsmyattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88runs74298e92-9700-486d-b211-a42e5fd0bf85pidsforked.pid' 
> I0202 12:52:53.894371 4052 slave.cpp:4314] Got registration for executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 from executor(1)@10.19.10.206:49855 [9] 
> I0202 12:52:53.911371 1104 docker.cpp:1627] Ignoring updating container 74298e92-9700-486d-b211-a42e5fd0bf85 because resources passed to update are identical to existing resources 
> I0202 12:52:53.914371 3768 slave.cpp:2785] Sending queued task 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' to executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 at executor(1)@10.19.10.206:49855 [9] 
> I0202 12:52:53.931371 7652 slave.cpp:4771] Handling status update TASK_STARTING (Status UUID: ef5adc2f-6f66-44c3-bc98-7697c1315ebf) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 from executor(1)@10.19.10.206:49855 [9] 
> I0202 12:52:53.942371 5168 task_status_update_manager.cpp:328] Received task status update TASK_STARTING (Status UUID: ef5adc2f-6f66-44c3-bc98-7697c1315ebf) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> I0202 12:52:53.948371 5168 task_status_update_manager.cpp:842] Checkpointing UPDATE for task status update TASK_STARTING (Status UUID: ef5adc2f-6f66-44c3-bc98-7697c1315ebf) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> I0202 12:52:53.950371 1104 slave.cpp:5254] Forwarding the update TASK_STARTING (Status UUID: ef5adc2f-6f66-44c3-bc98-7697c1315ebf) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 to master@10.22.1.94:5050 [8] 
> I0202 12:52:53.953371 1104 slave.cpp:5163] Sending acknowledgement for status update TASK_STARTING (Status UUID: ef5adc2f-6f66-44c3-bc98-7697c1315ebf) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 to executor(1)@10.19.10.206:49855 [9] 
> I0202 12:52:54.049816 3348 task_status_update_manager.cpp:401] Received task status update acknowledgement (UUID: ef5adc2f-6f66-44c3-bc98-7697c1315ebf) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> I0202 12:52:54.051817 3348 task_status_update_manager.cpp:842] Checkpointing ACK for task status update TASK_STARTING (Status UUID: ef5adc2f-6f66-44c3-bc98-7697c1315ebf) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> I0202 12:52:59.255755 4052 slave.cpp:4771] Handling status update TASK_FAILED (Status UUID: c0775c86-4f1b-44a6-ae8f-347486f6fa9f) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 from executor(1)@10.19.10.206:49855 [9] 
> I0202 12:52:59.260759 4052 task_status_update_manager.cpp:328] Received task status update TASK_FAILED (Status UUID: c0775c86-4f1b-44a6-ae8f-347486f6fa9f) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> I0202 12:52:59.261757 4052 task_status_update_manager.cpp:842] Checkpointing UPDATE for task status update TASK_FAILED (Status UUID: c0775c86-4f1b-44a6-ae8f-347486f6fa9f) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> I0202 12:52:59.263756 5168 slave.cpp:5254] Forwarding the update TASK_FAILED (Status UUID: c0775c86-4f1b-44a6-ae8f-347486f6fa9f) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 to master@10.22.1.94:5050 [8] 
> I0202 12:52:59.265756 5168 slave.cpp:5163] Sending acknowledgement for status update TASK_FAILED (Status UUID: c0775c86-4f1b-44a6-ae8f-347486f6fa9f) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 to executor(1)@10.19.10.206:49855 [9] 
> I0202 12:52:59.367189 7052 task_status_update_manager.cpp:401] Received task status update acknowledgement (UUID: c0775c86-4f1b-44a6-ae8f-347486f6fa9f) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> I0202 12:52:59.368187 7052 task_status_update_manager.cpp:842] Checkpointing ACK for task status update TASK_FAILED (Status UUID: c0775c86-4f1b-44a6-ae8f-347486f6fa9f) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> I0202 12:53:00.261153 4052 slave.cpp:5386] Got exited event for executor(1)@10.19.10.206:49855 [9] 
> I0202 12:53:00.471400 7052 docker.cpp:2415] Executor for container 74298e92-9700-486d-b211-a42e5fd0bf85 has exited 
> I0202 12:53:00.472362 7052 docker.cpp:2186] Destroying container 74298e92-9700-486d-b211-a42e5fd0bf85 in RUNNING state 
> I0202 12:53:00.474362 7052 docker.cpp:2236] Running docker stop on container 74298e92-9700-486d-b211-a42e5fd0bf85 
> I0202 12:53:00.477478 3348 slave.cpp:5795] Executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 exited with status 0 
> I0202 12:53:00.478476 3348 slave.cpp:5899] Cleaning up executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 at executor(1)@10.19.10.206:49855 [9] 
> I0202 12:53:00.481472 4052 gc.cpp:90] Scheduling 'c:mesoswork_dirslavesa0664e60-846a-42d0-9586-cf97e997eba3-S0frameworksca2eae6-8912-4f6a-984a-d501ac02ff88-0000executorsmyattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88runs74298e92-9700-486d-b211-a42e5fd0bf85' for gc 6.99989026072889days in the future 
> I0202 12:53:00.483475 3528 gc.cpp:90] Scheduling 'c:mesoswork_dirslavesa0664e60-846a-42d0-9586-cf97e997eba3-S0frameworksca2eae6-8912-4f6a-984a-d501ac02ff88-0000executorsmyattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' for gc 6.99987866347259days in the future 
> I0202 12:53:00.484474 5168 gc.cpp:90] Scheduling 'c:mesoswork_dirmetaslavesa0664e60-846a-42d0-9586-cf97e997eba3-S0frameworksca2eae6-8912-4f6a-984a-d501ac02ff88-0000executorsmyattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88runs74298e92-9700-486d-b211-a42e5fd0bf85' for gc 6.99999439265185days in the future 
> I0202 12:53:00.485474 5168 gc.cpp:90] Scheduling 'c:mesoswork_dirmetaslavesa0664e60-846a-42d0-9586-cf97e997eba3-S0frameworksca2eae6-8912-4f6a-984a-d501ac02ff88-0000executorsmyattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' for gc 6.99987864033482days in the future 
> I0202 12:53:00.485474 3348 slave.cpp:6006] Cleaning up framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> I0202 12:53:00.486479 1104 task_status_update_manager.cpp:289] Closing task status update streams for framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> I0202 12:53:00.487473 3768 gc.cpp:90] Scheduling 'c:mesoswork_dirslavesa0664e60-846a-42d0-9586-cf97e997eba3-S0frameworksca2eae6-8912-4f6a-984a-d501ac02ff88-0000' for gc 6.9998786172days in the future 
> I0202 12:53:00.488477 3768 gc.cpp:90] Scheduling 'c:mesoswork_dirmetaslavesa0664e60-846a-42d0-9586-cf97e997eba3-S0frameworksca2eae6-8912-4f6a-984a-d501ac02ff88-0000' for gc 6.99987860557926days in the future 
> I0202 12:53:47.742332 7052 slave.cpp:6314] Current disk usage 24.73%. Max allowed age: 4.568714599279827days 
> I0202 12:54:01.675030 7052 slave.cpp:6222] Framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 seems to have exited. Ignoring registration timeout for executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' 
> I0202 12:54:03.169529 3348 slave.cpp:970] Received SIGUSR1 signal; unregistering and shutting down 
> I0202 12:54:03.170536 3348 slave.cpp:931] Agent terminating 
> I0202 12:54:03.199530 3308 process.cpp:887] Failed to accept socket: future discarded 
> 
> in DCOS web-ui -> Jobs -> myattempt11 -> Run History there is also no information. 
> 
> Are there any good troubleshooting tips / ideas what to try or where to find more informative logs to run a Docker container on Windows using Mesos? 
> 
> Are there any more suitable alternative orchestration tools to run Docker Windows containers in a cluster?
 

Links:
------
[1] https://downloads.dcos.io/dcos/stable/aws.html
[2] http://10.22.1.94:2181/mesos
[3] http://dcos.activestate.com/
[4] http://10.19.10.206:5051/
[5] https://registry-1.docker.io/
[6] http://mesos-agent.exe.info/
[7] http://resources.info/
[8] http://master@10.22.1.94:5050/
[9] http://10.19.10.206:49855/

Re: Struggling with running Docker container on Windows agent

Posted by Andrew Schwartzmeyer <an...@schwartzmeyer.com>.
 You may want to build from the tip of master (e.g. 1.6), as the
following change/fix went in:

commit 6b35c93ba
Author: Akash Gupta <ak...@hotmail.com>
Date: Wed Jan 17 13:51:44 2018 -0800

 Windows: Mapped the Docker network info types.

 The Network enum in DockerInfo is specific to Linux containers. `HOST`
 doesn't exist on Windows and `BRIDGE` is `NAT` on Windows. The current
 default docker network setting was always `HOST`, which broke the
 Windows docker executor. Now, if a specific network isn't given, the
 network mode will default to `HOST` on Linux agents and `NAT` on
Windows
 agents. Also, `BRIDGE` mode will be translated to `NAT` on Windows.

 Review: https://reviews.apache.org/r/63860/

Then you can set the network type to "BRIDGE" in Marathon, which on
Windows is equivalent (and will use) the Docker NAT network. See
https://mesos.apache.org/documentation/latest/networking/#docker-containerizer
(but note that it applies to 1.6, not your current version at 1.5).

On 02/02/2018 1:44 pm, Joseph Wu wrote: 

> It doesn't look like you've set the network type to `NAT`. Marathon isn't aware (yet) that Docker on Windows does not support HOST/BRIDGE networks. 
> 
> On Fri, Feb 2, 2018 at 1:37 PM, Andrew Schwartzmeyer <an...@schwartzmeyer.com> wrote:
> 
> Hello,
> 
> Would you please provide me with the executor's stderr log? This can be found in the work directory on the agent, it should give us a bit more information as to why it failed to start the task.
> 
> It'll be deeply nested, something like:
> 
> c:mesoswork_dirslaves7dc02270-a4e1-4f59-9ad7-56bad5182ea4-S3frameworkseb32cef4-c503-4ab7-85d4-8d4577e6a3bf-0000executorsnotepad.fcf078d1-084a-11e8-8f77-02421c3bc93crunslateststderr (and stdout)
> 
> Thanks,
> 
> Andy 
> 
> On 02/02/2018 1:30 pm, ajkf9uvxc ajkf9uvxc wrote: 
> 
> Hi, 
> 
> I am trying to get a job in DCOS to run a docker container on a Windows agent machine. DCOS was installed using the AWS CF template here: https://downloads.dcos.io/dcos/stable/aws.html [1] (single master). 
> 
> The Windows agent is added: 
> 
> C:mesosmesosbuildsrcmesos-agent.exe --attributes=os:windows --containerizers=docker,mesos --hostname=10.19.10.206 --IP=10.19.10.206 --master=zk://10.22.1.94:2181/mesos [2] --work_dir=c:mesoswork_dir --launcher_dir=c:mesosmesosbuildsrc --log_dir=c:mesoslogs 
> 
> And a simple job works: 
> 
> dcos.activestate.com [3] -> Job -> New 
> 
> { 
> 
> "id": "mywindowstest01", 
> 
> "labels": {}, 
> 
> "run": { 
> 
> "cpus": 0.01, 
> 
> "mem": 128, 
> 
> "disk": 0, 
> 
> "cmd": "C:\Windows\System32\cmd.exe /c echo helloworld > c:\mesos\work_dir\helloworld2", 
> 
> "env": {}, 
> 
> "placement": { 
> 
> "constraints": [ 
> 
> { 
> 
> "attribute": "os", 
> 
> "operator": "EQ", 
> 
> "value": "windows" 
> 
> } 
> 
> ] 
> 
> }, 
> 
> "artifacts": [], 
> 
> "maxLaunchDelay": 3600, 
> 
> "volumes": [], 
> 
> "restart": { 
> 
> "policy": "NEVER" 
> 
> } 
> 
> }, 
> 
> "schedules": [] 
> 
> } 
> 
> creates: "c:\mesos\work_dir\helloworld2" 
> 
> The Windows agent has DockerCE installed and is set to run Windows containers (tried with Linux containers as well and getting the same problem, but for the purpose of this question let's stick to Windows containers) 
> 
> I confirmed that it's possible to run a Windows container manually, directly on Windows 10 by starting a Powershell as Administrator and running: 
> 
> docker run -ti microsoft/windowsservercore 
> and 
> 
> docker run microsoft/windowsservercore 
> 
> Both commands create a new container (verified with "docker ps" , besides I get a cmd.exe shell in the conatiner for the first command) 
> 
> Now the problem: 
> 
> trying to run a container from DCOS does not work: 
> 
> dcos job add a.json 
> 
> with the json: 
> 
> { 
> "id": "myattempt11", 
> "labels": {}, 
> "run": { 
> "env": {}, 
> "cpus": 1.00, 
> "mem": 512, 
> "disk": 1000, 
> "placement": { 
> "constraints": [ 
> { 
> "attribute": "os", 
> "operator": "EQ", 
> "value": "windows" 
> } 
> ] 
> }, 
> "artifacts": [], 
> "maxLaunchDelay": 3600, 
> "docker": { 
> "image": "microsoft/windowsservercore" 
> }, 
> "restart": { 
> "policy": "NEVER" 
> } 
> }, 
> "schedules": [] 
> } 
> 
> Does not work: 
> 
> # dcos job add a.json 
> 
> # dcos job run myattempt11 
> Run ID: 20180202203339zVpxc 
> 
> The log on the Mesos Agent on Windows shows activity but not much information about the problem (see "TASK_FAILED" at the end below): 
> 
> Log file created at: 2018/02/02 12:52:47 
> Running on machine: DESKTOP-JJK06UJ 
> Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg 
> I0202 12:52:47.330880 8388 logging.cpp:201] INFO level logging started! 
> I0202 12:52:47.335886 8388 main.cpp:365] Build: 2017-12-20 23:35:42 UTC by Anne S Bell 
> I0202 12:52:47.335886 8388 main.cpp:366] Version: 1.5.0 
> I0202 12:52:47.337895 8388 main.cpp:373] Git SHA: 327726d3c7272806c8f3c3b7479758c26e55fd43 
> I0202 12:52:47.358888 8388 resolver.cpp:69] Creating default secret resolver 
> I0202 12:52:47.574883 8388 containerizer.cpp:304] Using isolation { windows/cpu, filesystem/windows, windows/mem, environment_secret } 
> I0202 12:52:47.577883 8388 provisioner.cpp:299] Using default backend 'copy' 
> I0202 12:52:47.596886 3348 slave.cpp:262] Mesos agent started on (1)@10.19.10.206:5051 [4] 
> I0202 12:52:47.597883 3348 slave.cpp:263] Flags at startup: --appc_simple_discovery_uri_prefix="http://" --appc_store_dir="C:UsersactiveitAppDataLocalTempmesosstoreappc" --attributes="os:windows" --authenticate_http_readonly="false" --authenticate_http_readwrite="false" --authenticatee="crammd5" --authentication_backoff_factor="1secs" --authorizer="local" --container_disk_watch_interval="15secs" --containerizers="docker,mesos" --default_role="*" --disk_watch_interval="1mins" --docker="docker" --docker_kill_orphans="true" --docker_registry="https://registry-1.docker.io [5]" --docker_remove_delay="6hrs" --docker_socket="//./pipe/docker_engine" --docker_stop_timeout="0ns" --docker_store_dir="C:UsersactiveitAppDataLocalTempmesosstoredocker" --docker_volume_checkpoint_dir="/var/run/mesos/isolators/docker/volume" --enforce_container_disk_quota="false" --executor_registration_timeout="1mins" --executor_reregistration_timeout="2secs" --executor_shutdown_grace_period="5secs"
--fetcher_cache_dir="C:UsersactiveitAppDataLocalTempmesosfetch" --fetcher_cache_size="2GB" --frameworks_home="" --gc_delay="1weeks" --gc_disk_headroom="0.1" --hadoop_home="" --help="false" --hostname="10.19.10.206" --hostname_lookup="true" --http_command_executor="false" --http_heartbeat_interval="30secs" --initialize_driver_logging="true" --ip="10.19.10.206" --isolation="windows/cpu,windows/mem" --launcher="windows" --launcher_dir="c:mesosmesosbuildsrc" --log_dir="c:mesoslogs" --logbufsecs="0" --logging_level="INFO" --master="zk://10.22.1.94:2181/mesos [2]" --max_completed_executors_per_framework="150" --oversubscribed_resources_interval="15secs" --port="5051" --qos_correction_interval_min="0ns" --quiet="false" --reconfiguration_policy="equal" --recover="reconnect" --recovery_timeout="15mins" --registration_backoff_factor="1secs" --runtime_dir="C:ProgramDatamesosruntime" --sandbox_directory="C:mesossandbox" --strict="true" --version="false" --work_dir="c:mesoswork_dir"
--zk_session_timeout="10secs" 
> I0202 12:52:47.604887 3348 slave.cpp:612] Agent resources: [{"name":"cpus","scalar":{"value":4.0},"type":"SCALAR"},{"name":"mem","scalar":{"value":15290.0},"type":"SCALAR"},{"name":"disk","scalar":{"value":470301.0},"type":"SCALAR"},{"name":"ports","ranges":{"range":[{"begin":31000,"end":32000}]},"type":"RANGES"}] 
> I0202 12:52:47.725885 3348 slave.cpp:620] Agent attributes: [ os=windows ] 
> I0202 12:52:47.727886 3348 slave.cpp:629] Agent hostname: 10.19.10.206 
> I0202 12:52:47.735886 7652 task_status_update_manager.cpp:181] Pausing sending task status updates 
> I0202 12:52:47.738890 4052 group.cpp:341] Group process (zookeeper-group(1)@10.19.10.206:5051 [4]) connected to ZooKeeper 
> I0202 12:52:47.739887 4052 group.cpp:831] Syncing group operations: queue size (joins, cancels, datas) = (0, 0, 0) 
> I0202 12:52:47.740885 4052 group.cpp:419] Trying to create path '/mesos' in ZooKeeper 
> I0202 12:52:47.773885 5168 state.cpp:66] Recovering state from 'c:mesoswork_dirmeta' 
> E0202 12:52:47.773885 3348 slave.cpp:1009] Failed to attach 'c:mesoslogsmesos-agent.exe.INFO [6]' to virtual path '/slave/log': Failed to get realpath of 'c:mesoslogsmesos-agent.exe.INFO [6]': Failed to get attributes for file 'c:mesoslogsmesos-agent.exe.INFO [6]': The system cannot find the file specified. 
> 
> I0202 12:52:47.774884 5168 state.cpp:724] No committed checkpointed resources found at 'c:mesoswork_dirmetaresourcesresources.info [7]' 
> I0202 12:52:47.779883 5168 state.cpp:110] Failed to find the latest agent from 'c:mesoswork_dirmeta' 
> I0202 12:52:47.781888 3528 task_status_update_manager.cpp:207] Recovering task status update manager 
> I0202 12:52:47.782883 3348 docker.cpp:890] Recovering Docker containers 
> I0202 12:52:47.782883 7652 containerizer.cpp:674] Recovering containerizer 
> I0202 12:52:47.807888 3768 provisioner.cpp:495] Provisioner recovery complete 
> I0202 12:52:47.891667 5168 detector.cpp:152] Detected a new leader: (id='1171') 
> I0202 12:52:47.892666 7652 group.cpp:700] Trying to get '/mesos/json.info_0000001171' in ZooKeeper 
> I0202 12:52:47.970657 5168 zookeeper.cpp:262] A new leading master (UPID=master@10.22.1.94:5050 [8]) is detected 
> I0202 12:52:48.011252 7652 slave.cpp:6776] Finished recovery 
> I0202 12:52:48.020246 3768 task_status_update_manager.cpp:181] Pausing sending task status updates 
> I0202 12:52:48.020246 7652 slave.cpp:1055] New master detected at master@10.22.1.94:5050 [8] 
> I0202 12:52:48.021251 7652 slave.cpp:1099] No credentials provided. Attempting to register without authentication 
> I0202 12:52:48.023254 7652 slave.cpp:1110] Detecting new master 
> I0202 12:52:48.330085 4052 slave.cpp:1275] Registered with master master@10.22.1.94:5050 [8]; given agent ID a0664e60-846a-42d0-9586-cf97e997eba3-S0 
> I0202 12:52:48.331082 5168 task_status_update_manager.cpp:188] Resuming sending task status updates 
> I0202 12:52:48.348086 4052 slave.cpp:1352] Forwarding agent update {"offer_operations":{},"resource_version_uuid":{"value":"DEVEk/KOR5KLtmOgVG9qvw=="},"slave_id":{"value":"a0664e60-846a-42d0-9586-cf97e997eba3-S0"},"update_oversubscribed_resources":true} 
> W0202 12:52:48.351085 4052 slave.cpp:1334] Already registered with master master@10.22.1.94:5050 [8] 
> I0202 12:52:48.356086 4052 slave.cpp:1352] Forwarding agent update {"offer_operations":{},"resource_version_uuid":{"value":"DEVEk/KOR5KLtmOgVG9qvw=="},"slave_id":{"value":"a0664e60-846a-42d0-9586-cf97e997eba3-S0"},"update_oversubscribed_resources":true} 
> W0202 12:52:48.358086 4052 slave.cpp:1334] Already registered with master master@10.22.1.94:5050 [8] 
> I0202 12:52:48.359086 4052 slave.cpp:1352] Forwarding agent update {"offer_operations":{},"resource_version_uuid":{"value":"DEVEk/KOR5KLtmOgVG9qvw=="},"slave_id":{"value":"a0664e60-846a-42d0-9586-cf97e997eba3-S0"},"update_oversubscribed_resources":true} 
> W0202 12:52:48.362089 4052 slave.cpp:1334] Already registered with master master@10.22.1.94:5050 [8] 
> I0202 12:52:48.363085 4052 slave.cpp:1352] Forwarding agent update {"offer_operations":{},"resource_version_uuid":{"value":"DEVEk/KOR5KLtmOgVG9qvw=="},"slave_id":{"value":"a0664e60-846a-42d0-9586-cf97e997eba3-S0"},"update_oversubscribed_resources":true} 
> W0202 12:52:48.364082 4052 slave.cpp:1334] Already registered with master master@10.22.1.94:5050 [8] 
> I0202 12:52:48.365085 4052 slave.cpp:1352] Forwarding agent update {"offer_operations":{},"resource_version_uuid":{"value":"DEVEk/KOR5KLtmOgVG9qvw=="},"slave_id":{"value":"a0664e60-846a-42d0-9586-cf97e997eba3-S0"},"update_oversubscribed_resources":true} 
> I0202 12:52:50.938498 7652 slave.cpp:1831] Got assigned task 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' for framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> I0202 12:52:50.962504 7652 slave.cpp:2101] Authorizing task 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' for framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> I0202 12:52:50.965504 3768 slave.cpp:2494] Launching task 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' for framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> I0202 12:52:50.988512 3768 slave.cpp:8373] Launching executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 with resources [{"allocation_info":{"role":"*"},"name":"cpus","scalar":{"value":0.1},"type":"SCALAR"},{"allocation_info":{"role":"*"},"name":"mem","scalar":{"value":32.0},"type":"SCALAR"}] in work directory 'c:mesoswork_dirslavesa0664e60-846a-42d0-9586-cf97e997eba3-S0frameworksca2eae6-8912-4f6a-984a-d501ac02ff88-0000executorsmyattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88runs74298e92-9700-486d-b211-a42e5fd0bf85' 
> I0202 12:52:50.995501 3768 slave.cpp:3046] Launching container 74298e92-9700-486d-b211-a42e5fd0bf85 for executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> I0202 12:52:51.010500 3768 slave.cpp:2580] Queued task 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' for executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> I0202 12:52:51.017498 3348 docker.cpp:1144] Starting container '74298e92-9700-486d-b211-a42e5fd0bf85' for task 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' (and executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88') of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> I0202 12:52:53.731667 1104 docker.cpp:784] Checkpointing pid 7732 to 'c:mesoswork_dirmetaslavesa0664e60-846a-42d0-9586-cf97e997eba3-S0frameworksca2eae6-8912-4f6a-984a-d501ac02ff88-0000executorsmyattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88runs74298e92-9700-486d-b211-a42e5fd0bf85pidsforked.pid' 
> I0202 12:52:53.894371 4052 slave.cpp:4314] Got registration for executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 from executor(1)@10.19.10.206:49855 [9] 
> I0202 12:52:53.911371 1104 docker.cpp:1627] Ignoring updating container 74298e92-9700-486d-b211-a42e5fd0bf85 because resources passed to update are identical to existing resources 
> I0202 12:52:53.914371 3768 slave.cpp:2785] Sending queued task 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' to executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 at executor(1)@10.19.10.206:49855 [9] 
> I0202 12:52:53.931371 7652 slave.cpp:4771] Handling status update TASK_STARTING (Status UUID: ef5adc2f-6f66-44c3-bc98-7697c1315ebf) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 from executor(1)@10.19.10.206:49855 [9] 
> I0202 12:52:53.942371 5168 task_status_update_manager.cpp:328] Received task status update TASK_STARTING (Status UUID: ef5adc2f-6f66-44c3-bc98-7697c1315ebf) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> I0202 12:52:53.948371 5168 task_status_update_manager.cpp:842] Checkpointing UPDATE for task status update TASK_STARTING (Status UUID: ef5adc2f-6f66-44c3-bc98-7697c1315ebf) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> I0202 12:52:53.950371 1104 slave.cpp:5254] Forwarding the update TASK_STARTING (Status UUID: ef5adc2f-6f66-44c3-bc98-7697c1315ebf) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 to master@10.22.1.94:5050 [8] 
> I0202 12:52:53.953371 1104 slave.cpp:5163] Sending acknowledgement for status update TASK_STARTING (Status UUID: ef5adc2f-6f66-44c3-bc98-7697c1315ebf) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 to executor(1)@10.19.10.206:49855 [9] 
> I0202 12:52:54.049816 3348 task_status_update_manager.cpp:401] Received task status update acknowledgement (UUID: ef5adc2f-6f66-44c3-bc98-7697c1315ebf) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> I0202 12:52:54.051817 3348 task_status_update_manager.cpp:842] Checkpointing ACK for task status update TASK_STARTING (Status UUID: ef5adc2f-6f66-44c3-bc98-7697c1315ebf) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> I0202 12:52:59.255755 4052 slave.cpp:4771] Handling status update TASK_FAILED (Status UUID: c0775c86-4f1b-44a6-ae8f-347486f6fa9f) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 from executor(1)@10.19.10.206:49855 [9] 
> I0202 12:52:59.260759 4052 task_status_update_manager.cpp:328] Received task status update TASK_FAILED (Status UUID: c0775c86-4f1b-44a6-ae8f-347486f6fa9f) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> I0202 12:52:59.261757 4052 task_status_update_manager.cpp:842] Checkpointing UPDATE for task status update TASK_FAILED (Status UUID: c0775c86-4f1b-44a6-ae8f-347486f6fa9f) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> I0202 12:52:59.263756 5168 slave.cpp:5254] Forwarding the update TASK_FAILED (Status UUID: c0775c86-4f1b-44a6-ae8f-347486f6fa9f) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 to master@10.22.1.94:5050 [8] 
> I0202 12:52:59.265756 5168 slave.cpp:5163] Sending acknowledgement for status update TASK_FAILED (Status UUID: c0775c86-4f1b-44a6-ae8f-347486f6fa9f) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 to executor(1)@10.19.10.206:49855 [9] 
> I0202 12:52:59.367189 7052 task_status_update_manager.cpp:401] Received task status update acknowledgement (UUID: c0775c86-4f1b-44a6-ae8f-347486f6fa9f) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> I0202 12:52:59.368187 7052 task_status_update_manager.cpp:842] Checkpointing ACK for task status update TASK_FAILED (Status UUID: c0775c86-4f1b-44a6-ae8f-347486f6fa9f) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> I0202 12:53:00.261153 4052 slave.cpp:5386] Got exited event for executor(1)@10.19.10.206:49855 [9] 
> I0202 12:53:00.471400 7052 docker.cpp:2415] Executor for container 74298e92-9700-486d-b211-a42e5fd0bf85 has exited 
> I0202 12:53:00.472362 7052 docker.cpp:2186] Destroying container 74298e92-9700-486d-b211-a42e5fd0bf85 in RUNNING state 
> I0202 12:53:00.474362 7052 docker.cpp:2236] Running docker stop on container 74298e92-9700-486d-b211-a42e5fd0bf85 
> I0202 12:53:00.477478 3348 slave.cpp:5795] Executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 exited with status 0 
> I0202 12:53:00.478476 3348 slave.cpp:5899] Cleaning up executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 at executor(1)@10.19.10.206:49855 [9] 
> I0202 12:53:00.481472 4052 gc.cpp:90] Scheduling 'c:mesoswork_dirslavesa0664e60-846a-42d0-9586-cf97e997eba3-S0frameworksca2eae6-8912-4f6a-984a-d501ac02ff88-0000executorsmyattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88runs74298e92-9700-486d-b211-a42e5fd0bf85' for gc 6.99989026072889days in the future 
> I0202 12:53:00.483475 3528 gc.cpp:90] Scheduling 'c:mesoswork_dirslavesa0664e60-846a-42d0-9586-cf97e997eba3-S0frameworksca2eae6-8912-4f6a-984a-d501ac02ff88-0000executorsmyattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' for gc 6.99987866347259days in the future 
> I0202 12:53:00.484474 5168 gc.cpp:90] Scheduling 'c:mesoswork_dirmetaslavesa0664e60-846a-42d0-9586-cf97e997eba3-S0frameworksca2eae6-8912-4f6a-984a-d501ac02ff88-0000executorsmyattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88runs74298e92-9700-486d-b211-a42e5fd0bf85' for gc 6.99999439265185days in the future 
> I0202 12:53:00.485474 5168 gc.cpp:90] Scheduling 'c:mesoswork_dirmetaslavesa0664e60-846a-42d0-9586-cf97e997eba3-S0frameworksca2eae6-8912-4f6a-984a-d501ac02ff88-0000executorsmyattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' for gc 6.99987864033482days in the future 
> I0202 12:53:00.485474 3348 slave.cpp:6006] Cleaning up framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> I0202 12:53:00.486479 1104 task_status_update_manager.cpp:289] Closing task status update streams for framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> I0202 12:53:00.487473 3768 gc.cpp:90] Scheduling 'c:mesoswork_dirslavesa0664e60-846a-42d0-9586-cf97e997eba3-S0frameworksca2eae6-8912-4f6a-984a-d501ac02ff88-0000' for gc 6.9998786172days in the future 
> I0202 12:53:00.488477 3768 gc.cpp:90] Scheduling 'c:mesoswork_dirmetaslavesa0664e60-846a-42d0-9586-cf97e997eba3-S0frameworksca2eae6-8912-4f6a-984a-d501ac02ff88-0000' for gc 6.99987860557926days in the future 
> I0202 12:53:47.742332 7052 slave.cpp:6314] Current disk usage 24.73%. Max allowed age: 4.568714599279827days 
> I0202 12:54:01.675030 7052 slave.cpp:6222] Framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 seems to have exited. Ignoring registration timeout for executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' 
> I0202 12:54:03.169529 3348 slave.cpp:970] Received SIGUSR1 signal; unregistering and shutting down 
> I0202 12:54:03.170536 3348 slave.cpp:931] Agent terminating 
> I0202 12:54:03.199530 3308 process.cpp:887] Failed to accept socket: future discarded 
> 
> in DCOS web-ui -> Jobs -> myattempt11 -> Run History there is also no information. 
> 
> Are there any good troubleshooting tips / ideas what to try or where to find more informative logs to run a Docker container on Windows using Mesos? 
> 
> Are there any more suitable alternative orchestration tools to run Docker Windows containers in a cluster?
 

Links:
------
[1] https://downloads.dcos.io/dcos/stable/aws.html
[2] http://10.22.1.94:2181/mesos
[3] http://dcos.activestate.com/
[4] http://10.19.10.206:5051/
[5] https://registry-1.docker.io/
[6] http://mesos-agent.exe.info/
[7] http://resources.info/
[8] http://master@10.22.1.94:5050/
[9] http://10.19.10.206:49855/

Re: Struggling with running Docker container on Windows agent

Posted by Joseph Wu <jo...@mesosphere.io>.
It doesn't look like you've set the network type to `NAT`.  Marathon isn't
aware (yet) that Docker on Windows does not support HOST/BRIDGE networks.

On Fri, Feb 2, 2018 at 1:37 PM, Andrew Schwartzmeyer <
andrew@schwartzmeyer.com> wrote:

> Hello,
>
> Would you please provide me with the executor's stderr log? This can be
> found in the work directory on the agent, it should give us a bit more
> information as to why it failed to start the task.
>
> It'll be deeply nested, something like:
>
> c:\mesos\work_dir\slaves\7dc02270-a4e1-4f59-9ad7-
> 56bad5182ea4-S3\frameworks\eb32cef4-c503-4ab7-85d4-
> 8d4577e6a3bf-0000\executors\notepad.fcf078d1-084a-11e8-
> 8f77-02421c3bc93c\runs\latest\stderr (and stdout)
>
>
> Thanks,
>
> Andy
>
> On 02/02/2018 1:30 pm, ajkf9uvxc ajkf9uvxc wrote:
>
> Hi,
>
> I am trying to get a job in DCOS to run a docker container on a Windows
> agent machine. DCOS was installed using the AWS CF template here:
> https://downloads.dcos.io/dcos/stable/aws.html (single master).
>
> The Windows agent is added:
>
> C:\mesos\mesos\build\src\mesos-agent.exe --attributes=os:windows
> --containerizers=docker,mesos --hostname=10.19.10.206 --IP=10.19.10.206
> --master=zk://10.22.1.94:2181/mesos --work_dir=c:\mesos\work_dir
> --launcher_dir=c:\mesos\mesos\build\src --log_dir=c:\mesos\logs
>
> And a simple job works:
>
> dcos.activestate.com -> Job -> New
>
>
>
> {
>
>  "id": "mywindowstest01",
>
>  "labels": {},
>
>  "run": {
>
>    "cpus": 0.01,
>
>    "mem": 128,
>
>    "disk": 0,
>
>    "cmd": "C:\\Windows\\System32\\cmd.exe /c echo helloworld >
> c:\\mesos\\work_dir\\helloworld2",
>
>    "env": {},
>
>    "placement": {
>
>      "constraints": [
>
>        {
>
>          "attribute": "os",
>
>          "operator": "EQ",
>
>          "value": "windows"
>
>        }
>
>      ]
>
>    },
>
>    "artifacts": [],
>
>    "maxLaunchDelay": 3600,
>
>    "volumes": [],
>
>    "restart": {
>
>      "policy": "NEVER"
>
>    }
>
>  },
>
>  "schedules": []
>
> }
>
> creates: "c:\\mesos\\work_dir\\helloworld2"
>
>
> The Windows agent has DockerCE installed and is set to run Windows
> containers (tried with Linux containers as well and getting the same
> problem, but for the purpose of this question let's stick to Windows
> containers)
>
> I confirmed that it's possible to run a Windows container manually,
> directly on Windows 10 by starting a Powershell as Administrator and
> running:
>
> docker run -ti microsoft/windowsservercore
> and
> docker run microsoft/windowsservercore
>
> Both commands create a new container (verified with "docker ps" , besides
> I get a cmd.exe shell in the conatiner for the first command)
>
> Now the problem:
>
> trying to run a container from DCOS does not work:
>
> dcos job add a.json
>
> with the json:
>
> {
>   "id": "myattempt11",
>   "labels": {},
>   "run": {
>     "env": {},
>     "cpus": 1.00,
>     "mem": 512,
>     "disk": 1000,
>     "placement": {
>       "constraints": [
>         {
>           "attribute": "os",
>           "operator": "EQ",
>           "value": "windows"
>         }
>       ]
>     },
>     "artifacts": [],
>     "maxLaunchDelay": 3600,
>     "docker": {
>       "image": "microsoft/windowsservercore"
>     },
>     "restart": {
>       "policy": "NEVER"
>     }
>   },
>   "schedules": []
> }
>
> Does not work:
>
> # dcos job add a.json
> # dcos job run myattempt11
> Run ID: 20180202203339zVpxc
>
> The log on the Mesos Agent on Windows shows activity but not much
> information about the problem (see "TASK_FAILED" at the end below):
>
> Log file created at: 2018/02/02 12:52:47
> Running on machine: DESKTOP-JJK06UJ
> Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
> I0202 12:52:47.330880  8388 logging.cpp:201] INFO level logging started!
> I0202 12:52:47.335886  8388 main.cpp:365] Build: 2017-12-20 23:35:42 UTC
> by Anne S Bell
> I0202 12:52:47.335886  8388 main.cpp:366] Version: 1.5.0
> I0202 12:52:47.337895  8388 main.cpp:373] Git SHA:
> 327726d3c7272806c8f3c3b7479758c26e55fd43
> I0202 12:52:47.358888  8388 resolver.cpp:69] Creating default secret
> resolver
> I0202 12:52:47.574883  8388 containerizer.cpp:304] Using isolation {
> windows/cpu, filesystem/windows, windows/mem, environment_secret }
> I0202 12:52:47.577883  8388 provisioner.cpp:299] Using default backend
> 'copy'
> I0202 12:52:47.596886  3348 slave.cpp:262] Mesos agent started on (1)@
> 10.19.10.206:5051
> I0202 12:52:47.597883  3348 slave.cpp:263] Flags at startup:
> --appc_simple_discovery_uri_prefix="http://" --appc_store_dir="C:\Users\
> activeit\AppData\Local\Temp\mesos\store\appc" --attributes="os:windows"
> --authenticate_http_readonly="false" --authenticate_http_readwrite="false"
> --authenticatee="crammd5" --authentication_backoff_factor="1secs"
> --authorizer="local" --container_disk_watch_interval="15secs"
> --containerizers="docker,mesos" --default_role="*"
> --disk_watch_interval="1mins" --docker="docker"
> --docker_kill_orphans="true" --docker_registry="https://
> registry-1.docker.io" --docker_remove_delay="6hrs"
> --docker_socket="//./pipe/docker_engine" --docker_stop_timeout="0ns"
> --docker_store_dir="C:\Users\activeit\AppData\Local\Temp\mesos\store\docker"
> --docker_volume_checkpoint_dir="/var/run/mesos/isolators/docker/volume"
> --enforce_container_disk_quota="false" --executor_registration_timeout="1mins"
> --executor_reregistration_timeout="2secs" --executor_shutdown_grace_period="5secs"
> --fetcher_cache_dir="C:\Users\activeit\AppData\Local\Temp\mesos\fetch"
> --fetcher_cache_size="2GB" --frameworks_home="" --gc_delay="1weeks"
> --gc_disk_headroom="0.1" --hadoop_home="" --help="false"
> --hostname="10.19.10.206" --hostname_lookup="true" --http_command_executor="false"
> --http_heartbeat_interval="30secs" --initialize_driver_logging="true"
> --ip="10.19.10.206" --isolation="windows/cpu,windows/mem"
> --launcher="windows" --launcher_dir="c:\mesos\mesos\build\src"
> --log_dir="c:\mesos\logs" --logbufsecs="0" --logging_level="INFO"
> --master="zk://10.22.1.94:2181/mesos" --max_completed_executors_per_framework="150"
> --oversubscribed_resources_interval="15secs" --port="5051"
> --qos_correction_interval_min="0ns" --quiet="false"
> --reconfiguration_policy="equal" --recover="reconnect"
> --recovery_timeout="15mins" --registration_backoff_factor="1secs"
> --runtime_dir="C:\ProgramData\mesos\runtime"
> --sandbox_directory="C:\mesos\sandbox" --strict="true" --version="false"
> --work_dir="c:\mesos\work_dir" --zk_session_timeout="10secs"
> I0202 12:52:47.604887  3348 slave.cpp:612] Agent resources:
> [{"name":"cpus","scalar":{"value":4.0},"type":"SCALAR"},{
> "name":"mem","scalar":{"value":15290.0},"type":"SCALAR"},{"
> name":"disk","scalar":{"value":470301.0},"type":"SCALAR"},{"
> name":"ports","ranges":{"range":[{"begin":31000,"end":
> 32000}]},"type":"RANGES"}]
> I0202 12:52:47.725885  3348 slave.cpp:620] Agent attributes: [ os=windows ]
> I0202 12:52:47.727886  3348 slave.cpp:629] Agent hostname: 10.19.10.206
> I0202 12:52:47.735886  7652 task_status_update_manager.cpp:181] Pausing
> sending task status updates
> I0202 12:52:47.738890  4052 group.cpp:341] Group process
> (zookeeper-group(1)@10.19.10.206:5051) connected to ZooKeeper
> I0202 12:52:47.739887  4052 group.cpp:831] Syncing group operations: queue
> size (joins, cancels, datas) = (0, 0, 0)
> I0202 12:52:47.740885  4052 group.cpp:419] Trying to create path '/mesos'
> in ZooKeeper
> I0202 12:52:47.773885  5168 state.cpp:66] Recovering state from
> 'c:\mesos\work_dir\meta'
> E0202 12:52:47.773885  3348 slave.cpp:1009] Failed to attach
> 'c:\mesos\logs\mesos-agent.exe.INFO <http://mesos-agent.exe.info/>' to
> virtual path '/slave/log': Failed to get realpath of 'c:\mesos\logs\
> mesos-agent.exe.INFO <http://mesos-agent.exe.info/>': Failed to get
> attributes for file 'c:\mesos\logs\mesos-agent.exe.INFO
> <http://mesos-agent.exe.info/>': The system cannot find the file
> specified.
>
> I0202 12:52:47.774884  5168 state.cpp:724] No committed checkpointed
> resources found at 'c:\mesos\work_dir\meta\resources\resources.info'
> I0202 12:52:47.779883  5168 state.cpp:110] Failed to find the latest agent
> from 'c:\mesos\work_dir\meta'
> I0202 12:52:47.781888  3528 task_status_update_manager.cpp:207]
> Recovering task status update manager
> I0202 12:52:47.782883  3348 docker.cpp:890] Recovering Docker containers
> I0202 12:52:47.782883  7652 containerizer.cpp:674] Recovering containerizer
> I0202 12:52:47.807888  3768 provisioner.cpp:495] Provisioner recovery
> complete
> I0202 12:52:47.891667  5168 detector.cpp:152] Detected a new leader:
> (id='1171')
> I0202 12:52:47.892666  7652 group.cpp:700] Trying to get
> '/mesos/json.info_0000001171' in ZooKeeper
> I0202 12:52:47.970657  5168 zookeeper.cpp:262] A new leading master (UPID=
> master@10.22.1.94:5050) is detected
> I0202 12:52:48.011252  7652 slave.cpp:6776] Finished recovery
> I0202 12:52:48.020246  3768 task_status_update_manager.cpp:181] Pausing
> sending task status updates
> I0202 12:52:48.020246  7652 slave.cpp:1055] New master detected at
> master@10.22.1.94:5050
> I0202 12:52:48.021251  7652 slave.cpp:1099] No credentials provided.
> Attempting to register without authentication
> I0202 12:52:48.023254  7652 slave.cpp:1110] Detecting new master
> I0202 12:52:48.330085  4052 slave.cpp:1275] Registered with master
> master@10.22.1.94:5050; given agent ID a0664e60-846a-42d0-9586-
> cf97e997eba3-S0
> I0202 12:52:48.331082  5168 task_status_update_manager.cpp:188] Resuming
> sending task status updates
> I0202 12:52:48.348086  4052 slave.cpp:1352] Forwarding agent update
> {"offer_operations":{},"resource_version_uuid":{"value":"DEVEk\/
> KOR5KLtmOgVG9qvw=="},"slave_id":{"value":"a0664e60-846a-
> 42d0-9586-cf97e997eba3-S0"},"update_oversubscribed_resources":true}
> W0202 12:52:48.351085  4052 slave.cpp:1334] Already registered with master
> master@10.22.1.94:5050
> I0202 12:52:48.356086  4052 slave.cpp:1352] Forwarding agent update
> {"offer_operations":{},"resource_version_uuid":{"value":"DEVEk\/
> KOR5KLtmOgVG9qvw=="},"slave_id":{"value":"a0664e60-846a-
> 42d0-9586-cf97e997eba3-S0"},"update_oversubscribed_resources":true}
> W0202 12:52:48.358086  4052 slave.cpp:1334] Already registered with master
> master@10.22.1.94:5050
> I0202 12:52:48.359086  4052 slave.cpp:1352] Forwarding agent update
> {"offer_operations":{},"resource_version_uuid":{"value":"DEVEk\/
> KOR5KLtmOgVG9qvw=="},"slave_id":{"value":"a0664e60-846a-
> 42d0-9586-cf97e997eba3-S0"},"update_oversubscribed_resources":true}
> W0202 12:52:48.362089  4052 slave.cpp:1334] Already registered with master
> master@10.22.1.94:5050
> I0202 12:52:48.363085  4052 slave.cpp:1352] Forwarding agent update
> {"offer_operations":{},"resource_version_uuid":{"value":"DEVEk\/
> KOR5KLtmOgVG9qvw=="},"slave_id":{"value":"a0664e60-846a-
> 42d0-9586-cf97e997eba3-S0"},"update_oversubscribed_resources":true}
> W0202 12:52:48.364082  4052 slave.cpp:1334] Already registered with master
> master@10.22.1.94:5050
> I0202 12:52:48.365085  4052 slave.cpp:1352] Forwarding agent update
> {"offer_operations":{},"resource_version_uuid":{"value":"DEVEk\/
> KOR5KLtmOgVG9qvw=="},"slave_id":{"value":"a0664e60-846a-
> 42d0-9586-cf97e997eba3-S0"},"update_oversubscribed_resources":true}
> I0202 12:52:50.938498  7652 slave.cpp:1831] Got assigned task 'myattempt11_
> 20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' for framework
> 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000
> I0202 12:52:50.962504  7652 slave.cpp:2101] Authorizing task 'myattempt11_
> 20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' for framework
> 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000
> I0202 12:52:50.965504  3768 slave.cpp:2494] Launching task 'myattempt11_
> 20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' for framework
> 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000
> I0202 12:52:50.988512  3768 slave.cpp:8373] Launching executor
> 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' of
> framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 with resources
> [{"allocation_info":{"role":"*"},"name":"cpus","scalar":{"
> value":0.1},"type":"SCALAR"},{"allocation_info":{"role":"*"}
> ,"name":"mem","scalar":{"value":32.0},"type":"SCALAR"}] in work directory
> 'c:\mesos\work_dir\slaves\a0664e60-846a-42d0-9586-
> cf97e997eba3-S0\frameworks\0ca2eae6-8912-4f6a-984a-
> d501ac02ff88-0000\executors\myattempt11_20180202203339zVpxc.07298e1c-
> 085b-11e8-bc6d-ae95ed0c8d88\runs\74298e92-9700-486d-b211-a42e5fd0bf85'
> I0202 12:52:50.995501  3768 slave.cpp:3046] Launching container
> 74298e92-9700-486d-b211-a42e5fd0bf85 for executor 'myattempt11_
> 20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' of framework
> 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000
> I0202 12:52:51.010500  3768 slave.cpp:2580] Queued task 'myattempt11_
> 20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' for executor
> 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' of
> framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000
> I0202 12:52:51.017498  3348 docker.cpp:1144] Starting container
> '74298e92-9700-486d-b211-a42e5fd0bf85' for task 'myattempt11_
> 20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' (and executor
> 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88')
> of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000
> I0202 12:52:53.731667  1104 docker.cpp:784] Checkpointing pid 7732 to
> 'c:\mesos\work_dir\meta\slaves\a0664e60-846a-42d0-9586-cf97e997eba3-S0\
> frameworks\0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000\
> executors\myattempt11_20180202203339zVpxc.07298e1c-
> 085b-11e8-bc6d-ae95ed0c8d88\runs\74298e92-9700-486d-b211-
> a42e5fd0bf85\pids\forked.pid'
> I0202 12:52:53.894371  4052 slave.cpp:4314] Got registration for executor
> 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' of
> framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 from executor(1)@
> 10.19.10.206:49855
> I0202 12:52:53.911371  1104 docker.cpp:1627] Ignoring updating container
> 74298e92-9700-486d-b211-a42e5fd0bf85 because resources passed to update
> are identical to existing resources
> I0202 12:52:53.914371  3768 slave.cpp:2785] Sending queued task
> 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' to
> executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88'
> of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 at executor(1)@
> 10.19.10.206:49855
> I0202 12:52:53.931371  7652 slave.cpp:4771] Handling status update
> TASK_STARTING (Status UUID: ef5adc2f-6f66-44c3-bc98-7697c1315ebf) for
> task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88
> of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 from executor(1)@
> 10.19.10.206:49855
> I0202 12:52:53.942371  5168 task_status_update_manager.cpp:328] Received
> task status update TASK_STARTING (Status UUID: ef5adc2f-6f66-44c3-bc98-7697c1315ebf)
> for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88
> of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000
> I0202 12:52:53.948371  5168 task_status_update_manager.cpp:842]
> Checkpointing UPDATE for task status update TASK_STARTING (Status UUID:
> ef5adc2f-6f66-44c3-bc98-7697c1315ebf) for task myattempt11_
> 20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework
> 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000
> I0202 12:52:53.950371  1104 slave.cpp:5254] Forwarding the update
> TASK_STARTING (Status UUID: ef5adc2f-6f66-44c3-bc98-7697c1315ebf) for
> task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88
> of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 to
> master@10.22.1.94:5050
> I0202 12:52:53.953371  1104 slave.cpp:5163] Sending acknowledgement for
> status update TASK_STARTING (Status UUID: ef5adc2f-6f66-44c3-bc98-7697c1315ebf)
> for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88
> of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 to executor(1)@
> 10.19.10.206:49855
> I0202 12:52:54.049816  3348 task_status_update_manager.cpp:401] Received
> task status update acknowledgement (UUID: ef5adc2f-6f66-44c3-bc98-7697c1315ebf)
> for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88
> of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000
> I0202 12:52:54.051817  3348 task_status_update_manager.cpp:842]
> Checkpointing ACK for task status update TASK_STARTING (Status UUID:
> ef5adc2f-6f66-44c3-bc98-7697c1315ebf) for task myattempt11_
> 20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework
> 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000
> I0202 12:52:59.255755  4052 slave.cpp:4771] Handling status update
> TASK_FAILED (Status UUID: c0775c86-4f1b-44a6-ae8f-347486f6fa9f) for task
> myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of
> framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 from executor(1)@
> 10.19.10.206:49855
> I0202 12:52:59.260759  4052 task_status_update_manager.cpp:328] Received
> task status update TASK_FAILED (Status UUID: c0775c86-4f1b-44a6-ae8f-347486f6fa9f)
> for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88
> of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000
> I0202 12:52:59.261757  4052 task_status_update_manager.cpp:842]
> Checkpointing UPDATE for task status update TASK_FAILED (Status UUID:
> c0775c86-4f1b-44a6-ae8f-347486f6fa9f) for task myattempt11_
> 20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework
> 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000
> I0202 12:52:59.263756  5168 slave.cpp:5254] Forwarding the update
> TASK_FAILED (Status UUID: c0775c86-4f1b-44a6-ae8f-347486f6fa9f) for task
> myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of
> framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 to
> master@10.22.1.94:5050
> I0202 12:52:59.265756  5168 slave.cpp:5163] Sending acknowledgement for
> status update TASK_FAILED (Status UUID: c0775c86-4f1b-44a6-ae8f-347486f6fa9f)
> for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88
> of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 to executor(1)@
> 10.19.10.206:49855
> I0202 12:52:59.367189  7052 task_status_update_manager.cpp:401] Received
> task status update acknowledgement (UUID: c0775c86-4f1b-44a6-ae8f-347486f6fa9f)
> for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88
> of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000
> I0202 12:52:59.368187  7052 task_status_update_manager.cpp:842]
> Checkpointing ACK for task status update TASK_FAILED (Status UUID:
> c0775c86-4f1b-44a6-ae8f-347486f6fa9f) for task myattempt11_
> 20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework
> 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000
> I0202 12:53:00.261153  4052 slave.cpp:5386] Got exited event for
> executor(1)@10.19.10.206:49855
> I0202 12:53:00.471400  7052 docker.cpp:2415] Executor for container
> 74298e92-9700-486d-b211-a42e5fd0bf85 has exited
> I0202 12:53:00.472362  7052 docker.cpp:2186] Destroying container
> 74298e92-9700-486d-b211-a42e5fd0bf85 in RUNNING state
> I0202 12:53:00.474362  7052 docker.cpp:2236] Running docker stop on
> container 74298e92-9700-486d-b211-a42e5fd0bf85
> I0202 12:53:00.477478  3348 slave.cpp:5795] Executor 'myattempt11_
> 20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' of framework
> 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 exited with status 0
> I0202 12:53:00.478476  3348 slave.cpp:5899] Cleaning up executor
> 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' of
> framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 at executor(1)@
> 10.19.10.206:49855
> I0202 12:53:00.481472  4052 gc.cpp:90] Scheduling
> 'c:\mesos\work_dir\slaves\a0664e60-846a-42d0-9586-
> cf97e997eba3-S0\frameworks\0ca2eae6-8912-4f6a-984a-
> d501ac02ff88-0000\executors\myattempt11_20180202203339zVpxc.07298e1c-
> 085b-11e8-bc6d-ae95ed0c8d88\runs\74298e92-9700-486d-b211-a42e5fd0bf85'
> for gc 6.99989026072889days in the future
> I0202 12:53:00.483475  3528 gc.cpp:90] Scheduling
> 'c:\mesos\work_dir\slaves\a0664e60-846a-42d0-9586-
> cf97e997eba3-S0\frameworks\0ca2eae6-8912-4f6a-984a-
> d501ac02ff88-0000\executors\myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88'
> for gc 6.99987866347259days in the future
> I0202 12:53:00.484474  5168 gc.cpp:90] Scheduling 'c:\mesos\work_dir\meta\
> slaves\a0664e60-846a-42d0-9586-cf97e997eba3-S0\
> frameworks\0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000\
> executors\myattempt11_20180202203339zVpxc.07298e1c-
> 085b-11e8-bc6d-ae95ed0c8d88\runs\74298e92-9700-486d-b211-a42e5fd0bf85'
> for gc 6.99999439265185days in the future
> I0202 12:53:00.485474  5168 gc.cpp:90] Scheduling 'c:\mesos\work_dir\meta\
> slaves\a0664e60-846a-42d0-9586-cf97e997eba3-S0\
> frameworks\0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000\
> executors\myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88'
> for gc 6.99987864033482days in the future
> I0202 12:53:00.485474  3348 slave.cpp:6006] Cleaning up framework
> 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000
> I0202 12:53:00.486479  1104 task_status_update_manager.cpp:289] Closing
> task status update streams for framework 0ca2eae6-8912-4f6a-984a-
> d501ac02ff88-0000
> I0202 12:53:00.487473  3768 gc.cpp:90] Scheduling
> 'c:\mesos\work_dir\slaves\a0664e60-846a-42d0-9586-
> cf97e997eba3-S0\frameworks\0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000' for
> gc 6.9998786172days in the future
> I0202 12:53:00.488477  3768 gc.cpp:90] Scheduling 'c:\mesos\work_dir\meta\
> slaves\a0664e60-846a-42d0-9586-cf97e997eba3-S0\
> frameworks\0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000' for gc
> 6.99987860557926days in the future
> I0202 12:53:47.742332  7052 slave.cpp:6314] Current disk usage 24.73%. Max
> allowed age: 4.568714599279827days
> I0202 12:54:01.675030  7052 slave.cpp:6222] Framework
> 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 seems to have exited. Ignoring
> registration timeout for executor 'myattempt11_
> 20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88'
> I0202 12:54:03.169529  3348 slave.cpp:970] Received SIGUSR1 signal;
> unregistering and shutting down
> I0202 12:54:03.170536  3348 slave.cpp:931] Agent terminating
> I0202 12:54:03.199530  3308 process.cpp:887] Failed to accept socket:
> future discarded
>
>
> in DCOS web-ui -> Jobs -> myattempt11 -> Run History  there is also no
> information.
>
> Are there any good troubleshooting tips / ideas what to try or where to
> find more informative logs to run a Docker container on Windows using
> Mesos?
>
> Are there any more suitable alternative orchestration tools to run Docker
> Windows containers in a cluster?
>
>

Re: Struggling with running Docker container on Windows agent

Posted by Andrew Schwartzmeyer <an...@schwartzmeyer.com>.
 Hello,

Would you please provide me with the executor's stderr log? This can be
found in the work directory on the agent, it should give us a bit more
information as to why it failed to start the task.

It'll be deeply nested, something like:

c:mesoswork_dirslaves7dc02270-a4e1-4f59-9ad7-56bad5182ea4-S3frameworkseb32cef4-c503-4ab7-85d4-8d4577e6a3bf-0000executorsnotepad.fcf078d1-084a-11e8-8f77-02421c3bc93crunslateststderr
(and stdout)

Thanks,

Andy

On 02/02/2018 1:30 pm, ajkf9uvxc ajkf9uvxc wrote: 

> Hi, 
> 
> I am trying to get a job in DCOS to run a docker container on a Windows agent machine. DCOS was installed using the AWS CF template here: https://downloads.dcos.io/dcos/stable/aws.html [1] (single master). 
> 
> The Windows agent is added: 
> 
> C:mesosmesosbuildsrcmesos-agent.exe --attributes=os:windows --containerizers=docker,mesos --hostname=10.19.10.206 --IP=10.19.10.206 --master=zk://10.22.1.94:2181/mesos [2] --work_dir=c:mesoswork_dir --launcher_dir=c:mesosmesosbuildsrc --log_dir=c:mesoslogs 
> 
> And a simple job works: 
> 
> dcos.activestate.com [3] -> Job -> New 
> 
> { 
> 
> "id": "mywindowstest01", 
> 
> "labels": {}, 
> 
> "run": { 
> 
> "cpus": 0.01, 
> 
> "mem": 128, 
> 
> "disk": 0, 
> 
> "cmd": "C:\Windows\System32\cmd.exe /c echo helloworld > c:\mesos\work_dir\helloworld2", 
> 
> "env": {}, 
> 
> "placement": { 
> 
> "constraints": [ 
> 
> { 
> 
> "attribute": "os", 
> 
> "operator": "EQ", 
> 
> "value": "windows" 
> 
> } 
> 
> ] 
> 
> }, 
> 
> "artifacts": [], 
> 
> "maxLaunchDelay": 3600, 
> 
> "volumes": [], 
> 
> "restart": { 
> 
> "policy": "NEVER" 
> 
> } 
> 
> }, 
> 
> "schedules": [] 
> 
> } 
> 
> creates: "c:\mesos\work_dir\helloworld2" 
> 
> The Windows agent has DockerCE installed and is set to run Windows containers (tried with Linux containers as well and getting the same problem, but for the purpose of this question let's stick to Windows containers) 
> 
> I confirmed that it's possible to run a Windows container manually, directly on Windows 10 by starting a Powershell as Administrator and running: 
> 
> docker run -ti microsoft/windowsservercore 
> and 
> 
> docker run microsoft/windowsservercore 
> 
> Both commands create a new container (verified with "docker ps" , besides I get a cmd.exe shell in the conatiner for the first command) 
> 
> Now the problem: 
> 
> trying to run a container from DCOS does not work: 
> 
> dcos job add a.json 
> 
> with the json: 
> 
> { 
> "id": "myattempt11", 
> "labels": {}, 
> "run": { 
> "env": {}, 
> "cpus": 1.00, 
> "mem": 512, 
> "disk": 1000, 
> "placement": { 
> "constraints": [ 
> { 
> "attribute": "os", 
> "operator": "EQ", 
> "value": "windows" 
> } 
> ] 
> }, 
> "artifacts": [], 
> "maxLaunchDelay": 3600, 
> "docker": { 
> "image": "microsoft/windowsservercore" 
> }, 
> "restart": { 
> "policy": "NEVER" 
> } 
> }, 
> "schedules": [] 
> } 
> 
> Does not work: 
> 
> # dcos job add a.json 
> 
> # dcos job run myattempt11 
> Run ID: 20180202203339zVpxc 
> 
> The log on the Mesos Agent on Windows shows activity but not much information about the problem (see "TASK_FAILED" at the end below): 
> 
> Log file created at: 2018/02/02 12:52:47 
> Running on machine: DESKTOP-JJK06UJ 
> Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg 
> I0202 12:52:47.330880 8388 logging.cpp:201] INFO level logging started! 
> I0202 12:52:47.335886 8388 main.cpp:365] Build: 2017-12-20 23:35:42 UTC by Anne S Bell 
> I0202 12:52:47.335886 8388 main.cpp:366] Version: 1.5.0 
> I0202 12:52:47.337895 8388 main.cpp:373] Git SHA: 327726d3c7272806c8f3c3b7479758c26e55fd43 
> I0202 12:52:47.358888 8388 resolver.cpp:69] Creating default secret resolver 
> I0202 12:52:47.574883 8388 containerizer.cpp:304] Using isolation { windows/cpu, filesystem/windows, windows/mem, environment_secret } 
> I0202 12:52:47.577883 8388 provisioner.cpp:299] Using default backend 'copy' 
> I0202 12:52:47.596886 3348 slave.cpp:262] Mesos agent started on (1)@10.19.10.206:5051 [4] 
> I0202 12:52:47.597883 3348 slave.cpp:263] Flags at startup: --appc_simple_discovery_uri_prefix="http://" --appc_store_dir="C:UsersactiveitAppDataLocalTempmesosstoreappc" --attributes="os:windows" --authenticate_http_readonly="false" --authenticate_http_readwrite="false" --authenticatee="crammd5" --authentication_backoff_factor="1secs" --authorizer="local" --container_disk_watch_interval="15secs" --containerizers="docker,mesos" --default_role="*" --disk_watch_interval="1mins" --docker="docker" --docker_kill_orphans="true" --docker_registry="https://registry-1.docker.io [5]" --docker_remove_delay="6hrs" --docker_socket="//./pipe/docker_engine" --docker_stop_timeout="0ns" --docker_store_dir="C:UsersactiveitAppDataLocalTempmesosstoredocker" --docker_volume_checkpoint_dir="/var/run/mesos/isolators/docker/volume" --enforce_container_disk_quota="false" --executor_registration_timeout="1mins" --executor_reregistration_timeout="2secs" --executor_shutdown_grace_period="5secs"
--fetcher_cache_dir="C:UsersactiveitAppDataLocalTempmesosfetch" --fetcher_cache_size="2GB" --frameworks_home="" --gc_delay="1weeks" --gc_disk_headroom="0.1" --hadoop_home="" --help="false" --hostname="10.19.10.206" --hostname_lookup="true" --http_command_executor="false" --http_heartbeat_interval="30secs" --initialize_driver_logging="true" --ip="10.19.10.206" --isolation="windows/cpu,windows/mem" --launcher="windows" --launcher_dir="c:mesosmesosbuildsrc" --log_dir="c:mesoslogs" --logbufsecs="0" --logging_level="INFO" --master="zk://10.22.1.94:2181/mesos [2]" --max_completed_executors_per_framework="150" --oversubscribed_resources_interval="15secs" --port="5051" --qos_correction_interval_min="0ns" --quiet="false" --reconfiguration_policy="equal" --recover="reconnect" --recovery_timeout="15mins" --registration_backoff_factor="1secs" --runtime_dir="C:ProgramDatamesosruntime" --sandbox_directory="C:mesossandbox" --strict="true" --version="false" --work_dir="c:mesoswork_dir"
--zk_session_timeout="10secs" 
> I0202 12:52:47.604887 3348 slave.cpp:612] Agent resources: [{"name":"cpus","scalar":{"value":4.0},"type":"SCALAR"},{"name":"mem","scalar":{"value":15290.0},"type":"SCALAR"},{"name":"disk","scalar":{"value":470301.0},"type":"SCALAR"},{"name":"ports","ranges":{"range":[{"begin":31000,"end":32000}]},"type":"RANGES"}] 
> I0202 12:52:47.725885 3348 slave.cpp:620] Agent attributes: [ os=windows ] 
> I0202 12:52:47.727886 3348 slave.cpp:629] Agent hostname: 10.19.10.206 
> I0202 12:52:47.735886 7652 task_status_update_manager.cpp:181] Pausing sending task status updates 
> I0202 12:52:47.738890 4052 group.cpp:341] Group process (zookeeper-group(1)@10.19.10.206:5051 [4]) connected to ZooKeeper 
> I0202 12:52:47.739887 4052 group.cpp:831] Syncing group operations: queue size (joins, cancels, datas) = (0, 0, 0) 
> I0202 12:52:47.740885 4052 group.cpp:419] Trying to create path '/mesos' in ZooKeeper 
> I0202 12:52:47.773885 5168 state.cpp:66] Recovering state from 'c:mesoswork_dirmeta' 
> E0202 12:52:47.773885 3348 slave.cpp:1009] Failed to attach 'c:mesoslogsmesos-agent.exe.INFO [6]' to virtual path '/slave/log': Failed to get realpath of 'c:mesoslogsmesos-agent.exe.INFO [6]': Failed to get attributes for file 'c:mesoslogsmesos-agent.exe.INFO [6]': The system cannot find the file specified. 
> 
> I0202 12:52:47.774884 5168 state.cpp:724] No committed checkpointed resources found at 'c:mesoswork_dirmetaresourcesresources.info [7]' 
> I0202 12:52:47.779883 5168 state.cpp:110] Failed to find the latest agent from 'c:mesoswork_dirmeta' 
> I0202 12:52:47.781888 3528 task_status_update_manager.cpp:207] Recovering task status update manager 
> I0202 12:52:47.782883 3348 docker.cpp:890] Recovering Docker containers 
> I0202 12:52:47.782883 7652 containerizer.cpp:674] Recovering containerizer 
> I0202 12:52:47.807888 3768 provisioner.cpp:495] Provisioner recovery complete 
> I0202 12:52:47.891667 5168 detector.cpp:152] Detected a new leader: (id='1171') 
> I0202 12:52:47.892666 7652 group.cpp:700] Trying to get '/mesos/json.info_0000001171' in ZooKeeper 
> I0202 12:52:47.970657 5168 zookeeper.cpp:262] A new leading master (UPID=master@10.22.1.94:5050 [8]) is detected 
> I0202 12:52:48.011252 7652 slave.cpp:6776] Finished recovery 
> I0202 12:52:48.020246 3768 task_status_update_manager.cpp:181] Pausing sending task status updates 
> I0202 12:52:48.020246 7652 slave.cpp:1055] New master detected at master@10.22.1.94:5050 [8] 
> I0202 12:52:48.021251 7652 slave.cpp:1099] No credentials provided. Attempting to register without authentication 
> I0202 12:52:48.023254 7652 slave.cpp:1110] Detecting new master 
> I0202 12:52:48.330085 4052 slave.cpp:1275] Registered with master master@10.22.1.94:5050 [8]; given agent ID a0664e60-846a-42d0-9586-cf97e997eba3-S0 
> I0202 12:52:48.331082 5168 task_status_update_manager.cpp:188] Resuming sending task status updates 
> I0202 12:52:48.348086 4052 slave.cpp:1352] Forwarding agent update {"offer_operations":{},"resource_version_uuid":{"value":"DEVEk/KOR5KLtmOgVG9qvw=="},"slave_id":{"value":"a0664e60-846a-42d0-9586-cf97e997eba3-S0"},"update_oversubscribed_resources":true} 
> W0202 12:52:48.351085 4052 slave.cpp:1334] Already registered with master master@10.22.1.94:5050 [8] 
> I0202 12:52:48.356086 4052 slave.cpp:1352] Forwarding agent update {"offer_operations":{},"resource_version_uuid":{"value":"DEVEk/KOR5KLtmOgVG9qvw=="},"slave_id":{"value":"a0664e60-846a-42d0-9586-cf97e997eba3-S0"},"update_oversubscribed_resources":true} 
> W0202 12:52:48.358086 4052 slave.cpp:1334] Already registered with master master@10.22.1.94:5050 [8] 
> I0202 12:52:48.359086 4052 slave.cpp:1352] Forwarding agent update {"offer_operations":{},"resource_version_uuid":{"value":"DEVEk/KOR5KLtmOgVG9qvw=="},"slave_id":{"value":"a0664e60-846a-42d0-9586-cf97e997eba3-S0"},"update_oversubscribed_resources":true} 
> W0202 12:52:48.362089 4052 slave.cpp:1334] Already registered with master master@10.22.1.94:5050 [8] 
> I0202 12:52:48.363085 4052 slave.cpp:1352] Forwarding agent update {"offer_operations":{},"resource_version_uuid":{"value":"DEVEk/KOR5KLtmOgVG9qvw=="},"slave_id":{"value":"a0664e60-846a-42d0-9586-cf97e997eba3-S0"},"update_oversubscribed_resources":true} 
> W0202 12:52:48.364082 4052 slave.cpp:1334] Already registered with master master@10.22.1.94:5050 [8] 
> I0202 12:52:48.365085 4052 slave.cpp:1352] Forwarding agent update {"offer_operations":{},"resource_version_uuid":{"value":"DEVEk/KOR5KLtmOgVG9qvw=="},"slave_id":{"value":"a0664e60-846a-42d0-9586-cf97e997eba3-S0"},"update_oversubscribed_resources":true} 
> I0202 12:52:50.938498 7652 slave.cpp:1831] Got assigned task 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' for framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> I0202 12:52:50.962504 7652 slave.cpp:2101] Authorizing task 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' for framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> I0202 12:52:50.965504 3768 slave.cpp:2494] Launching task 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' for framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> I0202 12:52:50.988512 3768 slave.cpp:8373] Launching executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 with resources [{"allocation_info":{"role":"*"},"name":"cpus","scalar":{"value":0.1},"type":"SCALAR"},{"allocation_info":{"role":"*"},"name":"mem","scalar":{"value":32.0},"type":"SCALAR"}] in work directory 'c:mesoswork_dirslavesa0664e60-846a-42d0-9586-cf97e997eba3-S0frameworksca2eae6-8912-4f6a-984a-d501ac02ff88-0000executorsmyattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88runs74298e92-9700-486d-b211-a42e5fd0bf85' 
> I0202 12:52:50.995501 3768 slave.cpp:3046] Launching container 74298e92-9700-486d-b211-a42e5fd0bf85 for executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> I0202 12:52:51.010500 3768 slave.cpp:2580] Queued task 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' for executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> I0202 12:52:51.017498 3348 docker.cpp:1144] Starting container '74298e92-9700-486d-b211-a42e5fd0bf85' for task 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' (and executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88') of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> I0202 12:52:53.731667 1104 docker.cpp:784] Checkpointing pid 7732 to 'c:mesoswork_dirmetaslavesa0664e60-846a-42d0-9586-cf97e997eba3-S0frameworksca2eae6-8912-4f6a-984a-d501ac02ff88-0000executorsmyattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88runs74298e92-9700-486d-b211-a42e5fd0bf85pidsforked.pid' 
> I0202 12:52:53.894371 4052 slave.cpp:4314] Got registration for executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 from executor(1)@10.19.10.206:49855 [9] 
> I0202 12:52:53.911371 1104 docker.cpp:1627] Ignoring updating container 74298e92-9700-486d-b211-a42e5fd0bf85 because resources passed to update are identical to existing resources 
> I0202 12:52:53.914371 3768 slave.cpp:2785] Sending queued task 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' to executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 at executor(1)@10.19.10.206:49855 [9] 
> I0202 12:52:53.931371 7652 slave.cpp:4771] Handling status update TASK_STARTING (Status UUID: ef5adc2f-6f66-44c3-bc98-7697c1315ebf) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 from executor(1)@10.19.10.206:49855 [9] 
> I0202 12:52:53.942371 5168 task_status_update_manager.cpp:328] Received task status update TASK_STARTING (Status UUID: ef5adc2f-6f66-44c3-bc98-7697c1315ebf) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> I0202 12:52:53.948371 5168 task_status_update_manager.cpp:842] Checkpointing UPDATE for task status update TASK_STARTING (Status UUID: ef5adc2f-6f66-44c3-bc98-7697c1315ebf) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> I0202 12:52:53.950371 1104 slave.cpp:5254] Forwarding the update TASK_STARTING (Status UUID: ef5adc2f-6f66-44c3-bc98-7697c1315ebf) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 to master@10.22.1.94:5050 [8] 
> I0202 12:52:53.953371 1104 slave.cpp:5163] Sending acknowledgement for status update TASK_STARTING (Status UUID: ef5adc2f-6f66-44c3-bc98-7697c1315ebf) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 to executor(1)@10.19.10.206:49855 [9] 
> I0202 12:52:54.049816 3348 task_status_update_manager.cpp:401] Received task status update acknowledgement (UUID: ef5adc2f-6f66-44c3-bc98-7697c1315ebf) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> I0202 12:52:54.051817 3348 task_status_update_manager.cpp:842] Checkpointing ACK for task status update TASK_STARTING (Status UUID: ef5adc2f-6f66-44c3-bc98-7697c1315ebf) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> I0202 12:52:59.255755 4052 slave.cpp:4771] Handling status update TASK_FAILED (Status UUID: c0775c86-4f1b-44a6-ae8f-347486f6fa9f) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 from executor(1)@10.19.10.206:49855 [9] 
> I0202 12:52:59.260759 4052 task_status_update_manager.cpp:328] Received task status update TASK_FAILED (Status UUID: c0775c86-4f1b-44a6-ae8f-347486f6fa9f) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> I0202 12:52:59.261757 4052 task_status_update_manager.cpp:842] Checkpointing UPDATE for task status update TASK_FAILED (Status UUID: c0775c86-4f1b-44a6-ae8f-347486f6fa9f) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> I0202 12:52:59.263756 5168 slave.cpp:5254] Forwarding the update TASK_FAILED (Status UUID: c0775c86-4f1b-44a6-ae8f-347486f6fa9f) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 to master@10.22.1.94:5050 [8] 
> I0202 12:52:59.265756 5168 slave.cpp:5163] Sending acknowledgement for status update TASK_FAILED (Status UUID: c0775c86-4f1b-44a6-ae8f-347486f6fa9f) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 to executor(1)@10.19.10.206:49855 [9] 
> I0202 12:52:59.367189 7052 task_status_update_manager.cpp:401] Received task status update acknowledgement (UUID: c0775c86-4f1b-44a6-ae8f-347486f6fa9f) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> I0202 12:52:59.368187 7052 task_status_update_manager.cpp:842] Checkpointing ACK for task status update TASK_FAILED (Status UUID: c0775c86-4f1b-44a6-ae8f-347486f6fa9f) for task myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88 of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> I0202 12:53:00.261153 4052 slave.cpp:5386] Got exited event for executor(1)@10.19.10.206:49855 [9] 
> I0202 12:53:00.471400 7052 docker.cpp:2415] Executor for container 74298e92-9700-486d-b211-a42e5fd0bf85 has exited 
> I0202 12:53:00.472362 7052 docker.cpp:2186] Destroying container 74298e92-9700-486d-b211-a42e5fd0bf85 in RUNNING state 
> I0202 12:53:00.474362 7052 docker.cpp:2236] Running docker stop on container 74298e92-9700-486d-b211-a42e5fd0bf85 
> I0202 12:53:00.477478 3348 slave.cpp:5795] Executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 exited with status 0 
> I0202 12:53:00.478476 3348 slave.cpp:5899] Cleaning up executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' of framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 at executor(1)@10.19.10.206:49855 [9] 
> I0202 12:53:00.481472 4052 gc.cpp:90] Scheduling 'c:mesoswork_dirslavesa0664e60-846a-42d0-9586-cf97e997eba3-S0frameworksca2eae6-8912-4f6a-984a-d501ac02ff88-0000executorsmyattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88runs74298e92-9700-486d-b211-a42e5fd0bf85' for gc 6.99989026072889days in the future 
> I0202 12:53:00.483475 3528 gc.cpp:90] Scheduling 'c:mesoswork_dirslavesa0664e60-846a-42d0-9586-cf97e997eba3-S0frameworksca2eae6-8912-4f6a-984a-d501ac02ff88-0000executorsmyattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' for gc 6.99987866347259days in the future 
> I0202 12:53:00.484474 5168 gc.cpp:90] Scheduling 'c:mesoswork_dirmetaslavesa0664e60-846a-42d0-9586-cf97e997eba3-S0frameworksca2eae6-8912-4f6a-984a-d501ac02ff88-0000executorsmyattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88runs74298e92-9700-486d-b211-a42e5fd0bf85' for gc 6.99999439265185days in the future 
> I0202 12:53:00.485474 5168 gc.cpp:90] Scheduling 'c:mesoswork_dirmetaslavesa0664e60-846a-42d0-9586-cf97e997eba3-S0frameworksca2eae6-8912-4f6a-984a-d501ac02ff88-0000executorsmyattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' for gc 6.99987864033482days in the future 
> I0202 12:53:00.485474 3348 slave.cpp:6006] Cleaning up framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> I0202 12:53:00.486479 1104 task_status_update_manager.cpp:289] Closing task status update streams for framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 
> I0202 12:53:00.487473 3768 gc.cpp:90] Scheduling 'c:mesoswork_dirslavesa0664e60-846a-42d0-9586-cf97e997eba3-S0frameworksca2eae6-8912-4f6a-984a-d501ac02ff88-0000' for gc 6.9998786172days in the future 
> I0202 12:53:00.488477 3768 gc.cpp:90] Scheduling 'c:mesoswork_dirmetaslavesa0664e60-846a-42d0-9586-cf97e997eba3-S0frameworksca2eae6-8912-4f6a-984a-d501ac02ff88-0000' for gc 6.99987860557926days in the future 
> I0202 12:53:47.742332 7052 slave.cpp:6314] Current disk usage 24.73%. Max allowed age: 4.568714599279827days 
> I0202 12:54:01.675030 7052 slave.cpp:6222] Framework 0ca2eae6-8912-4f6a-984a-d501ac02ff88-0000 seems to have exited. Ignoring registration timeout for executor 'myattempt11_20180202203339zVpxc.07298e1c-085b-11e8-bc6d-ae95ed0c8d88' 
> I0202 12:54:03.169529 3348 slave.cpp:970] Received SIGUSR1 signal; unregistering and shutting down 
> I0202 12:54:03.170536 3348 slave.cpp:931] Agent terminating 
> I0202 12:54:03.199530 3308 process.cpp:887] Failed to accept socket: future discarded 
> 
> in DCOS web-ui -> Jobs -> myattempt11 -> Run History there is also no information. 
> 
> Are there any good troubleshooting tips / ideas what to try or where to find more informative logs to run a Docker container on Windows using Mesos? 
> 
> Are there any more suitable alternative orchestration tools to run Docker Windows containers in a cluster?
 

Links:
------
[1] https://downloads.dcos.io/dcos/stable/aws.html
[2] http://10.22.1.94:2181/mesos
[3] http://dcos.activestate.com/
[4] http://10.19.10.206:5051/
[5] https://registry-1.docker.io/
[6] http://mesos-agent.exe.info/
[7] http://resources.info/
[8] http://master@10.22.1.94:5050/
[9] http://10.19.10.206:49855/