You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Matt Christiansen (JIRA)" <ji...@apache.org> on 2015/02/05 09:27:34 UTC

[jira] [Comment Edited] (MESOS-1837) failed to determine cgroup for the 'cpu' subsystem

    [ https://issues.apache.org/jira/browse/MESOS-1837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14306819#comment-14306819 ] 

Matt Christiansen edited comment on MESOS-1837 at 2/5/15 8:26 AM:
------------------------------------------------------------------

I am also getting this error:

E0205 00:07:42.387457  4456 slave.cpp:2344] Failed to update resources for container 8ac5e3d1-edc9-4d74-8658-9b0b4fd6f207 of executor 8004091e-e93b-4415-6c7b-3e52d62abef9 running task 8004091e-e93b-4415-6c7b-3e52d62abef9 on status update for terminal task, destroying container: Failed to determine cgroup for the 'cpu' subsystem: Failed to read /proc/4519/cgroup: Failed to open file '/proc/4519/cgroup': No such file or directory

E0205 00:13:42.638646  4454 slave.cpp:2344] Failed to update resources for container 29890198-88e8-4809-91a1-590546bee1fd of executor 8005091e-e93b-4415-6c7b-3e52d62abef9 running task 8005091e-e93b-4415-6c7b-3e52d62abef9 on status update for terminal task, destroying container: Failed to determine cgroup for the 'cpu' subsystem: Failed to read /proc/4891/cgroup: Failed to open file '/proc/4891/cgroup': No such file or directory


it doesn't end in a TASK_KILL or fail, instead TASK_FINISHED but the job never moves from Active Tasks to Completed Tasks but the resources are freed

This is on Centos 6.6, Mesos 0.21.1, Docker 1.4.1 with a custom in house framework, and I have tried jobs with both custom executors or use the built in one made no difference. 

This seems to happen every time I run a job and regardless if the job succeeds or fails.

here is the full log for task :8004091e-e93b-4415-6c7b-3e52d62abef9 

{noformat}
I0205 00:07:13.098376  4454 slave.cpp:1083] Got assigned task 8004091e-e93b-4415-6c7b-3e52d62abef9 for framework 20150204-233402-654050058-5050-15891-0000
I0205 00:07:13.114512  4454 slave.cpp:1193] Launching task 8004091e-e93b-4415-6c7b-3e52d62abef9 for framework 20150204-233402-654050058-5050-15891-0000
I0205 00:07:13.115538  4454 slave.cpp:3997] Launching executor 8004091e-e93b-4415-6c7b-3e52d62abef9 of framework 20150204-233402-654050058-5050-15891-0000 in work directory '/tmp/mesos/slaves/20150204-233402-654050058-5050-15891-S2/frameworks/20150204-233402-654050058-5050-15891-0000/executors/8004091e-e93b-4415-6c7b-3e52d62abef9/runs/8ac5e3d1-edc9-4d74-8658-9b0b4fd6f207'
I0205 00:07:13.115799  4454 slave.cpp:1316] Queuing task '8004091e-e93b-4415-6c7b-3e52d62abef9' for executor 8004091e-e93b-4415-6c7b-3e52d62abef9 of framework '20150204-233402-654050058-5050-15891-0000
I0205 00:07:13.118450  4452 docker.cpp:927] Starting container '8ac5e3d1-edc9-4d74-8658-9b0b4fd6f207' for task '8004091e-e93b-4415-6c7b-3e52d62abef9' (and executor '8004091e-e93b-4415-6c7b-3e52d62abef9') of framework '20150204-233402-654050058-5050-15891-0000'
I0205 00:07:14.945565  4454 docker.cpp:633] Checkpointing pid 4527 to '/tmp/mesos/meta/slaves/20150204-233402-654050058-5050-15891-S2/frameworks/20150204-233402-654050058-5050-15891-0000/executors/8004091e-e93b-4415-6c7b-3e52d62abef9/runs/8ac5e3d1-edc9-4d74-8658-9b0b4fd6f207/pids/forked.pid'
I0205 00:07:14.946475  4450 slave.cpp:2840] Monitoring executor '8004091e-e93b-4415-6c7b-3e52d62abef9' of framework '20150204-233402-654050058-5050-15891-0000' in container '8ac5e3d1-edc9-4d74-8658-9b0b4fd6f207'
I0205 00:07:15.000608  4454 slave.cpp:1860] Got registration for executor '8004091e-e93b-4415-6c7b-3e52d62abef9' of framework 20150204-233402-654050058-5050-15891-0000 from executor(1)@10.3.0.112:53291
I0205 00:07:15.001273  4454 slave.cpp:1979] Flushing queued task 8004091e-e93b-4415-6c7b-3e52d62abef9 for executor '8004091e-e93b-4415-6c7b-3e52d62abef9' of framework 20150204-233402-654050058-5050-15891-0000
I0205 00:07:15.010057  4457 slave.cpp:2215] Handling status update TASK_RUNNING (UUID: d4a33cd4-d09f-423a-8dce-31f485854e5d) for task 8004091e-e93b-4415-6c7b-3e52d62abef9 of framework 20150204-233402-654050058-5050-15891-0000 from executor(1)@10.3.0.112:53291
I0205 00:07:15.010246  4457 status_update_manager.cpp:317] Received status update TASK_RUNNING (UUID: d4a33cd4-d09f-423a-8dce-31f485854e5d) for task 8004091e-e93b-4415-6c7b-3e52d62abef9 of framework 20150204-233402-654050058-5050-15891-0000
I0205 00:07:15.010418  4457 status_update_manager.hpp:346] Checkpointing UPDATE for status update TASK_RUNNING (UUID: d4a33cd4-d09f-423a-8dce-31f485854e5d) for task 8004091e-e93b-4415-6c7b-3e52d62abef9 of framework 20150204-233402-654050058-5050-15891-0000
I0205 00:07:15.047556  4455 docker.cpp:1298] Updated 'cpu.shares' to 1126 at /cgroup/cpu/docker/a8d803bfb154165cfb0491fd7d17f66e2697e09f3908a89e2cf1e9e3647ab3e0 for container 8ac5e3d1-edc9-4d74-8658-9b0b4fd6f207
I0205 00:07:15.047868  4455 docker.cpp:1333] Updated 'memory.soft_limit_in_bytes' to 4544MB for container 8ac5e3d1-edc9-4d74-8658-9b0b4fd6f207
I0205 00:07:15.048305  4455 docker.cpp:1359] Updated 'memory.limit_in_bytes' to 4544MB at /cgroup/memory/docker/a8d803bfb154165cfb0491fd7d17f66e2697e09f3908a89e2cf1e9e3647ab3e0 for container 8ac5e3d1-edc9-4d74-8658-9b0b4fd6f207
I0205 00:07:15.159543  4450 slave.cpp:2458] Forwarding the update TASK_RUNNING (UUID: d4a33cd4-d09f-423a-8dce-31f485854e5d) for task 8004091e-e93b-4415-6c7b-3e52d62abef9 of framework 20150204-233402-654050058-5050-15891-0000 to master@10.3.252.38:5050
I0205 00:07:15.159682  4450 slave.cpp:2391] Sending acknowledgement for status update TASK_RUNNING (UUID: d4a33cd4-d09f-423a-8dce-31f485854e5d) for task 8004091e-e93b-4415-6c7b-3e52d62abef9 of framework 20150204-233402-654050058-5050-15891-0000 to executor(1)@10.3.0.112:53291
I0205 00:07:15.162772  4452 status_update_manager.cpp:389] Received status update acknowledgement (UUID: d4a33cd4-d09f-423a-8dce-31f485854e5d) for task 8004091e-e93b-4415-6c7b-3e52d62abef9 of framework 20150204-233402-654050058-5050-15891-0000
I0205 00:07:15.162827  4452 status_update_manager.hpp:346] Checkpointing ACK for status update TASK_RUNNING (UUID: d4a33cd4-d09f-423a-8dce-31f485854e5d) for task 8004091e-e93b-4415-6c7b-3e52d62abef9 of framework 20150204-233402-654050058-5050-15891-0000
I0205 00:07:42.387159  4450 slave.cpp:2215] Handling status update TASK_FINISHED (UUID: d8a4a00c-8c91-4311-af16-f3dc5f7cd8f9) for task 8004091e-e93b-4415-6c7b-3e52d62abef9 of framework 20150204-233402-654050058-5050-15891-0000 from executor(1)@10.3.0.112:53291
E0205 00:07:42.387457  4456 slave.cpp:2344] Failed to update resources for container 8ac5e3d1-edc9-4d74-8658-9b0b4fd6f207 of executor 8004091e-e93b-4415-6c7b-3e52d62abef9 running task 8004091e-e93b-4415-6c7b-3e52d62abef9 on status update for terminal task, destroying container: Failed to determine cgroup for the 'cpu' subsystem: Failed to read /proc/4519/cgroup: Failed to open file '/proc/4519/cgroup': No such file or directory
I0205 00:07:42.398448  4456 status_update_manager.cpp:317] Received status update TASK_FINISHED (UUID: d8a4a00c-8c91-4311-af16-f3dc5f7cd8f9) for task 8004091e-e93b-4415-6c7b-3e52d62abef9 of framework 20150204-233402-654050058-5050-15891-0000
I0205 00:07:42.398532  4456 status_update_manager.hpp:346] Checkpointing UPDATE for status update TASK_FINISHED (UUID: d8a4a00c-8c91-4311-af16-f3dc5f7cd8f9) for task 8004091e-e93b-4415-6c7b-3e52d62abef9 of framework 20150204-233402-654050058-5050-15891-0000
I0205 00:07:42.398494  4457 docker.cpp:1501] Destroying container '8ac5e3d1-edc9-4d74-8658-9b0b4fd6f207'
I0205 00:07:42.398794  4457 docker.cpp:1593] Running docker stop on container '8ac5e3d1-edc9-4d74-8658-9b0b4fd6f207'
I0205 00:07:42.508761  4450 slave.cpp:2458] Forwarding the update TASK_FINISHED (UUID: d8a4a00c-8c91-4311-af16-f3dc5f7cd8f9) for task 8004091e-e93b-4415-6c7b-3e52d62abef9 of framework 20150204-233402-654050058-5050-15891-0000 to master@10.3.252.38:5050
I0205 00:07:42.508932  4450 slave.cpp:2391] Sending acknowledgement for status update TASK_FINISHED (UUID: d8a4a00c-8c91-4311-af16-f3dc5f7cd8f9) for task 8004091e-e93b-4415-6c7b-3e52d62abef9 of framework 20150204-233402-654050058-5050-15891-0000 to executor(1)@10.3.0.112:53291
I0205 00:07:42.516126  4454 status_update_manager.cpp:389] Received status update acknowledgement (UUID: d8a4a00c-8c91-4311-af16-f3dc5f7cd8f9) for task 8004091e-e93b-4415-6c7b-3e52d62abef9 of framework 20150204-233402-654050058-5050-15891-0000
I0205 00:07:42.516222  4454 status_update_manager.hpp:346] Checkpointing ACK for status update TASK_FINISHED (UUID: d8a4a00c-8c91-4311-af16-f3dc5f7cd8f9) for task 8004091e-e93b-4415-6c7b-3e52d62abef9 of framework 20150204-233402-654050058-5050-15891-0000
{noformat}


was (Author: nikore):
I am also getting this error:

E0205 00:07:42.387457  4456 slave.cpp:2344] Failed to update resources for container 8ac5e3d1-edc9-4d74-8658-9b0b4fd6f207 of executor 8004091e-e93b-4415-6c7b-3e52d62abef9 running task 8004091e-e93b-4415-6c7b-3e52d62abef9 on status update for terminal task, destroying container: Failed to determine cgroup for the 'cpu' subsystem: Failed to read /proc/4519/cgroup: Failed to open file '/proc/4519/cgroup': No such file or directory

E0205 00:13:42.638646  4454 slave.cpp:2344] Failed to update resources for container 29890198-88e8-4809-91a1-590546bee1fd of executor 8005091e-e93b-4415-6c7b-3e52d62abef9 running task 8005091e-e93b-4415-6c7b-3e52d62abef9 on status update for terminal task, destroying container: Failed to determine cgroup for the 'cpu' subsystem: Failed to read /proc/4891/cgroup: Failed to open file '/proc/4891/cgroup': No such file or directory


it doesn't end in a TASK_KILL or fail, instead TASK_FINISHED but the job never moves from Active Tasks to Completed Tasks but the resources are freed

This is on Centos 6.6, Mesos 0.21.1, Docker 1.4.1 with a custom in house framework, and I have tried jobs with both custom executors or use the built in one made no difference. 

This seems to happen every time I run a job and regardless if the job succeeds or fails.

here is the full log for task :8004091e-e93b-4415-6c7b-3e52d62abef9 

I0205 00:07:13.098376  4454 slave.cpp:1083] Got assigned task 8004091e-e93b-4415-6c7b-3e52d62abef9 for framework 20150204-233402-654050058-5050-15891-0000
I0205 00:07:13.114512  4454 slave.cpp:1193] Launching task 8004091e-e93b-4415-6c7b-3e52d62abef9 for framework 20150204-233402-654050058-5050-15891-0000
I0205 00:07:13.115538  4454 slave.cpp:3997] Launching executor 8004091e-e93b-4415-6c7b-3e52d62abef9 of framework 20150204-233402-654050058-5050-15891-0000 in work directory '/tmp/mesos/slaves/20150204-233402-654050058-5050-15891-S2/frameworks/20150204-233402-654050058-5050-15891-0000/executors/8004091e-e93b-4415-6c7b-3e52d62abef9/runs/8ac5e3d1-edc9-4d74-8658-9b0b4fd6f207'
I0205 00:07:13.115799  4454 slave.cpp:1316] Queuing task '8004091e-e93b-4415-6c7b-3e52d62abef9' for executor 8004091e-e93b-4415-6c7b-3e52d62abef9 of framework '20150204-233402-654050058-5050-15891-0000
I0205 00:07:13.118450  4452 docker.cpp:927] Starting container '8ac5e3d1-edc9-4d74-8658-9b0b4fd6f207' for task '8004091e-e93b-4415-6c7b-3e52d62abef9' (and executor '8004091e-e93b-4415-6c7b-3e52d62abef9') of framework '20150204-233402-654050058-5050-15891-0000'
I0205 00:07:14.945565  4454 docker.cpp:633] Checkpointing pid 4527 to '/tmp/mesos/meta/slaves/20150204-233402-654050058-5050-15891-S2/frameworks/20150204-233402-654050058-5050-15891-0000/executors/8004091e-e93b-4415-6c7b-3e52d62abef9/runs/8ac5e3d1-edc9-4d74-8658-9b0b4fd6f207/pids/forked.pid'
I0205 00:07:14.946475  4450 slave.cpp:2840] Monitoring executor '8004091e-e93b-4415-6c7b-3e52d62abef9' of framework '20150204-233402-654050058-5050-15891-0000' in container '8ac5e3d1-edc9-4d74-8658-9b0b4fd6f207'
I0205 00:07:15.000608  4454 slave.cpp:1860] Got registration for executor '8004091e-e93b-4415-6c7b-3e52d62abef9' of framework 20150204-233402-654050058-5050-15891-0000 from executor(1)@10.3.0.112:53291
I0205 00:07:15.001273  4454 slave.cpp:1979] Flushing queued task 8004091e-e93b-4415-6c7b-3e52d62abef9 for executor '8004091e-e93b-4415-6c7b-3e52d62abef9' of framework 20150204-233402-654050058-5050-15891-0000
I0205 00:07:15.010057  4457 slave.cpp:2215] Handling status update TASK_RUNNING (UUID: d4a33cd4-d09f-423a-8dce-31f485854e5d) for task 8004091e-e93b-4415-6c7b-3e52d62abef9 of framework 20150204-233402-654050058-5050-15891-0000 from executor(1)@10.3.0.112:53291
I0205 00:07:15.010246  4457 status_update_manager.cpp:317] Received status update TASK_RUNNING (UUID: d4a33cd4-d09f-423a-8dce-31f485854e5d) for task 8004091e-e93b-4415-6c7b-3e52d62abef9 of framework 20150204-233402-654050058-5050-15891-0000
I0205 00:07:15.010418  4457 status_update_manager.hpp:346] Checkpointing UPDATE for status update TASK_RUNNING (UUID: d4a33cd4-d09f-423a-8dce-31f485854e5d) for task 8004091e-e93b-4415-6c7b-3e52d62abef9 of framework 20150204-233402-654050058-5050-15891-0000
I0205 00:07:15.047556  4455 docker.cpp:1298] Updated 'cpu.shares' to 1126 at /cgroup/cpu/docker/a8d803bfb154165cfb0491fd7d17f66e2697e09f3908a89e2cf1e9e3647ab3e0 for container 8ac5e3d1-edc9-4d74-8658-9b0b4fd6f207
I0205 00:07:15.047868  4455 docker.cpp:1333] Updated 'memory.soft_limit_in_bytes' to 4544MB for container 8ac5e3d1-edc9-4d74-8658-9b0b4fd6f207
I0205 00:07:15.048305  4455 docker.cpp:1359] Updated 'memory.limit_in_bytes' to 4544MB at /cgroup/memory/docker/a8d803bfb154165cfb0491fd7d17f66e2697e09f3908a89e2cf1e9e3647ab3e0 for container 8ac5e3d1-edc9-4d74-8658-9b0b4fd6f207
I0205 00:07:15.159543  4450 slave.cpp:2458] Forwarding the update TASK_RUNNING (UUID: d4a33cd4-d09f-423a-8dce-31f485854e5d) for task 8004091e-e93b-4415-6c7b-3e52d62abef9 of framework 20150204-233402-654050058-5050-15891-0000 to master@10.3.252.38:5050
I0205 00:07:15.159682  4450 slave.cpp:2391] Sending acknowledgement for status update TASK_RUNNING (UUID: d4a33cd4-d09f-423a-8dce-31f485854e5d) for task 8004091e-e93b-4415-6c7b-3e52d62abef9 of framework 20150204-233402-654050058-5050-15891-0000 to executor(1)@10.3.0.112:53291
I0205 00:07:15.162772  4452 status_update_manager.cpp:389] Received status update acknowledgement (UUID: d4a33cd4-d09f-423a-8dce-31f485854e5d) for task 8004091e-e93b-4415-6c7b-3e52d62abef9 of framework 20150204-233402-654050058-5050-15891-0000
I0205 00:07:15.162827  4452 status_update_manager.hpp:346] Checkpointing ACK for status update TASK_RUNNING (UUID: d4a33cd4-d09f-423a-8dce-31f485854e5d) for task 8004091e-e93b-4415-6c7b-3e52d62abef9 of framework 20150204-233402-654050058-5050-15891-0000
I0205 00:07:42.387159  4450 slave.cpp:2215] Handling status update TASK_FINISHED (UUID: d8a4a00c-8c91-4311-af16-f3dc5f7cd8f9) for task 8004091e-e93b-4415-6c7b-3e52d62abef9 of framework 20150204-233402-654050058-5050-15891-0000 from executor(1)@10.3.0.112:53291
E0205 00:07:42.387457  4456 slave.cpp:2344] Failed to update resources for container 8ac5e3d1-edc9-4d74-8658-9b0b4fd6f207 of executor 8004091e-e93b-4415-6c7b-3e52d62abef9 running task 8004091e-e93b-4415-6c7b-3e52d62abef9 on status update for terminal task, destroying container: Failed to determine cgroup for the 'cpu' subsystem: Failed to read /proc/4519/cgroup: Failed to open file '/proc/4519/cgroup': No such file or directory
I0205 00:07:42.398448  4456 status_update_manager.cpp:317] Received status update TASK_FINISHED (UUID: d8a4a00c-8c91-4311-af16-f3dc5f7cd8f9) for task 8004091e-e93b-4415-6c7b-3e52d62abef9 of framework 20150204-233402-654050058-5050-15891-0000
I0205 00:07:42.398532  4456 status_update_manager.hpp:346] Checkpointing UPDATE for status update TASK_FINISHED (UUID: d8a4a00c-8c91-4311-af16-f3dc5f7cd8f9) for task 8004091e-e93b-4415-6c7b-3e52d62abef9 of framework 20150204-233402-654050058-5050-15891-0000
I0205 00:07:42.398494  4457 docker.cpp:1501] Destroying container '8ac5e3d1-edc9-4d74-8658-9b0b4fd6f207'
I0205 00:07:42.398794  4457 docker.cpp:1593] Running docker stop on container '8ac5e3d1-edc9-4d74-8658-9b0b4fd6f207'
I0205 00:07:42.508761  4450 slave.cpp:2458] Forwarding the update TASK_FINISHED (UUID: d8a4a00c-8c91-4311-af16-f3dc5f7cd8f9) for task 8004091e-e93b-4415-6c7b-3e52d62abef9 of framework 20150204-233402-654050058-5050-15891-0000 to master@10.3.252.38:5050
I0205 00:07:42.508932  4450 slave.cpp:2391] Sending acknowledgement for status update TASK_FINISHED (UUID: d8a4a00c-8c91-4311-af16-f3dc5f7cd8f9) for task 8004091e-e93b-4415-6c7b-3e52d62abef9 of framework 20150204-233402-654050058-5050-15891-0000 to executor(1)@10.3.0.112:53291
I0205 00:07:42.516126  4454 status_update_manager.cpp:389] Received status update acknowledgement (UUID: d8a4a00c-8c91-4311-af16-f3dc5f7cd8f9) for task 8004091e-e93b-4415-6c7b-3e52d62abef9 of framework 20150204-233402-654050058-5050-15891-0000
I0205 00:07:42.516222  4454 status_update_manager.hpp:346] Checkpointing ACK for status update TASK_FINISHED (UUID: d8a4a00c-8c91-4311-af16-f3dc5f7cd8f9) for task 8004091e-e93b-4415-6c7b-3e52d62abef9 of framework 20150204-233402-654050058-5050-15891-0000





> failed to determine cgroup for the 'cpu' subsystem
> --------------------------------------------------
>
>                 Key: MESOS-1837
>                 URL: https://issues.apache.org/jira/browse/MESOS-1837
>             Project: Mesos
>          Issue Type: Bug
>          Components: docker
>    Affects Versions: 0.20.1
>         Environment: Ubuntu 14.04
>            Reporter: Chris Fortier
>            Assignee: Timothy Chen
>
> Attempting to launch Docker container with Marathon. Container is launched then fails. 
> A search of /var/log/syslog reveals:
> Sep 27 03:01:43 vagrant-ubuntu-trusty-64 mesos-slave[1409]: E0927 03:01:43.546957  1463 slave.cpp:2205] Failed to update resources for container 8c2429d9-f090-4443-8108-0206ca37f3fd of executor hello-world.970dbe74-45f2-11e4-8b1d-56847afe9799 running task hello-world.970dbe74-45f2-11e4-8b1d-56847afe9799 on status update for terminal task, destroying container: Failed to determine cgroup for the 'cpu' subsystem: Failed to read /proc/9792/cgroup: Failed to open file '/proc/9792/cgroup': No such file or directory



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)