You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by 王瑜 <wa...@nfs.iscas.ac.cn> on 2013/05/15 09:00:32 UTC

Tasks always lost when running hadoop test!

Hi Ben,

I think the problem is mesos have found the executor on hdfs://master/user/mesos/hadoop.tar.gz, but it did not download it, so did not use it.
Mesos found the executor, so it did not output error, just update the task status as lost; but mesos did not use the executor, so the executor directory contains nothing! 

But I am not very familiar with source code, so I do not know why mesos can not use the executor. And I also do not know whether my analysis is right. Thanks very much for your help!




Wang Yu

发件人: 王瑜
发送时间: 2013-05-15 11:04
收件人: mesos-dev
抄送: Benjamin Mahler
主题: 回复: 回复: org.apache.hadoop.mapred.MesosScheduler: Unknown/exited TaskTracker: http://slave5:50060
Hi, Ben,

I have reworked the test, and checked log directory again, it is still null. The same as following.
I think there is the problem with my executor, but I do not know how to let the executor works. Logs is as following...
" Asked to update resources for an unknown/killed executor" why it always kill the executor?

1. I opened all the executor directory, but all of them are null. I do not know what happened to them...
[root@slave1 logs]# cd /tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_4/runs/8a4dd631-1ec0-4946-a1bc-0644a7238e3c
[root@slave1 8a4dd631-1ec0-4946-a1bc-0644a7238e3c]# ls
[root@slave1 8a4dd631-1ec0-4946-a1bc-0644a7238e3c]# ls -l
总用量 0
[root@slave1 8a4dd631-1ec0-4946-a1bc-0644a7238e3c]# ls -a
.  ..
[root@slave1 8a4dd631-1ec0-4946-a1bc-0644a7238e3c]#
2. I added "--isolation=cgroups" for slaves, but it still not work. Tasks are always lost. But there is no error any more, I still do not know what happened to the executor...Logs on one slave is as follows. Please help me, thanks very much!

mesos-slave.INFO
Log file created at: 2013/05/13 09:12:54
Running on machine: slave1
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
I0513 09:12:54.170383 24183 main.cpp:124] Creating "cgroups" isolator
I0513 09:12:54.171617 24183 main.cpp:132] Build: 2013-04-10 16:07:43 by root
I0513 09:12:54.171656 24183 main.cpp:133] Starting Mesos slave
I0513 09:12:54.173495 24197 slave.cpp:203] Slave started on 1)@192.168.0.3:36668
I0513 09:12:54.173578 24197 slave.cpp:204] Slave resources: cpus=24; mem=63356; ports=[31000-32000]; disk=29143
I0513 09:12:54.174486 24192 cgroups_isolator.cpp:242] Using /cgroup as cgroups hierarchy root
I0513 09:12:54.179914 24197 slave.cpp:453] New master detected at master@192.168.0.2:5050
I0513 09:12:54.180809 24197 slave.cpp:436] Successfully attached file '/home/mesos/build/logs/mesos-slave.INFO'
I0513 09:12:54.180817 24207 status_update_manager.cpp:132] New master detected at master@192.168.0.2:5050
I0513 09:12:54.194345 24192 cgroups_isolator.cpp:730] Recovering isolator
I0513 09:12:54.195453 24189 slave.cpp:377] Finished recovery
I0513 09:12:54.197798 24206 slave.cpp:487] Registered with master; given slave ID 201305130913-33597632-5050-3893-0
I0513 09:12:54.198086 24201 gc.cpp:56] Scheduling '/tmp/mesos/slaves/201305081719-33597632-5050-4050-1' for removal
I0513 09:12:54.198329 24201 gc.cpp:56] Scheduling '/tmp/mesos/slaves/201305100938-33597632-5050-19520-1' for removal
I0513 09:12:54.198490 24201 gc.cpp:56] Scheduling '/tmp/mesos/slaves/201305081625-33597632-5050-2991-1' for removal
I0513 09:12:54.198593 24201 gc.cpp:56] Scheduling '/tmp/mesos/slaves/201305081746-33597632-5050-12378-1' for removal
I0513 09:12:54.198874 24201 gc.cpp:56] Scheduling '/tmp/mesos/slaves/201305090914-33597632-5050-5072-1' for removal
I0513 09:12:54.199028 24201 gc.cpp:56] Scheduling '/tmp/mesos/slaves/201305081730-33597632-5050-8558-1' for removal
I0513 09:12:54.199149 24201 gc.cpp:56] Scheduling '/tmp/mesos/slaves/201304131144-33597632-5050-4949-2' for removal
I0513 09:13:54.176460 24204 slave.cpp:1811] Current disk usage 26.93%. Max allowed age: 5.11days
I0513 09:14:54.178444 24203 slave.cpp:1811] Current disk usage 26.93%. Max allowed age: 5.11days
I0513 09:15:54.180680 24203 slave.cpp:1811] Current disk usage 26.93%. Max allowed age: 5.11days
I0513 09:16:23.051203 24200 slave.cpp:587] Got assigned task Task_Tracker_0 for framework 201305130913-33597632-5050-3893-0000
I0513 09:16:23.054324 24200 paths.hpp:302] Created executor directory '/tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_0/runs/6522748a-9d43-41b7-8f88-cd537a502495'
I0513 09:16:23.055605 24188 slave.cpp:436] Successfully attached file '/tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_0/runs/6522748a-9d43-41b7-8f88-cd537a502495'
I0513 09:16:23.056043 24190 cgroups_isolator.cpp:525] Launching executor_Task_Tracker_0 (cd hadoop && ./bin/mesos-executor) in /tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_0/runs/6522748a-9d43-41b7-8f88-cd537a502495 with resources cpus=1; mem=1280 for framework 201305130913-33597632-5050-3893-0000 in cgroup mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_0_tag_6522748a-9d43-41b7-8f88-cd537a502495
I0513 09:16:23.059368 24190 cgroups_isolator.cpp:670] Changing cgroup controls for executor executor_Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000 with resources cpus=1; mem=1280
I0513 09:16:23.060478 24190 cgroups_isolator.cpp:841] Updated 'cpu.shares' to 1024 for executor executor_Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000
I0513 09:16:23.061101 24190 cgroups_isolator.cpp:979] Updated 'memory.limit_in_bytes' to 1342177280 for executor executor_Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000
I0513 09:16:23.061101 24190 cgroups_isolator.cpp:979] Updated 'memory.limit_in_bytes' to 1342177280 for executor executor_Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000
I0513 09:16:23.061807 24190 cgroups_isolator.cpp:1005] Started listening for OOM events for executor executor_Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000
I0513 09:16:23.063297 24190 cgroups_isolator.cpp:555] Forked executor at = 24552
I0513 09:16:29.055598 24190 slave.cpp:587] Got assigned task Task_Tracker_1 for framework 201305130913-33597632-5050-3893-0000
I0513 09:16:29.058297 24190 paths.hpp:302] Created executor directory '/tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_1/runs/38d83d3a-ef9d-4118-b28c-c6c3cfba6c4b'
I0513 09:16:29.059012 24203 slave.cpp:436] Successfully attached file '/tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_1/runs/38d83d3a-ef9d-4118-b28c-c6c3cfba6c4b'
I0513 09:16:29.059865 24200 cgroups_isolator.cpp:525] Launching executor_Task_Tracker_1 (cd hadoop && ./bin/mesos-executor) in /tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_1/runs/38d83d3a-ef9d-4118-b28c-c6c3cfba6c4b with resources cpus=1; mem=1280 for framework 201305130913-33597632-5050-3893-0000 in cgroup mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_1_tag_38d83d3a-ef9d-4118-b28c-c6c3cfba6c4b
I0513 09:16:29.061282 24200 cgroups_isolator.cpp:670] Changing cgroup controls for executor executor_Task_Tracker_1 of framework 201305130913-33597632-5050-3893-0000 with resources cpus=1; mem=1280
I0513 09:16:29.062208 24200 cgroups_isolator.cpp:841] Updated 'cpu.shares' to 1024 for executor executor_Task_Tracker_1 of framework 201305130913-33597632-5050-3893-0000
I0513 09:16:29.062940 24200 cgroups_isolator.cpp:979] Updated 'memory.limit_in_bytes' to 1342177280 for executor executor_Task_Tracker_1 of framework 201305130913-33597632-5050-3893-0000
I0513 09:16:29.063705 24200 cgroups_isolator.cpp:1005] Started listening for OOM events for executor executor_Task_Tracker_1 of framework 201305130913-33597632-5050-3893-0000
I0513 09:16:29.065239 24200 cgroups_isolator.cpp:555] Forked executor at = 24628
I0513 09:16:34.457746 24188 cgroups_isolator.cpp:806] Executor executor_Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000 terminated with status 256
I0513 09:16:34.457909 24188 cgroups_isolator.cpp:635] Killing executor executor_Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000
I0513 09:16:34.459873 24188 cgroups_isolator.cpp:1025] OOM notifier is triggered for executor executor_Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000 with uuid 6522748a-9d43-41b7-8f88-cd537a502495
I0513 09:16:34.460028 24188 cgroups_isolator.cpp:1030] Discarded OOM notifier for executor executor_Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000 with uuid 6522748a-9d43-41b7-8f88-cd537a502495
I0513 09:16:34.461314 24190 cgroups.cpp:1175] Trying to freeze cgroup /cgroup/mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_0_tag_6522748a-9d43-41b7-8f88-cd537a502495
I0513 09:16:34.461675 24190 cgroups.cpp:1214] Successfully froze cgroup /cgroup/mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_0_tag_6522748a-9d43-41b7-8f88-cd537a502495 after 1 attempts
I0513 09:16:34.464400 24197 cgroups.cpp:1190] Trying to thaw cgroup /cgroup/mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_0_tag_6522748a-9d43-41b7-8f88-cd537a502495
I0513 09:16:34.464659 24197 cgroups.cpp:1298] Successfully thawed /cgroup/mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_0_tag_6522748a-9d43-41b7-8f88-cd537a502495
I0513 09:16:34.477118 24199 cgroups_isolator.cpp:1144] Successfully destroyed cgroup mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_0_tag_6522748a-9d43-41b7-8f88-cd537a502495
I0513 09:16:34.477439 24190 slave.cpp:1479] Executor 'executor_Task_Tracker_0' of framework 201305130913-33597632-5050-3893-0000 has exited with status 1
I0513 09:16:34.479852 24190 slave.cpp:1232] Handling status update TASK_LOST from task Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000
I0513 09:16:34.480123 24190 slave.cpp:1280] Forwarding status update TASK_LOST from task Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000 to the status update manager
I0513 09:16:34.480136 24199 cgroups_isolator.cpp:666] Asked to update resources for an unknown/killed executor
I0513 09:16:34.480480 24185 status_update_manager.cpp:254] Received status update TASK_LOST from task Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000
I0513 09:16:34.480716 24185 status_update_manager.cpp:403] Creating StatusUpdate stream for task Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000
I0513 09:16:34.480927 24185 status_update_manager.hpp:314] Handling UPDATE for status update TASK_LOST from task Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000
I0513 09:16:34.481107 24185 status_update_manager.cpp:289] Forwarding status update TASK_LOST from task Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000 to the master at master@192.168.0.2:5050
I0513 09:16:34.487007 24194 slave.cpp:979] Got acknowledgement of status update for task Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000
I0513 09:16:34.487257 24185 status_update_manager.cpp:314] Received status update acknowledgement for task Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000
I0513 09:16:34.487412 24185 status_update_manager.hpp:314] Handling ACK for status update TASK_LOST from task Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000
I0513 09:16:34.487547 24185 status_update_manager.cpp:434] Cleaning up status update stream for task Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000
I0513 09:16:34.487788 24207 slave.cpp:1016] Status update manager successfully handled status update acknowledgement for task Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000
I0513 09:16:34.488142 24202 gc.cpp:56] Scheduling '/tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_0/runs/6522748a-9d43-41b7-8f88-cd537a502495' for removal
I0513 09:16:35.063462 24199 slave.cpp:587] Got assigned task Task_Tracker_2 for framework 201305130913-33597632-5050-3893-0000
I0513 09:16:35.066090 24199 paths.hpp:302] Created executor directory '/tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_2/runs/f4729d73-5000-4c40-9c0e-1e77ad414f27'
I0513 09:16:35.066673 24188 slave.cpp:436] Successfully attached file '/tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_2/runs/f4729d73-5000-4c40-9c0e-1e77ad414f27'
I0513 09:16:35.066985 24205 cgroups_isolator.cpp:525] Launching executor_Task_Tracker_2 (cd hadoop && ./bin/mesos-executor) in /tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_2/runs/f4729d73-5000-4c40-9c0e-1e77ad414f27 with resources cpus=1; mem=1280 for framework 201305130913-33597632-5050-3893-0000 in cgroup mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_2_tag_f4729d73-5000-4c40-9c0e-1e77ad414f27
I0513 09:16:35.068594 24205 cgroups_isolator.cpp:670] Changing cgroup controls for executor executor_Task_Tracker_2 of framework 201305130913-33597632-5050-3893-0000 with resources cpus=1; mem=1280
I0513 09:16:35.069341 24205 cgroups_isolator.cpp:841] Updated 'cpu.shares' to 1024 for executor executor_Task_Tracker_2 of framework 201305130913-33597632-5050-3893-0000
I0513 09:16:35.070061 24205 cgroups_isolator.cpp:979] Updated 'memory.limit_in_bytes' to 1342177280 for executor executor_Task_Tracker_2 of framework 201305130913-33597632-5050-3893-0000
I0513 09:16:35.070828 24205 cgroups_isolator.cpp:1005] Started listening for OOM events for executor executor_Task_Tracker_2 of framework 201305130913-33597632-5050-3893-0000
I0513 09:16:35.071966 24205 cgroups_isolator.cpp:555] Forked executor at = 24704
I0513 09:16:40.464987 24197 cgroups_isolator.cpp:806] Executor executor_Task_Tracker_1 of framework 201305130913-33597632-5050-3893-0000 terminated with status 256
I0513 09:16:40.465175 24197 cgroups_isolator.cpp:635] Killing executor executor_Task_Tracker_1 of framework 201305130913-33597632-5050-3893-0000
I0513 09:16:40.467118 24197 cgroups_isolator.cpp:1025] OOM notifier is triggered for executor executor_Task_Tracker_1 of framework 201305130913-33597632-5050-3893-0000 with uuid 38d83d3a-ef9d-4118-b28c-c6c3cfba6c4b
I0513 09:16:40.467269 24197 cgroups_isolator.cpp:1030] Discarded OOM notifier for executor executor_Task_Tracker_1 of framework 201305130913-33597632-5050-3893-0000 with uuid 38d83d3a-ef9d-4118-b28c-c6c3cfba6c4b
I0513 09:16:40.468596 24198 cgroups.cpp:1175] Trying to freeze cgroup /cgroup/mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_1_tag_38d83d3a-ef9d-4118-b28c-c6c3cfba6c4b
I0513 09:16:40.468945 24198 cgroups.cpp:1214] Successfully froze cgroup /cgroup/mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_1_tag_38d83d3a-ef9d-4118-b28c-c6c3cfba6c4b after 1 attempts
I0513 09:16:40.471577 24200 cgroups.cpp:1190] Trying to thaw cgroup /cgroup/mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_1_tag_38d83d3a-ef9d-4118-b28c-c6c3cfba6c4b
I0513 09:16:40.471850 24200 cgroups.cpp:1298] Successfully thawed /cgroup/mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_1_tag_38d83d3a-ef9d-4118-b28c-c6c3cfba6c4b
I0513 09:16:40.480960 24185 cgroups_isolator.cpp:1144] Successfully destroyed cgroup mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_1_tag_38d83d3a-ef9d-4118-b28c-c6c3cfba6c4b
I0513 09:16:40.481230 24196 slave.cpp:1479] Executor 'executor_Task_Tracker_1' of framework 201305130913-33597632-5050-3893-0000 has exited with status 1
I0513 09:16:40.483572 24196 slave.cpp:1232] Handling status update TASK_LOST from task Task_Tracker_1 of framework 201305130913-33597632-5050-3893-0000
I0513 09:16:40.483801 24196 slave.cpp:1280] Forwarding status update TASK_LOST from task Task_Tracker_1 of framework 201305130913-33597632-5050-3893-0000 to the status update manager
I0513 09:16:40.483846 24193 cgroups_isolator.cpp:666] Asked to update resources for an unknown/killed executor
I0513 09:16:40.484094 24205 status_update_manager.cpp:254] Received status update TASK_LOST from task Task_Tracker_1 of framework 201305130913-33597632-5050-3893-0000
I0513 09:16:40.484267 24205 status_update_manager.cpp:403] Creating StatusUpdate stream for task Task_Tracker_1 of framework 201305130913-33597632-5050-3893-0000
I0513 09:16:40.484412 24205 status_update_manager.hpp:314] Handling UPDATE for status update TASK_LOST from task Task_Tracker_1 of framework 201305130913-33597632-5050-3893-0000
I0513 09:16:40.484558 24205 status_update_manager.cpp:289] Forwarding status update TASK_LOST from task Task_Tracker_1 of framework 201305130913-33597632-5050-3893-0000 to the master at master@192.168.0.2:5050
I0513 09:16:40.487229 24202 slave.cpp:979] Got acknowledgement of status update for task Task_Tracker_1 of framework 201305130913-33597632-5050-3893-0000
I0513 09:16:40.487457 24196 status_update_manager.cpp:314] Received status update acknowledgement for task Task_Tracker_1 of framework 201305130913-33597632-5050-3893-0000
I0513 09:16:40.487607 24196 status_update_manager.hpp:314] Handling ACK for status update TASK_LOST from task Task_Tracker_1 of framework 201305130913-33597632-5050-3893-0000
I0513 09:16:40.487741 24196 status_update_manager.cpp:434] Cleaning up status update stream for task Task_Tracker_1 of framework 201305130913-33597632-5050-3893-0000
I0513 09:16:40.487949 24207 slave.cpp:1016] Status update manager successfully handled status update acknowledgement for task Task_Tracker_1 of framework 201305130913-33597632-5050-3893-0000
I0513 09:16:40.488278 24193 gc.cpp:56] Scheduling '/tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_1/runs/38d83d3a-ef9d-4118-b28c-c6c3cfba6c4b' for removal
I0513 09:16:41.072098 24194 slave.cpp:587] Got assigned task Task_Tracker_3 for framework 201305130913-33597632-5050-3893-0000
I0513 09:16:41.074632 24194 paths.hpp:302] Created executor directory '/tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_3/runs/22f6e84b-d07f-430a-a322-6f804b3cd642'
I0513 09:16:41.075546 24198 slave.cpp:436] Successfully attached file '/tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_3/runs/22f6e84b-d07f-430a-a322-6f804b3cd642'
I0513 09:16:41.076081 24194 cgroups_isolator.cpp:525] Launching executor_Task_Tracker_3 (cd hadoop && ./bin/mesos-executor) in /tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_3/runs/22f6e84b-d07f-430a-a322-6f804b3cd642 with resources cpus=1; mem=1280 for framework 201305130913-33597632-5050-3893-0000 in cgroup mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_3_tag_22f6e84b-d07f-430a-a322-6f804b3cd642
I0513 09:16:41.077606 24194 cgroups_isolator.cpp:670] Changing cgroup controls for executor executor_Task_Tracker_3 of framework 201305130913-33597632-5050-3893-0000 with resources cpus=1; mem=1280
I0513 09:16:41.078402 24194 cgroups_isolator.cpp:841] Updated 'cpu.shares' to 1024 for executor executor_Task_Tracker_3 of framework 201305130913-33597632-5050-3893-0000
I0513 09:16:41.079186 24194 cgroups_isolator.cpp:979] Updated 'memory.limit_in_bytes' to 1342177280 for executor executor_Task_Tracker_3 of framework 201305130913-33597632-5050-3893-0000
I0513 09:16:41.080008 24194 cgroups_isolator.cpp:1005] Started listening for OOM events for executor executor_Task_Tracker_3 of framework 201305130913-33597632-5050-3893-0000
I0513 09:16:41.081447 24194 cgroups_isolator.cpp:555] Forked executor at = 24780
I0513 09:16:44.482589 24200 status_update_manager.cpp:379] Checking for unacknowledged status updates
I0513 09:16:46.473145 24199 cgroups_isolator.cpp:806] Executor executor_Task_Tracker_2 of framework 201305130913-33597632-5050-3893-0000 terminated with status 256
I0513 09:16:46.473307 24199 cgroups_isolator.cpp:635] Killing executor executor_Task_Tracker_2 of framework 201305130913-33597632-5050-3893-0000
I0513 09:16:46.475491 24199 cgroups_isolator.cpp:1025] OOM notifier is triggered for executor executor_Task_Tracker_2 of framework 201305130913-33597632-5050-3893-0000 with uuid f4729d73-5000-4c40-9c0e-1e77ad414f27
I0513 09:16:46.475649 24199 cgroups_isolator.cpp:1030] Discarded OOM notifier for executor executor_Task_Tracker_2 of framework 201305130913-33597632-5050-3893-0000 with uuid f4729d73-5000-4c40-9c0e-1e77ad414f27
I0513 09:16:46.476820 24192 cgroups.cpp:1175] Trying to freeze cgroup /cgroup/mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_2_tag_f4729d73-5000-4c40-9c0e-1e77ad414f27
I0513 09:16:46.477181 24192 cgroups.cpp:1214] Successfully froze cgroup /cgroup/mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_2_tag_f4729d73-5000-4c40-9c0e-1e77ad414f27 after 1 attempts
I0513 09:16:46.479907 24201 cgroups.cpp:1190] Trying to thaw cgroup /cgroup/mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_2_tag_f4729d73-5000-4c40-9c0e-1e77ad414f27
I0513 09:16:46.480229 24201 cgroups.cpp:1298] Successfully thawed /cgroup/mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_2_tag_f4729d73-5000-4c40-9c0e-1e77ad414f27
I0513 09:16:46.493069 24200 cgroups_isolator.cpp:1144] Successfully destroyed cgroup mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_2_tag_f4729d73-5000-4c40-9c0e-1e77ad414f27
I0513 09:16:46.493391 24184 slave.cpp:1479] Executor 'executor_Task_Tracker_2' of framework 201305130913-33597632-5050-3893-0000 has exited with status 1
I0513 09:16:46.495689 24184 slave.cpp:1232] Handling status update TASK_LOST from task Task_Tracker_2 of framework 201305130913-33597632-5050-3893-0000
I0513 09:16:46.495933 24184 slave.cpp:1280] Forwarding status update TASK_LOST from task Task_Tracker_2 of framework 201305130913-33597632-5050-3893-0000 to the status update manager
I0513 09:16:46.495980 24189 cgroups_isolator.cpp:666] Asked to update resources for an unknown/killed executor
I0513 09:16:46.496305 24193 status_update_manager.cpp:254] Received status update TASK_LOST from task Task_Tracker_2 of framework 201305130913-33597632-5050-3893-0000
I0513 09:16:46.496553 24193 status_update_manager.cpp:403] Creating StatusUpdate stream for task Task_Tracker_2 of framework 201305130913-33597632-5050-3893-0000
I0513 09:16:46.496707 24193 status_update_manager.hpp:314] Handling UPDATE for status update TASK_LOST from task Task_Tracker_2 of framework 201305130913-33597632-5050-3893-0000
I0513 09:16:46.496868 24193 status_update_manager.cpp:289] Forwarding status update TASK_LOST from task Task_Tracker_2 of framework 201305130913-33597632-5050-3893-0000 to the master at master@192.168.0.2:5050
I0513 09:16:46.499631 24201 slave.cpp:979] Got acknowledgement of status update for task Task_Tracker_2 of framework 201305130913-33597632-5050-3893-0000
I0513 09:16:46.499961 24193 status_update_manager.cpp:314] Received status update acknowledgement for task Task_Tracker_2 of framework 201305130913-33597632-5050-3893-0000
I0513 09:16:46.500128 24193 status_update_manager.hpp:314] Handling ACK for status update TASK_LOST from task Task_Tracker_2 of framework 201305130913-33597632-5050-3893-0000
I0513 09:16:46.500257 24193 status_update_manager.cpp:434] Cleaning up status update stream for task Task_Tracker_2 of framework 201305130913-33597632-5050-3893-0000
I0513 09:16:46.500452 24192 slave.cpp:1016] Status update manager successfully handled status update acknowledgement for task Task_Tracker_2 of framework 201305130913-33597632-5050-3893-0000
I0513 09:16:46.500743 24204 gc.cpp:56] Scheduling '/tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_2/runs/f4729d73-5000-4c40-9c0e-1e77ad414f27' for removal
I0513 09:16:47.079013 24193 slave.cpp:587] Got assigned task Task_Tracker_4 for framework 201305130913-33597632-5050-3893-0000
I0513 09:16:47.081650 24193 paths.hpp:302] Created executor directory '/tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_4/runs/8a4dd631-1ec0-4946-a1bc-0644a7238e3c'
I0513 09:16:47.082447 24198 slave.cpp:436] Successfully attached file '/tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_4/runs/8a4dd631-1ec0-4946-a1bc-0644a7238e3c'
I0513 09:16:47.082861 24194 cgroups_isolator.cpp:525] Launching executor_Task_Tracker_4 (cd hadoop && ./bin/mesos-executor) in /tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_4/runs/8a4dd631-1ec0-4946-a1bc-0644a7238e3c with resources cpus=1; mem=1280 for framework 201305130913-33597632-5050-3893-0000 in cgroup mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_4_tag_8a4dd631-1ec0-4946-a1bc-0644a7238e3c
I0513 09:16:47.084478 24194 cgroups_isolator.cpp:670] Changing cgroup controls for executor executor_Task_Tracker_4 of framework 201305130913-33597632-5050-3893-0000 with resources cpus=1; mem=1280
I0513 09:16:47.085273 24194 cgroups_isolator.cpp:841] Updated 'cpu.shares' to 1024 for executor executor_Task_Tracker_4 of framework 201305130913-33597632-5050-3893-0000
I0513 09:16:47.086045 24194 cgroups_isolator.cpp:979] Updated 'memory.limit_in_bytes' to 1342177280 for executor executor_Task_Tracker_4 of framework 201305130913-33597632-5050-3893-0000
I0513 09:16:47.086853 24194 cgroups_isolator.cpp:1005] Started listening for OOM events for executor executor_Task_Tracker_4 of framework 201305130913-33597632-5050-3893-0000
I0513 09:16:47.088227 24194 cgroups_isolator.cpp:555] Forked executor at = 24856
I0513 09:16:50.485791 24194 status_update_manager.cpp:379] Checking for unacknowledged status updates
I0513 09:16:52.480471 24185 cgroups_isolator.cpp:806] Executor executor_Task_Tracker_3 of framework 201305130913-33597632-5050-3893-0000 terminated with status 256
I0513 09:16:52.480622 24185 cgroups_isolator.cpp:635] Killing executor executor_Task_Tracker_3 of framework 201305130913-33597632-5050-3893-0000
I0513 09:16:52.482652 24185 cgroups_isolator.cpp:1025] OOM notifier is triggered for executor executor_Task_Tracker_3 of framework 201305130913-33597632-5050-3893-0000 with uuid 22f6e84b-d07f-430a-a322-6f804b3cd642
I0513 09:16:52.482805 24185 cgroups_isolator.cpp:1030] Discarded OOM notifier for executor executor_Task_Tracker_3 of framework 201305130913-33597632-5050-3893-0000 with uuid 22f6e84b-d07f-430a-a322-6f804b3cd642
I0513 09:16:52.484110 24195 cgroups.cpp:1175] Trying to freeze cgroup /cgroup/mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_3_tag_22f6e84b-d07f-430a-a322-6f804b3cd642
I0513 09:16:52.484447 24195 cgroups.cpp:1214] Successfully froze cgroup /cgroup/mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_3_tag_22f6e84b-d07f-430a-a322-6f804b3cd642 after 1 attempts
I0513 09:16:52.487893 24184 cgroups.cpp:1190] Trying to thaw cgroup /cgroup/mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_3_tag_22f6e84b-d07f-430a-a322-6f804b3cd642
I0513 09:16:52.488129 24184 cgroups.cpp:1298] Successfully thawed /cgroup/mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_3_tag_22f6e84b-d07f-430a-a322-6f804b3cd642
I0513 09:16:52.496047 24207 cgroups_isolator.cpp:1144] Successfully destroyed cgroup mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_3_tag_22f6e84b-d07f-430a-a322-6f804b3cd642
I0513 09:16:52.496247 24203 slave.cpp:1479] Executor 'executor_Task_Tracker_3' of framework 201305130913-33597632-5050-3893-0000 has exited with status 1
I0513 09:16:52.498538 24203 slave.cpp:1232] Handling status update TASK_LOST from task Task_Tracker_3 of framework 201305130913-33597632-5050-3893-0000
......




Wang Yu

Re: Re: Tasks always lost when running hadoop test!

Posted by Wang Yu <wa...@nfs.iscas.ac.cn>.
1. There is no log in directories like "/tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_4/runs/8a4dd631-1ec0-4946-a1bc-0644a7238e3c"
2. I use the git version, just download it using "git clone git://git.apache.org/mesos.git". This is what you told me before...

2013-05-15



Wang Yu



发件人:Vinod Kone <vi...@twitter.com>
发送时间:2013-05-15 23:14
主题:Re: Tasks always lost when running hadoop test!
收件人:"mesos-dev@incubator.apache.org"<me...@incubator.apache.org>
抄送:"mesos-dev"<me...@incubator.apache.org>,"Benjamin Mahler"<be...@gmail.com>

logs? Also what version of mesos? 

@vinodkone 
Sent from my mobile  

On May 15, 2013, at 12:00 AM, 王瑜 <wa...@nfs.iscas.ac.cn> wrote: 

> Hi Ben, 
>  
> I think the problem is mesos have found the executor on hdfs://master/user/mesos/hadoop.tar.gz, but it did not download it, so did not use it. 
> Mesos found the executor, so it did not output error, just update the task status as lost; but mesos did not use the executor, so the executor directory contains nothing!  
>  
> But I am not very familiar with source code, so I do not know why mesos can not use the executor. And I also do not know whether my analysis is right. Thanks very much for your help! 
>  
>  
>  
>  
> Wang Yu 
>  
> 发件人: 王瑜 
> 发送时间: 2013-05-15 11:04 
> 收件人: mesos-dev 
> 抄送: Benjamin Mahler 
> 主题: 回复: 回复: org.apache.hadoop.mapred.MesosScheduler: Unknown/exited TaskTracker: http://slave5:50060 
> Hi, Ben, 
>  
> I have reworked the test, and checked log directory again, it is still null. The same as following. 
> I think there is the problem with my executor, but I do not know how to let the executor works. Logs is as following... 
> " Asked to update resources for an unknown/killed executor" why it always kill the executor? 
>  
> 1. I opened all the executor directory, but all of them are null. I do not know what happened to them... 
> [root@slave1 logs]# cd /tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_4/runs/8a4dd631-1ec0-4946-a1bc-0644a7238e3c 
> [root@slave1 8a4dd631-1ec0-4946-a1bc-0644a7238e3c]# ls 
> [root@slave1 8a4dd631-1ec0-4946-a1bc-0644a7238e3c]# ls -l 
> 总用量 0 
> [root@slave1 8a4dd631-1ec0-4946-a1bc-0644a7238e3c]# ls -a 
> .  .. 
> [root@slave1 8a4dd631-1ec0-4946-a1bc-0644a7238e3c]# 
> 2. I added "--isolation=cgroups" for slaves, but it still not work. Tasks are always lost. But there is no error any more, I still do not know what happened to the executor...Logs on one slave is as follows. Please help me, thanks very much! 
>  
> mesos-slave.INFO 
> Log file created at: 2013/05/13 09:12:54 
> Running on machine: slave1 
> Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg 
> I0513 09:12:54.170383 24183 main.cpp:124] Creating "cgroups" isolator 
> I0513 09:12:54.171617 24183 main.cpp:132] Build: 2013-04-10 16:07:43 by root 
> I0513 09:12:54.171656 24183 main.cpp:133] Starting Mesos slave 
> I0513 09:12:54.173495 24197 slave.cpp:203] Slave started on 1)@192.168.0.3:36668 
> I0513 09:12:54.173578 24197 slave.cpp:204] Slave resources: cpus=24; mem=63356; ports=[31000-32000]; disk=29143 
> I0513 09:12:54.174486 24192 cgroups_isolator.cpp:242] Using /cgroup as cgroups hierarchy root 
> I0513 09:12:54.179914 24197 slave.cpp:453] New master detected at master@192.168.0.2:5050 
> I0513 09:12:54.180809 24197 slave.cpp:436] Successfully attached file '/home/mesos/build/logs/mesos-slave.INFO' 
> I0513 09:12:54.180817 24207 status_update_manager.cpp:132] New master detected at master@192.168.0.2:5050 
> I0513 09:12:54.194345 24192 cgroups_isolator.cpp:730] Recovering isolator 
> I0513 09:12:54.195453 24189 slave.cpp:377] Finished recovery 
> I0513 09:12:54.197798 24206 slave.cpp:487] Registered with master; given slave ID 201305130913-33597632-5050-3893-0 
> I0513 09:12:54.198086 24201 gc.cpp:56] Scheduling '/tmp/mesos/slaves/201305081719-33597632-5050-4050-1' for removal 
> I0513 09:12:54.198329 24201 gc.cpp:56] Scheduling '/tmp/mesos/slaves/201305100938-33597632-5050-19520-1' for removal 
> I0513 09:12:54.198490 24201 gc.cpp:56] Scheduling '/tmp/mesos/slaves/201305081625-33597632-5050-2991-1' for removal 
> I0513 09:12:54.198593 24201 gc.cpp:56] Scheduling '/tmp/mesos/slaves/201305081746-33597632-5050-12378-1' for removal 
> I0513 09:12:54.198874 24201 gc.cpp:56] Scheduling '/tmp/mesos/slaves/201305090914-33597632-5050-5072-1' for removal 
> I0513 09:12:54.199028 24201 gc.cpp:56] Scheduling '/tmp/mesos/slaves/201305081730-33597632-5050-8558-1' for removal 
> I0513 09:12:54.199149 24201 gc.cpp:56] Scheduling '/tmp/mesos/slaves/201304131144-33597632-5050-4949-2' for removal 
> I0513 09:13:54.176460 24204 slave.cpp:1811] Current disk usage 26.93%. Max allowed age: 5.11days 
> I0513 09:14:54.178444 24203 slave.cpp:1811] Current disk usage 26.93%. Max allowed age: 5.11days 
> I0513 09:15:54.180680 24203 slave.cpp:1811] Current disk usage 26.93%. Max allowed age: 5.11days 
> I0513 09:16:23.051203 24200 slave.cpp:587] Got assigned task Task_Tracker_0 for framework 201305130913-33597632-5050-3893-0000 
> I0513 09:16:23.054324 24200 paths.hpp:302] Created executor directory '/tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_0/runs/6522748a-9d43-41b7-8f88-cd537a502495' 
> I0513 09:16:23.055605 24188 slave.cpp:436] Successfully attached file '/tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_0/runs/6522748a-9d43-41b7-8f88-cd537a502495' 
> I0513 09:16:23.056043 24190 cgroups_isolator.cpp:525] Launching executor_Task_Tracker_0 (cd hadoop && ./bin/mesos-executor) in /tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_0/runs/6522748a-9d43-41b7-8f88-cd537a502495 with resources cpus=1; mem=1280 for framework 201305130913-33597632-5050-3893-0000 in cgroup mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_0_tag_6522748a-9d43-41b7-8f88-cd537a502495 
> I0513 09:16:23.059368 24190 cgroups_isolator.cpp:670] Changing cgroup controls for executor executor_Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000 with resources cpus=1; mem=1280 
> I0513 09:16:23.060478 24190 cgroups_isolator.cpp:841] Updated 'cpu.shares' to 1024 for executor executor_Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000 
> I0513 09:16:23.061101 24190 cgroups_isolator.cpp:979] Updated 'memory.limit_in_bytes' to 1342177280 for executor executor_Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000 
> I0513 09:16:23.061101 24190 cgroups_isolator.cpp:979] Updated 'memory.limit_in_bytes' to 1342177280 for executor executor_Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000 
> I0513 09:16:23.061807 24190 cgroups_isolator.cpp:1005] Started listening for OOM events for executor executor_Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000 
> I0513 09:16:23.063297 24190 cgroups_isolator.cpp:555] Forked executor at = 24552 
> I0513 09:16:29.055598 24190 slave.cpp:587] Got assigned task Task_Tracker_1 for framework 201305130913-33597632-5050-3893-0000 
> I0513 09:16:29.058297 24190 paths.hpp:302] Created executor directory '/tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_1/runs/38d83d3a-ef9d-4118-b28c-c6c3cfba6c4b' 
> I0513 09:16:29.059012 24203 slave.cpp:436] Successfully attached file '/tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_1/runs/38d83d3a-ef9d-4118-b28c-c6c3cfba6c4b' 
> I0513 09:16:29.059865 24200 cgroups_isolator.cpp:525] Launching executor_Task_Tracker_1 (cd hadoop && ./bin/mesos-executor) in /tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_1/runs/38d83d3a-ef9d-4118-b28c-c6c3cfba6c4b with resources cpus=1; mem=1280 for framework 201305130913-33597632-5050-3893-0000 in cgroup mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_1_tag_38d83d3a-ef9d-4118-b28c-c6c3cfba6c4b 
> I0513 09:16:29.061282 24200 cgroups_isolator.cpp:670] Changing cgroup controls for executor executor_Task_Tracker_1 of framework 201305130913-33597632-5050-3893-0000 with resources cpus=1; mem=1280 
> I0513 09:16:29.062208 24200 cgroups_isolator.cpp:841] Updated 'cpu.shares' to 1024 for executor executor_Task_Tracker_1 of framework 201305130913-33597632-5050-3893-0000 
> I0513 09:16:29.062940 24200 cgroups_isolator.cpp:979] Updated 'memory.limit_in_bytes' to 1342177280 for executor executor_Task_Tracker_1 of framework 201305130913-33597632-5050-3893-0000 
> I0513 09:16:29.063705 24200 cgroups_isolator.cpp:1005] Started listening for OOM events for executor executor_Task_Tracker_1 of framework 201305130913-33597632-5050-3893-0000 
> I0513 09:16:29.065239 24200 cgroups_isolator.cpp:555] Forked executor at = 24628 
> I0513 09:16:34.457746 24188 cgroups_isolator.cpp:806] Executor executor_Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000 terminated with status 256 
> I0513 09:16:34.457909 24188 cgroups_isolator.cpp:635] Killing executor executor_Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000 
> I0513 09:16:34.459873 24188 cgroups_isolator.cpp:1025] OOM notifier is triggered for executor executor_Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000 with uuid 6522748a-9d43-41b7-8f88-cd537a502495 
> I0513 09:16:34.460028 24188 cgroups_isolator.cpp:1030] Discarded OOM notifier for executor executor_Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000 with uuid 6522748a-9d43-41b7-8f88-cd537a502495 
> I0513 09:16:34.461314 24190 cgroups.cpp:1175] Trying to freeze cgroup /cgroup/mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_0_tag_6522748a-9d43-41b7-8f88-cd537a502495 
> I0513 09:16:34.461675 24190 cgroups.cpp:1214] Successfully froze cgroup /cgroup/mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_0_tag_6522748a-9d43-41b7-8f88-cd537a502495 after 1 attempts 
> I0513 09:16:34.464400 24197 cgroups.cpp:1190] Trying to thaw cgroup /cgroup/mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_0_tag_6522748a-9d43-41b7-8f88-cd537a502495 
> I0513 09:16:34.464659 24197 cgroups.cpp:1298] Successfully thawed /cgroup/mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_0_tag_6522748a-9d43-41b7-8f88-cd537a502495 
> I0513 09:16:34.477118 24199 cgroups_isolator.cpp:1144] Successfully destroyed cgroup mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_0_tag_6522748a-9d43-41b7-8f88-cd537a502495 
> I0513 09:16:34.477439 24190 slave.cpp:1479] Executor 'executor_Task_Tracker_0' of framework 201305130913-33597632-5050-3893-0000 has exited with status 1 
> I0513 09:16:34.479852 24190 slave.cpp:1232] Handling status update TASK_LOST from task Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000 
> I0513 09:16:34.480123 24190 slave.cpp:1280] Forwarding status update TASK_LOST from task Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000 to the status update manager 
> I0513 09:16:34.480136 24199 cgroups_isolator.cpp:666] Asked to update resources for an unknown/killed executor 
> I0513 09:16:34.480480 24185 status_update_manager.cpp:254] Received status update TASK_LOST from task Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000 
> I0513 09:16:34.480716 24185 status_update_manager.cpp:403] Creating StatusUpdate stream for task Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000 
> I0513 09:16:34.480927 24185 status_update_manager.hpp:314] Handling UPDATE for status update TASK_LOST from task Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000 
> I0513 09:16:34.481107 24185 status_update_manager.cpp:289] Forwarding status update TASK_LOST from task Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000 to the master at master@192.168.0.2:5050 
> I0513 09:16:34.487007 24194 slave.cpp:979] Got acknowledgement of status update for task Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000 
> I0513 09:16:34.487257 24185 status_update_manager.cpp:314] Received status update acknowledgement for task Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000 
> I0513 09:16:34.487412 24185 status_update_manager.hpp:314] Handling ACK for status update TASK_LOST from task Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000 
> I0513 09:16:34.487547 24185 status_upda 

Re: /home/mesos/build/hadoop/hadoop-0.20.205.0/build.xml:666: The following error occurred while executing this line

Posted by Benjamin Mahler <be...@gmail.com>.
Hi Wang,

We will be releasing 0.12.0 shortly, which contains a completely new Hadoop
framework that we wrote awhile back. However, you'll want to keep in mind
the bundled Hadoop framework is not production vetted in 0.12.0.

Brenden Matthews has been been running it extensively since then and has
been a great help with fixing bugs and improving the framework since then.

Keep an eye out for the VOTE and release of 0.12.0!

Ben


On Sun, Jun 9, 2013 at 1:25 AM, 王瑜 <wa...@nfs.iscas.ac.cn> wrote:

> Hi all,
>
> When I compile the new stable version of mesos and deploy hadoop on it, it
> can not compile hadoop.tar.gz file for task excutor, the log is as follows,
> thanks very for helping me.
> It seems there are some problem with javac can not find symbol "."
>
> compile:
>      [echo] contrib: mesos
>     [javac]
> /home/mesos/build/hadoop/hadoop-0.20.205.0/src/contrib/build-contrib.xml:185:
> warning: 'includeantruntime' was not set, defaulting to
> build.sysclasspath=last; set to false for repeatable builds
>     [javac] Compiling 5 source files to
> /home/mesos/build/hadoop/hadoop-0.20.205.0/build/contrib/mesos/classes
>     [javac]
> /home/mesos/build/hadoop/hadoop-0.20.205.0/src/contrib/mesos/src/java/org/apache/hadoop/mapred/FrameworkExecutor.java:61:
> 错误: 找不到符号
>     [javac]       Class<?>[] instClasses =
> TaskTracker.getInstrumentationClasses(conf);
>     [javac]                                           ^
>     [javac]   符号:   方法 getInstrumentationClasses(JobConf)
>     [javac]   位置: 类 TaskTracker
>     [javac]
> /home/mesos/build/hadoop/hadoop-0.20.205.0/src/contrib/mesos/src/java/org/apache/hadoop/mapred/FrameworkExecutor.java:136:
> 错误: 找不到符号
>     [javac]     if (task.extraData.equals("")) {
>     [javac]             ^
>     [javac]   符号:   变量 extraData
>     [javac]   位置: 类型为Task的变量 task
>     [javac]
> /home/mesos/build/hadoop/hadoop-0.20.205.0/src/contrib/mesos/src/java/org/apache/hadoop/mapred/FrameworkExecutor.java:143:
> 错误: 找不到符号
>     [javac]       .setValue(task.extraData)
>     [javac]                     ^
>     [javac]   符号:   变量 extraData
>     [javac]   位置: 类型为Task的变量 task
>     [javac]
> /home/mesos/build/hadoop/hadoop-0.20.205.0/src/contrib/mesos/src/java/org/apache/hadoop/mapred/FrameworkExecutor.java:176:
> 错误: 找不到符号
>     [javac]
> .setTaskId(TaskID.newBuilder().setValue(task.extraData).build())
>     [javac]                                                     ^
>     [javac]   符号:   变量 extraData
>     [javac]   位置: 类型为Task的变量 task
>     [javac]
> /home/mesos/build/hadoop/hadoop-0.20.205.0/src/contrib/mesos/src/java/org/apache/hadoop/mapred/FrameworkScheduler.java:143:
> 错误: jobTracker可以在MesosScheduler中访问private
>     [javac]     this.jobTracker = mesosSched.jobTracker;
>     [javac]                                 ^
>     [javac]
> /home/mesos/build/hadoop/hadoop-0.20.205.0/src/contrib/mesos/src/java/org/apache/hadoop/mapred/FrameworkScheduler.java:557:
> 错误: 找不到符号
>     [javac]                 task.extraData = "" + nt.mesosId.getValue();
>     [javac]                     ^
>     [javac]   符号:   变量 extraData
>     [javac]   位置: 类型为Task的变量 task
>     [javac]
> /home/mesos/build/hadoop/hadoop-0.20.205.0/src/contrib/mesos/src/java/org/apache/hadoop/mapred/FrameworkScheduler.java:572:
> 错误: 找不到符号
>     [javac]                 task.extraData = "" + nt.mesosId.getValue();
>     [javac]                     ^
>     [javac]   符号:   变量 extraData
>     [javac]   位置: 类型为Task的变量 task
>     [javac]
> /home/mesos/build/hadoop/hadoop-0.20.205.0/src/contrib/mesos/src/java/org/apache/hadoop/mapred/FrameworkScheduler.java:725:
> 错误: 找不到符号
>     [javac]       int maxLevel = job.getMaxCacheLevel();
>     [javac]                         ^
>     [javac]   符号:   方法 getMaxCacheLevel()
>     [javac]   位置: 类型为JobInProgress的变量 job
>     [javac]
> /home/mesos/build/hadoop/hadoop-0.20.205.0/src/contrib/mesos/src/java/org/apache/hadoop/mapred/MesosScheduler.java:545:
> 错误: 找不到符号
>     [javac]                     .setName("Hadoop TaskTracker")
>     [javac]                     ^
>     [javac]   符号:   方法 setName(String)
>     [javac]   位置: 类 Builder
>     [javac]
> /home/mesos/build/hadoop/hadoop-0.20.205.0/src/contrib/mesos/src/java/org/apache/hadoop/mapred/MesosTaskTrackerInstrumentation.java:24:
> 错误: 方法不会覆盖或实现超类型的方法
>     [javac]   @Override
>     [javac]   ^
>     [javac] 注:
> /home/mesos/build/hadoop/hadoop-0.20.205.0/src/contrib/mesos/src/java/org/apache/hadoop/mapred/FrameworkScheduler.java使用或覆盖了已过时的
> API。
>     [javac] 注: 有关详细信息, 请使用 -Xlint:deprecation 重新编译。
>     [javac] 10 个错误
>
> BUILD FAILED
> /home/mesos/build/hadoop/hadoop-0.20.205.0/build.xml:666: The following
> error occurred while executing this line:
> /home/mesos/build/hadoop/hadoop-0.20.205.0/src/contrib/build.xml:30: The
> following error occurred while executing this line:
> /home/mesos/build/hadoop/hadoop-0.20.205.0/src/contrib/build-contrib.xml:185:
> Compile failed; see the compiler error output for details.
>
> Total time: 24 seconds
>
> Oh no! We failed to run 'ant -Dversion=0.20.205.0 compile bin-package'. If
> you need help try emailing:
>
>   mesos-dev@incubator.apache.org
>
> (Remember to include as much debug information as possible.)
>
>
>
>
> Wang Yu

Re: Re: Tasks always lost when running hadoop test!

Posted by 王瑜 <wa...@nfs.iscas.ac.cn>.
Hi all,

Here more strange thing happened, when I run spark on the mesos, the task lost just like hadoop.  What's wrong with my mesos, do you have any suggestions?

[root@master spark]# ./run spark.examples.SparkPi master:5050
13/05/16 10:43:17 INFO spark.BoundedMemoryCache: BoundedMemoryCache.maxBytes = 6791445872
13/05/16 10:43:17 INFO spark.CacheTrackerActor: Registered actor on port 7077
13/05/16 10:43:17 INFO spark.CacheTrackerActor: Started slave cache (size 6.3GB) on master
13/05/16 10:43:17 INFO spark.MapOutputTrackerActor: Registered actor on port 7077
13/05/16 10:43:17 INFO spark.ShuffleManager: Shuffle dir: /tmp/spark-local-79237de3-efea-48d6-bb13-c32d98c1d7ec/shuffle
13/05/16 10:43:17 INFO server.Server: jetty-7.5.3.v20111011
13/05/16 10:43:17 INFO server.AbstractConnector: Started SelectChannelConnector@0.0.0.0:44208 STARTING
13/05/16 10:43:17 INFO spark.ShuffleManager: Local URI: http://192.168.0.2:44208
13/05/16 10:43:17 INFO server.Server: jetty-7.5.3.v20111011
13/05/16 10:43:17 INFO server.AbstractConnector: Started SelectChannelConnector@0.0.0.0:51063 STARTING
13/05/16 10:43:17 INFO broadcast.HttpBroadcast: Broadcast server started at http://192.168.0.2:51063
13/05/16 10:43:17 INFO spark.MesosScheduler: Registered as framework ID 201305151124-33597632-5050-20218-0001
13/05/16 10:43:17 INFO spark.SparkContext: Starting job...
13/05/16 10:43:17 INFO spark.CacheTracker: Registering RDD ID 1 with cache
13/05/16 10:43:17 INFO spark.CacheTrackerActor: Registering RDD 1 with 2 partitions
13/05/16 10:43:17 INFO spark.CacheTracker: Registering RDD ID 0 with cache
13/05/16 10:43:17 INFO spark.CacheTrackerActor: Registering RDD 0 with 2 partitions
13/05/16 10:43:17 INFO spark.CacheTrackerActor: Asked for current cache locations
13/05/16 10:43:17 INFO spark.MesosScheduler: Final stage: Stage 0
13/05/16 10:43:17 INFO spark.MesosScheduler: Parents of final stage: List()
13/05/16 10:43:17 INFO spark.MesosScheduler: Missing parents: List()
13/05/16 10:43:17 INFO spark.MesosScheduler: Submitting Stage 0, which has no missing parents
13/05/16 10:43:17 INFO spark.MesosScheduler: Got a job with 2 tasks
13/05/16 10:43:17 INFO spark.MesosScheduler: Adding job with ID 0
13/05/16 10:43:17 INFO spark.SimpleJob: Starting task 0:0 as TID 0 on slave 201305151124-33597632-5050-20218-1: slave5 (preferred)
13/05/16 10:43:17 INFO spark.SimpleJob: Size of task 0:0 is 1606 bytes and took 72 ms to serialize by spark.JavaSerializerInstance
13/05/16 10:43:17 INFO spark.SimpleJob: Starting task 0:1 as TID 1 on slave 201305151124-33597632-5050-20218-0: slave1 (preferred)
13/05/16 10:43:17 INFO spark.SimpleJob: Size of task 0:1 is 1606 bytes and took 1 ms to serialize by spark.JavaSerializerInstance
13/05/16 10:43:18 INFO spark.SimpleJob: Lost TID 1 (task 0:1)
13/05/16 10:43:18 INFO spark.SimpleJob: Starting task 0:1 as TID 2 on slave 201305151124-33597632-5050-20218-0: slave1 (preferred)
13/05/16 10:43:18 INFO spark.SimpleJob: Size of task 0:1 is 1606 bytes and took 1 ms to serialize by spark.JavaSerializerInstance
13/05/16 10:43:18 INFO spark.SimpleJob: Lost TID 0 (task 0:0)
13/05/16 10:43:19 INFO spark.SimpleJob: Lost TID 2 (task 0:1)
13/05/16 10:43:19 INFO spark.SimpleJob: Starting task 0:1 as TID 3 on slave 201305151124-33597632-5050-20218-1: slave5 (preferred)
13/05/16 10:43:19 INFO spark.SimpleJob: Size of task 0:1 is 1606 bytes and took 1 ms to serialize by spark.JavaSerializerInstance
13/05/16 10:43:19 INFO spark.SimpleJob: Starting task 0:0 as TID 4 on slave 201305151124-33597632-5050-20218-0: slave1 (preferred)
13/05/16 10:43:19 INFO spark.SimpleJob: Size of task 0:0 is 1606 bytes and took 2 ms to serialize by spark.JavaSerializerInstance
13/05/16 10:43:19 INFO spark.SimpleJob: Lost TID 3 (task 0:1)
13/05/16 10:43:20 INFO spark.SimpleJob: Lost TID 4 (task 0:0)
13/05/16 10:43:20 INFO spark.SimpleJob: Starting task 0:0 as TID 5 on slave 201305151124-33597632-5050-20218-1: slave5 (preferred)
13/05/16 10:43:20 INFO spark.SimpleJob: Size of task 0:0 is 1606 bytes and took 2 ms to serialize by spark.JavaSerializerInstance
13/05/16 10:43:20 INFO spark.SimpleJob: Starting task 0:1 as TID 6 on slave 201305151124-33597632-5050-20218-0: slave1 (preferred)
13/05/16 10:43:20 INFO spark.SimpleJob: Size of task 0:1 is 1606 bytes and took 1 ms to serialize by spark.JavaSerializerInstance
13/05/16 10:43:20 INFO spark.SimpleJob: Lost TID 5 (task 0:0)
13/05/16 10:43:21 INFO spark.SimpleJob: Lost TID 6 (task 0:1)
13/05/16 10:43:21 INFO spark.SimpleJob: Starting task 0:1 as TID 7 on slave 201305151124-33597632-5050-20218-1: slave5 (preferred)
13/05/16 10:43:21 INFO spark.SimpleJob: Size of task 0:1 is 1606 bytes and took 1 ms to serialize by spark.JavaSerializerInstance
13/05/16 10:43:21 INFO spark.SimpleJob: Starting task 0:0 as TID 8 on slave 201305151124-33597632-5050-20218-0: slave1 (preferred)
13/05/16 10:43:21 INFO spark.SimpleJob: Size of task 0:0 is 1606 bytes and took 2 ms to serialize by spark.JavaSerializerInstance
13/05/16 10:43:21 INFO spark.SimpleJob: Lost TID 7 (task 0:1)
13/05/16 10:43:21 ERROR spark.SimpleJob: Task 0:1 failed more than 4 times; aborting job
13/05/16 10:43:22 INFO spark.MesosScheduler: Ignoring update from TID 8 because its job is gone




Wang Yu

From: Vinod Kone
Date: 2013-05-15 23:45
To: Wang Yu
CC: mesos-dev@incubator.apache.org; Benjamin Mahler
Subject: Re: Tasks always lost when running hadoop test!
What is the git sha of your HEAD?

Also can you post the scheduler/master/slave logs?

@vinodkone
Sent from my mobile 

On May 15, 2013, at 8:22 AM, "Wang Yu" <wa...@nfs.iscas.ac.cn> wrote:

> 1. There is no log in directories like "/tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_4/runs/8a4dd631-1ec0-4946-a1bc-0644a7238e3c"
> 2. I use the git version, just download it using "git clone git://git.apache.org/mesos.git". This is what you told me before...<1.gif>
>  
> 2013-05-15
> Wang Yu
> 发件人:Vinod Kone <vi...@twitter.com>
> 发送时间:2013-05-15 23:14
> 主题:Re: Tasks always lost when running hadoop test!
> 收件人:"mesos-dev@incubator.apache.org"<me...@incubator.apache.org>
> 抄送:"mesos-dev"<me...@incubator.apache.org>,"Benjamin Mahler"<be...@gmail.com>
>  
> logs? Also what version of mesos? 
>  
> @vinodkone 
> Sent from my mobile  
>  
> On May 15, 2013, at 12:00 AM, 王瑜 <wa...@nfs.iscas.ac.cn> wrote: 
>  
> > Hi Ben, 
> >  
> > I think the problem is mesos have found the executor on hdfs://master/user/mesos/hadoop.tar.gz, but it did not download it, so did not use it. 
> > Mesos found the executor, so it did not output error, just update the task status as lost; but mesos did not use the executor, so the executor directory contains nothing!  
> >  
> > But I am not very familiar with source code, so I do not know why mesos can not use the executor. And I also do not know whether my analysis is right. Thanks very much for your help! 
> >  
> >  
> >  
> >  
> > Wang Yu 
> >  
> > 发件人: 王瑜 
> > 发送时间: 2013-05-15 11:04 
> > 收件人: mesos-dev 
> > 抄送: Benjamin Mahler 
> > 主题: 回复: 回复: org.apache.hadoop.mapred.MesosScheduler: Unknown/exited TaskTracker: http://slave5:50060 
> > Hi, Ben, 
> >  
> > I have reworked the test, and checked log directory again, it is still null. The same as following. 
> > I think there is the problem with my executor, but I do not know how to let the executor works. Logs is as following... 
> > " Asked to update resources for an unknown/killed executor" why it always kill the executor? 
> >  
> > 1. I opened all the executor directory, but all of them are null. I do not know what happened to them... 
> > [root@slave1 logs]# cd /tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_4/runs/8a4dd631-1ec0-4946-a1bc-0644a7238e3c 
> > [root@slave1 8a4dd631-1ec0-4946-a1bc-0644a7238e3c]# ls 
> > [root@slave1 8a4dd631-1ec0-4946-a1bc-0644a7238e3c]# ls -l 
> > 总用量 0 
> > [root@slave1 8a4dd631-1ec0-4946-a1bc-0644a7238e3c]# ls -a 
> > .  .. 
> > [root@slave1 8a4dd631-1ec0-4946-a1bc-0644a7238e3c]# 
> > 2. I added "--isolation=cgroups" for slaves, but it still not work. Tasks are always lost. But there is no error any more, I still do not know what happened to the executor...Logs on one slave is as follows. Please help me, thanks very much! 
> >  
> > mesos-slave.INFO 
> > Log file created at: 2013/05/13 09:12:54 
> > Running on machine: slave1 
> > Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg 
> > I0513 09:12:54.170383 24183 main.cpp:124] Creating "cgroups" isolator 
> > I0513 09:12:54.171617 24183 main.cpp:132] Build: 2013-04-10 16:07:43 by root 
> > I0513 09:12:54.171656 24183 main.cpp:133] Starting Mesos slave 
> > I0513 09:12:54.173495 24197 slave.cpp:203] Slave started on 1)@192.168.0.3:36668 
> > I0513 09:12:54.173578 24197 slave.cpp:204] Slave resources: cpus=24; mem=63356; ports=[31000-32000]; disk=29143 
> > I0513 09:12:54.174486 24192 cgroups_isolator.cpp:242] Using /cgroup as cgroups hierarchy root 
> > I0513 09:12:54.179914 24197 slave.cpp:453] New master detected at master@192.168.0.2:5050 
> > I0513 09:12:54.180809 24197 slave.cpp:436] Successfully attached file '/home/mesos/build/logs/mesos-slave.INFO' 
> > I0513 09:12:54.180817 24207 status_update_manager.cpp:132] New master detected at master@192.168.0.2:5050 
> > I0513 09:12:54.194345 24192 cgroups_isolator.cpp:730] Recovering isolator 
> > I0513 09:12:54.195453 24189 slave.cpp:377] Finished recovery 
> > I0513 09:12:54.197798 24206 slave.cpp:487] Registered with master; given slave ID 201305130913-33597632-5050-3893-0 
> > I0513 09:12:54.198086 24201 gc.cpp:56] Scheduling '/tmp/mesos/slaves/201305081719-33597632-5050-4050-1' for removal 
> > I0513 09:12:54.198329 24201 gc.cpp:56] Scheduling '/tmp/mesos/slaves/201305100938-33597632-5050-19520-1' for removal 
> > I0513 09:12:54.198490 24201 gc.cpp:56] Scheduling '/tmp/mesos/slaves/201305081625-33597632-5050-2991-1' for removal 
> > I0513 09:12:54.198593 24201 gc.cpp:56] Scheduling '/tmp/mesos/slaves/201305081746-33597632-5050-12378-1' for removal 
> > I0513 09:12:54.198874 24201 gc.cpp:56] Scheduling '/tmp/mesos/slaves/201305090914-33597632-5050-5072-1' for removal 
> > I0513 09:12:54.199028 24201 gc.cpp:56] Scheduling '/tmp/mesos/slaves/201305081730-33597632-5050-8558-1' for removal 
> > I0513 09:12:54.199149 24201 gc.cpp:56] Scheduling '/tmp/mesos/slaves/201304131144-33597632-5050-4949-2' for removal 
> > I0513 09:13:54.176460 24204 slave.cpp:1811] Current disk usage 26.93%. Max allowed age: 5.11days 
> > I0513 09:14:54.178444 24203 slave.cpp:1811] Current disk usage 26.93%. Max allowed age: 5.11days 
> > I0513 09:15:54.180680 24203 slave.cpp:1811] Current disk usage 26.93%. Max allowed age: 5.11days 
> > I0513 09:16:23.051203 24200 slave.cpp:587] Got assigned task Task_Tracker_0 for framework 201305130913-33597632-5050-3893-0000 
> > I0513 09:16:23.054324 24200 paths.hpp:302] Created executor directory '/tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_0/runs/6522748a-9d43-41b7-8f88-cd537a502495' 
> > I0513 09:16:23.055605 24188 slave.cpp:436] Successfully attached file '/tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_0/runs/6522748a-9d43-41b7-8f88-cd537a502495' 
> > I0513 09:16:23.056043 24190 cgroups_isolator.cpp:525] Launching executor_Task_Tracker_0 (cd hadoop && ./bin/mesos-executor) in /tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_0/runs/6522748a-9d43-41b7-8f88-cd537a502495 with resources cpus=1; mem=1280 for framework 201305130913-33597632-5050-3893-0000 in cgroup mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_0_tag_6522748a-9d43-41b7-8f88-cd537a502495 
> > I0513&nbs

/home/mesos/build/hadoop/hadoop-0.20.205.0/build.xml:666: The following error occurred while executing this line

Posted by 王瑜 <wa...@nfs.iscas.ac.cn>.
Hi all,

When I compile the new stable version of mesos and deploy hadoop on it, it can not compile hadoop.tar.gz file for task excutor, the log is as follows, thanks very for helping me.
It seems there are some problem with javac can not find symbol "."

compile:
     [echo] contrib: mesos
    [javac] /home/mesos/build/hadoop/hadoop-0.20.205.0/src/contrib/build-contrib.xml:185: warning: 'includeantruntime' was not set, defaulting to build.sysclasspath=last; set to false for repeatable builds
    [javac] Compiling 5 source files to /home/mesos/build/hadoop/hadoop-0.20.205.0/build/contrib/mesos/classes
    [javac] /home/mesos/build/hadoop/hadoop-0.20.205.0/src/contrib/mesos/src/java/org/apache/hadoop/mapred/FrameworkExecutor.java:61: 错误: 找不到符号
    [javac]       Class<?>[] instClasses = TaskTracker.getInstrumentationClasses(conf);
    [javac]                                           ^
    [javac]   符号:   方法 getInstrumentationClasses(JobConf)
    [javac]   位置: 类 TaskTracker
    [javac] /home/mesos/build/hadoop/hadoop-0.20.205.0/src/contrib/mesos/src/java/org/apache/hadoop/mapred/FrameworkExecutor.java:136: 错误: 找不到符号
    [javac]     if (task.extraData.equals("")) {
    [javac]             ^
    [javac]   符号:   变量 extraData
    [javac]   位置: 类型为Task的变量 task
    [javac] /home/mesos/build/hadoop/hadoop-0.20.205.0/src/contrib/mesos/src/java/org/apache/hadoop/mapred/FrameworkExecutor.java:143: 错误: 找不到符号
    [javac]       .setValue(task.extraData)
    [javac]                     ^
    [javac]   符号:   变量 extraData
    [javac]   位置: 类型为Task的变量 task
    [javac] /home/mesos/build/hadoop/hadoop-0.20.205.0/src/contrib/mesos/src/java/org/apache/hadoop/mapred/FrameworkExecutor.java:176: 错误: 找不到符号
    [javac]         .setTaskId(TaskID.newBuilder().setValue(task.extraData).build())
    [javac]                                                     ^
    [javac]   符号:   变量 extraData
    [javac]   位置: 类型为Task的变量 task
    [javac] /home/mesos/build/hadoop/hadoop-0.20.205.0/src/contrib/mesos/src/java/org/apache/hadoop/mapred/FrameworkScheduler.java:143: 错误: jobTracker可以在MesosScheduler中访问private
    [javac]     this.jobTracker = mesosSched.jobTracker;
    [javac]                                 ^
    [javac] /home/mesos/build/hadoop/hadoop-0.20.205.0/src/contrib/mesos/src/java/org/apache/hadoop/mapred/FrameworkScheduler.java:557: 错误: 找不到符号
    [javac]                 task.extraData = "" + nt.mesosId.getValue();
    [javac]                     ^
    [javac]   符号:   变量 extraData
    [javac]   位置: 类型为Task的变量 task
    [javac] /home/mesos/build/hadoop/hadoop-0.20.205.0/src/contrib/mesos/src/java/org/apache/hadoop/mapred/FrameworkScheduler.java:572: 错误: 找不到符号
    [javac]                 task.extraData = "" + nt.mesosId.getValue();
    [javac]                     ^
    [javac]   符号:   变量 extraData
    [javac]   位置: 类型为Task的变量 task
    [javac] /home/mesos/build/hadoop/hadoop-0.20.205.0/src/contrib/mesos/src/java/org/apache/hadoop/mapred/FrameworkScheduler.java:725: 错误: 找不到符号
    [javac]       int maxLevel = job.getMaxCacheLevel();
    [javac]                         ^
    [javac]   符号:   方法 getMaxCacheLevel()
    [javac]   位置: 类型为JobInProgress的变量 job
    [javac] /home/mesos/build/hadoop/hadoop-0.20.205.0/src/contrib/mesos/src/java/org/apache/hadoop/mapred/MesosScheduler.java:545: 错误: 找不到符号
    [javac]                     .setName("Hadoop TaskTracker")
    [javac]                     ^
    [javac]   符号:   方法 setName(String)
    [javac]   位置: 类 Builder
    [javac] /home/mesos/build/hadoop/hadoop-0.20.205.0/src/contrib/mesos/src/java/org/apache/hadoop/mapred/MesosTaskTrackerInstrumentation.java:24: 错误: 方法不会覆盖或实现超类型的方法
    [javac]   @Override
    [javac]   ^
    [javac] 注: /home/mesos/build/hadoop/hadoop-0.20.205.0/src/contrib/mesos/src/java/org/apache/hadoop/mapred/FrameworkScheduler.java使用或覆盖了已过时的 API。
    [javac] 注: 有关详细信息, 请使用 -Xlint:deprecation 重新编译。
    [javac] 10 个错误

BUILD FAILED
/home/mesos/build/hadoop/hadoop-0.20.205.0/build.xml:666: The following error occurred while executing this line:
/home/mesos/build/hadoop/hadoop-0.20.205.0/src/contrib/build.xml:30: The following error occurred while executing this line:
/home/mesos/build/hadoop/hadoop-0.20.205.0/src/contrib/build-contrib.xml:185: Compile failed; see the compiler error output for details.

Total time: 24 seconds

Oh no! We failed to run 'ant -Dversion=0.20.205.0 compile bin-package'. If you need help try emailing:

  mesos-dev@incubator.apache.org

(Remember to include as much debug information as possible.)




Wang Yu

Re: Re: Tasks always lost when running hadoop test!

Posted by 王瑜 <wa...@nfs.iscas.ac.cn>.
Hi Vinod,
The mesos version is 0.13.0. And the logs for master and slave is attached, can you get them?
Where should I get scheduler logs? 

Thanks very much for your help!





Wang Yu

From: Vinod Kone
Date: 2013-05-15 23:45
To: Wang Yu
CC: mesos-dev@incubator.apache.org; Benjamin Mahler
Subject: Re: Tasks always lost when running hadoop test!
What is the git sha of your HEAD?

Also can you post the scheduler/master/slave logs?

@vinodkone
Sent from my mobile 

On May 15, 2013, at 8:22 AM, "Wang Yu" <wa...@nfs.iscas.ac.cn> wrote:

> 1. There is no log in directories like "/tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_4/runs/8a4dd631-1ec0-4946-a1bc-0644a7238e3c"
> 2. I use the git version, just download it using "git clone git://git.apache.org/mesos.git". This is what you told me before...<1.gif>
>  
> 2013-05-15
> Wang Yu
> 发件人:Vinod Kone <vi...@twitter.com>
> 发送时间:2013-05-15 23:14
> 主题:Re: Tasks always lost when running hadoop test!
> 收件人:"mesos-dev@incubator.apache.org"<me...@incubator.apache.org>
> 抄送:"mesos-dev"<me...@incubator.apache.org>,"Benjamin Mahler"<be...@gmail.com>
>  
> logs? Also what version of mesos? 
>  
> @vinodkone 
> Sent from my mobile  
>  
> On May 15, 2013, at 12:00 AM, 王瑜 <wa...@nfs.iscas.ac.cn> wrote: 
>  
> > Hi Ben, 
> >  
> > I think the problem is mesos have found the executor on hdfs://master/user/mesos/hadoop.tar.gz, but it did not download it, so did not use it. 
> > Mesos found the executor, so it did not output error, just update the task status as lost; but mesos did not use the executor, so the executor directory contains nothing!  
> >  
> > But I am not very familiar with source code, so I do not know why mesos can not use the executor. And I also do not know whether my analysis is right. Thanks very much for your help! 
> >  
> >  
> >  
> >  
> > Wang Yu 
> >  
> > 发件人: 王瑜 
> > 发送时间: 2013-05-15 11:04 
> > 收件人: mesos-dev 
> > 抄送: Benjamin Mahler 
> > 主题: 回复: 回复: org.apache.hadoop.mapred.MesosScheduler: Unknown/exited TaskTracker: http://slave5:50060 
> > Hi, Ben, 
> >  
> > I have reworked the test, and checked log directory again, it is still null. The same as following. 
> > I think there is the problem with my executor, but I do not know how to let the executor works. Logs is as following... 
> > " Asked to update resources for an unknown/killed executor" why it always kill the executor? 
> >  
> > 1. I opened all the executor directory, but all of them are null. I do not know what happened to them... 
> > [root@slave1 logs]# cd /tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_4/runs/8a4dd631-1ec0-4946-a1bc-0644a7238e3c 
> > [root@slave1 8a4dd631-1ec0-4946-a1bc-0644a7238e3c]# ls 
> > [root@slave1 8a4dd631-1ec0-4946-a1bc-0644a7238e3c]# ls -l 
> > 总用量 0 
> > [root@slave1 8a4dd631-1ec0-4946-a1bc-0644a7238e3c]# ls -a 
> > .  .. 
> > [root@slave1 8a4dd631-1ec0-4946-a1bc-0644a7238e3c]# 
> > 2. I added "--isolation=cgroups" for slaves, but it still not work. Tasks are always lost. But there is no error any more, I still do not know what happened to the executor...Logs on one slave is as follows. Please help me, thanks very much! 
> >  
> > mesos-slave.INFO 
> > Log file created at: 2013/05/13 09:12:54 
> > Running on machine: slave1 
> > Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg 
> > I0513 09:12:54.170383 24183 main.cpp:124] Creating "cgroups" isolator 
> > I0513 09:12:54.171617 24183 main.cpp:132] Build: 2013-04-10 16:07:43 by root 
> > I0513 09:12:54.171656 24183 main.cpp:133] Starting Mesos slave 
> > I0513 09:12:54.173495 24197 slave.cpp:203] Slave started on 1)@192.168.0.3:36668 
> > I0513 09:12:54.173578 24197 slave.cpp:204] Slave resources: cpus=24; mem=63356; ports=[31000-32000]; disk=29143 
> > I0513 09:12:54.174486 24192 cgroups_isolator.cpp:242] Using /cgroup as cgroups hierarchy root 
> > I0513 09:12:54.179914 24197 slave.cpp:453] New master detected at master@192.168.0.2:5050 
> > I0513 09:12:54.180809 24197 slave.cpp:436] Successfully attached file '/home/mesos/build/logs/mesos-slave.INFO' 
> > I0513 09:12:54.180817 24207 status_update_manager.cpp:132] New master detected at master@192.168.0.2:5050 
> > I0513 09:12:54.194345 24192 cgroups_isolator.cpp:730] Recovering isolator 
> > I0513 09:12:54.195453 24189 slave.cpp:377] Finished recovery 
> > I0513 09:12:54.197798 24206 slave.cpp:487] Registered with master; given slave ID 201305130913-33597632-5050-3893-0 
> > I0513 09:12:54.198086 24201 gc.cpp:56] Scheduling '/tmp/mesos/slaves/201305081719-33597632-5050-4050-1' for removal 
> > I0513 09:12:54.198329 24201 gc.cpp:56] Scheduling '/tmp/mesos/slaves/201305100938-33597632-5050-19520-1' for removal 
> > I0513 09:12:54.198490 24201 gc.cpp:56] Scheduling '/tmp/mesos/slaves/201305081625-33597632-5050-2991-1' for removal 
> > I0513 09:12:54.198593 24201 gc.cpp:56] Scheduling '/tmp/mesos/slaves/201305081746-33597632-5050-12378-1' for removal 
> > I0513 09:12:54.198874 24201 gc.cpp:56] Scheduling '/tmp/mesos/slaves/201305090914-33597632-5050-5072-1' for removal 
> > I0513 09:12:54.199028 24201 gc.cpp:56] Scheduling '/tmp/mesos/slaves/201305081730-33597632-5050-8558-1' for removal 
> > I0513 09:12:54.199149 24201 gc.cpp:56] Scheduling '/tmp/mesos/slaves/201304131144-33597632-5050-4949-2' for removal 
> > I0513 09:13:54.176460 24204 slave.cpp:1811] Current disk usage 26.93%. Max allowed age: 5.11days 
> > I0513 09:14:54.178444 24203 slave.cpp:1811] Current disk usage 26.93%. Max allowed age: 5.11days 
> > I0513 09:15:54.180680 24203 slave.cpp:1811] Current disk usage 26.93%. Max allowed age: 5.11days 
> > I0513 09:16:23.051203 24200 slave.cpp:587] Got assigned task Task_Tracker_0 for framework 201305130913-33597632-5050-3893-0000 
> > I0513 09:16:23.054324 24200 paths.hpp:302] Created executor directory '/tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_0/runs/6522748a-9d43-41b7-8f88-cd537a502495' 
> > I0513 09:16:23.055605 24188 slave.cpp:436] Successfully attached file '/tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_0/runs/6522748a-9d43-41b7-8f88-cd537a502495' 
> > I0513 09:16:23.056043 24190 cgroups_isolator.cpp:525] Launching executor_Task_Tracker_0 (cd hadoop && ./bin/mesos-executor) in /tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_0/runs/6522748a-9d43-41b7-8f88-cd537a502495 with resources cpus=1; mem=1280 for framework 201305130913-33597632-5050-3893-0000 in cgroup mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_0_tag_6522748a-9d43-41b7-8f88-cd537a502495 
> > I0513&nbs

Re: Tasks always lost when running hadoop test!

Posted by Vinod Kone <vi...@gmail.com>.
What is the git sha of your HEAD?

Also can you post the scheduler/master/slave logs?

@vinodkone
Sent from my mobile 

On May 15, 2013, at 8:22 AM, "Wang Yu" <wa...@nfs.iscas.ac.cn> wrote:

> 1. There is no log in directories like "/tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_4/runs/8a4dd631-1ec0-4946-a1bc-0644a7238e3c"
> 2. I use the git version, just download it using "git clone git://git.apache.org/mesos.git". This is what you told me before...<1.gif>
>  
> 2013-05-15
> Wang Yu
> 发件人:Vinod Kone <vi...@twitter.com>
> 发送时间:2013-05-15 23:14
> 主题:Re: Tasks always lost when running hadoop test!
> 收件人:"mesos-dev@incubator.apache.org"<me...@incubator.apache.org>
> 抄送:"mesos-dev"<me...@incubator.apache.org>,"Benjamin Mahler"<be...@gmail.com>
>  
> logs? Also what version of mesos? 
>  
> @vinodkone 
> Sent from my mobile  
>  
> On May 15, 2013, at 12:00 AM, 王瑜 <wa...@nfs.iscas.ac.cn> wrote: 
>  
> > Hi Ben, 
> >  
> > I think the problem is mesos have found the executor on hdfs://master/user/mesos/hadoop.tar.gz, but it did not download it, so did not use it. 
> > Mesos found the executor, so it did not output error, just update the task status as lost; but mesos did not use the executor, so the executor directory contains nothing!  
> >  
> > But I am not very familiar with source code, so I do not know why mesos can not use the executor. And I also do not know whether my analysis is right. Thanks very much for your help! 
> >  
> >  
> >  
> >  
> > Wang Yu 
> >  
> > 发件人: 王瑜 
> > 发送时间: 2013-05-15 11:04 
> > 收件人: mesos-dev 
> > 抄送: Benjamin Mahler 
> > 主题: 回复: 回复: org.apache.hadoop.mapred.MesosScheduler: Unknown/exited TaskTracker: http://slave5:50060 
> > Hi, Ben, 
> >  
> > I have reworked the test, and checked log directory again, it is still null. The same as following. 
> > I think there is the problem with my executor, but I do not know how to let the executor works. Logs is as following... 
> > " Asked to update resources for an unknown/killed executor" why it always kill the executor? 
> >  
> > 1. I opened all the executor directory, but all of them are null. I do not know what happened to them... 
> > [root@slave1 logs]# cd /tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_4/runs/8a4dd631-1ec0-4946-a1bc-0644a7238e3c 
> > [root@slave1 8a4dd631-1ec0-4946-a1bc-0644a7238e3c]# ls 
> > [root@slave1 8a4dd631-1ec0-4946-a1bc-0644a7238e3c]# ls -l 
> > 总用量 0 
> > [root@slave1 8a4dd631-1ec0-4946-a1bc-0644a7238e3c]# ls -a 
> > .  .. 
> > [root@slave1 8a4dd631-1ec0-4946-a1bc-0644a7238e3c]# 
> > 2. I added "--isolation=cgroups" for slaves, but it still not work. Tasks are always lost. But there is no error any more, I still do not know what happened to the executor...Logs on one slave is as follows. Please help me, thanks very much! 
> >  
> > mesos-slave.INFO 
> > Log file created at: 2013/05/13 09:12:54 
> > Running on machine: slave1 
> > Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg 
> > I0513 09:12:54.170383 24183 main.cpp:124] Creating "cgroups" isolator 
> > I0513 09:12:54.171617 24183 main.cpp:132] Build: 2013-04-10 16:07:43 by root 
> > I0513 09:12:54.171656 24183 main.cpp:133] Starting Mesos slave 
> > I0513 09:12:54.173495 24197 slave.cpp:203] Slave started on 1)@192.168.0.3:36668 
> > I0513 09:12:54.173578 24197 slave.cpp:204] Slave resources: cpus=24; mem=63356; ports=[31000-32000]; disk=29143 
> > I0513 09:12:54.174486 24192 cgroups_isolator.cpp:242] Using /cgroup as cgroups hierarchy root 
> > I0513 09:12:54.179914 24197 slave.cpp:453] New master detected at master@192.168.0.2:5050 
> > I0513 09:12:54.180809 24197 slave.cpp:436] Successfully attached file '/home/mesos/build/logs/mesos-slave.INFO' 
> > I0513 09:12:54.180817 24207 status_update_manager.cpp:132] New master detected at master@192.168.0.2:5050 
> > I0513 09:12:54.194345 24192 cgroups_isolator.cpp:730] Recovering isolator 
> > I0513 09:12:54.195453 24189 slave.cpp:377] Finished recovery 
> > I0513 09:12:54.197798 24206 slave.cpp:487] Registered with master; given slave ID 201305130913-33597632-5050-3893-0 
> > I0513 09:12:54.198086 24201 gc.cpp:56] Scheduling '/tmp/mesos/slaves/201305081719-33597632-5050-4050-1' for removal 
> > I0513 09:12:54.198329 24201 gc.cpp:56] Scheduling '/tmp/mesos/slaves/201305100938-33597632-5050-19520-1' for removal 
> > I0513 09:12:54.198490 24201 gc.cpp:56] Scheduling '/tmp/mesos/slaves/201305081625-33597632-5050-2991-1' for removal 
> > I0513 09:12:54.198593 24201 gc.cpp:56] Scheduling '/tmp/mesos/slaves/201305081746-33597632-5050-12378-1' for removal 
> > I0513 09:12:54.198874 24201 gc.cpp:56] Scheduling '/tmp/mesos/slaves/201305090914-33597632-5050-5072-1' for removal 
> > I0513 09:12:54.199028 24201 gc.cpp:56] Scheduling '/tmp/mesos/slaves/201305081730-33597632-5050-8558-1' for removal 
> > I0513 09:12:54.199149 24201 gc.cpp:56] Scheduling '/tmp/mesos/slaves/201304131144-33597632-5050-4949-2' for removal 
> > I0513 09:13:54.176460 24204 slave.cpp:1811] Current disk usage 26.93%. Max allowed age: 5.11days 
> > I0513 09:14:54.178444 24203 slave.cpp:1811] Current disk usage 26.93%. Max allowed age: 5.11days 
> > I0513 09:15:54.180680 24203 slave.cpp:1811] Current disk usage 26.93%. Max allowed age: 5.11days 
> > I0513 09:16:23.051203 24200 slave.cpp:587] Got assigned task Task_Tracker_0 for framework 201305130913-33597632-5050-3893-0000 
> > I0513 09:16:23.054324 24200 paths.hpp:302] Created executor directory '/tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_0/runs/6522748a-9d43-41b7-8f88-cd537a502495' 
> > I0513 09:16:23.055605 24188 slave.cpp:436] Successfully attached file '/tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_0/runs/6522748a-9d43-41b7-8f88-cd537a502495' 
> > I0513 09:16:23.056043 24190 cgroups_isolator.cpp:525] Launching executor_Task_Tracker_0 (cd hadoop && ./bin/mesos-executor) in /tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_0/runs/6522748a-9d43-41b7-8f88-cd537a502495 with resources cpus=1; mem=1280 for framework 201305130913-33597632-5050-3893-0000 in cgroup mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_0_tag_6522748a-9d43-41b7-8f88-cd537a502495 
> > I0513&nbs

Re: Tasks always lost when running hadoop test!

Posted by Vinod Kone <vi...@twitter.com>.
logs? Also what version of mesos?

@vinodkone
Sent from my mobile 

On May 15, 2013, at 12:00 AM, 王瑜 <wa...@nfs.iscas.ac.cn> wrote:

> Hi Ben,
> 
> I think the problem is mesos have found the executor on hdfs://master/user/mesos/hadoop.tar.gz, but it did not download it, so did not use it.
> Mesos found the executor, so it did not output error, just update the task status as lost; but mesos did not use the executor, so the executor directory contains nothing! 
> 
> But I am not very familiar with source code, so I do not know why mesos can not use the executor. And I also do not know whether my analysis is right. Thanks very much for your help!
> 
> 
> 
> 
> Wang Yu
> 
> 发件人: 王瑜
> 发送时间: 2013-05-15 11:04
> 收件人: mesos-dev
> 抄送: Benjamin Mahler
> 主题: 回复: 回复: org.apache.hadoop.mapred.MesosScheduler: Unknown/exited TaskTracker: http://slave5:50060
> Hi, Ben,
> 
> I have reworked the test, and checked log directory again, it is still null. The same as following.
> I think there is the problem with my executor, but I do not know how to let the executor works. Logs is as following...
> " Asked to update resources for an unknown/killed executor" why it always kill the executor?
> 
> 1. I opened all the executor directory, but all of them are null. I do not know what happened to them...
> [root@slave1 logs]# cd /tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_4/runs/8a4dd631-1ec0-4946-a1bc-0644a7238e3c
> [root@slave1 8a4dd631-1ec0-4946-a1bc-0644a7238e3c]# ls
> [root@slave1 8a4dd631-1ec0-4946-a1bc-0644a7238e3c]# ls -l
> 总用量 0
> [root@slave1 8a4dd631-1ec0-4946-a1bc-0644a7238e3c]# ls -a
> .  ..
> [root@slave1 8a4dd631-1ec0-4946-a1bc-0644a7238e3c]#
> 2. I added "--isolation=cgroups" for slaves, but it still not work. Tasks are always lost. But there is no error any more, I still do not know what happened to the executor...Logs on one slave is as follows. Please help me, thanks very much!
> 
> mesos-slave.INFO
> Log file created at: 2013/05/13 09:12:54
> Running on machine: slave1
> Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
> I0513 09:12:54.170383 24183 main.cpp:124] Creating "cgroups" isolator
> I0513 09:12:54.171617 24183 main.cpp:132] Build: 2013-04-10 16:07:43 by root
> I0513 09:12:54.171656 24183 main.cpp:133] Starting Mesos slave
> I0513 09:12:54.173495 24197 slave.cpp:203] Slave started on 1)@192.168.0.3:36668
> I0513 09:12:54.173578 24197 slave.cpp:204] Slave resources: cpus=24; mem=63356; ports=[31000-32000]; disk=29143
> I0513 09:12:54.174486 24192 cgroups_isolator.cpp:242] Using /cgroup as cgroups hierarchy root
> I0513 09:12:54.179914 24197 slave.cpp:453] New master detected at master@192.168.0.2:5050
> I0513 09:12:54.180809 24197 slave.cpp:436] Successfully attached file '/home/mesos/build/logs/mesos-slave.INFO'
> I0513 09:12:54.180817 24207 status_update_manager.cpp:132] New master detected at master@192.168.0.2:5050
> I0513 09:12:54.194345 24192 cgroups_isolator.cpp:730] Recovering isolator
> I0513 09:12:54.195453 24189 slave.cpp:377] Finished recovery
> I0513 09:12:54.197798 24206 slave.cpp:487] Registered with master; given slave ID 201305130913-33597632-5050-3893-0
> I0513 09:12:54.198086 24201 gc.cpp:56] Scheduling '/tmp/mesos/slaves/201305081719-33597632-5050-4050-1' for removal
> I0513 09:12:54.198329 24201 gc.cpp:56] Scheduling '/tmp/mesos/slaves/201305100938-33597632-5050-19520-1' for removal
> I0513 09:12:54.198490 24201 gc.cpp:56] Scheduling '/tmp/mesos/slaves/201305081625-33597632-5050-2991-1' for removal
> I0513 09:12:54.198593 24201 gc.cpp:56] Scheduling '/tmp/mesos/slaves/201305081746-33597632-5050-12378-1' for removal
> I0513 09:12:54.198874 24201 gc.cpp:56] Scheduling '/tmp/mesos/slaves/201305090914-33597632-5050-5072-1' for removal
> I0513 09:12:54.199028 24201 gc.cpp:56] Scheduling '/tmp/mesos/slaves/201305081730-33597632-5050-8558-1' for removal
> I0513 09:12:54.199149 24201 gc.cpp:56] Scheduling '/tmp/mesos/slaves/201304131144-33597632-5050-4949-2' for removal
> I0513 09:13:54.176460 24204 slave.cpp:1811] Current disk usage 26.93%. Max allowed age: 5.11days
> I0513 09:14:54.178444 24203 slave.cpp:1811] Current disk usage 26.93%. Max allowed age: 5.11days
> I0513 09:15:54.180680 24203 slave.cpp:1811] Current disk usage 26.93%. Max allowed age: 5.11days
> I0513 09:16:23.051203 24200 slave.cpp:587] Got assigned task Task_Tracker_0 for framework 201305130913-33597632-5050-3893-0000
> I0513 09:16:23.054324 24200 paths.hpp:302] Created executor directory '/tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_0/runs/6522748a-9d43-41b7-8f88-cd537a502495'
> I0513 09:16:23.055605 24188 slave.cpp:436] Successfully attached file '/tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_0/runs/6522748a-9d43-41b7-8f88-cd537a502495'
> I0513 09:16:23.056043 24190 cgroups_isolator.cpp:525] Launching executor_Task_Tracker_0 (cd hadoop && ./bin/mesos-executor) in /tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_0/runs/6522748a-9d43-41b7-8f88-cd537a502495 with resources cpus=1; mem=1280 for framework 201305130913-33597632-5050-3893-0000 in cgroup mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_0_tag_6522748a-9d43-41b7-8f88-cd537a502495
> I0513 09:16:23.059368 24190 cgroups_isolator.cpp:670] Changing cgroup controls for executor executor_Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000 with resources cpus=1; mem=1280
> I0513 09:16:23.060478 24190 cgroups_isolator.cpp:841] Updated 'cpu.shares' to 1024 for executor executor_Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000
> I0513 09:16:23.061101 24190 cgroups_isolator.cpp:979] Updated 'memory.limit_in_bytes' to 1342177280 for executor executor_Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000
> I0513 09:16:23.061101 24190 cgroups_isolator.cpp:979] Updated 'memory.limit_in_bytes' to 1342177280 for executor executor_Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000
> I0513 09:16:23.061807 24190 cgroups_isolator.cpp:1005] Started listening for OOM events for executor executor_Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000
> I0513 09:16:23.063297 24190 cgroups_isolator.cpp:555] Forked executor at = 24552
> I0513 09:16:29.055598 24190 slave.cpp:587] Got assigned task Task_Tracker_1 for framework 201305130913-33597632-5050-3893-0000
> I0513 09:16:29.058297 24190 paths.hpp:302] Created executor directory '/tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_1/runs/38d83d3a-ef9d-4118-b28c-c6c3cfba6c4b'
> I0513 09:16:29.059012 24203 slave.cpp:436] Successfully attached file '/tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_1/runs/38d83d3a-ef9d-4118-b28c-c6c3cfba6c4b'
> I0513 09:16:29.059865 24200 cgroups_isolator.cpp:525] Launching executor_Task_Tracker_1 (cd hadoop && ./bin/mesos-executor) in /tmp/mesos/slaves/201305130913-33597632-5050-3893-0/frameworks/201305130913-33597632-5050-3893-0000/executors/executor_Task_Tracker_1/runs/38d83d3a-ef9d-4118-b28c-c6c3cfba6c4b with resources cpus=1; mem=1280 for framework 201305130913-33597632-5050-3893-0000 in cgroup mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_1_tag_38d83d3a-ef9d-4118-b28c-c6c3cfba6c4b
> I0513 09:16:29.061282 24200 cgroups_isolator.cpp:670] Changing cgroup controls for executor executor_Task_Tracker_1 of framework 201305130913-33597632-5050-3893-0000 with resources cpus=1; mem=1280
> I0513 09:16:29.062208 24200 cgroups_isolator.cpp:841] Updated 'cpu.shares' to 1024 for executor executor_Task_Tracker_1 of framework 201305130913-33597632-5050-3893-0000
> I0513 09:16:29.062940 24200 cgroups_isolator.cpp:979] Updated 'memory.limit_in_bytes' to 1342177280 for executor executor_Task_Tracker_1 of framework 201305130913-33597632-5050-3893-0000
> I0513 09:16:29.063705 24200 cgroups_isolator.cpp:1005] Started listening for OOM events for executor executor_Task_Tracker_1 of framework 201305130913-33597632-5050-3893-0000
> I0513 09:16:29.065239 24200 cgroups_isolator.cpp:555] Forked executor at = 24628
> I0513 09:16:34.457746 24188 cgroups_isolator.cpp:806] Executor executor_Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000 terminated with status 256
> I0513 09:16:34.457909 24188 cgroups_isolator.cpp:635] Killing executor executor_Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000
> I0513 09:16:34.459873 24188 cgroups_isolator.cpp:1025] OOM notifier is triggered for executor executor_Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000 with uuid 6522748a-9d43-41b7-8f88-cd537a502495
> I0513 09:16:34.460028 24188 cgroups_isolator.cpp:1030] Discarded OOM notifier for executor executor_Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000 with uuid 6522748a-9d43-41b7-8f88-cd537a502495
> I0513 09:16:34.461314 24190 cgroups.cpp:1175] Trying to freeze cgroup /cgroup/mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_0_tag_6522748a-9d43-41b7-8f88-cd537a502495
> I0513 09:16:34.461675 24190 cgroups.cpp:1214] Successfully froze cgroup /cgroup/mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_0_tag_6522748a-9d43-41b7-8f88-cd537a502495 after 1 attempts
> I0513 09:16:34.464400 24197 cgroups.cpp:1190] Trying to thaw cgroup /cgroup/mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_0_tag_6522748a-9d43-41b7-8f88-cd537a502495
> I0513 09:16:34.464659 24197 cgroups.cpp:1298] Successfully thawed /cgroup/mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_0_tag_6522748a-9d43-41b7-8f88-cd537a502495
> I0513 09:16:34.477118 24199 cgroups_isolator.cpp:1144] Successfully destroyed cgroup mesos/framework_201305130913-33597632-5050-3893-0000_executor_executor_Task_Tracker_0_tag_6522748a-9d43-41b7-8f88-cd537a502495
> I0513 09:16:34.477439 24190 slave.cpp:1479] Executor 'executor_Task_Tracker_0' of framework 201305130913-33597632-5050-3893-0000 has exited with status 1
> I0513 09:16:34.479852 24190 slave.cpp:1232] Handling status update TASK_LOST from task Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000
> I0513 09:16:34.480123 24190 slave.cpp:1280] Forwarding status update TASK_LOST from task Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000 to the status update manager
> I0513 09:16:34.480136 24199 cgroups_isolator.cpp:666] Asked to update resources for an unknown/killed executor
> I0513 09:16:34.480480 24185 status_update_manager.cpp:254] Received status update TASK_LOST from task Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000
> I0513 09:16:34.480716 24185 status_update_manager.cpp:403] Creating StatusUpdate stream for task Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000
> I0513 09:16:34.480927 24185 status_update_manager.hpp:314] Handling UPDATE for status update TASK_LOST from task Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000
> I0513 09:16:34.481107 24185 status_update_manager.cpp:289] Forwarding status update TASK_LOST from task Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000 to the master at master@192.168.0.2:5050
> I0513 09:16:34.487007 24194 slave.cpp:979] Got acknowledgement of status update for task Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000
> I0513 09:16:34.487257 24185 status_update_manager.cpp:314] Received status update acknowledgement for task Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000
> I0513 09:16:34.487412 24185 status_update_manager.hpp:314] Handling ACK for status update TASK_LOST from task Task_Tracker_0 of framework 201305130913-33597632-5050-3893-0000
> I0513 09:16:34.487547 24185 status_upda