You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Rohan Nahata (JIRA)" <ji...@apache.org> on 2015/09/30 23:04:04 UTC

[jira] [Comment Edited] (MESOS-2990) Task dropped into LOST state

    [ https://issues.apache.org/jira/browse/MESOS-2990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14938870#comment-14938870 ] 

Rohan Nahata edited comment on MESOS-2990 at 9/30/15 9:03 PM:
--------------------------------------------------------------

[~ykrips] can you point me to how you actually went about fixing this... new to mesos and have no clue as to what I'm doing.
This is the error message that i got from the log that [~vinodkone] mentioned
error while loading shared libraries: libmesos-0.24.0.so: cannot open shared object file: No such file or directory



was (Author: rohanahata):
[~ykrips] can you point me to how you actually went about fixing this... new to mesos and have no clue as to what I'm doing

> Task dropped into LOST state
> ----------------------------
>
>                 Key: MESOS-2990
>                 URL: https://issues.apache.org/jira/browse/MESOS-2990
>             Project: Mesos
>          Issue Type: Bug
>          Components: slave
>    Affects Versions: 0.22.1
>         Environment: RHEL 7.0 ppc64
> IBM JDK 1.7.0 SR 7
>            Reporter: Jihun Kang
>
> Every time I ran "test-framework" command on the shell, mesos always failed to run each tasks. First task on this framework dropped into the *LOST* state, and another tasks also terminated.
> Following is the message from "test-framework".
> {code}
> # ./src/test-framework --master=10.10.14.72:5050
> I0706 17:24:44.202020 38486 sched.cpp:157] Version: 0.22.1
> I0706 17:24:44.210917 38523 sched.cpp:254] New master detected at master@10.10.14.72:5050
> I0706 17:24:44.212316 38523 sched.cpp:264] No credentials provided. Attempting to register without authentication
> I0706 17:24:44.215756 38529 sched.cpp:448] Framework registered with 20150706-154445-168431176-5050-19360-0000
> Registered!
> Received offer 20150706-154445-168431176-5050-19360-O0 with cpus(*):40; mem(*):60064; disk(*):46055; ports(*):[31000-32000]
> Launching task 0 using offer 20150706-154445-168431176-5050-19360-O0
> Launching task 1 using offer 20150706-154445-168431176-5050-19360-O0
> Launching task 2 using offer 20150706-154445-168431176-5050-19360-O0
> Launching task 3 using offer 20150706-154445-168431176-5050-19360-O0
> Launching task 4 using offer 20150706-154445-168431176-5050-19360-O0
> Task 0 is in state TASK_LOST
> Aborting because task 0 is in unexpected state TASK_LOST with reason 1 from source 1 with message 'Executor terminated'
> I0706 17:24:44.428568 38513 sched.cpp:1623] Asked to abort the driver
> I0706 17:24:44.428665 38513 sched.cpp:856] Aborting framework '20150706-154445-168431176-5050-19360-0000'
> I0706 17:24:44.428987 38486 sched.cpp:1589] Asked to stop the driver
> I0706 17:24:44.429121 38539 sched.cpp:831] Stopping framework '20150706-154445-168431176-5050-19360-0000'
> {code}
> Followings also got from the slave log.
> {code}
> I0706 17:24:44.225492 19452 slave.cpp:1144] Got assigned task 0 for framework 20150706-154445-168431176-5050-19360-0000
> I0706 17:24:44.226763 19452 slave.cpp:1144] Got assigned task 1 for framework 20150706-154445-168431176-5050-19360-0000
> I0706 17:24:44.227041 19452 slave.cpp:1144] Got assigned task 2 for framework 20150706-154445-168431176-5050-19360-0000
> I0706 17:24:44.227252 19452 slave.cpp:1254] Launching task 0 for framework 20150706-154445-168431176-5050-19360-0000
> I0706 17:24:44.238466 19452 slave.cpp:4208] Launching executor default of framework 20150706-154445-168431176-5050-19360-0000 in work directory '/tmp/mesos/slaves/20150706-154445-168431176-5050-19360-S0/frameworks/20150706-154445-168431176-5050-19360-0000/executors/default/runs/d235751e-986c-44ae-a6c9-953814dac2f8'
> I0706 17:24:44.239434 19460 containerizer.cpp:484] Starting container 'd235751e-986c-44ae-a6c9-953814dac2f8' for executor 'default' of framework '20150706-154445-168431176-5050-19360-0000'
> I0706 17:24:44.239447 19452 slave.cpp:1401] Queuing task '0' for executor default of framework '20150706-154445-168431176-5050-19360-0000
> I0706 17:24:44.239769 19452 slave.cpp:1144] Got assigned task 3 for framework 20150706-154445-168431176-5050-19360-0000
> I0706 17:24:44.240003 19452 slave.cpp:1254] Launching task 1 for framework 20150706-154445-168431176-5050-19360-0000
> I0706 17:24:44.240056 19452 slave.cpp:1401] Queuing task '1' for executor default of framework '20150706-154445-168431176-5050-19360-0000
> I0706 17:24:44.240100 19452 slave.cpp:1254] Launching task 2 for framework 20150706-154445-168431176-5050-19360-0000
> I0706 17:24:44.240146 19452 slave.cpp:1401] Queuing task '2' for executor default of framework '20150706-154445-168431176-5050-19360-0000
> I0706 17:24:44.240228 19452 slave.cpp:1144] Got assigned task 4 for framework 20150706-154445-168431176-5050-19360-0000
> I0706 17:24:44.240414 19452 slave.cpp:1254] Launching task 3 for framework 20150706-154445-168431176-5050-19360-0000
> I0706 17:24:44.240460 19452 slave.cpp:1401] Queuing task '3' for executor default of framework '20150706-154445-168431176-5050-19360-0000
> I0706 17:24:44.240501 19452 slave.cpp:1254] Launching task 4 for framework 20150706-154445-168431176-5050-19360-0000
> I0706 17:24:44.240592 19452 slave.cpp:1401] Queuing task '4' for executor default of framework '20150706-154445-168431176-5050-19360-0000
> I0706 17:24:44.245450 19460 launcher.cpp:130] Forked child with pid '38542' for container 'd235751e-986c-44ae-a6c9-953814dac2f8'
> I0706 17:24:44.247877 19458 slave.cpp:3165] Monitoring executor 'default' of framework '20150706-154445-168431176-5050-19360-0000' in container 'd235751e-986c-44ae-a6c9-953814dac2f8'
> I0706 17:24:44.346531 19472 containerizer.cpp:1123] Executor for container 'd235751e-986c-44ae-a6c9-953814dac2f8' has exited
> I0706 17:24:44.346587 19472 containerizer.cpp:918] Destroying container 'd235751e-986c-44ae-a6c9-953814dac2f8'
> I0706 17:24:44.392073 19450 slave.cpp:3223] Executor 'default' of framework 20150706-154445-168431176-5050-19360-0000 exited with status 127
> I0706 17:24:44.398993 19450 slave.cpp:2531] Handling status update TASK_LOST (UUID: f013f24c-5f2d-4623-82e8-b96f46bb3143) for task 0 of framework 20150706-154445-168431176-5050-19360-0000 from @0.0.0.0:0
> W0706 17:24:44.399422 19455 containerizer.cpp:814] Ignoring update for unknown container: d235751e-986c-44ae-a6c9-953814dac2f8
> I0706 17:24:44.406010 19450 slave.cpp:2531] Handling status update TASK_LOST (UUID: 73b14123-8bc2-443b-a04b-89bfe3ff9893) for task 1 of framework 20150706-154445-168431176-5050-19360-0000 from @0.0.0.0:0
> W0706 17:24:44.406121 19465 containerizer.cpp:814] Ignoring update for unknown container: d235751e-986c-44ae-a6c9-953814dac2f8
> I0706 17:24:44.412560 19450 slave.cpp:2531] Handling status update TASK_LOST (UUID: 5c3420c3-9082-4904-8a6e-61b1a0ec52ca) for task 2 of framework 20150706-154445-168431176-5050-19360-0000 from @0.0.0.0:0
> W0706 17:24:44.412675 19479 containerizer.cpp:814] Ignoring update for unknown container: d235751e-986c-44ae-a6c9-953814dac2f8
> I0706 17:24:44.418977 19450 slave.cpp:2531] Handling status update TASK_LOST (UUID: 475a9f26-6f0e-4c11-a959-8be83a633101) for task 3 of framework 20150706-154445-168431176-5050-19360-0000 from @0.0.0.0:0
> W0706 17:24:44.419096 19467 containerizer.cpp:814] Ignoring update for unknown container: d235751e-986c-44ae-a6c9-953814dac2f8
> I0706 17:24:44.425416 19450 slave.cpp:2531] Handling status update TASK_LOST (UUID: dedc265d-c09b-4311-a77f-d9ab4f53960f) for task 4 of framework 20150706-154445-168431176-5050-19360-0000 from @0.0.0.0:0
> W0706 17:24:44.425529 19446 containerizer.cpp:814] Ignoring update for unknown container: d235751e-986c-44ae-a6c9-953814dac2f8
> I0706 17:24:44.425901 19478 status_update_manager.cpp:317] Received status update TASK_LOST (UUID: f013f24c-5f2d-4623-82e8-b96f46bb3143) for task 0 of framework 20150706-154445-168431176-5050-19360-0000
> I0706 17:24:44.426853 19441 slave.cpp:2776] Forwarding the update TASK_LOST (UUID: f013f24c-5f2d-4623-82e8-b96f46bb3143) for task 0 of framework 20150706-154445-168431176-5050-19360-0000 to master@10.10.14.72:5050
> I0706 17:24:44.426983 19478 status_update_manager.cpp:317] Received status update TASK_LOST (UUID: 73b14123-8bc2-443b-a04b-89bfe3ff9893) for task 1 of framework 20150706-154445-168431176-5050-19360-0000
> I0706 17:24:44.427263 19478 status_update_manager.cpp:317] Received status update TASK_LOST (UUID: 5c3420c3-9082-4904-8a6e-61b1a0ec52ca) for task 2 of framework 20150706-154445-168431176-5050-19360-0000
> I0706 17:24:44.427326 19441 slave.cpp:2776] Forwarding the update TASK_LOST (UUID: 73b14123-8bc2-443b-a04b-89bfe3ff9893) for task 1 of framework 20150706-154445-168431176-5050-19360-0000 to master@10.10.14.72:5050
> I0706 17:24:44.427475 19441 slave.cpp:2776] Forwarding the update TASK_LOST (UUID: 5c3420c3-9082-4904-8a6e-61b1a0ec52ca) for task 2 of framework 20150706-154445-168431176-5050-19360-0000 to master@10.10.14.72:5050
> I0706 17:24:44.427508 19478 status_update_manager.cpp:317] Received status update TASK_LOST (UUID: 475a9f26-6f0e-4c11-a959-8be83a633101) for task 3 of framework 20150706-154445-168431176-5050-19360-0000
> I0706 17:24:44.427753 19449 slave.cpp:2776] Forwarding the update TASK_LOST (UUID: 475a9f26-6f0e-4c11-a959-8be83a633101) for task 3 of framework 20150706-154445-168431176-5050-19360-0000 to master@10.10.14.72:5050
> I0706 17:24:44.427798 19478 status_update_manager.cpp:317] Received status update TASK_LOST (UUID: dedc265d-c09b-4311-a77f-d9ab4f53960f) for task 4 of framework 20150706-154445-168431176-5050-19360-0000
> I0706 17:24:44.428028 19449 slave.cpp:2776] Forwarding the update TASK_LOST (UUID: dedc265d-c09b-4311-a77f-d9ab4f53960f) for task 4 of framework 20150706-154445-168431176-5050-19360-0000 to master@10.10.14.72:5050
> I0706 17:24:44.431067 19471 slave.cpp:1768] Asked to shut down framework 20150706-154445-168431176-5050-19360-0000 by master@10.10.14.72:5050
> I0706 17:24:44.431099 19471 slave.cpp:1793] Shutting down framework 20150706-154445-168431176-5050-19360-0000
> I0706 17:24:44.431450 19471 slave.cpp:3332] Cleaning up executor 'default' of framework 20150706-154445-168431176-5050-19360-0000
> I0706 17:24:44.431660 19476 gc.cpp:56] Scheduling '/tmp/mesos/slaves/20150706-154445-168431176-5050-19360-S0/frameworks/20150706-154445-168431176-5050-19360-0000/executors/default/runs/d235751e-986c-44ae-a6c9-953814dac2f8' for gc 6.99999500509333days in the future
> I0706 17:24:44.432011 19471 slave.cpp:3411] Cleaning up framework 20150706-154445-168431176-5050-19360-0000
> I0706 17:24:44.432106 19476 gc.cpp:56] Scheduling '/tmp/mesos/slaves/20150706-154445-168431176-5050-19360-S0/frameworks/20150706-154445-168431176-5050-19360-0000/executors/default' for gc 6.99999500129482days in the future
> I0706 17:24:44.432111 19468 status_update_manager.cpp:279] Closing status update streams for framework 20150706-154445-168431176-5050-19360-0000
> I0706 17:24:44.432212 19476 gc.cpp:56] Scheduling '/tmp/mesos/slaves/20150706-154445-168431176-5050-19360-S0/frameworks/20150706-154445-168431176-5050-19360-0000' for gc 6.99999499845037days in the future
> I0706 17:25:32.152350 19462 slave.cpp:3648] Current disk usage 15.70%. Max allowed age: 5.201143517953102days
> I0706 17:25:44.240165 19446 slave.cpp:3564] Framework 20150706-154445-168431176-5050-19360-0000 seems to have exited. Ignoring registration timeout for executor 'default'
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)