You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Matthias Veit (JIRA)" <ji...@apache.org> on 2015/10/20 14:25:27 UTC

[jira] [Created] (MESOS-3766) Can not kill task in Status STAGING

Matthias Veit created MESOS-3766:
------------------------------------

             Summary: Can not kill task in Status STAGING
                 Key: MESOS-3766
                 URL: https://issues.apache.org/jira/browse/MESOS-3766
             Project: Mesos
          Issue Type: Bug
          Components: general
    Affects Versions: 0.25.0
         Environment: OSX 
            Reporter: Matthias Veit


I have created a simple Marathon Application with instance count 100 (100 tasks) with a simple sleep command. Before all tasks were running, I killed all tasks. This operation was successful, except 2 tasks. These 2 tasks are in state STAGING (according to the mesos UI). Marathon tries to kill those tasks every 5 seconds (for over an hour now) - unsuccessfully.

I picked one task and grepped the slave log:
{noformat}
I1020 12:39:38.480478 315482112 slave.cpp:1270] Got assigned task app.dc98434b-7716-11e5-a5fc-1ea69edef42d for framework 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-0000
I1020 12:39:38.887559 315482112 slave.cpp:1386] Launching task app.dc98434b-7716-11e5-a5fc-1ea69edef42d for framework 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-0000
I1020 12:39:38.898221 315482112 slave.cpp:4852] Launching executor app.dc98434b-7716-11e5-a5fc-1ea69edef42d of framework 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-0000 with resour
I1020 12:39:38.899521 315482112 slave.cpp:1604] Queuing task 'app.dc98434b-7716-11e5-a5fc-1ea69edef42d' for executor app.dc98434b-7716-11e5-a5fc-1ea69edef42d of framework '80
I1020 12:39:39.740401 313872384 containerizer.cpp:640] Starting container '5ce75a17-12db-4c8f-9131-b40f8280b9f7' for executor 'app.dc98434b-7716-11e5-a5fc-1ea69edef42d' of fr
I1020 12:39:40.495931 313872384 containerizer.cpp:873] Checkpointing executor's forked pid 37096 to '/tmp/mesos/meta/slaves/80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-S0/frameworks
I1020 12:39:41.744439 313335808 slave.cpp:2379] Got registration for executor 'app.dc98434b-7716-11e5-a5fc-1ea69edef42d' of framework 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-000
I1020 12:39:42.080734 313335808 slave.cpp:1760] Sending queued task 'app.dc98434b-7716-11e5-a5fc-1ea69edef42d' to executor 'app.dc98434b-7716-11e5-a5fc-1ea69edef42d' of frame
I1020 12:40:13.073390 312262656 slave.cpp:1789] Asked to kill task app.dc98434b-7716-11e5-a5fc-1ea69edef42d of framework 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-0000
I1020 12:40:18.079651 312262656 slave.cpp:1789] Asked to kill task app.dc98434b-7716-11e5-a5fc-1ea69edef42d of framework 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-0000
I1020 12:40:23.097504 313335808 slave.cpp:1789] Asked to kill task app.dc98434b-7716-11e5-a5fc-1ea69edef42d of framework 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-0000
I1020 12:40:28.118443 313872384 slave.cpp:1789] Asked to kill task app.dc98434b-7716-11e5-a5fc-1ea69edef42d of framework 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-0000
I1020 12:40:33.138137 313335808 slave.cpp:1789] Asked to kill task app.dc98434b-7716-11e5-a5fc-1ea69edef42d of framework 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-0000
I1020 12:40:38.158529 316018688 slave.cpp:1789] Asked to kill task app.dc98434b-7716-11e5-a5fc-1ea69edef42d of framework 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-0000
I1020 12:40:43.177901 314408960 slave.cpp:1789] Asked to kill task app.dc98434b-7716-11e5-a5fc-1ea69edef42d of framework 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-0000
I1020 12:40:48.197852 313872384 slave.cpp:1789] Asked to kill task app.dc98434b-7716-11e5-a5fc-1ea69edef42d of framework 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-0000
I1020 12:40:53.216672 316018688 slave.cpp:1789] Asked to kill task app.dc98434b-7716-11e5-a5fc-1ea69edef42d of framework 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-0000
I1020 12:40:58.238471 314945536 slave.cpp:1789] Asked to kill task app.dc98434b-7716-11e5-a5fc-1ea69edef42d of framework 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-0000
I1020 12:41:03.256614 312799232 slave.cpp:1789] Asked to kill task app.dc98434b-7716-11e5-a5fc-1ea69edef42d of framework 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-0000
I1020 12:41:08.276450 313335808 slave.cpp:1789] Asked to kill task app.dc98434b-7716-11e5-a5fc-1ea69edef42d of framework 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-0000
I1020 12:41:13.297114 315482112 slave.cpp:1789] Asked to kill task app.dc98434b-7716-11e5-a5fc-1ea69edef42d of framework 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-0000
I1020 12:41:18.316463 316018688 slave.cpp:1789] Asked to kill task app.dc98434b-7716-11e5-a5fc-1ea69edef42d of framework 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-0000
I1020 12:41:23.337116 313872384 slave.cpp:1789] Asked to kill task app.dc98434b-7716-11e5-a5fc-1ea69edef42d of framework 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-0000
.
.
.
I1020 14:11:03.614157 316018688 slave.cpp:1789] Asked to kill task app.dc98434b-7716-11e5-a5fc-1ea69edef42d of framework 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-0000
{noformat}

master log looks like this:

{noformat}
I1020 12:39:38.044208 351387648 master.hpp:176] Adding task app.dc98434b-7716-11e5-a5fc-1ea69edef42d with resources cpus(*):0.1; mem(*):16; ports(*):[31232-31232] on slave 80
I1020 12:39:38.044494 351387648 master.cpp:3248] Launching task app.dc98434b-7716-11e5-a5fc-1ea69edef42d of framework 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-0000 (marathon) at 
I1020 12:40:13.061883 350314496 master.cpp:3482] Telling slave 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-S0 at slave(1)@127.0.0.1:5051 (localhost) to kill task app.dc98434b-7716-1
I1020 12:40:18.079074 351387648 master.cpp:3482] Telling slave 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-S0 at slave(1)@127.0.0.1:5051 (localhost) to kill task app.dc98434b-7716-1
I1020 12:40:23.097110 352460800 master.cpp:3482] Telling slave 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-S0 at slave(1)@127.0.0.1:5051 (localhost) to kill task app.dc98434b-7716-1
I1020 12:40:28.117952 352997376 master.cpp:3482] Telling slave 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-S0 at slave(1)@127.0.0.1:5051 (localhost) to kill task app.dc98434b-7716-1
I1020 12:40:33.137667 352460800 master.cpp:3482] Telling slave 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-S0 at slave(1)@127.0.0.1:5051 (localhost) to kill task app.dc98434b-7716-1
I1020 12:40:38.157832 354070528 master.cpp:3482] Telling slave 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-S0 at slave(1)@127.0.0.1:5051 (localhost) to kill task app.dc98434b-7716-1
I1020 12:40:43.177223 353533952 master.cpp:3482] Telling slave 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-S0 at slave(1)@127.0.0.1:5051 (localhost) to kill task app.dc98434b-7716-1
.
.
.
I1020 14:11:33.611827 353533952 master.cpp:3482] Telling slave 80ba2050-bf0f-4472-a2f7-2636c4f7b8c8-S0 at slave(1)@127.0.0.1:5051 (localhost) to kill task app.dc98434b-7716-1
{noformat}

The sandbox: stdout is empty and stderr has following content:
{noformat}
I1020 12:39:41.551882 2047558400 exec.cpp:134] Version: 0.25.0
{noformat}

Just for reference, this was the Marathon Application used:

{noformat}
{
  "id": "/app", 
  "mem": 16.0, 
  "cmd": "sleep 10000", 
  "cpus": 0.1, 
  "disk": 0.0, 
  "env": {
      "foo": "bla"
  } 
}
{noformat}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)