You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@aurora.apache.org by "Stephan Erb (JIRA)" <ji...@apache.org> on 2016/06/16 15:14:05 UTC

[jira] [Created] (AURORA-1719) Aurora warns about every terminated executor

Stephan Erb created AURORA-1719:
-----------------------------------

             Summary: Aurora warns about every terminated executor
                 Key: AURORA-1719
                 URL: https://issues.apache.org/jira/browse/AURORA-1719
             Project: Aurora
          Issue Type: Bug
            Reporter: Stephan Erb


A clean executor shutdown in Mesos:
{code}
I0616 13:55:16.580080 16915 master.cpp:4891] Executor 'thermos-role-env-job-0-d94972f8-760e-4bb0-beef-654e2df1f5e0' of framework 20151001-085346-58917130-5050-37976-0000 on slave d4218d85-e294-4405-af4c-80fc7a66f1a4
-S0 at slave(1)@xxxx:5051 (xxxx): exited with status 0
I0616 13:55:16.580286 16915 master.cpp:6540] Removing executor 'thermos-role-env-job-0-d94972f8-760e-4bb0-beef-654e2df1f5e0' with resources cpus(*):0.01; mem(*):128 of framework 20151001-085346-58917130-5050-37976-0000 on slave d4218d85-e294-4405-af4c-80fc7a66f1a4-S0 at slave(1)@xxxx:5051 (xxxx)
{code}

On the Aurora end, this leads to this warning and stats increment of {{scheduler_lost_executors}}:

{code}
W0616 14:17:14.603 [Thread-7189, MesosSchedulerImpl:258] Lost executor value: "thermos-role-env-job-0-d94972f8-760e-4bb0-beef-654e2df1f5e0"

{code}

I read the responsible [mesos commit|https://github.com/apache/mesos/commit/932df4885c91083697a378ca8f09a47c194e5fbb] as it is fully intentional to always send the {{executorLost}} even if the shutdown was clean :-(. We should therefore add a guard to our handler to only dispatch a warning if necessary.






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)