You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Meng Zhu (JIRA)" <ji...@apache.org> on 2019/03/22 20:14:00 UTC

[jira] [Created] (MESOS-9673) Add timeout mechanism to GC incomplete task.

Meng Zhu created MESOS-9673:
-------------------------------

             Summary: Add timeout mechanism to GC incomplete task.
                 Key: MESOS-9673
                 URL: https://issues.apache.org/jira/browse/MESOS-9673
             Project: Mesos
          Issue Type: Improvement
          Components: containerization
            Reporter: Meng Zhu


Currently, an executor's meta and sandbox directory are only GCed when a task is completed i.e. terminal task with all status acked.

However, in the case of unacked status update, the agent will keep resending and keep the directories forever.

One issue is that, agent will keep recovering this executor upon every failover and if a later executor happens to use the same pid (almost a certainty consider the old meta dir will never be GCed), it will send agent into a crash loop (MESOS-9672).

We should consider introducing a timeout mechanism to GC incomplete tasks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)