You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by "Benjamin Mahler (JIRA)" <ji...@apache.org> on 2013/03/16 21:34:12 UTC

[jira] [Assigned] (MESOS-396) Slave GarbageCollector needs to delete the parent executor directories. It currently only deletes the executor run directories.

     [ https://issues.apache.org/jira/browse/MESOS-396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Benjamin Mahler reassigned MESOS-396:
-------------------------------------

    Assignee: Benjamin Mahler
    
> Slave GarbageCollector needs to delete the parent executor directories. It currently only deletes the executor run directories.
> -------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MESOS-396
>                 URL: https://issues.apache.org/jira/browse/MESOS-396
>             Project: Mesos
>          Issue Type: Bug
>            Reporter: Benjamin Mahler
>            Assignee: Benjamin Mahler
>            Priority: Blocker
>
> The result of this is that long lived slaves accumulate a large number of empty executor directories. All that remains in these directories is a broken link to the 'latest' run.
> Over time, as the slave approaches having LINK_MAX empty executor directories, the slave will crash from mkdir failing, as was found in MESOS-391.
> The fix is that we have to schedule the executor parent directories for deletion, however the GC module does not know whether the parent executor can be deleted! This is because there could be more tasks launched with the same executor id, since having scheduled the directory for deletion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira