You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Jason Lowe (JIRA)" <ji...@apache.org> on 2017/12/15 20:29:00 UTC

[jira] [Commented] (TEZ-3877) Delete spill files once merge is done

    [ https://issues.apache.org/jira/browse/TEZ-3877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16293186#comment-16293186 ] 

Jason Lowe commented on TEZ-3877:
---------------------------------

TEZ-3350 tracks the problem where intermediate spills are not placed in the container-specific directory.  Even if the task deletes intermediate spills once they are consumed, we still need TEZ-3350 to solve the problem of "leaking" spill files if the task crashes or is killed during a shuffle/merge.


> Delete spill files once merge is done
> -------------------------------------
>
>                 Key: TEZ-3877
>                 URL: https://issues.apache.org/jira/browse/TEZ-3877
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Rohini Palaniswamy
>
>   I see that spill files are not deleted right after merge completes. We should do that as it takes up a lot of space and we can't afford that wastage when Tez takes up a lot of shuffle space with complex DAGs. [~jlowe] told me they are only cleaned up after application completes as they are written in app directory and not container directory. That also has to be done so that they are cleaned up by node manager during task failures or container crashes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)