You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Jason Lowe (JIRA)" <ji...@apache.org> on 2018/01/05 22:08:00 UTC

[jira] [Updated] (TEZ-3877) Delete unordered spill files once merge is done

     [ https://issues.apache.org/jira/browse/TEZ-3877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jason Lowe updated TEZ-3877:
----------------------------
    Attachment: TEZ-3877.001.patch

Attaching a patch that cleans up the intermediate spills in in the unordered writer after the merge is complete or encounters an error.


> Delete unordered spill files once merge is done
> -----------------------------------------------
>
>                 Key: TEZ-3877
>                 URL: https://issues.apache.org/jira/browse/TEZ-3877
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Rohini Palaniswamy
>            Assignee: Jason Lowe
>         Attachments: TEZ-3877.001.patch
>
>
>   I see that spill files are not deleted right after merge completes. We should do that as it takes up a lot of space and we can't afford that wastage when Tez takes up a lot of shuffle space with complex DAGs. [~jlowe] told me they are only cleaned up after application completes as they are written in app directory and not container directory. That also has to be done so that they are cleaned up by node manager during task failures or container crashes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)