You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Siddharth Seth (JIRA)" <ji...@apache.org> on 2014/11/03 01:58:33 UTC

[jira] [Updated] (TEZ-1731) OnDiskMerger can end up clobbering files across tasks with LocalDiskFetch

     [ https://issues.apache.org/jira/browse/TEZ-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Siddharth Seth updated TEZ-1731:
--------------------------------
    Attachment: TEZ-1731.1.txt

Patch to use the same filename as would have been used if using non-local optimized fetch, during the merge.

Also addressed TEZ-1727, by using just the name component of the path instead of the entire path.

[~pramachandran], [~rajesh.balamohan], [~gopalv] - review please.

> OnDiskMerger can end up clobbering files across tasks with LocalDiskFetch
> -------------------------------------------------------------------------
>
>                 Key: TEZ-1731
>                 URL: https://issues.apache.org/jira/browse/TEZ-1731
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Siddharth Seth
>            Assignee: Siddharth Seth
>            Priority: Critical
>         Attachments: TEZ-1731.1.txt
>
>
> When an on disk fetch starts with LOCAL files (optimize.local.fetch), the filename used by the merger is based on the source file name. This name can be the same for all tasks reading the same input on the node - and can result in files being overwritten between tasks, depending on the order in which events are processed, and the dir allocated by the local dir-allocator.
> Leads to ChecksumExceptions, and FileNotFoundExceptions during the merge.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)