You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Kuhu Shukla (JIRA)" <ji...@apache.org> on 2018/07/06 15:44:00 UTC

[jira] [Updated] (TEZ-3912) Fetchers should be more robust to corrupted inputs

     [ https://issues.apache.org/jira/browse/TEZ-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kuhu Shukla updated TEZ-3912:
-----------------------------
    Attachment: TEZ-3912.002.patch

> Fetchers should be more robust to corrupted inputs
> --------------------------------------------------
>
>                 Key: TEZ-3912
>                 URL: https://issues.apache.org/jira/browse/TEZ-3912
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Jason Lowe
>            Assignee: Kuhu Shukla
>            Priority: Major
>         Attachments: TEZ-3912.001.patch, TEZ-3912.002.patch
>
>
> I recently saw a case where a bad node in the cluster produced corrupted shuffle data that caused the codec to throw IllegalArgumentException when trying to fetch.  Fetchers currently only handle IOException and InternalError, and any other type of exception will cause the entire task to be torn down.  We should consider catching Exception like MapReduce does to be more robust in light of other types of errors coming from the codec and allow retries to occur.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)