You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "TezQA (JIRA)" <ji...@apache.org> on 2018/09/05 20:31:00 UTC

[jira] [Commented] (TEZ-3984) Shuffle: Out of Band DME event sending causes errors

    [ https://issues.apache.org/jira/browse/TEZ-3984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16604904#comment-16604904 ] 

TezQA commented on TEZ-3984:
----------------------------

{color:red}-1 overall{color}.  Here are the results of testing the latest attachment
  http://issues.apache.org/jira/secure/attachment/12938522/TEZ-3984.3.patch
  against master revision a37a367.

    {color:green}+1 @author{color}.  The patch does not contain any @author tags.

    {color:green}+1 tests included{color}.  The patch appears to include 1 new or modified test files.

    {color:green}+1 javac{color}.  The applied patch does not increase the total number of javac compiler warnings.

    {color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

    {color:green}+1 findbugs{color}.  The patch does not introduce any new Findbugs (version 3.0.1) warnings.

    {color:green}+1 release audit{color}.  The applied patch does not increase the total number of release audit warnings.

    {color:red}-1 core tests{color}.  The patch failed these unit tests in :
                   org.apache.tez.runtime.library.output.TestOnFileSortedOutput

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2907//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2907//console

This message is automatically generated.


> Shuffle: Out of Band DME event sending causes errors
> ----------------------------------------------------
>
>                 Key: TEZ-3984
>                 URL: https://issues.apache.org/jira/browse/TEZ-3984
>             Project: Apache Tez
>          Issue Type: Bug
>    Affects Versions: 0.8.4, 0.9.1, 0.10.0
>            Reporter: Gopal V
>            Assignee: Jaume M
>            Priority: Critical
>              Labels: correctness
>         Attachments: TEZ-3984.1.patch, TEZ-3984.2.patch, TEZ-3984.3.patch
>
>
> In case of a task Input throwing an exception, the outputs are also closed in the LogicalIOProcessorRuntimeTask.cleanup().
> Cleanup ignore all the events returned by output close, however if any output tries to send an event out of band by directly calling outputContext.sendEvents(events), then those events can reach the AM before the task failure is reported.
> This can cause correctness issues with shuffle since zero sized events can be sent out due to an input failure and downstream tasks may never reattempt a fetch from the valid attempt.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)