You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "TezQA (JIRA)" <ji...@apache.org> on 2015/04/20 09:35:58 UTC

[jira] [Commented] (TEZ-2313) Regression in handling obsolete events in ShuffleScheduler

    [ https://issues.apache.org/jira/browse/TEZ-2313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14502469#comment-14502469 ] 

TezQA commented on TEZ-2313:
----------------------------

{color:green}+1 overall{color}.  Here are the results of testing the latest attachment
  http://issues.apache.org/jira/secure/attachment/12726521/TEZ-2313.1.patch
  against master revision 57c62f1.

    {color:green}+1 @author{color}.  The patch does not contain any @author tags.

    {color:green}+1 tests included{color}.  The patch appears to include 1 new or modified test files.

    {color:green}+1 javac{color}.  The applied patch does not increase the total number of javac compiler warnings.

    {color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

    {color:green}+1 findbugs{color}.  The patch does not introduce any new Findbugs (version 2.0.3) warnings.

    {color:green}+1 release audit{color}.  The applied patch does not increase the total number of release audit warnings.

    {color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/489//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/489//console

This message is automatically generated.

> Regression in handling obsolete events in ShuffleScheduler
> ----------------------------------------------------------
>
>                 Key: TEZ-2313
>                 URL: https://issues.apache.org/jira/browse/TEZ-2313
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Bikas Saha
>            Assignee: Rajesh Balamohan
>            Priority: Blocker
>         Attachments: TEZ-2313.1.patch, TEZ-2313.WIP.patch
>
>
> /cc [~rohini]
> When an obsolete event is received then the shuffle scheduler fails fast even when pipelining is disabled. IIRC, obsolete inputs were supposed to fail the shuffled inputs if we were reading and merging partial spilled outputs. But in this case, pipelining is not on. So not sure why we are failing fast. 
> {noformat}
> Caused by: java.io.IOException: InputAttemptIdentifier [inputIdentifier=InputIdentifier [inputIndex=4485], attemptNumber=1, pathComponent=null, fetchTypeInfo=FINAL_MERGE_ENABLED, spillEventId=-1] is marked as obsoleteInput, but it exists in shuffleInfoEventMap. Some data could have been already merged to memory/disk outputs. Failing the fetch early.
> at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.ShuffleScheduler.obsoleteInput(ShuffleScheduler.java:546)
> at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.ShuffleInputEventHandlerOrderedGrouped.processTaskFailedEvent(ShuffleInputEventHandlerOrderedGrouped.java:122)
> at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.ShuffleInputEventHandlerOrderedGrouped.handleEvent(ShuffleInputEventHandlerOrderedGrouped.java:73)
> at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.ShuffleInputEventHandlerOrderedGrouped.handleEvents(ShuffleInputEventHandlerOrderedGrouped.java:63)
> at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle.handleEvents(Shuffle.java:246)
> at org.apache.tez.runtime.library.input.OrderedGroupedKVInput.handleEvents(OrderedGroupedKVInput.java:265)
> at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.handleEvent(LogicalIOProcessorRuntimeTask.java:620)
> at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.access$1100(LogicalIOProcessorRuntimeTask.java:93)
> at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask$1.runInternal(LogicalIOProcessorRuntimeTask.java:683)
> at org.apache.tez.common.RunnableWithNdc.run(RunnableWithNdc.java:35){noformat}
> /cc [~rajesh.balamohan]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)