You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tez.apache.org by "Kuhu Shukla (JIRA)" <ji...@apache.org> on 2018/04/18 19:17:00 UTC

[jira] [Created] (TEZ-3917) Speculative task attempt's DMEs can cause downstream fetcher to NPE or duplicate fetch

Kuhu Shukla created TEZ-3917:
--------------------------------

             Summary: Speculative task attempt's DMEs can cause downstream fetcher to NPE or duplicate fetch
                 Key: TEZ-3917
                 URL: https://issues.apache.org/jira/browse/TEZ-3917
             Project: Apache Tez
          Issue Type: Bug
    Affects Versions: 0.9.1
            Reporter: Kuhu Shukla
            Assignee: Kuhu Shukla


STA0 , STA1

         |

         |

DTA0 , DTA1

 

Take the above example of  DTA0 initially fetching from upstream source task which has 2 attempts, one speculative (say STA1).

There exists a race where in DME from STA1 comes in to DTA0 and is fetched followed by the fetch from STA0 (the successful one) being marked as duplicate. The DME from STA1 is sent before it is marked as killed by the AM.

This additional event can also lead to an NPE since fetcher thread is assigned this additional output to be fetched while ShuffleScheduler thinks it has fetched all the mapoutputs since it is not prepared to handle the extra events coming in from the the speculative attempts.

There are cases where DTA0 NPEs and DTA1 shows duplicate fetches.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)