You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Prasanth Jayachandran (Jira)" <ji...@apache.org> on 2020/08/26 17:49:00 UTC

[jira] [Resolved] (HIVE-24068) Add re-execution plugin for handling DAG submission and unmanaged AM failures

     [ https://issues.apache.org/jira/browse/HIVE-24068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Prasanth Jayachandran resolved HIVE-24068.
------------------------------------------
    Resolution: Fixed

PR merged to master. Thanks [~kgyrtkirk]  for the review!

> Add re-execution plugin for handling DAG submission and unmanaged AM failures
> -----------------------------------------------------------------------------
>
>                 Key: HIVE-24068
>                 URL: https://issues.apache.org/jira/browse/HIVE-24068
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 4.0.0
>            Reporter: Prasanth Jayachandran
>            Assignee: Prasanth Jayachandran
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> DAG submission failure can also happen in environments where AM container died causing DNS issues. DAG submissions are safe to retry as the DAG hasn't started execution yet. There are retries at getSession and submitDAG level individually but some submitDAG failure has to retry getSession as well as AM could be unreachable, this can be handled in re-execution plugin.
> There is already AM loss retry execution plugin but it only handles managed AMs. It can be extended to handle unmanaged AMs as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)