You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@gobblin.apache.org by "Chen Guo (Jira)" <ji...@apache.org> on 2019/12/10 22:42:00 UTC

[jira] [Resolved] (GOBBLIN-998) ExecutionStatus should be reset to PENDING before a job retries

     [ https://issues.apache.org/jira/browse/GOBBLIN-998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chen Guo resolved GOBBLIN-998.
------------------------------
    Resolution: Fixed

> ExecutionStatus should be reset to PENDING before a job retries
> ---------------------------------------------------------------
>
>                 Key: GOBBLIN-998
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-998
>             Project: Apache Gobblin
>          Issue Type: Bug
>            Reporter: Chen Guo
>            Priority: Critical
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> In the modifyStateIfRetryRequired of KafkaJobStatusMonitor, when the state is Failed and currentAttempts < maxAttempts, the ExecutionStatus is set to Running. 
> However, due to the checkin from GOBBLIN-974([https://github.com/apache/incubator-gobblin/blob/9f50a2563cc257039da44018663b6b9e119fb499/gobblin-service/src/main/java/org/apache/gobblin/service/monitoring/KafkaJobStatusMonitor.java#L159]), the currentAttempts update from a lower-order event(like Orchestrated) cannot be consumed to update the jobState file. Thus it will cause infinite retries in DagManagerThread for failed jobs when it poolAndAdvanceDag.
>  
> The solution is to update ExecutionStatus to PENDING instead of Running.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)