You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@gobblin.apache.org by "Chen Guo (Jira)" <ji...@apache.org> on 2019/12/06 22:32:00 UTC

[jira] [Created] (GOBBLIN-998) ExecutionStatus should be reset to PENDING before a job retries

Chen Guo created GOBBLIN-998:
--------------------------------

             Summary: ExecutionStatus should be reset to PENDING before a job retries
                 Key: GOBBLIN-998
                 URL: https://issues.apache.org/jira/browse/GOBBLIN-998
             Project: Apache Gobblin
          Issue Type: Bug
            Reporter: Chen Guo


In the modifyStateIfRetryRequired of KafkaJobStatusMonitor, when the state is Failed and currentAttempts < maxAttempts, the ExecutionStatus is set to Running. 

However, due to the checkin from GOBBLIN-974([https://github.com/apache/incubator-gobblin/blob/9f50a2563cc257039da44018663b6b9e119fb499/gobblin-service/src/main/java/org/apache/gobblin/service/monitoring/KafkaJobStatusMonitor.java#L159]), the currentAttempts update from a lower-order event(like Orchestrated) cannot be consumed to update the jobState file. Thus it will cause infinite retries in DagManagerThread for failed jobs when it poolAndAdvanceDag.

 

The solution is to update ExecutionStatus to PENDING instead of Running.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)