You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@gobblin.apache.org by "Chen Guo (Jira)" <ji...@apache.org> on 2019/12/06 22:32:00 UTC
[jira] [Created] (GOBBLIN-998) ExecutionStatus should be reset to
PENDING before a job retries
Chen Guo created GOBBLIN-998:
--------------------------------
Summary: ExecutionStatus should be reset to PENDING before a job retries
Key: GOBBLIN-998
URL: https://issues.apache.org/jira/browse/GOBBLIN-998
Project: Apache Gobblin
Issue Type: Bug
Reporter: Chen Guo
In the modifyStateIfRetryRequired of KafkaJobStatusMonitor, when the state is Failed and currentAttempts < maxAttempts, the ExecutionStatus is set to Running.
However, due to the checkin from GOBBLIN-974([https://github.com/apache/incubator-gobblin/blob/9f50a2563cc257039da44018663b6b9e119fb499/gobblin-service/src/main/java/org/apache/gobblin/service/monitoring/KafkaJobStatusMonitor.java#L159]), the currentAttempts update from a lower-order event(like Orchestrated) cannot be consumed to update the jobState file. Thus it will cause infinite retries in DagManagerThread for failed jobs when it poolAndAdvanceDag.
The solution is to update ExecutionStatus to PENDING instead of Running.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)