You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@gobblin.apache.org by "William Lo (Jira)" <ji...@apache.org> on 2023/02/09 15:58:00 UTC

[jira] [Created] (GOBBLIN-1784) Race condition where on service restart DagManager will lose track of dags

William Lo created GOBBLIN-1784:
-----------------------------------

             Summary: Race condition where on service restart DagManager will lose track of dags
                 Key: GOBBLIN-1784
                 URL: https://issues.apache.org/jira/browse/GOBBLIN-1784
             Project: Apache Gobblin
          Issue Type: Bug
          Components: gobblin-service
            Reporter: William Lo
            Assignee: Abhishek Tiwari


Gobblin-as-a-Service has a bug where on restart, the DagManager will clean up dags but a flow event is never sent.
This leads to a scenario where if the event is never sent by the underlying notification system, the dag will already be cleaned up and thus the job status will permanently be stuck in a running state.

The DagManager thus should only clean up its own reference of dags after it reads that the jobstatus monitor has properly saved the final flow status, and if a status hasn't been received by some timestamp (e.g. 5 mins), then the DagManager will re-emit the event in case it was lost.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)