You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "champon1020 (via GitHub)" <gi...@apache.org> on 2023/10/04 18:09:27 UTC

[I] DataflowJob is failed when wait_until_finished=True although the state is JOB_STATE_DONE [airflow]

champon1020 opened a new issue, #34767:
URL: https://github.com/apache/airflow/issues/34767

   ### Apache Airflow version
   
   2.7.1
   
   ### What happened
   
   We currently use the DataflowHook in tasks of Airflow DAG. If we upgrade the version of apache-airflow-google-providers to 10.9.0, we got the following error although the dataflow job is completed.
   ```
   Traceback (most recent call last):
     File "/opt/python3.8/lib/python3.8/site-packages/airflow/models/taskinstance.py", line 1384, in _run_raw_task
       self._execute_task_with_callbacks(context, test_mode)
     File "/opt/python3.8/lib/python3.8/site-packages/airflow/models/taskinstance.py", line 1531, in _execute_task_with_callbacks
       result = self._execute_task(context, task_orig)
     File "/opt/python3.8/lib/python3.8/site-packages/airflow/models/taskinstance.py", line 1586, in _execute_task
       result = execute_callable(context=context)
     File "xxx", line 65, in execute
       hook.wait_for_done(
     File "/opt/python3.8/lib/python3.8/site-packages/airflow/providers/google/common/hooks/base_google.py", line 475, in inner_wrapper
       return func(self, *args, **kwargs)
     File "/opt/python3.8/lib/python3.8/site-packages/airflow/providers/google/cloud/hooks/dataflow.py", line 1203, in wait_for_done
       job_controller.wait_for_done()
     File "/opt/python3.8/lib/python3.8/site-packages/airflow/providers/google/cloud/hooks/dataflow.py", line 439, in wait_for_done
       while self._jobs and not all(self._check_dataflow_job_state(job) for job in self._jobs):
     File "/opt/python3.8/lib/python3.8/site-packages/airflow/providers/google/cloud/hooks/dataflow.py", line 439, in <genexpr>
       while self._jobs and not all(self._check_dataflow_job_state(job) for job in self._jobs):
     File "/opt/python3.8/lib/python3.8/site-packages/airflow/providers/google/cloud/hooks/dataflow.py", line 430, in _check_dataflow_job_state
       raise Exception(
   Exception: Google Cloud Dataflow job <xxx> is in an unexpected terminal state: JOB_STATE_DONE, expected terminal state: JOB_STATE_DONE
   ```
   
   ### What you think should happen instead
   
   The error message "an unexpected terminal state: JOB_STATE_DONE, expected terminal state: JOB_STATE_DONE" is strange. If the dataflow job is completed, I think it should not be failed even if the `expected_terminal_state` is not set as DataflowHook parameter.
   
   ### How to reproduce
   
   Install airflow from apache-airflow-google-providers/10.9.0. 
   Pass wait_until_finished=True to DataflowHook and execute start_template_dataflow.
   
   
   ### Operating System
   
   Ubuntu 20.04.6 LTS (Focal Fossa)
   
   ### Versions of Apache Airflow Providers
   
   apache-airflow-google-providers===10.9.0
   
   ### Deployment
   
   Google Cloud Composer
   
   ### Deployment details
   
   _No response_
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] DataflowJob is failed when wait_until_finished=True although the state is JOB_STATE_DONE [airflow]

Posted by "boring-cyborg[bot] (via GitHub)" <gi...@apache.org>.
boring-cyborg[bot] commented on issue #34767:
URL: https://github.com/apache/airflow/issues/34767#issuecomment-1747400431

   Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Dataflow job is failed when wait_until_finished=True although the state is JOB_STATE_DONE [airflow]

Posted by "eladkal (via GitHub)" <gi...@apache.org>.
eladkal closed issue #34767: Dataflow job is failed when wait_until_finished=True although the state is JOB_STATE_DONE
URL: https://github.com/apache/airflow/issues/34767


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org