You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/08/08 21:15:48 UTC

[GitHub] [airflow] MatrixManAtYrService commented on issue #25210: Many tasks updating dataset at once causes some of them to fail

MatrixManAtYrService commented on issue #25210:
URL: https://github.com/apache/airflow/issues/25210#issuecomment-1208618232

   Ran across this again (at least I think it's the same thing) with this dag:
   
   ```python3
   from airflow import Dataset, DAG
   from airflow.operators.python import PythonOperator
   from datetime import datetime
   
   
   fan_out = Dataset("fan_out")
   fan_in = Dataset("fan_in")
   
   # the leader
   with DAG(
       dag_id="momma_duck", start_date=datetime(1970, 1, 1), schedule_interval=None
   ) as leader:
   
       PythonOperator(
           task_id="has_outlet", python_callable=lambda: None, outlets=[fan_out]
       )
   
   # the many
   for i in range(1, 1000):
       with DAG(
           dag_id=f"duckling_{i}", start_date=datetime(1970, 1, 1), schedule_on=[fan_out]
       ) as duck:
   
           PythonOperator(
               task_id="has_outlet", python_callable=lambda: None, outlets=[fan_in]
           )
       globals()[f"duck_{i}"] = duck
   
   
   # the straggler
   with DAG(
       dag_id="straggler_duck", start_date=datetime(1970, 1, 1), schedule_on=[fan_in]
   ) as straggler:
   
       PythonOperator(task_id="has_outlet", python_callable=lambda: None)
   
   ```
   
   This time I noticed that, with the LocalExecutor at least, there are messages regarding zombines in the scheduler log:
   
   ```{dag.py:3178} INFO - Setting next_dagrun for duckling_897 to None, run_after=None
   {dagrun.py:567} INFO - Marking run <DagRun duckling_855 @ 2022-08-08 21:01:33.571168+00:00: dataset_triggered__2022-08-08T21:01:33.568548+00:00, state:running, queued_at: 2022-08-08 21:01:33.569170+00:00. externally triggered: False> successful
   {dagrun.py:612} INFO - DagRun Finished: dag_id=duckling_855, execution_date=2022-08-08 21:01:33.571168+00:00, run_id=dataset_triggered__2022-08-08T21:01:33.568548+00:00, run_start_date=2022-08-08 21:03:40.332173+00:00, run_end_date=2022-08-08 21:07:27.579980+00:00, run_duration=227.247807, state=success, external_trigger=False, run_type=dataset_triggered, data_interval_start=None, data_interval_end=None, dag_hash=ff772192a0a9c22e28dbebf2b14f1ebc
   {dag.py:3178} INFO - Setting next_dagrun for duckling_855 to None, run_after=None
   {scheduler_job.py:1364} WARNING - Failing (58) jobs without heartbeat after 2022-08-08 21:02:27.737896+00:00
   {scheduler_job.py:1372} ERROR - Detected zombie job: {'full_filepath': '/home/matt/2022/08/07/dags/many_to_one.py', 'msg': 'Detected <TaskInstance: duckling_289.has_outlet dataset_triggered__2022-08-08T21:01:31.846596+00:00 [running]> as zombie', 'simple_task_instance': <airflow.models.taskinstance.SimpleTaskInstance object at 0x7f98ebffc850>, 'is_failure_callback': True}
   {scheduler_job.py:1372} ERROR - Detected zombie job: {'full_filepath': '/home/matt/2022/08/07/dags/many_to_one.py', 'msg': 'Detected <TaskInstance: duckling_127.has_outlet dataset_triggered__2022-08-08T21:01:31.823203+00:00 [running]> as zombie', 'simple_task_instance': <airflow.models.taskinstance.SimpleTaskInstance object at 0x7f98ec93f7f0>, 'is_failure_callback': True}
   
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org