You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/09/21 00:03:00 UTC

[GitHub] [airflow] nsepetys opened a new issue #18398: When clearing a successful Subdag, child tasks are not ran

nsepetys opened a new issue #18398:
URL: https://github.com/apache/airflow/issues/18398


   ### Apache Airflow version
   
   2.1.4 (latest released)
   
   ### Operating System
   
   Debian GNU/Linux 10 (buster)
   
   ### Versions of Apache Airflow Providers
   
   apache-airflow-providers-amazon==2.2.0
   apache-airflow-providers-celery==2.0.0
   apache-airflow-providers-cncf-kubernetes==2.0.2
   apache-airflow-providers-docker==2.1.1
   apache-airflow-providers-elasticsearch==2.0.3
   apache-airflow-providers-ftp==2.0.1
   apache-airflow-providers-google==5.1.0
   apache-airflow-providers-grpc==2.0.1
   apache-airflow-providers-hashicorp==2.1.0
   apache-airflow-providers-http==2.0.1
   apache-airflow-providers-imap==2.0.1
   apache-airflow-providers-microsoft-azure==3.1.1
   apache-airflow-providers-mysql==2.1.1
   apache-airflow-providers-postgres==2.2.0
   apache-airflow-providers-redis==2.0.1
   apache-airflow-providers-sendgrid==2.0.1
   apache-airflow-providers-sftp==2.1.1
   apache-airflow-providers-slack==4.0.1
   apache-airflow-providers-sqlite==2.0.1
   apache-airflow-providers-ssh==2.1.1
   
   ### Deployment
   
   Docker-Compose
   
   ### Deployment details
   
   dockerfile from apache/airflow:2.1.4-python3.7
   `EXECUTOR=LocalExecutor`
   
   ### What happened
   
   After successfully running a SUBDAG, clearing it (including downstream+recursive) doesn't trigger the inner tasks. Instead, the subdag is marked successful and the inner tasks all stay cleared and are not re-ran.
   
   ### What you expected to happen
   
   Expected Clear with DownStream + Recursive to re-run all subdag tasks.
   
   ### How to reproduce
   
   1. Using a slightly modified version of https://airflow.apache.org/docs/apache-airflow/stable/concepts.html#subdags:
   ```python
   from airflow import DAG
   from airflow.example_dags.subdags.subdag import subdag
   from airflow.operators.dummy import DummyOperator
   from airflow.operators.subdag import SubDagOperator
   from airflow.utils.dates import days_ago
   
   def subdag(parent_dag_name, child_dag_name, args):
       dag_subdag = DAG(
           dag_id=f'{parent_dag_name}.{child_dag_name}',
           default_args=args,
           start_date=days_ago(2),
           schedule_interval=None,
       )
   
       for i in range(5):
           DummyOperator(
               task_id='{}-task-{}'.format(child_dag_name, i + 1),
               default_args=args,
               dag=dag_subdag,
           )
   
       return dag_subdag
   	
   DAG_NAME = 'example_subdag_operator'
   
   args = {
       'owner': 'airflow',
   }
   
   dag = DAG(
       dag_id=DAG_NAME, default_args=args, start_date=days_ago(2), schedule_interval=None, tags=['example']
   )
   
   start = DummyOperator(
       task_id='start',
       dag=dag,
   )
   
   section_1 = SubDagOperator(
       task_id='section-1',
       subdag=subdag(DAG_NAME, 'section-1', args),
       dag=dag,
   )
   
   some_other_task = DummyOperator(
       task_id='some-other-task',
       dag=dag,
   )
   
   section_2 = SubDagOperator(
       task_id='section-2',
       subdag=subdag(DAG_NAME, 'section-2', args),
       dag=dag,
   )
   
   end = DummyOperator(
       task_id='end',
       dag=dag,
   )
   
   start >> section_1 >> some_other_task >> section_2 >> end
   ```
       
   
   2. Run the subdag fully.
   3.  Clear (with recursive/downstream) any of the SubDags.
   4.  The Subdag will be marked successful, but if you zoom into the subdag, you'll see all the child tasks were not run.
   
   
   ### Anything else
   
   This appears to be the same issues as the "resolved" bug https://github.com/apache/airflow/issues/13295 . Airflow version 2.0.2 does not appear to fix the issue and neither does version 2.1.0 or 2.1.4. I am uncertain of how the fix (https://github.com/apache/airflow/pull/14776) was shown as the key to fixing this.
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] nsepetys commented on issue #18398: When clearing a successful Subdag, child tasks are not ran

Posted by GitBox <gi...@apache.org>.
nsepetys commented on issue #18398:
URL: https://github.com/apache/airflow/issues/18398#issuecomment-925311762


   So it appears the SubDAG tasks are being hit in Airflow 2.0.2, however, judging from how quick the DAG scheduler runs through these versus 1.10.14 it makes me think there is some caching of sorts that allows the scheduler to skip checking the DB for the state of the tasks in the SubDAG. The reason I mention this is that if you change the ID of the SubDAG tasks _after_ after a run (this may sound like an odd and potentially unsupported Airflow behavior but this is my use case), the SubDAGs tasks will not be hit . Using the example I posted in my original comment I changed the task Id's of the SubDAG's tasks and this is where things appear to not work:
   
   Outside SubDAG after "successful" run:
   ![image](https://user-images.githubusercontent.com/4511738/134417359-fa52ee17-3cbe-4feb-a7bf-770a1978c687.png)
   
   Inside SubDAG after "successful" run:
   ![image](https://user-images.githubusercontent.com/4511738/134417505-1c48a105-1042-4503-9d41-daf82fc47db8.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on issue #18398: When clearing a successful Subdag, child tasks are not ran

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #18398:
URL: https://github.com/apache/airflow/issues/18398#issuecomment-923447119


   Thanks for opening your first issue here! Be sure to follow the issue template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] nsepetys commented on issue #18398: When clearing a successful Subdag, child tasks are not ran

Posted by GitBox <gi...@apache.org>.
nsepetys commented on issue #18398:
URL: https://github.com/apache/airflow/issues/18398#issuecomment-925141408


   Testing on 2.0.2 and comparing versus the behavior in 1.10.14 I am noticing a difference. For starters the `task_instance` table is not clearing out the subDag's tasks' statuses in 2.0.2 but in 1.10.14 it is. See below:
   **1.10.14**
   ![image](https://user-images.githubusercontent.com/4511738/134385788-269a29db-852c-47c7-8f12-60c050006930.png)
   
   **2.0.2**
   ![image](https://user-images.githubusercontent.com/4511738/134385849-ebbf18ac-e4cd-41a2-88de-dc53fef0a402.png)
   
   Any suggestions on how I can debug this further?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] nsepetys commented on issue #18398: When clearing a successful Subdag, child tasks are not ran

Posted by GitBox <gi...@apache.org>.
nsepetys commented on issue #18398:
URL: https://github.com/apache/airflow/issues/18398#issuecomment-926715453


   @ephraimbuddy - I am uncertain of how to continue debugging this. Running the DAG with `DebugExecutor` is not telling me much as to why these tasks are being passed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] nsepetys commented on issue #18398: When clearing a successful Subdag, child tasks are not ran

Posted by GitBox <gi...@apache.org>.
nsepetys commented on issue #18398:
URL: https://github.com/apache/airflow/issues/18398#issuecomment-923993752


   Could I get another pair of eyes on this to see if this issue is indeed an issue? Like I said in my post above it seems odd https://github.com/apache/airflow/pull/14776 made it into master without being confirmed as addressing the issue.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on issue #18398: When clearing a successful Subdag, child tasks are not ran

Posted by GitBox <gi...@apache.org>.
uranusjr commented on issue #18398:
URL: https://github.com/apache/airflow/issues/18398#issuecomment-924545967


   cc @ephraimbuddy 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ephraimbuddy commented on issue #18398: When clearing a successful Subdag, child tasks are not ran

Posted by GitBox <gi...@apache.org>.
ephraimbuddy commented on issue #18398:
URL: https://github.com/apache/airflow/issues/18398#issuecomment-925320498


   Can you check the tree view? That was what I checked. Let’s see if there’s a difference between it and the graph view


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] nsepetys edited a comment on issue #18398: When clearing a successful Subdag, child tasks are not ran

Posted by GitBox <gi...@apache.org>.
nsepetys edited a comment on issue #18398:
URL: https://github.com/apache/airflow/issues/18398#issuecomment-925311762


   So it appears the SubDAG tasks are being hit in Airflow 2.0.2, however, judging from how quick the DAG scheduler runs through these versus 1.10.14 it makes me think there is some caching of sorts that allows the scheduler to skip checking the DB for the state of the tasks in the SubDAG. The reason I mention this is that if you change the ID of the SubDAG tasks _after_ after a run (this may sound like an odd and potentially unsupported Airflow behavior but this is my use case), the SubDAGs tasks will not be hit . Using the example I posted in my original comment I changed the task Id's of the SubDAG's tasks and this is where things appear to not work:
   
   Outside SubDAG after "successful" run:
   <img src="https://user-images.githubusercontent.com/4511738/134417359-fa52ee17-3cbe-4feb-a7bf-770a1978c687.png" width="900" height="200">
   
   Inside SubDAG after "successful" run:
   <img src="https://user-images.githubusercontent.com/4511738/134417505-1c48a105-1042-4503-9d41-daf82fc47db8.png" width="300" height="700">
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ephraimbuddy commented on issue #18398: When clearing a successful Subdag, child tasks are not ran

Posted by GitBox <gi...@apache.org>.
ephraimbuddy commented on issue #18398:
URL: https://github.com/apache/airflow/issues/18398#issuecomment-924726023


   @nsepetys, I have tested this in main and also in 2.0.2 and it's working, wondering why you're seeing a different result


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ephraimbuddy commented on issue #18398: When clearing a successful Subdag, child tasks are not ran

Posted by GitBox <gi...@apache.org>.
ephraimbuddy commented on issue #18398:
URL: https://github.com/apache/airflow/issues/18398#issuecomment-926820607


   This is on my list and I will look into it, however, I will advise you to move to TaskGroup. Subdag is deprecated and we don't pay much attention to it anymore. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] nsepetys edited a comment on issue #18398: When clearing a successful Subdag, child tasks are not ran

Posted by GitBox <gi...@apache.org>.
nsepetys edited a comment on issue #18398:
URL: https://github.com/apache/airflow/issues/18398#issuecomment-925141408


   Testing on 2.0.2 and comparing versus the behavior in 1.10.14 I am noticing a difference. For starters the `task_instance` table is not clearing out the subDags' tasks' statuses in 2.0.2 but in 1.10.14 it is. See below:
   **1.10.14**
   ![image](https://user-images.githubusercontent.com/4511738/134385788-269a29db-852c-47c7-8f12-60c050006930.png)
   
   **2.0.2**
   ![image](https://user-images.githubusercontent.com/4511738/134385849-ebbf18ac-e4cd-41a2-88de-dc53fef0a402.png)
   
   I also tried manually deleting the subDAGs' tasks from the `task_instance` table to see if they would get hit but to no avail. The SubDAGs' tasks didn't even get inserted back into the `task_instance` table. Any suggestions on how I can debug this further?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] nsepetys commented on issue #18398: When clearing a successful Subdag, child tasks are not ran

Posted by GitBox <gi...@apache.org>.
nsepetys commented on issue #18398:
URL: https://github.com/apache/airflow/issues/18398#issuecomment-925330228


   Appears to be the same in the tree view:
   <img src="https://user-images.githubusercontent.com/4511738/134421859-6569e350-5304-4017-aa0e-18d4e2a5c494.png" width="1000" height="200">
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal commented on issue #18398: When clearing a successful Subdag, child tasks are not ran

Posted by GitBox <gi...@apache.org>.
eladkal commented on issue #18398:
URL: https://github.com/apache/airflow/issues/18398#issuecomment-923982647


   @nsepetys assigned the issue to you


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] nsepetys edited a comment on issue #18398: When clearing a successful Subdag, child tasks are not ran

Posted by GitBox <gi...@apache.org>.
nsepetys edited a comment on issue #18398:
URL: https://github.com/apache/airflow/issues/18398#issuecomment-925141408


   Testing on 2.0.2 and comparing versus the behavior in 1.10.14 I am noticing a difference. For starters the `task_instance` table is not clearing out the subDags' tasks' statuses in 2.0.2 but in 1.10.14 it is. See below:
   **1.10.14**
   ![image](https://user-images.githubusercontent.com/4511738/134385788-269a29db-852c-47c7-8f12-60c050006930.png)
   
   **2.0.2**
   ![image](https://user-images.githubusercontent.com/4511738/134385849-ebbf18ac-e4cd-41a2-88de-dc53fef0a402.png)
   
   I also tried manually deleting the subDAGs' tasks from the `task_instance` table manually to see if they would get hit but to no avail. The SubDAGs' tasks didn't even get inserted back into the `task_instance` table. Any suggestions on how I can debug this further?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org