You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/03/03 18:34:13 UTC

[GitHub] [airflow] rustikk opened a new issue #21972: Cycle detected in dag when there is no cycle using taskgroups and the taskflow api

rustikk opened a new issue #21972:
URL: https://github.com/apache/airflow/issues/21972


   ### Apache Airflow version
   
   main (development)
   
   ### What happened
   
   I'm getting an import error of ` Cycle detected in DAG. Faulty task: group1.dummy4`
   when I load this dag into airflow
   
   ### What you expected to happen
   
   I expected the dag to have no import errors. It imports fine in version 2.2.4-2 but when switching to version 2.3.0.dev20220302 it gives the import error described above.
   
   ### How to reproduce
   
   ```
   from airflow.utils.edgemodifier import Label
   from airflow.models.baseoperator import chain
   from airflow.decorators import dag, task, task_group
   from airflow.utils.dot_renderer import render_dag
   from textwrap import indent
   
   from datetime import datetime, timedelta
   
   two_days = datetime.now() - timedelta(days=2)
   
   @dag(dag_id="taskflow_compare2",
       schedule_interval=None,
       start_date=two_days,
       tags=['core'])
   def task_grouper():
   
       @task
       def assert_homomorphic(task_group_names, **context):
           """
           The structure of all of the task groups above should be the same
           """
           # get the dag in dot notation, focus only on its edges
           dag = context["dag"]
           #gives string which represents whole dag structure
           graph = render_dag(dag)
           print("Whole DAG:")
           print(indent(str(graph), "    "))
           lines = list(filter(lambda x: "->" in x, str(graph).split("\n")))
   
           # bin them by task group, then remove the group names
           group_strings = []
           #removes everything thats not a task name
           for name in task_group_names:
               print(name)
               relevant_lines = filter(lambda x: name in x, lines)
               normalized_lines = map(
               lambda x: x.strip().replace(name, ""), sorted(relevant_lines)
               )
               edges_str = "\n".join(normalized_lines)
               group_strings.append(edges_str)
               print(indent(edges_str, "    "))
   
           # these should be identical
           for xgroup, ygroup in zip(group_strings, group_strings[1:]):
               assert xgroup == ygroup
   
       @task_group(group_id="group1")
       def grouper1():
   
           @task
           def dummy00():
               return 0
       
           @task
           def dummy0():
               return 0
       
           @task
           def dummy1(val):
               return val + 1
   
           @task
           def dummy2(val):
               return val + 2
       
           @task
           def dummy3(val):
               return val + 3
       
           @task
           def dummy4(val):
               return val + 4
           
           @task
           def dummy5(val):
               return val + 5
           
           @task
           def dummy6(val):
               return val + 6
           
           @task
           def dummy7(val):
               return val + 7
           
           @task
           def dummy8(val):
               return val + 8
           
           @task
           def dummy9(val):
               return val + 9
           
           @task
           def dummy10(val):
               return val + 10
   
           d00 = dummy00()
           d0 = dummy0()
           d1 = dummy1(0)
           d2 = dummy2(0)
           d3 = dummy3(0)
           d4 = dummy4(0)
           d5 = dummy5(0)
           d6 = dummy6(0)
           d7 = dummy7(0)
           d8 = dummy8(0)
           d9 = dummy9(0)
           d10 = dummy10(0)
   
           chain(d00, [d0, d1, d2], d3, [Label("branch one"), Label("branch two"), Label("branch three"), Label("branch four"), Label("branch five")], [d4, d5, d6, d7, d8], d9, d10)
           return d10
       
       @task_group(group_id="group2")
       def grouper2():
           
           @task
           def dummy00():
               return 0
   
           @task
           def dummy0():
               return 0
       
           @task
           def dummy1(val):
               return val + 1
       
           @task
           def dummy2(val):
               return val + 2
       
           @task
           def dummy3(val):
               return val + 3
       
           @task
           def dummy4(val):
               return val + 4
           
           @task
           def dummy5(val):
               return val + 5
           
           @task
           def dummy6(val):
               return val + 6
   
           @task
           def dummy7(val):
               return val + 7
           
           @task
           def dummy8(val):
               return val + 8
           
           @task
           def dummy9(val):
               return val + 9
           
           @task
           def dummy10(val):
               return val + 10
   
           d00 = dummy00()
           d0 = dummy0()
           d1 = dummy1(0)
           d2 = dummy2(0)
           d3 = dummy3(0)
           d4 = dummy4(0)
           d5 = dummy5(0)
           d6 = dummy6(0)
           d7 = dummy7(0)
           d8 = dummy8(0)
           d9 = dummy9(0)
           d10 = dummy10(0)
   
           d00.set_downstream(d0)
           d00.set_downstream(d1)
           d00.set_downstream(d2)
           d0.set_downstream(d3)
           d1.set_downstream(d3)
           d2.set_downstream(d3)
           d3.set_downstream(d4, edge_modifier=Label("branch one"))
           d3.set_downstream(d5, edge_modifier=Label("branch two"))
           d3.set_downstream(d6, edge_modifier=Label("branch three"))
           d3.set_downstream(d7, edge_modifier=Label("branch four"))
           d3.set_downstream(d8, edge_modifier=Label("branch five"))
           d4.set_downstream(d9)
           d5.set_downstream(d9)
           d6.set_downstream(d9)
           d7.set_downstream(d9)
           d8.set_downstream(d9)
           d9.set_downstream(d10)
   
           #return dummy4(dummy3(dummy2(dummy1(dummy0()))))
           return d10
   
       tg1 = grouper1()
       tg2 = grouper2()
   
       [tg1, tg2] >> assert_homomorphic(["group1", "group2"])
   dag = task_grouper()
   
   
   
   ```
   
   ### Operating System
   
   Docker (debian:buster)
   
   ### Versions of Apache Airflow Providers
   
   _No response_
   
   ### Deployment
   
   Astronomer
   
   ### Deployment details
   
   Using the Astro CLI with this image:
   quay.io/astronomer/ap-airflow-dev:main
   
   ### Anything else
   
   Happens every time.
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] RNHTTR edited a comment on issue #21972: Cycle detected in dag when there is no cycle using taskgroups and the taskflow api

Posted by GitBox <gi...@apache.org>.
RNHTTR edited a comment on issue #21972:
URL: https://github.com/apache/airflow/issues/21972#issuecomment-1059846990


   Interestingly, this also happens for me using the image `quay.io/astronomer/astro-runtime:4.1.0` (airflow version 2.2.4) with the DAG included in `astrocloud dev init`, `example-dag-advanced.py`, which I believe has the following graph:
   
   ```mermaid
   graph LR
       %% A[begin] -->|Get money| B(Go shopping)
       begin --> check_day_of_week
       check_day_of_week --> weekday --> weekday_activities
       check_day_of_week --> weekend --> weekend_activities
       weekday_activities --> Z[end]
       weekend_activities --> Z[end]
       
       X{weekday_activities} --> which_weekday_activity_day
       which_weekday_activity_day --> |monday| guitar_lessons
       which_weekday_activity_day --> |tuesday| studying
       which_weekday_activity_day --> |wednesday| soccer_practice
       which_weekday_activity_day --> |thursday| contributing_to_Airflow
       which_weekday_activity_day --> |friday| family_dinner
   
       Y{weekend_activities} --> which_weekend_activity_day
       which_weekend_activity_day --> |saturday| going_to_the_beach
       which_weekend_activity_day --> |sunday| sleeping_in
   ```
   
   I'm not sure why this would result in a cycle being detected.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] RNHTTR commented on issue #21972: Cycle detected in dag when there is no cycle using taskgroups and the taskflow api

Posted by GitBox <gi...@apache.org>.
RNHTTR commented on issue #21972:
URL: https://github.com/apache/airflow/issues/21972#issuecomment-1059846990


   Interestingly, this also happens for me using the image `quay.io/astronomer/astro-runtime:4.1.0` (airflow version 2.2.4) with the DAG included in `astrocloud dev init`, `example-dag-advanced.py`, which I believe has the following graph:
   
   ```mermaid
   graph LR
       %% A[begin] -->|Get money| B(Go shopping)
       begin --> check_day_of_week
       check_day_of_week --> weekday --> weekday_activities
       check_day_of_week --> weekend --> weekend_activities
       weekday_activities --> Z[end]
       weekend_activities --> Z[end]
       
       X{weekday_activities} --> which_weekday_activity_day
       which_weekday_activity_day --> |monday| guitar_lessons
       which_weekday_activity_day --> |tuesday| studying
       which_weekday_activity_day --> |wednesday| soccer_practice
       which_weekday_activity_day --> |thursday| contributing_to_Airflow
       which_weekday_activity_day --> |friday| family_dinner
   
       Y{weekend_activities} --> which_weekend_activity_day
       which_weekend_activity_day --> |saturday| going_to_the_beach
       which_weekend_activity_day --> |sunday| sleeping_in
   ```
   
   I'm not sure why this would result in a cycle being detected.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] josh-fell commented on issue #21972: Cycle detected in dag when there is no cycle using taskgroups and the taskflow api

Posted by GitBox <gi...@apache.org>.
josh-fell commented on issue #21972:
URL: https://github.com/apache/airflow/issues/21972#issuecomment-1063607864






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] RNHTTR removed a comment on issue #21972: Cycle detected in dag when there is no cycle using taskgroups and the taskflow api

Posted by GitBox <gi...@apache.org>.
RNHTTR removed a comment on issue #21972:
URL: https://github.com/apache/airflow/issues/21972#issuecomment-1059417995


   ~I don't believe this is an import error.~ The offending error message can be found in [utils/dag_cycle_tester.py](https://github.com/apache/airflow/blob/efd365274a548e7dd859ca1823da6a7c417a34f1/airflow/utils/dag_cycle_tester.py#L61)
   
   Edit: I misunderstood "import error" in this context; I guess in this context it means it's an error for Airflow importing the DAG.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] RNHTTR edited a comment on issue #21972: Cycle detected in dag when there is no cycle using taskgroups and the taskflow api

Posted by GitBox <gi...@apache.org>.
RNHTTR edited a comment on issue #21972:
URL: https://github.com/apache/airflow/issues/21972#issuecomment-1061376695


   This issue can be traced back to #21404. Using the latest code in the main branch but rolling back the changes to the 3 files ([airflow/models/taskmixin.py](https://github.com/apache/airflow/blob/main/airflow/models/taskmixin.py), [airflow/utils/edgemodifier.py](https://github.com/apache/airflow/blob/main/airflow/utils/edgemodifier.py), and [airflow/utils/task_group.py](https://github.com/apache/airflow/blob/main/airflow/utils/task_group.py)) in the PR and your DAG imports with no issues. I'll try to take a deeper look tomorrow.
   
   cc: @avkirilishin 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] RNHTTR edited a comment on issue #21972: Cycle detected in dag when there is no cycle using taskgroups and the taskflow api

Posted by GitBox <gi...@apache.org>.
RNHTTR edited a comment on issue #21972:
URL: https://github.com/apache/airflow/issues/21972#issuecomment-1059417995


   I don't believe this is an import error. The offending error message can be found in [utils/dag_cycle_tester.py](https://github.com/apache/airflow/blob/efd365274a548e7dd859ca1823da6a7c417a34f1/airflow/utils/dag_cycle_tester.py#L61)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] RNHTTR commented on issue #21972: Cycle detected in dag when there is no cycle using taskgroups and the taskflow api

Posted by GitBox <gi...@apache.org>.
RNHTTR commented on issue #21972:
URL: https://github.com/apache/airflow/issues/21972#issuecomment-1064363534


   Sure thing. I don't know the fix yet, but I've got a good idea of where to look.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] RNHTTR commented on issue #21972: Cycle detected in dag when there is no cycle using taskgroups and the taskflow api

Posted by GitBox <gi...@apache.org>.
RNHTTR commented on issue #21972:
URL: https://github.com/apache/airflow/issues/21972#issuecomment-1061376695


   This issue can be traced back to [this merged PR](https://github.com/apache/airflow/pull/21404/files). Using the latest code in the main branch but rolling back the changes to the 3 files ([airflow/models/taskmixin.py](https://github.com/apache/airflow/blob/main/airflow/models/taskmixin.py), [airflow/utils/edgemodifier.py](https://github.com/apache/airflow/blob/main/airflow/utils/edgemodifier.py), and [airflow/utils/task_group.py](https://github.com/apache/airflow/blob/main/airflow/utils/task_group.py)) in the PR and your DAG imports with no issues. I'll try to take a deeper look tomorrow.
   
   cc: @avkirilishin 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] RNHTTR commented on issue #21972: Cycle detected in dag when there is no cycle using taskgroups and the taskflow api

Posted by GitBox <gi...@apache.org>.
RNHTTR commented on issue #21972:
URL: https://github.com/apache/airflow/issues/21972#issuecomment-1059417995


   I don't believe this is an import error. The offending error message can be found in [utils/dag_cycle_tester.py](https://github.com/apache/airflow/blob/efd365274a548e7dd859ca1823da6a7c417a34f1/airflow/utils/dag_cycle_tester.py)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] RNHTTR removed a comment on issue #21972: Cycle detected in dag when there is no cycle using taskgroups and the taskflow api

Posted by GitBox <gi...@apache.org>.
RNHTTR removed a comment on issue #21972:
URL: https://github.com/apache/airflow/issues/21972#issuecomment-1059846990


   Interestingly, this also happens for me using the image `quay.io/astronomer/astro-runtime:4.1.0` (airflow version 2.2.4) with the DAG included in `astrocloud dev init`, `example-dag-advanced.py`, which I believe has the following graph:
   
   ```mermaid
   graph LR
       %% A[begin] -->|Get money| B(Go shopping)
       begin --> check_day_of_week
       check_day_of_week --> weekday --> weekday_activities
       check_day_of_week --> weekend --> weekend_activities
       weekday_activities --> Z[end]
       weekend_activities --> Z[end]
       
       X{weekday_activities} --> which_weekday_activity_day
       which_weekday_activity_day --> |monday| guitar_lessons
       which_weekday_activity_day --> |tuesday| studying
       which_weekday_activity_day --> |wednesday| soccer_practice
       which_weekday_activity_day --> |thursday| contributing_to_Airflow
       which_weekday_activity_day --> |friday| family_dinner
   
       Y{weekend_activities} --> which_weekend_activity_day
       which_weekend_activity_day --> |saturday| going_to_the_beach
       which_weekend_activity_day --> |sunday| sleeping_in
   ```
   
   I'm not sure why this would result in a cycle being detected.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] RNHTTR edited a comment on issue #21972: Cycle detected in dag when there is no cycle using taskgroups and the taskflow api

Posted by GitBox <gi...@apache.org>.
RNHTTR edited a comment on issue #21972:
URL: https://github.com/apache/airflow/issues/21972#issuecomment-1059417995


   ~I don't believe this is an import error.~ The offending error message can be found in [utils/dag_cycle_tester.py](https://github.com/apache/airflow/blob/efd365274a548e7dd859ca1823da6a7c417a34f1/airflow/utils/dag_cycle_tester.py#L61)
   
   Edit: I misunderstood "import error" in this context; I guess in this context it means it's an error for Airflow importing the DAG.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] RNHTTR edited a comment on issue #21972: Cycle detected in dag when there is no cycle using taskgroups and the taskflow api

Posted by GitBox <gi...@apache.org>.
RNHTTR edited a comment on issue #21972:
URL: https://github.com/apache/airflow/issues/21972#issuecomment-1059846990


   Interestingly, this also happens for me using the image `quay.io/astronomer/astro-runtime:4.1.0` (airflow version 2.2.4) with the DAG included in `astrocloud dev start`, `example-dag-advanced.py`, which I believe has the following graph:
   
   ```mermaid
   graph LR
       %% A[begin] -->|Get money| B(Go shopping)
       begin --> check_day_of_week
       check_day_of_week --> weekday --> weekday_activities
       check_day_of_week --> weekend --> weekend_activities
       weekday_activities --> Z[end]
       weekend_activities --> Z[end]
       
       X{weekday_activities} --> which_weekday_activity_day
       which_weekday_activity_day --> |monday| guitar_lessons
       which_weekday_activity_day --> |tuesday| studying
       which_weekday_activity_day --> |wednesday| soccer_practice
       which_weekday_activity_day --> |thursday| contributing_to_Airflow
       which_weekday_activity_day --> |friday| family_dinner
   
       Y{weekend_activities} --> which_weekend_activity_day
       which_weekend_activity_day --> |saturday| going_to_the_beach
       which_weekend_activity_day --> |sunday| sleeping_in
   ```
   
   I'm not sure why this would result in a cycle being detected.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org