You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/11/08 09:24:02 UTC

[GitHub] [airflow] toddy86 opened a new issue #19461: Missing DagRuns when catchup=True

toddy86 opened a new issue #19461:
URL: https://github.com/apache/airflow/issues/19461


   ### Apache Airflow version
   
   2.2.1 (latest released)
   
   ### Operating System
   
   PRETTY_NAME="Debian GNU/Linux 10 (buster)
   
   ### Versions of Apache Airflow Providers
   
   apache-airflow-providers-ftp==2.0.1
   apache-airflow-providers-http==2.0.1
   apache-airflow-providers-imap==2.0.1
   apache-airflow-providers-sqlite==2.0.1
   
   ### Deployment
   
   Docker-Compose
   
   ### Deployment details
   
   _No response_
   
   ### What happened
   
   Backfilling via catchup=True leads to missing DagRuns. 
   
   See reproduction steps for full details
   
   ### What you expected to happen
   
   _No response_
   
   ### How to reproduce
   
   Note, this is an issue which we have experienced in our production environment, with a much more complicated DAG. Below are the reproduction steps using breeze.
   
   1. Setup ./breeze environment with the below config modifications
   2. Create a simple DAG, with dummy tasks in it (see below example)
   2. Set a `start_date` in the past
   3. Set `catchup=True`
   4. Unpause the DAG
   5. Catch up starts and if you view the tree view, you have the false impression that everything has caught up correctly. 
   6. However, access the calendar view, you can see the missing DagRuns. 
   
   **Breeze Config**
   ```
   export DB_RESET="true"
   export START_AIRFLOW="true"
   export INSTALL_AIRFLOW_VERSION="2.2.1"
   export USE_AIRFLOW_VERSION="2.2.1"
   ```
   
   **Dummy DAG**
   ```
   from datetime import datetime
   from airflow import DAG
   from airflow.operators.dummy import DummyOperator
   
   dag = DAG(
       dag_id="temp_test",
       schedule_interval="@daily",
       catchup=True,
       start_date=datetime(2021, 8, 1),
       max_active_tasks=10,
       max_active_runs=5,
       is_paused_upon_creation=True,
   )
   
   with dag:
       task1 = DummyOperator(task_id="task1")
       task2 = DummyOperator(task_id="task2")
       task3 = DummyOperator(task_id="task3")
       task4 = DummyOperator(task_id="task4")
       task5 = DummyOperator(task_id="task5")
   
       task1 >> task2 >> task3 >> task4 >> task5
   ```
   
   **Results**
   <img width="1430" alt="tree_view" src="https://user-images.githubusercontent.com/10559757/140715465-6bc3831c-d71c-4025-bcde-985010ab31f8.png">
   
   <img width="1435" alt="calendar_view" src="https://user-images.githubusercontent.com/10559757/140715467-1a1a5c9a-3eb6-40ff-8720-ebe6db999028.png">
   
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] bbovenzi edited a comment on issue #19461: Missing DagRuns when catchup=True

Posted by GitBox <gi...@apache.org>.
bbovenzi edited a comment on issue #19461:
URL: https://github.com/apache/airflow/issues/19461#issuecomment-963287615


   ~~Let me get this right,:
   In tree view all of the runs are there, but in calendar view they are not? Are you able to see those individual runs in the graph view?~~
   
   Edit: Nevermind, I tested it locally and yes, there are some gaps in the `dataInterval` on both views. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] toddy86 commented on issue #19461: Missing DagRuns when catchup=True

Posted by GitBox <gi...@apache.org>.
toddy86 commented on issue #19461:
URL: https://github.com/apache/airflow/issues/19461#issuecomment-963389506


   > ~Let me get this right,: In tree view all of the runs are there, but in calendar view they are not? Are you able to see those individual runs in the graph view?~
   > 
   > Edit: Nevermind, I tested it locally and yes, there are some gaps in the `dataInterval` on both views.
   
   Yeah, the gaps are there on the tree view, it's just that since empty runs aren't shown on the tree view, it provides you with a false sense idea that all of the DagRuns have actually run. It's not until you look at the calendar view that you actually see the gaps.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on issue #19461: Missing DagRuns when catchup=True

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #19461:
URL: https://github.com/apache/airflow/issues/19461#issuecomment-962961242


   Thanks for opening your first issue here! Be sure to follow the issue template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil closed issue #19461: Missing DagRuns when catchup=True

Posted by GitBox <gi...@apache.org>.
kaxil closed issue #19461:
URL: https://github.com/apache/airflow/issues/19461


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] bbovenzi commented on issue #19461: Missing DagRuns when catchup=True

Posted by GitBox <gi...@apache.org>.
bbovenzi commented on issue #19461:
URL: https://github.com/apache/airflow/issues/19461#issuecomment-963287615


   Let me get this right,:
   In tree view all of the runs are there, but in calendar view they are not? Are you able to see those individual runs in the graph view?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] bbovenzi edited a comment on issue #19461: Missing DagRuns when catchup=True

Posted by GitBox <gi...@apache.org>.
bbovenzi edited a comment on issue #19461:
URL: https://github.com/apache/airflow/issues/19461#issuecomment-963287615


   ---Let me get this right,:
   In tree view all of the runs are there, but in calendar view they are not? Are you able to see those individual runs in the graph view?---
   
   Edit: Nevermind, I tested it locally and yes, there are some gaps in the `dataInterval` on both views. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] toddy86 commented on issue #19461: Missing DagRuns when catchup=True

Posted by GitBox <gi...@apache.org>.
toddy86 commented on issue #19461:
URL: https://github.com/apache/airflow/issues/19461#issuecomment-962972588


   Also note, in the above example DAG, we have the edited the max active run values on the DAG. However, we get the same result if these overrides are removed


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] enguerranchevalier commented on issue #19461: Missing DagRuns when catchup=True

Posted by GitBox <gi...@apache.org>.
enguerranchevalier commented on issue #19461:
URL: https://github.com/apache/airflow/issues/19461#issuecomment-963287051


   I've got exactly the same behavior on 2.2.1 too.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org