You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/01/22 22:24:02 UTC

[GitHub] [airflow] oz-r opened a new issue #13853: Clearing of historic Task or DagRuns leads to failed DagRun

oz-r opened a new issue #13853:
URL: https://github.com/apache/airflow/issues/13853


   <!--
   
   Welcome to Apache Airflow!  For a smooth issue process, try to answer the following questions.
   Don't worry if they're not all applicable; just try to include what you can :-)
   
   If you need to include code snippets or logs, please put them in fenced code
   blocks.  If they're super-long, please use the details tag like
   <details><summary>super-long log</summary> lots of stuff </details>
   
   Please delete these comment blocks before submitting the issue.
   
   -->
   
   <!--
   
   IMPORTANT!!!
   
   PLEASE CHECK "SIMILAR TO X EXISTING ISSUES" OPTION IF VISIBLE
   NEXT TO "SUBMIT NEW ISSUE" BUTTON!!!
   
   PLEASE CHECK IF THIS ISSUE HAS BEEN REPORTED PREVIOUSLY USING SEARCH!!!
   
   Please complete the next sections or the issue will be closed.
   These questions are the first thing we need to know to understand the context.
   
   -->
   
   **Apache Airflow version**: 2.0.0
   
   **Environment**:
   
   - **Cloud provider or hardware configuration**:
   - **OS** (e.g. from /etc/os-release): Amazon Linux 2
   - **Kernel** (e.g. `uname -a`): Linux
   - **Install tools**:
   - **Others**:
   
   **What happened**:
   
   <!-- (please include exact error messages if you can) -->
   Clearing a DagRun that has an execution_date in the past where the time different between now and that `execution_date` is greater than the DagRun timeout leads to the DagRun failing on clear, and not running.
   
   **What you expected to happen**:
   The DagRun should enter the Running state
   
   <!-- What do you think went wrong? -->
   I anticipate this is a bug where the DAG duration is not being reset and so the new DagRun is timing out as soon as it starts.
   
   **How to reproduce it**:
   <!---
   
   As minimally and precisely as possible. Keep in mind we do not have access to your cluster or dags.
   
   If you are using kubernetes, please attempt to recreate the issue using minikube or kind.
   
   ## Install minikube/kind
   
   - Minikube https://minikube.sigs.k8s.io/docs/start/
   - Kind https://kind.sigs.k8s.io/docs/user/quick-start/
   
   If this is a UI bug, please provide a screenshot of the bug or a link to a youtube video of the bug in action
   
   You can include images using the .md style of
   ![alt text](http://url/to/img.png)
   
   To record a screencast, mac users can use QuickTime and then create an unlisted youtube video with the resulting .mov file.
   
   --->
   
   Create an DAG with a DagRun timeout. Trigger said DAG. Wait for `DagRun timeout + 1` minutes and then clear said DagRun. New DagRun will immediately fail.
   
   **Anything else we need to know**:
   This occurs regardless of whether the DAG/Task succeeded or failed.
   <!--
   
   How often does this problem occur? Once? Every time etc?
   
   Any relevant logs to include? Put them here in side a detail tag:
   <details><summary>x.log</summary> lots of stuff </details>
   
   -->
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] darthale commented on issue #13853: Clearing of historic Task or DagRuns leads to failed DagRun

Posted by GitBox <gi...@apache.org>.
darthale commented on issue #13853:
URL: https://github.com/apache/airflow/issues/13853#issuecomment-841046794


   This is still happening for me in airflow 2.0.2.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ebdavison commented on issue #13853: Clearing of historic Task or DagRuns leads to failed DagRun

Posted by GitBox <gi...@apache.org>.
ebdavison commented on issue #13853:
URL: https://github.com/apache/airflow/issues/13853#issuecomment-856147738


   I am on airflow 2.1.0 and this is definitely an issue on that version.
   
   Can someone comment on the status of this?  I have had to update 400+ DAGs to remove the `dagrun_timeout` config item to have older DAGs re-run.  We had an issue recently and I needed to re-run DAGs from the last 5 days but with a `dagrum_timeout` of 120 seconds, this was failing all DAGs that got cleared.
   
   How am I supposed to limit the length of time a DAG can run _AND_ still be able to re-run older DAGs as needed in airflow 2.+?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] renanleme commented on issue #13853: Clearing of historic Task or DagRuns leads to failed DagRun

Posted by GitBox <gi...@apache.org>.
renanleme commented on issue #13853:
URL: https://github.com/apache/airflow/issues/13853#issuecomment-865961271


   We have this problem on our cluster. 
   I found a workaround while we don't fix it in a newer version.
   The way I found is cleaning the failed tasks from the homepage/dagrun list instead of the dag tree view it self.
   Example in pictures:
   Let's say that after cleaning this execution it started failing, if you go to the homepage and search for the dag, you will see that a execution failed:
   ![image](https://user-images.githubusercontent.com/129836/122928087-70a3bf00-d361-11eb-81e1-0de21cc36dd7.png)
   So if you click on it (yellow marked in the image) you will open the dag run list, and you will be able to see the failed one:
   ![image](https://user-images.githubusercontent.com/129836/122928350-aea0e300-d361-11eb-809a-82442d10e9c4.png)
   
   Then, you can select it and clear the stated. It will clear the execution and it will work as expected 🙏🏻 
   ![image](https://user-images.githubusercontent.com/129836/122928577-ead44380-d361-11eb-87ab-f887045f0a17.png)
   
   Probably the workflow for this action (cleaning) is different from the one that we have on the tree view UI, hope this help you guys to fix it :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] yogyang commented on issue #13853: Clearing of historic Task or DagRuns leads to failed DagRun

Posted by GitBox <gi...@apache.org>.
yogyang commented on issue #13853:
URL: https://github.com/apache/airflow/issues/13853#issuecomment-783115905


   Hi, @kaxil would you help identify this, this bug is kind of preventing me from upgrading the Airflow to 2.0 as we do need the ability to rerun the previous dag-runs in case of some error data.
   
   I compared the codes to 1.10.14(which has no this problem):
   
   in 2.0.0 we have skipped the task_instance runs when the dag is found as dag_timout
   https://github.com/apache/airflow/blob/c743b95a02ba1ec04013635a56ad042ce98823d2/airflow/jobs/scheduler_job.py#L1290-L1299
   
   while in 1.10.14, 
   when the dag is found as dag_timeout, we still run the task_instances.
   https://github.com/apache/airflow/blob/c743b95a02ba1ec04013635a56ad042ce98823d2/airflow/jobs/scheduler_job.py#L1290-L1299
   
   seems it has the different cause compared with https://github.com/apache/airflow/issues/13407


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] oz-r edited a comment on issue #13853: Clearing of historic Task or DagRuns leads to failed DagRun

Posted by GitBox <gi...@apache.org>.
oz-r edited a comment on issue #13853:
URL: https://github.com/apache/airflow/issues/13853#issuecomment-765718399


   I'm not entire sure #13407 is related, but it may be.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on issue #13853: Clearing of historic Task or DagRuns leads to failed DagRun

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on issue #13853:
URL: https://github.com/apache/airflow/issues/13853#issuecomment-910933964


   This issue has been closed because it has not received response from the issue author.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on issue #13853: Clearing of historic Task or DagRuns leads to failed DagRun

Posted by GitBox <gi...@apache.org>.
kaxil commented on issue #13853:
URL: https://github.com/apache/airflow/issues/13853#issuecomment-857024420


   Looks like a bug, needs fixing - added to 2.1.1 milestone


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] oz-r commented on issue #13853: Clearing of historic Task or DagRuns leads to failed DagRun

Posted by GitBox <gi...@apache.org>.
oz-r commented on issue #13853:
URL: https://github.com/apache/airflow/issues/13853#issuecomment-799870151


   > I think the old behavior makes more sense, as it:
   > 
   > 1. Enforces a maximum DAG duration (intended effect)
   > 2. While still allowing a DAG to be re-executed
   
   I agree with this (was about going to make the same point).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] JCoder01 commented on issue #13853: Clearing of historic Task or DagRuns leads to failed DagRun

Posted by GitBox <gi...@apache.org>.
JCoder01 commented on issue #13853:
URL: https://github.com/apache/airflow/issues/13853#issuecomment-874655698


   Yes, still seems to be an issue. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] closed issue #13853: Clearing of historic Task or DagRuns leads to failed DagRun

Posted by GitBox <gi...@apache.org>.
github-actions[bot] closed issue #13853:
URL: https://github.com/apache/airflow/issues/13853


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] yogyang edited a comment on issue #13853: Clearing of historic Task or DagRuns leads to failed DagRun

Posted by GitBox <gi...@apache.org>.
yogyang edited a comment on issue #13853:
URL: https://github.com/apache/airflow/issues/13853#issuecomment-783115905


   Hi, @kaxil would you help identify this, this bug is kind of preventing me from upgrading the Airflow to 2.0 as we do need the ability to rerun the previous dag-runs in case of some error data.
   
   I compared the codes to 1.10.14(which has no this problem):
   
   in 2.0.0 we have skipped the task_instance runs when the dag is found as dag_timout
   https://github.com/apache/airflow/blob/cc9827dceb14468f69e7a5281297af4221c96a8c/airflow/jobs/scheduler_job.py#L1699-L1747
   
   while in 1.10.14, 
   when the dag is found as dag_timeout, we still run the task_instances.
   https://github.com/apache/airflow/blob/c743b95a02ba1ec04013635a56ad042ce98823d2/airflow/jobs/scheduler_job.py#L1290-L1299
   
   seems it has the different cause compared with https://github.com/apache/airflow/issues/13407


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on issue #13853: Clearing of historic Task or DagRuns leads to failed DagRun

Posted by GitBox <gi...@apache.org>.
kaxil commented on issue #13853:
URL: https://github.com/apache/airflow/issues/13853#issuecomment-784153400


   Yeah I think the dagrun_timeout was broken (or didn't work as intended) in 1.10.x 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on issue #13853: Clearing of historic Task or DagRuns leads to failed DagRun

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on issue #13853:
URL: https://github.com/apache/airflow/issues/13853#issuecomment-905956858


   This issue has been automatically marked as stale because it has been open for 30 days with no response from the author. It will be closed in next 7 days if no further activity occurs from the issue author.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] lsowen commented on issue #13853: Clearing of historic Task or DagRuns leads to failed DagRun

Posted by GitBox <gi...@apache.org>.
lsowen commented on issue #13853:
URL: https://github.com/apache/airflow/issues/13853#issuecomment-799744386


   I think the issue was is `1.10.x`, dagrun_timeout was essentially "how long as this DAG been executing".  In `2.0.x` it is now "how long has it been since the _scheduled_ start time".  
   
   I think the old behavior makes more sense, as it:
   1. Enforces a maximum DAG duration (intended effect)
   2. While still allowing a DAG to be re-executed


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] JCoder01 commented on issue #13853: Clearing of historic Task or DagRuns leads to failed DagRun

Posted by GitBox <gi...@apache.org>.
JCoder01 commented on issue #13853:
URL: https://github.com/apache/airflow/issues/13853#issuecomment-874655698


   Yes, still seems to be an issue. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] renanleme edited a comment on issue #13853: Clearing of historic Task or DagRuns leads to failed DagRun

Posted by GitBox <gi...@apache.org>.
renanleme edited a comment on issue #13853:
URL: https://github.com/apache/airflow/issues/13853#issuecomment-865961271


   We have this problem on our cluster. 
   I found a workaround while we don't fix it in a newer version.
   The way I found is cleaning the failed tasks from the homepage/dagrun list instead of the dag tree view it self.
   Example in pictures:
   Let's say that after cleaning this execution it started failing, if you go to the homepage and search for the dag, you will see that a execution failed:
   ![image](https://user-images.githubusercontent.com/129836/122928087-70a3bf00-d361-11eb-81e1-0de21cc36dd7.png)
   So if you click on it (yellow marked in the image) you will open the dag run list, and you will be able to see the failed one:
   ![image](https://user-images.githubusercontent.com/129836/122928963-46063600-d362-11eb-86eb-400f8de81fc5.png)
   
   
   Then, you can select it and clear the stated. It will clear the execution and it will work as expected 🙏🏻 
   ![image](https://user-images.githubusercontent.com/129836/122929029-57e7d900-d362-11eb-8845-0cecd6d814ca.png)
   
   Probably the workflow for this action (cleaning) is different from the one that we have on the tree view UI, hope this help you guys to fix it :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] lsowen commented on issue #13853: Clearing of historic Task or DagRuns leads to failed DagRun

Posted by GitBox <gi...@apache.org>.
lsowen commented on issue #13853:
URL: https://github.com/apache/airflow/issues/13853#issuecomment-837226456


   Is this or #14265 a candidate for https://github.com/apache/airflow/milestone/21?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ericdevries commented on issue #13853: Clearing of historic Task or DagRuns leads to failed DagRun

Posted by GitBox <gi...@apache.org>.
ericdevries commented on issue #13853:
URL: https://github.com/apache/airflow/issues/13853#issuecomment-839704780


   I also agree with the old behavior being more sensible, but can we at least confirm the new implementation is a bug or a feature?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ephraimbuddy commented on issue #13853: Clearing of historic Task or DagRuns leads to failed DagRun

Posted by GitBox <gi...@apache.org>.
ephraimbuddy commented on issue #13853:
URL: https://github.com/apache/airflow/issues/13853#issuecomment-886919517


   I was able to reproduce this in 2.0.0 but not in 2.1.2. @oz-r can you confirm?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on issue #13853: Clearing of historic Task or DagRuns leads to failed DagRun

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #13853:
URL: https://github.com/apache/airflow/issues/13853#issuecomment-765717820


   Thanks for opening your first issue here! Be sure to follow the issue template!
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] oz-r commented on issue #13853: Clearing of historic Task or DagRuns leads to failed DagRun

Posted by GitBox <gi...@apache.org>.
oz-r commented on issue #13853:
URL: https://github.com/apache/airflow/issues/13853#issuecomment-765718399


   I imagine #13407 has the same root cause.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] lsowen commented on issue #13853: Clearing of historic Task or DagRuns leads to failed DagRun

Posted by GitBox <gi...@apache.org>.
lsowen commented on issue #13853:
URL: https://github.com/apache/airflow/issues/13853#issuecomment-799745414


   it looks like https://github.com/apache/airflow/pull/14455 is meant to address this.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on issue #13853: Clearing of historic Task or DagRuns leads to failed DagRun

Posted by GitBox <gi...@apache.org>.
uranusjr commented on issue #13853:
URL: https://github.com/apache/airflow/issues/13853#issuecomment-873757605


   Could someone verify whether this is still an issue with 2.1.1? Context: https://github.com/apache/airflow/issues/14265#issuecomment-873756374


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org