You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/02/15 13:55:47 UTC

[GitHub] [airflow] lukasbriaukus opened a new issue #21585: Sensor timeout sometimes is not respected

lukasbriaukus opened a new issue #21585:
URL: https://github.com/apache/airflow/issues/21585


   ### Apache Airflow version
   
   2.2.3 (latest released)
   
   ### What happened
   
   tasks continue to be rescheduled even though timeout was reached and it never fails
   
   ### What you expected to happen
   
   I think we hit https://github.com/apache/airflow/issues/10790 or other airflow issue hence there was no record associated with the first task run and airflow kept considering task has just started. Below code from https://github.com/apache/airflow/blob/main/airflow/sensors/base.py caused timeout value to be ignored: first attempt doesn't exist in db hence start_date is always set to utcnow
   `def execute(self, context: Context) -> Any:
           started_at: Union[datetime.datetime, float]
   
           if self.reschedule:
   
               # If reschedule, use the start date of the first try (first try can be either the very
               # first execution of the task, or the first execution after the task was cleared.)
               first_try_number = context['ti'].max_tries - self.retries + 1
               task_reschedules = TaskReschedule.find_for_task_instance(
                   context['ti'], try_number=first_try_number
               )
               if not task_reschedules:
                   start_date = timezone.utcnow()
               else:
                   start_date = task_reschedules[0].start_date
               started_at = start_date
   
               def run_duration() -> float:
                   # If we are in reschedule mode, then we have to compute diff
                   # based on the time in a DB, so can't use time.monotonic
                   return (timezone.utcnow() - start_date).total_seconds()
   `
   
   ### How to reproduce
   
   run sensor in reschedule mode and force first attempt failure and then remove it's status from db.task_instance.
   
   ### Operating System
   
   CentOS 7.9.2009
   
   ### Versions of Apache Airflow Providers
   
   _No response_
   
   ### Deployment
   
   Virtualenv installation
   
   ### Deployment details
   
   _No response_
   
   ### Anything else
   
   since we have issue when first attempt fails without writing it's state to backend db, it's fairly frequent.
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #21585: Sensor timeout sometimes is not respected

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #21585:
URL: https://github.com/apache/airflow/issues/21585#issuecomment-1043064319


   I think the root cause shoud be treated, not this one - to be honest. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] lukasbriaukus commented on issue #21585: Sensor timeout sometimes is not respected

Posted by GitBox <gi...@apache.org>.
lukasbriaukus commented on issue #21585:
URL: https://github.com/apache/airflow/issues/21585#issuecomment-1043163998


   > I think the root cause shoud be treated, not this one - to be honest.
   
   This works assuming that everything is fine with the first try state. And if it is not, this will ignore timeout and can stop further dag runs silently if not taken care.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on issue #21585: Sensor timeout sometimes is not respected

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #21585:
URL: https://github.com/apache/airflow/issues/21585#issuecomment-1040303481


   Thanks for opening your first issue here! Be sure to follow the issue template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #21585: Sensor timeout sometimes is not respected

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #21585:
URL: https://github.com/apache/airflow/issues/21585#issuecomment-1052518274


   I do not really understand what the issue is. Could you please make a PR with the fix proposal ? Maybe it will be easier to discuss it.  In the meantime I am not sure this issue is an "issue" - I will make it into a discussion - but feel free to provide PR and link it to the dicusssion. We do not need issues at all and PRs fixing stuff is recommended.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on issue #21585: Sensor timeout sometimes is not respected

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #21585:
URL: https://github.com/apache/airflow/issues/21585#issuecomment-1040303481


   Thanks for opening your first issue here! Be sure to follow the issue template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org