You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "sahar1454 (via GitHub)" <gi...@apache.org> on 2023/10/11 17:24:01 UTC

[I] Scheduler crashes with OverflowError: date value out of range when retries are added to ShortCircuitOperator [airflow]

sahar1454 opened a new issue, #34869:
URL: https://github.com/apache/airflow/issues/34869

   ### Apache Airflow version
   
   Other Airflow 2 version (please specify below)
   
   ### What happened
   
   In airflow version `2.5.1`, adding retries on `ShortCircuitOperator` causes the Scheduler to crash with the following error:
   
   ```
   File "/usr/local/lib/python3.9/site-packages/airflow/models/taskinstance.py", line 1243, in next_retry_datetime  
   return self.end_date + delay  
   OverflowError: date value out of range
   ```
   Retries on other operators are working fine.
   
   ### What you think should happen instead
   
   We should be able to add `retries` option to the ShortCircuitOperator tasks to automatically retry the tasks in case of random failures. For example, if airflow worker CPUs are high and airflow tasks are randomly failing as a result, we would like them to be retried.
   
   ### How to reproduce
   
   Add `retries` with a value of more than 0 to `ShortCircuitBreaker` task in airflow
   
   ### Operating System
   
   Linux
   
   ### Versions of Apache Airflow Providers
   
   2.5.1 on AWS Managed Apache Airflow
   
   ### Deployment
   
   Amazon (AWS) MWAA
   
   ### Deployment details
   
   _No response_
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Scheduler crashes with OverflowError: date value out of range when retries are added to ShortCircuitOperator [airflow]

Posted by "boring-cyborg[bot] (via GitHub)" <gi...@apache.org>.
boring-cyborg[bot] commented on issue #34869:
URL: https://github.com/apache/airflow/issues/34869#issuecomment-1758153133

   Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Scheduler crashes with OverflowError: date value out of range when retries are added to ShortCircuitOperator [airflow]

Posted by "Taragolis (via GitHub)" <gi...@apache.org>.
Taragolis commented on issue #34869:
URL: https://github.com/apache/airflow/issues/34869#issuecomment-1759951396

   I can't reproduce, this sample retries without any issues
   
   ```python
   
   from datetime import timedelta
   from airflow.decorators import task
   from airflow.models.dag import DAG
   from airflow.operators.python import ShortCircuitOperator
   from airflow.utils.timezone import datetime
   
   
   def _process():
       return 1 / 0
   
   
   with DAG(
       "issue_34869",
       start_date=datetime(2023, 10, 1),
       schedule_interval="@daily",
       catchup=False,
       tags=["issue", "34869"]
   ) as dag:
       task_a = ShortCircuitOperator(
           task_id = "always_failed",
           python_callable = _process,
           retries=999_999,
           retry_delay=timedelta(seconds=1)
       )
   
       @task
       def task_b():
           ...
   
       task_a >> task_b()
   ```
   
   ```console
   
   4792c3ee7560
   *** Found local files:
   ***   * /root/airflow/logs/dag_id=issue_34869/run_id=scheduled__2023-10-11T00:00:00+00:00/task_id=always_failed/attempt=120.log
   [2023-10-12, 16:17:45 UTC] {taskinstance.py:1914} INFO - Dependencies all met for dep_context=non-requeueable deps ti=<TaskInstance: issue_34869.always_failed scheduled__2023-10-11T00:00:00+00:00 [queued]>
   [2023-10-12, 16:17:45 UTC] {taskinstance.py:1914} INFO - Dependencies all met for dep_context=requeueable deps ti=<TaskInstance: issue_34869.always_failed scheduled__2023-10-11T00:00:00+00:00 [queued]>
   [2023-10-12, 16:17:45 UTC] {taskinstance.py:2116} INFO - Starting attempt 120 of 1000000
   [2023-10-12, 16:17:45 UTC] {taskinstance.py:2137} INFO - Executing <Task(ShortCircuitOperator): always_failed> on 2023-10-11 00:00:00+00:00
   [2023-10-12, 16:17:45 UTC] {standard_task_runner.py:60} INFO - Started process 1091 to run task
   [2023-10-12, 16:17:45 UTC] {standard_task_runner.py:87} INFO - Running: ['***', 'tasks', 'run', 'issue_34869', 'always_failed', 'scheduled__2023-10-11T00:00:00+00:00', '--job-id', '122', '--raw', '--subdir', 'DAGS_FOLDER/issue_34869.py', '--cfg-path', '/tmp/tmprrzxh1ux']
   [2023-10-12, 16:17:45 UTC] {standard_task_runner.py:88} INFO - Job 122: Subtask always_failed
   [2023-10-12, 16:17:45 UTC] {task_command.py:421} INFO - Running <TaskInstance: issue_34869.always_failed scheduled__2023-10-11T00:00:00+00:00 [running]> on host 4792c3ee7560
   [2023-10-12, 16:17:45 UTC] {taskinstance.py:2390} INFO - Exporting env vars: AIRFLOW_CTX_DAG_OWNER='***' AIRFLOW_CTX_DAG_ID='issue_34869' AIRFLOW_CTX_TASK_ID='always_failed' AIRFLOW_CTX_EXECUTION_DATE='2023-10-11T00:00:00+00:00' AIRFLOW_CTX_TRY_NUMBER='120' AIRFLOW_CTX_DAG_RUN_ID='scheduled__2023-10-11T00:00:00+00:00'
   [2023-10-12, 16:17:45 UTC] {taskinstance.py:2608} ERROR - Task failed with exception
   Traceback (most recent call last):
     File "/opt/airflow/airflow/models/taskinstance.py", line 432, in _execute_task
       result = execute_callable(context=context, **execute_callable_kwargs)
     File "/opt/airflow/airflow/operators/python.py", line 263, in execute
       condition = super().execute(context)
     File "/opt/airflow/airflow/operators/python.py", line 195, in execute
       return_value = self.execute_callable()
     File "/opt/airflow/airflow/operators/python.py", line 212, in execute_callable
       return self.python_callable(*self.op_args, **self.op_kwargs)
     File "/files/dags/issue_34869.py", line 8, in _process
       return 1 / 0
   ZeroDivisionError: division by zero
   [2023-10-12, 16:17:45 UTC] {taskinstance.py:1130} INFO - Marking task as UP_FOR_RETRY. dag_id=issue_34869, task_id=always_failed, execution_date=20231011T000000, start_date=20231012T161745, end_date=20231012T161745
   [2023-10-12, 16:17:45 UTC] {standard_task_runner.py:107} ERROR - Failed to execute job 122 for task always_failed (division by zero; 1091)
   [2023-10-12, 16:17:45 UTC] {local_task_job_runner.py:233} INFO - Task exited with return code 1
   [2023-10-12, 16:17:45 UTC] {taskinstance.py:3188} INFO - 0 downstream tasks scheduled from follow-on schedule check
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Scheduler crashes with OverflowError: date value out of range when retries are added to ShortCircuitOperator [airflow]

Posted by "sahar1454 (via GitHub)" <gi...@apache.org>.
sahar1454 commented on issue #34869:
URL: https://github.com/apache/airflow/issues/34869#issuecomment-1758460080

   @Taragolis I don't believe the fix you linked is related to the issue I reported. That one solves the problem for this issue: https://github.com/apache/airflow/issues/28171
   But I don't see the relation between the two. We are not using exponential backoff and we have set the retries to 1, not sure why it should overflow?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Scheduler crashes with OverflowError: date value out of range when retries are added to ShortCircuitOperator [airflow]

Posted by "sahar1454 (via GitHub)" <gi...@apache.org>.
sahar1454 commented on issue #34869:
URL: https://github.com/apache/airflow/issues/34869#issuecomment-1759513168

   @Taragolis I believe it is the second case in this scenario (i.e. Have invalid end_date). Reproducible steps are mentioned in the issue description. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Scheduler crashes with OverflowError: date value out of range when retries are added to ShortCircuitOperator [airflow]

Posted by "Taragolis (via GitHub)" <gi...@apache.org>.
Taragolis commented on issue #34869:
URL: https://github.com/apache/airflow/issues/34869#issuecomment-1758539398

   There are only option to overflow date it try to get it greater than greater `9999-12-31T23:59:59` in this method:
   - So it possible if delay (provided or backoff) greater than 7976 years ± couple years
   - Have invalid end_date
   
   So if you think that this both it not your case, then you should provide reproducible example.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Scheduler crashes with OverflowError: date value out of range when retries are added to ShortCircuitOperator [airflow]

Posted by "Taragolis (via GitHub)" <gi...@apache.org>.
Taragolis closed issue #34869: Scheduler crashes with OverflowError: date value out of range when retries are added to ShortCircuitOperator
URL: https://github.com/apache/airflow/issues/34869


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Scheduler crashes with OverflowError: date value out of range when retries are added to ShortCircuitOperator [airflow]

Posted by "Taragolis (via GitHub)" <gi...@apache.org>.
Taragolis commented on issue #34869:
URL: https://github.com/apache/airflow/issues/34869#issuecomment-1758259679

   Should be fixed as part of Airflow 2.6.0
   - https://github.com/apache/airflow/pull/28172


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org