You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "cosinequanon (via GitHub)" <gi...@apache.org> on 2023/02/28 18:48:15 UTC

[GitHub] [airflow] cosinequanon opened a new issue, #29819: DAG fails serialization if template_field contains execution_timeout

cosinequanon opened a new issue, #29819:
URL: https://github.com/apache/airflow/issues/29819

   ### Apache Airflow version
   
   2.5.1
   
   ### What happened
   
   If an Operator specifies a template_field with `execution_timeout` then the DAG will serialize correctly but throw an error during deserialization. This causes the entire scheduler to crash and breaks the application.
   
   ### What you think should happen instead
   
   The scheduler should never go down because of some code someone wrote, this should probably throw an error during serialization.
   
   ### How to reproduce
   
   Define an operator like this
   ```
   class ExecutionTimeoutOperator(BaseOperator):
       template_fields = ("execution_timeout", )
   
       def __init__(self, execution_timeout: timedelta, **kwargs):
           super().__init__(**kwargs)
           self.execution_timeout = execution_timeout
   ```
   then make a dag like this
   ```
   dag = DAG(
       "serialize_with_default",
       schedule_interval="0 12 * * *",
       start_date=datetime(2023, 2, 28),
       catchup=False,
       default_args={
           "execution_timeout": timedelta(days=4),
       },
   )
   
   with dag:
       execution = ExecutionTimeoutOperator(task_id="execution", execution_timeout=timedelta(hours=1))
   ```
   that will break the scheduler, you can force the stack trace by doing this
   ```
   from airflow.models import DagBag
   db = DagBag('dags/siri/staging/exp_airflow_dags', read_dags_from_db=True)
   db.get_dag('serialize_with_default')
   Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
     File "/usr/local/lib/python3.9/site-packages/airflow/utils/session.py", line 75, in wrapper
       return func(*args, session=session, **kwargs)
     File "/usr/local/lib/python3.9/site-packages/airflow/models/dagbag.py", line 190, in get_dag
       self._add_dag_from_db(dag_id=dag_id, session=session)
     File "/usr/local/lib/python3.9/site-packages/airflow/models/dagbag.py", line 265, in _add_dag_from_db
       dag = row.dag
     File "/usr/local/lib/python3.9/site-packages/airflow/models/serialized_dag.py", line 218, in dag
       dag = SerializedDAG.from_dict(self.data)
     File "/usr/local/lib/python3.9/site-packages/airflow/serialization/serialized_objects.py", line 1287, in from_dict
       return cls.deserialize_dag(serialized_obj["dag"])
     File "/usr/local/lib/python3.9/site-packages/airflow/serialization/serialized_objects.py", line 1194, in deserialize_dag
       v = {task["task_id"]: SerializedBaseOperator.deserialize_operator(task) for task in v}
     File "/usr/local/lib/python3.9/site-packages/airflow/serialization/serialized_objects.py", line 1194, in <dictcomp>
       v = {task["task_id"]: SerializedBaseOperator.deserialize_operator(task) for task in v}
     File "/usr/local/lib/python3.9/site-packages/airflow/serialization/serialized_objects.py", line 955, in deserialize_operator
       cls.populate_operator(op, encoded_op)
     File "/usr/local/lib/python3.9/site-packages/airflow/serialization/serialized_objects.py", line 864, in populate_operator
       v = cls._deserialize_timedelta(v)
     File "/usr/local/lib/python3.9/site-packages/airflow/serialization/serialized_objects.py", line 513, in _deserialize_timedelta
       return datetime.timedelta(seconds=seconds)
   TypeError: unsupported type for timedelta seconds component: str
   ```
   
   ### Operating System
   
   Mac 13.1 (22C65)
   
   ### Versions of Apache Airflow Providers
   
   apache-airflow-providers-amazon==5.1.0
   apache-airflow-providers-apache-hdfs==3.2.0
   apache-airflow-providers-apache-hive==5.1.1
   apache-airflow-providers-apache-spark==4.0.0
   apache-airflow-providers-celery==3.1.0
   apache-airflow-providers-cncf-kubernetes==5.1.1
   apache-airflow-providers-common-sql==1.3.3
   apache-airflow-providers-datadog==3.1.0
   apache-airflow-providers-ftp==3.3.0
   apache-airflow-providers-http==4.1.1
   apache-airflow-providers-imap==3.1.1
   apache-airflow-providers-jdbc==3.3.0
   apache-airflow-providers-jenkins==3.2.0
   apache-airflow-providers-mysql==4.0.0
   apache-airflow-providers-pagerduty==3.1.0
   apache-airflow-providers-postgres==5.4.0
   apache-airflow-providers-presto==4.2.1
   apache-airflow-providers-slack==7.2.0
   apache-airflow-providers-sqlite==3.3.1
   apache-airflow-providers-ssh==3.4.0
   
   ### Deployment
   
   Docker-Compose
   
   ### Deployment details
   
   I could repro this with docker-compose and in a helm backed deployment so I don't think it's really related to the deployment details
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk closed issue #29819: DAG fails serialization if template_field contains execution_timeout

Posted by "potiuk (via GitHub)" <gi...@apache.org>.
potiuk closed issue #29819: DAG fails serialization if template_field contains execution_timeout
URL: https://github.com/apache/airflow/issues/29819


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] boring-cyborg[bot] commented on issue #29819: DAG fails serialization if template_field contains execution_timeout

Posted by "boring-cyborg[bot] (via GitHub)" <gi...@apache.org>.
boring-cyborg[bot] commented on issue #29819:
URL: https://github.com/apache/airflow/issues/29819#issuecomment-1448688696

   Thanks for opening your first issue here! Be sure to follow the issue template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org