You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/02/11 04:16:42 UTC

[GitHub] [airflow] jedcunningham commented on a change in pull request #21501: Support different timeout value for dag file parsing

jedcunningham commented on a change in pull request #21501:
URL: https://github.com/apache/airflow/pull/21501#discussion_r804344322



##########
File path: airflow/settings.py
##########
@@ -209,6 +209,18 @@ def get_airflow_context_vars(context):
     return {}
 
 
+def get_dagbag_import_timeout(dag_file_path: str) -> float:
+    """
+    This setting allows to dynamically control the dag file parsing timeout.
+
+    It is useful when there are a few dag files, requiring long parsing time while others not

Review comment:
       ```suggestion
       It is useful when there are a few DAG files requiring longer parsing times, while others do not.
   ```

##########
File path: tests/models/test_dagbag.py
##########
@@ -206,6 +207,49 @@ def test_zip(self):
         assert dagbag.get_dag("test_zip_dag")
         assert sys.path == syspath_before  # sys.path doesn't change
 
+    @patch("airflow.models.dagbag.timeout")
+    @patch("airflow.models.dagbag.settings.get_dagbag_import_timeout")
+    def test_process_dag_file_without_timeout(self, mocked_get_dagbag_import_timeout, mocked_timeout):
+        """
+        Test dag file parsing without timeout
+        """
+        mocked_get_dagbag_import_timeout.return_value = 0

Review comment:
       We also check with a negative, no?

##########
File path: tests/models/test_dagbag.py
##########
@@ -206,6 +207,49 @@ def test_zip(self):
         assert dagbag.get_dag("test_zip_dag")
         assert sys.path == syspath_before  # sys.path doesn't change
 
+    @patch("airflow.models.dagbag.timeout")
+    @patch("airflow.models.dagbag.settings.get_dagbag_import_timeout")
+    def test_process_dag_file_without_timeout(self, mocked_get_dagbag_import_timeout, mocked_timeout):
+        """
+        Test dag file parsing without timeout
+        """
+        mocked_get_dagbag_import_timeout.return_value = 0
+
+        dagbag = models.DagBag(dag_folder=self.empty_dir, include_examples=False)
+        dagbag.process_file(os.path.join(TEST_DAGS_FOLDER, 'test_default_views.py'))
+        mocked_timeout.assert_not_called()
+
+    @patch("airflow.models.dagbag.timeout")
+    @patch("airflow.models.dagbag.settings.get_dagbag_import_timeout")
+    def test_process_dag_file_with_non_default_timeout(
+        self, mocked_get_dagbag_import_timeout, mocked_timeout
+    ):
+        """
+        Test customized dag file parsing timeout
+        """
+        timeout_value = 100
+        mocked_get_dagbag_import_timeout.return_value = timeout_value
+
+        # ensure the test value is not equal to the default value
+        assert timeout_value != settings.conf.getfloat('core', 'DAGBAG_IMPORT_TIMEOUT')
+
+        dagbag = models.DagBag(dag_folder=self.empty_dir, include_examples=False)
+        dagbag.process_file(os.path.join(TEST_DAGS_FOLDER, 'test_default_views.py'))
+
+        mocked_timeout.assert_called_once_with(timeout_value, error_message=mock.ANY)
+
+    @patch("airflow.models.dagbag.settings.get_dagbag_import_timeout")
+    def test_check_value_type_from_get_dagbag_import_timeout(self, mocked_get_dagbag_import_timeout):
+        """
+        Test correctness of value from get_dagbag_import_timeout
+        """
+        mocked_get_dagbag_import_timeout.return_value = '1'

Review comment:
       ```suggestion
           mocked_get_dagbag_import_timeout.return_value = 0.1
   ```
   Faster, and the right type?

##########
File path: airflow/settings.py
##########
@@ -209,6 +209,18 @@ def get_airflow_context_vars(context):
     return {}
 
 
+def get_dagbag_import_timeout(dag_file_path: str) -> float:
+    """
+    This setting allows to dynamically control the dag file parsing timeout.
+
+    It is useful when there are a few dag files, requiring long parsing time while others not
+    You can control them separately instead of having one value for all dag files.

Review comment:
       ```suggestion
       You can control them separately instead of having one value for all DAG files.
   ```

##########
File path: airflow/settings.py
##########
@@ -209,6 +209,18 @@ def get_airflow_context_vars(context):
     return {}
 
 
+def get_dagbag_import_timeout(dag_file_path: str) -> float:
+    """
+    This setting allows to dynamically control the dag file parsing timeout.
+
+    It is useful when there are a few dag files, requiring long parsing time while others not
+    You can control them separately instead of having one value for all dag files.
+
+    If the return value is less than 0, it means no timeout during the dag parsing.

Review comment:
       ```suggestion
       If the return value is less than or equal to 0, it means no timeout while parsing.
   ```

##########
File path: airflow/settings.py
##########
@@ -209,6 +209,18 @@ def get_airflow_context_vars(context):
     return {}
 
 
+def get_dagbag_import_timeout(dag_file_path: str) -> float:

Review comment:
       ```suggestion
   def get_dagbag_import_timeout(dag_file_path: str) -> Union[int, float]:
   ```
   

##########
File path: airflow/settings.py
##########
@@ -209,6 +209,18 @@ def get_airflow_context_vars(context):
     return {}
 
 
+def get_dagbag_import_timeout(dag_file_path: str) -> float:
+    """
+    This setting allows to dynamically control the dag file parsing timeout.

Review comment:
       ```suggestion
       This setting allows for dynamically control of the dag file parsing timeout based on the DAG file path.
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org