You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "o-nikolas (via GitHub)" <gi...@apache.org> on 2023/02/14 00:04:13 UTC

[GitHub] [airflow] o-nikolas commented on pull request #29494: Fix circular imports when airflow starts

o-nikolas commented on PR #29494:
URL: https://github.com/apache/airflow/pull/29494#issuecomment-1428888690

   >The first traceback you 
   
   > I have not found yet which change triggered it (for sure that was not the change by @o-nikolas #29257 as I reverted it and it did not help) but it seems our main cli `airflow` fails with circular imports. Maybe others can find where it is from but my changes fix it (@o-nikolas - this is another manifestation of the circular "dependencies" we have - between config and settings):
   > 
   > Example stack trace I got before that one in various stages of the fix:
   > 
   > ```
   > Traceback (most recent call last):
   >   File "/Users/jarek/.pyenv/versions/airflow-3.9/bin/airflow", line 5, in <module>
   >     from airflow.__main__ import main
   >   File "/Users/jarek/IdeaProjects/airflow/airflow/__main__.py", line 27, in <module>
   >     from airflow.cli import cli_parser
   >   File "/Users/jarek/IdeaProjects/airflow/airflow/cli/cli_parser.py", line 33, in <module>
   >     from airflow import settings
   >   File "/Users/jarek/IdeaProjects/airflow/airflow/settings.py", line 39, in <module>
   >     from airflow.configuration import AIRFLOW_HOME, WEBSERVER_CONFIG, conf  # NOQA F401
   >   File "/Users/jarek/IdeaProjects/airflow/airflow/configuration.py", line 1794, in <module>
   >     conf.validate()
   >   File "/Users/jarek/IdeaProjects/airflow/airflow/configuration.py", line 343, in validate
   >     self._validate_config_dependencies()
   >   File "/Users/jarek/IdeaProjects/airflow/airflow/configuration.py", line 439, in _validate_config_dependencies
   >     executor, _ = ExecutorLoader.import_default_executor_cls()
   >   File "/Users/jarek/IdeaProjects/airflow/airflow/executors/executor_loader.py", line 151, in import_default_executor_cls
   >     return cls.import_executor_cls(executor_name)
   >   File "/Users/jarek/IdeaProjects/airflow/airflow/executors/executor_loader.py", line 126, in import_executor_cls
   >     return import_string(cls.executors[executor_name]), ConnectorSource.CORE
   >   File "/Users/jarek/IdeaProjects/airflow/airflow/utils/module_loading.py", line 36, in import_string
   >     module = import_module(module_path)
   >   File "/Users/jarek/.pyenv/versions/3.9.9/lib/python3.9/importlib/__init__.py", line 127, in import_module
   >     return _bootstrap._gcd_import(name[level:], package, level)
   >   File "/Users/jarek/IdeaProjects/airflow/airflow/executors/sequential_executor.py", line 30, in <module>
   >     from airflow.executors.base_executor import BaseExecutor, CommandType
   >   File "/Users/jarek/IdeaProjects/airflow/airflow/executors/base_executor.py", line 34, in <module>
   >     from airflow.models.taskinstance import TaskInstance, TaskInstanceKey
   >   File "/Users/jarek/IdeaProjects/airflow/airflow/models/taskinstance.py", line 72, in <module>
   >     from airflow.datasets.manager import dataset_manager
   >   File "/Users/jarek/IdeaProjects/airflow/airflow/datasets/manager.py", line 27, in <module>
   >     from airflow.models.dataset import DatasetDagRunQueue, DatasetEvent, DatasetModel
   >   File "/Users/jarek/IdeaProjects/airflow/airflow/models/dataset.py", line 40, in <module>
   >     from airflow.utils import timezone
   >   File "/Users/jarek/IdeaProjects/airflow/airflow/utils/timezone.py", line 27, in <module>
   >     from airflow.settings import TIMEZONE
   > ImportError: cannot import name 'TIMEZONE' from partially initialized module 'airflow.settings' (most likely due to a circular import) (/Users/jarek/IdeaProjects/airflow/airflow/settings.py)
   > ```
   
   @potiuk I believe that a similar fix I applied in `airflow.__int__` (in #29257)  would also have fixed this case actually. Basically in `__main__.py` if `airflow.configuration.conf` is imported **before**  `airflow.cli.cli_parser` (which imports `airflow.settings`) then the cycle would be removed.
   We get cycles like this when `airflow.settings` **is the first module to import/load `airflow.configuration`** because the very first import of `airflow.configuration` triggers the conf object to be created and validated, but much of the conf validation code depends on `airflow.settings` downstream so we can't have it happen inside `airflow.settings`.
   
   So importing `airflow.configuration` early (thus initting/validating conf **outside** the context of `airflow.settings`) would have fixed the issue in `__main__.py` I think:
   ```diff
    from __future__ import annotations
    
    import os
    
    import argcomplete
    
   -from airflow.cli import cli_parser
    from airflow.configuration import conf
   +from airflow.cli import cli_parser
    
    
    def main():
        """Main executable function."""
        if conf.get("core", "security") == "kerberos":
            os.environ["KRB5CCNAME"] = conf.get("kerberos", "ccache")
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org