You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "dinigo (via GitHub)" <gi...@apache.org> on 2023/02/28 13:56:13 UTC

[GitHub] [airflow] dinigo opened a new issue, #29803: Run DAG in isolated session

dinigo opened a new issue, #29803:
URL: https://github.com/apache/airflow/issues/29803

   ### Apache Airflow version
   
   2.5.1
   
   ### What happened
   
   Trying the new `airflow.models.DAG.test` function to run e2e tests on a DAG in a `pytest` fashion I find there's no way to force to write to a different db other than the configured one.
   
   This should create an alchemy session for an inmemory db, initialise the db and then use it for the test
   
   ```python
   @fixture(scope="session")
   def airflow_db():
       # in-memory database
       engine = create_engine(f"sqlite://")
       with Session(engine) as db_session:
           initdb(session=db_session, load_connections=False)
           yield db_session
   
   
   def test_dag_runs_default(airflow_db):
       dag.test(session=airflow_db)
   ```
   
   However `initdb` never receives the `engine` from `settings` that has been initialised before. It uses the engine **from `settings` instead of the engine from the session**.
   https://github.com/apache/airflow/blob/main/airflow/utils/db.py#L694-L695
   ```python
     with create_global_lock(session=session, lock=DBLocks.MIGRATIONS):
           Base.metadata.create_all(settings.engine)
           Model.metadata.create_all(settings.engine)
   ```
   
   Then `_create_flask_session_tbl()` reads again the database  from the config (which might be the same as when settings was initialised or not) and creates all Airflow tables in a database different from the provided in the session again.
   
   
   ### What you think should happen instead
   
   The sql alchemy base, models and airflow tables should be created in the database provided by the session.
   
   In case the session is injected then, this will match the config. But if a session is provided, it should use this session instead
   
   ### How to reproduce
   
   This inits the db specified in the config (defaults to `${HOME}/airflow/airflow.db`), then the test tries to use the in-memory one and breaks
   ```python
   @fixture(scope="session")
   def airflow_db():
       # in-memory database
       engine = create_engine(f"sqlite://")
       with Session(engine) as db_session:
           initdb(session=db_session, load_connections=False)
           yield db_session
   
   
   def test_dag_runs_default(airflow_db):
       dag.test(session=airflow_db)
   ```
   
   ### Operating System
   
   MacOs
   
   ### Versions of Apache Airflow Providers
   
   _No response_
   
   ### Deployment
   
   Virtualenv installation
   
   ### Deployment details
   
   _No response_
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] uranusjr closed issue #29803: Run DAG in isolated session

Posted by "uranusjr (via GitHub)" <gi...@apache.org>.
uranusjr closed issue #29803: Run DAG in isolated session
URL: https://github.com/apache/airflow/issues/29803


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org