You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/11/29 18:39:05 UTC

[GitHub] [airflow] potiuk edited a comment on pull request #19860: Restore stability and unquarantine all test_scheduler_job tests

potiuk edited a comment on pull request #19860:
URL: https://github.com/apache/airflow/pull/19860#issuecomment-981908081


   Hmm. @ashb @uranusjr @ephraimbuddy maybe you could take a look at this one and help me to figure out what happened. I have a theory and I am trying it in the latest commit, but I just debugged and learned more about how dag_maker works internally (pretty cool actually). 
   
   Below you will see coloured dump pf the failing `test_scheduler_keeps_scheduling_pool_full` tests.
   
   Here is my line of thinking:
   
   What I seee there is that when the "create_dagrun" loop for the `d2`  dag failed with `Couldn't find dag` (scheduler_job.py:992). You can see five of those failing (this is understood as we try to create 5 dagruns).
   
   When (normally) the test runs successfully -  as expected there are no error messages, DagRuns are created and life is good. So clearly the reason for the test failin are those 'Couldn't find dag`. But from what I understand how get_dag() works, the only way to get `Couldn't find dag` is that there is no SerializedDagModel available. 
   
   For whatever reason when the testts get to the `d1` create_dagruns loop, the SerializedDag model is not available in sqlalchemy. The whole test is done (as I understand it)  without a single  commit()  - so all the SerializedDag objects should be in the session (because they are not commited te the database). This basically means that the whole tests has only a chance to work if both `dag_makers` above use the same session. 
   
   My theory is that for some reason the second run of the dag_maker used different Session() from sqlalchemy. This is the only way I could explain this behaviour, but I am not sure how it could happen. Yes the dag_makers did not have session passed to them - which triggers (own_session) - so each of the dag_maker contexts would create new Session() - but at least in theory they should be  the same scope (from what I understand sqlalchemy stores the actuall session in Threadlocal and they will be reused if Session() is called twice (but maybe I do not understand SqlAlchemy session behaviour and there is way same thread will get two different sessions if it calls Session() twice. 
   
   I am trying out my theory by adding a change where the two dag_makers, actually share session created before - those tests  failed in 1/2 tests consistently before so I hope we will see the result of it soon. But If it would confirm, then I wonder under what circumstances this could happen.
   
   ![Screenshot from 2021-11-29 19-07-05](https://user-images.githubusercontent.com/595491/143920234-54bb3984-2099-4f3e-8dd9-03cd749b83f5.png)
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org