You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/05/06 17:57:41 UTC

[GitHub] [airflow] ashb opened a new pull request #8739: Test that DagFileProcessor can operator against on a Serialized DAG

ashb opened a new pull request #8739:
URL: https://github.com/apache/airflow/pull/8739


   As part of the scheduler HA work we are going to want to separate the
   parsing from the scheduling, so this changes the tests to ensure that
   the important methods of DagFileProcessor can do everything the need to
   when given a SerializedDAG, not just a DAG. i.e. that we have correctly
   serialized all the necessary fields.
   
   ---
   Make sure to mark the boxes below before creating PR: [x]
   
   - [x] Description above provides context of the change
   - [x] Unit tests coverage for changes (not needed for documentation changes)
   - [x] Target Github ISSUE in description if exists
   - [x] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)"
   - [x] Relevant documentation is updated including usage instructions.
   - [x] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #8739: Test that DagFileProcessor can operator against on a Serialized DAG

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #8739:
URL: https://github.com/apache/airflow/pull/8739#discussion_r425839031



##########
File path: airflow/models/dagbag.py
##########
@@ -456,4 +456,5 @@ def sync_to_db(self):
         from airflow.models.dag import DAG
         from airflow.models.serialized_dag import SerializedDagModel
         DAG.bulk_sync_to_db(self.dags.values())
-        SerializedDagModel.bulk_sync_to_db(self.dags.values())
+        if self.store_serialized_dags:
+            SerializedDagModel.bulk_sync_to_db(self.dags.values())

Review comment:
       @kaxil FYI, plus the next hunk to models/s10n_dag.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #8739: Test that DagFileProcessor can operator against on a Serialized DAG

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #8739:
URL: https://github.com/apache/airflow/pull/8739#discussion_r421844181



##########
File path: tests/jobs/test_scheduler_job.py
##########
@@ -2253,7 +2307,17 @@ def evaluate_dagrun(
             self.null_exec.mock_task_fail(dag_id, tid, ex_date)
 
         try:
-            dag.run(start_date=ex_date, end_date=ex_date, executor=self.null_exec, **run_kwargs)
+            # This needs a _REAL_ dag, not the serialized version
+            if not isinstance(dag, SerializedDAG):
+                real_dag = dag
+            else:
+                # It may not be loaded. This "could" live in DagBag, but it's
+                # only really needed here in tests, not in normal code.
+                if dag_id not in self.non_serialized_dagbag.dag_ids:
+                    self.non_serialized_dagbag.process_file(dag.fileloc)
+
+                real_dag = self.non_serialized_dagbag.get_dag(dag_id)

Review comment:
       Probably fixed by https://github.com/apache/airflow/pull/8775




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #8739: Test that DagFileProcessor can operator against on a Serialized DAG

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #8739:
URL: https://github.com/apache/airflow/pull/8739#discussion_r432745457



##########
File path: tests/models/test_serialized_dag.py
##########
@@ -117,7 +116,6 @@ def test_remove_stale_dags(self):
         self.assertFalse(SDM.has_dag(stale_dag.dag_id))
         self.assertTrue(SDM.has_dag(fresh_dag.dag_id))
 
-    @mock.patch('airflow.models.serialized_dag.STORE_SERIALIZED_DAGS', True)
     def test_bulk_sync_to_db(self):

Review comment:
       `SDM.bulk_sync_to_db` will now _always_ write to the DB when called.
   
   Before it would return if the config was false. Now  the check is moved in to DagBag.bulk_sync:
   
   ```python
           if self.store_serialized_dags:
               SerializedDagModel.bulk_sync_to_db(self.dags.values())
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil merged pull request #8739: Test that DagFileProcessor can operator against on a Serialized DAG

Posted by GitBox <gi...@apache.org>.
kaxil merged pull request #8739:
URL: https://github.com/apache/airflow/pull/8739


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #8739: Test that DagFileProcessor can operator against on a Serialized DAG

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #8739:
URL: https://github.com/apache/airflow/pull/8739#discussion_r425839302



##########
File path: airflow/models/dagbag.py
##########
@@ -456,4 +456,5 @@ def sync_to_db(self):
         from airflow.models.dag import DAG
         from airflow.models.serialized_dag import SerializedDagModel
         DAG.bulk_sync_to_db(self.dags.values())
-        SerializedDagModel.bulk_sync_to_db(self.dags.values())
+        if self.store_serialized_dags:
+            SerializedDagModel.bulk_sync_to_db(self.dags.values())

Review comment:
       Means I can do
   
   ```
           non_serialized_dagbag = DagBag(store_serialized_dags=False, include_examples=False)
           non_serialized_dagbag.store_serialized_dags = True
           non_serialized_dagbag.sync_to_db()
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #8739: Test that DagFileProcessor can operator against on a Serialized DAG

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #8739:
URL: https://github.com/apache/airflow/pull/8739#discussion_r421460848



##########
File path: tests/jobs/test_scheduler_job.py
##########
@@ -2253,7 +2307,17 @@ def evaluate_dagrun(
             self.null_exec.mock_task_fail(dag_id, tid, ex_date)
 
         try:
-            dag.run(start_date=ex_date, end_date=ex_date, executor=self.null_exec, **run_kwargs)
+            # This needs a _REAL_ dag, not the serialized version
+            if not isinstance(dag, SerializedDAG):
+                real_dag = dag
+            else:
+                # It may not be loaded. This "could" live in DagBag, but it's
+                # only really needed here in tests, not in normal code.
+                if dag_id not in self.non_serialized_dagbag.dag_ids:
+                    self.non_serialized_dagbag.process_file(dag.fileloc)
+
+                real_dag = self.non_serialized_dagbag.get_dag(dag_id)

Review comment:
       This _feels_ like it shouldn't be necessary, taking a look at it.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #8739: Test that DagFileProcessor can operator against on a Serialized DAG

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #8739:
URL: https://github.com/apache/airflow/pull/8739#discussion_r432745615



##########
File path: tests/models/test_serialized_dag.py
##########
@@ -117,7 +116,6 @@ def test_remove_stale_dags(self):
         self.assertFalse(SDM.has_dag(stale_dag.dag_id))
         self.assertTrue(SDM.has_dag(fresh_dag.dag_id))
 
-    @mock.patch('airflow.models.serialized_dag.STORE_SERIALIZED_DAGS', True)
     def test_bulk_sync_to_db(self):

Review comment:
       (This test isn't great, but that wasn't changed by this)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on pull request #8739: Test that DagFileProcessor can operator against on a Serialized DAG

Posted by GitBox <gi...@apache.org>.
kaxil commented on pull request #8739:
URL: https://github.com/apache/airflow/pull/8739#issuecomment-624930516


   Looks like it works fine :)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #8739: Test that DagFileProcessor can operator against on a Serialized DAG

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #8739:
URL: https://github.com/apache/airflow/pull/8739#discussion_r425893005



##########
File path: airflow/models/dagbag.py
##########
@@ -456,4 +456,5 @@ def sync_to_db(self):
         from airflow.models.dag import DAG
         from airflow.models.serialized_dag import SerializedDagModel
         DAG.bulk_sync_to_db(self.dags.values())
-        SerializedDagModel.bulk_sync_to_db(self.dags.values())
+        if self.store_serialized_dags:
+            SerializedDagModel.bulk_sync_to_db(self.dags.values())

Review comment:
       Nice




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #8739: Test that DagFileProcessor can operator against on a Serialized DAG

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #8739:
URL: https://github.com/apache/airflow/pull/8739#discussion_r432742447



##########
File path: tests/models/test_serialized_dag.py
##########
@@ -117,7 +116,6 @@ def test_remove_stale_dags(self):
         self.assertFalse(SDM.has_dag(stale_dag.dag_id))
         self.assertTrue(SDM.has_dag(fresh_dag.dag_id))
 
-    @mock.patch('airflow.models.serialized_dag.STORE_SERIALIZED_DAGS', True)
     def test_bulk_sync_to_db(self):

Review comment:
       How does this test work now, this doesn't store serialized DAG anymore, does it ?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #8739: Test that DagFileProcessor can operator against on a Serialized DAG

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #8739:
URL: https://github.com/apache/airflow/pull/8739#discussion_r432742933



##########
File path: tests/models/test_serialized_dag.py
##########
@@ -117,7 +116,6 @@ def test_remove_stale_dags(self):
         self.assertFalse(SDM.has_dag(stale_dag.dag_id))
         self.assertTrue(SDM.has_dag(fresh_dag.dag_id))
 
-    @mock.patch('airflow.models.serialized_dag.STORE_SERIALIZED_DAGS', True)
     def test_bulk_sync_to_db(self):

Review comment:
       I mean with the patch, it would only sync and write DAGs to DAGModel table and assert that queries count is 7, is that the intention 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on pull request #8739: Test that DagFileProcessor can operator against on a Serialized DAG

Posted by GitBox <gi...@apache.org>.
ashb commented on pull request #8739:
URL: https://github.com/apache/airflow/pull/8739#issuecomment-632164460


   Just the Travis/Kube tests failing now. And they have been doing that for a while now :(


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org