You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/10/21 21:18:02 UTC

[GitHub] [airflow] o-nikolas commented on a diff in pull request #26400: Create a more efficient airflow dag test command that also has better local logging

o-nikolas commented on code in PR #26400:
URL: https://github.com/apache/airflow/pull/26400#discussion_r1002192571


##########
docs/apache-airflow/executor/debug.rst:
##########
@@ -15,11 +15,59 @@
     specific language governing permissions and limitations
     under the License.
 
+Testing DAGs with dag.test()
+=============================
+
+To debug DAGs in an IDE, you can set up the ``dag.test`` command in your dag file and run through your DAG in a single
+serialized python process.
+
+This approach can be used with any supported database (including a local SQLite database) and will
+*fail fast* as all tasks run in a single process.
+
+To set up ``dag.test``, add these two lines to the bottom of your dag file:
+
+.. code-block:: python
+
+  if __name__ == "__main__":
+      dag.test()
+
+and that's it! You can add argument such as ``execution_date`` if you want to test argument-specific dagruns, but otherwise
+you can run or debug DAGs as needed.
+
+Comparison with DebugExecutor
+*****************************
+
+The ``dag.test`` command has the following benefits over the :class:`~airflow.executors.debug_executor.DebugExecutor`
+class, which is now deprecated:
+
+1. It does not require running an executor at all. Tasks are run one at a time with no executor or scheduler logs.
+2. It is significantly faster than running code with a DebugExecutor as it does not need to go through a scheduler loop.
+3. It does not perform a backfill.
+
+
+Debugging Airflow DAGs on the command line
+==========================================
+
+With the same two line addition as mentioned in the above section, you can now easily debug a DAG using pdb as well.
+Run ``python -m pdb <path to dag file>.py`` for an interactive debugging experience on the command line.
+
+.. code-block:: bash
+
+  root@ef2c84ad4856:/opt/airflow# python -m pdb airflow/example_dags/example_bash_operator.py
+  > /opt/airflow/airflow/example_dags/example_bash_operator.py(18)<module>()
+  -> """Example DAG demonstrating the usage of the BashOperator."""
+  (Pdb) b 45
+  Breakpoint 1 at /opt/airflow/airflow/example_dags/example_bash_operator.py:45
+  (Pdb) c
+  > /opt/airflow/airflow/example_dags/example_bash_operator.py(45)<module>()
+  -> bash_command='echo 1',
+  (Pdb) run_this_last
+  <Task(EmptyOperator): run_this_last>
 
 .. _executor:DebugExecutor:
 
-Debug Executor
-==================
+Debug Executor (deprecated)

Review Comment:
   +1 to @eladkal question.
   I stumbled across this while working on something entirely different. I'm very curious to have this discussion as well. 
   
   Currently [AIP-47](https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-47+New+design+of+Airflow+System+Tests) system tests depend on the DebugExecutor:
   > The tests should be structured in the way that they are easy to run as “standalone” tests manually but they should also nicely be integrated into pytest test execution environment. This can be achieved by leveraging the **DebugExecutor** and utilising modern pytest test discovery mechanism 
   
   So we'd need to do some migration there if we were to deprecate this executor (CC @potiuk @mnojek). I know @bhirsz has mentioned before they're using a different executor altogether for their system tests, maybe that's a more viable option.
   
    We should also add deprecation warnings if we plan to deprecate this executor.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org