You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/03/06 15:09:18 UTC

[GitHub] [airflow] eladkal commented on a change in pull request #22025: Add information on DAG pausing/deactivation/deletion

eladkal commented on a change in pull request #22025:
URL: https://github.com/apache/airflow/pull/22025#discussion_r820247041



##########
File path: docs/apache-airflow/concepts/dags.rst
##########
@@ -745,3 +745,34 @@ the dependency graph.
 
 The dependency detector is configurable, so you can implement your own logic different than the defaults in
 :class:`~airflow.serialization.serialized_objects.DependencyDetector`
+
+DAG pausing, deactivation and deletion
+--------------------------------------
+
+The DAGs have several states when it comes to being "not running". DAGs can be paused, deactivated
+and finally all metadata for the ``DAG`` can be deleted.
+
+Dag can be paused via UI when it is present in the ``DAGS_FOLDER``, and scheduler stored it in
+the database, but the user chose to disable it via the UI. The ``pause/unpause`` actions are available
+via UI and API. Paused ``DAG`` is not scheduled by the Scheduler, but you can trigger them via UI for
+manual runs.
+
+Dag can be deactivated by removing them from the ``DAGS_FOLDER``. When scheduler parses the ``DAGS_FOLDER``
+and misses the ``DAG`` that it has seen before and stored in the database it will set is as inactive. The
+metadata and history of the ``DAG`` is kept and when the ``DAG`` is re-added to the ``DAGS_FOLDER`` it will
+be again activated. You cannot activate/deactivate ``DAG`` via UI or API, this can only be done by removing
+files from the ``DAGS_FOLDER``. In the UI, you can see inactive DAGs (in ``All`` tab) but they are not
+scheduled by the scheduler. You can also trigger them via UI for manual execution but they will not get
+executed until they ``DAG`` file that generated the dag re-appears in the ``DAGS_FOLDER`` and the ``DAG``
+becomes active. No data for historical runs of the ``DAG`` are lost when it is deactivated by the scheduler.
+
+You can also delete the ``DAG`` metadata from the metadata database using UI or API , but it does not
+necessarily deletes the ``DAG`` itself. If the ``DAG`` is in ``DAGS_FOLDER`` and you delete the metadata,

Review comment:
       I think it never deleted the DAG itself? 

##########
File path: docs/apache-airflow/concepts/dags.rst
##########
@@ -745,3 +745,34 @@ the dependency graph.
 
 The dependency detector is configurable, so you can implement your own logic different than the defaults in
 :class:`~airflow.serialization.serialized_objects.DependencyDetector`
+
+DAG pausing, deactivation and deletion
+--------------------------------------
+
+The DAGs have several states when it comes to being "not running". DAGs can be paused, deactivated
+and finally all metadata for the ``DAG`` can be deleted.
+
+Dag can be paused via UI when it is present in the ``DAGS_FOLDER``, and scheduler stored it in
+the database, but the user chose to disable it via the UI. The ``pause/unpause`` actions are available
+via UI and API. Paused ``DAG`` is not scheduled by the Scheduler, but you can trigger them via UI for
+manual runs.
+
+Dag can be deactivated by removing them from the ``DAGS_FOLDER``. When scheduler parses the ``DAGS_FOLDER``
+and misses the ``DAG`` that it has seen before and stored in the database it will set is as inactive. The
+metadata and history of the ``DAG`` is kept and when the ``DAG`` is re-added to the ``DAGS_FOLDER`` it will
+be again activated. You cannot activate/deactivate ``DAG`` via UI or API, this can only be done by removing
+files from the ``DAGS_FOLDER``. In the UI, you can see inactive DAGs (in ``All`` tab) but they are not
+scheduled by the scheduler. You can also trigger them via UI for manual execution but they will not get
+executed until they ``DAG`` file that generated the dag re-appears in the ``DAGS_FOLDER`` and the ``DAG``
+becomes active. No data for historical runs of the ``DAG`` are lost when it is deactivated by the scheduler.
+
+You can also delete the ``DAG`` metadata from the metadata database using UI or API , but it does not
+necessarily deletes the ``DAG`` itself. If the ``DAG`` is in ``DAGS_FOLDER`` and you delete the metadata,
+the ``DAG`` will re-appear as Scheduler will parse the folder, only historical runs information for the
+``DAG`` will be removed (except the logs - those are only removed if you delete the log files).
+
+This all means that if you want to actually delete a ``DAG`` and its all historical metadata, you need to do
+it in two steps:
+
+* delete the ``DAG`` file from the ``DAGS_FOLDER`` and wait until it becomes inactive

Review comment:
       Is there indication to the user when it becomes inactive?
   I think the only indication for it is a the column state in the DB which is not exposed to the users via the UI.

##########
File path: docs/apache-airflow/concepts/dags.rst
##########
@@ -745,3 +745,34 @@ the dependency graph.
 
 The dependency detector is configurable, so you can implement your own logic different than the defaults in
 :class:`~airflow.serialization.serialized_objects.DependencyDetector`
+
+DAG pausing, deactivation and deletion
+--------------------------------------
+
+The DAGs have several states when it comes to being "not running". DAGs can be paused, deactivated
+and finally all metadata for the ``DAG`` can be deleted.
+
+Dag can be paused via UI when it is present in the ``DAGS_FOLDER``, and scheduler stored it in
+the database, but the user chose to disable it via the UI. The ``pause/unpause`` actions are available
+via UI and API. Paused ``DAG`` is not scheduled by the Scheduler, but you can trigger them via UI for
+manual runs.

Review comment:
       This is a bit confusing I think?
   When we trigger a manual RUN it also automatically change the paused DAG to ON so it can run. If it is left in Off the dag will not be scheduled.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org