You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/12/20 21:23:38 UTC

[GitHub] [airflow] ms32035 opened a new pull request #13199: Create dag dependencies view

ms32035 opened a new pull request #13199:
URL: https://github.com/apache/airflow/pull/13199


   <!--
   Thank you for contributing! Please make sure that your code changes
   are covered with tests. And in case of new features or big changes
   remember to adjust the documentation.
   
   Feel free to ping committers for the review!
   
   In case of existing issue, reference it using one of the following:
   
   closes: #ISSUE
   related: #ISSUE
   
   How to write a good git commit message:
   http://chris.beams.io/posts/git-commit/
   -->
   
   This PR add a new view displaying dependencies between the DAGs. It's based on my DAG Dependencies plugin (https://github.com/ms32035/airflow-dag-dependencies) and has been updated to work with Airflow 2.0. Since quite a bit of code is common with DAG graph view, that has been externalized to a new module. Once this is merged, I will deprecate the plugin.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#discussion_r617713024



##########
File path: airflow/serialization/schema.json
##########
@@ -103,7 +109,8 @@
           { "type": "null" },
           { "$ref": "#/definitions/task_group" }
         ]},
-        "edge_info": { "$ref": "#/definitions/edge_info" }
+        "edge_info": { "$ref": "#/definitions/edge_info" },
+        "dependencies": { "$ref": "#/definitions/dependencies" }

Review comment:
       We should probably call this `dag_dependencies` WDYT?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ms32035 commented on a change in pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
ms32035 commented on a change in pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#discussion_r598124058



##########
File path: airflow/www/views.py
##########
@@ -3811,3 +3811,102 @@ def autocomplete(self, session=None):
         payload = [row[0] for row in dag_ids_query.union(owners_query).limit(10).all()]
 
         return wwwutils.json_response(payload)
+
+
+class DagDependenciesView(AirflowBaseView):
+    """View to show dependencies between DAGs"""
+
+    refresh_interval = conf.getint(
+        "dag_dependencies_plugin",
+        "refresh_interval",
+        fallback=conf.getint("scheduler", "dag_dir_list_interval"),
+    )
+    last_refresh = datetime.utcnow() - timedelta(seconds=refresh_interval)
+    nodes = []
+    edges = []

Review comment:
       https://github.com/pallets/flask/issues/2520 - assume that's how flask still works, so it's to prevent recalculation on every request. Open to other suggestions where to store the dependencies state in memory

##########
File path: airflow/www/views.py
##########
@@ -3811,3 +3811,102 @@ def autocomplete(self, session=None):
         payload = [row[0] for row in dag_ids_query.union(owners_query).limit(10).all()]
 
         return wwwutils.json_response(payload)
+
+
+class DagDependenciesView(AirflowBaseView):
+    """View to show dependencies between DAGs"""
+
+    refresh_interval = conf.getint(
+        "dag_dependencies_plugin",
+        "refresh_interval",
+        fallback=conf.getint("scheduler", "dag_dir_list_interval"),
+    )
+    last_refresh = datetime.utcnow() - timedelta(seconds=refresh_interval)
+    nodes = []
+    edges = []
+
+    @expose('/dag-dependencies')
+    @auth.has_access(
+        [
+            (permissions.ACTION_CAN_READ, permissions.RESOURCE_DAG),
+        ]
+    )
+    @gzipped
+    @action_logging
+    def list(self):
+        """Display DAG dependencies"""
+        title = "DAG Dependencies"
+
+        if datetime.utcnow() > self.last_refresh + timedelta(seconds=self.refresh_interval):
+            self._calculate_graph()
+            self.last_refresh = datetime.utcnow()
+
+        return self.render_template(
+            "airflow/dag_dependencies.html",
+            title=title,
+            nodes=self.nodes,
+            edges=self.edges,
+            last_refresh=self.last_refresh.strftime("%Y-%m-%d %H:%M:%S"),
+            arrange=conf.get("webserver", "dag_orientation"),
+            width=request.args.get("width", "100%"),
+            height=request.args.get("height", "800"),
+        )
+
+    def _calculate_graph(self):
+
+        current_app.dag_bag.collect_dags_from_db()
+
+        nodes = {}
+        edges = []
+
+        for dag_id, dag in current_app.dag_bag.dags.items():
+            dag_node_id = f"d:{dag_id}"
+            nodes[dag_node_id] = self._node_dict(dag_node_id, dag_id, "fill: rgb(232, 247, 228)")
+
+            for task in dag.tasks:
+                task_node_id = f"t:{dag_id}:{task.task_id}"
+                if task.task_type in conf.getlist("core", "dag_dependencies_trigger"):
+                    nodes[task_node_id] = self._node_dict(
+                        task_node_id, task.task_id, "fill: rgb(255, 239, 235)"
+                    )
+
+                    edges.extend(
+                        [
+                            {"u": dag_node_id, "v": task_node_id},
+                            {"u": task_node_id, "v": f"d:{task.trigger_dag_id}"},
+                        ]
+                    )
+                elif task.task_type in conf.getlist("core", "dag_dependencies_sensor"):
+                    nodes[task_node_id] = self._node_dict(
+                        task_node_id, task.task_id, "fill: rgb(230, 241, 242)"
+                    )
+
+                    edges.extend(
+                        [
+                            {"u": task_node_id, "v": dag_node_id},
+                            {"u": f"d:{task.external_dag_id}", "v": task_node_id},
+                        ]
+                    )
+
+            implicit = getattr(dag, "implicit_dependencies", None)

Review comment:
       for now removing though

##########
File path: airflow/www/views.py
##########
@@ -3811,3 +3811,102 @@ def autocomplete(self, session=None):
         payload = [row[0] for row in dag_ids_query.union(owners_query).limit(10).all()]
 
         return wwwutils.json_response(payload)
+
+
+class DagDependenciesView(AirflowBaseView):
+    """View to show dependencies between DAGs"""
+
+    refresh_interval = conf.getint(
+        "dag_dependencies_plugin",
+        "refresh_interval",
+        fallback=conf.getint("scheduler", "dag_dir_list_interval"),
+    )
+    last_refresh = datetime.utcnow() - timedelta(seconds=refresh_interval)
+    nodes = []
+    edges = []
+
+    @expose('/dag-dependencies')
+    @auth.has_access(
+        [
+            (permissions.ACTION_CAN_READ, permissions.RESOURCE_DAG),
+        ]
+    )
+    @gzipped
+    @action_logging
+    def list(self):
+        """Display DAG dependencies"""
+        title = "DAG Dependencies"
+
+        if datetime.utcnow() > self.last_refresh + timedelta(seconds=self.refresh_interval):
+            self._calculate_graph()
+            self.last_refresh = datetime.utcnow()
+
+        return self.render_template(
+            "airflow/dag_dependencies.html",
+            title=title,
+            nodes=self.nodes,
+            edges=self.edges,
+            last_refresh=self.last_refresh.strftime("%Y-%m-%d %H:%M:%S"),
+            arrange=conf.get("webserver", "dag_orientation"),
+            width=request.args.get("width", "100%"),
+            height=request.args.get("height", "800"),
+        )
+
+    def _calculate_graph(self):
+
+        current_app.dag_bag.collect_dags_from_db()
+
+        nodes = {}
+        edges = []
+
+        for dag_id, dag in current_app.dag_bag.dags.items():
+            dag_node_id = f"d:{dag_id}"
+            nodes[dag_node_id] = self._node_dict(dag_node_id, dag_id, "fill: rgb(232, 247, 228)")

Review comment:
       will change

##########
File path: airflow/www/views.py
##########
@@ -3811,3 +3811,102 @@ def autocomplete(self, session=None):
         payload = [row[0] for row in dag_ids_query.union(owners_query).limit(10).all()]
 
         return wwwutils.json_response(payload)
+
+
+class DagDependenciesView(AirflowBaseView):
+    """View to show dependencies between DAGs"""
+
+    refresh_interval = conf.getint(
+        "dag_dependencies_plugin",
+        "refresh_interval",
+        fallback=conf.getint("scheduler", "dag_dir_list_interval"),
+    )
+    last_refresh = datetime.utcnow() - timedelta(seconds=refresh_interval)
+    nodes = []
+    edges = []
+
+    @expose('/dag-dependencies')
+    @auth.has_access(
+        [
+            (permissions.ACTION_CAN_READ, permissions.RESOURCE_DAG),
+        ]
+    )
+    @gzipped
+    @action_logging
+    def list(self):
+        """Display DAG dependencies"""
+        title = "DAG Dependencies"
+
+        if datetime.utcnow() > self.last_refresh + timedelta(seconds=self.refresh_interval):

Review comment:
       aware of this, it's the best/easiest solution that came to my mind. other recommendations are welcome

##########
File path: airflow/www/views.py
##########
@@ -3811,3 +3811,102 @@ def autocomplete(self, session=None):
         payload = [row[0] for row in dag_ids_query.union(owners_query).limit(10).all()]
 
         return wwwutils.json_response(payload)
+
+
+class DagDependenciesView(AirflowBaseView):
+    """View to show dependencies between DAGs"""
+
+    refresh_interval = conf.getint(
+        "dag_dependencies_plugin",

Review comment:
       will change

##########
File path: airflow/www/views.py
##########
@@ -3811,3 +3811,102 @@ def autocomplete(self, session=None):
         payload = [row[0] for row in dag_ids_query.union(owners_query).limit(10).all()]
 
         return wwwutils.json_response(payload)
+
+
+class DagDependenciesView(AirflowBaseView):
+    """View to show dependencies between DAGs"""
+
+    refresh_interval = conf.getint(
+        "dag_dependencies_plugin",
+        "refresh_interval",
+        fallback=conf.getint("scheduler", "dag_dir_list_interval"),
+    )
+    last_refresh = datetime.utcnow() - timedelta(seconds=refresh_interval)
+    nodes = []
+    edges = []
+
+    @expose('/dag-dependencies')
+    @auth.has_access(
+        [
+            (permissions.ACTION_CAN_READ, permissions.RESOURCE_DAG),
+        ]
+    )
+    @gzipped
+    @action_logging
+    def list(self):
+        """Display DAG dependencies"""
+        title = "DAG Dependencies"
+
+        if datetime.utcnow() > self.last_refresh + timedelta(seconds=self.refresh_interval):
+            self._calculate_graph()
+            self.last_refresh = datetime.utcnow()
+
+        return self.render_template(
+            "airflow/dag_dependencies.html",
+            title=title,
+            nodes=self.nodes,
+            edges=self.edges,
+            last_refresh=self.last_refresh.strftime("%Y-%m-%d %H:%M:%S"),
+            arrange=conf.get("webserver", "dag_orientation"),
+            width=request.args.get("width", "100%"),
+            height=request.args.get("height", "800"),
+        )
+
+    def _calculate_graph(self):
+
+        current_app.dag_bag.collect_dags_from_db()
+
+        nodes = {}
+        edges = []
+
+        for dag_id, dag in current_app.dag_bag.dags.items():
+            dag_node_id = f"d:{dag_id}"
+            nodes[dag_node_id] = self._node_dict(dag_node_id, dag_id, "fill: rgb(232, 247, 228)")
+
+            for task in dag.tasks:
+                task_node_id = f"t:{dag_id}:{task.task_id}"
+                if task.task_type in conf.getlist("core", "dag_dependencies_trigger"):
+                    nodes[task_node_id] = self._node_dict(
+                        task_node_id, task.task_id, "fill: rgb(255, 239, 235)"
+                    )
+
+                    edges.extend(
+                        [
+                            {"u": dag_node_id, "v": task_node_id},
+                            {"u": task_node_id, "v": f"d:{task.trigger_dag_id}"},
+                        ]
+                    )
+                elif task.task_type in conf.getlist("core", "dag_dependencies_sensor"):
+                    nodes[task_node_id] = self._node_dict(
+                        task_node_id, task.task_id, "fill: rgb(230, 241, 242)"
+                    )
+
+                    edges.extend(
+                        [
+                            {"u": task_node_id, "v": dag_node_id},
+                            {"u": f"d:{task.external_dag_id}", "v": task_node_id},
+                        ]
+                    )
+
+            implicit = getattr(dag, "implicit_dependencies", None)

Review comment:
       correct, thanks. any other place now we could store sth like this?

##########
File path: airflow/www/views.py
##########
@@ -3811,3 +3811,102 @@ def autocomplete(self, session=None):
         payload = [row[0] for row in dag_ids_query.union(owners_query).limit(10).all()]
 
         return wwwutils.json_response(payload)
+
+
+class DagDependenciesView(AirflowBaseView):
+    """View to show dependencies between DAGs"""
+
+    refresh_interval = conf.getint(
+        "dag_dependencies_plugin",
+        "refresh_interval",
+        fallback=conf.getint("scheduler", "dag_dir_list_interval"),
+    )
+    last_refresh = datetime.utcnow() - timedelta(seconds=refresh_interval)
+    nodes = []
+    edges = []
+
+    @expose('/dag-dependencies')
+    @auth.has_access(
+        [
+            (permissions.ACTION_CAN_READ, permissions.RESOURCE_DAG),
+        ]
+    )
+    @gzipped
+    @action_logging
+    def list(self):
+        """Display DAG dependencies"""
+        title = "DAG Dependencies"
+
+        if datetime.utcnow() > self.last_refresh + timedelta(seconds=self.refresh_interval):
+            self._calculate_graph()
+            self.last_refresh = datetime.utcnow()
+
+        return self.render_template(
+            "airflow/dag_dependencies.html",
+            title=title,
+            nodes=self.nodes,
+            edges=self.edges,
+            last_refresh=self.last_refresh.strftime("%Y-%m-%d %H:%M:%S"),
+            arrange=conf.get("webserver", "dag_orientation"),
+            width=request.args.get("width", "100%"),
+            height=request.args.get("height", "800"),
+        )
+
+    def _calculate_graph(self):
+
+        current_app.dag_bag.collect_dags_from_db()
+
+        nodes = {}
+        edges = []
+
+        for dag_id, dag in current_app.dag_bag.dags.items():
+            dag_node_id = f"d:{dag_id}"
+            nodes[dag_node_id] = self._node_dict(dag_node_id, dag_id, "fill: rgb(232, 247, 228)")
+
+            for task in dag.tasks:
+                task_node_id = f"t:{dag_id}:{task.task_id}"
+                if task.task_type in conf.getlist("core", "dag_dependencies_trigger"):

Review comment:
       Good spot




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ms32035 commented on pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
ms32035 commented on pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#issuecomment-748673189


   ![dag_dependencies](https://user-images.githubusercontent.com/5060473/102724814-3f8ef700-430a-11eb-9988-b4db78c0eee8.png)
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ms32035 commented on pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
ms32035 commented on pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#issuecomment-749471655


   Statuses aren't there for 2 reasons:
   * Design & performance - this should demonstrate the static relationships between dags, but adding dag runs add a performance cost and complexity related to refresh
   * It may well be that several instances of the same DAG are active or there is an active run for something else than the latest execution, in which case this could be quite confusing. The DAG list view is more appropriate get details about statuses


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ms32035 commented on a change in pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
ms32035 commented on a change in pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#discussion_r617725966



##########
File path: airflow/serialization/serialized_objects.py
##########
@@ -335,6 +334,30 @@ def _value_is_hardcoded_default(cls, attrname: str, value: Any, instance: Any) -
         return False
 
 
+class DependencyDetector:

Review comment:
       I am actually thinking a bit ahead for future extensions:
   1) detecting dependencies from dag attributes too (like a list of dependant dag names, etc) -> second method
   2) make the class configurable




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#issuecomment-821555359


   [The Workflow run](https://github.com/apache/airflow/actions/runs/756857228) is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ms32035 commented on pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
ms32035 commented on pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#issuecomment-824364772


   @kaxil please hold off for a moment. looks like I lost 1 commit when rebasing master


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal commented on pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
eladkal commented on pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#issuecomment-826661329


   Sharing a thought: Would it be possible to add a toggle that will exclude from the view DAGs that have no dependencies? The use case for this view is to check dependency so if a DAG doesn't have one there is no need to overwhelm the view (may be very relevant for instances with 500+ DAGs)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] nathadfield commented on pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
nathadfield commented on pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#issuecomment-749480379


   > Statuses aren't there for 2 reasons:
   > 
   > * Design & performance - this should demonstrate the static relationships between dags, but adding dag runs add a performance cost and complexity related to refresh
   > * It may well be that several instances of the same DAG are active or there is an active run for something else than the latest execution, in which case this could be quite confusing. The DAG list view is more appropriate get details about statuses
   
   Putting performance issues to one side, I can certainly imagine UI concepts that might enable the user to see how many DagRuns are active and their status, e.g. some form of hover-over?  Clicking on the DAG node in the dependency graph could then either link directly to the active DagRun or the the list of active DagRuns.
   
   However I totally understand that performance is probably a limiting factor for all of this and, as I said, just being able to visualise all the dependencies is a good thing in of itself.  I'd much rather have a static graph covering all dependencies in an up coming release than nothing!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ms32035 commented on pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
ms32035 commented on pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#issuecomment-811823848


   agree, let's schedule sth through Slack for the week after then


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#discussion_r628240580



##########
File path: airflow/serialization/serialized_objects.py
##########
@@ -19,6 +19,7 @@
 import datetime
 import enum
 import logging
+from dataclasses import dataclass

Review comment:
       Done on your branch via https://github.com/apache/airflow/pull/13199/commits/a1a6b0ef814ab8f60106e1e61e0cbf80e711b642




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
ashb commented on pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#issuecomment-827653203


   @ms32035 We haven't forgotten about this PR, don't worry, and we still want to get this in to Airflow 2.1


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#discussion_r546442769



##########
File path: airflow/www/views.py
##########
@@ -3772,3 +3772,102 @@ def autocomplete(self, session=None):
         payload = [row[0] for row in dag_ids_query.union(owners_query).limit(10).all()]
 
         return wwwutils.json_response(payload)
+
+
+class DagDependenciesView(AirflowBaseView):
+    """View to show dependencies between DAGs"""
+
+    refresh_interval = conf.getint(
+        "dag_dependencies_plugin",
+        "refresh_interval",
+        fallback=conf.getint("scheduler", "dag_dir_list_interval"),
+    )
+    last_refresh = datetime.utcnow() - timedelta(seconds=refresh_interval)
+    nodes = []
+    edges = []
+
+    @expose('/dag-dependencies')
+    @auth.has_access(
+        [
+            (permissions.ACTION_CAN_READ, permissions.RESOURCE_DAG),
+        ]
+    )
+    @gzipped
+    @action_logging
+    def list(self):
+        """Display DAG dependencies"""
+        title = "DAG Dependencies"
+
+        if datetime.utcnow() > self.last_refresh + timedelta(seconds=self.refresh_interval):
+            self._calculate_graph()
+            self.last_refresh = datetime.utcnow()
+
+        return self.render_template(
+            "airflow/dag_dependencies.html",
+            title=title,
+            nodes=self.nodes,
+            edges=self.edges,
+            last_refresh=self.last_refresh.strftime("%Y-%m-%d %H:%M:%S"),
+            arrange=conf.get("webserver", "dag_orientation"),
+            width=request.args.get("width", "100%"),
+            height=request.args.get("height", "800"),
+        )
+
+    def _calculate_graph(self):
+
+        current_app.dag_bag.collect_dags_from_db()

Review comment:
       I fear this call the most. Some DAGs can contain a thousand tasks, and we can have several such DAGs. It will be very slow to load them all into memory.  Do you think we can optimize it?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ms32035 commented on pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
ms32035 commented on pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#issuecomment-827655399


   @ashb I'll bundle any changes you request with fixing the new conflicts that need to be resolved


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ms32035 commented on pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
ms32035 commented on pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#issuecomment-803612096


   > it would be a good idea to add RBAC permissions to this view (like all other views) thus allowing admins to restrict access per role if needed.
   
   At the moment it requires `(permissions.ACTION_CAN_READ, permissions.RESOURCE_DAG)`, but menu link has a `permissions.RESOURCE_DAG_DEPENDENCIES, category=permissions.RESOURCE_BROWSE_MENU` so it's not entirely consistent. I'll tighten this


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ms32035 commented on a change in pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
ms32035 commented on a change in pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#discussion_r546445867



##########
File path: airflow/www/views.py
##########
@@ -3772,3 +3772,102 @@ def autocomplete(self, session=None):
         payload = [row[0] for row in dag_ids_query.union(owners_query).limit(10).all()]
 
         return wwwutils.json_response(payload)
+
+
+class DagDependenciesView(AirflowBaseView):
+    """View to show dependencies between DAGs"""
+
+    refresh_interval = conf.getint(
+        "dag_dependencies_plugin",
+        "refresh_interval",
+        fallback=conf.getint("scheduler", "dag_dir_list_interval"),
+    )
+    last_refresh = datetime.utcnow() - timedelta(seconds=refresh_interval)
+    nodes = []
+    edges = []
+
+    @expose('/dag-dependencies')
+    @auth.has_access(
+        [
+            (permissions.ACTION_CAN_READ, permissions.RESOURCE_DAG),
+        ]
+    )
+    @gzipped
+    @action_logging
+    def list(self):
+        """Display DAG dependencies"""
+        title = "DAG Dependencies"
+
+        if datetime.utcnow() > self.last_refresh + timedelta(seconds=self.refresh_interval):
+            self._calculate_graph()
+            self.last_refresh = datetime.utcnow()
+
+        return self.render_template(
+            "airflow/dag_dependencies.html",
+            title=title,
+            nodes=self.nodes,
+            edges=self.edges,
+            last_refresh=self.last_refresh.strftime("%Y-%m-%d %H:%M:%S"),
+            arrange=conf.get("webserver", "dag_orientation"),
+            width=request.args.get("width", "100%"),
+            height=request.args.get("height", "800"),
+        )
+
+    def _calculate_graph(self):
+
+        current_app.dag_bag.collect_dags_from_db()

Review comment:
       Not quite sure how I could optimize it? It's already optimized in that it only refreshes after an interval which is equal to the scheduler recalculation interval. All dags are required to calculate the dependencies. Few ideas I have, which are not optimization but might help:
   
   - set last_refresh_time at the start of the refresh, so that a second refresh triggered before first one finishes does not cause a recalculation in parallel
   - remove dags without dependencies from rendering
   - add a global configuration option to disable the view. I guess some deployments may be too large for such a global view.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ms32035 commented on pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
ms32035 commented on pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#issuecomment-803311188


   @bbovenzi I merged master and fixed issues, so please make sure this is now actioned and merged. As for linting, I ran the code through eslint [here](https://github.com/ms32035/airflow/commit/64bcf8a6d498c1e5a4a6a8ff2dd8e1e2f2552d08) but most of the issues come from pre-existing code that was copied, so I don't think it should be in the scope of this PR


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ms32035 commented on pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
ms32035 commented on pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#issuecomment-824379830


   ok, now fixed including changes you requested. I don't know how could I messed up so badly to have lost a key commit without noticing and yet tests were still passing, but with so many rebases to this I guess !@#$ happens


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ms32035 commented on a change in pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
ms32035 commented on a change in pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#discussion_r617726491



##########
File path: airflow/serialization/serialized_objects.py
##########
@@ -350,6 +373,8 @@ class SerializedBaseOperator(BaseOperator, BaseSerialization):
         if v.default is not v.empty
     }
 
+    dependency_detector = DependencyDetector()

Review comment:
       not necessarily




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ms32035 commented on pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
ms32035 commented on pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#issuecomment-815300967


   Ok, I'd appreciate if that is in place before the weekend. I plan to work a bit on this, and have design session scheduled with @ashb in a few days.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ms32035 commented on pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
ms32035 commented on pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#issuecomment-824890683


   It was a fully working solution, just some parts were from the old approach.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
ashb commented on pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#issuecomment-811758611


   @ms32035 For this to be accepted in to core we're going to need it to have better behaviour:
   
   1. Not having to load and walk through every dag to calculate dependencies -- i.e. storing something about the deps between dags at parse time, rather than having to re-compute it.
   2. Some kind of pagination/filtering to cope with having a large number of DAGs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] nathadfield commented on pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
nathadfield commented on pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#issuecomment-749454131


   > I am adding this PR to the Airflow 2.1 milestone. I think it is worth thinking how we can add this change to Airflow so that it meets the expectations of the largest group of users.
   > 
   > @Craig-Chatfield @nathadfield Do you or your team have any requirements for this view?
   > https://youtu.be/kmD_u8VfuT0?t=2042
   
   This is great news!  To begin with just the ability to visualise each DAG and their connected dependencies (be that via TriggerDagOperators or ExternalTaskSensors) would be a nice addition.  It would also be nice to have an indication as to the last run status and if it could be dynamic even better!
   
   The ability to directly click through to each DAG or task dependency would also make for better user experience.
   
   Hope this helps.
   
   Cheers,
   
   Nathan


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
ashb commented on pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#issuecomment-824888017


   > ok, now fixed including changes you requested. I don't know how could I messed up so badly to have lost a key commit without noticing and yet tests were still passing, but with so many rebases to this I guess !@#$ happens
   
   If the tests were still passing could you please add some more? :D 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ms32035 commented on pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
ms32035 commented on pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#issuecomment-811151278


   @ashb can you respond to my questions / review again?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ryw commented on pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
ryw commented on pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#issuecomment-753998411


   @ryanahamilton you might be interested to participate in this PR from a UX standpoint


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] bbovenzi commented on pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
bbovenzi commented on pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#issuecomment-832313043


   I would recommend backend filtering. Some users have 10,000+ dags. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#discussion_r628236646



##########
File path: airflow/serialization/serialized_objects.py
##########
@@ -19,6 +19,7 @@
 import datetime
 import enum
 import logging
+from dataclasses import dataclass

Review comment:
       We still need to support Python 3.6




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
ashb commented on pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#issuecomment-811810358


   > To make it clear, the main structure behind the view is a dictionary which will be no more than kilobytes for deployments with thousands of dags, so storing that in memory is not an issue at all.
   
   My primary concern is not this structure, but loading all 1000+ dags in to memory.
   
   I _do_ want this feature, it is almost embarrassing that it doesn't exist in core Airflow!
   
   I'm happy to work with you to come up with a plan this (and to carry on/finish the PR if you don't have time) - for instance I think that we can do this by adding an extra field in to the serialized DAG representation, so the existing parsing process would keep this up-to-date, and then JSON query operators can be used to find dags with dependencies (and can be indexed), or could be stored in a "denormalized" (not quite right term, but close) into a column on the DAG table


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
ashb commented on pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#issuecomment-811822194


   @ms32035 Since we are both in the UK it's probably easier to co-ordinate a more real time session to work on this, though I am off all next week. @kaxil is probably also available.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#discussion_r617714377



##########
File path: airflow/serialization/serialized_objects.py
##########
@@ -350,6 +373,8 @@ class SerializedBaseOperator(BaseOperator, BaseSerialization):
         if v.default is not v.empty
     }
 
+    dependency_detector = DependencyDetector()

Review comment:
       Why do we need a new instance of DependencyDetector?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#discussion_r628238700



##########
File path: airflow/serialization/serialized_objects.py
##########
@@ -19,6 +19,7 @@
 import datetime
 import enum
 import logging
+from dataclasses import dataclass

Review comment:
       We should be explicit about the dep on https://pypi.org/project/dataclasses/, rather than relying on it being pulled in via a dep




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#discussion_r546451698



##########
File path: airflow/www/views.py
##########
@@ -3772,3 +3772,102 @@ def autocomplete(self, session=None):
         payload = [row[0] for row in dag_ids_query.union(owners_query).limit(10).all()]
 
         return wwwutils.json_response(payload)
+
+
+class DagDependenciesView(AirflowBaseView):
+    """View to show dependencies between DAGs"""
+
+    refresh_interval = conf.getint(
+        "dag_dependencies_plugin",
+        "refresh_interval",
+        fallback=conf.getint("scheduler", "dag_dir_list_interval"),
+    )
+    last_refresh = datetime.utcnow() - timedelta(seconds=refresh_interval)
+    nodes = []
+    edges = []
+
+    @expose('/dag-dependencies')
+    @auth.has_access(
+        [
+            (permissions.ACTION_CAN_READ, permissions.RESOURCE_DAG),
+        ]
+    )
+    @gzipped
+    @action_logging
+    def list(self):
+        """Display DAG dependencies"""
+        title = "DAG Dependencies"
+
+        if datetime.utcnow() > self.last_refresh + timedelta(seconds=self.refresh_interval):
+            self._calculate_graph()
+            self.last_refresh = datetime.utcnow()
+
+        return self.render_template(
+            "airflow/dag_dependencies.html",
+            title=title,
+            nodes=self.nodes,
+            edges=self.edges,
+            last_refresh=self.last_refresh.strftime("%Y-%m-%d %H:%M:%S"),
+            arrange=conf.get("webserver", "dag_orientation"),
+            width=request.args.get("width", "100%"),
+            height=request.args.get("height", "800"),
+        )
+
+    def _calculate_graph(self):
+
+        current_app.dag_bag.collect_dags_from_db()

Review comment:
       I am thinking of creating a new table that will contain only these relationships and try to be kept in good condition by DagFileProcessor.  This way, we will only be able to read the information we need.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#issuecomment-749261748


   I am adding this PR to the Airflow 2.1 milestone. I think it is worth thinking how we can add this change to Airflow so that it meets the expectations of the largest group of users.
   
   @Craig-Chatfield  @nathadfield  Do you or your team have any requirements for this view?
   https://youtu.be/kmD_u8VfuT0?t=2042
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ms32035 commented on pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
ms32035 commented on pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#issuecomment-811802961


   @ashb this is not helpful at all.
   
   - You are not providing any (measurable) arguments as to why implementing an additional structure storing the dependencies, which would add to the scheduler overhead regardless if it is used or not, is a better solution than recalculating a clean state  in memory, with some time based caching, when the functionality is actually used. Maintaining an additional database table also adds a number of edge cases that would require handling and are not an issue for the current approach, such as dependencies pointing to invalid dags (these are texts) or tags or tasks being removed. Furthermore, there is no reason the scheduler needs to know/use these dependencies at all. Also a tricky question here. The same supposedly heavy `collect_dags_from_db` function is called in the web server by `/refresh_all` endpoint.
   - requirement 2) defies the purpose of the view, and I’ve written my arguments before
   
   To make it clear, the main structure behind the view is a dictionary which will be no more than kilobytes for deployments with thousands of dags, so storing that in memory is not an issue at all.
   
   One other idea I have is to add a check based on `max(serialized_dag.last_updated)` to check if recalculating is needed.
   
   I’d really appreciate more support to progress this, otherwise I’ll close the PR, leave it us a plugin, and potentially abandon in the future.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
ashb commented on pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#issuecomment-823424596


   Interesting read https://medium.com/quintoandar-tech-blog/effective-cross-dags-dependency-in-apache-airflow-1885dc7ece9f


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#discussion_r625741144



##########
File path: airflow/www/views.py
##########
@@ -3907,6 +3907,76 @@ def autocomplete(self, session=None):
         return wwwutils.json_response(payload)
 
 
+class DagDependenciesView(AirflowBaseView):
+    """View to show dependencies between DAGs"""
+
+    refresh_interval = conf.getint(
+        "webserver",
+        "dag_dependencies_refresh_interval",
+        fallback=conf.getint("scheduler", "dag_dir_list_interval"),
+    )
+    last_refresh = timezone.utcnow() - timedelta(seconds=refresh_interval)
+    nodes = []
+    edges = []
+
+    @expose('/dag-dependencies')
+    @auth.has_access(
+        [
+            (permissions.ACTION_CAN_READ, permissions.RESOURCE_DAG_DEPENDENCIES),
+        ]
+    )
+    @gzipped
+    @action_logging
+    def list(self):
+        """Display DAG dependencies"""
+        title = "DAG Dependencies"
+
+        if timezone.utcnow() > self.last_refresh + timedelta(seconds=self.refresh_interval):

Review comment:
       ```suggestion
           if timezone.utcnow() > self.last_refresh + self.refresh_interval:
   ```

##########
File path: airflow/www/views.py
##########
@@ -3907,6 +3907,76 @@ def autocomplete(self, session=None):
         return wwwutils.json_response(payload)
 
 
+class DagDependenciesView(AirflowBaseView):
+    """View to show dependencies between DAGs"""
+
+    refresh_interval = conf.getint(
+        "webserver",
+        "dag_dependencies_refresh_interval",
+        fallback=conf.getint("scheduler", "dag_dir_list_interval"),
+    )
+    last_refresh = timezone.utcnow() - timedelta(seconds=refresh_interval)

Review comment:
       ```suggestion
       refresh_interval = timedelta(seconds=conf.getint(
           "webserver",
           "dag_dependencies_refresh_interval",
           fallback=conf.getint("scheduler", "dag_dir_list_interval"),
       ))
       last_refresh = timezone.utcnow() - refresh_interval
   ```

##########
File path: airflow/models/__init__.py
##########
@@ -28,6 +28,7 @@
 from airflow.models.pool import Pool
 from airflow.models.renderedtifields import RenderedTaskInstanceFields
 from airflow.models.sensorinstance import SensorInstance  # noqa: F401
+from airflow.models.serialized_dag import SerializedDagModel

Review comment:
       This appears to be here to be able to do `models.SerializedDagModel` in www.views, right?
   
   If so could you change _that_ to use ` from airflow.models.serialized_dag import SerializedDagModel`?
   
   We want to move away from mass-auto-importing things in Airflow (as it's a slowdown for initial process startup, and makes dependency cycles more likely).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ms32035 commented on pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
ms32035 commented on pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#issuecomment-748692939


   * This view is not used as often as the DAG list for example, so I don't
   think the extra table would be an optimization.
   * To calculate dependencies all DAGs are required, which could impact
   multi-threaded DAG processing, and transactional update of the table with
   all recalculated dependencies and referential constraints enabled would be
   a hell. Otherwise, if the dependencies were recalculated when each DAG is
   processed, there's a risk of inconsistencies (links to DAGs that don't
   exist etc)
   
   On Sun, Dec 20, 2020 at 11:25 PM Kamil Breguła <no...@github.com>
   wrote:
   
   > *@mik-laj* commented on this pull request.
   > ------------------------------
   >
   > In airflow/www/views.py
   > <https://github.com/apache/airflow/pull/13199#discussion_r546451698>:
   >
   > > +            self.last_refresh = datetime.utcnow()
   > +
   > +        return self.render_template(
   > +            "airflow/dag_dependencies.html",
   > +            title=title,
   > +            nodes=self.nodes,
   > +            edges=self.edges,
   > +            last_refresh=self.last_refresh.strftime("%Y-%m-%d %H:%M:%S"),
   > +            arrange=conf.get("webserver", "dag_orientation"),
   > +            width=request.args.get("width", "100%"),
   > +            height=request.args.get("height", "800"),
   > +        )
   > +
   > +    def _calculate_graph(self):
   > +
   > +        current_app.dag_bag.collect_dags_from_db()
   >
   > I am thinking of creating a new table that will contain only these
   > relationships and try to be kept in good condition by DagFileProcessor.
   > This way, we will only be able to read the information we need.
   >
   > —
   > You are receiving this because you authored the thread.
   > Reply to this email directly, view it on GitHub
   > <https://github.com/apache/airflow/pull/13199#discussion_r546451698>, or
   > unsubscribe
   > <https://github.com/notifications/unsubscribe-auth/ABGTO6KUN77XHQMBRAOTCTTSV2BWJANCNFSM4VDMNXNQ>
   > .
   >
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ms32035 commented on pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
ms32035 commented on pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#issuecomment-811819631


   I do have time, I guess finding/agreeing on a satisfactory solution design is the challenge here, and the more ideas we look into the better. I agree that serialising the dependencies into an additional JSON field in serialised_dag table during parsing is a good idea. I will look into that over the next few days


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
kaxil commented on pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#issuecomment-812759581


   > @ms32035 Since we are both in the UK it's probably easier to co-ordinate a more real time session to work on this, though I am off all next week. @kaxil is probably also available.
   
   Yup, happy to jump on a call with you'll whenever :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ms32035 edited a comment on pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
ms32035 edited a comment on pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#issuecomment-748692939


   * This view is not used as often as the DAG list for example, so I don't
   think the extra table would be an optimization.
   * To calculate dependencies all DAGs are required, which could impact
   multi-threaded DAG processing, and transactional update of the table with
   all recalculated dependencies and referential constraints enabled would be
   a hell. Otherwise, if the dependencies were recalculated when each DAG is
   processed, there's a risk of inconsistencies (links to DAGs that don't
   exist etc)
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ms32035 commented on pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
ms32035 commented on pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#issuecomment-801491972


   @bbovenzi I'd appreciate if this is merged before any other changes to the files affected. I already had to resolve the conflicts a couple of times, and since then I see a few new PRs popped up, including https://github.com/apache/airflow/pull/14661 which breaks this for now. I'll try to fix over the weekend.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ms32035 commented on pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
ms32035 commented on pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#issuecomment-774682702


   @ryanahamilton @mik-laj please review as the JS in master changes and rebasing becomes painful


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ms32035 commented on pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
ms32035 commented on pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#issuecomment-832293530


   @ashb made the changes requested
   @eladkal I have a few enhancements in mind for Airflow 2.2. Another possible filter would be enabled/disabled dags. Not sure yet though if filtering would work better in the backend or in JS.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#discussion_r613180186



##########
File path: airflow/security/permissions.py
##########
@@ -26,6 +26,7 @@
 RESOURCE_DOCS_LINKS = "docs_links"
 RESOURCE_CONFIG = "Configurations"
 RESOURCE_CONNECTION = "Connections"
+RESOURCE_DAG_DEPENDENCIES = "DAG Dependencies"

Review comment:
       Paging @jhtimmins -- need you permissions hat thoughts here :)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ms32035 commented on a change in pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
ms32035 commented on a change in pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#discussion_r617719654



##########
File path: airflow/serialization/schema.json
##########
@@ -103,7 +109,8 @@
           { "type": "null" },
           { "$ref": "#/definitions/task_group" }
         ]},
-        "edge_info": { "$ref": "#/definitions/edge_info" }
+        "edge_info": { "$ref": "#/definitions/edge_info" },
+        "dependencies": { "$ref": "#/definitions/dependencies" }

Review comment:
       can change




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] bbovenzi commented on pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
bbovenzi commented on pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#issuecomment-815299329


   @ms32035, I'm going to move forward with the js migration stuff. I'll make a pr to your branch to resolve merge conflicts.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kurtqq commented on pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
kurtqq commented on pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#issuecomment-803517359


   it would be a good idea to add RBAC permissions to this view (like all other views) thus allowing admins to restrict access per role if needed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#discussion_r598109416



##########
File path: airflow/www/templates/airflow/dag_dependencies.html
##########
@@ -0,0 +1,111 @@
+{#
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+#}
+
+{% extends "airflow/main.html" %}
+
+{% block title %}Airflow - DAGs{% endblock %}

Review comment:
       Title should say something about dependencies

##########
File path: airflow/www/views.py
##########
@@ -3811,3 +3811,102 @@ def autocomplete(self, session=None):
         payload = [row[0] for row in dag_ids_query.union(owners_query).limit(10).all()]
 
         return wwwutils.json_response(payload)
+
+
+class DagDependenciesView(AirflowBaseView):
+    """View to show dependencies between DAGs"""
+
+    refresh_interval = conf.getint(
+        "dag_dependencies_plugin",

Review comment:
       It's not a plugin anymore, so this section should change. Probably to webserver?

##########
File path: airflow/www/views.py
##########
@@ -3811,3 +3811,102 @@ def autocomplete(self, session=None):
         payload = [row[0] for row in dag_ids_query.union(owners_query).limit(10).all()]
 
         return wwwutils.json_response(payload)
+
+
+class DagDependenciesView(AirflowBaseView):
+    """View to show dependencies between DAGs"""
+
+    refresh_interval = conf.getint(
+        "dag_dependencies_plugin",
+        "refresh_interval",
+        fallback=conf.getint("scheduler", "dag_dir_list_interval"),
+    )
+    last_refresh = datetime.utcnow() - timedelta(seconds=refresh_interval)
+    nodes = []
+    edges = []
+
+    @expose('/dag-dependencies')
+    @auth.has_access(
+        [
+            (permissions.ACTION_CAN_READ, permissions.RESOURCE_DAG),
+        ]
+    )
+    @gzipped
+    @action_logging
+    def list(self):
+        """Display DAG dependencies"""
+        title = "DAG Dependencies"
+
+        if datetime.utcnow() > self.last_refresh + timedelta(seconds=self.refresh_interval):
+            self._calculate_graph()
+            self.last_refresh = datetime.utcnow()
+
+        return self.render_template(
+            "airflow/dag_dependencies.html",
+            title=title,
+            nodes=self.nodes,
+            edges=self.edges,
+            last_refresh=self.last_refresh.strftime("%Y-%m-%d %H:%M:%S"),
+            arrange=conf.get("webserver", "dag_orientation"),
+            width=request.args.get("width", "100%"),
+            height=request.args.get("height", "800"),
+        )
+
+    def _calculate_graph(self):
+
+        current_app.dag_bag.collect_dags_from_db()
+
+        nodes = {}
+        edges = []
+
+        for dag_id, dag in current_app.dag_bag.dags.items():
+            dag_node_id = f"d:{dag_id}"
+            nodes[dag_node_id] = self._node_dict(dag_node_id, dag_id, "fill: rgb(232, 247, 228)")
+
+            for task in dag.tasks:
+                task_node_id = f"t:{dag_id}:{task.task_id}"
+                if task.task_type in conf.getlist("core", "dag_dependencies_trigger"):
+                    nodes[task_node_id] = self._node_dict(
+                        task_node_id, task.task_id, "fill: rgb(255, 239, 235)"
+                    )
+
+                    edges.extend(
+                        [
+                            {"u": dag_node_id, "v": task_node_id},
+                            {"u": task_node_id, "v": f"d:{task.trigger_dag_id}"},
+                        ]
+                    )
+                elif task.task_type in conf.getlist("core", "dag_dependencies_sensor"):
+                    nodes[task_node_id] = self._node_dict(
+                        task_node_id, task.task_id, "fill: rgb(230, 241, 242)"
+                    )
+
+                    edges.extend(
+                        [
+                            {"u": task_node_id, "v": dag_node_id},
+                            {"u": f"d:{task.external_dag_id}", "v": task_node_id},
+                        ]
+                    )
+
+            implicit = getattr(dag, "implicit_dependencies", None)

Review comment:
       This attribute will never exist because we are loading the SerialisedDag now.

##########
File path: airflow/www/views.py
##########
@@ -3811,3 +3811,102 @@ def autocomplete(self, session=None):
         payload = [row[0] for row in dag_ids_query.union(owners_query).limit(10).all()]
 
         return wwwutils.json_response(payload)
+
+
+class DagDependenciesView(AirflowBaseView):
+    """View to show dependencies between DAGs"""
+
+    refresh_interval = conf.getint(
+        "dag_dependencies_plugin",
+        "refresh_interval",
+        fallback=conf.getint("scheduler", "dag_dir_list_interval"),
+    )
+    last_refresh = datetime.utcnow() - timedelta(seconds=refresh_interval)
+    nodes = []
+    edges = []
+
+    @expose('/dag-dependencies')
+    @auth.has_access(
+        [
+            (permissions.ACTION_CAN_READ, permissions.RESOURCE_DAG),
+        ]
+    )
+    @gzipped
+    @action_logging
+    def list(self):
+        """Display DAG dependencies"""
+        title = "DAG Dependencies"
+
+        if datetime.utcnow() > self.last_refresh + timedelta(seconds=self.refresh_interval):
+            self._calculate_graph()
+            self.last_refresh = datetime.utcnow()
+
+        return self.render_template(
+            "airflow/dag_dependencies.html",
+            title=title,
+            nodes=self.nodes,
+            edges=self.edges,
+            last_refresh=self.last_refresh.strftime("%Y-%m-%d %H:%M:%S"),
+            arrange=conf.get("webserver", "dag_orientation"),
+            width=request.args.get("width", "100%"),
+            height=request.args.get("height", "800"),
+        )
+
+    def _calculate_graph(self):
+
+        current_app.dag_bag.collect_dags_from_db()
+
+        nodes = {}
+        edges = []
+
+        for dag_id, dag in current_app.dag_bag.dags.items():
+            dag_node_id = f"d:{dag_id}"
+            nodes[dag_node_id] = self._node_dict(dag_node_id, dag_id, "fill: rgb(232, 247, 228)")

Review comment:
       We really shouldn't be hardcoding colours in the views - these should be classes and colours defined in css

##########
File path: airflow/www/views.py
##########
@@ -3811,3 +3811,102 @@ def autocomplete(self, session=None):
         payload = [row[0] for row in dag_ids_query.union(owners_query).limit(10).all()]
 
         return wwwutils.json_response(payload)
+
+
+class DagDependenciesView(AirflowBaseView):
+    """View to show dependencies between DAGs"""
+
+    refresh_interval = conf.getint(
+        "dag_dependencies_plugin",
+        "refresh_interval",
+        fallback=conf.getint("scheduler", "dag_dir_list_interval"),
+    )
+    last_refresh = datetime.utcnow() - timedelta(seconds=refresh_interval)
+    nodes = []
+    edges = []
+
+    @expose('/dag-dependencies')
+    @auth.has_access(
+        [
+            (permissions.ACTION_CAN_READ, permissions.RESOURCE_DAG),
+        ]
+    )
+    @gzipped
+    @action_logging
+    def list(self):
+        """Display DAG dependencies"""
+        title = "DAG Dependencies"
+
+        if datetime.utcnow() > self.last_refresh + timedelta(seconds=self.refresh_interval):

Review comment:
       This cache is per webserver worker

##########
File path: airflow/www/views.py
##########
@@ -3811,3 +3811,102 @@ def autocomplete(self, session=None):
         payload = [row[0] for row in dag_ids_query.union(owners_query).limit(10).all()]
 
         return wwwutils.json_response(payload)
+
+
+class DagDependenciesView(AirflowBaseView):
+    """View to show dependencies between DAGs"""
+
+    refresh_interval = conf.getint(
+        "dag_dependencies_plugin",
+        "refresh_interval",
+        fallback=conf.getint("scheduler", "dag_dir_list_interval"),
+    )
+    last_refresh = datetime.utcnow() - timedelta(seconds=refresh_interval)
+    nodes = []
+    edges = []
+
+    @expose('/dag-dependencies')
+    @auth.has_access(
+        [
+            (permissions.ACTION_CAN_READ, permissions.RESOURCE_DAG),
+        ]
+    )
+    @gzipped
+    @action_logging
+    def list(self):
+        """Display DAG dependencies"""
+        title = "DAG Dependencies"
+
+        if datetime.utcnow() > self.last_refresh + timedelta(seconds=self.refresh_interval):
+            self._calculate_graph()
+            self.last_refresh = datetime.utcnow()
+
+        return self.render_template(
+            "airflow/dag_dependencies.html",
+            title=title,
+            nodes=self.nodes,
+            edges=self.edges,
+            last_refresh=self.last_refresh.strftime("%Y-%m-%d %H:%M:%S"),
+            arrange=conf.get("webserver", "dag_orientation"),
+            width=request.args.get("width", "100%"),
+            height=request.args.get("height", "800"),
+        )
+
+    def _calculate_graph(self):
+
+        current_app.dag_bag.collect_dags_from_db()
+
+        nodes = {}
+        edges = []
+
+        for dag_id, dag in current_app.dag_bag.dags.items():
+            dag_node_id = f"d:{dag_id}"
+            nodes[dag_node_id] = self._node_dict(dag_node_id, dag_id, "fill: rgb(232, 247, 228)")
+
+            for task in dag.tasks:
+                task_node_id = f"t:{dag_id}:{task.task_id}"
+                if task.task_type in conf.getlist("core", "dag_dependencies_trigger"):

Review comment:
       Reading variables from the config for each task is comparatively expensive - we should call this once function at most

##########
File path: airflow/www/views.py
##########
@@ -3811,3 +3811,102 @@ def autocomplete(self, session=None):
         payload = [row[0] for row in dag_ids_query.union(owners_query).limit(10).all()]
 
         return wwwutils.json_response(payload)
+
+
+class DagDependenciesView(AirflowBaseView):
+    """View to show dependencies between DAGs"""
+
+    refresh_interval = conf.getint(
+        "dag_dependencies_plugin",
+        "refresh_interval",
+        fallback=conf.getint("scheduler", "dag_dir_list_interval"),
+    )
+    last_refresh = datetime.utcnow() - timedelta(seconds=refresh_interval)
+    nodes = []
+    edges = []

Review comment:
       Bit confused as to why these are class variables




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] bbovenzi commented on pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
bbovenzi commented on pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#issuecomment-801269882


   From a JS perspective, this looks fine. There is some linting that could be improved but that's not a big deal. After this PR, we would move all of the inline js in `graph.html` to its own file, so we'll have to rename `graph.js` or something.
   
   With a new React UI coming, it may be worth exploring new ways to incorporate this feature but I don't think that's a blocker for now.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#discussion_r546442769



##########
File path: airflow/www/views.py
##########
@@ -3772,3 +3772,102 @@ def autocomplete(self, session=None):
         payload = [row[0] for row in dag_ids_query.union(owners_query).limit(10).all()]
 
         return wwwutils.json_response(payload)
+
+
+class DagDependenciesView(AirflowBaseView):
+    """View to show dependencies between DAGs"""
+
+    refresh_interval = conf.getint(
+        "dag_dependencies_plugin",
+        "refresh_interval",
+        fallback=conf.getint("scheduler", "dag_dir_list_interval"),
+    )
+    last_refresh = datetime.utcnow() - timedelta(seconds=refresh_interval)
+    nodes = []
+    edges = []
+
+    @expose('/dag-dependencies')
+    @auth.has_access(
+        [
+            (permissions.ACTION_CAN_READ, permissions.RESOURCE_DAG),
+        ]
+    )
+    @gzipped
+    @action_logging
+    def list(self):
+        """Display DAG dependencies"""
+        title = "DAG Dependencies"
+
+        if datetime.utcnow() > self.last_refresh + timedelta(seconds=self.refresh_interval):
+            self._calculate_graph()
+            self.last_refresh = datetime.utcnow()
+
+        return self.render_template(
+            "airflow/dag_dependencies.html",
+            title=title,
+            nodes=self.nodes,
+            edges=self.edges,
+            last_refresh=self.last_refresh.strftime("%Y-%m-%d %H:%M:%S"),
+            arrange=conf.get("webserver", "dag_orientation"),
+            width=request.args.get("width", "100%"),
+            height=request.args.get("height", "800"),
+        )
+
+    def _calculate_graph(self):
+
+        current_app.dag_bag.collect_dags_from_db()

Review comment:
       I fear this call the most. Some DAGs can contain a thousand tasks, and we can have several such DAGs. It will be very slow to load them all into memory.  If more users start accessing this view at the same time, the web server will hang and stop responding to requests.
   
   Do you think we can optimize it?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] bbovenzi commented on pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
bbovenzi commented on pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#issuecomment-801500851


   @ms32035 sounds good. I'll hold off the js migrations until this is ready.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb merged pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
ashb merged pull request #13199:
URL: https://github.com/apache/airflow/pull/13199


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ms32035 commented on pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
ms32035 commented on pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#issuecomment-748672927


   This is still WIP, but happy to hear any suggestions early on.
   Things to improve as:
   * Some more configuration (operator types)
   * Docs (?)
   * Couldn't get `yarn run lint` to target new files. Checks everything and throws a ton of errors


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
ashb commented on pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#issuecomment-834585638


   > I would recommend backend filtering. Some users have 10,000+ dags.
   
   We can do this in 2.1.1/2.2.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#discussion_r617714786



##########
File path: airflow/serialization/serialized_objects.py
##########
@@ -335,6 +334,30 @@ def _value_is_hardcoded_default(cls, attrname: str, value: Any, instance: Any) -
         return False
 
 
+class DependencyDetector:

Review comment:
       Does it have to be a separate class, we just have one method that can be converted to a function, no?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#issuecomment-834442227


   The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest master at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #13199: Create dag dependencies view

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #13199:
URL: https://github.com/apache/airflow/pull/13199#discussion_r598109856



##########
File path: airflow/www/views.py
##########
@@ -3811,3 +3811,102 @@ def autocomplete(self, session=None):
         payload = [row[0] for row in dag_ids_query.union(owners_query).limit(10).all()]
 
         return wwwutils.json_response(payload)
+
+
+class DagDependenciesView(AirflowBaseView):
+    """View to show dependencies between DAGs"""
+
+    refresh_interval = conf.getint(
+        "dag_dependencies_plugin",
+        "refresh_interval",
+        fallback=conf.getint("scheduler", "dag_dir_list_interval"),
+    )
+    last_refresh = datetime.utcnow() - timedelta(seconds=refresh_interval)
+    nodes = []
+    edges = []
+
+    @expose('/dag-dependencies')
+    @auth.has_access(
+        [
+            (permissions.ACTION_CAN_READ, permissions.RESOURCE_DAG),
+        ]
+    )
+    @gzipped
+    @action_logging
+    def list(self):
+        """Display DAG dependencies"""
+        title = "DAG Dependencies"
+
+        if datetime.utcnow() > self.last_refresh + timedelta(seconds=self.refresh_interval):

Review comment:
       This cache is per webserver worker - is that okay/expected?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org