You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/06/17 20:53:19 UTC

[GitHub] [airflow] mdediana opened a new pull request #9354: Task logging handlers can provide custom log links

mdediana opened a new pull request #9354:
URL: https://github.com/apache/airflow/pull/9354


   Airflow provides logging handlers that send logs to external services (Elasticsearch, StackDriver, etc). At the moment, if Elasticsearch is in use, links to its logs show up in the UI. This PR generalizes this functionality to other logging handlers.
   
   To provide links to remote logs, a logging handler must inherit from `RemoteLoggingMixin` and implement a `get_remote_log_url()` method and have a `REMOTE_LOG_NAME` attribute. The view identifies when a remote logging handler is in use and shows the link to the logs in the UI.
   
   This PR does not close but is is related issue #9083.    
   
   ---
   Make sure to mark the boxes below before creating PR: [x]
   
   - [x] Description above provides context of the change
   - [x] Unit tests coverage for changes (not needed for documentation changes)
   - [x] Target Github ISSUE in description if exists
   - [x] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)"
   - [x] Relevant documentation is updated including usage instructions.
   - [x] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #9354: Task logging handlers can provide custom log links

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #9354:
URL: https://github.com/apache/airflow/pull/9354#discussion_r446610902



##########
File path: airflow/www/views.py
##########
@@ -760,22 +760,35 @@ def log(self, session=None):
             execution_date=execution_date, form=form,
             root=root, wrapped=conf.getboolean('webserver', 'default_wrap'))
 
-    @expose('/elasticsearch')
+    @expose('/redirect_to_external_log')
     @has_dag_access(can_dag_read=True)
     @has_access
     @action_logging
-    def elasticsearch(self):
+    @provide_session
+    def redirect_to_external_log(self, session=None):
         dag_id = request.args.get('dag_id')
         task_id = request.args.get('task_id')
         execution_date = request.args.get('execution_date')
+        dttm = timezone.parse(execution_date)
         try_number = request.args.get('try_number', 1)
-        elasticsearch_frontend = conf.get('elasticsearch', 'frontend')
-        log_id_template = conf.get('elasticsearch', 'log_id_template')
-        log_id = log_id_template.format(
-            dag_id=dag_id, task_id=task_id,
-            execution_date=execution_date, try_number=try_number)
-        url = 'https://' + elasticsearch_frontend.format(log_id=quote(log_id))
-        return redirect(url)
+
+        ti = session.query(models.TaskInstance).filter(
+            models.TaskInstance.dag_id == dag_id,
+            models.TaskInstance.task_id == task_id,
+            models.TaskInstance.execution_date == dttm).first()
+
+        if not ti:
+            flash(f"Task [{dag_id}.{task_id}] does not exist", "error")
+            return redirect(url_for('Airflow.index'))
+
+        try:
+            task_log_reader = TaskLogReader()
+            handler = task_log_reader.log_handler
+            url = handler.get_external_log_url(ti, try_number)
+            return redirect(url)
+        except AttributeError:
+            flash("Cannot redirect to external log, task log handler is not external", "error")
+            return redirect(url_for('Airflow.index'))

Review comment:
       ```suggestion
           task_log_reader = TaskLogReader()
           if not task_log_reader.supports_external_links:
               flash("Ttask log handler is not support external links", "error")
               return redirect(url_for('Airflow.index'))
   
           handler = task_log_reader.log_handler
           url = handler.get_external_log_url(ti, try_number)
           return redirect(url)
   ```
   This exception here may mean that the handler does not support external links or that a failure occurred while generating links.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mdediana commented on a change in pull request #9354: Task logging handlers can provide custom log links

Posted by GitBox <gi...@apache.org>.
mdediana commented on a change in pull request #9354:
URL: https://github.com/apache/airflow/pull/9354#discussion_r446532544



##########
File path: airflow/www/views.py
##########
@@ -1493,8 +1505,14 @@ def recurse_nodes(task, visited):
 
         form = DateTimeWithNumRunsForm(data={'base_date': max_date,
                                              'num_runs': num_runs})
-        external_logs = conf.get('elasticsearch', 'frontend')
+
         doc_md = wwwutils.wrapped_markdown(getattr(dag, 'doc_md', None), css_class='dag-doc')
+
+        task_log_reader = TaskLogReader()
+        handler = task_log_reader.log_handler
+        show_external_log_redirect = isinstance(handler, ExternalLoggingMixin)

Review comment:
       Done.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mdediana commented on a change in pull request #9354: Task logging handlers can provide custom log links

Posted by GitBox <gi...@apache.org>.
mdediana commented on a change in pull request #9354:
URL: https://github.com/apache/airflow/pull/9354#discussion_r446640999



##########
File path: airflow/utils/log/log_reader.py
##########
@@ -100,6 +101,11 @@ def is_supported(self):
 
         return hasattr(self.log_handler, 'read')
 
+    @property
+    def is_external(self):
+        """Check if the logging handler is external (e.g. Elasticsearch, Stackdriver, etc)."""

Review comment:
       Makes sense.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on pull request #9354: Task logging handlers can provide custom log links

Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #9354:
URL: https://github.com/apache/airflow/pull/9354#issuecomment-646878976


   TaskLogReader has been merged. Can you do.a rebase?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] turbaszek merged pull request #9354: Task logging handlers can provide custom log links

Posted by GitBox <gi...@apache.org>.
turbaszek merged pull request #9354:
URL: https://github.com/apache/airflow/pull/9354


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mdediana commented on pull request #9354: Task logging handlers can provide custom log links

Posted by GitBox <gi...@apache.org>.
mdediana commented on pull request #9354:
URL: https://github.com/apache/airflow/pull/9354#issuecomment-647472206


   We're good.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] turbaszek commented on a change in pull request #9354: Task logging handlers can provide custom log links

Posted by GitBox <gi...@apache.org>.
turbaszek commented on a change in pull request #9354:
URL: https://github.com/apache/airflow/pull/9354#discussion_r445365945



##########
File path: airflow/utils/log/es_task_handler.py
##########
@@ -262,3 +267,24 @@ def close(self):
         super().close()
 
         self.closed = True
+
+    def log_name(self):

Review comment:
       WDYT?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mdediana commented on a change in pull request #9354: Task logging handlers can provide custom log links

Posted by GitBox <gi...@apache.org>.
mdediana commented on a change in pull request #9354:
URL: https://github.com/apache/airflow/pull/9354#discussion_r446532395



##########
File path: airflow/www/views.py
##########
@@ -760,22 +761,33 @@ def log(self, session=None):
             execution_date=execution_date, form=form,
             root=root, wrapped=conf.getboolean('webserver', 'default_wrap'))
 
-    @expose('/elasticsearch')
+    @expose('/redirect_to_external_log')
     @has_dag_access(can_dag_read=True)
     @has_access
     @action_logging
-    def elasticsearch(self):
+    @provide_session
+    def redirect_to_external_log(self, session=None):
         dag_id = request.args.get('dag_id')
         task_id = request.args.get('task_id')
         execution_date = request.args.get('execution_date')
+        dttm = timezone.parse(execution_date)
         try_number = request.args.get('try_number', 1)
-        elasticsearch_frontend = conf.get('elasticsearch', 'frontend')
-        log_id_template = conf.get('elasticsearch', 'log_id_template')
-        log_id = log_id_template.format(
-            dag_id=dag_id, task_id=task_id,
-            execution_date=execution_date, try_number=try_number)
-        url = 'https://' + elasticsearch_frontend.format(log_id=quote(log_id))
-        return redirect(url)
+
+        ti = session.query(models.TaskInstance).filter(
+            models.TaskInstance.dag_id == dag_id,
+            models.TaskInstance.task_id == task_id,
+            models.TaskInstance.execution_date == dttm).first()
+
+        try:
+            task_log_reader = TaskLogReader()
+            handler = task_log_reader.log_handler
+            url = handler.get_external_log_url(ti, try_number)
+            return redirect(url)
+        except AttributeError as e:
+            error_message = [
+                f"Task log handler does not support external log URLs.\n{str(e)}\n"
+            ]
+            return jsonify(message=error_message, error=True)

Review comment:
       I changed it to redirect to home with an error message, not sure if this is the right thing, let me know.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #9354: Task logging handlers can provide custom log links

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #9354:
URL: https://github.com/apache/airflow/pull/9354#discussion_r446086461



##########
File path: airflow/www/views.py
##########
@@ -760,22 +761,33 @@ def log(self, session=None):
             execution_date=execution_date, form=form,
             root=root, wrapped=conf.getboolean('webserver', 'default_wrap'))
 
-    @expose('/elasticsearch')
+    @expose('/redirect_to_external_log')
     @has_dag_access(can_dag_read=True)
     @has_access
     @action_logging
-    def elasticsearch(self):
+    @provide_session
+    def redirect_to_external_log(self, session=None):
         dag_id = request.args.get('dag_id')
         task_id = request.args.get('task_id')
         execution_date = request.args.get('execution_date')
+        dttm = timezone.parse(execution_date)
         try_number = request.args.get('try_number', 1)
-        elasticsearch_frontend = conf.get('elasticsearch', 'frontend')
-        log_id_template = conf.get('elasticsearch', 'log_id_template')
-        log_id = log_id_template.format(
-            dag_id=dag_id, task_id=task_id,
-            execution_date=execution_date, try_number=try_number)
-        url = 'https://' + elasticsearch_frontend.format(log_id=quote(log_id))
-        return redirect(url)
+
+        ti = session.query(models.TaskInstance).filter(
+            models.TaskInstance.dag_id == dag_id,
+            models.TaskInstance.task_id == task_id,
+            models.TaskInstance.execution_date == dttm).first()
+

Review comment:
       Can you handle missing TI here? You should probably redirect the user to the main page + display flash message.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on pull request #9354: Task logging handlers can provide custom log links

Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #9354:
URL: https://github.com/apache/airflow/pull/9354#issuecomment-646985000


   Yes. We should use TaskLogReader, if possible. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on pull request #9354: Task logging handlers can provide custom log links

Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #9354:
URL: https://github.com/apache/airflow/pull/9354#issuecomment-645643685


   Thank you very much. A very useful change. I look forward to further work results.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #9354: Task logging handlers can provide custom log links

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #9354:
URL: https://github.com/apache/airflow/pull/9354#discussion_r441849111



##########
File path: airflow/utils/log/logging_mixin.py
##########
@@ -58,6 +58,20 @@ def _set_context(self, context):
             set_context(self.log, context)
 
 
+class RemoteLoggingMixin:
+    """
+    Define a logging handler based on a remote service (e.g. ELK, StackDriver). Subclasses must override
+    REMOTE_LOG_NAME and get_remote_log_url().
+    """
+    REMOTE_LOG_NAME: str = 'RemoteLog'

Review comment:
       ```suggestion
       def log_name(self):
           raise NotImplementedError('Remote logger must provide remote name')
   ```
   This will force the user to configure the name.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #9354: Task logging handlers can provide custom log links

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #9354:
URL: https://github.com/apache/airflow/pull/9354#discussion_r441850813



##########
File path: airflow/www/templates/airflow/dag.html
##########
@@ -366,9 +366,9 @@ <h4 class="modal-title" id="dagModalLabel">
     var task_id = '';
     var execution_date = '';
     var subdag_id = '';
-    var show_es_logs = false;
-    {% if show_external_logs is defined %}
-      show_es_logs = '{{ show_external_logs }}' == "True";
+    var show_remote_log_redirect = false;

Review comment:
       ```suggestion
       var show_external_log_redirect = false;
   ```
   This name is even more confusing here.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on pull request #9354: Task logging handlers can provide custom log links

Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #9354:
URL: https://github.com/apache/airflow/pull/9354#issuecomment-652676572


   @turbaszek Do you want to add anything else?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] turbaszek commented on a change in pull request #9354: Task logging handlers can provide custom log links

Posted by GitBox <gi...@apache.org>.
turbaszek commented on a change in pull request #9354:
URL: https://github.com/apache/airflow/pull/9354#discussion_r445364741



##########
File path: airflow/utils/log/es_task_handler.py
##########
@@ -262,3 +267,24 @@ def close(self):
         super().close()
 
         self.closed = True
+
+    def log_name(self):

Review comment:
       ```suggestion
       @property
       def log_name(self):
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #9354: Task logging handlers can provide custom log links

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #9354:
URL: https://github.com/apache/airflow/pull/9354#discussion_r446087927



##########
File path: airflow/www/views.py
##########
@@ -1493,8 +1505,14 @@ def recurse_nodes(task, visited):
 
         form = DateTimeWithNumRunsForm(data={'base_date': max_date,
                                              'num_runs': num_runs})
-        external_logs = conf.get('elasticsearch', 'frontend')
+
         doc_md = wwwutils.wrapped_markdown(getattr(dag, 'doc_md', None), css_class='dag-doc')
+
+        task_log_reader = TaskLogReader()
+        handler = task_log_reader.log_handler
+        show_external_log_redirect = isinstance(handler, ExternalLoggingMixin)

Review comment:
       Can you create a property from this condition in TaskLogReader?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #9354: Task logging handlers can provide custom log links

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #9354:
URL: https://github.com/apache/airflow/pull/9354#discussion_r446610902



##########
File path: airflow/www/views.py
##########
@@ -760,22 +760,35 @@ def log(self, session=None):
             execution_date=execution_date, form=form,
             root=root, wrapped=conf.getboolean('webserver', 'default_wrap'))
 
-    @expose('/elasticsearch')
+    @expose('/redirect_to_external_log')
     @has_dag_access(can_dag_read=True)
     @has_access
     @action_logging
-    def elasticsearch(self):
+    @provide_session
+    def redirect_to_external_log(self, session=None):
         dag_id = request.args.get('dag_id')
         task_id = request.args.get('task_id')
         execution_date = request.args.get('execution_date')
+        dttm = timezone.parse(execution_date)
         try_number = request.args.get('try_number', 1)
-        elasticsearch_frontend = conf.get('elasticsearch', 'frontend')
-        log_id_template = conf.get('elasticsearch', 'log_id_template')
-        log_id = log_id_template.format(
-            dag_id=dag_id, task_id=task_id,
-            execution_date=execution_date, try_number=try_number)
-        url = 'https://' + elasticsearch_frontend.format(log_id=quote(log_id))
-        return redirect(url)
+
+        ti = session.query(models.TaskInstance).filter(
+            models.TaskInstance.dag_id == dag_id,
+            models.TaskInstance.task_id == task_id,
+            models.TaskInstance.execution_date == dttm).first()
+
+        if not ti:
+            flash(f"Task [{dag_id}.{task_id}] does not exist", "error")
+            return redirect(url_for('Airflow.index'))
+
+        try:
+            task_log_reader = TaskLogReader()
+            handler = task_log_reader.log_handler
+            url = handler.get_external_log_url(ti, try_number)
+            return redirect(url)
+        except AttributeError:
+            flash("Cannot redirect to external log, task log handler is not external", "error")
+            return redirect(url_for('Airflow.index'))

Review comment:
       ```suggestion
           task_log_reader = TaskLogReader()
           if not task_log_reader.supports_external_links:
               flash("Ttask log handler is not support external links", "error")
               return redirect(url_for('Airflow.index'))
   
           handler = task_log_reader.log_handler
           url = handler.get_external_log_url(ti, try_number)
           return redirect(url)
   ```
   This exception here may mean that the handler does not support external links or that a failure occurred while generating links. We have another easy way to check if an error occurs, so we can use it.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] aneesh-joseph commented on pull request #9354: Task logging handlers can provide custom log links

Posted by GitBox <gi...@apache.org>.
aneesh-joseph commented on pull request #9354:
URL: https://github.com/apache/airflow/pull/9354#issuecomment-650487239


   thank you @mdediana , this is very useful. @mik-laj  - is there a chance to get this in 1.10.11? :)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #9354: Task logging handlers can provide custom log links

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #9354:
URL: https://github.com/apache/airflow/pull/9354#discussion_r446610398



##########
File path: airflow/utils/log/log_reader.py
##########
@@ -100,6 +101,11 @@ def is_supported(self):
 
         return hasattr(self.log_handler, 'read')
 
+    @property
+    def is_external(self):
+        """Check if the logging handler is external (e.g. Elasticsearch, Stackdriver, etc)."""

Review comment:
       ```suggestion
       def supports_external_link(self):
           """Check if the logging handler is external links (e.g. Elasticsearch, Stackdriver, etc)."""
   ```
   This name is because the word "external" suggests that the class is not part of the Airflow core, but we want to check here whether external links are supported.  When we introduce it, we need to improve the "A" property so that it has a similar naming convention - `support_read`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mdediana commented on a change in pull request #9354: Task logging handlers can provide custom log links

Posted by GitBox <gi...@apache.org>.
mdediana commented on a change in pull request #9354:
URL: https://github.com/apache/airflow/pull/9354#discussion_r441857809



##########
File path: airflow/utils/log/logging_mixin.py
##########
@@ -58,6 +58,20 @@ def _set_context(self, context):
             set_context(self.log, context)
 
 
+class RemoteLoggingMixin:

Review comment:
       Sounds good, I'll change it.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mdediana edited a comment on pull request #9354: Task logging handlers can provide custom log links

Posted by GitBox <gi...@apache.org>.
mdediana edited a comment on pull request #9354:
URL: https://github.com/apache/airflow/pull/9354#issuecomment-650568904


   @aneesh-joseph and @mik-laj I will do my best to address change requests as quickly as possible. In general, I have more time on weekends, but I can do something on weekdays too.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #9354: Task logging handlers can provide custom log links

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #9354:
URL: https://github.com/apache/airflow/pull/9354#discussion_r446086992



##########
File path: airflow/www/views.py
##########
@@ -760,22 +761,33 @@ def log(self, session=None):
             execution_date=execution_date, form=form,
             root=root, wrapped=conf.getboolean('webserver', 'default_wrap'))
 
-    @expose('/elasticsearch')
+    @expose('/redirect_to_external_log')
     @has_dag_access(can_dag_read=True)
     @has_access
     @action_logging
-    def elasticsearch(self):
+    @provide_session
+    def redirect_to_external_log(self, session=None):
         dag_id = request.args.get('dag_id')
         task_id = request.args.get('task_id')
         execution_date = request.args.get('execution_date')
+        dttm = timezone.parse(execution_date)
         try_number = request.args.get('try_number', 1)
-        elasticsearch_frontend = conf.get('elasticsearch', 'frontend')
-        log_id_template = conf.get('elasticsearch', 'log_id_template')
-        log_id = log_id_template.format(
-            dag_id=dag_id, task_id=task_id,
-            execution_date=execution_date, try_number=try_number)
-        url = 'https://' + elasticsearch_frontend.format(log_id=quote(log_id))
-        return redirect(url)
+
+        ti = session.query(models.TaskInstance).filter(
+            models.TaskInstance.dag_id == dag_id,
+            models.TaskInstance.task_id == task_id,
+            models.TaskInstance.execution_date == dttm).first()
+
+        try:
+            task_log_reader = TaskLogReader()
+            handler = task_log_reader.log_handler
+            url = handler.get_external_log_url(ti, try_number)
+            return redirect(url)
+        except AttributeError as e:
+            error_message = [
+                f"Task log handler does not support external log URLs.\n{str(e)}\n"
+            ]
+            return jsonify(message=error_message, error=True)

Review comment:
       I am not sure if this is the correct response format. This is not an API, but a view for the end-user, so the response should be text/html.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #9354: Task logging handlers can provide custom log links

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #9354:
URL: https://github.com/apache/airflow/pull/9354#discussion_r446663429



##########
File path: airflow/utils/log/log_reader.py
##########
@@ -100,6 +101,11 @@ def is_supported(self):
 

Review comment:
       Can you rename `is_supported` to `supports_read` or somethings similar?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mdediana commented on a change in pull request #9354: Task logging handlers can provide custom log links

Posted by GitBox <gi...@apache.org>.
mdediana commented on a change in pull request #9354:
URL: https://github.com/apache/airflow/pull/9354#discussion_r441859257



##########
File path: airflow/www/views.py
##########
@@ -1363,6 +1374,14 @@ def success(self):
                                               confirmed, upstream, downstream,
                                               future, past, State.SUCCESS)
 
+    @staticmethod
+    def _get_log_handler():

Review comment:
       Nice. This is also used in `get_logs_with_metadata()`, but I didn't change it because it also uses `task_log_reader`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #9354: Task logging handlers can provide custom log links

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #9354:
URL: https://github.com/apache/airflow/pull/9354#discussion_r441851779



##########
File path: airflow/www/views.py
##########
@@ -1363,6 +1374,14 @@ def success(self):
                                               confirmed, upstream, downstream,
                                               future, past, State.SUCCESS)
 
+    @staticmethod
+    def _get_log_handler():

Review comment:
       Today I was working on one class that contains a similar piece of code. In the future, we will be able to move it.
   https://github.com/apache/airflow/pull/9331/commits/530c93c8ff6140e77a3642efd1e9eff3ac6bdd1d




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #9354: Task logging handlers can provide custom log links

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #9354:
URL: https://github.com/apache/airflow/pull/9354#discussion_r446663511



##########
File path: airflow/utils/log/es_task_handler.py
##########
@@ -262,3 +267,25 @@ def close(self):
         super().close()
 
         self.closed = True
+
+    @property
+    def log_name(self):
+        return self.LOG_NAME
+
+    def get_external_log_url(self, task_instance: TaskInstance, try_number: int) -> str:
+        """
+        Creates an address for an external log collecting service.
+        :param task_instance: task instance object

Review comment:
       ```suggestion
   
           :param task_instance: task instance object
   ```
   Without this, the documentation does not comply with the Sphinx.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mdediana commented on pull request #9354: Task logging handlers can provide custom log links

Posted by GitBox <gi...@apache.org>.
mdediana commented on pull request #9354:
URL: https://github.com/apache/airflow/pull/9354#issuecomment-646984825


   @mik-laj We're good now. Just one question: should we replace `_get_log_handler()` with `TaskLogReader`? The latter does more than just finding the log handler, but it's curious that this search end up duplicated.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #9354: Task logging handlers can provide custom log links

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #9354:
URL: https://github.com/apache/airflow/pull/9354#discussion_r446663429



##########
File path: airflow/utils/log/log_reader.py
##########
@@ -100,6 +101,11 @@ def is_supported(self):
 

Review comment:
       Can you rename is_supported to `supports_read`?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #9354: Task logging handlers can provide custom log links

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #9354:
URL: https://github.com/apache/airflow/pull/9354#discussion_r446087927



##########
File path: airflow/www/views.py
##########
@@ -1493,8 +1505,14 @@ def recurse_nodes(task, visited):
 
         form = DateTimeWithNumRunsForm(data={'base_date': max_date,
                                              'num_runs': num_runs})
-        external_logs = conf.get('elasticsearch', 'frontend')
+
         doc_md = wwwutils.wrapped_markdown(getattr(dag, 'doc_md', None), css_class='dag-doc')
+
+        task_log_reader = TaskLogReader()
+        handler = task_log_reader.log_handler
+        show_external_log_redirect = isinstance(handler, ExternalLoggingMixin)

Review comment:
       Can you create a property from this condition in TaskLogReader?  In general, all logic should be outside the view method. The view should only format the responses to the expected format. I know that this code does not look like this at the moment, but I am trying to fix it so that the Web UI is not just one file, but it is a more manageable code.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mdediana commented on pull request #9354: Task logging handlers can provide custom log links

Posted by GitBox <gi...@apache.org>.
mdediana commented on pull request #9354:
URL: https://github.com/apache/airflow/pull/9354#issuecomment-650568904


   @aneesh-joseph and @mik-laj I will do my best to address change requests as quick as possible. In general, I have more time on weekends, but I can do something on weekdays too.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #9354: Task logging handlers can provide custom log links

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #9354:
URL: https://github.com/apache/airflow/pull/9354#discussion_r446610398



##########
File path: airflow/utils/log/log_reader.py
##########
@@ -100,6 +101,11 @@ def is_supported(self):
 
         return hasattr(self.log_handler, 'read')
 
+    @property
+    def is_external(self):
+        """Check if the logging handler is external (e.g. Elasticsearch, Stackdriver, etc)."""

Review comment:
       ```suggestion
       def supports_external_link(self):
           """Check if the logging handler is external links (e.g. Elasticsearch, Stackdriver, etc)."""
   ```
   This name is because the word "external" suggests that the class is not part of the Airflow core, but we want to check here whether external links are supported.  When we introduce it, we need to improve the "is_supported" property so that it has a similar naming convention - `support_read`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mdediana commented on a change in pull request #9354: Task logging handlers can provide custom log links

Posted by GitBox <gi...@apache.org>.
mdediana commented on a change in pull request #9354:
URL: https://github.com/apache/airflow/pull/9354#discussion_r446532508



##########
File path: airflow/utils/log/es_task_handler.py
##########
@@ -262,3 +267,24 @@ def close(self):
         super().close()
 
         self.closed = True
+
+    def log_name(self):

Review comment:
       Done, looks better, thanks. In the process pylint had some complaint about overriden properties, I turned it into an abstract property.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mdediana commented on a change in pull request #9354: Task logging handlers can provide custom log links

Posted by GitBox <gi...@apache.org>.
mdediana commented on a change in pull request #9354:
URL: https://github.com/apache/airflow/pull/9354#discussion_r446532566



##########
File path: airflow/www/views.py
##########
@@ -760,22 +761,33 @@ def log(self, session=None):
             execution_date=execution_date, form=form,
             root=root, wrapped=conf.getboolean('webserver', 'default_wrap'))
 
-    @expose('/elasticsearch')
+    @expose('/redirect_to_external_log')
     @has_dag_access(can_dag_read=True)
     @has_access
     @action_logging
-    def elasticsearch(self):
+    @provide_session
+    def redirect_to_external_log(self, session=None):
         dag_id = request.args.get('dag_id')
         task_id = request.args.get('task_id')
         execution_date = request.args.get('execution_date')
+        dttm = timezone.parse(execution_date)
         try_number = request.args.get('try_number', 1)
-        elasticsearch_frontend = conf.get('elasticsearch', 'frontend')
-        log_id_template = conf.get('elasticsearch', 'log_id_template')
-        log_id = log_id_template.format(
-            dag_id=dag_id, task_id=task_id,
-            execution_date=execution_date, try_number=try_number)
-        url = 'https://' + elasticsearch_frontend.format(log_id=quote(log_id))
-        return redirect(url)
+
+        ti = session.query(models.TaskInstance).filter(
+            models.TaskInstance.dag_id == dag_id,
+            models.TaskInstance.task_id == task_id,
+            models.TaskInstance.execution_date == dttm).first()
+

Review comment:
       Done.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on pull request #9354: Task logging handlers can provide custom log links

Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #9354:
URL: https://github.com/apache/airflow/pull/9354#issuecomment-650711579


   > thank you @mdediana , this is very useful. @mik-laj - is there a chance to get this in 1.10.11? :)
   
   I'm not the release manager, but I don't see any obstacles for it to be added to the next release.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #9354: Task logging handlers can provide custom log links

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #9354:
URL: https://github.com/apache/airflow/pull/9354#discussion_r441850197



##########
File path: airflow/utils/log/logging_mixin.py
##########
@@ -58,6 +58,20 @@ def _set_context(self, context):
             set_context(self.log, context)
 
 
+class RemoteLoggingMixin:

Review comment:
       ```suggestion
   class ExternalLoggingMixin:
   ```
   What do you think to change the name? Remote logging is an existing mechanism that allows access to logs from Web UI. In my opinion, external logs better describe this feature.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #9354: Task logging handlers can provide custom log links

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #9354:
URL: https://github.com/apache/airflow/pull/9354#discussion_r446663429



##########
File path: airflow/utils/log/log_reader.py
##########
@@ -100,6 +101,11 @@ def is_supported(self):
 

Review comment:
       Can you rename is_supported to supports_read?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on pull request #9354: Task logging handlers can provide custom log links

Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #9354:
URL: https://github.com/apache/airflow/pull/9354#issuecomment-650708973


   Can you add some documentation?  I am thinking about the new section - "External links" on the ["Writing logs"](https://airflow.readthedocs.io/en/latest/howto/write-logs.html#writing-logs) page. (This page has a bad title and I will try to change it to "Logging" soon)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org