You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/08/12 23:27:09 UTC

[GitHub] [airflow] mik-laj opened a new pull request #17591: Use gunicorn to serve logs generated by worker

mik-laj opened a new pull request #17591:
URL: https://github.com/apache/airflow/pull/17591


   <!--
   Thank you for contributing! Please make sure that your code changes
   are covered with tests. And in case of new features or big changes
   remember to adjust the documentation.
   
   Feel free to ping committers for the review!
   
   In case of existing issue, reference it using one of the following:
   
   closes: #ISSUE
   related: #ISSUE
   
   How to write a good git commit message:
   http://chris.beams.io/posts/git-commit/
   -->
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information.
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/main/UPDATING.md).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #17591: Use gunicorn to serve logs generated by worker

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #17591:
URL: https://github.com/apache/airflow/pull/17591#discussion_r688666544



##########
File path: airflow/utils/serve_logs.py
##########
@@ -73,10 +74,47 @@ def serve_logs_view(filename):
     return flask_app
 
 
+class StandaloneGunicornApplication(gunicorn.app.base.BaseApplication):
+    """
+    Standalone Gunicorn application/serve for usage with any WSGI-application.
+
+    Code inspired by an example from the Gunicorn documentation.
+    https://github.com/benoitc/gunicorn/blob/cf55d2cec277f220ebd605989ce78ad1bb553c46/examples/standalone_app.py
+
+    For details, about standalone gunicorn application, see:
+    https://docs.gunicorn.org/en/stable/custom.html
+    """
+
+    def __init__(self, app, options=None):
+        self.options = options or {}
+        self.application = app
+        super().__init__()
+
+    def load_config(self):
+        config = {
+            key: value
+            for key, value in self.options.items()
+            if key in self.cfg.settings and value is not None
+        }
+        for key, value in config.items():

Review comment:
       I probably use the `--force` too much.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #17591: Use gunicorn to serve logs generated by worker

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #17591:
URL: https://github.com/apache/airflow/pull/17591#discussion_r688644072



##########
File path: airflow/utils/serve_logs.py
##########
@@ -73,10 +74,47 @@ def serve_logs_view(filename):
     return flask_app
 
 
+class StandaloneGunicornApplication(gunicorn.app.base.BaseApplication):
+    """
+    Standalone Gunicorn application/serve for usage with any WSGI-application.
+
+    Code inspired by an example from the Gunicorn documentation.
+    https://github.com/benoitc/gunicorn/blob/cf55d2cec277f220ebd605989ce78ad1bb553c46/examples/standalone_app.py
+
+    For details, about standalone gunicorn application, see:
+    https://docs.gunicorn.org/en/stable/custom.html
+    """
+
+    def __init__(self, app, options=None):
+        self.options = options or {}
+        self.application = app
+        super().__init__()
+
+    def load_config(self):
+        config = {
+            key: value
+            for key, value in self.options.items()
+            if key in self.cfg.settings and value is not None
+        }
+        for key, value in config.items():
+            self.cfg.set(key.lower(), value)
+
+    def load(self):
+        return self.application
+
+
 def serve_logs():
     """Serves logs generated by Worker"""
     setproctitle("airflow serve-logs")
-    app = flask_app()
+    wsgi_app = create_app()
 
     worker_log_server_port = conf.getint('celery', 'WORKER_LOG_SERVER_PORT')
-    app.run(host='0.0.0.0', port=worker_log_server_port)
+    options = {
+        'bind': f"0.0.0.0:{worker_log_server_port}",
+        'workers': 2,

Review comment:
       Good point. Added docs.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #17591: Use gunicorn to serve logs generated by worker

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #17591:
URL: https://github.com/apache/airflow/pull/17591#discussion_r688468320



##########
File path: airflow/utils/serve_logs.py
##########
@@ -73,10 +74,47 @@ def serve_logs_view(filename):
     return flask_app
 
 
+class StandaloneGunicornApplication(gunicorn.app.base.BaseApplication):
+    """
+    Standalone Gunicorn application/serve for usage with any WSGI-application.
+
+    Code inspired by an example from the Gunicorn documentation.
+    https://github.com/benoitc/gunicorn/blob/cf55d2cec277f220ebd605989ce78ad1bb553c46/examples/standalone_app.py
+
+    For details, about standalone gunicorn application, see:
+    https://docs.gunicorn.org/en/stable/custom.html
+    """
+
+    def __init__(self, app, options=None):
+        self.options = options or {}
+        self.application = app
+        super().__init__()
+
+    def load_config(self):
+        config = {
+            key: value
+            for key, value in self.options.items()
+            if key in self.cfg.settings and value is not None
+        }
+        for key, value in config.items():
+            self.cfg.set(key.lower(), value)
+
+    def load(self):
+        return self.application
+
+
 def serve_logs():
     """Serves logs generated by Worker"""
     setproctitle("airflow serve-logs")
-    app = flask_app()
+    wsgi_app = create_app()
 
     worker_log_server_port = conf.getint('celery', 'WORKER_LOG_SERVER_PORT')
-    app.run(host='0.0.0.0', port=worker_log_server_port)
+    options = {
+        'bind': f"0.0.0.0:{worker_log_server_port}",
+        'workers': 2,

Review comment:
       Yeah 2 is better just in case of memory errors/crashes. Just one small nit (in case somene has problems with memory usage etc.) i think it would be great to mention in the docs of worker that we are using Gunicorm and that the configuration options can be overridden by  GUNiCORN_CMD_ARGS env variable https://docs.gunicorn.org/en/latest/settings.html#settings 
   
   It's not at all obvious from docs without looking at the code now that wlwe have separate Gunicorm processes forked and that you can configure their behaviour via ENV vars. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #17591: Use gunicorn to serve logs generated by worker

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #17591:
URL: https://github.com/apache/airflow/pull/17591#issuecomment-898413764


   The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] jedcunningham commented on a change in pull request #17591: Use gunicorn to serve logs generated by worker

Posted by GitBox <gi...@apache.org>.
jedcunningham commented on a change in pull request #17591:
URL: https://github.com/apache/airflow/pull/17591#discussion_r688555204



##########
File path: airflow/utils/serve_logs.py
##########
@@ -73,10 +74,47 @@ def serve_logs_view(filename):
     return flask_app
 
 
+class StandaloneGunicornApplication(gunicorn.app.base.BaseApplication):
+    """
+    Standalone Gunicorn application/serve for usage with any WSGI-application.
+
+    Code inspired by an example from the Gunicorn documentation.
+    https://github.com/benoitc/gunicorn/blob/cf55d2cec277f220ebd605989ce78ad1bb553c46/examples/standalone_app.py
+
+    For details, about standalone gunicorn application, see:
+    https://docs.gunicorn.org/en/stable/custom.html
+    """
+
+    def __init__(self, app, options=None):
+        self.options = options or {}
+        self.application = app
+        super().__init__()
+
+    def load_config(self):
+        config = {
+            key: value
+            for key, value in self.options.items()
+            if key in self.cfg.settings and value is not None
+        }
+        for key, value in config.items():

Review comment:
       ```suggestion
           for key, value in self.options.items():
   ```
   
   I'm not sure we need this? Seems odd to silently drop options that are passed, same with None.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #17591: Use gunicorn to serve logs generated by worker

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #17591:
URL: https://github.com/apache/airflow/pull/17591#discussion_r688388159



##########
File path: airflow/utils/serve_logs.py
##########
@@ -73,10 +74,47 @@ def serve_logs_view(filename):
     return flask_app
 
 
+class StandaloneGunicornApplication(gunicorn.app.base.BaseApplication):
+    """
+    Standalone Gunicorn application/serve for usage with any WSGI-application.
+
+    Code inspired by an example from the Gunicorn documentation.
+    https://github.com/benoitc/gunicorn/blob/cf55d2cec277f220ebd605989ce78ad1bb553c46/examples/standalone_app.py
+
+    For details, about standalone gunicorn application, see:
+    https://docs.gunicorn.org/en/stable/custom.html
+    """
+
+    def __init__(self, app, options=None):
+        self.options = options or {}
+        self.application = app
+        super().__init__()
+
+    def load_config(self):
+        config = {
+            key: value
+            for key, value in self.options.items()
+            if key in self.cfg.settings and value is not None
+        }
+        for key, value in config.items():
+            self.cfg.set(key.lower(), value)
+
+    def load(self):
+        return self.application
+
+
 def serve_logs():
     """Serves logs generated by Worker"""
     setproctitle("airflow serve-logs")
-    app = flask_app()
+    wsgi_app = create_app()
 
     worker_log_server_port = conf.getint('celery', 'WORKER_LOG_SERVER_PORT')
-    app.run(host='0.0.0.0', port=worker_log_server_port)
+    options = {
+        'bind': f"0.0.0.0:{worker_log_server_port}",
+        'workers': 2,

Review comment:
       Do we need 2 workers or do away with just 1?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #17591: Use gunicorn to serve logs generated by worker

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #17591:
URL: https://github.com/apache/airflow/pull/17591#discussion_r688148321



##########
File path: airflow/utils/serve_logs.py
##########
@@ -19,15 +19,16 @@
 import os
 import time
 
+import gunicorn.app.base
 from flask import Flask, abort, request, send_from_directory
 from itsdangerous import TimedJSONWebSignatureSerializer
 from setproctitle import setproctitle
 
 from airflow.configuration import conf
 
 
-def flask_app():
-    flask_app = Flask(__name__)
+def create_app():
+    flask_app = Flask(__name__, static_folder=None)

Review comment:
       We don't support this feature, so we can safely turn it off.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] jedcunningham commented on a change in pull request #17591: Use gunicorn to serve logs generated by worker

Posted by GitBox <gi...@apache.org>.
jedcunningham commented on a change in pull request #17591:
URL: https://github.com/apache/airflow/pull/17591#discussion_r688661251



##########
File path: airflow/utils/serve_logs.py
##########
@@ -73,10 +74,47 @@ def serve_logs_view(filename):
     return flask_app
 
 
+class StandaloneGunicornApplication(gunicorn.app.base.BaseApplication):
+    """
+    Standalone Gunicorn application/serve for usage with any WSGI-application.
+
+    Code inspired by an example from the Gunicorn documentation.
+    https://github.com/benoitc/gunicorn/blob/cf55d2cec277f220ebd605989ce78ad1bb553c46/examples/standalone_app.py
+
+    For details, about standalone gunicorn application, see:
+    https://docs.gunicorn.org/en/stable/custom.html
+    """
+
+    def __init__(self, app, options=None):
+        self.options = options or {}
+        self.application = app
+        super().__init__()
+
+    def load_config(self):
+        config = {
+            key: value
+            for key, value in self.options.items()
+            if key in self.cfg.settings and value is not None
+        }
+        for key, value in config.items():

Review comment:
       @mik-laj, was this intentionally resolved without a comment?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #17591: Use gunicorn to serve logs generated by worker

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #17591:
URL: https://github.com/apache/airflow/pull/17591#issuecomment-898066957


   > Can you explain a bit more about this change, why do we need it?
   
   Yeah. Agree with @kaxil  I figured it is after seeing this slack conversation: https://apache-airflow.slack.com/archives/CCQ7EGB1P/p1626866127472700 - but commit message why we are doing it and what problem are we solving (what was the effect of running the dev server rather than gunicorn). 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #17591: Use gunicorn to serve logs generated by worker

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #17591:
URL: https://github.com/apache/airflow/pull/17591#discussion_r688664479



##########
File path: airflow/utils/serve_logs.py
##########
@@ -73,10 +74,47 @@ def serve_logs_view(filename):
     return flask_app
 
 
+class StandaloneGunicornApplication(gunicorn.app.base.BaseApplication):
+    """
+    Standalone Gunicorn application/serve for usage with any WSGI-application.
+
+    Code inspired by an example from the Gunicorn documentation.
+    https://github.com/benoitc/gunicorn/blob/cf55d2cec277f220ebd605989ce78ad1bb553c46/examples/standalone_app.py
+
+    For details, about standalone gunicorn application, see:
+    https://docs.gunicorn.org/en/stable/custom.html
+    """
+
+    def __init__(self, app, options=None):
+        self.options = options or {}
+        self.application = app
+        super().__init__()
+
+    def load_config(self):
+        config = {
+            key: value
+            for key, value in self.options.items()
+            if key in self.cfg.settings and value is not None
+        }
+        for key, value in config.items():

Review comment:
       I did something wrong. I accepted this change, but I don't know why I can't see it now.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] jedcunningham commented on a change in pull request #17591: Use gunicorn to serve logs generated by worker

Posted by GitBox <gi...@apache.org>.
jedcunningham commented on a change in pull request #17591:
URL: https://github.com/apache/airflow/pull/17591#discussion_r688665220



##########
File path: airflow/utils/serve_logs.py
##########
@@ -73,10 +74,47 @@ def serve_logs_view(filename):
     return flask_app
 
 
+class StandaloneGunicornApplication(gunicorn.app.base.BaseApplication):
+    """
+    Standalone Gunicorn application/serve for usage with any WSGI-application.
+
+    Code inspired by an example from the Gunicorn documentation.
+    https://github.com/benoitc/gunicorn/blob/cf55d2cec277f220ebd605989ce78ad1bb553c46/examples/standalone_app.py
+
+    For details, about standalone gunicorn application, see:
+    https://docs.gunicorn.org/en/stable/custom.html
+    """
+
+    def __init__(self, app, options=None):
+        self.options = options or {}
+        self.application = app
+        super().__init__()
+
+    def load_config(self):
+        config = {
+            key: value
+            for key, value in self.options.items()
+            if key in self.cfg.settings and value is not None
+        }
+        for key, value in config.items():

Review comment:
       Ah, weird 🍺




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on pull request #17591: Use gunicorn to serve logs generated by worker

Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #17591:
URL: https://github.com/apache/airflow/pull/17591#issuecomment-898119170


   @potiuk @kaxil Updated PR description.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #17591: Use gunicorn to serve logs generated by worker

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #17591:
URL: https://github.com/apache/airflow/pull/17591#discussion_r688462179



##########
File path: airflow/utils/serve_logs.py
##########
@@ -73,10 +74,47 @@ def serve_logs_view(filename):
     return flask_app
 
 
+class StandaloneGunicornApplication(gunicorn.app.base.BaseApplication):
+    """
+    Standalone Gunicorn application/serve for usage with any WSGI-application.
+
+    Code inspired by an example from the Gunicorn documentation.
+    https://github.com/benoitc/gunicorn/blob/cf55d2cec277f220ebd605989ce78ad1bb553c46/examples/standalone_app.py
+
+    For details, about standalone gunicorn application, see:
+    https://docs.gunicorn.org/en/stable/custom.html
+    """
+
+    def __init__(self, app, options=None):
+        self.options = options or {}
+        self.application = app
+        super().__init__()
+
+    def load_config(self):
+        config = {
+            key: value
+            for key, value in self.options.items()
+            if key in self.cfg.settings and value is not None
+        }
+        for key, value in config.items():
+            self.cfg.set(key.lower(), value)
+
+    def load(self):
+        return self.application
+
+
 def serve_logs():
     """Serves logs generated by Worker"""
     setproctitle("airflow serve-logs")
-    app = flask_app()
+    wsgi_app = create_app()
 
     worker_log_server_port = conf.getint('celery', 'WORKER_LOG_SERVER_PORT')
-    app.run(host='0.0.0.0', port=worker_log_server_port)
+    options = {
+        'bind': f"0.0.0.0:{worker_log_server_port}",
+        'workers': 2,

Review comment:
       1 should work too, but I configured 2 to failover in case one worker had problems.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] jedcunningham commented on a change in pull request #17591: Use gunicorn to serve logs generated by worker

Posted by GitBox <gi...@apache.org>.
jedcunningham commented on a change in pull request #17591:
URL: https://github.com/apache/airflow/pull/17591#discussion_r688652693



##########
File path: docs/apache-airflow/logging-monitoring/logging-tasks.rst
##########
@@ -107,3 +107,16 @@ External Links
 When using remote logging, users can configure Airflow to show a link to an external UI within the Airflow Web UI. Clicking the link redirects a user to the external UI.
 
 Some external systems require specific configuration in Airflow for redirection to work but others do not.
+
+Serving logs from workers
+-------------------------
+
+Most task handlers send logs upon completion of a task. In order to view the log in real time, airflow starts the server serving the log in the following cases:
+
+- If ``SchedulerExecutor`` or ``LocalExecutor`` is used, then after running the ``airflow scheduler`` command.
+- If ``CeleryExecutor`` is used, then after running the ``airflow worker`` command.
+
+The server is running on the port specified by ``worker_log_server_port`` option in ``celery`` section. By default, it is ``8793``.
+Communication between the webserver and the worker is signed with the key specified by ``secret_key`` option  in ``webserver`` section. You must ensure that the key matches so that communication can take place without problems.
+
+We are using `Gunicorm <https://gunicorn.org/>`__ as WSGI server. Its configuration options can be overridden by ``GUNiCORN_CMD_ARGS`` env variable. For details, see `Gunicorn settings <https://docs.gunicorn.org/en/latest/settings.html#settings>`__

Review comment:
       ```suggestion
   We are using `Gunicorm <https://gunicorn.org/>`__ as a WSGI server. Its configuration options can be overridden with the ``GUNICORN_CMD_ARGS`` env variable. For details, see `Gunicorn settings <https://docs.gunicorn.org/en/latest/settings.html#settings>`__.
   ```

##########
File path: docs/apache-airflow/logging-monitoring/logging-tasks.rst
##########
@@ -107,3 +107,16 @@ External Links
 When using remote logging, users can configure Airflow to show a link to an external UI within the Airflow Web UI. Clicking the link redirects a user to the external UI.
 
 Some external systems require specific configuration in Airflow for redirection to work but others do not.
+
+Serving logs from workers
+-------------------------
+
+Most task handlers send logs upon completion of a task. In order to view the log in real time, airflow starts the server serving the log in the following cases:
+
+- If ``SchedulerExecutor`` or ``LocalExecutor`` is used, then after running the ``airflow scheduler`` command.
+- If ``CeleryExecutor`` is used, then after running the ``airflow worker`` command.

Review comment:
       ```suggestion
   - If ``SchedulerExecutor`` or ``LocalExecutor`` is used, then when ``airflow scheduler`` is running.
   - If ``CeleryExecutor`` is used, then when ``airflow worker`` is running.
   ```
   
   Still not 100% satisfied with the wording here 🤷‍♂️

##########
File path: docs/apache-airflow/logging-monitoring/logging-tasks.rst
##########
@@ -107,3 +107,16 @@ External Links
 When using remote logging, users can configure Airflow to show a link to an external UI within the Airflow Web UI. Clicking the link redirects a user to the external UI.
 
 Some external systems require specific configuration in Airflow for redirection to work but others do not.
+
+Serving logs from workers
+-------------------------
+
+Most task handlers send logs upon completion of a task. In order to view the log in real time, airflow starts the server serving the log in the following cases:

Review comment:
       ```suggestion
   Most task handlers send logs upon completion of a task. In order to view logs in real time, airflow automatically starts an http server to serve the logs in the following cases:
   ```

##########
File path: docs/apache-airflow/logging-monitoring/logging-tasks.rst
##########
@@ -107,3 +107,16 @@ External Links
 When using remote logging, users can configure Airflow to show a link to an external UI within the Airflow Web UI. Clicking the link redirects a user to the external UI.
 
 Some external systems require specific configuration in Airflow for redirection to work but others do not.
+
+Serving logs from workers
+-------------------------
+
+Most task handlers send logs upon completion of a task. In order to view the log in real time, airflow starts the server serving the log in the following cases:
+
+- If ``SchedulerExecutor`` or ``LocalExecutor`` is used, then after running the ``airflow scheduler`` command.
+- If ``CeleryExecutor`` is used, then after running the ``airflow worker`` command.
+
+The server is running on the port specified by ``worker_log_server_port`` option in ``celery`` section. By default, it is ``8793``.
+Communication between the webserver and the worker is signed with the key specified by ``secret_key`` option  in ``webserver`` section. You must ensure that the key matches so that communication can take place without problems.
+
+We are using `Gunicorm <https://gunicorn.org/>`__ as WSGI server and that the configuration options can be overridden by ``GUNiCORN_CMD_ARGS`` env variable. For details, see `Gunicorn settings <https://docs.gunicorn.org/en/latest/settings.html#settings>`__

Review comment:
       ```suggestion
   We are using `Gunicorm <https://gunicorn.org/>`__ as a WSGI server and configuration options can be overridden with the ``GUNICORN_CMD_ARGS`` env variable. For details, see `Gunicorn settings <https://docs.gunicorn.org/en/latest/settings.html#settings>`__.
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk merged pull request #17591: Use gunicorn to serve logs generated by worker

Posted by GitBox <gi...@apache.org>.
potiuk merged pull request #17591:
URL: https://github.com/apache/airflow/pull/17591


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #17591: Use gunicorn to serve logs generated by worker

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #17591:
URL: https://github.com/apache/airflow/pull/17591#discussion_r688147883



##########
File path: airflow/utils/serve_logs.py
##########
@@ -19,15 +19,16 @@
 import os
 import time
 
+import gunicorn.app.base
 from flask import Flask, abort, request, send_from_directory
 from itsdangerous import TimedJSONWebSignatureSerializer
 from setproctitle import setproctitle
 
 from airflow.configuration import conf
 
 
-def flask_app():
-    flask_app = Flask(__name__)
+def create_app():

Review comment:
       Follow convention supported by flask. https://github.com/pallets/flask/blob/afc13b9390ae2e40f4731e815b49edc9ef52ed4b/src/flask/cli.py#L63




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #17591: Use gunicorn to serve logs generated by worker

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #17591:
URL: https://github.com/apache/airflow/pull/17591#discussion_r688645101



##########
File path: docs/apache-airflow/logging-monitoring/logging-tasks.rst
##########
@@ -107,3 +107,16 @@ External Links
 When using remote logging, users can configure Airflow to show a link to an external UI within the Airflow Web UI. Clicking the link redirects a user to the external UI.
 
 Some external systems require specific configuration in Airflow for redirection to work but others do not.
+

Review comment:
       :heart:




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org