You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/07/01 11:40:51 UTC

[GitHub] [airflow] ashb opened a new pull request #16754: Only allow webserver to request from the worker log server

ashb opened a new pull request #16754:
URL: https://github.com/apache/airflow/pull/16754


   Logs _shouldn't_ contain any sensitive info, but they often do by
   mistake. As an extra level of protection we shouldn't allow anything
   other than the webserver to access the logs.
   
   (We can't change the bind IP form 0.0.0.0 as for it to be useful it
   needs to be accessed from different hosts -- i.e. the webserver will
   almost always be on a different node)
   
   
   <!--
   Thank you for contributing! Please make sure that your code changes
   are covered with tests. And in case of new features or big changes
   remember to adjust the documentation.
   
   Feel free to ping committers for the review!
   
   In case of existing issue, reference it using one of the following:
   
   closes: #ISSUE
   related: #ISSUE
   
   How to write a good git commit message:
   http://chris.beams.io/posts/git-commit/
   -->
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information.
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/main/UPDATING.md).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #16754: Only allow webserver to request from the worker log server

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #16754:
URL: https://github.com/apache/airflow/pull/16754#issuecomment-883590588


   > Also when using airflow helm chart `secret_key` ends up being different between the pods unless it is explicitly specified in values.yaml, so maybe update helm chart documentation as well? Or use the same logic as is [currently used for fernet key in helm chart](https://github.com/apache/airflow/blob/main/chart/templates/secrets/fernetkey-secret.yaml)?
   
   cc: @jedcunningham and @kaxil -> I believe this has already been addressed (or at least planned) just wanted to make sure it's not lost :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #16754: Only allow webserver to request from the worker log server

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #16754:
URL: https://github.com/apache/airflow/pull/16754#discussion_r662333769



##########
File path: airflow/utils/serve_logs.py
##########
@@ -17,25 +17,61 @@
 
 """Serve logs process"""
 import os
+import time
 
-import flask
+from flask import Flask, abort, request, send_from_directory
+from itsdangerous import TimedJSONWebSignatureSerializer
 from setproctitle import setproctitle
 
 from airflow.configuration import conf
 
 
-def serve_logs():
-    """Serves logs generated by Worker"""
-    print("Starting flask")
-    flask_app = flask.Flask(__name__)
-    setproctitle("airflow serve-logs")
+def flask_app():
+    flask_app = Flask(__name__)
+    max_request_age = conf.getint('webserver', 'log_request_clock_grace', fallback=30)
+    log_directory = os.path.expanduser(conf.get('logging', 'BASE_LOG_FOLDER'))
+
+    signer = TimedJSONWebSignatureSerializer(
+        secret_key=conf.get('webserver', 'secret_key'),

Review comment:
       Done in 27265516d

##########
File path: airflow/utils/serve_logs.py
##########
@@ -17,25 +17,61 @@
 
 """Serve logs process"""
 import os
+import time
 
-import flask
+from flask import Flask, abort, request, send_from_directory
+from itsdangerous import TimedJSONWebSignatureSerializer
 from setproctitle import setproctitle
 
 from airflow.configuration import conf
 
 
-def serve_logs():
-    """Serves logs generated by Worker"""
-    print("Starting flask")
-    flask_app = flask.Flask(__name__)
-    setproctitle("airflow serve-logs")
+def flask_app():
+    flask_app = Flask(__name__)
+    max_request_age = conf.getint('webserver', 'log_request_clock_grace', fallback=30)
+    log_directory = os.path.expanduser(conf.get('logging', 'BASE_LOG_FOLDER'))
+
+    signer = TimedJSONWebSignatureSerializer(
+        secret_key=conf.get('webserver', 'secret_key'),
+        algorithm_name='HS512',
+        expires_in=max_request_age,
+    )
+
+    # Prevent direct access to the logs port
+    @flask_app.before_request
+    def validate_pre_signed_url():
+        try:
+            auth = request.headers['Authorization']
+
+            # We don't actually care about the payload, just that the signature

Review comment:
       Done in 27265516d




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] wanderijames edited a comment on pull request #16754: Only allow webserver to request from the worker log server

Posted by GitBox <gi...@apache.org>.
wanderijames edited a comment on pull request #16754:
URL: https://github.com/apache/airflow/pull/16754#issuecomment-880731042


   @ashb this could have broken worker logs access from the web ui. A pod running webui cannot access logs from the worker in another pod. This can be reproduced if deployed in Kubernetes. Getting the following:
   
   ```
   *** Fetching from: https://worker.worker-svc.default.svc.cluster.local:8793/log/<dag>/<task>/2021-07-15T11:51:59.190528+00:00/1.log
   *** Failed to fetch log file from worker. 403 Client Error: FORBIDDEN for url: https://worker.worker-svc.default.svc.cluster.local:8793/log/<dag>/<task>/2021-07-15T11:51:59.190528+00:00/1.log
   For more information check: https://httpstatuses.com/403
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] jedcunningham commented on pull request #16754: Only allow webserver to request from the worker log server

Posted by GitBox <gi...@apache.org>.
jedcunningham commented on pull request #16754:
URL: https://github.com/apache/airflow/pull/16754#issuecomment-885665842


   @green2k, I'd just change your success criteria to be a 403 response. That said, having a `/health` endpoint probably makes sense.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #16754: Only allow webserver to request from the worker log server

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #16754:
URL: https://github.com/apache/airflow/pull/16754#issuecomment-882102892


   > Edit: I thought you meant fernet key by secret key. I understood that you actually meant AIRFLOW__WEBSERVER__SECRET_KEY. It's kind of tough to understand which components need the same configuration.
   
   All components of Airflow needs exactly the same configuration. FULL CONFIGURATION. Correct me please, if I am wrong but nowhere in documentation it is stated otherwise. This is the default mode for Airflow and it's always been like that.
   
   I think it is pretty brave (and completely unnecessary) to decide on having some parts of configuration available here and unavailable there without fully realising the consequences. 
   
   By default you should have:
   
   * the same images for all airflow components
   * the same dependencies for all Airflow components
   * the same configuration for all Airlfow components
   
   If you try to approach it differently, you might hit all kinds of problems and unless you exactly know what you are doing, you should avoid it (and even that you open-up to future changes that might - like in this case - base on the assumptions that your configuration is the same for all components.
   
   Just wondering @tunix  - where the idea that different components should have different configuration came from? Is this something implied? Or something that you found in the documentation? Should we make it very explicit in our documentation that you are not supposed to selectively set some configuration parameters on some components only?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] green2k commented on pull request #16754: Only allow webserver to request from the worker log server

Posted by GitBox <gi...@apache.org>.
green2k commented on pull request #16754:
URL: https://github.com/apache/airflow/pull/16754#issuecomment-885661439


   We use the logging endpoint for worker healthchecks (an external service for health-checking of nodes in a cluster). Does it mean these simple healthchecks won't work anymore?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #16754: Only allow webserver to request from the worker log server

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #16754:
URL: https://github.com/apache/airflow/pull/16754#issuecomment-882102892


   > Edit: I thought you meant fernet key by secret key. I understood that you actually meant AIRFLOW__WEBSERVER__SECRET_KEY. It's kind of tough to understand which components need the same configuration.
   
   All components of Airflow needs exactly the same configuration. FULL CONFIGURATION. Correct me please, if I am wrong but nowhere in documentation it is stated otherwise. This is the default mode for Airflow and it's always been like that.
   
   I think it is pretty brave (and completely unnecessary) to decide on having some parts of configuration available here and unavailable there without fully realising the consequences. 
   
   By default you should have:
   
   * the same images for all airflow components
   * the same dependencies for all Airflow components
   * the same configuration for all Airlfow components
   
   If you try to approach it differently, you might hit all kinds of problems and unless you exactly know what you are doing, you should avoid it (and even that you open-up to future changes that might - like in this case - base on the assumptions that your configuration is the same for all components.
   
   Just wondering @tunix  - where the idea that different components should have different configuration came from? Is this something implied? Or something that you found in the documentation? Should we make it very explicit in our documentation that you are not supposed to selectively set some configuration parameters on some components?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on pull request #16754: Only allow webserver to request from the worker log server

Posted by GitBox <gi...@apache.org>.
kaxil commented on pull request #16754:
URL: https://github.com/apache/airflow/pull/16754#issuecomment-884365589


   https://github.com/apache/airflow/pull/17142 should take care of the Helm Chart -- rc1 for which will be created today and hopefully the version will be released in next 3-4 days


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb merged pull request #16754: Only allow webserver to request from the worker log server

Posted by GitBox <gi...@apache.org>.
ashb merged pull request #16754:
URL: https://github.com/apache/airflow/pull/16754


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] tunix commented on pull request #16754: Only allow webserver to request from the worker log server

Posted by GitBox <gi...@apache.org>.
tunix commented on pull request #16754:
URL: https://github.com/apache/airflow/pull/16754#issuecomment-881504197


   @ashb which configuration do you mean? i have the same issue with @wanderijames on my Docker Swarm setup. My webserver & worker containers have identical configuration. Worker has a few other packages installed in it. They share the same secret.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #16754: Only allow webserver to request from the worker log server

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #16754:
URL: https://github.com/apache/airflow/pull/16754#discussion_r662276667



##########
File path: airflow/utils/serve_logs.py
##########
@@ -17,25 +17,61 @@
 
 """Serve logs process"""
 import os
+import time
 
-import flask
+from flask import Flask, abort, request, send_from_directory
+from itsdangerous import TimedJSONWebSignatureSerializer
 from setproctitle import setproctitle
 
 from airflow.configuration import conf
 
 
-def serve_logs():
-    """Serves logs generated by Worker"""
-    print("Starting flask")
-    flask_app = flask.Flask(__name__)
-    setproctitle("airflow serve-logs")
+def flask_app():
+    flask_app = Flask(__name__)
+    max_request_age = conf.getint('webserver', 'log_request_clock_grace', fallback=30)
+    log_directory = os.path.expanduser(conf.get('logging', 'BASE_LOG_FOLDER'))
+
+    signer = TimedJSONWebSignatureSerializer(
+        secret_key=conf.get('webserver', 'secret_key'),
+        algorithm_name='HS512',
+        expires_in=max_request_age,
+    )
+
+    # Prevent direct access to the logs port
+    @flask_app.before_request
+    def validate_pre_signed_url():
+        try:
+            auth = request.headers['Authorization']
+
+            # We don't actually care about the payload, just that the signature

Review comment:
       Oh yeah, easy enough to as the payload is otherwise unused.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] shankarcb commented on pull request #16754: Only allow webserver to request from the worker log server

Posted by GitBox <gi...@apache.org>.
shankarcb commented on pull request #16754:
URL: https://github.com/apache/airflow/pull/16754#issuecomment-963392985


   > 
   > 
   > This is not the real propblem but your deployment issue. Your webserver and workers simply do not have time properly synchronized. Please make sure they do (for example with ntp).
   
   Initially I did think the same and made sure they are synced which did not help. They are having only 2 second delay between the nodes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] tunix edited a comment on pull request #16754: Only allow webserver to request from the worker log server

Posted by GitBox <gi...@apache.org>.
tunix edited a comment on pull request #16754:
URL: https://github.com/apache/airflow/pull/16754#issuecomment-881504197


   @ashb which configuration do you mean? i have the same issue with @wanderijames on my Docker Swarm setup. My webserver & worker containers have identical configuration. Worker has a few other packages installed in it. They share the same secret.
   
   **Edit:** I thought you meant fernet key by secret key. I understood that you actually meant `AIRFLOW__WEBSERVER__SECRET_KEY`. It's kind of tough to understand which components need the same configuration.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashoksahoo commented on pull request #16754: Only allow webserver to request from the worker log server

Posted by GitBox <gi...@apache.org>.
ashoksahoo commented on pull request #16754:
URL: https://github.com/apache/airflow/pull/16754#issuecomment-883879822


   I also faced the same issue and was not able to solve it, I downgraded to 2.1.1 in the chart for now. 
   Can we create a Issue and track it for 2.1.3


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #16754: Only allow webserver to request from the worker log server

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #16754:
URL: https://github.com/apache/airflow/pull/16754#issuecomment-872301309


   The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on pull request #16754: Only allow webserver to request from the worker log server

Posted by GitBox <gi...@apache.org>.
ashb commented on pull request #16754:
URL: https://github.com/apache/airflow/pull/16754#issuecomment-880780858


   @wanderijames That sounds like your webserver and the worker pods do not have the same configuration then -- specifically the secret key. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #16754: Only allow webserver to request from the worker log server

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #16754:
URL: https://github.com/apache/airflow/pull/16754#issuecomment-963406426


   I actually it's not the change you applied mto the code but likely redeploying the code or restarting the machine fixed the issue. Almost 100% sure that you had time drift. 
   
   The fact that you have 2 seconds now, means that something is wrong and probably the difference will grow. When you have NTP synchronised, both machines should have exactly the same time down to millisecond.
   
   You can check it be changing the value back to 30 - as long as the time drift is smaller it should still work.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #16754: Only allow webserver to request from the worker log server

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #16754:
URL: https://github.com/apache/airflow/pull/16754#issuecomment-884444436


   > #17142 should take care of the Helm Chart -- rc1 for which will be created today and hopefully the version will be released in next 3-4 days
   
   wooohooo :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] tunix commented on pull request #16754: Only allow webserver to request from the worker log server

Posted by GitBox <gi...@apache.org>.
tunix commented on pull request #16754:
URL: https://github.com/apache/airflow/pull/16754#issuecomment-884419057


   > Just wondering @tunix - where the idea that different components should have different configuration came from? Is this something implied? Or something that you found in the documentation? Should we make it very explicit in our documentation that you are not supposed to selectively set some configuration parameters on some components only?
   
   I'm new to Airflow and I haven't seen an explicit part within the docs that says "all configuration must be shared between airflow components". In fact, all configuration keys are prefixed with some naming scheme which led me to think that some are common, some are for webserver etc... I personally don't like to have common configuration because changing a single setting requires a change in all components (hence restart) but I get that this is intentional for Airflow. I think this can be better communicated throughout the docs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #16754: Only allow webserver to request from the worker log server

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #16754:
URL: https://github.com/apache/airflow/pull/16754#issuecomment-963418306


   So likely you had another issue - possibly not restarted airlfow that used different configuration. Yes. This parameter should  change grace period and yes it should work if you restart airflow. so you should look in your process/deployment not in airflow.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #16754: Only allow webserver to request from the worker log server

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #16754:
URL: https://github.com/apache/airflow/pull/16754#discussion_r662263042



##########
File path: airflow/utils/serve_logs.py
##########
@@ -17,25 +17,61 @@
 
 """Serve logs process"""
 import os
+import time
 
-import flask
+from flask import Flask, abort, request, send_from_directory
+from itsdangerous import TimedJSONWebSignatureSerializer
 from setproctitle import setproctitle
 
 from airflow.configuration import conf
 
 
-def serve_logs():
-    """Serves logs generated by Worker"""
-    print("Starting flask")
-    flask_app = flask.Flask(__name__)
-    setproctitle("airflow serve-logs")
+def flask_app():
+    flask_app = Flask(__name__)
+    max_request_age = conf.getint('webserver', 'log_request_clock_grace', fallback=30)
+    log_directory = os.path.expanduser(conf.get('logging', 'BASE_LOG_FOLDER'))
+
+    signer = TimedJSONWebSignatureSerializer(
+        secret_key=conf.get('webserver', 'secret_key'),
+        algorithm_name='HS512',
+        expires_in=max_request_age,
+    )
+
+    # Prevent direct access to the logs port
+    @flask_app.before_request
+    def validate_pre_signed_url():
+        try:
+            auth = request.headers['Authorization']
+
+            # We don't actually care about the payload, just that the signature

Review comment:
       It would be *slightly* safer to add the log path to the payload. With the current approach m-i-m could potentially grab an authorisation header from one request and retrieve as many different log entries as possible within 30 seconds time window. Otherwise they could only intercept those logs that were specifically requested by the webserver. 
   
   It is very remote possibility - you'd have to be able to place yourself between the webserver and workers, but I think it should be rather easy to add  as an extra protection. I think adding the log path as payload and verifying it, would be all that is needed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16754: Only allow webserver to request from the worker log server

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16754:
URL: https://github.com/apache/airflow/pull/16754#discussion_r662329194



##########
File path: airflow/utils/serve_logs.py
##########
@@ -17,25 +17,61 @@
 
 """Serve logs process"""
 import os
+import time
 
-import flask
+from flask import Flask, abort, request, send_from_directory
+from itsdangerous import TimedJSONWebSignatureSerializer
 from setproctitle import setproctitle
 
 from airflow.configuration import conf
 
 
-def serve_logs():
-    """Serves logs generated by Worker"""
-    print("Starting flask")
-    flask_app = flask.Flask(__name__)
-    setproctitle("airflow serve-logs")
+def flask_app():
+    flask_app = Flask(__name__)
+    max_request_age = conf.getint('webserver', 'log_request_clock_grace', fallback=30)
+    log_directory = os.path.expanduser(conf.get('logging', 'BASE_LOG_FOLDER'))
+
+    signer = TimedJSONWebSignatureSerializer(
+        secret_key=conf.get('webserver', 'secret_key'),

Review comment:
       Yes.  For now, we have used the tokens in two places, but in the future we may use them more often.  If we are unlucky, it may happen that it will be possible to generate a token that has a matching format. When we add salt, this token can be used here and no other place.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] shankarcb commented on pull request #16754: Only allow webserver to request from the worker log server

Posted by GitBox <gi...@apache.org>.
shankarcb commented on pull request #16754:
URL: https://github.com/apache/airflow/pull/16754#issuecomment-963222750


   This push has introduced another bug of 403 client error when DAG logs are fetched from Airflow WebUI. This bug appears to impact when Airflow is setup in cluster mode. We have setup Airflow in cluster mode with Celery - One master and multiple worker nodes. [Not Celery kubernetes]
   Error message was that secret_key is not matching which was not the case since identical copies of airflow.cfg was used in master and worker nodes. 
   
   **After increasing the default fallback value to 120 seconds in below code snippet, this has fixed the 403 client error issue.** 
    We added "log_request_clock_grace" under [webserver] section on airflow.cfg to override this value which did not work. We had to manually change the code in airflow installation to fix the issue. However, this is not recommended/permanent solution. **Please fix this in future releases.**
   
   def create_app():
       flask_app = Flask(__name__, static_folder=None)
       max_request_age = conf.getint('webserver', 'log_request_clock_grace', **fallback=30**)
   
   Code: https://github.com/apache/airflow/blob/main/airflow/utils/serve_logs.py
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #16754: Only allow webserver to request from the worker log server

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #16754:
URL: https://github.com/apache/airflow/pull/16754#issuecomment-963364869


   This is not the real propblem but your deployment issue. Your webserver and workers simply do not have time properly synchronized. Please make sure they do (for example with ntp).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #16754: Only allow webserver to request from the worker log server

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #16754:
URL: https://github.com/apache/airflow/pull/16754#discussion_r662263443



##########
File path: airflow/utils/serve_logs.py
##########
@@ -17,25 +17,61 @@
 
 """Serve logs process"""
 import os
+import time
 
-import flask
+from flask import Flask, abort, request, send_from_directory
+from itsdangerous import TimedJSONWebSignatureSerializer
 from setproctitle import setproctitle
 
 from airflow.configuration import conf
 
 
-def serve_logs():
-    """Serves logs generated by Worker"""
-    print("Starting flask")
-    flask_app = flask.Flask(__name__)
-    setproctitle("airflow serve-logs")
+def flask_app():
+    flask_app = Flask(__name__)
+    max_request_age = conf.getint('webserver', 'log_request_clock_grace', fallback=30)
+    log_directory = os.path.expanduser(conf.get('logging', 'BASE_LOG_FOLDER'))
+
+    signer = TimedJSONWebSignatureSerializer(
+        secret_key=conf.get('webserver', 'secret_key'),
+        algorithm_name='HS512',
+        expires_in=max_request_age,
+    )
+
+    # Prevent direct access to the logs port
+    @flask_app.before_request
+    def validate_pre_signed_url():
+        try:
+            auth = request.headers['Authorization']
+
+            # We don't actually care about the payload, just that the signature

Review comment:
       Not super strong about it, but seems easy to do with little to no drawback




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16754: Only allow webserver to request from the worker log server

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16754:
URL: https://github.com/apache/airflow/pull/16754#discussion_r662331988



##########
File path: airflow/utils/serve_logs.py
##########
@@ -17,25 +17,61 @@
 
 """Serve logs process"""
 import os
+import time
 
-import flask
+from flask import Flask, abort, request, send_from_directory
+from itsdangerous import TimedJSONWebSignatureSerializer
 from setproctitle import setproctitle
 
 from airflow.configuration import conf
 
 
-def serve_logs():
-    """Serves logs generated by Worker"""
-    print("Starting flask")
-    flask_app = flask.Flask(__name__)
-    setproctitle("airflow serve-logs")
+def flask_app():
+    flask_app = Flask(__name__)
+    max_request_age = conf.getint('webserver', 'log_request_clock_grace', fallback=30)
+    log_directory = os.path.expanduser(conf.get('logging', 'BASE_LOG_FOLDER'))
+
+    signer = TimedJSONWebSignatureSerializer(
+        secret_key=conf.get('webserver', 'secret_key'),

Review comment:
       For details, see: https://github.com/pallets/itsdangerous/blob/main/src/itsdangerous/signer.py#L78-L79




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] shankarcb commented on pull request #16754: Only allow webserver to request from the worker log server

Posted by GitBox <gi...@apache.org>.
shankarcb commented on pull request #16754:
URL: https://github.com/apache/airflow/pull/16754#issuecomment-963414025


   Airflow library is installed on a separate python virtual environment with required Airflow python libraries installed there.. we updated the serve_logs.py in site-packages directory of that virtual environment which is picked up by Airflow process to resolve the issue..Machine reboot was not performed.
   
   All the machines are synced with same RHEL NTP servers.
   
   This impacted multiple deployments of Airflow..The other deployment did not have any time lag/drift..
   
   What is the significance of log_request_clock_grace parameter in the code? Can this be configured in airflow.cfg?
   We are fine with this solution as long as we dont need to manually change the airflow utils code after each upgrade of Airflow


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on pull request #16754: Only allow webserver to request from the worker log server

Posted by GitBox <gi...@apache.org>.
kaxil commented on pull request #16754:
URL: https://github.com/apache/airflow/pull/16754#issuecomment-884217602


   > I also faced the same issue and was not able to solve it, I downgraded to 2.1.1 in the chart for now.
   > Can we create a Issue and track it for 2.1.3
   
   Yup, we will clear it out in documentation as well as make sure the official Helm-chart works with this config :) 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #16754: Only allow webserver to request from the worker log server

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #16754:
URL: https://github.com/apache/airflow/pull/16754#discussion_r662263042



##########
File path: airflow/utils/serve_logs.py
##########
@@ -17,25 +17,61 @@
 
 """Serve logs process"""
 import os
+import time
 
-import flask
+from flask import Flask, abort, request, send_from_directory
+from itsdangerous import TimedJSONWebSignatureSerializer
 from setproctitle import setproctitle
 
 from airflow.configuration import conf
 
 
-def serve_logs():
-    """Serves logs generated by Worker"""
-    print("Starting flask")
-    flask_app = flask.Flask(__name__)
-    setproctitle("airflow serve-logs")
+def flask_app():
+    flask_app = Flask(__name__)
+    max_request_age = conf.getint('webserver', 'log_request_clock_grace', fallback=30)
+    log_directory = os.path.expanduser(conf.get('logging', 'BASE_LOG_FOLDER'))
+
+    signer = TimedJSONWebSignatureSerializer(
+        secret_key=conf.get('webserver', 'secret_key'),
+        algorithm_name='HS512',
+        expires_in=max_request_age,
+    )
+
+    # Prevent direct access to the logs port
+    @flask_app.before_request
+    def validate_pre_signed_url():
+        try:
+            auth = request.headers['Authorization']
+
+            # We don't actually care about the payload, just that the signature

Review comment:
       It would be *slightly* safer to add the log path to the payload. With the current approach m-i-m could potentially grab an authorisation header from one request and retrieve as many different log entries as possible within 30 seconds time window. Otherwise they could only intercept those logs that were specifically requested by the webserver. 
   
   It is very remote possibility - you'd have to be able to place yourself between the webserver and workers, but I think it should be rather easy to add  as an extra protection I think adding the log path as payload and verifying it, would be all that is needed.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #16754: Only allow webserver to request from the worker log server

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #16754:
URL: https://github.com/apache/airflow/pull/16754#discussion_r662319303



##########
File path: airflow/utils/serve_logs.py
##########
@@ -17,25 +17,61 @@
 
 """Serve logs process"""
 import os
+import time
 
-import flask
+from flask import Flask, abort, request, send_from_directory
+from itsdangerous import TimedJSONWebSignatureSerializer
 from setproctitle import setproctitle
 
 from airflow.configuration import conf
 
 
-def serve_logs():
-    """Serves logs generated by Worker"""
-    print("Starting flask")
-    flask_app = flask.Flask(__name__)
-    setproctitle("airflow serve-logs")
+def flask_app():
+    flask_app = Flask(__name__)
+    max_request_age = conf.getint('webserver', 'log_request_clock_grace', fallback=30)
+    log_directory = os.path.expanduser(conf.get('logging', 'BASE_LOG_FOLDER'))
+
+    signer = TimedJSONWebSignatureSerializer(
+        secret_key=conf.get('webserver', 'secret_key'),

Review comment:
       Do you mean something like adding `salt='logs'`?
   
   Also probably doesn't hurt, but the formats are very different.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on pull request #16754: Only allow webserver to request from the worker log server

Posted by GitBox <gi...@apache.org>.
ashb commented on pull request #16754:
URL: https://github.com/apache/airflow/pull/16754#issuecomment-872335442


   All passed bar upload coverage -- merging.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #16754: Only allow webserver to request from the worker log server

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #16754:
URL: https://github.com/apache/airflow/pull/16754#discussion_r662300548



##########
File path: airflow/utils/serve_logs.py
##########
@@ -17,25 +17,61 @@
 
 """Serve logs process"""
 import os
+import time
 
-import flask
+from flask import Flask, abort, request, send_from_directory
+from itsdangerous import TimedJSONWebSignatureSerializer
 from setproctitle import setproctitle
 
 from airflow.configuration import conf
 
 
-def serve_logs():
-    """Serves logs generated by Worker"""
-    print("Starting flask")
-    flask_app = flask.Flask(__name__)
-    setproctitle("airflow serve-logs")
+def flask_app():
+    flask_app = Flask(__name__)
+    max_request_age = conf.getint('webserver', 'log_request_clock_grace', fallback=30)
+    log_directory = os.path.expanduser(conf.get('logging', 'BASE_LOG_FOLDER'))
+
+    signer = TimedJSONWebSignatureSerializer(
+        secret_key=conf.get('webserver', 'secret_key'),

Review comment:
       I think we should pass the salt parameter here as well to prevent the token from being used in the wrong context e.g. using a token from the getCode API in this method. This shouldn't work, but it's better to still explicitly prevent it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] wanderijames commented on pull request #16754: Only allow webserver to request from the worker log server

Posted by GitBox <gi...@apache.org>.
wanderijames commented on pull request #16754:
URL: https://github.com/apache/airflow/pull/16754#issuecomment-880797073


   > @wanderijames That sounds like your webserver and the worker pods do not have the same configuration then -- specifically the secret key.
   
   I will investigate my setup further. Thanks @ashb 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on pull request #16754: Only allow webserver to request from the worker log server

Posted by GitBox <gi...@apache.org>.
kaxil commented on pull request #16754:
URL: https://github.com/apache/airflow/pull/16754#issuecomment-884443202


   @tunix Does https://github.com/apache/airflow/pull/17146 help?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #16754: Only allow webserver to request from the worker log server

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #16754:
URL: https://github.com/apache/airflow/pull/16754#issuecomment-963406426


   I actually think it's not the change you applied mto the code but likely redeploying the code or restarting the machine fixed the issue. Almost 100% sure that you had time drift. 
   
   The fact that you have 2 seconds now, means that something is wrong and probably the difference will grow. When you have NTP synchronised, both machines should have exactly the same time down to millisecond.
   
   You can check it be changing the value back to 30 - as long as the time drift is smaller it should still work.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] green2k commented on pull request #16754: Only allow webserver to request from the worker log server

Posted by GitBox <gi...@apache.org>.
green2k commented on pull request #16754:
URL: https://github.com/apache/airflow/pull/16754#issuecomment-885680910


   Yes, that's the thing. Our health-checking system expects HTTP/200 response. We managed to have a custom healthcheck endpoint by executing `pathlib.Path('logs/health').touch(exist_ok=True)`. So we're currently (v2.1.1) able to check health of our workers by calling `http(s)://worker/log/health`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] wanderijames commented on pull request #16754: Only allow webserver to request from the worker log server

Posted by GitBox <gi...@apache.org>.
wanderijames commented on pull request #16754:
URL: https://github.com/apache/airflow/pull/16754#issuecomment-880731042


   @ashb this could have broken worker logs access from the web ui. A pod running webui cannot access logs from the work in another pod.  Getting the following:
   
   ```
   *** Fetching from: https://worker.worker-svc.default.svc.cluster.local:8793/log/<dag>/<task>/2021-07-15T11:51:59.190528+00:00/1.log
   *** Failed to fetch log file from worker. 403 Client Error: FORBIDDEN for url: https://worker.worker-svc.default.svc.cluster.local:8793/log/<dag>/<task>/2021-07-15T11:51:59.190528+00:00/1.log
   For more information check: https://httpstatuses.com/403
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] Aakcht commented on pull request #16754: Only allow webserver to request from the worker log server

Posted by GitBox <gi...@apache.org>.
Aakcht commented on pull request #16754:
URL: https://github.com/apache/airflow/pull/16754#issuecomment-883543410


   > Or something that you found in the documentation? Should we make it very explicit in our documentation that you are not supposed to selectively set some configuration parameters on some components only?
   
   I also had the same issue, as per https://airflow.apache.org/docs/apache-airflow/stable/configurations-ref.html#secret-key to me it looks like only instances of webserver should have the same `secret_key`, so maybe better change it?
   
   Also when using airflow helm chart `secret_key` ends up being different between the pods unless it is explicitly specified in values.yaml, so maybe update helm chart documentation as well? Or use the same logic as is [currently used for fernet key in helm chart](https://github.com/apache/airflow/blob/main/chart/templates/secrets/fernetkey-secret.yaml)?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org