You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/02/19 17:02:56 UTC
[GitHub] [airflow] potiuk opened a new pull request #21685: Add extra information about time synchronization needed
potiuk opened a new pull request #21685:
URL: https://github.com/apache/airflow/pull/21685
When you have several machines running Airflow and their time
is not synchronized, you might get very weird behaviour - some of
the log retrieval actions might randomly return "forbidden" error
because of expiring token. This is extremely difficult to
diagnose and figure out (some of our users spent days on
investigating that). Explicitly stating the requirement in the
forbidden error and whenever secret_key parameter is
mentioned, should help our users to diagnose it more easily
(and save maintainers from unnecessary questions and discussions
in Slack :))
<!--
Thank you for contributing! Please make sure that your code changes
are covered with tests. And in case of new features or big changes
remember to adjust the documentation.
Feel free to ping committers for the review!
In case of existing issue, reference it using one of the following:
closes: #ISSUE
related: #ISSUE
How to write a good git commit message:
http://chris.beams.io/posts/git-commit/
-->
---
**^ Add meaningful description above**
Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information.
In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/main/UPDATING.md).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #21685: Add extra information about time synchronization needed
Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #21685:
URL: https://github.com/apache/airflow/pull/21685#issuecomment-1046066935
Actually - thanks @eladkal ! I was thinking for a while about a new idea for talk / blog post and your comment gave me fantastic idea. I am going to do it :)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk merged pull request #21685: Add extra information about time synchronization needed
Posted by GitBox <gi...@apache.org>.
potiuk merged pull request #21685:
URL: https://github.com/apache/airflow/pull/21685
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #21685: Add extra information about time synchronization needed
Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #21685:
URL: https://github.com/apache/airflow/pull/21685#issuecomment-1046064875
And when you see mistakes like that, it's also super easy to do search-replace if the content is the same :)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] github-actions[bot] commented on pull request #21685: Add extra information about time synchronization needed
Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #21685:
URL: https://github.com/apache/airflow/pull/21685#issuecomment-1046063591
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #21685: Add extra information about time synchronization needed
Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #21685:
URL: https://github.com/apache/airflow/pull/21685#issuecomment-1046066143
Just one comment - tt's actually very similar reasoning why Tests shoudl be DAMP not DRY: https://testing.googleblog.com/2019/12/testing-on-toilet-tests-too-dry-make.html
I agree that also in docs DRY is important but when faced with the dillema DRY or IMMEDIATELY HELPFUL WITHOUT FOLLOWING MORE LINKS (when you see warnings and watchouts) I favour the other.
BTW. That gave me an idea - I need to come up with some catchy phrase like DAMP. I think this might be the next content of my blog post and presentation for the summit even :D.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] eladkal commented on pull request #21685: Add extra information about time synchronization needed
Posted by GitBox <gi...@apache.org>.
eladkal commented on pull request #21685:
URL: https://github.com/apache/airflow/pull/21685#issuecomment-1046106835
For me documentation is not different that managing DWH. You need one source of truth otherwise more often people will consume wrong/non update information but I also agree with what you said. Duplication can sometimes be OK (for example in intro/quick guide <-> full extensive guide). Ideally we should have the ability to embed text snippets so the content is written only once and the reader doesn't have to click a link nor is even aware that the content is taken from somewhere else. Something to inspire to...
> Actually - thanks @eladkal ! I was thinking for a while about a new idea for talk / blog post and your comment gave me fantastic idea. I am going to do it :)
Waiting to read it!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk edited a comment on pull request #21685: Add extra information about time synchronization needed
Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #21685:
URL: https://github.com/apache/airflow/pull/21685#issuecomment-1046066143
Just one comment - tt's actually very similar reasoning why Tests shoudl be DAMP rather than DRY (DAMP should be favoured): https://testing.googleblog.com/2019/12/testing-on-toilet-tests-too-dry-make.html
I agree that also in docs DRY is important but when faced with the dillema DRY or IMMEDIATELY HELPFUL WITHOUT FOLLOWING MORE LINKS (when you see warnings and watchouts) I favour the other.
BTW. That gave me an idea - I need to come up with some catchy phrase like DAMP. I think this might be the next content of my blog post and presentation for the summit even :D.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] eladkal commented on a change in pull request #21685: Add extra information about time synchronization needed
Posted by GitBox <gi...@apache.org>.
eladkal commented on a change in pull request #21685:
URL: https://github.com/apache/airflow/pull/21685#discussion_r810515314
##########
File path: airflow/config_templates/config.yml
##########
@@ -1085,6 +1085,10 @@
Secret key used to run your flask app. It should be as random as possible. However, when running
more than 1 instances of webserver, make sure all of them use the same ``secret_key`` otherwise
one of them will error with "CSRF session token is missing".
+ The webserver key is also used to authorize requests to Celery workers when loga are retrieved.
Review comment:
```suggestion
The webserver key is also used to authorize requests to Celery workers when logs are retrieved.
```
##########
File path: docs/helm-chart/production-guide.rst
##########
@@ -111,6 +111,11 @@ Example to create a Kubernetes Secret from ``kubectl``:
kubectl create secret generic my-webserver-secret --from-literal="webserver-secret-key=$(python3 -c 'import secrets; print(secrets.token_hex(16))')"
+The webserver key is also used to authorize requests to Celery workers when loga are retrieved. The token
Review comment:
```suggestion
The webserver key is also used to authorize requests to Celery workers when logs are retrieved. The token
```
##########
File path: docs/apache-airflow/howto/set-config.rst
##########
@@ -121,3 +121,8 @@ the example below.
does not require all, some configurations need to be same otherwise they would not
work as expected. A good example for that is :ref:`secret_key<config:webserver__secret_key>` which
should be same on the Webserver and Worker to allow Webserver to fetch logs from Worker.
+
+ The webserver key is also used to authorize requests to Celery workers when loga are retrieved. The token
Review comment:
```suggestion
The webserver key is also used to authorize requests to Celery workers when logs are retrieved. The token
```
##########
File path: docs/apache-airflow/upgrading-from-1-10/index.rst
##########
@@ -339,6 +339,11 @@ the only supported UI.
this via any configuration mechanism. The 1.10.15 bridge-release modifies this feature
to use randomly generated secret keys instead of an insecure default and may break existing
deployments that rely on the default.
+ The webserver key is also used to authorize requests to Celery workers when loga are retrieved. The token
Review comment:
```suggestion
The webserver key is also used to authorize requests to Celery workers when logs are retrieved. The token
```
##########
File path: airflow/config_templates/default_airflow.cfg
##########
@@ -559,6 +559,10 @@ reload_on_plugin_change = False
# Secret key used to run your flask app. It should be as random as possible. However, when running
# more than 1 instances of webserver, make sure all of them use the same ``secret_key`` otherwise
# one of them will error with "CSRF session token is missing".
+# The webserver key is also used to authorize requests to Celery workers when loga are retrieved.
Review comment:
```suggestion
# The webserver key is also used to authorize requests to Celery workers when logs are retrieved.
```
##########
File path: docs/apache-airflow/configurations-ref.rst
##########
@@ -27,6 +27,11 @@ does not require all, some configurations need to be same otherwise they would n
work as expected. A good example for that is :ref:`secret_key<config:webserver__secret_key>` which
should be same on the Webserver and Worker to allow Webserver to fetch logs from Worker.
+The webserver key is also used to authorize requests to Celery workers when loga are retrieved. The token
Review comment:
```suggestion
The webserver key is also used to authorize requests to Celery workers when logs are retrieved. The token
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] potiuk commented on pull request #21685: Add extra information about time synchronization needed
Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #21685:
URL: https://github.com/apache/airflow/pull/21685#issuecomment-1046064722
I deliberetely put repeated content in this case. The problem with content that is "linked" from many places is that most users will not follow "see also" or "look for details". I believe some level of redundancy in "watch outs" and warnings like this is not only good but necessary - especially that (unlike code) those warnings will not get refactored or updated after they are merged.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org