You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "hakuno (via GitHub)" <gi...@apache.org> on 2023/10/23 14:27:54 UTC
[PR] fix(helm): set worker safeToEvict properly [airflow]
hakuno opened a new pull request, #35130:
URL: https://github.com/apache/airflow/pull/35130
It improves (or fixes) the `workers.safeToEvict` usage.
Kubernetes administrators/users agree that `cluster-autoscaler.kubernetes.io/safe-to-evict` can be managed to avoid the Autoscaler to kill the worker pod when scaling nodes down.
> When this annotation is set to "true", the cluster autoscaler is allowed to evict a Pod even if other rules would normally prevent that. The cluster autoscaler never evicts Pods that have this annotation explicitly set to "false"; you could set that on an important Pod that you want to keep running. If this annotation is not set then the cluster autoscaler follows its Pod-level behavior.
So, if I set `workers.safeToEvict` to `false`, it gets nothing. The Autoscaler will kill that still. Because the Helm chart has no effect on handling that. See:
```
{{- if .Values.workers.safeToEvict }}
cluster-autoscaler.kubernetes.io/safe-to-evict: "true"
{{- end }}
```
It's like a useless piece of statement.
What about if I need to set that to false? I couldn't. Anybody is unable to.
A probaly **workaround** would set up `workers.podAnnotations` with map of _annotations_ and `workers.safeToEvict` falsely together.
Please, check it out carefully. Thanks in advance!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] fix(helm): set worker safeToEvict properly [airflow]
Posted by "hakuno (via GitHub)" <gi...@apache.org>.
hakuno commented on PR #35130:
URL: https://github.com/apache/airflow/pull/35130#issuecomment-1778325712
Ps.¹ Other Airflow components also has `safeToEvict` values.
Ps.² At another moment, I see I can contribute with other PRs. Because `workers.podAnnotations` and `airflowPodAnnotations` can fight each other whenever it has a set of annotations in common.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] fix(helm): set worker safeToEvict properly [airflow]
Posted by "potiuk (via GitHub)" <gi...@apache.org>.
potiuk commented on PR #35130:
URL: https://github.com/apache/airflow/pull/35130#issuecomment-1777228011
> @jedcunningham Thanks for that. Unfortunately, I've had some difficult to setup this test environment.
>
> Thoughts?
If you do not write what kind of difficulties you had, It's hard to have more thoughts. But I recommend to follow the steps in https://github.com/apache/airflow/blob/main/CONTRIBUTORS_QUICK_START.rst to setup breeze and then the docs pointed at by @jedcunningham provide a step-by-step (almost wizard-like) approach to make it works.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] fix(helm): set worker safeToEvict properly [airflow]
Posted by "eladkal (via GitHub)" <gi...@apache.org>.
eladkal merged PR #35130:
URL: https://github.com/apache/airflow/pull/35130
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] fix(helm): set worker safeToEvict properly [airflow]
Posted by "hakuno (via GitHub)" <gi...@apache.org>.
hakuno commented on PR #35130:
URL: https://github.com/apache/airflow/pull/35130#issuecomment-1777184228
@jedcunningham Thanks for that. Unfortunately, I've had some difficult to setup this test environment.
Thoughts?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] fix(helm): set worker safeToEvict properly [airflow]
Posted by "jedcunningham (via GitHub)" <gi...@apache.org>.
jedcunningham commented on PR #35130:
URL: https://github.com/apache/airflow/pull/35130#issuecomment-1776502995
@hakuno, Hussein meant a test in our test suit to make sure this doesn't regress. [Here is an example](https://github.com/apache/airflow/blob/f457228bd21b13a7fdb29e154202d5357d024050/helm_tests/airflow_core/test_worker.py#L50-L65), and [some details on our helm chart tests](https://github.com/apache/airflow/blob/main/TESTING.rst#helm-unit-tests).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] fix(helm): set worker safeToEvict properly [airflow]
Posted by "hakuno (via GitHub)" <gi...@apache.org>.
hakuno commented on PR #35130:
URL: https://github.com/apache/airflow/pull/35130#issuecomment-1776255255
Thank you so, @hussein-awala
I tested the Helm Chart itself with `--validate` on a Kubernetes cluster to verify the generated manifests.
```
$ helm template --dry-run --validate --set workers.safeToEvict=false .
```
We got
```
# Source: airflow/templates/workers/worker-deployment.yaml
################################
## Airflow Worker Deployment
#################################
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: release-name-worker
labels:
tier: airflow
component: worker
release: release-name
chart: "airflow-1.12.0-dev"
heritage: Helm
spec:
serviceName: release-name-worker
replicas: 1
selector:
matchLabels:
tier: airflow
component: worker
release: release-name
template:
metadata:
labels:
tier: airflow
component: worker
release: release-name
annotations:
checksum/metadata-secret: 8b8ce685079b3075a4b91c47e267db7b50cd8bfda2269dd36fef1e258a3a38eb
checksum/result-backend-secret: 98a68f230007cfa8f5d3792e1aff843a76b0686409e4a46ab2f092f6865a1b71
checksum/pgbouncer-config-secret: 1dae2adc757473469686d37449d076b0c82404f61413b58ae68b3c5e99527688
checksum/webserver-secret-key: 1b48a02846657f40ac54dc50b230572e67e9f4a88e9a486e63bd25209173bfc4
checksum/kerberos-keytab: 80979996aa3c1f48c95dfbe9bb27191e71f12442a08c0ed834413da9d430fd0e
checksum/airflow-config: 520107ea8f2b203f84baed2ba60545a7afc8fc9a9f6978aa7cd45c4dba7c8dbe
checksum/extra-configmaps: e862ea47e13e634cf17d476323784fa27dac20015550c230953b526182f5cac8
checksum/extra-secrets: e9582fdd622296c976cbc10a5ba7d6702c28a24fe80795ea5b84ba443a56c827
cluster-autoscaler.kubernetes.io/safe-to-evict: "false"
...
```
And
```
$ helm template --dry-run --validate --set workers.safeToEvict=true .
```
We got
```
# Source: airflow/templates/workers/worker-deployment.yaml
################################
## Airflow Worker Deployment
#################################
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: release-name-worker
labels:
tier: airflow
component: worker
release: release-name
chart: "airflow-1.12.0-dev"
heritage: Helm
spec:
serviceName: release-name-worker
replicas: 1
selector:
matchLabels:
tier: airflow
component: worker
release: release-name
template:
metadata:
labels:
tier: airflow
component: worker
release: release-name
annotations:
checksum/metadata-secret: 8b8ce685079b3075a4b91c47e267db7b50cd8bfda2269dd36fef1e258a3a38eb
checksum/result-backend-secret: 98a68f230007cfa8f5d3792e1aff843a76b0686409e4a46ab2f092f6865a1b71
checksum/pgbouncer-config-secret: 1dae2adc757473469686d37449d076b0c82404f61413b58ae68b3c5e99527688
checksum/webserver-secret-key: c20aeee646c14a8ab31991c17de268a577d34f05d4a1ae045725bf974f999c69
checksum/kerberos-keytab: 80979996aa3c1f48c95dfbe9bb27191e71f12442a08c0ed834413da9d430fd0e
checksum/airflow-config: 520107ea8f2b203f84baed2ba60545a7afc8fc9a9f6978aa7cd45c4dba7c8dbe
checksum/extra-configmaps: e862ea47e13e634cf17d476323784fa27dac20015550c230953b526182f5cac8
checksum/extra-secrets: e9582fdd622296c976cbc10a5ba7d6702c28a24fe80795ea5b84ba443a56c827
cluster-autoscaler.kubernetes.io/safe-to-evict: "true"
...
```
And default values is same than `workers.safeToEvict=true` as expected.
Please, tell me if you know more ways to check it out.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] fix(helm): set worker safeToEvict properly [airflow]
Posted by "boring-cyborg[bot] (via GitHub)" <gi...@apache.org>.
boring-cyborg[bot] commented on PR #35130:
URL: https://github.com/apache/airflow/pull/35130#issuecomment-1775330488
Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst)
Here are some useful points:
- Pay attention to the quality of your code (ruff, mypy and type annotations). Our [pre-commits]( https://github.com/apache/airflow/blob/main/STATIC_CODE_CHECKS.rst#prerequisites-for-pre-commit-hooks) will help you with that.
- In case of a new feature add useful documentation (in docstrings or in `docs/` directory). Adding a new operator? Check this short [guide](https://github.com/apache/airflow/blob/main/docs/apache-airflow/howto/custom-operator.rst) Consider adding an example DAG that shows how users should use it.
- Consider using [Breeze environment](https://github.com/apache/airflow/blob/main/BREEZE.rst) for testing locally, it's a heavy docker but it ships with a working Airflow and a lot of integrations.
- Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
- Please follow [ASF Code of Conduct](https://www.apache.org/foundation/policies/conduct) for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
- Be sure to read the [Airflow Coding style]( https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#coding-style-and-best-practices).
- Always keep your Pull Requests rebased, otherwise your build might fail due to changes not related to your commits.
Apache Airflow is a community-driven project and together we are making it better 🚀.
In case of doubts contact the developers at:
Mailing List: dev@airflow.apache.org
Slack: https://s.apache.org/airflow-slack
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] fix(helm): set worker safeToEvict properly [airflow]
Posted by "hakuno (via GitHub)" <gi...@apache.org>.
hakuno commented on PR #35130:
URL: https://github.com/apache/airflow/pull/35130#issuecomment-1777258718
Thanks, @potiuk - I meant the environment setup itself. My Pytest couldn't discovery tests etc.
I'll read that docs as soon as possible... thanks again :)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] fix(helm): set worker safeToEvict properly [airflow]
Posted by "hakuno (via GitHub)" <gi...@apache.org>.
hakuno commented on PR #35130:
URL: https://github.com/apache/airflow/pull/35130#issuecomment-1778309120
@jedcunningham @hussein-awala @potiuk I got it.
```
pytest helm_tests/airflow_core/test_worker.py::TestWorker::test_safetoevict_annotations
============================================================================================ test session starts ============================================================================================
platform linux -- Python 3.8.18, pytest-7.4.2, pluggy-1.3.0 -- /usr/local/bin/python
cachedir: .pytest_cache
rootdir: /opt/airflow
configfile: pyproject.toml
plugins: timeouts-1.2.1, xdist-3.3.1, capture-warnings-0.0.4, instafail-0.5.0, mock-3.12.0, asyncio-0.21.1, requests-mock-1.11.0, time-machine-2.13.0, cov-4.1.0, rerunfailures-12.0, anyio-4.0.0, httpx-0.21.3
asyncio: mode=strict
setup timeout: 0.0s, execution timeout: 0.0s, teardown timeout: 0.0s
collected 2 items
helm_tests/airflow_core/test_worker.py::TestWorker::test_safetoevict_annotations[true-True] PASSED [ 50%]
helm_tests/airflow_core/test_worker.py::TestWorker::test_safetoevict_annotations[false-False] PASSED [100%]
============================================================================================ 2 passed in 16.65s =============================================================================================
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [PR] fix(helm): set worker safeToEvict properly [airflow]
Posted by "boring-cyborg[bot] (via GitHub)" <gi...@apache.org>.
boring-cyborg[bot] commented on PR #35130:
URL: https://github.com/apache/airflow/pull/35130#issuecomment-1783266148
Awesome work, congrats on your first merged pull request! You are invited to check our [Issue Tracker](https://github.com/apache/airflow/issues) for additional contributions.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org