You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2019/11/05 17:16:34 UTC
[GitHub] [airflow] ddelange opened a new pull request #6506: [AIRFLOW-XXX]
Fix unpacking of resources
ddelange opened a new pull request #6506: [AIRFLOW-XXX] Fix unpacking of resources
URL: https://github.com/apache/airflow/pull/6506
Make sure you have checked _all_ steps below.
### Jira
- [ ] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
- https://issues.apache.org/jira/browse/AIRFLOW-XXX
- In case you are fixing a typo in the documentation you can prepend your commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
- In case you are proposing a fundamental code change, you need to create an Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)).
- In case you are adding a dependency, check if the license complies with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
### Description
When trying to initialize a KubernetesPodOperator with some RAM and CPU limits, I followed [Kubernetes docs](https://kubernetes.io/docs/tasks/administer-cluster/manage-resources/cpu-default-namespace/#what-if-you-specify-a-container-s-limit-but-not-its-request), similar to and added a dict of form similar to https://github.com/apache/airflow/blob/5096fa3e48b4f79a29e4ad430e426c8bc9462449/tests/kubernetes/test_pod_generator.py#L92-L101:
```
resources = {
"limits": {"ram": f"2000Mi", "cpus": "1.0"},
"requests": {"ram": f"2000Mi", "cpus": "1.0"},
}
```
However, this structure seems to not properly propagate, so I tried to make the `resources` dict compatible with the BaseOperator inherited by the KubernetesPodOperator, and found this bug in the source code when using a simplified `resources` dict compatible with the BaseOperator:
```
> /usr/local/lib/python3.7/site-packages/airflow/models/baseoperator.py(383)__init__()
382 import ipdb;ipdb.set_trace()
--> 383 self.resources = Resources(*resources) if resources is not None else None
384 self.run_as_user = run_as_user
ipdb> resources
{'cpus': 1.0, 'ram': 2000}
ipdb> Resources(*resources)
*** TypeError: '<' not supported between instances of 'str' and 'int'
ipdb> Resources(**resources)
{'cpus': {'_name': 'CPU', '_units_str': 'core(s)', '_qty': 1.0}, 'ram': {'_name': 'RAM', '_units_str': 'MB', '_qty': 2000}, 'disk': {'_name': 'Disk', '_units_str': 'MB', '_qty': 512}, 'gpus': {'_name': 'GPU', '_units_str': 'gpu(s)', '_qty': 0}}
ipdb>
```
I have a feeling this is the wrong kwarg to limit the resources of the Pod that the KubernetesPodOperator will spawn, since this code is not used at all: https://github.com/apache/airflow/blob/5096fa3e48b4f79a29e4ad430e426c8bc9462449/airflow/kubernetes/pod.py#L28
Instead, it simply uses https://github.com/apache/airflow/blob/5096fa3e48b4f79a29e4ad430e426c8bc9462449/airflow/models/baseoperator.py#L397
If someone can tell me how to properly set these resource limits for a KubernetesPod, please let me know.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services