You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2019/11/05 17:16:34 UTC

[GitHub] [airflow] ddelange opened a new pull request #6506: [AIRFLOW-XXX] Fix unpacking of resources

ddelange opened a new pull request #6506: [AIRFLOW-XXX] Fix unpacking of resources
URL: https://github.com/apache/airflow/pull/6506
 
 
   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [ ] My PR addresses the following [Airflow Jira](https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR"
     - https://issues.apache.org/jira/browse/AIRFLOW-XXX
     - In case you are fixing a typo in the documentation you can prepend your commit with \[AIRFLOW-XXX\], code changes always need a Jira issue.
     - In case you are proposing a fundamental code change, you need to create an Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)).
     - In case you are adding a dependency, check if the license complies with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   
   ### Description
   
   When trying to initialize a KubernetesPodOperator with some RAM and CPU limits, I followed [Kubernetes docs](https://kubernetes.io/docs/tasks/administer-cluster/manage-resources/cpu-default-namespace/#what-if-you-specify-a-container-s-limit-but-not-its-request), similar to  and added a dict of form similar to https://github.com/apache/airflow/blob/5096fa3e48b4f79a29e4ad430e426c8bc9462449/tests/kubernetes/test_pod_generator.py#L92-L101:
   ```
   resources = {
       "limits": {"ram": f"2000Mi", "cpus": "1.0"},
       "requests": {"ram": f"2000Mi", "cpus": "1.0"},
   }
   ```
   However, this structure seems to not properly propagate, so I tried to make the `resources` dict compatible with the BaseOperator inherited by the KubernetesPodOperator, and found this bug in the source code when using a simplified `resources` dict compatible with the BaseOperator:
   ```
   > /usr/local/lib/python3.7/site-packages/airflow/models/baseoperator.py(383)__init__()
       382         import ipdb;ipdb.set_trace()
   --> 383         self.resources = Resources(*resources) if resources is not None else None
       384         self.run_as_user = run_as_user
   
   ipdb> resources
   {'cpus': 1.0, 'ram': 2000}
   ipdb> Resources(*resources)
   *** TypeError: '<' not supported between instances of 'str' and 'int'
   ipdb> Resources(**resources)
   {'cpus': {'_name': 'CPU', '_units_str': 'core(s)', '_qty': 1.0}, 'ram': {'_name': 'RAM', '_units_str': 'MB', '_qty': 2000}, 'disk': {'_name': 'Disk', '_units_str': 'MB', '_qty': 512}, 'gpus': {'_name': 'GPU', '_units_str': 'gpu(s)', '_qty': 0}}
   ipdb>
   ```
   
   I have a feeling this is the wrong kwarg to limit the resources of the Pod that the KubernetesPodOperator will spawn, since this code is not used at all: https://github.com/apache/airflow/blob/5096fa3e48b4f79a29e4ad430e426c8bc9462449/airflow/kubernetes/pod.py#L28
   Instead, it simply uses https://github.com/apache/airflow/blob/5096fa3e48b4f79a29e4ad430e426c8bc9462449/airflow/models/baseoperator.py#L397
   
   If someone can tell me how to properly set these resource limits for a KubernetesPod, please let me know.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services