You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/05/05 11:14:55 UTC

[GitHub] [airflow] Jaxing opened a new issue #15670: Pod overrides not working

Jaxing opened a new issue #15670:
URL: https://github.com/apache/airflow/issues/15670


   <!--
   
   Welcome to Apache Airflow!  For a smooth issue process, try to answer the following questions.
   Don't worry if they're not all applicable; just try to include what you can :-)
   
   If you need to include code snippets or logs, please put them in fenced code
   blocks.  If they're super-long, please use the details tag like
   <details><summary>super-long log</summary> lots of stuff </details>
   
   Please delete these comment blocks before submitting the issue.
   
   -->
   
   <!--
   
   IMPORTANT!!!
   
   PLEASE CHECK "SIMILAR TO X EXISTING ISSUES" OPTION IF VISIBLE
   NEXT TO "SUBMIT NEW ISSUE" BUTTON!!!
   
   PLEASE CHECK IF THIS ISSUE HAS BEEN REPORTED PREVIOUSLY USING SEARCH!!!
   
   Please complete the next sections or the issue will be closed.
   These questions are the first thing we need to know to understand the context.
   
   -->
   
   **Apache Airflow version**:
   2.0.1
   **Kubernetes version (if you are using kubernetes)** (use `kubectl version`):
   1.17
   **Environment**:
   
   - **Cloud provider or hardware configuration**: GKE, some CPU some GPU nodes
   
   **What happened**:
   Hey I'm having some issues using the executor_config in the python operator when running with kubernetes_executor, i have tried both the option of using the `pod_template_file` (to point out a specific file) and the `pod_override` option to do some changes to the pod spec among others set specific resource limits and request (a requirment in our namespace) however when I deploy it and start the task I get an error in the scheduler that I cannot create the worker since I have not specified the resource limits and requests. This new pod spec contained resources (and other necessary changes e.g. node selectors) for finding a node with GPU and allocation a GPU. I could see that when the scheduler scheduled the worker it used the correct gpu pod_template_file.
   One thing I did to debug was to use the pod_template_file with the GPU spec as the default pod template (by setting it in the airflow.cfg). This works which means that there is no issue with the pod_template_file.
   **What you expected to happen**:
   It seems like the pod overrides are not applied correctly
   **How to reproduce it**:
   Use `pod_template_file` or `pod_override` in a cluster of namespace that requires certain resource limits.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] bpleines commented on issue #15670: Pod overrides not working

Posted by GitBox <gi...@apache.org>.
bpleines commented on issue #15670:
URL: https://github.com/apache/airflow/issues/15670#issuecomment-846125695


   Thanks for the example @kaxil. Understood if you want to keep the discussion for `2.0+` only, but wanted to note that I don't think this pod_override strategy works for `1.10.15`. I can see the scheduler pulling in the pod_override (logs attached), but it is not realized on the resulting worker pod's container.
   
   ```
   from airflow import DAG
   from airflow.operators.python_operator import PythonOperator
   from datetime import timedelta
   from kubernetes.client import models as k8s
   import os
   import time
   
   from dag_utils import utils
   
   def test():
       print('Sleeping for 60')
       time.sleep(60)
   
   with DAG(
       'kubernetes_pod_override_example',
       default_args=utils.get_default_args(),
       description='Kubernetes pod_override Example',
       schedule_interval=timedelta(minutes=2),
       max_active_runs=1) as dag:
   
           pod_override = PythonOperator(
                task_id='pod_override',
                python_callable=test,
                executor_config={
                    "pod_override": k8s.V1PodSpec(
                        containers=[
                            k8s.V1Container(
                                name="base",
                                resources=k8s.V1ResourceRequirements(
                                    requests={'cpu': '4', 'memory': '8Gi'},
                                    limits={'cpu': '4', 'memory': '8Gi'}
                                )
                            )
                        ]
                    )
                }
           )
   
           pod_override
   ```
   [scheduler.log](https://github.com/apache/airflow/files/6524193/scheduler.log)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] CatarinaSilva commented on issue #15670: Pod overrides not working

Posted by GitBox <gi...@apache.org>.
CatarinaSilva commented on issue #15670:
URL: https://github.com/apache/airflow/issues/15670#issuecomment-845883145


   Hi @kaxil also experiencing this issue in 2.0.2, although I was using the old overrides:
   
   ```
           "KubernetesExecutor": {
               "request_cpu": "250m",
               "request_memory": "500Mi",
               "limit_memory": "1Gi",
               "request_ephemeral_storage": "1Gi"
           }
   ```
   
   it only breaks for resources overrides, volumes work fine for example
   
   I can try with the new format but I'm struggling to find good docs on how to convert from the previous overrides to the new ones, can you point me to something?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] CatarinaSilva commented on issue #15670: Pod overrides not working

Posted by GitBox <gi...@apache.org>.
CatarinaSilva commented on issue #15670:
URL: https://github.com/apache/airflow/issues/15670#issuecomment-846030371


   hey @kaxil I can confirm the requests and limits are working with the new way in 2.0.2 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] bpleines commented on issue #15670: Pod overrides not working

Posted by GitBox <gi...@apache.org>.
bpleines commented on issue #15670:
URL: https://github.com/apache/airflow/issues/15670#issuecomment-841862250


   I have the `pod_template_file` working, but similarly have not been able to get the container `pod_override` working for resource requests/limits on Airflow `1.10.15`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on issue #15670: Pod overrides not working

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #15670:
URL: https://github.com/apache/airflow/issues/15670#issuecomment-832607036


   Thanks for opening your first issue here! Be sure to follow the issue template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on issue #15670: Pod overrides not working

Posted by GitBox <gi...@apache.org>.
kaxil commented on issue #15670:
URL: https://github.com/apache/airflow/issues/15670#issuecomment-889548384


   Yup you are right @Jaxing , use the following that should work both in 1.10.15 and 2.*. It will help in migrations, once you upgrade to Airflow 2.0+, you can then change to using pod_overrides
   
   ```python
   p1 = PythonOperator(
       task_id='example_test_task',
       dag=dag,
       python_callable=lambda: 1,
       executor_config={
           "KubernetesExecutor": {
               'resources': {
                   'limits': {'memory': '200Mi', 'cpu': '100m'},
                   'requests': {'memory': '100Mi', 'cpu': '100m'}
               }
           }
       }
   )
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] bpleines commented on issue #15670: Pod overrides not working

Posted by GitBox <gi...@apache.org>.
bpleines commented on issue #15670:
URL: https://github.com/apache/airflow/issues/15670#issuecomment-841862250


   I have the `pod_template_file` working, but similarly have not been able to get the container `pod_override` working for resource requests/limits on Airflow `1.10.15`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on issue #15670: Pod overrides not working

Posted by GitBox <gi...@apache.org>.
kaxil commented on issue #15670:
URL: https://github.com/apache/airflow/issues/15670#issuecomment-832773486


   Can you test it with 2.0.2 please and let us know if the problem persists? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] CatarinaSilva commented on issue #15670: Pod overrides not working

Posted by GitBox <gi...@apache.org>.
CatarinaSilva commented on issue #15670:
URL: https://github.com/apache/airflow/issues/15670#issuecomment-845924350


   Thanks @kaxil currently testing with the new format, will let you know if it works


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] bpleines edited a comment on issue #15670: Pod overrides not working

Posted by GitBox <gi...@apache.org>.
bpleines edited a comment on issue #15670:
URL: https://github.com/apache/airflow/issues/15670#issuecomment-846125695


   Thanks for the example @kaxil. Understood if you want to keep the discussion for `2.0+` only, but wanted to note that I don't think this pod_override strategy works for `1.10.15`. I can see the scheduler pulling in the pod_override (logs attached), but the resources override are not realized on the resulting worker pod's container.
   
   ```
   from airflow import DAG
   from airflow.operators.python_operator import PythonOperator
   from datetime import timedelta
   from kubernetes.client import models as k8s
   import os
   import time
   
   from dag_utils import utils
   
   def test():
       print('Sleeping for 60')
       time.sleep(60)
   
   with DAG(
       'kubernetes_pod_override_example',
       default_args=utils.get_default_args(),
       description='Kubernetes pod_override Example',
       schedule_interval=timedelta(minutes=2),
       max_active_runs=1) as dag:
   
           pod_override = PythonOperator(
                task_id='pod_override',
                python_callable=test,
                executor_config={
                    "pod_override": k8s.V1PodSpec(
                        containers=[
                            k8s.V1Container(
                                name="base",
                                resources=k8s.V1ResourceRequirements(
                                    requests={'cpu': '4', 'memory': '8Gi'},
                                    limits={'cpu': '4', 'memory': '8Gi'}
                                )
                            )
                        ]
                    )
                }
           )
   
           pod_override
   ```
   [scheduler.log](https://github.com/apache/airflow/files/6524193/scheduler.log)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on issue #15670: Pod overrides not working

Posted by GitBox <gi...@apache.org>.
kaxil commented on issue #15670:
URL: https://github.com/apache/airflow/issues/15670#issuecomment-845896442


   Probably something like that should work:
   
   ```python
   sidecar_task = PythonOperator(
       task_id="task_with_sidecar",
       python_callable=test_sharedvolume_mount,
       executor_config={
           "pod_override": k8s.V1Pod(
               spec=k8s.V1PodSpec(
                   containers=[
                       k8s.V1Container(
                           name="base",
                           volume_mounts=[
                               k8s.V1VolumeMount(mount_path="/shared/", name="shared-empty-dir")
                           ],
                           resources=k8s.V1ResourceRequirements(
                               requests={'memory': '100Mi'},
                               limits={
                                   'memory': '200Mi',
                               },
                           ),
                       ),
                       k8s.V1Container(
                           name="sidecar",
                           image="ubuntu",
                           args=["echo \"retrieved from mount\" > /shared/test.txt"],
                           command=["bash", "-cx"],
                           volume_mounts=[
                               k8s.V1VolumeMount(mount_path="/shared/", name="shared-empty-dir")
                           ],
                       ),
                   ],
                   volumes=[
                       k8s.V1Volume(name="shared-empty-dir", empty_dir=k8s.V1EmptyDirVolumeSource()),
                   ],
               )
           ),
       },
   )
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org