You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/06/09 08:57:00 UTC

[GitHub] [airflow] lostkamp opened a new pull request, #24342: KubernetesExecutor: Always give precedence to pod from executor_config arg

lostkamp opened a new pull request, #24342:
URL: https://github.com/apache/airflow/pull/24342

   closes: #22298 
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information.
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in a newsfragement file, named `{pr_number}.significant.rst`, in [newsfragments](https://github.com/apache/airflow/tree/main/newsfragments).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] vanchaxy commented on pull request #24342: KubernetesExecutor: Always give precedence to pod from executor_config arg

Posted by GitBox <gi...@apache.org>.
vanchaxy commented on PR #24342:
URL: https://github.com/apache/airflow/pull/24342#issuecomment-1159719089

   > can you clarify your concern, perhaps with an example?
   
   @dstandish after this change the task like this will never work:
   ```
   @task(executor_config="pod_override": k8s.V1Pod(metadata=k8s.V1ObjectMeta(labels={"airflow-worker": "wrong-value"})))
   def start_task():
       print()
   ```
   
   It's now only about `airflow-worker`: specifying any label/annotation that is defined in the dynamic pod can break the executor (dag_id/task_id/try_number/run_id/map_index/kubernetes_executor/etc). Basically, we want to allow only overriding of namespace/image but the suggested approach allows us to override any value. My suggestion is to split `dynamic_pod` into optional (allowed to override) and required (denied to override):
   ```
   optional_dynamic_pod = k8s.V1Pod(
       metadata=k8s.V1ObjectMeta(namespace=namespace),
       spec=k8s.V1PodSpec(containers=[k8s.V1Container(image=image)]),
   )
   
   required_dynamic_pod = k8s.V1Pod(
       metadata=k8s.V1ObjectMeta(
           annotations=annotations,
           name=PodGenerator.make_unique_pod_id(pod_id),
           labels=labels,
       ),
       spec=k8s.V1PodSpec(
           containers=[
               k8s.V1Container(
                   name="base",
                   args=args,
                   env=[k8s.V1EnvVar(name="AIRFLOW_IS_K8S_EXECUTOR_POD", value="True")],
               ),
           ],
       ),
   )
   
   pod_list = [base_worker_pod, optional_dynamic_pod, pod_override_object, required_dynamic_pod]
   ```
   
   Also, regardless of implementation, this part of the function should be removed as it's no longer needed. 
   ```
           try:
               image = pod_override_object.spec.containers[0].image  # type: ignore
               if not image:
                   image = kube_image
           except Exception:
               image = kube_image
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] XD-DENG commented on pull request #24342: KubernetesExecutor: Use user-provided namespace from executor_config arg

Posted by GitBox <gi...@apache.org>.
XD-DENG commented on PR #24342:
URL: https://github.com/apache/airflow/pull/24342#issuecomment-1173092750

   > > Was permission considered? Say if the scheduler pod would have the permission to launch pods in the namespace provided by user? If not, how will this be handled properly?
   > 
   > I did not consider this, because I think it is the user's responsibility to set the correct permissions. Does Airflow do permission checks in case the namespace is not overridden?
   
   Agreed that it's the user's responsibility to set the correct permissions.
   
   But if the user forgot to do that or has made any human error (say typo in namespace name, etc.), how will the error be handled in a way so that it's obvious/friendly to the users?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] lostkamp commented on pull request #24342: KubernetesExecutor: Use user-provided namespace from executor_config arg

Posted by GitBox <gi...@apache.org>.
lostkamp commented on PR #24342:
URL: https://github.com/apache/airflow/pull/24342#issuecomment-1173085810

   > Was permission considered? Say if the scheduler pod would have the permission to launch pods in the namespace provided by user? If not, how will this be handled properly?
   
   I did not consider this, because I think it is the user's responsibility to set the correct permissions. Does Airflow do permission checks in case the namespace is not overridden?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] dstandish commented on pull request #24342: KubernetesExecutor: Use user-provided namespace from executor_config arg

Posted by GitBox <gi...@apache.org>.
dstandish commented on PR #24342:
URL: https://github.com/apache/airflow/pull/24342#issuecomment-1160630013

   looks like you might've had a rebase mishap @lostkamp 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] dstandish commented on pull request #24342: KubernetesExecutor: Always give precedence to pod from executor_config arg

Posted by GitBox <gi...@apache.org>.
dstandish commented on PR #24342:
URL: https://github.com/apache/airflow/pull/24342#issuecomment-1159727523

   OK thanks for that @vanchaxy 
   
   So I see the issue is we want to protect users from overriding something with `pod_override` that would break k8s executor.  Looking this through, there's already some logic to respect the image from pod override if one is provided.  We could do the same with namespace. labels and annotations are already additive so maybe nothing necessary there.  why not just do the same for namespace that we are doing for image?  any concerns with that solution?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] dstandish commented on pull request #24342: KubernetesExecutor: Always give precedence to pod from executor_config arg

Posted by GitBox <gi...@apache.org>.
dstandish commented on PR #24342:
URL: https://github.com/apache/airflow/pull/24342#issuecomment-1159813861

   great let's see what @lostkamp has to say.  given that pretty much everything else in "dynamic_pod" needs to be there, or can be additively merged, i don't really see the need to construct two pods; we can just have these two special case carveouts, which to me seems the simpler approach. though i'm not super strongly opinionated either way. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] lostkamp commented on pull request #24342: KubernetesExecutor: Always give precedence to pod from executor_config arg

Posted by GitBox <gi...@apache.org>.
lostkamp commented on PR #24342:
URL: https://github.com/apache/airflow/pull/24342#issuecomment-1160587930

   Hi,
   thanks a lot for your input so far. My first thought is that adding a try/except for the namespace makes the code easier to read than creating two different dynamic pods, so I would go with that


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on pull request #24342: KubernetesExecutor: Use user-provided namespace from executor_config arg

Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #24342:
URL: https://github.com/apache/airflow/pull/24342#issuecomment-1176447653

   Hello @lostkamp  - I needed to revert this one because it caused main failure on Kubernetes tests. Can you please re-opoen this one (re-apply it again on main so that:
   
   a) we can see why it did not trigger the tests before (our tests might be a bit too selective and I want to get it fixed) 
   b) you can fix the tests
   
   Sorry for that but we needed to unblock other failing PRs
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk merged pull request #24342: KubernetesExecutor: Use user-provided namespace from executor_config arg

Posted by GitBox <gi...@apache.org>.
potiuk merged PR #24342:
URL: https://github.com/apache/airflow/pull/24342


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] vanchaxy commented on pull request #24342: KubernetesExecutor: Always give precedence to pod from executor_config arg

Posted by GitBox <gi...@apache.org>.
vanchaxy commented on PR #24342:
URL: https://github.com/apache/airflow/pull/24342#issuecomment-1152813887

   Hi, I also encountered this bug. For now, I overcome it with pod_hook. 
   
   From what I see, at the start of the construct_pod (line 339) we check for the image in pod_override_object and use it in kube_image.
   This should be removed if we change the order of pods as you did in this PR or the namespace should be overwritten in a similar way.
   
   I see the problem that after this PR users of airflow can overwrite some important annotations/labels which airflow use to operate and this will break executor work.
   
   My suggestion is:
   1) Remove image overwriting at the start of the function (try... except block)
   2) Split dynamic_pod into 2 different pods.  One with namespace and kube_image. Another with annotations/labels/name/args/env.
   3) Change pod_list order to pod_template_file -> pod with namespace and image ->  pod from executor_config -> pod from from dynamic arguments (annotations/labels/...).
   
   After this change, we will have two dynamic pods: one with values that we allow users to overwrite and another with values that should not be overwritten by users.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] lostkamp commented on pull request #24342: KubernetesExecutor: Use user-provided namespace from executor_config arg

Posted by GitBox <gi...@apache.org>.
lostkamp commented on PR #24342:
URL: https://github.com/apache/airflow/pull/24342#issuecomment-1164512649

   > looks like you might've had a rebase mishap @lostkamp
   
   huh.. i'm not sure what happened, but clicking on "fetch upstream" in this branch resolved it apparently.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] lostkamp commented on pull request #24342: KubernetesExecutor: Use user-provided namespace from executor_config arg

Posted by GitBox <gi...@apache.org>.
lostkamp commented on PR #24342:
URL: https://github.com/apache/airflow/pull/24342#issuecomment-1175780443

   I don't know, I guess at the moment it won't? But this is a larger issue that also applies to the default configured namespace in airflow.cfg right? So isn't this rather a separate issue / PR?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] dstandish commented on pull request #24342: KubernetesExecutor: Always give precedence to pod from executor_config arg

Posted by GitBox <gi...@apache.org>.
dstandish commented on PR #24342:
URL: https://github.com/apache/airflow/pull/24342#issuecomment-1159227636

   > I see the problem that after this PR users of airflow can overwrite some important annotations/labels which airflow use to operate and this will break executor work.
   
   @vanchaxy can you clarify your concern, perhaps with an example?  This approach seems reasonable to me.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] vanchaxy commented on pull request #24342: KubernetesExecutor: Always give precedence to pod from executor_config arg

Posted by GitBox <gi...@apache.org>.
vanchaxy commented on PR #24342:
URL: https://github.com/apache/airflow/pull/24342#issuecomment-1159747679

   @dstandish no concerns with that, I wrote about this in the first message:
   
   > or the namespace should be overwritten in a similar way
   
   For me personally, solution with two pods just looks prettier/easier to understand than try/except with `pod_override_object.spec.containers[0].image` and try/except `pod_override_object.metadata.namespace`. But both solution does exactly the same, so we can go with any.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org