You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/06/25 18:20:05 UTC

[GitHub] [airflow] jedcunningham opened a new pull request, #24658: Add `airflow_kpo_in_cluster` label to KPO pods

jedcunningham opened a new pull request, #24658:
URL: https://github.com/apache/airflow/pull/24658

   This allows one to determine if the pod was created with in_cluster config
   or not, both on the k8s side and in pod_mutation_hooks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ianbuss commented on pull request #24658: Add `airflow_kpo_in_cluster` label to KPO pods

Posted by GitBox <gi...@apache.org>.
ianbuss commented on PR #24658:
URL: https://github.com/apache/airflow/pull/24658#issuecomment-1167092658

   Just thinking out loud, would it make sense to (also?) add the full task spec as json to a KPO label to allow maximum future flexibility in the pod mutation hook to make mutation decisions. I realise the `in_cluster` here is more complicated than if the user specifies `in_cluster=True|False` on the task as it looks for global Airflow configs and also connection config but there may be things in the future that we haven't considered that users might want to use in their mutation logic.
   
   The alternative would be to extend the `pod_mutation_hook` function signature to accept a `KubernetesPodOperator` argument but not sure how/if this could be done in a backwards-compatible way.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on pull request #24658: Add `airflow_kpo_in_cluster` label to KPO pods

Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #24658:
URL: https://github.com/apache/airflow/pull/24658#issuecomment-1166482679

   You will need to rebase @jedcunningham to account for selective check problem from #24665 just merged.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] jedcunningham commented on a diff in pull request #24658: Add `airflow_kpo_in_cluster` label to KPO pods

Posted by GitBox <gi...@apache.org>.
jedcunningham commented on code in PR #24658:
URL: https://github.com/apache/airflow/pull/24658#discussion_r906941092


##########
airflow/providers/cncf/kubernetes/hooks/kubernetes.py:
##########
@@ -127,6 +127,9 @@ def __init__(
         self.disable_verify_ssl = disable_verify_ssl
         self.disable_tcp_keepalive = disable_tcp_keepalive
 
+        # Expose whether the hook is configured to use incluster_config or not
+        self.is_in_cluster: Optional[bool] = None

Review Comment:
   That's a good idea, thanks 🍺



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] jedcunningham commented on a diff in pull request #24658: Add `airflow_kpo_in_cluster` label to KPO pods

Posted by GitBox <gi...@apache.org>.
jedcunningham commented on code in PR #24658:
URL: https://github.com/apache/airflow/pull/24658#discussion_r907680231


##########
airflow/providers/cncf/kubernetes/hooks/kubernetes.py:
##########
@@ -273,6 +279,14 @@ def _get_default_client(self, *, cluster_context=None):
             )
         return client.ApiClient()
 
+    @property
+    def is_in_cluster(self):
+        """Expose whether the hook is configured with incluster_config or not"""
+        if self._is_in_cluster is not None:
+            return self._is_in_cluster
+        self.get_conn()  # so we can determine if we are in_cluster or not

Review Comment:
   Works for me.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] dstandish commented on a diff in pull request #24658: Add `airflow_kpo_in_cluster` label to KPO pods

Posted by GitBox <gi...@apache.org>.
dstandish commented on code in PR #24658:
URL: https://github.com/apache/airflow/pull/24658#discussion_r907523546


##########
airflow/providers/cncf/kubernetes/hooks/kubernetes.py:
##########
@@ -273,6 +279,14 @@ def _get_default_client(self, *, cluster_context=None):
             )
         return client.ApiClient()
 
+    @property
+    def is_in_cluster(self):
+        """Expose whether the hook is configured with incluster_config or not"""
+        if self._is_in_cluster is not None:
+            return self._is_in_cluster
+        self.get_conn()  # so we can determine if we are in_cluster or not

Review Comment:
   since get_conn isn't cached it seems better to call self.api_client



##########
airflow/providers/cncf/kubernetes/hooks/kubernetes.py:
##########
@@ -232,9 +234,12 @@ def get_conn(self) -> Any:
 
         if in_cluster:
             self.log.debug("loading kube_config from: in_cluster configuration")
+            self._is_in_cluster = True
             config.load_incluster_config()
             return client.ApiClient()
 
+        self._is_in_cluster = False

Review Comment:
   it's a tiny bit odd to  set it false  here when  you might set it to  true again later. 
   
   even though it's more code, it would probably be a tiny  bit better to set this when we know, i.e. closer to the return  statements.  e.g. right before  250 and 262 and conditionally in _get_default_client
   
   until we _know_, it's probably  better to leave it  not set.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] jedcunningham commented on a diff in pull request #24658: Add `airflow_kpo_in_cluster` label to KPO pods

Posted by GitBox <gi...@apache.org>.
jedcunningham commented on code in PR #24658:
URL: https://github.com/apache/airflow/pull/24658#discussion_r907680883


##########
airflow/providers/cncf/kubernetes/hooks/kubernetes.py:
##########
@@ -232,9 +234,12 @@ def get_conn(self) -> Any:
 
         if in_cluster:
             self.log.debug("loading kube_config from: in_cluster configuration")
+            self._is_in_cluster = True
             config.load_incluster_config()
             return client.ApiClient()
 
+        self._is_in_cluster = False

Review Comment:
   I don't love the sprawl, but it does make it easier to test as well. Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] github-actions[bot] commented on pull request #24658: Add `airflow_kpo_in_cluster` label to KPO pods

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #24658:
URL: https://github.com/apache/airflow/pull/24658#issuecomment-1167731655

   The PR is likely OK to be merged with just subset of tests for default Python and Database versions without running the full matrix of tests, because it does not modify the core of Airflow. If the committers decide that the full tests matrix is needed, they will add the label 'full tests needed'. Then you should rebase to the latest main or amend the last commit of the PR, and push it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ianbuss commented on pull request #24658: Add `airflow_kpo_in_cluster` label to KPO pods

Posted by GitBox <gi...@apache.org>.
ianbuss commented on PR #24658:
URL: https://github.com/apache/airflow/pull/24658#issuecomment-1168416272

   > 
   
   Yes, I agree @jedcunningham. Regardless of whether we considered adding the task spec as a new label or pass it as a parameter (which I think might be useful), I think it makes sense to always add the `airflow_kpo_in_cluster` as implemented here. Passing the task information could potentially be implemented as a follow up PR after some additional discussion?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] jedcunningham merged pull request #24658: Add `airflow_kpo_in_cluster` label to KPO pods

Posted by GitBox <gi...@apache.org>.
jedcunningham merged PR #24658:
URL: https://github.com/apache/airflow/pull/24658


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] dstandish commented on a diff in pull request #24658: Add `airflow_kpo_in_cluster` label to KPO pods

Posted by GitBox <gi...@apache.org>.
dstandish commented on code in PR #24658:
URL: https://github.com/apache/airflow/pull/24658#discussion_r906718644


##########
airflow/providers/cncf/kubernetes/hooks/kubernetes.py:
##########
@@ -127,6 +127,9 @@ def __init__(
         self.disable_verify_ssl = disable_verify_ssl
         self.disable_tcp_keepalive = disable_tcp_keepalive
 
+        # Expose whether the hook is configured to use incluster_config or not
+        self.is_in_cluster: Optional[bool] = None

Review Comment:
   So... maybe what you should do is this
   
   ```
   @property
   def is_in_cluster(self):
       self.client
       return self._is_in_cluster
   ```
   
   that way it will always tell the truth....
   
   sorta like we arrived at with KPO



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] jedcunningham commented on pull request #24658: Add `airflow_kpo_in_cluster` label to KPO pods

Posted by GitBox <gi...@apache.org>.
jedcunningham commented on PR #24658:
URL: https://github.com/apache/airflow/pull/24658#issuecomment-1167763772

   @ianbuss, I'd be more inclined to also send the task to the `pod_mutation_hook`, and it could be done in a backwards compatible way I believe.
   
   That said, I think this label still has value regardless. Thoughts?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] jedcunningham commented on pull request #24658: Add `airflow_kpo_in_cluster` label to KPO pods

Posted by GitBox <gi...@apache.org>.
jedcunningham commented on PR #24658:
URL: https://github.com/apache/airflow/pull/24658#issuecomment-1167778902

   Actually, thinking a little more, it's not terribly hard to get the TI/task from the existing labels on the pod. I wonder if anyone is already doing that now. It could be worth adding to make it easier still.
   
   However, you nailed it that knowing whether `load_incluster_config` was used or not gets complicated because the connection is involved. This flag makes it easy on the user side of things.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org