You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Guillaume Onfroy (Jira)" <ji...@apache.org> on 2020/02/19 15:37:00 UTC
[jira] [Commented] (AIRFLOW-6810) KubernetesPodOperator pod is
completed but xcom side car is stuck
[ https://issues.apache.org/jira/browse/AIRFLOW-6810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17040171#comment-17040171 ]
Guillaume Onfroy commented on AIRFLOW-6810:
-------------------------------------------
I confirm this issue has been there for quite some time now and is quite annoying.
Identical to [https://stackoverflow.com/questions/54388441/kubernetes-pod-created-through-airflow-remains-in-running-state]
> KubernetesPodOperator pod is completed but xcom side car is stuck
> -----------------------------------------------------------------
>
> Key: AIRFLOW-6810
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6810
> Project: Apache Airflow
> Issue Type: Bug
> Components: executor-kubernetes
> Affects Versions: 1.10.6
> Reporter: Maxence Cramet
> Assignee: Daniel Imberman
> Priority: Major
>
> We're using KubernetesPodOperator with param xcom_push=true in order to push information from our task.
> From time to time the main pod completes but the side car pod is stuck.
> Here's the output of the pods details:
> {noformat}
> kubectl describe pod my_pod
> Name: my_pod
> Namespace: default
> Priority: 0
> PriorityClassName: <none>
> Node: xxx
> Start Time: Wed, 05 Feb 2020 11:12:33 +0000
> Labels: xxx
> Annotations: xxx
> Status: Running
> IP: xxx
> Containers:
> base:
> Container ID: xxx
> Image: xxx
> Image ID: xxx
> Port: <none>
> Host Port: <none>
> Args:
> xxx
> State: Terminated
> Reason: Completed
> Exit Code: 0
> Started: Wed, 05 Feb 2020 11:12:38 +0000
> Finished: Wed, 05 Feb 2020 11:12:47 +0000
> Ready: False
> Restart Count: 0
> Limits:
> memory: 512Mi
> Requests:
> memory: 512Mi
> Environment:
> xxx
> Mounts:
> /airflow/xcom from xcom (rw)
> airflow-xcom-sidecar:
> Container ID: docker://83053d7d292cda9156454ac13064d72ace1e4f72738ba9b62b04ff57cb7966cc
> Image: alpine
> Image ID: docker-pullable://alpine@sha256:ab00606a42621fb68f2ed6ad3c88be54397f981a7b70a79db3d1172b11c4367d
> Port: <none>
> Host Port: <none>
> Command:
> sh
> -c
> trap "exit 0" INT; while true; do sleep 30; done;
> State: Running
> Started: Wed, 05 Feb 2020 11:12:40 +0000
> Ready: True
> Restart Count: 0
> Limits:
> memory: 4Gi
> Requests:
> cpu: 1m
> memory: 2Gi
> Environment: <none>
> Mounts:
> /airflow/xcom from xcom (rw)
> xxx
> Conditions:
> Type Status
> Initialized True
> Ready False
> ContainersReady False
> PodScheduled True
> Volumes:
> xcom:
> Type: EmptyDir (a temporary directory that shares a pod's lifetime)
> Medium:
> SizeLimit: <unset>
> xxx
> QoS Class: Burstable
> Node-Selectors: <none>
> Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
> node.kubernetes.io/unreachable:NoExecute for 300s
> Events: <none>{noformat}
> I don't have more information of the possible causes of that.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)