You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Guillaume Onfroy (Jira)" <ji...@apache.org> on 2020/02/19 15:37:00 UTC

[jira] [Commented] (AIRFLOW-6810) KubernetesPodOperator pod is completed but xcom side car is stuck

    [ https://issues.apache.org/jira/browse/AIRFLOW-6810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17040171#comment-17040171 ] 

Guillaume Onfroy commented on AIRFLOW-6810:
-------------------------------------------

I confirm this issue has been there for quite some time now and is quite annoying.

Identical to [https://stackoverflow.com/questions/54388441/kubernetes-pod-created-through-airflow-remains-in-running-state]

> KubernetesPodOperator pod is completed but xcom side car is stuck
> -----------------------------------------------------------------
>
>                 Key: AIRFLOW-6810
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-6810
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: executor-kubernetes
>    Affects Versions: 1.10.6
>            Reporter: Maxence Cramet
>            Assignee: Daniel Imberman
>            Priority: Major
>
> We're using KubernetesPodOperator with param xcom_push=true in order to push information from our task.
> From time to time the main pod completes but the side car pod is stuck.
> Here's the output of the pods details:
> {noformat}
> kubectl describe pod my_pod
> Name:               my_pod
> Namespace:          default
> Priority:           0
> PriorityClassName:  <none>
> Node:               xxx
> Start Time:         Wed, 05 Feb 2020 11:12:33 +0000
> Labels:             xxx
> Annotations:        xxx
> Status:             Running
> IP:                 xxx
> Containers:
>   base:
>     Container ID:  xxx
>     Image:         xxx
>     Image ID:      xxx
>     Port:          <none>
>     Host Port:     <none>
>     Args:
>       xxx
>     State:          Terminated
>       Reason:       Completed
>       Exit Code:    0
>       Started:      Wed, 05 Feb 2020 11:12:38 +0000
>       Finished:     Wed, 05 Feb 2020 11:12:47 +0000
>     Ready:          False
>     Restart Count:  0
>     Limits:
>       memory:  512Mi
>     Requests:
>       memory:  512Mi
>     Environment:
>       xxx
>     Mounts:
>       /airflow/xcom from xcom (rw)
>   airflow-xcom-sidecar:
>     Container ID:  docker://83053d7d292cda9156454ac13064d72ace1e4f72738ba9b62b04ff57cb7966cc
>     Image:         alpine
>     Image ID:      docker-pullable://alpine@sha256:ab00606a42621fb68f2ed6ad3c88be54397f981a7b70a79db3d1172b11c4367d
>     Port:          <none>
>     Host Port:     <none>
>     Command:
>       sh
>       -c
>       trap "exit 0" INT; while true; do sleep 30; done;
>     State:          Running
>       Started:      Wed, 05 Feb 2020 11:12:40 +0000
>     Ready:          True
>     Restart Count:  0
>     Limits:
>       memory:  4Gi
>     Requests:
>       cpu:        1m
>       memory:     2Gi
>     Environment:  <none>
>     Mounts:
>       /airflow/xcom from xcom (rw)
>       xxx
> Conditions:
>   Type              Status
>   Initialized       True 
>   Ready             False 
>   ContainersReady   False 
>   PodScheduled      True 
> Volumes:
>   xcom:
>     Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
>     Medium:     
>     SizeLimit:  <unset>
>   xxx
> QoS Class:       Burstable
> Node-Selectors:  <none>
> Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
>                  node.kubernetes.io/unreachable:NoExecute for 300s
> Events:          <none>{noformat}
> I don't have more information of the possible causes of that.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)