You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user-zh@flink.apache.org by yzxs <cz...@gmail.com> on 2021/01/08 04:12:21 UTC

flink1.12.0 native k8s启动不了

1、使用以下命令发布任务:
./bin/flink run-application \
    --target kubernetes-application \
    -Dkubernetes.cluster-id=my-first-application-cluster \
   
-Dkubernetes.container.image=registry.cn-shenzhen.aliyuncs.com/syni_test/flink:v1
\
    -Dkubernetes.container.image.pull-policy=Always \
    -Dkubernetes.container-start-command-template="%java% %classpath%
%jvmmem% %jvmopts% %logging% %class% %args%" \
    local:///opt/flink/usrlib/WordCount.jar

2、任务发布后,pod重启失败,用kubectl logs查看日志,出现以下错误:
/docker-entrypoint.sh: 125: exec: native-k8s: not found

3、检查了镜像的docker-entrypoint.sh脚本,没有navive-k8s的命令,镜像是基于flink最新的镜像进行构筑的,dockerfile如下:
FROM flink:latest
RUN mkdir -p /opt/flink/usrlib
COPY ./WordCount.jar /opt/flink/usrlib/WordCount.jar

3、pod的describe信息
Name:         my-first-application-cluster-59c4445df4-4ss2m
Namespace:    default
Priority:     0
Node:         minikube/192.168.64.2
Start Time:   Wed, 23 Dec 2020 17:06:02 +0800
Labels:       app=my-first-application-cluster
              component=jobmanager
              pod-template-hash=59c4445df4
              type=flink-native-kubernetes
Annotations:  <none>
Status:       Running
IP:           172.17.0.3
IPs:
  IP:           172.17.0.3
Controlled By:  ReplicaSet/my-first-application-cluster-59c4445df4
Containers:
  flink-job-manager:
    Container ID: 
docker://b8e5759488af5fd3e3273f69d42890d9750d430cbd6e18b1d024ab83293d0124
    Image:         registry.cn-shenzhen.aliyuncs.com/syni_test/flink:v1
    Image ID:     
docker-pullable://registry.cn-shenzhen.aliyuncs.com/syni_test/flink@sha256:53a2cec0d0a532aa5d79c241acfdd13accb9df78eb951eb4e878485174186aa8
    Ports:         8081/TCP, 6123/TCP, 6124/TCP
    Host Ports:    0/TCP, 0/TCP, 0/TCP
    Command:
      /docker-entrypoint.sh
    Args:
      native-k8s
      $JAVA_HOME/bin/java -classpath $FLINK_CLASSPATH -Xmx1073741824
-Xms1073741824 -XX:MaxMetaspaceSize=268435456
-Dlog.file=/opt/flink/log/jobmanager.log
-Dlogback.configurationFile=file:/opt/flink/conf/logback-console.xml
-Dlog4j.configuration=file:/opt/flink/conf/log4j-console.properties
-Dlog4j.configurationFile=file:/opt/flink/conf/log4j-console.properties
org.apache.flink.kubernetes.entrypoint.KubernetesApplicationClusterEntrypoint
-D jobmanager.memory.off-heap.size=134217728b -D
jobmanager.memory.jvm-overhead.min=201326592b -D
jobmanager.memory.jvm-metaspace.size=268435456b -D
jobmanager.memory.heap.size=1073741824b -D
jobmanager.memory.jvm-overhead.max=201326592b
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    127
      Started:      Wed, 23 Dec 2020 17:37:28 +0800
      Finished:     Wed, 23 Dec 2020 17:37:28 +0800
    Ready:          False
    Restart Count:  11
    Limits:
      cpu:     1
      memory:  1600Mi
    Requests:
      cpu:     1
      memory:  1600Mi
    Environment:
      _POD_IP_ADDRESS:   (v1:status.podIP)
    Mounts:
      /opt/flink/conf from flink-config-volume (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-9hdqt
(ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  flink-config-volume:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      flink-config-my-first-application-cluster
    Optional:  false
  default-token-9hdqt:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-9hdqt
    Optional:    false
QoS Class:       Guaranteed
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason     Age                  From               Message
  ----     ------     ----                 ----               -------
  Normal   Scheduled  15d                  default-scheduler  Successfully
assigned default/my-first-application-cluster-59c4445df4-4ss2m to minikube
  Normal   Pulled     15d                  kubelet            Successfully
pulled image "registry.cn-shenzhen.aliyuncs.com/syni_test/flink:v1" in
513.7913ms
  Normal   Pulled     15d                  kubelet            Successfully
pulled image "registry.cn-shenzhen.aliyuncs.com/syni_test/flink:v1" in
374.1125ms
  Normal   Pulled     15d                  kubelet            Successfully
pulled image "registry.cn-shenzhen.aliyuncs.com/syni_test/flink:v1" in
360.6719ms
  Normal   Created    15d (x4 over 15d)    kubelet            Created
container flink-job-manager
  Normal   Started    15d (x4 over 15d)    kubelet            Started
container flink-job-manager
  Normal   Pulled     15d                  kubelet            Successfully
pulled image "registry.cn-shenzhen.aliyuncs.com/syni_test/flink:v1" in
374.2637ms
  Normal   Pulling    15d (x5 over 15d)    kubelet            Pulling image
"registry.cn-shenzhen.aliyuncs.com/syni_test/flink:v1"
  Warning  BackOff    15d (x141 over 15d)  kubelet            Back-off
restarting failed container




--
Sent from: http://apache-flink.147419.n8.nabble.com/

Re: flink1.12.0 native k8s启动不了

Posted by yzxs <cz...@gmail.com>.
谢谢,问题已解决。



--
Sent from: http://apache-flink.147419.n8.nabble.com/

Re: flink1.12.0 native k8s启动不了

Posted by Yang Wang <da...@gmail.com>.
这个问题的根本原因是你Client端用的是1.12版本,但是你build的镜像的基础镜像是1.11的,因为1.12的镜像还没有发布到docker
hub上
你用正确的Dockerfile[1]自己重新build一个,再运行一下看看

[1].
https://github.com/apache/flink-docker/tree/master/1.12/scala_2.12-java8-debian

Best,
Yang

yzxs <cz...@gmail.com> 于2021年1月8日周五 下午12:16写道:

> 1、使用以下命令发布任务:
> ./bin/flink run-application \
>     --target kubernetes-application \
>     -Dkubernetes.cluster-id=my-first-application-cluster \
>
> -Dkubernetes.container.image=
> registry.cn-shenzhen.aliyuncs.com/syni_test/flink:v1
> \
>     -Dkubernetes.container.image.pull-policy=Always \
>     -Dkubernetes.container-start-command-template="%java% %classpath%
> %jvmmem% %jvmopts% %logging% %class% %args%" \
>     local:///opt/flink/usrlib/WordCount.jar
>
> 2、任务发布后,pod重启失败,用kubectl logs查看日志,出现以下错误:
> /docker-entrypoint.sh: 125: exec: native-k8s: not found
>
>
> 3、检查了镜像的docker-entrypoint.sh脚本,没有navive-k8s的命令,镜像是基于flink最新的镜像进行构筑的,dockerfile如下:
> FROM flink:latest
> RUN mkdir -p /opt/flink/usrlib
> COPY ./WordCount.jar /opt/flink/usrlib/WordCount.jar
>
> 3、pod的describe信息
> Name:         my-first-application-cluster-59c4445df4-4ss2m
> Namespace:    default
> Priority:     0
> Node:         minikube/192.168.64.2
> Start Time:   Wed, 23 Dec 2020 17:06:02 +0800
> Labels:       app=my-first-application-cluster
>               component=jobmanager
>               pod-template-hash=59c4445df4
>               type=flink-native-kubernetes
> Annotations:  <none>
> Status:       Running
> IP:           172.17.0.3
> IPs:
>   IP:           172.17.0.3
> Controlled By:  ReplicaSet/my-first-application-cluster-59c4445df4
> Containers:
>   flink-job-manager:
>     Container ID:
> docker://b8e5759488af5fd3e3273f69d42890d9750d430cbd6e18b1d024ab83293d0124
>     Image:         registry.cn-shenzhen.aliyuncs.com/syni_test/flink:v1
>     Image ID:
> docker-pullable://
> registry.cn-shenzhen.aliyuncs.com/syni_test/flink@sha256:53a2cec0d0a532aa5d79c241acfdd13accb9df78eb951eb4e878485174186aa8
>     Ports:         8081/TCP, 6123/TCP, 6124/TCP
>     Host Ports:    0/TCP, 0/TCP, 0/TCP
>     Command:
>       /docker-entrypoint.sh
>     Args:
>       native-k8s
>       $JAVA_HOME/bin/java -classpath $FLINK_CLASSPATH -Xmx1073741824
> -Xms1073741824 -XX:MaxMetaspaceSize=268435456
> -Dlog.file=/opt/flink/log/jobmanager.log
> -Dlogback.configurationFile=file:/opt/flink/conf/logback-console.xml
> -Dlog4j.configuration=file:/opt/flink/conf/log4j-console.properties
> -Dlog4j.configurationFile=file:/opt/flink/conf/log4j-console.properties
>
> org.apache.flink.kubernetes.entrypoint.KubernetesApplicationClusterEntrypoint
> -D jobmanager.memory.off-heap.size=134217728b -D
> jobmanager.memory.jvm-overhead.min=201326592b -D
> jobmanager.memory.jvm-metaspace.size=268435456b -D
> jobmanager.memory.heap.size=1073741824b -D
> jobmanager.memory.jvm-overhead.max=201326592b
>     State:          Waiting
>       Reason:       CrashLoopBackOff
>     Last State:     Terminated
>       Reason:       Error
>       Exit Code:    127
>       Started:      Wed, 23 Dec 2020 17:37:28 +0800
>       Finished:     Wed, 23 Dec 2020 17:37:28 +0800
>     Ready:          False
>     Restart Count:  11
>     Limits:
>       cpu:     1
>       memory:  1600Mi
>     Requests:
>       cpu:     1
>       memory:  1600Mi
>     Environment:
>       _POD_IP_ADDRESS:   (v1:status.podIP)
>     Mounts:
>       /opt/flink/conf from flink-config-volume (rw)
>       /var/run/secrets/kubernetes.io/serviceaccount from
> default-token-9hdqt
> (ro)
> Conditions:
>   Type              Status
>   Initialized       True
>   Ready             False
>   ContainersReady   False
>   PodScheduled      True
> Volumes:
>   flink-config-volume:
>     Type:      ConfigMap (a volume populated by a ConfigMap)
>     Name:      flink-config-my-first-application-cluster
>     Optional:  false
>   default-token-9hdqt:
>     Type:        Secret (a volume populated by a Secret)
>     SecretName:  default-token-9hdqt
>     Optional:    false
> QoS Class:       Guaranteed
> Node-Selectors:  <none>
> Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
>                  node.kubernetes.io/unreachable:NoExecute op=Exists for
> 300s
> Events:
>   Type     Reason     Age                  From               Message
>   ----     ------     ----                 ----               -------
>   Normal   Scheduled  15d                  default-scheduler  Successfully
> assigned default/my-first-application-cluster-59c4445df4-4ss2m to minikube
>   Normal   Pulled     15d                  kubelet            Successfully
> pulled image "registry.cn-shenzhen.aliyuncs.com/syni_test/flink:v1" in
> 513.7913ms
>   Normal   Pulled     15d                  kubelet            Successfully
> pulled image "registry.cn-shenzhen.aliyuncs.com/syni_test/flink:v1" in
> 374.1125ms
>   Normal   Pulled     15d                  kubelet            Successfully
> pulled image "registry.cn-shenzhen.aliyuncs.com/syni_test/flink:v1" in
> 360.6719ms
>   Normal   Created    15d (x4 over 15d)    kubelet            Created
> container flink-job-manager
>   Normal   Started    15d (x4 over 15d)    kubelet            Started
> container flink-job-manager
>   Normal   Pulled     15d                  kubelet            Successfully
> pulled image "registry.cn-shenzhen.aliyuncs.com/syni_test/flink:v1" in
> 374.2637ms
>   Normal   Pulling    15d (x5 over 15d)    kubelet            Pulling image
> "registry.cn-shenzhen.aliyuncs.com/syni_test/flink:v1"
>   Warning  BackOff    15d (x141 over 15d)  kubelet            Back-off
> restarting failed container
>
>
>
>
> --
> Sent from: http://apache-flink.147419.n8.nabble.com/
>