You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Qihua Yang <ya...@gmail.com> on 2021/09/29 18:36:51 UTC
Start Flink cluster, k8s pod behavior
Hi,
I deployed flink in session mode. I didn't run any jobs. I saw below logs.
That is normal, same as Flink menual shows.
+ /opt/flink/bin/run-job-manager.sh
Starting HA cluster with 1 masters.
Starting standalonesession daemon on host job-manager-776dcf6dd-xzs8g.
Starting taskexecutor daemon on host job-manager-776dcf6dd-xzs8g.
But when I check kubectl, it shows status is Completed. After a while,
status changed to CrashLoopBackOff, and pod restart.
NAME READY
STATUS RESTARTS AGE
job-manager-776dcf6dd-xzs8g 0/1 Completed 5
5m27s
NAME READY
STATUS RESTARTS AGE
job-manager-776dcf6dd-xzs8g 0/1 CrashLoopBackOff 5
7m35s
Anyone can help me understand why?
Why do kubernetes regard this pod as completed and restart? Should I config
something? either Flink side or Kubernetes side? From the Flink manual,
after the cluster is started, I can upload a jar to run the application.
Thanks,
Qihua
Re: Start Flink cluster, k8s pod behavior
Posted by Yang Wang <da...@gmail.com>.
Did you use the "jobmanager.sh start-foreground" in your own
"run-job-manager.sh", just like what Flink has done
in the docker-entrypoint.sh[1]?
I strongly suggest to start the Flink session cluster with official
yamls[2].
[1].
https://github.com/apache/flink-docker/blob/master/1.13/scala_2.11-java11-debian/docker-entrypoint.sh#L114
[2].
https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/resource-providers/standalone/kubernetes/#starting-a-kubernetes-cluster-session-mode
Best,
Yang
Qihua Yang <ya...@gmail.com> 于2021年10月1日周五 上午2:59写道:
> Looks like after script *flink-daemon.sh *complete, it return exit 0.
> Kubernetes regard it as done. Is that expected?
>
> Thanks,
> Qihua
>
> On Thu, Sep 30, 2021 at 11:11 AM Qihua Yang <ya...@gmail.com> wrote:
>
>> Thank you for your reply.
>> From the log, exit code is 0, and reason is Completed.
>> Looks like the cluster is fine. But why kubenetes restart the pod. As you
>> said, from perspective of Kubernetes everything is done. Then how to
>> prevent the restart?
>> It didn't even give chance to upload and run a jar....
>>
>> Ports: 8081/TCP, 6123/TCP, 6124/TCP, 6125/TCP
>> Host Ports: 0/TCP, 0/TCP, 0/TCP, 0/TCP
>> Command:
>> /opt/flink/bin/entrypoint.sh
>> Args:
>> /opt/flink/bin/run-job-manager.sh
>> State: Waiting
>> Reason: CrashLoopBackOff
>> Last State: Terminated
>> Reason: Completed
>> Exit Code: 0
>> Started: Wed, 29 Sep 2021 20:12:30 -0700
>> Finished: Wed, 29 Sep 2021 20:12:45 -0700
>> Ready: False
>> Restart Count: 131
>>
>> Thanks,
>> Qihua
>>
>> On Thu, Sep 30, 2021 at 1:00 AM Chesnay Schepler <ch...@apache.org>
>> wrote:
>>
>>> Is the run-job-manager.sh script actually blocking?
>>> Since you (apparently) use that as an entrypoint, if that scripts exits
>>> after starting the JM then from the perspective of Kubernetes everything is
>>> done.
>>>
>>> On 30/09/2021 08:59, Matthias Pohl wrote:
>>>
>>> Hi Qihua,
>>> I guess, looking into kubectl describe and the JobManager logs would
>>> help in understanding what's going on.
>>>
>>> Best,
>>> Matthias
>>>
>>> On Wed, Sep 29, 2021 at 8:37 PM Qihua Yang <ya...@gmail.com> wrote:
>>>
>>>> Hi,
>>>> I deployed flink in session mode. I didn't run any jobs. I saw below
>>>> logs. That is normal, same as Flink menual shows.
>>>>
>>>> + /opt/flink/bin/run-job-manager.sh
>>>> Starting HA cluster with 1 masters.
>>>> Starting standalonesession daemon on host job-manager-776dcf6dd-xzs8g.
>>>> Starting taskexecutor daemon on host job-manager-776dcf6dd-xzs8g.
>>>>
>>>>
>>>> But when I check kubectl, it shows status is Completed. After a while,
>>>> status changed to CrashLoopBackOff, and pod restart.
>>>> NAME READY
>>>> STATUS RESTARTS AGE
>>>> job-manager-776dcf6dd-xzs8g 0/1 Completed 5
>>>> 5m27s
>>>>
>>>> NAME READY
>>>> STATUS RESTARTS AGE
>>>> job-manager-776dcf6dd-xzs8g 0/1 CrashLoopBackOff 5
>>>> 7m35s
>>>>
>>>> Anyone can help me understand why?
>>>> Why do kubernetes regard this pod as completed and restart? Should I
>>>> config something? either Flink side or Kubernetes side? From the Flink
>>>> manual, after the cluster is started, I can upload a jar to run the
>>>> application.
>>>>
>>>> Thanks,
>>>> Qihua
>>>>
>>>
>>>
Re: Start Flink cluster, k8s pod behavior
Posted by Qihua Yang <ya...@gmail.com>.
Looks like after script *flink-daemon.sh *complete, it return exit 0.
Kubernetes regard it as done. Is that expected?
Thanks,
Qihua
On Thu, Sep 30, 2021 at 11:11 AM Qihua Yang <ya...@gmail.com> wrote:
> Thank you for your reply.
> From the log, exit code is 0, and reason is Completed.
> Looks like the cluster is fine. But why kubenetes restart the pod. As you
> said, from perspective of Kubernetes everything is done. Then how to
> prevent the restart?
> It didn't even give chance to upload and run a jar....
>
> Ports: 8081/TCP, 6123/TCP, 6124/TCP, 6125/TCP
> Host Ports: 0/TCP, 0/TCP, 0/TCP, 0/TCP
> Command:
> /opt/flink/bin/entrypoint.sh
> Args:
> /opt/flink/bin/run-job-manager.sh
> State: Waiting
> Reason: CrashLoopBackOff
> Last State: Terminated
> Reason: Completed
> Exit Code: 0
> Started: Wed, 29 Sep 2021 20:12:30 -0700
> Finished: Wed, 29 Sep 2021 20:12:45 -0700
> Ready: False
> Restart Count: 131
>
> Thanks,
> Qihua
>
> On Thu, Sep 30, 2021 at 1:00 AM Chesnay Schepler <ch...@apache.org>
> wrote:
>
>> Is the run-job-manager.sh script actually blocking?
>> Since you (apparently) use that as an entrypoint, if that scripts exits
>> after starting the JM then from the perspective of Kubernetes everything is
>> done.
>>
>> On 30/09/2021 08:59, Matthias Pohl wrote:
>>
>> Hi Qihua,
>> I guess, looking into kubectl describe and the JobManager logs would help
>> in understanding what's going on.
>>
>> Best,
>> Matthias
>>
>> On Wed, Sep 29, 2021 at 8:37 PM Qihua Yang <ya...@gmail.com> wrote:
>>
>>> Hi,
>>> I deployed flink in session mode. I didn't run any jobs. I saw below
>>> logs. That is normal, same as Flink menual shows.
>>>
>>> + /opt/flink/bin/run-job-manager.sh
>>> Starting HA cluster with 1 masters.
>>> Starting standalonesession daemon on host job-manager-776dcf6dd-xzs8g.
>>> Starting taskexecutor daemon on host job-manager-776dcf6dd-xzs8g.
>>>
>>>
>>> But when I check kubectl, it shows status is Completed. After a while,
>>> status changed to CrashLoopBackOff, and pod restart.
>>> NAME READY
>>> STATUS RESTARTS AGE
>>> job-manager-776dcf6dd-xzs8g 0/1 Completed 5
>>> 5m27s
>>>
>>> NAME READY
>>> STATUS RESTARTS AGE
>>> job-manager-776dcf6dd-xzs8g 0/1 CrashLoopBackOff 5
>>> 7m35s
>>>
>>> Anyone can help me understand why?
>>> Why do kubernetes regard this pod as completed and restart? Should I
>>> config something? either Flink side or Kubernetes side? From the Flink
>>> manual, after the cluster is started, I can upload a jar to run the
>>> application.
>>>
>>> Thanks,
>>> Qihua
>>>
>>
>>
Re: Start Flink cluster, k8s pod behavior
Posted by Qihua Yang <ya...@gmail.com>.
Thank you for your reply.
From the log, exit code is 0, and reason is Completed.
Looks like the cluster is fine. But why kubenetes restart the pod. As you
said, from perspective of Kubernetes everything is done. Then how to
prevent the restart?
It didn't even give chance to upload and run a jar....
Ports: 8081/TCP, 6123/TCP, 6124/TCP, 6125/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP, 0/TCP
Command:
/opt/flink/bin/entrypoint.sh
Args:
/opt/flink/bin/run-job-manager.sh
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Completed
Exit Code: 0
Started: Wed, 29 Sep 2021 20:12:30 -0700
Finished: Wed, 29 Sep 2021 20:12:45 -0700
Ready: False
Restart Count: 131
Thanks,
Qihua
On Thu, Sep 30, 2021 at 1:00 AM Chesnay Schepler <ch...@apache.org> wrote:
> Is the run-job-manager.sh script actually blocking?
> Since you (apparently) use that as an entrypoint, if that scripts exits
> after starting the JM then from the perspective of Kubernetes everything is
> done.
>
> On 30/09/2021 08:59, Matthias Pohl wrote:
>
> Hi Qihua,
> I guess, looking into kubectl describe and the JobManager logs would help
> in understanding what's going on.
>
> Best,
> Matthias
>
> On Wed, Sep 29, 2021 at 8:37 PM Qihua Yang <ya...@gmail.com> wrote:
>
>> Hi,
>> I deployed flink in session mode. I didn't run any jobs. I saw below
>> logs. That is normal, same as Flink menual shows.
>>
>> + /opt/flink/bin/run-job-manager.sh
>> Starting HA cluster with 1 masters.
>> Starting standalonesession daemon on host job-manager-776dcf6dd-xzs8g.
>> Starting taskexecutor daemon on host job-manager-776dcf6dd-xzs8g.
>>
>>
>> But when I check kubectl, it shows status is Completed. After a while,
>> status changed to CrashLoopBackOff, and pod restart.
>> NAME READY
>> STATUS RESTARTS AGE
>> job-manager-776dcf6dd-xzs8g 0/1 Completed 5
>> 5m27s
>>
>> NAME READY
>> STATUS RESTARTS AGE
>> job-manager-776dcf6dd-xzs8g 0/1 CrashLoopBackOff 5
>> 7m35s
>>
>> Anyone can help me understand why?
>> Why do kubernetes regard this pod as completed and restart? Should I
>> config something? either Flink side or Kubernetes side? From the Flink
>> manual, after the cluster is started, I can upload a jar to run the
>> application.
>>
>> Thanks,
>> Qihua
>>
>
>
Re: Start Flink cluster, k8s pod behavior
Posted by Chesnay Schepler <ch...@apache.org>.
Is the run-job-manager.sh script actually blocking?
Since you (apparently) use that as an entrypoint, if that scripts exits
after starting the JM then from the perspective of Kubernetes everything
is done.
On 30/09/2021 08:59, Matthias Pohl wrote:
> Hi Qihua,
> I guess, looking into kubectl describe and the JobManager logs would
> help in understanding what's going on.
>
> Best,
> Matthias
>
> On Wed, Sep 29, 2021 at 8:37 PM Qihua Yang <yangqqh@gmail.com
> <ma...@gmail.com>> wrote:
>
> Hi,
> I deployed flink in session mode. I didn't run any jobs. I saw
> below logs. That is normal, same as Flink menual shows.
>
> + /opt/flink/bin/run-job-manager.sh
> Starting HA cluster with 1 masters.
> Starting standalonesession daemon on host job-manager-776dcf6dd-xzs8g.
> Starting taskexecutor daemon on host job-manager-776dcf6dd-xzs8g.
>
> But when I check kubectl, it shows status is Completed. After a
> while, status changed to CrashLoopBackOff, and pod restart.
> NAME READY STATUS RESTARTS AGE
> job-manager-776dcf6dd-xzs8g 0/1 Completed 5
> 5m27s
>
> NAME READY STATUS RESTARTS AGE
> job-manager-776dcf6dd-xzs8g 0/1 CrashLoopBackOff 5
> 7m35s
>
> Anyone can help me understand why?
> Why do kubernetes regard this pod as completed and restart? Should
> I config something? either Flink side or Kubernetes side? From the
> Flink manual, after the cluster is started, I can upload a jar to
> run the application.
>
> Thanks,
> Qihua
>
Re: Start Flink cluster, k8s pod behavior
Posted by Qihua Yang <ya...@gmail.com>.
I did check the kubectl describe, it shows below info. Reason is Completed.
Ports: 8081/TCP, 6123/TCP, 6124/TCP, 6125/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP, 0/TCP
Command:
/opt/flink/bin/entrypoint.sh
Args:
/opt/flink/bin/run-job-manager.sh
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Completed
Exit Code: 0
Started: Wed, 29 Sep 2021 20:12:30 -0700
Finished: Wed, 29 Sep 2021 20:12:45 -0700
Ready: False
Restart Count: 131
On Wed, Sep 29, 2021 at 11:59 PM Matthias Pohl <ma...@ververica.com>
wrote:
> Hi Qihua,
> I guess, looking into kubectl describe and the JobManager logs would help
> in understanding what's going on.
>
> Best,
> Matthias
>
> On Wed, Sep 29, 2021 at 8:37 PM Qihua Yang <ya...@gmail.com> wrote:
>
>> Hi,
>> I deployed flink in session mode. I didn't run any jobs. I saw below
>> logs. That is normal, same as Flink menual shows.
>>
>> + /opt/flink/bin/run-job-manager.sh
>> Starting HA cluster with 1 masters.
>> Starting standalonesession daemon on host job-manager-776dcf6dd-xzs8g.
>> Starting taskexecutor daemon on host job-manager-776dcf6dd-xzs8g.
>>
>> But when I check kubectl, it shows status is Completed. After a while,
>> status changed to CrashLoopBackOff, and pod restart.
>> NAME READY
>> STATUS RESTARTS AGE
>> job-manager-776dcf6dd-xzs8g 0/1 Completed 5
>> 5m27s
>>
>> NAME READY
>> STATUS RESTARTS AGE
>> job-manager-776dcf6dd-xzs8g 0/1 CrashLoopBackOff 5
>> 7m35s
>>
>> Anyone can help me understand why?
>> Why do kubernetes regard this pod as completed and restart? Should I
>> config something? either Flink side or Kubernetes side? From the Flink
>> manual, after the cluster is started, I can upload a jar to run the
>> application.
>>
>> Thanks,
>> Qihua
>>
>
Re: Start Flink cluster, k8s pod behavior
Posted by Matthias Pohl <ma...@ververica.com>.
Hi Qihua,
I guess, looking into kubectl describe and the JobManager logs would help
in understanding what's going on.
Best,
Matthias
On Wed, Sep 29, 2021 at 8:37 PM Qihua Yang <ya...@gmail.com> wrote:
> Hi,
> I deployed flink in session mode. I didn't run any jobs. I saw below logs.
> That is normal, same as Flink menual shows.
>
> + /opt/flink/bin/run-job-manager.sh
> Starting HA cluster with 1 masters.
> Starting standalonesession daemon on host job-manager-776dcf6dd-xzs8g.
> Starting taskexecutor daemon on host job-manager-776dcf6dd-xzs8g.
>
> But when I check kubectl, it shows status is Completed. After a while,
> status changed to CrashLoopBackOff, and pod restart.
> NAME READY
> STATUS RESTARTS AGE
> job-manager-776dcf6dd-xzs8g 0/1 Completed 5
> 5m27s
>
> NAME READY
> STATUS RESTARTS AGE
> job-manager-776dcf6dd-xzs8g 0/1 CrashLoopBackOff 5
> 7m35s
>
> Anyone can help me understand why?
> Why do kubernetes regard this pod as completed and restart? Should I
> config something? either Flink side or Kubernetes side? From the Flink
> manual, after the cluster is started, I can upload a jar to run the
> application.
>
> Thanks,
> Qihua
>