You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by jwatte <jw...@gmail.com> on 2018/10/01 20:35:54 UTC

flink:latest container on kubernetes fails to connect taskmanager to jobmanager

I'm using the standard Kubernetes deploy configs for jobmanager and
taskmanager deployments, and jobmanager service.
However, when the task managers start up, they try to register with the job
manager over Akka on port 6123.
This fails, because the Akka on the jobmanager discards those messages as
"non-local."

The taskmanager keeps repeating this log message and eventually existing
(and getting restarted by Kubernetes):

2018-10-01 20:08:28,365 INFO 
org.apache.flink.runtime.taskexecutor.TaskExecutor            - Could not
resolve ResourceManager address
akka.tcp://flink@flink-jobmanager:6123/user/resourcemanager, retrying in
10000 ms: Ask timed out on
[ActorSelection[Anchor(akka.tcp://flink@flink-jobmanager:6123/),
Path(/user/resourcemanager)]] after [10000 ms]. Sender[null] sent message of
type "akka.actor.Identify"..

The jobmanager responds with this log message:

2018-10-01 20:09:38,475 ERROR akka.remote.EndpointWriter                                   
- dropping message [class akka.actor.ActorSelectionMessage] for non-local
recipient [Actor[akka.tcp://flink@flink-jobmanager:6123/]] arriving at
[akka.tcp://flink@flink-jobmanager:6123] inbound addresses are
[akka.tcp://flink@cluster:6123]

I have verified that network connectivity exists, so this is some
configuration problem.
I notice that the docker-entrypoint.sh edits the config files and calls the
taskmanager.sh / jobmanager.sh scripts based on start mode.
Is this file editing the config file wrong? What needs to be done so that
Akka on the jobmanager accepts the registration messages?




--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: flink:latest container on kubernetes fails to connect taskmanager to jobmanager

Posted by Till Rohrmann <tr...@apache.org>.
Hi jwatte,

sorry for the inconveniences. I hope that the dicker hub images have been
updated by now.

Cheers,
Till

On Wed, Oct 10, 2018, 05:20 vino yang <ya...@gmail.com> wrote:

> Hi jwatte,
>
> Maybe Till can help you.
>
> Thanks, vino.
>
> jwatte <jw...@gmail.com> 于2018年10月2日周二 上午5:30写道:
>
>> It turns out that the latest flink:latest docker image is 5 days old, and
>> thus bug was fixed 4 days ago in the flink-docker github.
>>
>> The problem is that the docker-entrypoint.sh script chains to
>> jobmanager.sh
>> by saying "start-foreground cluster" where the "cluster" argument is
>> obsolete as of Flink 1.5.
>>
>> I patched it with a sed command in the Kubernetes manifest, until the
>> updated docker image makes it way to the world.
>>
>>
>>
>> --
>> Sent from:
>> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>>
>

Re: flink:latest container on kubernetes fails to connect taskmanager to jobmanager

Posted by vino yang <ya...@gmail.com>.
Hi jwatte,

Maybe Till can help you.

Thanks, vino.

jwatte <jw...@gmail.com> 于2018年10月2日周二 上午5:30写道:

> It turns out that the latest flink:latest docker image is 5 days old, and
> thus bug was fixed 4 days ago in the flink-docker github.
>
> The problem is that the docker-entrypoint.sh script chains to jobmanager.sh
> by saying "start-foreground cluster" where the "cluster" argument is
> obsolete as of Flink 1.5.
>
> I patched it with a sed command in the Kubernetes manifest, until the
> updated docker image makes it way to the world.
>
>
>
> --
> Sent from:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>

Re: flink:latest container on kubernetes fails to connect taskmanager to jobmanager

Posted by jwatte <jw...@gmail.com>.
It turns out that the latest flink:latest docker image is 5 days old, and
thus bug was fixed 4 days ago in the flink-docker github.

The problem is that the docker-entrypoint.sh script chains to jobmanager.sh
by saying "start-foreground cluster" where the "cluster" argument is
obsolete as of Flink 1.5.

I patched it with a sed command in the Kubernetes manifest, until the
updated docker image makes it way to the world.



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/