You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Günter Hipler <gu...@bluewin.ch> on 2019/05/01 19:31:42 UTC

configuration of standalone cluster

Hi,

For the first time I'm trying to set up a standalone cluster. My current 
configuration
4 server (1 jobmanger and 3 taskmanager)

a) starting the cluster
swissbib@sb-ust1:/swissbib_index/apps/flink/bin$ ./start-cluster.sh
Starting cluster.
Starting standalonesession daemon on host sb-ust1.
Starting taskexecutor daemon on host sb-ust2.
Starting taskexecutor daemon on host sb-ust3.
Starting taskexecutor daemon on host sb-ust4.


On the taskmanager side I get the error
2019-05-01 21:16:32,794 WARN 
akka.remote.ReliableDeliverySupervisor                        - 
Association with remote system [akka.ssl.tcp://flink@sb-ust1:6123] has 
failed, address is now gated for [50] ms. Reason: [class [B cannot be 
cast to class [C ([B and [C are in module java.base of loader 'bootstrap')]
2019-05-01 21:16:41,932 INFO 
org.apache.flink.runtime.taskexecutor.TaskExecutor            - Could 
not resolve ResourceManager address 
akka.ssl.tcp://flink@sb-ust1:6123/user/resourcemanager, retrying in 
10000 ms: Ask timed out on 
[ActorSelection[Anchor(akka.ssl.tcp://flink@sb-ust1:6123/), 
Path(/user/resourcemanager)]] after [10000 ms]. Sender[null] sent 
message of type "akka.actor.Identify"..
2019-05-01 21:17:01,960 INFO 
org.apache.flink.runtime.taskexecutor.TaskExecutor            - Could 
not resolve ResourceManager address 
akka.ssl.tcp://flink@sb-ust1:6123/user/resourcemanager, retrying in 
10000 ms: Ask timed out on 
[ActorSelection[Anchor(akka.ssl.tcp://flink@sb-ust1:6123/), 
Path(/user/resourcemanager)]] after [10000 ms]. Sender[null] sent 
message of type "akka.actor.Identify"..


port 6123 is allowed on the jobmanager but I haven't created a 
specialized flink - user.

- Is this necessary? if yes, is it possible to define another user for 
communication purposes?

I followed the documentation to setup a ssl based communication 
(https://ci.apache.org/projects/flink/flink-docs-release-1.8/ops/security-ssl.html#example-ssl-setup-standalone-and-kubernetes) 
and created a keystore as described:

keytool -genkeypair -alias swissbib.internal -keystore internal.keystore 
-dname "CN=flink.internal" -storepass verysecret -keypass verysecret 
-keyalg RSA -keysize 4096

and deployed the flink-conf.yaml on the whole cluster

(part of flink-conf.yaml)
security.ssl.internal.enabled: true
security.ssl.internal.keystore: 
/swissbib_index/apps/flink/conf/internal.keystore
security.ssl.internal.truststore: 
/swissbib_index/apps/flink/conf/internal.keystore
security.ssl.internal.keystore-password: verysecret
security.ssl.internal.truststore-password: verysecret
security.ssl.internal.key-password: verysecret

but this doesn't solve the problem - still no connection between 
task-managers and job-managers.

- another question: which ports have to be enabled in the firewall for a 
standalone cluster?

Thanks for any hints!

Günter


Re: configuration of standalone cluster

Posted by Abhishek Jain <ab...@gmail.com>.
Java version: "1.8.0_112"
Java(TM) SE Runtime Environment (build 1.8.0_112-b15)
Java HotSpot(TM) 64-Bit Server VM (build 25.112-b15, mixed mode)


On Thu, 2 May 2019 at 17:18, Chesnay Schepler <ch...@apache.org> wrote:

> Which java version are you using?
>
> On 01/05/2019 21:31, Günter Hipler wrote:
> > Hi,
> >
> > For the first time I'm trying to set up a standalone cluster. My
> > current configuration
> > 4 server (1 jobmanger and 3 taskmanager)
> >
> > a) starting the cluster
> > swissbib@sb-ust1:/swissbib_index/apps/flink/bin$ ./start-cluster.sh
> > Starting cluster.
> > Starting standalonesession daemon on host sb-ust1.
> > Starting taskexecutor daemon on host sb-ust2.
> > Starting taskexecutor daemon on host sb-ust3.
> > Starting taskexecutor daemon on host sb-ust4.
> >
> >
> > On the taskmanager side I get the error
> > 2019-05-01 21:16:32,794 WARN
> > akka.remote.ReliableDeliverySupervisor                        -
> > Association with remote system [akka.ssl.tcp://flink@sb-ust1:6123] has
> > failed, address is now gated for [50] ms. Reason: [class [B cannot be
> > cast to class [C ([B and [C are in module java.base of loader
> > 'bootstrap')]
> > 2019-05-01 21:16:41,932 INFO
> > org.apache.flink.runtime.taskexecutor.TaskExecutor            - Could
> > not resolve ResourceManager address
> > akka.ssl.tcp://flink@sb-ust1:6123/user/resourcemanager, retrying in
> > 10000 ms: Ask timed out on
> > [ActorSelection[Anchor(akka.ssl.tcp://flink@sb-ust1:6123/),
> > Path(/user/resourcemanager)]] after [10000 ms]. Sender[null] sent
> > message of type "akka.actor.Identify"..
> > 2019-05-01 21:17:01,960 INFO
> > org.apache.flink.runtime.taskexecutor.TaskExecutor            - Could
> > not resolve ResourceManager address
> > akka.ssl.tcp://flink@sb-ust1:6123/user/resourcemanager, retrying in
> > 10000 ms: Ask timed out on
> > [ActorSelection[Anchor(akka.ssl.tcp://flink@sb-ust1:6123/),
> > Path(/user/resourcemanager)]] after [10000 ms]. Sender[null] sent
> > message of type "akka.actor.Identify"..
> >
> >
> > port 6123 is allowed on the jobmanager but I haven't created a
> > specialized flink - user.
> >
> > - Is this necessary? if yes, is it possible to define another user for
> > communication purposes?
> >
> > I followed the documentation to setup a ssl based communication
> > (
> https://ci.apache.org/projects/flink/flink-docs-release-1.8/ops/security-ssl.html#example-ssl-setup-standalone-and-kubernetes)
>
> > and created a keystore as described:
> >
> > keytool -genkeypair -alias swissbib.internal -keystore
> > internal.keystore -dname "CN=flink.internal" -storepass verysecret
> > -keypass verysecret -keyalg RSA -keysize 4096
> >
> > and deployed the flink-conf.yaml on the whole cluster
> >
> > (part of flink-conf.yaml)
> > security.ssl.internal.enabled: true
> > security.ssl.internal.keystore:
> > /swissbib_index/apps/flink/conf/internal.keystore
> > security.ssl.internal.truststore:
> > /swissbib_index/apps/flink/conf/internal.keystore
> > security.ssl.internal.keystore-password: verysecret
> > security.ssl.internal.truststore-password: verysecret
> > security.ssl.internal.key-password: verysecret
> >
> > but this doesn't solve the problem - still no connection between
> > task-managers and job-managers.
> >
> > - another question: which ports have to be enabled in the firewall for
> > a standalone cluster?
> >
> > Thanks for any hints!
> >
> > Günter
> >
> >
>
>

-- 
Warm Regards,
Abhishek Jain

Re: configuration of standalone cluster

Posted by Chesnay Schepler <ch...@apache.org>.
Flink still only works with Java 8 at the moment. It will be a while 
until we properly support Java 11.

On 02/05/2019 13:58, Günter Hipler wrote:
> swissbib@sb-ust1:~$ java -version
> openjdk version "11.0.2" 2019-01-15
> OpenJDK Runtime Environment (build 11.0.2+9-Ubuntu-3ubuntu118.04.3)
> OpenJDK 64-Bit Server VM (build 11.0.2+9-Ubuntu-3ubuntu118.04.3, mixed 
> mode, sharing)
> swissbib@sb-ust1:~$
>
> Is version 8 more appropriate?
>
> Günter
>
>
> On 02.05.19 13:48, Chesnay Schepler wrote:
>> Which java version are you using?
>>
>> On 01/05/2019 21:31, Günter Hipler wrote:
>>> Hi,
>>>
>>> For the first time I'm trying to set up a standalone cluster. My 
>>> current configuration
>>> 4 server (1 jobmanger and 3 taskmanager)
>>>
>>> a) starting the cluster
>>> swissbib@sb-ust1:/swissbib_index/apps/flink/bin$ ./start-cluster.sh
>>> Starting cluster.
>>> Starting standalonesession daemon on host sb-ust1.
>>> Starting taskexecutor daemon on host sb-ust2.
>>> Starting taskexecutor daemon on host sb-ust3.
>>> Starting taskexecutor daemon on host sb-ust4.
>>>
>>>
>>> On the taskmanager side I get the error
>>> 2019-05-01 21:16:32,794 WARN akka.remote.ReliableDeliverySupervisor 
>>> - Association with remote system [akka.ssl.tcp://flink@sb-ust1:6123] 
>>> has failed, address is now gated for [50] ms. Reason: [class [B 
>>> cannot be cast to class [C ([B and [C are in module java.base of 
>>> loader 'bootstrap')]
>>> 2019-05-01 21:16:41,932 INFO 
>>> org.apache.flink.runtime.taskexecutor.TaskExecutor - Could not 
>>> resolve ResourceManager address 
>>> akka.ssl.tcp://flink@sb-ust1:6123/user/resourcemanager, retrying in 
>>> 10000 ms: Ask timed out on 
>>> [ActorSelection[Anchor(akka.ssl.tcp://flink@sb-ust1:6123/), 
>>> Path(/user/resourcemanager)]] after [10000 ms]. Sender[null] sent 
>>> message of type "akka.actor.Identify"..
>>> 2019-05-01 21:17:01,960 INFO 
>>> org.apache.flink.runtime.taskexecutor.TaskExecutor - Could not 
>>> resolve ResourceManager address 
>>> akka.ssl.tcp://flink@sb-ust1:6123/user/resourcemanager, retrying in 
>>> 10000 ms: Ask timed out on 
>>> [ActorSelection[Anchor(akka.ssl.tcp://flink@sb-ust1:6123/), 
>>> Path(/user/resourcemanager)]] after [10000 ms]. Sender[null] sent 
>>> message of type "akka.actor.Identify"..
>>>
>>>
>>> port 6123 is allowed on the jobmanager but I haven't created a 
>>> specialized flink - user.
>>>
>>> - Is this necessary? if yes, is it possible to define another user 
>>> for communication purposes?
>>>
>>> I followed the documentation to setup a ssl based communication 
>>> (https://ci.apache.org/projects/flink/flink-docs-release-1.8/ops/security-ssl.html#example-ssl-setup-standalone-and-kubernetes) 
>>> and created a keystore as described:
>>>
>>> keytool -genkeypair -alias swissbib.internal -keystore 
>>> internal.keystore -dname "CN=flink.internal" -storepass verysecret 
>>> -keypass verysecret -keyalg RSA -keysize 4096
>>>
>>> and deployed the flink-conf.yaml on the whole cluster
>>>
>>> (part of flink-conf.yaml)
>>> security.ssl.internal.enabled: true
>>> security.ssl.internal.keystore: 
>>> /swissbib_index/apps/flink/conf/internal.keystore
>>> security.ssl.internal.truststore: 
>>> /swissbib_index/apps/flink/conf/internal.keystore
>>> security.ssl.internal.keystore-password: verysecret
>>> security.ssl.internal.truststore-password: verysecret
>>> security.ssl.internal.key-password: verysecret
>>>
>>> but this doesn't solve the problem - still no connection between 
>>> task-managers and job-managers.
>>>
>>> - another question: which ports have to be enabled in the firewall 
>>> for a standalone cluster?
>>>
>>> Thanks for any hints!
>>>
>>> Günter
>>>
>>>
>>
>>
>


Re: configuration of standalone cluster

Posted by Günter Hipler <gu...@bluewin.ch>.
swissbib@sb-ust1:~$ java -version
openjdk version "11.0.2" 2019-01-15
OpenJDK Runtime Environment (build 11.0.2+9-Ubuntu-3ubuntu118.04.3)
OpenJDK 64-Bit Server VM (build 11.0.2+9-Ubuntu-3ubuntu118.04.3, mixed 
mode, sharing)
swissbib@sb-ust1:~$

Is version 8 more appropriate?

Günter


On 02.05.19 13:48, Chesnay Schepler wrote:
> Which java version are you using?
>
> On 01/05/2019 21:31, Günter Hipler wrote:
>> Hi,
>>
>> For the first time I'm trying to set up a standalone cluster. My 
>> current configuration
>> 4 server (1 jobmanger and 3 taskmanager)
>>
>> a) starting the cluster
>> swissbib@sb-ust1:/swissbib_index/apps/flink/bin$ ./start-cluster.sh
>> Starting cluster.
>> Starting standalonesession daemon on host sb-ust1.
>> Starting taskexecutor daemon on host sb-ust2.
>> Starting taskexecutor daemon on host sb-ust3.
>> Starting taskexecutor daemon on host sb-ust4.
>>
>>
>> On the taskmanager side I get the error
>> 2019-05-01 21:16:32,794 WARN 
>> akka.remote.ReliableDeliverySupervisor                        - 
>> Association with remote system [akka.ssl.tcp://flink@sb-ust1:6123] 
>> has failed, address is now gated for [50] ms. Reason: [class [B 
>> cannot be cast to class [C ([B and [C are in module java.base of 
>> loader 'bootstrap')]
>> 2019-05-01 21:16:41,932 INFO 
>> org.apache.flink.runtime.taskexecutor.TaskExecutor            - Could 
>> not resolve ResourceManager address 
>> akka.ssl.tcp://flink@sb-ust1:6123/user/resourcemanager, retrying in 
>> 10000 ms: Ask timed out on 
>> [ActorSelection[Anchor(akka.ssl.tcp://flink@sb-ust1:6123/), 
>> Path(/user/resourcemanager)]] after [10000 ms]. Sender[null] sent 
>> message of type "akka.actor.Identify"..
>> 2019-05-01 21:17:01,960 INFO 
>> org.apache.flink.runtime.taskexecutor.TaskExecutor            - Could 
>> not resolve ResourceManager address 
>> akka.ssl.tcp://flink@sb-ust1:6123/user/resourcemanager, retrying in 
>> 10000 ms: Ask timed out on 
>> [ActorSelection[Anchor(akka.ssl.tcp://flink@sb-ust1:6123/), 
>> Path(/user/resourcemanager)]] after [10000 ms]. Sender[null] sent 
>> message of type "akka.actor.Identify"..
>>
>>
>> port 6123 is allowed on the jobmanager but I haven't created a 
>> specialized flink - user.
>>
>> - Is this necessary? if yes, is it possible to define another user 
>> for communication purposes?
>>
>> I followed the documentation to setup a ssl based communication 
>> (https://ci.apache.org/projects/flink/flink-docs-release-1.8/ops/security-ssl.html#example-ssl-setup-standalone-and-kubernetes) 
>> and created a keystore as described:
>>
>> keytool -genkeypair -alias swissbib.internal -keystore 
>> internal.keystore -dname "CN=flink.internal" -storepass verysecret 
>> -keypass verysecret -keyalg RSA -keysize 4096
>>
>> and deployed the flink-conf.yaml on the whole cluster
>>
>> (part of flink-conf.yaml)
>> security.ssl.internal.enabled: true
>> security.ssl.internal.keystore: 
>> /swissbib_index/apps/flink/conf/internal.keystore
>> security.ssl.internal.truststore: 
>> /swissbib_index/apps/flink/conf/internal.keystore
>> security.ssl.internal.keystore-password: verysecret
>> security.ssl.internal.truststore-password: verysecret
>> security.ssl.internal.key-password: verysecret
>>
>> but this doesn't solve the problem - still no connection between 
>> task-managers and job-managers.
>>
>> - another question: which ports have to be enabled in the firewall 
>> for a standalone cluster?
>>
>> Thanks for any hints!
>>
>> Günter
>>
>>
>
>

Re: configuration of standalone cluster

Posted by Chesnay Schepler <ch...@apache.org>.
Which java version are you using?

On 01/05/2019 21:31, Günter Hipler wrote:
> Hi,
>
> For the first time I'm trying to set up a standalone cluster. My 
> current configuration
> 4 server (1 jobmanger and 3 taskmanager)
>
> a) starting the cluster
> swissbib@sb-ust1:/swissbib_index/apps/flink/bin$ ./start-cluster.sh
> Starting cluster.
> Starting standalonesession daemon on host sb-ust1.
> Starting taskexecutor daemon on host sb-ust2.
> Starting taskexecutor daemon on host sb-ust3.
> Starting taskexecutor daemon on host sb-ust4.
>
>
> On the taskmanager side I get the error
> 2019-05-01 21:16:32,794 WARN 
> akka.remote.ReliableDeliverySupervisor                        - 
> Association with remote system [akka.ssl.tcp://flink@sb-ust1:6123] has 
> failed, address is now gated for [50] ms. Reason: [class [B cannot be 
> cast to class [C ([B and [C are in module java.base of loader 
> 'bootstrap')]
> 2019-05-01 21:16:41,932 INFO 
> org.apache.flink.runtime.taskexecutor.TaskExecutor            - Could 
> not resolve ResourceManager address 
> akka.ssl.tcp://flink@sb-ust1:6123/user/resourcemanager, retrying in 
> 10000 ms: Ask timed out on 
> [ActorSelection[Anchor(akka.ssl.tcp://flink@sb-ust1:6123/), 
> Path(/user/resourcemanager)]] after [10000 ms]. Sender[null] sent 
> message of type "akka.actor.Identify"..
> 2019-05-01 21:17:01,960 INFO 
> org.apache.flink.runtime.taskexecutor.TaskExecutor            - Could 
> not resolve ResourceManager address 
> akka.ssl.tcp://flink@sb-ust1:6123/user/resourcemanager, retrying in 
> 10000 ms: Ask timed out on 
> [ActorSelection[Anchor(akka.ssl.tcp://flink@sb-ust1:6123/), 
> Path(/user/resourcemanager)]] after [10000 ms]. Sender[null] sent 
> message of type "akka.actor.Identify"..
>
>
> port 6123 is allowed on the jobmanager but I haven't created a 
> specialized flink - user.
>
> - Is this necessary? if yes, is it possible to define another user for 
> communication purposes?
>
> I followed the documentation to setup a ssl based communication 
> (https://ci.apache.org/projects/flink/flink-docs-release-1.8/ops/security-ssl.html#example-ssl-setup-standalone-and-kubernetes) 
> and created a keystore as described:
>
> keytool -genkeypair -alias swissbib.internal -keystore 
> internal.keystore -dname "CN=flink.internal" -storepass verysecret 
> -keypass verysecret -keyalg RSA -keysize 4096
>
> and deployed the flink-conf.yaml on the whole cluster
>
> (part of flink-conf.yaml)
> security.ssl.internal.enabled: true
> security.ssl.internal.keystore: 
> /swissbib_index/apps/flink/conf/internal.keystore
> security.ssl.internal.truststore: 
> /swissbib_index/apps/flink/conf/internal.keystore
> security.ssl.internal.keystore-password: verysecret
> security.ssl.internal.truststore-password: verysecret
> security.ssl.internal.key-password: verysecret
>
> but this doesn't solve the problem - still no connection between 
> task-managers and job-managers.
>
> - another question: which ports have to be enabled in the firewall for 
> a standalone cluster?
>
> Thanks for any hints!
>
> Günter
>
>