You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by "Kumar Bolar, Harshith" <hk...@arity.com> on 2018/10/22 10:52:30 UTC
Flink JobManager is not starting when running on a standalone
cluster
Hi all,
We run Flink on a five node cluster – three task managers, two job managers. One of the job manager running on flink2-0 node is down and refuses to come back up, so the cluster is currently running with a single job manager. When I restart the service, I see this in the logs. Any idea what this issue might be?
2018-10-22 06:43:50,458 INFO org.apache.flink.runtime.jobmanager.JobManager - Starting JobManager actor
2018-10-22 06:43:50,462 INFO org.apache.flink.runtime.blob.BlobServer - Created BLOB server storage directory /tmp/blobStore-73e8dbe2-8fdb-4310-84d4-c9f3445723f3
2018-10-22 06:43:50,466 INFO org.apache.flink.runtime.blob.BlobServer - Enabling ssl for the blob server
2018-10-22 06:43:50,482 INFO org.apache.flink.runtime.blob.BlobServer - Started BLOB server at 0.0.0.0:36880 - max concurrent requests: 50 - max backlog: 1000
2018-10-22 06:43:50,501 INFO org.apache.flink.runtime.jobmanager.MemoryArchivist - Started memory archivist akka://flink/user/archive
2018-10-22 06:43:50,525 INFO org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - Starting ZooKeeperLeaderRetrievalService.
2018-10-22 06:43:50,525 INFO org.apache.flink.runtime.jobmanager.JobManager - Starting JobManager at akka.ssl.tcp://flink@flink2-0.flink2.us-east-1.prod.xxxxxxx.io:22902/user/jobmanager.
2018-10-22 06:43:50,526 INFO org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService - Starting ZooKeeperLeaderElectionService org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService@2805f48f.
2018-10-22 06:43:50,532 INFO org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - Starting ZooKeeperLeaderRetrievalService.
2018-10-22 06:43:50,557 INFO org.apache.flink.runtime.clusterframework.standalone.StandaloneResourceManager - Received leader address but not running in leader ActorSystem. Cancelling registration.
Thanks,
Harshith
Re: Flink JobManager is not starting when running on a standalone cluster
Posted by miki haiat <mi...@gmail.com>.
I think it`s related to this issue
https://issues.apache.org/jira/browse/FLINK-10011
On Mon, Oct 22, 2018 at 1:52 PM Kumar Bolar, Harshith <hk...@arity.com>
wrote:
> Hi all,
>
>
>
> We run Flink on a five node cluster – three task managers, two job
> managers. One of the job manager running on flink2-0 node is down and
> refuses to come back up, so the cluster is currently running with a single
> job manager. When I restart the service, I see this in the logs. Any idea
> what this issue might be?
>
>
>
> 2018-10-22 06:43:50,458 INFO
> org.apache.flink.runtime.jobmanager.JobManager - Starting
> JobManager actor
>
> 2018-10-22 06:43:50,462 INFO org.apache.flink.runtime.blob.BlobServer
> - Created BLOB server storage directory
> /tmp/blobStore-73e8dbe2-8fdb-4310-84d4-c9f3445723f3
>
> 2018-10-22 06:43:50,466 INFO org.apache.flink.runtime.blob.BlobServer
> - Enabling ssl for the blob server
>
> 2018-10-22 06:43:50,482 INFO org.apache.flink.runtime.blob.BlobServer
> - Started BLOB server at 0.0.0.0:36880 - max concurrent
> requests: 50 - max backlog: 1000
>
> 2018-10-22 06:43:50,501 INFO org.apache.flink.runtime.jobmanager.MemoryArchivist
> - Started memory archivist akka://flink/user/archive
>
> 2018-10-22 06:43:50,525 INFO
> org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
> Starting ZooKeeperLeaderRetrievalService.
>
> 2018-10-22 06:43:50,525 INFO
> org.apache.flink.runtime.jobmanager.JobManager - Starting
> JobManager at akka.ssl.tcp://
> flink@flink2-0.flink2.us-east-1.prod.xxxxxxx.io:22902/user/jobmanager.
>
> 2018-10-22 06:43:50,526 INFO
> org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService -
> Starting ZooKeeperLeaderElectionService
> org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService@2805f48f.
>
> 2018-10-22 06:43:50,532 INFO
> org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService -
> Starting ZooKeeperLeaderRetrievalService.
>
> 2018-10-22 06:43:50,557 INFO
> org.apache.flink.runtime.clusterframework.standalone.StandaloneResourceManager
> - Received leader address but not running in leader ActorSystem.
> Cancelling registration.
>
>
>
> Thanks,
>
> Harshith
>