You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user-zh@flink.apache.org by yidan zhao <hi...@gmail.com> on 2021/03/03 06:58:27 UTC

Re: Flink 1.11.2运行一段时间后,会报ResourceManager leader changed to new address null的异常

Mark下。这个问题我也遇到多次,看过一个xintognsongn的回复,由于网络、zk可用性等问题会导致。不够一般会自动恢复。

史 正超 <sh...@outlook.com> 于2020年12月7日周一 下午10:13写道:

> 8 个slot,8个并行度,jm是2G,tm配置的是8G,其它的任务配置是
> ```
> SET 'execution.checkpointing.interval' = '5min';
> SET 'execution.checkpointing.min-pause' = '10s';
> SET 'min.idle.state.retention.time' = '1d';
> SET 'max.idle.state.retention.time' = '25h';
> SET 'checkpoint.with.rocksdb' = 'true';
> set 'table.exec.mini-batch.enabled' = 'true';
> set 'table.exec.mini-batch.allow-latency' = '5s';
> set 'table.exec.mini-batch.size' = '5000';
>
> ```
>
> 2020-12-07 19:35:01
> org.apache.flink.util.FlinkException: ResourceManager leader changed to
> new address null
>     at
> org.apache.flink.runtime.taskexecutor.TaskExecutor.notifyOfNewResourceManagerLeader(TaskExecutor.java:1093)
>     at
> org.apache.flink.runtime.taskexecutor.TaskExecutor.access$800(TaskExecutor.java:173)
>     at
> org.apache.flink.runtime.taskexecutor.TaskExecutor$ResourceManagerLeaderListener.lambda$notifyLeaderAddress$0(TaskExecutor.java:1816)
>     at
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:402)
>     at
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:195)
>     at
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:152)
>     at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
>     at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
>     at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
>     at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
>     at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170)
>     at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
>     at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
>     at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
>     at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
>     at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
>     at akka.actor.ActorCell.invoke(ActorCell.scala:561)
>     at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
>     at akka.dispatch.Mailbox.run(Mailbox.scala:225)
>     at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
>     at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>     at
> akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>     at
> akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>     at
> akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>
>
>

Re: Flink 1.11.2运行一段时间后,会报ResourceManager leader changed to new address null的异常

Posted by yidan zhao <hi...@gmail.com>.
还有没有大佬解释下,我最近又遇到这个问题了,而且很频繁。任务启动1小时restored达到了8。

yidan zhao <hi...@gmail.com> 于2021年3月3日周三 下午2:58写道:

> Mark下。这个问题我也遇到多次,看过一个xintognsongn的回复,由于网络、zk可用性等问题会导致。不够一般会自动恢复。
>
> 史 正超 <sh...@outlook.com> 于2020年12月7日周一 下午10:13写道:
>
>> 8 个slot,8个并行度,jm是2G,tm配置的是8G,其它的任务配置是
>> ```
>> SET 'execution.checkpointing.interval' = '5min';
>> SET 'execution.checkpointing.min-pause' = '10s';
>> SET 'min.idle.state.retention.time' = '1d';
>> SET 'max.idle.state.retention.time' = '25h';
>> SET 'checkpoint.with.rocksdb' = 'true';
>> set 'table.exec.mini-batch.enabled' = 'true';
>> set 'table.exec.mini-batch.allow-latency' = '5s';
>> set 'table.exec.mini-batch.size' = '5000';
>>
>> ```
>>
>> 2020-12-07 19:35:01
>> org.apache.flink.util.FlinkException: ResourceManager leader changed to
>> new address null
>>     at
>> org.apache.flink.runtime.taskexecutor.TaskExecutor.notifyOfNewResourceManagerLeader(TaskExecutor.java:1093)
>>     at
>> org.apache.flink.runtime.taskexecutor.TaskExecutor.access$800(TaskExecutor.java:173)
>>     at
>> org.apache.flink.runtime.taskexecutor.TaskExecutor$ResourceManagerLeaderListener.lambda$notifyLeaderAddress$0(TaskExecutor.java:1816)
>>     at
>> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:402)
>>     at
>> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:195)
>>     at
>> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:152)
>>     at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
>>     at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
>>     at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
>>     at akka.japi.pf
>> .UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
>>     at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170)
>>     at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
>>     at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
>>     at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
>>     at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
>>     at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
>>     at akka.actor.ActorCell.invoke(ActorCell.scala:561)
>>     at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
>>     at akka.dispatch.Mailbox.run(Mailbox.scala:225)
>>     at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
>>     at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>>     at
>> akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>>     at
>> akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>>     at
>> akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>>
>>
>>