You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user-zh@flink.apache.org by 陈卓宇 <25...@qq.com.INVALID> on 2022/04/20 07:05:04 UTC
The file STDOUT does not exist on the TaskExecutor 异常
大佬您好:
小弟想问一下这个异常是什么原因产生的,对生产有何影响,如何消除
java.util.concurrent.CompletionException: org.apache.flink.util.FlinkException: The file STDOUT does not exist on the TaskExecutor.
at org.apache.flink.runtime.taskexecutor.TaskExecutor.lambda$requestFileUploadByFilePath$24(TaskExecutor.java:2064) ~[flink-dist_2.11-1.13.1.jar:1.13.1]
at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590) ~[?:1.8.0_202]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_202]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_202]
at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_202]
Caused by: org.apache.flink.util.FlinkException: The file STDOUT does not exist on the TaskExecutor.
... 5 more
2022-04-20 14:51:47,370 ERROR org.apache.flink.runtime.rest.handler.taskmanager.TaskManagerStdoutFileHandler [] - Unhandled exception.
org.apache.flink.util.FlinkException: The file STDOUT does not exist on the TaskExecutor.
at org.apache.flink.runtime.taskexecutor.TaskExecutor.lambda$requestFileUploadByFilePath$24(TaskExecutor.java:2064) ~[flink-dist_2.11-1.13.1.jar:1.13.1]
at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590) ~[?:1.8.0_202]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_202]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_202]
at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_202]
卓宇
回复: The file STDOUT does not exist on the TaskExecutor 异常
Posted by 陈卓宇 <25...@qq.com.INVALID>.
感谢您解答
陈卓宇
------------------ 原始邮件 ------------------
发件人: "user-zh" <zhlonghong@gmail.com>;
发送时间: 2022年4月20日(星期三) 下午4:50
收件人: "user-zh"<user-zh@flink.apache.org>;
主题: Re: The file STDOUT does not exist on the TaskExecutor 异常
Hello, 卓宇:
这个是REST API的报错,说明你在Flink
Dashboard中TaskManager页面点击了Stdout选项卡,但对应的TaskManager上访问不到stdout文件,因此报错。该错误不会影响任务的正常运行,可以忽略。
Best,
Zhilong
On Wed, Apr 20, 2022 at 3:06 PM 陈卓宇 <2572805166@qq.com.invalid> wrote:
> 大佬您好:
> &nbsp;&nbsp;&nbsp;&nbsp; 小弟想问一下这个异常是什么原因产生的,对生产有何影响,如何消除
>
> java.util.concurrent.CompletionException:
> org.apache.flink.util.FlinkException: The file STDOUT does not exist on the
> TaskExecutor.
>
> at
> org.apache.flink.runtime.taskexecutor.TaskExecutor.lambda$requestFileUploadByFilePath$24(TaskExecutor.java:2064)
> ~[flink-dist_2.11-1.13.1.jar:1.13.1]
>
> at
> java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
> ~[?:1.8.0_202]
>
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> ~[?:1.8.0_202]
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> ~[?:1.8.0_202]
>
> at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_202]
>
> Caused by: org.apache.flink.util.FlinkException: The file STDOUT does not
> exist on the TaskExecutor.
>
> ... 5 more
>
> 2022-04-20 14:51:47,370 ERROR
> org.apache.flink.runtime.rest.handler.taskmanager.TaskManagerStdoutFileHandler
> [] - Unhandled exception.
>
> org.apache.flink.util.FlinkException: The file STDOUT does not exist on
> the TaskExecutor.
>
> at
> org.apache.flink.runtime.taskexecutor.TaskExecutor.lambda$requestFileUploadByFilePath$24(TaskExecutor.java:2064)
> ~[flink-dist_2.11-1.13.1.jar:1.13.1]
>
> at
> java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
> ~[?:1.8.0_202]
>
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> ~[?:1.8.0_202]
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> ~[?:1.8.0_202]
>
> at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_202]
> 卓宇
>
>
> &nbsp;
Re: Filnk: Job leader for job id xxxx lost leadership
Posted by huweihua <hu...@gmail.com>.
补充一个 case,检查下 hudi connector 是否使用了 OperatorCoordinator 来跟 hudi 进行一些交互,这部分操作是在 JobMaster 主线程内的,如果耗时比较长会导致 TaskManager 跟 JobMaster 断开链接.
> 2022年4月20日 下午6:42,Zhanghao Chen <zh...@outlook.com> 写道:
>
> 出现 Job leader for job id xxxx lost 说明是 jm leader 在 zk 上的 session timeout 了。可能的原因有
>
> 1. JM 和 ZK 网络连接有抖动,ZK 连接进入 suspended,并且你没有配置容忍 zk 连接 suspended(1.14 及以上版本配置 high-availability.zookeeper.client.tolerate-suspended-connections 参数)或者配了但是 session timeout 时间设的太短触发丢主
> 2. JM 确实经常挂
> 3. JM GC 很严重,导致了和 zk 连接有问题进入 suspended 状态
>
> Best,
> Zhanghao Chen
> ________________________________
> From: magic <gu...@foxmail.com>
> Sent: Wednesday, April 20, 2022 17:49
> To: user-zh <us...@flink.apache.org>
> Subject: Filnk: Job leader for job id xxxx lost leadership
>
> Hi,all
> 我们在使用Flink 消费kafka数据写入hudi时,经常会报错:Job leader for job id xxxx lost leadership, 但是同集群 其他flink 任务就没问题,请教下前辈们,这是什么原因呢,感觉不太像zk的问题
Re: Filnk: Job leader for job id xxxx lost leadership
Posted by Zhanghao Chen <zh...@outlook.com>.
出现 Job leader for job id xxxx lost 说明是 jm leader 在 zk 上的 session timeout 了。可能的原因有
1. JM 和 ZK 网络连接有抖动,ZK 连接进入 suspended,并且你没有配置容忍 zk 连接 suspended(1.14 及以上版本配置 high-availability.zookeeper.client.tolerate-suspended-connections 参数)或者配了但是 session timeout 时间设的太短触发丢主
2. JM 确实经常挂
3. JM GC 很严重,导致了和 zk 连接有问题进入 suspended 状态
Best,
Zhanghao Chen
________________________________
From: magic <gu...@foxmail.com>
Sent: Wednesday, April 20, 2022 17:49
To: user-zh <us...@flink.apache.org>
Subject: Filnk: Job leader for job id xxxx lost leadership
Hi,all
我们在使用Flink 消费kafka数据写入hudi时,经常会报错:Job leader for job id xxxx lost leadership, 但是同集群 其他flink 任务就没问题,请教下前辈们,这是什么原因呢,感觉不太像zk的问题
Filnk: Job leader for job id xxxx lost leadership
Posted by magic <gu...@foxmail.com>.
Hi,all
我们在使用Flink 消费kafka数据写入hudi时,经常会报错:Job leader for job id xxxx lost leadership, 但是同集群 其他flink 任务就没问题,请教下前辈们,这是什么原因呢,感觉不太像zk的问题
Re: The file STDOUT does not exist on the TaskExecutor 异常
Posted by Zhilong Hong <zh...@gmail.com>.
Hello, 卓宇:
这个是REST API的报错,说明你在Flink
Dashboard中TaskManager页面点击了Stdout选项卡,但对应的TaskManager上访问不到stdout文件,因此报错。该错误不会影响任务的正常运行,可以忽略。
Best,
Zhilong
On Wed, Apr 20, 2022 at 3:06 PM 陈卓宇 <25...@qq.com.invalid> wrote:
> 大佬您好:
> 小弟想问一下这个异常是什么原因产生的,对生产有何影响,如何消除
>
> java.util.concurrent.CompletionException:
> org.apache.flink.util.FlinkException: The file STDOUT does not exist on the
> TaskExecutor.
>
> at
> org.apache.flink.runtime.taskexecutor.TaskExecutor.lambda$requestFileUploadByFilePath$24(TaskExecutor.java:2064)
> ~[flink-dist_2.11-1.13.1.jar:1.13.1]
>
> at
> java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
> ~[?:1.8.0_202]
>
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> ~[?:1.8.0_202]
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> ~[?:1.8.0_202]
>
> at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_202]
>
> Caused by: org.apache.flink.util.FlinkException: The file STDOUT does not
> exist on the TaskExecutor.
>
> ... 5 more
>
> 2022-04-20 14:51:47,370 ERROR
> org.apache.flink.runtime.rest.handler.taskmanager.TaskManagerStdoutFileHandler
> [] - Unhandled exception.
>
> org.apache.flink.util.FlinkException: The file STDOUT does not exist on
> the TaskExecutor.
>
> at
> org.apache.flink.runtime.taskexecutor.TaskExecutor.lambda$requestFileUploadByFilePath$24(TaskExecutor.java:2064)
> ~[flink-dist_2.11-1.13.1.jar:1.13.1]
>
> at
> java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
> ~[?:1.8.0_202]
>
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> ~[?:1.8.0_202]
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> ~[?:1.8.0_202]
>
> at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_202]
> 卓宇
>
>
>