You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user-zh@flink.apache.org by MuChen <93...@qq.com> on 2020/09/07 02:23:23 UTC

使用StreamingFileSink向hive metadata中增加分区部分失败

hi, all:
麻烦大佬们帮看个问题,多谢!


处理逻辑如下
1. 使用DataStream API读取kafka中的数据,写入DataStream ds1中
2. 新建一个tableEnv,并注册hive catalog:
&nbsp; &nbsp; tableEnv.registerCatalog(catalogName, catalog);
&nbsp; &nbsp; tableEnv.useCatalog(catalogName);
3. 声明以ds1为数据源的table
&nbsp; &nbsp; Table sourcetable = tableEnv.fromDataStream(ds1);
&nbsp; &nbsp; String souceTableName = "music_source";
&nbsp; &nbsp; tableEnv.createTemporaryView(souceTableName, sourcetable);
4. 创建一张hive表:
CREATE TABLE `dwd_music_copyright_test`(   `url` string COMMENT 'url',   `md5` string COMMENT 'md5',   `utime` bigint COMMENT '时间',   `title` string COMMENT '歌曲名',   `singer` string COMMENT '演唱者',   `company` string COMMENT '公司',   `level` int COMMENT '置信度.0是标题切词,1是acrcloud返回的结果,3是人工标准') PARTITIONED BY (   `dt` string,   `hour` string) ROW FORMAT SERDE   'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' STORED AS INPUTFORMAT   'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' OUTPUTFORMAT   'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat' LOCATION   'hdfs://Ucluster/user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test' TBLPROPERTIES (   'connector'='HiveCatalog',   'partition.time-extractor.timestamp-pattern'='$dt $hour:00:00',   'sink.partition-commit.delay'='1 min',   'sink.partition-commit.policy.kind'='metastore,success-file',   'sink.partition-commit.trigger'='partition-time',   'sink.rolling-policy.check-interval'='30s',   'sink.rolling-policy.rollover-interval'='1min',   'sink.rolling-policy.file-size'='1MB');


5. 将step3表中的数据插入dwd_music_copyright_test


环境
flink:1.11 kafka:1.1.1 hadoop:2.6.0 hive:1.2.0
问题
&nbsp; &nbsp; 程序运行后,发现hive catalog部分分区未成功创建,如下未成功创建hour=02和hour=03分区:
show partitions rt_dwd.dwd_music_copyright_test; | dt=2020-08-29/hour=00  | | dt=2020-08-29/hour=01  | | dt=2020-08-29/hour=04  | | dt=2020-08-29/hour=05  |
&nbsp;但是hdfs目录下有文件生成:
$ hadoop fs -du -h /user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-08-29/ 4.5 K   13.4 K  /user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-08-29/hour=00 2.0 K   6.1 K   /user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-08-29/hour=01 1.7 K   5.1 K   /user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-08-29/hour=02 1.3 K   3.8 K   /user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-08-29/hour=03 3.1 K   9.2 K   /user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-08-29/hour=04 


且手动add partition后可以正常读取数据。


通过flink WebUI可以看到,过程中有checkpoint在StreamingFileCommitter时失败的情况发生:









请问:


1. exactly-once只能保证写sink文件,不能保证更新catalog吗?
2. 是的话有什么方案解决这个问题吗?
3. EXACTLY_ONCE有没有必要指定kafka参数isolation.level=read_committed和enable.auto.commit=false?是不是有了如下设置就可以保证EXACTLY_ONCE?
streamEnv.getCheckpointConfig().setCheckpointingMode(CheckpointingMode.EXACTLY_ONCE); tableEnv.getConfig().getConfiguration().set(ExecutionCheckpointingOptions.CHECKPOINTING_MODE, CheckpointingMode.EXACTLY_ONCE);

Re: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败

Posted by Jingsong Li <ji...@gmail.com>.
非常感谢你的反馈,应该是真的有问题,我建个JIRA追踪下

https://issues.apache.org/jira/browse/FLINK-19166

会包含在即将发布的1.11.2中

Best,
Jingsong

On Wed, Sep 9, 2020 at 10:44 AM MuChen <93...@qq.com> wrote:

> hi,Rui Li:
> 没有提交分区的目录是commited状态,手动add partition是可以正常查询的
>
> &nbsp;/user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-08-19/hour=07/part-b7d8f3c6-f1f3-40d4-a269-1ccf2c9a7720-0-1031
>
>
>
>
> ------------------&nbsp;原始邮件&nbsp;------------------
> 发件人:
>                                                   "user-zh"
>                                                                     <
> lirui.fudan@gmail.com&gt;;
> 发送时间:&nbsp;2020年9月8日(星期二) 晚上9:43
> 收件人:&nbsp;"MuChen"<9329748@qq.com&gt;;
> 抄送:&nbsp;"user-zh"<user-zh@flink.apache.org&gt;;"夏帅"<jkillers@dingtalk.com
> &gt;;
> 主题:&nbsp;Re: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
>
>
>
> 另外也list一下没有提交的分区目录吧,看看里面的文件是什么状态
>
> On Tue, Sep 8, 2020 at 9:19 PM Rui Li <lirui.fudan@gmail.com&gt; wrote:
>
> &gt; 作业有发生failover么?还是说作业能成功结束但是某些partition始终没提交?
> &gt;
> &gt; On Tue, Sep 8, 2020 at 5:20 PM MuChen <9329748@qq.com&gt; wrote:
> &gt;
> &gt;&gt; hi, Rui Li:
> &gt;&gt; 如你所说,的确有类似日志,但是只有成功增加的分区的日志,没有失败分区的日志:
> &gt;&gt; 2020-09-04 17:17:10,548 INFO
> org.apache.flink.streaming.api.operators.
> &gt;&gt; AbstractStreamOperator [] - Partition {dt=2020-08-22, hour=18} of
> table
> &gt;&gt; `hive_catalog`.`rt_dwd`.`dwd_music_copyright_test` is ready to be
> committed
> &gt;&gt; 2020-09-04 17:17:10,716 INFO org.apache.flink.table.filesystem.
> &gt;&gt; MetastoreCommitPolicy [] - Committed partition {dt=2020-08-22,
> hour=18}
> &gt;&gt; to metastore
> &gt;&gt; 2020-09-04 17:17:10,720 INFO org.apache.flink.table.filesystem.
> &gt;&gt; SuccessFileCommitPolicy [] - Committed partition {dt=2020-08-22,
> hour=18}
> &gt;&gt; with success file
> &gt;&gt; 2020-09-04 17:17:19,652 INFO
> org.apache.flink.streaming.api.operators.
> &gt;&gt; AbstractStreamOperator [] - Partition {dt=2020-08-22, hour=19} of
> table
> &gt;&gt; `hive_catalog`.`rt_dwd`.`dwd_music_copyright_test` is ready to be
> committed
> &gt;&gt; 2020-09-04 17:17:19,820 INFO org.apache.flink.table.filesystem.
> &gt;&gt; MetastoreCommitPolicy [] - Committed partition {dt=2020-08-22,
> hour=19}
> &gt;&gt; to metastore
> &gt;&gt; 2020-09-04 17:17:19,824 INFO org.apache.flink.table.filesystem.
> &gt;&gt; SuccessFileCommitPolicy [] - Committed partition {dt=2020-08-22,
> hour=19}
> &gt;&gt; with success file
> &gt;&gt;
> &gt;&gt; 写hdfs的日志是都有的:
> &gt;&gt; 2020-09-04 17:16:04,100 INFO org.apache.hadoop.hive.ql.io
> .parquet.write.
> &gt;&gt; ParquetRecordWriterWrapper [] - creating real writer to write at
> hdfs://
> &gt;&gt;
> Ucluster/user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-
> &gt;&gt; 08-22/hour=07/.part-b7d8f3c6-f1f3-40d4-a269-1ccf2c9a7720-0-1140
> &gt;&gt; .inprogress.1631ac6c-a07c-4ad7-86ff-cf0d4375d1de
> &gt;&gt; 2020-09-04 17:16:04,126 INFO org.apache.hadoop.hive.ql.io
> .parquet.write.
> &gt;&gt; ParquetRecordWriterWrapper [] - creating real writer to write at
> hdfs://
> &gt;&gt;
> Ucluster/user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-
> &gt;&gt; 08-22/hour=19/.part-b7d8f3c6-f1f3-40d4-a269-1ccf2c9a7720-0-1142
> &gt;&gt; .inprogress.2700eded-5ed0-4794-8ee9-21721c0c2ffd
> &gt;&gt;
> &gt;&gt; ------------------ 原始邮件 ------------------
> &gt;&gt; *发件人:* "Rui Li" <lirui.fudan@gmail.com&gt;;
> &gt;&gt; *发送时间:* 2020年9月8日(星期二) 中午12:09
> &gt;&gt; *收件人:* "user-zh"<user-zh@flink.apache.org&gt;;"夏帅"<
> jkillers@dingtalk.com&gt;;
> &gt;&gt; *抄送:* "MuChen"<9329748@qq.com&gt;;
> &gt;&gt; *主题:* Re: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
> &gt;&gt;
> &gt;&gt; streaming file committer在提交分区之前会打印这样的日志:
> &gt;&gt;
> &gt;&gt; LOG.info("Partition {} of table {} is ready to be committed",
> partSpec, tableIdentifier);
> &gt;&gt;
> &gt;&gt; partition commit policy会在成功提交分区以后打印这样的日志:
> &gt;&gt;
> &gt;&gt; LOG.info("Committed partition {} to metastore", partitionSpec);
> &gt;&gt;
> &gt;&gt; LOG.info("Committed partition {} with success file",
> context.partitionSpec());
> &gt;&gt;
> &gt;&gt; 可以检查一下这样的日志,看是不是卡在什么地方了
> &gt;&gt;
> &gt;&gt; On Tue, Sep 8, 2020 at 11:02 AM 夏帅 <jkillers@dingtalk.com.invalid&gt;
> wrote:
> &gt;&gt;
> &gt;&gt;&gt; 就第二次提供的日志看,好像是你的namenode出现的问题
> &gt;&gt;&gt;
> &gt;&gt;&gt;
> &gt;&gt;&gt;
> ------------------------------------------------------------------
> &gt;&gt;&gt; 发件人:MuChen <9329748@qq.com&gt;
> &gt;&gt;&gt; 发送时间:2020年9月8日(星期二) 10:56
> &gt;&gt;&gt; 收件人:user-zh@flink.apache.org 夏帅 <jkillers@dingtalk.com&gt;;
> user-zh <
> &gt;&gt;&gt; user-zh@flink.apache.org&gt;
> &gt;&gt;&gt; 主 题:回复: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
> &gt;&gt;&gt;
> &gt;&gt;&gt; 在checkpoint失败的时间,tm上还有一些info和warn级别的日志:
> &gt;&gt;&gt; 2020-09-04 17:17:59,520 INFO
> &gt;&gt;&gt; org.apache.hadoop.io.retry.RetryInvocationHandler [] -
> Exception while
> &gt;&gt;&gt; invoking create of class ClientNamenodeProtocolTranslatorPB
> over
> &gt;&gt;&gt; uhadoop-op3raf-master2/10.42.52.202:8020 after 14 fail over
> attempts.
> &gt;&gt;&gt; Trying to fail over immediately.
> &gt;&gt;&gt; java.io.IOException: java.lang.InterruptedException
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> org.apache.hadoop.ipc.Client.call(Client.java:1449)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> org.apache.hadoop.ipc.Client.call(Client.java:1401)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> &gt;&gt;&gt;
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> com.sun.proxy.$Proxy26.create(Unknown Source) ~[?:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> &gt;&gt;&gt;
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:295)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
> &gt;&gt;&gt; ~[?:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> &gt;&gt;&gt;
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> &gt;&gt;&gt; ~[?:1.8.0_144]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> &gt;&gt;&gt;
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> &gt;&gt;&gt;
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> com.sun.proxy.$Proxy27.create(Unknown Source) ~[?:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> &gt;&gt;&gt;
> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1721)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1657)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1582)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> &gt;&gt;&gt;
> org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:397)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> &gt;&gt;&gt;
> org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:393)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> &gt;&gt;&gt;
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> &gt;&gt;&gt;
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:393)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> &gt;&gt;&gt;
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:337)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> &gt;&gt;&gt;
> org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:141)
> &gt;&gt;&gt; ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> &gt;&gt;&gt;
> org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:37)
> &gt;&gt;&gt; ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> &gt;&gt;&gt;
> org.apache.flink.table.filesystem.SuccessFileCommitPolicy.commit(SuccessFileCommitPolicy.java:45)
> &gt;&gt;&gt; ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> &gt;&gt;&gt;
> org.apache.flink.table.filesystem.stream.StreamingFileCommitter.commitPartitions(StreamingFileCommitter.java:167)
> &gt;&gt;&gt; ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> &gt;&gt;&gt;
> org.apache.flink.table.filesystem.stream.StreamingFileCommitter.processElement(StreamingFileCommitter.java:144)
> &gt;&gt;&gt; ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> &gt;&gt;&gt;
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:161)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.flink.streaming.
> runtime.io
> .StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:178)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.flink.streaming.
> runtime.io
> .StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:153)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.flink.streaming.
> runtime.io
> .StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> &gt;&gt;&gt;
> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:345)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> &gt;&gt;&gt;
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:191)
> &gt;&gt;&gt; [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> &gt;&gt;&gt;
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:181)
> &gt;&gt;&gt; [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> &gt;&gt;&gt;
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:558)
> &gt;&gt;&gt; [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> &gt;&gt;&gt;
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:530)
> &gt;&gt;&gt; [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721)
> &gt;&gt;&gt; [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> org.apache.flink.runtime.taskmanager.Task.run(Task.java:546)
> &gt;&gt;&gt; [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
> &gt;&gt;&gt; Caused by: java.lang.InterruptedException
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404)
> &gt;&gt;&gt; ~[?:1.8.0_144]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> java.util.concurrent.FutureTask.get(FutureTask.java:191)
> &gt;&gt;&gt; ~[?:1.8.0_144]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> &gt;&gt;&gt;
> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1048)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> org.apache.hadoop.ipc.Client.call(Client.java:1443)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; ... 38 more
> &gt;&gt;&gt; 2020-09-04 17:17:59,522 WARN
> &gt;&gt;&gt; org.apache.hadoop.io.retry.RetryInvocationHandler [] -
> Exception while
> &gt;&gt;&gt; invoking class
> &gt;&gt;&gt;
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create
> &gt;&gt;&gt; over uhadoop-op3raf-master1/10.42.31.63:8020. Not retrying
> because
> &gt;&gt;&gt; failovers (15) exceeded maximum allowed (15)
> &gt;&gt;&gt; java.io.IOException: Failed on local exception:
> &gt;&gt;&gt; java.nio.channels.ClosedByInterruptException; Host Details :
> local host is:
> &gt;&gt;&gt; "uhadoop-op3raf-core13/10.42.99.178"; destination host is:
> &gt;&gt;&gt; "uhadoop-op3raf-master1":8020;
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.hadoop.net
> .NetUtils.wrapException(NetUtils.java:772)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> org.apache.hadoop.ipc.Client.call(Client.java:1474)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> org.apache.hadoop.ipc.Client.call(Client.java:1401)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> &gt;&gt;&gt;
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> com.sun.proxy.$Proxy26.create(Unknown Source) ~[?:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> &gt;&gt;&gt;
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:295)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
> &gt;&gt;&gt; ~[?:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> &gt;&gt;&gt;
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> &gt;&gt;&gt; ~[?:1.8.0_144]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> &gt;&gt;&gt;
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> &gt;&gt;&gt;
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> com.sun.proxy.$Proxy27.create(Unknown Source) ~[?:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> &gt;&gt;&gt;
> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1721)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1657)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1582)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> &gt;&gt;&gt;
> org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:397)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> &gt;&gt;&gt;
> org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:393)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> &gt;&gt;&gt;
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> &gt;&gt;&gt;
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:393)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> &gt;&gt;&gt;
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:337)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> &gt;&gt;&gt;
> org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:141)
> &gt;&gt;&gt; ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> &gt;&gt;&gt;
> org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:37)
> &gt;&gt;&gt; ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> &gt;&gt;&gt;
> org.apache.flink.table.filesystem.SuccessFileCommitPolicy.commit(SuccessFileCommitPolicy.java:45)
> &gt;&gt;&gt; ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> &gt;&gt;&gt;
> org.apache.flink.table.filesystem.stream.StreamingFileCommitter.commitPartitions(StreamingFileCommitter.java:167)
> &gt;&gt;&gt; ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> &gt;&gt;&gt;
> org.apache.flink.table.filesystem.stream.StreamingFileCommitter.processElement(StreamingFileCommitter.java:144)
> &gt;&gt;&gt; ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> &gt;&gt;&gt;
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:161)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.flink.streaming.
> runtime.io
> .StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:178)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.flink.streaming.
> runtime.io
> .StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:153)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.flink.streaming.
> runtime.io
> .StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> &gt;&gt;&gt;
> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:345)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> &gt;&gt;&gt;
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:191)
> &gt;&gt;&gt; [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> &gt;&gt;&gt;
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:181)
> &gt;&gt;&gt; [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> &gt;&gt;&gt;
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:558)
> &gt;&gt;&gt; [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> &gt;&gt;&gt;
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:530)
> &gt;&gt;&gt; [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721)
> &gt;&gt;&gt; [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> org.apache.flink.runtime.taskmanager.Task.run(Task.java:546)
> &gt;&gt;&gt; [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
> &gt;&gt;&gt; Caused by: java.nio.channels.ClosedByInterruptException
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> &gt;&gt;&gt;
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
> &gt;&gt;&gt; ~[?:1.8.0_144]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:659)
> &gt;&gt;&gt; ~[?:1.8.0_144]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.hadoop.net
> .SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.hadoop.net
> .NetUtils.connect(NetUtils.java:530)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.hadoop.net
> .NetUtils.connect(NetUtils.java:494)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> &gt;&gt;&gt;
> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:609)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> &gt;&gt;&gt;
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:707)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> &gt;&gt;&gt;
> org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:370)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> org.apache.hadoop.ipc.Client.getConnection(Client.java:1523)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
> org.apache.hadoop.ipc.Client.call(Client.java:1440)
> &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; ... 38 more
> &gt;&gt;&gt;
> &gt;&gt;&gt; 补充:程序多次执行,均会出现部分分区创建失败的情况,而且每次失败的分区是不同的
> &gt;&gt;&gt;
> &gt;&gt;&gt;
> &gt;&gt;&gt; ------------------ 原始邮件 ------------------
> &gt;&gt;&gt; 发件人: "user-zh@flink.apache.org 夏帅"
> <jkillers@dingtalk.com.INVALID&gt;;
> &gt;&gt;&gt; 发送时间: 2020年9月8日(星期二) 上午10:47
> &gt;&gt;&gt; 收件人: "user-zh"<user-zh@flink.apache.org&gt;;"MuChen"<
> 9329748@qq.com&gt;;
> &gt;&gt;&gt; 主题:&nbsp; 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
> &gt;&gt;&gt;
> &gt;&gt;&gt; 异常日志只有这些么?有没有详细点的
> &gt;&gt;
> &gt;&gt;
> &gt;&gt;
> &gt;&gt; --
> &gt;&gt; Best regards!
> &gt;&gt; Rui Li
> &gt;&gt;
> &gt;
> &gt;
> &gt; --
> &gt; Best regards!
> &gt; Rui Li
> &gt;
>
>
> --
> Best regards!
> Rui Li



-- 
Best, Jingsong Lee

回复: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败

Posted by MuChen <93...@qq.com>.
hi,Rui Li:
没有提交分区的目录是commited状态,手动add partition是可以正常查询的
&nbsp;/user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-08-19/hour=07/part-b7d8f3c6-f1f3-40d4-a269-1ccf2c9a7720-0-1031




------------------&nbsp;原始邮件&nbsp;------------------
发件人:                                                                                                                        "user-zh"                                                                                    <lirui.fudan@gmail.com&gt;;
发送时间:&nbsp;2020年9月8日(星期二) 晚上9:43
收件人:&nbsp;"MuChen"<9329748@qq.com&gt;;
抄送:&nbsp;"user-zh"<user-zh@flink.apache.org&gt;;"夏帅"<jkillers@dingtalk.com&gt;;
主题:&nbsp;Re: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败



另外也list一下没有提交的分区目录吧,看看里面的文件是什么状态

On Tue, Sep 8, 2020 at 9:19 PM Rui Li <lirui.fudan@gmail.com&gt; wrote:

&gt; 作业有发生failover么?还是说作业能成功结束但是某些partition始终没提交?
&gt;
&gt; On Tue, Sep 8, 2020 at 5:20 PM MuChen <9329748@qq.com&gt; wrote:
&gt;
&gt;&gt; hi, Rui Li:
&gt;&gt; 如你所说,的确有类似日志,但是只有成功增加的分区的日志,没有失败分区的日志:
&gt;&gt; 2020-09-04 17:17:10,548 INFO org.apache.flink.streaming.api.operators.
&gt;&gt; AbstractStreamOperator [] - Partition {dt=2020-08-22, hour=18} of table
&gt;&gt; `hive_catalog`.`rt_dwd`.`dwd_music_copyright_test` is ready to be committed
&gt;&gt; 2020-09-04 17:17:10,716 INFO org.apache.flink.table.filesystem.
&gt;&gt; MetastoreCommitPolicy [] - Committed partition {dt=2020-08-22, hour=18}
&gt;&gt; to metastore
&gt;&gt; 2020-09-04 17:17:10,720 INFO org.apache.flink.table.filesystem.
&gt;&gt; SuccessFileCommitPolicy [] - Committed partition {dt=2020-08-22, hour=18}
&gt;&gt; with success file
&gt;&gt; 2020-09-04 17:17:19,652 INFO org.apache.flink.streaming.api.operators.
&gt;&gt; AbstractStreamOperator [] - Partition {dt=2020-08-22, hour=19} of table
&gt;&gt; `hive_catalog`.`rt_dwd`.`dwd_music_copyright_test` is ready to be committed
&gt;&gt; 2020-09-04 17:17:19,820 INFO org.apache.flink.table.filesystem.
&gt;&gt; MetastoreCommitPolicy [] - Committed partition {dt=2020-08-22, hour=19}
&gt;&gt; to metastore
&gt;&gt; 2020-09-04 17:17:19,824 INFO org.apache.flink.table.filesystem.
&gt;&gt; SuccessFileCommitPolicy [] - Committed partition {dt=2020-08-22, hour=19}
&gt;&gt; with success file
&gt;&gt;
&gt;&gt; 写hdfs的日志是都有的:
&gt;&gt; 2020-09-04 17:16:04,100 INFO org.apache.hadoop.hive.ql.io.parquet.write.
&gt;&gt; ParquetRecordWriterWrapper [] - creating real writer to write at hdfs://
&gt;&gt; Ucluster/user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-
&gt;&gt; 08-22/hour=07/.part-b7d8f3c6-f1f3-40d4-a269-1ccf2c9a7720-0-1140
&gt;&gt; .inprogress.1631ac6c-a07c-4ad7-86ff-cf0d4375d1de
&gt;&gt; 2020-09-04 17:16:04,126 INFO org.apache.hadoop.hive.ql.io.parquet.write.
&gt;&gt; ParquetRecordWriterWrapper [] - creating real writer to write at hdfs://
&gt;&gt; Ucluster/user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-
&gt;&gt; 08-22/hour=19/.part-b7d8f3c6-f1f3-40d4-a269-1ccf2c9a7720-0-1142
&gt;&gt; .inprogress.2700eded-5ed0-4794-8ee9-21721c0c2ffd
&gt;&gt;
&gt;&gt; ------------------ 原始邮件 ------------------
&gt;&gt; *发件人:* "Rui Li" <lirui.fudan@gmail.com&gt;;
&gt;&gt; *发送时间:* 2020年9月8日(星期二) 中午12:09
&gt;&gt; *收件人:* "user-zh"<user-zh@flink.apache.org&gt;;"夏帅"<jkillers@dingtalk.com&gt;;
&gt;&gt; *抄送:* "MuChen"<9329748@qq.com&gt;;
&gt;&gt; *主题:* Re: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
&gt;&gt;
&gt;&gt; streaming file committer在提交分区之前会打印这样的日志:
&gt;&gt;
&gt;&gt; LOG.info("Partition {} of table {} is ready to be committed", partSpec, tableIdentifier);
&gt;&gt;
&gt;&gt; partition commit policy会在成功提交分区以后打印这样的日志:
&gt;&gt;
&gt;&gt; LOG.info("Committed partition {} to metastore", partitionSpec);
&gt;&gt;
&gt;&gt; LOG.info("Committed partition {} with success file", context.partitionSpec());
&gt;&gt;
&gt;&gt; 可以检查一下这样的日志,看是不是卡在什么地方了
&gt;&gt;
&gt;&gt; On Tue, Sep 8, 2020 at 11:02 AM 夏帅 <jkillers@dingtalk.com.invalid&gt; wrote:
&gt;&gt;
&gt;&gt;&gt; 就第二次提供的日志看,好像是你的namenode出现的问题
&gt;&gt;&gt;
&gt;&gt;&gt;
&gt;&gt;&gt; ------------------------------------------------------------------
&gt;&gt;&gt; 发件人:MuChen <9329748@qq.com&gt;
&gt;&gt;&gt; 发送时间:2020年9月8日(星期二) 10:56
&gt;&gt;&gt; 收件人:user-zh@flink.apache.org 夏帅 <jkillers@dingtalk.com&gt;; user-zh <
&gt;&gt;&gt; user-zh@flink.apache.org&gt;
&gt;&gt;&gt; 主 题:回复: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
&gt;&gt;&gt;
&gt;&gt;&gt; 在checkpoint失败的时间,tm上还有一些info和warn级别的日志:
&gt;&gt;&gt; 2020-09-04 17:17:59,520 INFO
&gt;&gt;&gt; org.apache.hadoop.io.retry.RetryInvocationHandler [] - Exception while
&gt;&gt;&gt; invoking create of class ClientNamenodeProtocolTranslatorPB over
&gt;&gt;&gt; uhadoop-op3raf-master2/10.42.52.202:8020 after 14 fail over attempts.
&gt;&gt;&gt; Trying to fail over immediately.
&gt;&gt;&gt; java.io.IOException: java.lang.InterruptedException
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.hadoop.ipc.Client.call(Client.java:1449)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.hadoop.ipc.Client.call(Client.java:1401)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt;&gt;&gt; org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at com.sun.proxy.$Proxy26.create(Unknown Source) ~[?:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt;&gt;&gt; org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:295)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
&gt;&gt;&gt; ~[?:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt;&gt;&gt; sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
&gt;&gt;&gt; ~[?:1.8.0_144]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt;&gt;&gt; org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt;&gt;&gt; org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at com.sun.proxy.$Proxy27.create(Unknown Source) ~[?:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt;&gt;&gt; org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1721)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1657)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1582)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt;&gt;&gt; org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:397)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt;&gt;&gt; org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:393)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt;&gt;&gt; org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt;&gt;&gt; org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:393)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt;&gt;&gt; org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:337)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt;&gt;&gt; org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:141)
&gt;&gt;&gt; ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt;&gt;&gt; org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:37)
&gt;&gt;&gt; ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt;&gt;&gt; org.apache.flink.table.filesystem.SuccessFileCommitPolicy.commit(SuccessFileCommitPolicy.java:45)
&gt;&gt;&gt; ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt;&gt;&gt; org.apache.flink.table.filesystem.stream.StreamingFileCommitter.commitPartitions(StreamingFileCommitter.java:167)
&gt;&gt;&gt; ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt;&gt;&gt; org.apache.flink.table.filesystem.stream.StreamingFileCommitter.processElement(StreamingFileCommitter.java:144)
&gt;&gt;&gt; ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt;&gt;&gt; org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:161)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:178)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:153)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt;&gt;&gt; org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:345)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt;&gt;&gt; org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:191)
&gt;&gt;&gt; [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt;&gt;&gt; org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:181)
&gt;&gt;&gt; [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt;&gt;&gt; org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:558)
&gt;&gt;&gt; [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt;&gt;&gt; org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:530)
&gt;&gt;&gt; [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721)
&gt;&gt;&gt; [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546)
&gt;&gt;&gt; [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
&gt;&gt;&gt; Caused by: java.lang.InterruptedException
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404)
&gt;&gt;&gt; ~[?:1.8.0_144]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at java.util.concurrent.FutureTask.get(FutureTask.java:191)
&gt;&gt;&gt; ~[?:1.8.0_144]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt;&gt;&gt; org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1048)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.hadoop.ipc.Client.call(Client.java:1443)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; ... 38 more
&gt;&gt;&gt; 2020-09-04 17:17:59,522 WARN
&gt;&gt;&gt; org.apache.hadoop.io.retry.RetryInvocationHandler [] - Exception while
&gt;&gt;&gt; invoking class
&gt;&gt;&gt; org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create
&gt;&gt;&gt; over uhadoop-op3raf-master1/10.42.31.63:8020. Not retrying because
&gt;&gt;&gt; failovers (15) exceeded maximum allowed (15)
&gt;&gt;&gt; java.io.IOException: Failed on local exception:
&gt;&gt;&gt; java.nio.channels.ClosedByInterruptException; Host Details : local host is:
&gt;&gt;&gt; "uhadoop-op3raf-core13/10.42.99.178"; destination host is:
&gt;&gt;&gt; "uhadoop-op3raf-master1":8020;
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.hadoop.ipc.Client.call(Client.java:1474)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.hadoop.ipc.Client.call(Client.java:1401)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt;&gt;&gt; org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at com.sun.proxy.$Proxy26.create(Unknown Source) ~[?:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt;&gt;&gt; org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:295)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
&gt;&gt;&gt; ~[?:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt;&gt;&gt; sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
&gt;&gt;&gt; ~[?:1.8.0_144]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt;&gt;&gt; org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt;&gt;&gt; org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at com.sun.proxy.$Proxy27.create(Unknown Source) ~[?:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt;&gt;&gt; org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1721)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1657)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1582)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt;&gt;&gt; org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:397)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt;&gt;&gt; org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:393)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt;&gt;&gt; org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt;&gt;&gt; org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:393)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt;&gt;&gt; org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:337)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt;&gt;&gt; org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:141)
&gt;&gt;&gt; ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt;&gt;&gt; org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:37)
&gt;&gt;&gt; ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt;&gt;&gt; org.apache.flink.table.filesystem.SuccessFileCommitPolicy.commit(SuccessFileCommitPolicy.java:45)
&gt;&gt;&gt; ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt;&gt;&gt; org.apache.flink.table.filesystem.stream.StreamingFileCommitter.commitPartitions(StreamingFileCommitter.java:167)
&gt;&gt;&gt; ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt;&gt;&gt; org.apache.flink.table.filesystem.stream.StreamingFileCommitter.processElement(StreamingFileCommitter.java:144)
&gt;&gt;&gt; ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt;&gt;&gt; org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:161)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:178)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:153)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt;&gt;&gt; org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:345)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt;&gt;&gt; org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:191)
&gt;&gt;&gt; [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt;&gt;&gt; org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:181)
&gt;&gt;&gt; [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt;&gt;&gt; org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:558)
&gt;&gt;&gt; [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt;&gt;&gt; org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:530)
&gt;&gt;&gt; [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721)
&gt;&gt;&gt; [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546)
&gt;&gt;&gt; [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
&gt;&gt;&gt; Caused by: java.nio.channels.ClosedByInterruptException
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt;&gt;&gt; java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
&gt;&gt;&gt; ~[?:1.8.0_144]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:659)
&gt;&gt;&gt; ~[?:1.8.0_144]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt;&gt;&gt; org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:609)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt;&gt;&gt; org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:707)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt;&gt;&gt; org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:370)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.hadoop.ipc.Client.getConnection(Client.java:1523)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.hadoop.ipc.Client.call(Client.java:1440)
&gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; ... 38 more
&gt;&gt;&gt;
&gt;&gt;&gt; 补充:程序多次执行,均会出现部分分区创建失败的情况,而且每次失败的分区是不同的
&gt;&gt;&gt;
&gt;&gt;&gt;
&gt;&gt;&gt; ------------------ 原始邮件 ------------------
&gt;&gt;&gt; 发件人: "user-zh@flink.apache.org 夏帅" <jkillers@dingtalk.com.INVALID&gt;;
&gt;&gt;&gt; 发送时间: 2020年9月8日(星期二) 上午10:47
&gt;&gt;&gt; 收件人: "user-zh"<user-zh@flink.apache.org&gt;;"MuChen"<9329748@qq.com&gt;;
&gt;&gt;&gt; 主题:&nbsp; 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
&gt;&gt;&gt;
&gt;&gt;&gt; 异常日志只有这些么?有没有详细点的
&gt;&gt;
&gt;&gt;
&gt;&gt;
&gt;&gt; --
&gt;&gt; Best regards!
&gt;&gt; Rui Li
&gt;&gt;
&gt;
&gt;
&gt; --
&gt; Best regards!
&gt; Rui Li
&gt;


-- 
Best regards!
Rui Li

回复: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败

Posted by MuChen <93...@qq.com>.
插入表的sql如下:


INSERT INTO rt_dwd.dwd_music_copyright_test
SELECT url,md5,utime,title,singer,company,level,
&nbsp; &nbsp; from_unixtime(cast(utime/1000 as int),'yyyy-MM-dd')
&nbsp; &nbsp; ,from_unixtime(cast(utime/1000 as int),'HH') FROM music_source;





------------------&nbsp;原始邮件&nbsp;------------------
发件人:                                                                                                                        "user-zh"                                                                                    <jingsonglee0@gmail.com&gt;;
发送时间:&nbsp;2020年9月9日(星期三) 上午10:32
收件人:&nbsp;"user-zh"<user-zh@flink.apache.org&gt;;
抄送:&nbsp;"MuChen"<9329748@qq.com&gt;;"夏帅"<jkillers@dingtalk.com&gt;;
主题:&nbsp;Re: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败



插入Hive表的SQL也发下?

On Tue, Sep 8, 2020 at 9:44 PM Rui Li <lirui.fudan@gmail.com&gt; wrote:

&gt; 另外也list一下没有提交的分区目录吧,看看里面的文件是什么状态
&gt;
&gt; On Tue, Sep 8, 2020 at 9:19 PM Rui Li <lirui.fudan@gmail.com&gt; wrote:
&gt;
&gt; &gt; 作业有发生failover么?还是说作业能成功结束但是某些partition始终没提交?
&gt; &gt;
&gt; &gt; On Tue, Sep 8, 2020 at 5:20 PM MuChen <9329748@qq.com&gt; wrote:
&gt; &gt;
&gt; &gt;&gt; hi, Rui Li:
&gt; &gt;&gt; 如你所说,的确有类似日志,但是只有成功增加的分区的日志,没有失败分区的日志:
&gt; &gt;&gt; 2020-09-04 17:17:10,548 INFO org.apache.flink.streaming.api.operators.
&gt; &gt;&gt; AbstractStreamOperator [] - Partition {dt=2020-08-22, hour=18} of table
&gt; &gt;&gt; `hive_catalog`.`rt_dwd`.`dwd_music_copyright_test` is ready to be
&gt; committed
&gt; &gt;&gt; 2020-09-04 17:17:10,716 INFO org.apache.flink.table.filesystem.
&gt; &gt;&gt; MetastoreCommitPolicy [] - Committed partition {dt=2020-08-22, hour=18}
&gt; &gt;&gt; to metastore
&gt; &gt;&gt; 2020-09-04 17:17:10,720 INFO org.apache.flink.table.filesystem.
&gt; &gt;&gt; SuccessFileCommitPolicy [] - Committed partition {dt=2020-08-22,
&gt; hour=18}
&gt; &gt;&gt; with success file
&gt; &gt;&gt; 2020-09-04 17:17:19,652 INFO org.apache.flink.streaming.api.operators.
&gt; &gt;&gt; AbstractStreamOperator [] - Partition {dt=2020-08-22, hour=19} of table
&gt; &gt;&gt; `hive_catalog`.`rt_dwd`.`dwd_music_copyright_test` is ready to be
&gt; committed
&gt; &gt;&gt; 2020-09-04 17:17:19,820 INFO org.apache.flink.table.filesystem.
&gt; &gt;&gt; MetastoreCommitPolicy [] - Committed partition {dt=2020-08-22, hour=19}
&gt; &gt;&gt; to metastore
&gt; &gt;&gt; 2020-09-04 17:17:19,824 INFO org.apache.flink.table.filesystem.
&gt; &gt;&gt; SuccessFileCommitPolicy [] - Committed partition {dt=2020-08-22,
&gt; hour=19}
&gt; &gt;&gt; with success file
&gt; &gt;&gt;
&gt; &gt;&gt; 写hdfs的日志是都有的:
&gt; &gt;&gt; 2020-09-04 17:16:04,100 INFO org.apache.hadoop.hive.ql.io
&gt; .parquet.write.
&gt; &gt;&gt; ParquetRecordWriterWrapper [] - creating real writer to write at hdfs://
&gt; &gt;&gt; Ucluster/user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-
&gt; &gt;&gt; 08-22/hour=07/.part-b7d8f3c6-f1f3-40d4-a269-1ccf2c9a7720-0-1140
&gt; &gt;&gt; .inprogress.1631ac6c-a07c-4ad7-86ff-cf0d4375d1de
&gt; &gt;&gt; 2020-09-04 17:16:04,126 INFO org.apache.hadoop.hive.ql.io
&gt; .parquet.write.
&gt; &gt;&gt; ParquetRecordWriterWrapper [] - creating real writer to write at hdfs://
&gt; &gt;&gt; Ucluster/user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-
&gt; &gt;&gt; 08-22/hour=19/.part-b7d8f3c6-f1f3-40d4-a269-1ccf2c9a7720-0-1142
&gt; &gt;&gt; .inprogress.2700eded-5ed0-4794-8ee9-21721c0c2ffd
&gt; &gt;&gt;
&gt; &gt;&gt; ------------------ 原始邮件 ------------------
&gt; &gt;&gt; *发件人:* "Rui Li" <lirui.fudan@gmail.com&gt;;
&gt; &gt;&gt; *发送时间:* 2020年9月8日(星期二) 中午12:09
&gt; &gt;&gt; *收件人:* "user-zh"<user-zh@flink.apache.org&gt;;"夏帅"<jkillers@dingtalk.com&gt;;
&gt; &gt;&gt; *抄送:* "MuChen"<9329748@qq.com&gt;;
&gt; &gt;&gt; *主题:* Re: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
&gt; &gt;&gt;
&gt; &gt;&gt; streaming file committer在提交分区之前会打印这样的日志:
&gt; &gt;&gt;
&gt; &gt;&gt; LOG.info("Partition {} of table {} is ready to be committed", partSpec,
&gt; tableIdentifier);
&gt; &gt;&gt;
&gt; &gt;&gt; partition commit policy会在成功提交分区以后打印这样的日志:
&gt; &gt;&gt;
&gt; &gt;&gt; LOG.info("Committed partition {} to metastore", partitionSpec);
&gt; &gt;&gt;
&gt; &gt;&gt; LOG.info("Committed partition {} with success file",
&gt; context.partitionSpec());
&gt; &gt;&gt;
&gt; &gt;&gt; 可以检查一下这样的日志,看是不是卡在什么地方了
&gt; &gt;&gt;
&gt; &gt;&gt; On Tue, Sep 8, 2020 at 11:02 AM 夏帅 <jkillers@dingtalk.com.invalid&gt;
&gt; wrote:
&gt; &gt;&gt;
&gt; &gt;&gt;&gt; 就第二次提供的日志看,好像是你的namenode出现的问题
&gt; &gt;&gt;&gt;
&gt; &gt;&gt;&gt;
&gt; &gt;&gt;&gt; ------------------------------------------------------------------
&gt; &gt;&gt;&gt; 发件人:MuChen <9329748@qq.com&gt;
&gt; &gt;&gt;&gt; 发送时间:2020年9月8日(星期二) 10:56
&gt; &gt;&gt;&gt; 收件人:user-zh@flink.apache.org 夏帅 <jkillers@dingtalk.com&gt;; user-zh <
&gt; &gt;&gt;&gt; user-zh@flink.apache.org&gt;
&gt; &gt;&gt;&gt; 主 题:回复: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
&gt; &gt;&gt;&gt;
&gt; &gt;&gt;&gt; 在checkpoint失败的时间,tm上还有一些info和warn级别的日志:
&gt; &gt;&gt;&gt; 2020-09-04 17:17:59,520 INFO
&gt; &gt;&gt;&gt; org.apache.hadoop.io.retry.RetryInvocationHandler [] - Exception while
&gt; &gt;&gt;&gt; invoking create of class ClientNamenodeProtocolTranslatorPB over
&gt; &gt;&gt;&gt; uhadoop-op3raf-master2/10.42.52.202:8020 after 14 fail over attempts.
&gt; &gt;&gt;&gt; Trying to fail over immediately.
&gt; &gt;&gt;&gt; java.io.IOException: java.lang.InterruptedException
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.hadoop.ipc.Client.call(Client.java:1449)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.hadoop.ipc.Client.call(Client.java:1401)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt; &gt;&gt;&gt;
&gt; org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at com.sun.proxy.$Proxy26.create(Unknown Source) ~[?:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt; &gt;&gt;&gt;
&gt; org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:295)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
&gt; &gt;&gt;&gt; ~[?:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt; &gt;&gt;&gt;
&gt; sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
&gt; &gt;&gt;&gt; ~[?:1.8.0_144]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt; &gt;&gt;&gt;
&gt; org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt; &gt;&gt;&gt;
&gt; org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at com.sun.proxy.$Proxy27.create(Unknown Source) ~[?:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt; &gt;&gt;&gt;
&gt; org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1721)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1657)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1582)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt; &gt;&gt;&gt;
&gt; org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:397)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt; &gt;&gt;&gt;
&gt; org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:393)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt; &gt;&gt;&gt;
&gt; org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt; &gt;&gt;&gt;
&gt; org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:393)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt; &gt;&gt;&gt;
&gt; org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:337)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt; &gt;&gt;&gt;
&gt; org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:141)
&gt; &gt;&gt;&gt; ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt; &gt;&gt;&gt;
&gt; org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:37)
&gt; &gt;&gt;&gt; ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt; &gt;&gt;&gt;
&gt; org.apache.flink.table.filesystem.SuccessFileCommitPolicy.commit(SuccessFileCommitPolicy.java:45)
&gt; &gt;&gt;&gt; ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt; &gt;&gt;&gt;
&gt; org.apache.flink.table.filesystem.stream.StreamingFileCommitter.commitPartitions(StreamingFileCommitter.java:167)
&gt; &gt;&gt;&gt; ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt; &gt;&gt;&gt;
&gt; org.apache.flink.table.filesystem.stream.StreamingFileCommitter.processElement(StreamingFileCommitter.java:144)
&gt; &gt;&gt;&gt; ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt; &gt;&gt;&gt;
&gt; org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:161)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.flink.streaming.runtime.io
&gt; .StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:178)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.flink.streaming.runtime.io
&gt; .StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:153)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.flink.streaming.runtime.io
&gt; .StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt; &gt;&gt;&gt;
&gt; org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:345)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt; &gt;&gt;&gt;
&gt; org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:191)
&gt; &gt;&gt;&gt; [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt; &gt;&gt;&gt;
&gt; org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:181)
&gt; &gt;&gt;&gt; [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt; &gt;&gt;&gt;
&gt; org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:558)
&gt; &gt;&gt;&gt; [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt; &gt;&gt;&gt;
&gt; org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:530)
&gt; &gt;&gt;&gt; [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721)
&gt; &gt;&gt;&gt; [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546)
&gt; &gt;&gt;&gt; [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
&gt; &gt;&gt;&gt; Caused by: java.lang.InterruptedException
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404)
&gt; &gt;&gt;&gt; ~[?:1.8.0_144]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at java.util.concurrent.FutureTask.get(FutureTask.java:191)
&gt; &gt;&gt;&gt; ~[?:1.8.0_144]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt; &gt;&gt;&gt;
&gt; org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1048)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.hadoop.ipc.Client.call(Client.java:1443)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; ... 38 more
&gt; &gt;&gt;&gt; 2020-09-04 17:17:59,522 WARN
&gt; &gt;&gt;&gt; org.apache.hadoop.io.retry.RetryInvocationHandler [] - Exception while
&gt; &gt;&gt;&gt; invoking class
&gt; &gt;&gt;&gt;
&gt; org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create
&gt; &gt;&gt;&gt; over uhadoop-op3raf-master1/10.42.31.63:8020. Not retrying because
&gt; &gt;&gt;&gt; failovers (15) exceeded maximum allowed (15)
&gt; &gt;&gt;&gt; java.io.IOException: Failed on local exception:
&gt; &gt;&gt;&gt; java.nio.channels.ClosedByInterruptException; Host Details : local
&gt; host is:
&gt; &gt;&gt;&gt; "uhadoop-op3raf-core13/10.42.99.178"; destination host is:
&gt; &gt;&gt;&gt; "uhadoop-op3raf-master1":8020;
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.hadoop.ipc.Client.call(Client.java:1474)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.hadoop.ipc.Client.call(Client.java:1401)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt; &gt;&gt;&gt;
&gt; org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at com.sun.proxy.$Proxy26.create(Unknown Source) ~[?:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt; &gt;&gt;&gt;
&gt; org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:295)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
&gt; &gt;&gt;&gt; ~[?:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt; &gt;&gt;&gt;
&gt; sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
&gt; &gt;&gt;&gt; ~[?:1.8.0_144]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt; &gt;&gt;&gt;
&gt; org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt; &gt;&gt;&gt;
&gt; org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at com.sun.proxy.$Proxy27.create(Unknown Source) ~[?:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt; &gt;&gt;&gt;
&gt; org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1721)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1657)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1582)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt; &gt;&gt;&gt;
&gt; org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:397)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt; &gt;&gt;&gt;
&gt; org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:393)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt; &gt;&gt;&gt;
&gt; org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt; &gt;&gt;&gt;
&gt; org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:393)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt; &gt;&gt;&gt;
&gt; org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:337)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt; &gt;&gt;&gt;
&gt; org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:141)
&gt; &gt;&gt;&gt; ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt; &gt;&gt;&gt;
&gt; org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:37)
&gt; &gt;&gt;&gt; ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt; &gt;&gt;&gt;
&gt; org.apache.flink.table.filesystem.SuccessFileCommitPolicy.commit(SuccessFileCommitPolicy.java:45)
&gt; &gt;&gt;&gt; ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt; &gt;&gt;&gt;
&gt; org.apache.flink.table.filesystem.stream.StreamingFileCommitter.commitPartitions(StreamingFileCommitter.java:167)
&gt; &gt;&gt;&gt; ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt; &gt;&gt;&gt;
&gt; org.apache.flink.table.filesystem.stream.StreamingFileCommitter.processElement(StreamingFileCommitter.java:144)
&gt; &gt;&gt;&gt; ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt; &gt;&gt;&gt;
&gt; org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:161)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.flink.streaming.runtime.io
&gt; .StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:178)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.flink.streaming.runtime.io
&gt; .StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:153)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.flink.streaming.runtime.io
&gt; .StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt; &gt;&gt;&gt;
&gt; org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:345)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt; &gt;&gt;&gt;
&gt; org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:191)
&gt; &gt;&gt;&gt; [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt; &gt;&gt;&gt;
&gt; org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:181)
&gt; &gt;&gt;&gt; [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt; &gt;&gt;&gt;
&gt; org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:558)
&gt; &gt;&gt;&gt; [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt; &gt;&gt;&gt;
&gt; org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:530)
&gt; &gt;&gt;&gt; [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721)
&gt; &gt;&gt;&gt; [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546)
&gt; &gt;&gt;&gt; [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
&gt; &gt;&gt;&gt; Caused by: java.nio.channels.ClosedByInterruptException
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt; &gt;&gt;&gt;
&gt; java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
&gt; &gt;&gt;&gt; ~[?:1.8.0_144]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:659)
&gt; &gt;&gt;&gt; ~[?:1.8.0_144]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.hadoop.net
&gt; .SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt; &gt;&gt;&gt;
&gt; org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:609)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt; &gt;&gt;&gt; org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:707)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at
&gt; &gt;&gt;&gt; org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:370)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.hadoop.ipc.Client.getConnection(Client.java:1523)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; at org.apache.hadoop.ipc.Client.call(Client.java:1440)
&gt; &gt;&gt;&gt; ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&gt; &gt;&gt;&gt;&nbsp;&nbsp;&nbsp;&nbsp; ... 38 more
&gt; &gt;&gt;&gt;
&gt; &gt;&gt;&gt; 补充:程序多次执行,均会出现部分分区创建失败的情况,而且每次失败的分区是不同的
&gt; &gt;&gt;&gt;
&gt; &gt;&gt;&gt;
&gt; &gt;&gt;&gt; ------------------ 原始邮件 ------------------
&gt; &gt;&gt;&gt; 发件人: "user-zh@flink.apache.org 夏帅" <jkillers@dingtalk.com.INVALID&gt;;
&gt; &gt;&gt;&gt; 发送时间: 2020年9月8日(星期二) 上午10:47
&gt; &gt;&gt;&gt; 收件人: "user-zh"<user-zh@flink.apache.org&gt;;"MuChen"<9329748@qq.com&gt;;
&gt; &gt;&gt;&gt; 主题:&nbsp; 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
&gt; &gt;&gt;&gt;
&gt; &gt;&gt;&gt; 异常日志只有这些么?有没有详细点的
&gt; &gt;&gt;
&gt; &gt;&gt;
&gt; &gt;&gt;
&gt; &gt;&gt; --
&gt; &gt;&gt; Best regards!
&gt; &gt;&gt; Rui Li
&gt; &gt;&gt;
&gt; &gt;
&gt; &gt;
&gt; &gt; --
&gt; &gt; Best regards!
&gt; &gt; Rui Li
&gt; &gt;
&gt;
&gt;
&gt; --
&gt; Best regards!
&gt; Rui Li
&gt;


-- 
Best, Jingsong Lee

Re: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败

Posted by Jingsong Li <ji...@gmail.com>.
插入Hive表的SQL也发下?

On Tue, Sep 8, 2020 at 9:44 PM Rui Li <li...@gmail.com> wrote:

> 另外也list一下没有提交的分区目录吧,看看里面的文件是什么状态
>
> On Tue, Sep 8, 2020 at 9:19 PM Rui Li <li...@gmail.com> wrote:
>
> > 作业有发生failover么?还是说作业能成功结束但是某些partition始终没提交?
> >
> > On Tue, Sep 8, 2020 at 5:20 PM MuChen <93...@qq.com> wrote:
> >
> >> hi, Rui Li:
> >> 如你所说,的确有类似日志,但是只有成功增加的分区的日志,没有失败分区的日志:
> >> 2020-09-04 17:17:10,548 INFO org.apache.flink.streaming.api.operators.
> >> AbstractStreamOperator [] - Partition {dt=2020-08-22, hour=18} of table
> >> `hive_catalog`.`rt_dwd`.`dwd_music_copyright_test` is ready to be
> committed
> >> 2020-09-04 17:17:10,716 INFO org.apache.flink.table.filesystem.
> >> MetastoreCommitPolicy [] - Committed partition {dt=2020-08-22, hour=18}
> >> to metastore
> >> 2020-09-04 17:17:10,720 INFO org.apache.flink.table.filesystem.
> >> SuccessFileCommitPolicy [] - Committed partition {dt=2020-08-22,
> hour=18}
> >> with success file
> >> 2020-09-04 17:17:19,652 INFO org.apache.flink.streaming.api.operators.
> >> AbstractStreamOperator [] - Partition {dt=2020-08-22, hour=19} of table
> >> `hive_catalog`.`rt_dwd`.`dwd_music_copyright_test` is ready to be
> committed
> >> 2020-09-04 17:17:19,820 INFO org.apache.flink.table.filesystem.
> >> MetastoreCommitPolicy [] - Committed partition {dt=2020-08-22, hour=19}
> >> to metastore
> >> 2020-09-04 17:17:19,824 INFO org.apache.flink.table.filesystem.
> >> SuccessFileCommitPolicy [] - Committed partition {dt=2020-08-22,
> hour=19}
> >> with success file
> >>
> >> 写hdfs的日志是都有的:
> >> 2020-09-04 17:16:04,100 INFO org.apache.hadoop.hive.ql.io
> .parquet.write.
> >> ParquetRecordWriterWrapper [] - creating real writer to write at hdfs://
> >> Ucluster/user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-
> >> 08-22/hour=07/.part-b7d8f3c6-f1f3-40d4-a269-1ccf2c9a7720-0-1140
> >> .inprogress.1631ac6c-a07c-4ad7-86ff-cf0d4375d1de
> >> 2020-09-04 17:16:04,126 INFO org.apache.hadoop.hive.ql.io
> .parquet.write.
> >> ParquetRecordWriterWrapper [] - creating real writer to write at hdfs://
> >> Ucluster/user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-
> >> 08-22/hour=19/.part-b7d8f3c6-f1f3-40d4-a269-1ccf2c9a7720-0-1142
> >> .inprogress.2700eded-5ed0-4794-8ee9-21721c0c2ffd
> >>
> >> ------------------ 原始邮件 ------------------
> >> *发件人:* "Rui Li" <li...@gmail.com>;
> >> *发送时间:* 2020年9月8日(星期二) 中午12:09
> >> *收件人:* "user-zh"<us...@dingtalk.com>;
> >> *抄送:* "MuChen"<93...@qq.com>;
> >> *主题:* Re: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
> >>
> >> streaming file committer在提交分区之前会打印这样的日志:
> >>
> >> LOG.info("Partition {} of table {} is ready to be committed", partSpec,
> tableIdentifier);
> >>
> >> partition commit policy会在成功提交分区以后打印这样的日志:
> >>
> >> LOG.info("Committed partition {} to metastore", partitionSpec);
> >>
> >> LOG.info("Committed partition {} with success file",
> context.partitionSpec());
> >>
> >> 可以检查一下这样的日志,看是不是卡在什么地方了
> >>
> >> On Tue, Sep 8, 2020 at 11:02 AM 夏帅 <jk...@dingtalk.com.invalid>
> wrote:
> >>
> >>> 就第二次提供的日志看,好像是你的namenode出现的问题
> >>>
> >>>
> >>> ------------------------------------------------------------------
> >>> 发件人:MuChen <93...@qq.com>
> >>> 发送时间:2020年9月8日(星期二) 10:56
> >>> 收件人:user-zh@flink.apache.org 夏帅 <jk...@dingtalk.com>; user-zh <
> >>> user-zh@flink.apache.org>
> >>> 主 题:回复: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
> >>>
> >>> 在checkpoint失败的时间,tm上还有一些info和warn级别的日志:
> >>> 2020-09-04 17:17:59,520 INFO
> >>> org.apache.hadoop.io.retry.RetryInvocationHandler [] - Exception while
> >>> invoking create of class ClientNamenodeProtocolTranslatorPB over
> >>> uhadoop-op3raf-master2/10.42.52.202:8020 after 14 fail over attempts.
> >>> Trying to fail over immediately.
> >>> java.io.IOException: java.lang.InterruptedException
> >>>     at org.apache.hadoop.ipc.Client.call(Client.java:1449)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at org.apache.hadoop.ipc.Client.call(Client.java:1401)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at
> >>>
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at com.sun.proxy.$Proxy26.create(Unknown Source) ~[?:?]
> >>>     at
> >>>
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:295)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
> >>> ~[?:?]
> >>>     at
> >>>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >>> ~[?:1.8.0_144]
> >>>     at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144]
> >>>     at
> >>>
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at
> >>>
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at com.sun.proxy.$Proxy27.create(Unknown Source) ~[?:?]
> >>>     at
> >>>
> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1721)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1657)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1582)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at
> >>>
> org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:397)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at
> >>>
> org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:393)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at
> >>>
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at
> >>>
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:393)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at
> >>>
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:337)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at
> >>>
> org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:141)
> >>> ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
> >>>     at
> >>>
> org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:37)
> >>> ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
> >>>     at
> >>>
> org.apache.flink.table.filesystem.SuccessFileCommitPolicy.commit(SuccessFileCommitPolicy.java:45)
> >>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
> >>>     at
> >>>
> org.apache.flink.table.filesystem.stream.StreamingFileCommitter.commitPartitions(StreamingFileCommitter.java:167)
> >>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
> >>>     at
> >>>
> org.apache.flink.table.filesystem.stream.StreamingFileCommitter.processElement(StreamingFileCommitter.java:144)
> >>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
> >>>     at
> >>>
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:161)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at org.apache.flink.streaming.runtime.io
> .StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:178)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at org.apache.flink.streaming.runtime.io
> .StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:153)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at org.apache.flink.streaming.runtime.io
> .StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at
> >>>
> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:345)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at
> >>>
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:191)
> >>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at
> >>>
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:181)
> >>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at
> >>>
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:558)
> >>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at
> >>>
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:530)
> >>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721)
> >>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546)
> >>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
> >>> Caused by: java.lang.InterruptedException
> >>>     at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404)
> >>> ~[?:1.8.0_144]
> >>>     at java.util.concurrent.FutureTask.get(FutureTask.java:191)
> >>> ~[?:1.8.0_144]
> >>>     at
> >>>
> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1048)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at org.apache.hadoop.ipc.Client.call(Client.java:1443)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     ... 38 more
> >>> 2020-09-04 17:17:59,522 WARN
> >>> org.apache.hadoop.io.retry.RetryInvocationHandler [] - Exception while
> >>> invoking class
> >>>
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create
> >>> over uhadoop-op3raf-master1/10.42.31.63:8020. Not retrying because
> >>> failovers (15) exceeded maximum allowed (15)
> >>> java.io.IOException: Failed on local exception:
> >>> java.nio.channels.ClosedByInterruptException; Host Details : local
> host is:
> >>> "uhadoop-op3raf-core13/10.42.99.178"; destination host is:
> >>> "uhadoop-op3raf-master1":8020;
> >>>     at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at org.apache.hadoop.ipc.Client.call(Client.java:1474)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at org.apache.hadoop.ipc.Client.call(Client.java:1401)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at
> >>>
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at com.sun.proxy.$Proxy26.create(Unknown Source) ~[?:?]
> >>>     at
> >>>
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:295)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
> >>> ~[?:?]
> >>>     at
> >>>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >>> ~[?:1.8.0_144]
> >>>     at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144]
> >>>     at
> >>>
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at
> >>>
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at com.sun.proxy.$Proxy27.create(Unknown Source) ~[?:?]
> >>>     at
> >>>
> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1721)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1657)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1582)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at
> >>>
> org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:397)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at
> >>>
> org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:393)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at
> >>>
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at
> >>>
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:393)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at
> >>>
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:337)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at
> >>>
> org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:141)
> >>> ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
> >>>     at
> >>>
> org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:37)
> >>> ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
> >>>     at
> >>>
> org.apache.flink.table.filesystem.SuccessFileCommitPolicy.commit(SuccessFileCommitPolicy.java:45)
> >>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
> >>>     at
> >>>
> org.apache.flink.table.filesystem.stream.StreamingFileCommitter.commitPartitions(StreamingFileCommitter.java:167)
> >>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
> >>>     at
> >>>
> org.apache.flink.table.filesystem.stream.StreamingFileCommitter.processElement(StreamingFileCommitter.java:144)
> >>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
> >>>     at
> >>>
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:161)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at org.apache.flink.streaming.runtime.io
> .StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:178)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at org.apache.flink.streaming.runtime.io
> .StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:153)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at org.apache.flink.streaming.runtime.io
> .StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at
> >>>
> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:345)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at
> >>>
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:191)
> >>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at
> >>>
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:181)
> >>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at
> >>>
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:558)
> >>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at
> >>>
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:530)
> >>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721)
> >>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546)
> >>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
> >>> Caused by: java.nio.channels.ClosedByInterruptException
> >>>     at
> >>>
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
> >>> ~[?:1.8.0_144]
> >>>     at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:659)
> >>> ~[?:1.8.0_144]
> >>>     at org.apache.hadoop.net
> .SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at
> >>>
> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:609)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at
> >>> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:707)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at
> >>> org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:370)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1523)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     at org.apache.hadoop.ipc.Client.call(Client.java:1440)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>>     ... 38 more
> >>>
> >>> 补充:程序多次执行,均会出现部分分区创建失败的情况,而且每次失败的分区是不同的
> >>>
> >>>
> >>> ------------------ 原始邮件 ------------------
> >>> 发件人: "user-zh@flink.apache.org 夏帅" <jk...@dingtalk.com.INVALID>;
> >>> 发送时间: 2020年9月8日(星期二) 上午10:47
> >>> 收件人: "user-zh"<us...@qq.com>;
> >>> 主题:  回复:使用StreamingFileSink向hive metadata中增加分区部分失败
> >>>
> >>> 异常日志只有这些么?有没有详细点的
> >>
> >>
> >>
> >> --
> >> Best regards!
> >> Rui Li
> >>
> >
> >
> > --
> > Best regards!
> > Rui Li
> >
>
>
> --
> Best regards!
> Rui Li
>


-- 
Best, Jingsong Lee

Re: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败

Posted by Rui Li <li...@gmail.com>.
另外也list一下没有提交的分区目录吧,看看里面的文件是什么状态

On Tue, Sep 8, 2020 at 9:19 PM Rui Li <li...@gmail.com> wrote:

> 作业有发生failover么?还是说作业能成功结束但是某些partition始终没提交?
>
> On Tue, Sep 8, 2020 at 5:20 PM MuChen <93...@qq.com> wrote:
>
>> hi, Rui Li:
>> 如你所说,的确有类似日志,但是只有成功增加的分区的日志,没有失败分区的日志:
>> 2020-09-04 17:17:10,548 INFO org.apache.flink.streaming.api.operators.
>> AbstractStreamOperator [] - Partition {dt=2020-08-22, hour=18} of table
>> `hive_catalog`.`rt_dwd`.`dwd_music_copyright_test` is ready to be committed
>> 2020-09-04 17:17:10,716 INFO org.apache.flink.table.filesystem.
>> MetastoreCommitPolicy [] - Committed partition {dt=2020-08-22, hour=18}
>> to metastore
>> 2020-09-04 17:17:10,720 INFO org.apache.flink.table.filesystem.
>> SuccessFileCommitPolicy [] - Committed partition {dt=2020-08-22, hour=18}
>> with success file
>> 2020-09-04 17:17:19,652 INFO org.apache.flink.streaming.api.operators.
>> AbstractStreamOperator [] - Partition {dt=2020-08-22, hour=19} of table
>> `hive_catalog`.`rt_dwd`.`dwd_music_copyright_test` is ready to be committed
>> 2020-09-04 17:17:19,820 INFO org.apache.flink.table.filesystem.
>> MetastoreCommitPolicy [] - Committed partition {dt=2020-08-22, hour=19}
>> to metastore
>> 2020-09-04 17:17:19,824 INFO org.apache.flink.table.filesystem.
>> SuccessFileCommitPolicy [] - Committed partition {dt=2020-08-22, hour=19}
>> with success file
>>
>> 写hdfs的日志是都有的:
>> 2020-09-04 17:16:04,100 INFO org.apache.hadoop.hive.ql.io.parquet.write.
>> ParquetRecordWriterWrapper [] - creating real writer to write at hdfs://
>> Ucluster/user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-
>> 08-22/hour=07/.part-b7d8f3c6-f1f3-40d4-a269-1ccf2c9a7720-0-1140
>> .inprogress.1631ac6c-a07c-4ad7-86ff-cf0d4375d1de
>> 2020-09-04 17:16:04,126 INFO org.apache.hadoop.hive.ql.io.parquet.write.
>> ParquetRecordWriterWrapper [] - creating real writer to write at hdfs://
>> Ucluster/user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-
>> 08-22/hour=19/.part-b7d8f3c6-f1f3-40d4-a269-1ccf2c9a7720-0-1142
>> .inprogress.2700eded-5ed0-4794-8ee9-21721c0c2ffd
>>
>> ------------------ 原始邮件 ------------------
>> *发件人:* "Rui Li" <li...@gmail.com>;
>> *发送时间:* 2020年9月8日(星期二) 中午12:09
>> *收件人:* "user-zh"<us...@dingtalk.com>;
>> *抄送:* "MuChen"<93...@qq.com>;
>> *主题:* Re: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
>>
>> streaming file committer在提交分区之前会打印这样的日志:
>>
>> LOG.info("Partition {} of table {} is ready to be committed", partSpec, tableIdentifier);
>>
>> partition commit policy会在成功提交分区以后打印这样的日志:
>>
>> LOG.info("Committed partition {} to metastore", partitionSpec);
>>
>> LOG.info("Committed partition {} with success file", context.partitionSpec());
>>
>> 可以检查一下这样的日志,看是不是卡在什么地方了
>>
>> On Tue, Sep 8, 2020 at 11:02 AM 夏帅 <jk...@dingtalk.com.invalid> wrote:
>>
>>> 就第二次提供的日志看,好像是你的namenode出现的问题
>>>
>>>
>>> ------------------------------------------------------------------
>>> 发件人:MuChen <93...@qq.com>
>>> 发送时间:2020年9月8日(星期二) 10:56
>>> 收件人:user-zh@flink.apache.org 夏帅 <jk...@dingtalk.com>; user-zh <
>>> user-zh@flink.apache.org>
>>> 主 题:回复: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
>>>
>>> 在checkpoint失败的时间,tm上还有一些info和warn级别的日志:
>>> 2020-09-04 17:17:59,520 INFO
>>> org.apache.hadoop.io.retry.RetryInvocationHandler [] - Exception while
>>> invoking create of class ClientNamenodeProtocolTranslatorPB over
>>> uhadoop-op3raf-master2/10.42.52.202:8020 after 14 fail over attempts.
>>> Trying to fail over immediately.
>>> java.io.IOException: java.lang.InterruptedException
>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1449)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1401)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at
>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at com.sun.proxy.$Proxy26.create(Unknown Source) ~[?:?]
>>>     at
>>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:295)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
>>> ~[?:?]
>>>     at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> ~[?:1.8.0_144]
>>>     at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144]
>>>     at
>>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at
>>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at com.sun.proxy.$Proxy27.create(Unknown Source) ~[?:?]
>>>     at
>>> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1721)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1657)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1582)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at
>>> org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:397)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at
>>> org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:393)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at
>>> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at
>>> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:393)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at
>>> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:337)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at
>>> org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:141)
>>> ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
>>>     at
>>> org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:37)
>>> ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
>>>     at
>>> org.apache.flink.table.filesystem.SuccessFileCommitPolicy.commit(SuccessFileCommitPolicy.java:45)
>>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
>>>     at
>>> org.apache.flink.table.filesystem.stream.StreamingFileCommitter.commitPartitions(StreamingFileCommitter.java:167)
>>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
>>>     at
>>> org.apache.flink.table.filesystem.stream.StreamingFileCommitter.processElement(StreamingFileCommitter.java:144)
>>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
>>>     at
>>> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:161)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:178)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:153)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at
>>> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:345)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at
>>> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:191)
>>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at
>>> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:181)
>>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at
>>> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:558)
>>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at
>>> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:530)
>>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721)
>>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546)
>>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
>>> Caused by: java.lang.InterruptedException
>>>     at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404)
>>> ~[?:1.8.0_144]
>>>     at java.util.concurrent.FutureTask.get(FutureTask.java:191)
>>> ~[?:1.8.0_144]
>>>     at
>>> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1048)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1443)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     ... 38 more
>>> 2020-09-04 17:17:59,522 WARN
>>> org.apache.hadoop.io.retry.RetryInvocationHandler [] - Exception while
>>> invoking class
>>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create
>>> over uhadoop-op3raf-master1/10.42.31.63:8020. Not retrying because
>>> failovers (15) exceeded maximum allowed (15)
>>> java.io.IOException: Failed on local exception:
>>> java.nio.channels.ClosedByInterruptException; Host Details : local host is:
>>> "uhadoop-op3raf-core13/10.42.99.178"; destination host is:
>>> "uhadoop-op3raf-master1":8020;
>>>     at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1474)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1401)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at
>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at com.sun.proxy.$Proxy26.create(Unknown Source) ~[?:?]
>>>     at
>>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:295)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
>>> ~[?:?]
>>>     at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> ~[?:1.8.0_144]
>>>     at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144]
>>>     at
>>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at
>>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at com.sun.proxy.$Proxy27.create(Unknown Source) ~[?:?]
>>>     at
>>> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1721)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1657)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1582)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at
>>> org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:397)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at
>>> org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:393)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at
>>> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at
>>> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:393)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at
>>> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:337)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at
>>> org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:141)
>>> ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
>>>     at
>>> org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:37)
>>> ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
>>>     at
>>> org.apache.flink.table.filesystem.SuccessFileCommitPolicy.commit(SuccessFileCommitPolicy.java:45)
>>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
>>>     at
>>> org.apache.flink.table.filesystem.stream.StreamingFileCommitter.commitPartitions(StreamingFileCommitter.java:167)
>>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
>>>     at
>>> org.apache.flink.table.filesystem.stream.StreamingFileCommitter.processElement(StreamingFileCommitter.java:144)
>>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
>>>     at
>>> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:161)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:178)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:153)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at
>>> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:345)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at
>>> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:191)
>>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at
>>> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:181)
>>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at
>>> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:558)
>>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at
>>> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:530)
>>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721)
>>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546)
>>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
>>> Caused by: java.nio.channels.ClosedByInterruptException
>>>     at
>>> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
>>> ~[?:1.8.0_144]
>>>     at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:659)
>>> ~[?:1.8.0_144]
>>>     at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at
>>> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:609)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at
>>> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:707)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at
>>> org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:370)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1523)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     at org.apache.hadoop.ipc.Client.call(Client.java:1440)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>>     ... 38 more
>>>
>>> 补充:程序多次执行,均会出现部分分区创建失败的情况,而且每次失败的分区是不同的
>>>
>>>
>>> ------------------ 原始邮件 ------------------
>>> 发件人: "user-zh@flink.apache.org 夏帅" <jk...@dingtalk.com.INVALID>;
>>> 发送时间: 2020年9月8日(星期二) 上午10:47
>>> 收件人: "user-zh"<us...@qq.com>;
>>> 主题:  回复:使用StreamingFileSink向hive metadata中增加分区部分失败
>>>
>>> 异常日志只有这些么?有没有详细点的
>>
>>
>>
>> --
>> Best regards!
>> Rui Li
>>
>
>
> --
> Best regards!
> Rui Li
>


-- 
Best regards!
Rui Li

Re: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败

Posted by Rui Li <li...@gmail.com>.
作业有发生failover么?还是说作业能成功结束但是某些partition始终没提交?

On Tue, Sep 8, 2020 at 5:20 PM MuChen <93...@qq.com> wrote:

> hi, Rui Li:
> 如你所说,的确有类似日志,但是只有成功增加的分区的日志,没有失败分区的日志:
> 2020-09-04 17:17:10,548 INFO org.apache.flink.streaming.api.operators.
> AbstractStreamOperator [] - Partition {dt=2020-08-22, hour=18} of table
> `hive_catalog`.`rt_dwd`.`dwd_music_copyright_test` is ready to be committed
> 2020-09-04 17:17:10,716 INFO org.apache.flink.table.filesystem.
> MetastoreCommitPolicy [] - Committed partition {dt=2020-08-22, hour=18}
> to metastore
> 2020-09-04 17:17:10,720 INFO org.apache.flink.table.filesystem.
> SuccessFileCommitPolicy [] - Committed partition {dt=2020-08-22, hour=18}
> with success file
> 2020-09-04 17:17:19,652 INFO org.apache.flink.streaming.api.operators.
> AbstractStreamOperator [] - Partition {dt=2020-08-22, hour=19} of table
> `hive_catalog`.`rt_dwd`.`dwd_music_copyright_test` is ready to be committed
> 2020-09-04 17:17:19,820 INFO org.apache.flink.table.filesystem.
> MetastoreCommitPolicy [] - Committed partition {dt=2020-08-22, hour=19}
> to metastore
> 2020-09-04 17:17:19,824 INFO org.apache.flink.table.filesystem.
> SuccessFileCommitPolicy [] - Committed partition {dt=2020-08-22, hour=19}
> with success file
>
> 写hdfs的日志是都有的:
> 2020-09-04 17:16:04,100 INFO org.apache.hadoop.hive.ql.io.parquet.write.
> ParquetRecordWriterWrapper [] - creating real writer to write at hdfs://
> Ucluster/user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-08
> -22/hour=07/.part-b7d8f3c6-f1f3-40d4-a269-1ccf2c9a7720-0-1140.inprogress.
> 1631ac6c-a07c-4ad7-86ff-cf0d4375d1de
> 2020-09-04 17:16:04,126 INFO org.apache.hadoop.hive.ql.io.parquet.write.
> ParquetRecordWriterWrapper [] - creating real writer to write at hdfs://
> Ucluster/user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-08
> -22/hour=19/.part-b7d8f3c6-f1f3-40d4-a269-1ccf2c9a7720-0-1142.inprogress.
> 2700eded-5ed0-4794-8ee9-21721c0c2ffd
>
> ------------------ 原始邮件 ------------------
> *发件人:* "Rui Li" <li...@gmail.com>;
> *发送时间:* 2020年9月8日(星期二) 中午12:09
> *收件人:* "user-zh"<us...@dingtalk.com>;
> *抄送:* "MuChen"<93...@qq.com>;
> *主题:* Re: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
>
> streaming file committer在提交分区之前会打印这样的日志:
>
> LOG.info("Partition {} of table {} is ready to be committed", partSpec, tableIdentifier);
>
> partition commit policy会在成功提交分区以后打印这样的日志:
>
> LOG.info("Committed partition {} to metastore", partitionSpec);
>
> LOG.info("Committed partition {} with success file", context.partitionSpec());
>
> 可以检查一下这样的日志,看是不是卡在什么地方了
>
> On Tue, Sep 8, 2020 at 11:02 AM 夏帅 <jk...@dingtalk.com.invalid> wrote:
>
>> 就第二次提供的日志看,好像是你的namenode出现的问题
>>
>>
>> ------------------------------------------------------------------
>> 发件人:MuChen <93...@qq.com>
>> 发送时间:2020年9月8日(星期二) 10:56
>> 收件人:user-zh@flink.apache.org 夏帅 <jk...@dingtalk.com>; user-zh <
>> user-zh@flink.apache.org>
>> 主 题:回复: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
>>
>> 在checkpoint失败的时间,tm上还有一些info和warn级别的日志:
>> 2020-09-04 17:17:59,520 INFO
>> org.apache.hadoop.io.retry.RetryInvocationHandler [] - Exception while
>> invoking create of class ClientNamenodeProtocolTranslatorPB over
>> uhadoop-op3raf-master2/10.42.52.202:8020 after 14 fail over attempts.
>> Trying to fail over immediately.
>> java.io.IOException: java.lang.InterruptedException
>>     at org.apache.hadoop.ipc.Client.call(Client.java:1449)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at org.apache.hadoop.ipc.Client.call(Client.java:1401)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at com.sun.proxy.$Proxy26.create(Unknown Source) ~[?:?]
>>     at
>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:295)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source) ~[?:?]
>>     at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> ~[?:1.8.0_144]
>>     at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144]
>>     at
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at com.sun.proxy.$Proxy27.create(Unknown Source) ~[?:?]
>>     at
>> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1721)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1657)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1582)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at
>> org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:397)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at
>> org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:393)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at
>> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at
>> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:393)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at
>> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:337)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at
>> org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:141)
>> ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
>>     at
>> org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:37)
>> ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
>>     at
>> org.apache.flink.table.filesystem.SuccessFileCommitPolicy.commit(SuccessFileCommitPolicy.java:45)
>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
>>     at
>> org.apache.flink.table.filesystem.stream.StreamingFileCommitter.commitPartitions(StreamingFileCommitter.java:167)
>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
>>     at
>> org.apache.flink.table.filesystem.stream.StreamingFileCommitter.processElement(StreamingFileCommitter.java:144)
>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
>>     at
>> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:161)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:178)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:153)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at
>> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:345)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at
>> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:191)
>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at
>> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:181)
>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at
>> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:558)
>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at
>> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:530)
>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721)
>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546)
>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
>> Caused by: java.lang.InterruptedException
>>     at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404)
>> ~[?:1.8.0_144]
>>     at java.util.concurrent.FutureTask.get(FutureTask.java:191)
>> ~[?:1.8.0_144]
>>     at
>> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1048)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at org.apache.hadoop.ipc.Client.call(Client.java:1443)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     ... 38 more
>> 2020-09-04 17:17:59,522 WARN
>> org.apache.hadoop.io.retry.RetryInvocationHandler [] - Exception while
>> invoking class
>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create
>> over uhadoop-op3raf-master1/10.42.31.63:8020. Not retrying because
>> failovers (15) exceeded maximum allowed (15)
>> java.io.IOException: Failed on local exception:
>> java.nio.channels.ClosedByInterruptException; Host Details : local host is:
>> "uhadoop-op3raf-core13/10.42.99.178"; destination host is:
>> "uhadoop-op3raf-master1":8020;
>>     at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at org.apache.hadoop.ipc.Client.call(Client.java:1474)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at org.apache.hadoop.ipc.Client.call(Client.java:1401)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at com.sun.proxy.$Proxy26.create(Unknown Source) ~[?:?]
>>     at
>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:295)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source) ~[?:?]
>>     at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> ~[?:1.8.0_144]
>>     at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144]
>>     at
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at com.sun.proxy.$Proxy27.create(Unknown Source) ~[?:?]
>>     at
>> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1721)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1657)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1582)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at
>> org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:397)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at
>> org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:393)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at
>> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at
>> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:393)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at
>> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:337)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at
>> org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:141)
>> ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
>>     at
>> org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:37)
>> ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
>>     at
>> org.apache.flink.table.filesystem.SuccessFileCommitPolicy.commit(SuccessFileCommitPolicy.java:45)
>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
>>     at
>> org.apache.flink.table.filesystem.stream.StreamingFileCommitter.commitPartitions(StreamingFileCommitter.java:167)
>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
>>     at
>> org.apache.flink.table.filesystem.stream.StreamingFileCommitter.processElement(StreamingFileCommitter.java:144)
>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
>>     at
>> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:161)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:178)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:153)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at
>> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:345)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at
>> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:191)
>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at
>> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:181)
>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at
>> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:558)
>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at
>> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:530)
>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721)
>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546)
>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
>> Caused by: java.nio.channels.ClosedByInterruptException
>>     at
>> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
>> ~[?:1.8.0_144]
>>     at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:659)
>> ~[?:1.8.0_144]
>>     at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at
>> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:609)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at
>> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:707)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at
>> org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:370)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1523)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     at org.apache.hadoop.ipc.Client.call(Client.java:1440)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>     ... 38 more
>>
>> 补充:程序多次执行,均会出现部分分区创建失败的情况,而且每次失败的分区是不同的
>>
>>
>> ------------------ 原始邮件 ------------------
>> 发件人: "user-zh@flink.apache.org 夏帅" <jk...@dingtalk.com.INVALID>;
>> 发送时间: 2020年9月8日(星期二) 上午10:47
>> 收件人: "user-zh"<us...@qq.com>;
>> 主题:  回复:使用StreamingFileSink向hive metadata中增加分区部分失败
>>
>> 异常日志只有这些么?有没有详细点的
>
>
>
> --
> Best regards!
> Rui Li
>


-- 
Best regards!
Rui Li

回复: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败

Posted by MuChen <93...@qq.com>.
hi, Rui Li:
如你所说,的确有类似日志,但是只有成功增加的分区的日志,没有失败分区的日志:
2020-09-04 17:17:10,548 INFO  org.apache.flink.streaming.api.operators.AbstractStreamOperator [] - Partition {dt=2020-08-22, hour=18} of table `hive_catalog`.`rt_dwd`.`dwd_music_copyright_test` is ready to be committed
2020-09-04 17:17:10,716 INFO  org.apache.flink.table.filesystem.MetastoreCommitPolicy      [] - Committed partition {dt=2020-08-22, hour=18} to metastore
2020-09-04 17:17:10,720 INFO  org.apache.flink.table.filesystem.SuccessFileCommitPolicy    [] - Committed partition {dt=2020-08-22, hour=18} with success file
2020-09-04 17:17:19,652 INFO  org.apache.flink.streaming.api.operators.AbstractStreamOperator [] - Partition {dt=2020-08-22, hour=19} of table `hive_catalog`.`rt_dwd`.`dwd_music_copyright_test` is ready to be committed
2020-09-04 17:17:19,820 INFO  org.apache.flink.table.filesystem.MetastoreCommitPolicy      [] - Committed partition {dt=2020-08-22, hour=19} to metastore
2020-09-04 17:17:19,824 INFO  org.apache.flink.table.filesystem.SuccessFileCommitPolicy    [] - Committed partition {dt=2020-08-22, hour=19} with success file






写hdfs的日志是都有的:
2020-09-04 17:16:04,100 INFO  org.apache.hadoop.hive.ql.io.parquet.write.ParquetRecordWriterWrapper [] - creating real writer to write at hdfs://Ucluster/user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-08-22/hour=07/.part-b7d8f3c6-f1f3-40d4-a269-1ccf2c9a7720-0-1140.inprogress.1631ac6c-a07c-4ad7-86ff-cf0d4375d1de

2020-09-04 17:16:04,126 INFO  org.apache.hadoop.hive.ql.io.parquet.write.ParquetRecordWriterWrapper [] - creating real writer to write at hdfs://Ucluster/user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-08-22/hour=19/.part-b7d8f3c6-f1f3-40d4-a269-1ccf2c9a7720-0-1142.inprogress.2700eded-5ed0-4794-8ee9-21721c0c2ffd



------------------&nbsp;原始邮件&nbsp;------------------
发件人:                                                                                                                        "Rui Li"                                                                                    <lirui.fudan@gmail.com&gt;;
发送时间:&nbsp;2020年9月8日(星期二) 中午12:09
收件人:&nbsp;"user-zh"<user-zh@flink.apache.org&gt;;"夏帅"<jkillers@dingtalk.com&gt;;
抄送:&nbsp;"MuChen"<9329748@qq.com&gt;;
主题:&nbsp;Re: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败



streaming file committer在提交分区之前会打印这样的日志:LOG.info("Partition {} of table {} is ready to be committed", partSpec, tableIdentifier);
partition commit policy会在成功提交分区以后打印这样的日志:
LOG.info("Committed partition {} to metastore", partitionSpec);LOG.info("Committed partition {} with success file", context.partitionSpec());
可以检查一下这样的日志,看是不是卡在什么地方了


On Tue, Sep 8, 2020 at 11:02 AM 夏帅 <jkillers@dingtalk.com.invalid&gt; wrote:

就第二次提供的日志看,好像是你的namenode出现的问题
 
 
 ------------------------------------------------------------------
 发件人:MuChen <9329748@qq.com&gt;
 发送时间:2020年9月8日(星期二) 10:56
 收件人:user-zh@flink.apache.org 夏帅 <jkillers@dingtalk.com&gt;; user-zh <user-zh@flink.apache.org&gt;
 主 题:回复: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
 
 在checkpoint失败的时间,tm上还有一些info和warn级别的日志:
 2020-09-04 17:17:59,520 INFO org.apache.hadoop.io.retry.RetryInvocationHandler [] - Exception while invoking create of class ClientNamenodeProtocolTranslatorPB over uhadoop-op3raf-master2/10.42.52.202:8020 after 14 fail over attempts. Trying to fail over immediately.
 java.io.IOException: java.lang.InterruptedException
 &nbsp; &nbsp; at org.apache.hadoop.ipc.Client.call(Client.java:1449) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.hadoop.ipc.Client.call(Client.java:1401) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at com.sun.proxy.$Proxy26.create(Unknown Source) ~[?:?]
 &nbsp; &nbsp; at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:295) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source) ~[?:?]
 &nbsp; &nbsp; at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_144]
 &nbsp; &nbsp; at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144]
 &nbsp; &nbsp; at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at com.sun.proxy.$Proxy27.create(Unknown Source) ~[?:?]
 &nbsp; &nbsp; at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1721) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1657) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1582) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:397) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:393) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:393) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:337) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:141) ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
 &nbsp; &nbsp; at org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:37) ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
 &nbsp; &nbsp; at org.apache.flink.table.filesystem.SuccessFileCommitPolicy.commit(SuccessFileCommitPolicy.java:45) ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
 &nbsp; &nbsp; at org.apache.flink.table.filesystem.stream.StreamingFileCommitter.commitPartitions(StreamingFileCommitter.java:167) ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
 &nbsp; &nbsp; at org.apache.flink.table.filesystem.stream.StreamingFileCommitter.processElement(StreamingFileCommitter.java:144) ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
 &nbsp; &nbsp; at org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:161) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:178) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:153) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:345) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:191) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:181) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:558) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:530) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
 Caused by: java.lang.InterruptedException
 &nbsp; &nbsp; at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404) ~[?:1.8.0_144]
 &nbsp; &nbsp; at java.util.concurrent.FutureTask.get(FutureTask.java:191) ~[?:1.8.0_144]
 &nbsp; &nbsp; at org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1048) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.hadoop.ipc.Client.call(Client.java:1443) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; ... 38 more
 2020-09-04 17:17:59,522 WARN org.apache.hadoop.io.retry.RetryInvocationHandler [] - Exception while invoking class org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create over uhadoop-op3raf-master1/10.42.31.63:8020. Not retrying because failovers (15) exceeded maximum allowed (15)
 java.io.IOException: Failed on local exception: java.nio.channels.ClosedByInterruptException; Host Details : local host is: "uhadoop-op3raf-core13/10.42.99.178"; destination host is: "uhadoop-op3raf-master1":8020; 
 &nbsp; &nbsp; at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.hadoop.ipc.Client.call(Client.java:1474) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.hadoop.ipc.Client.call(Client.java:1401) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at com.sun.proxy.$Proxy26.create(Unknown Source) ~[?:?]
 &nbsp; &nbsp; at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:295) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source) ~[?:?]
 &nbsp; &nbsp; at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_144]
 &nbsp; &nbsp; at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144]
 &nbsp; &nbsp; at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at com.sun.proxy.$Proxy27.create(Unknown Source) ~[?:?]
 &nbsp; &nbsp; at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1721) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1657) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1582) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:397) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:393) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:393) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:337) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:141) ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
 &nbsp; &nbsp; at org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:37) ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
 &nbsp; &nbsp; at org.apache.flink.table.filesystem.SuccessFileCommitPolicy.commit(SuccessFileCommitPolicy.java:45) ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
 &nbsp; &nbsp; at org.apache.flink.table.filesystem.stream.StreamingFileCommitter.commitPartitions(StreamingFileCommitter.java:167) ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
 &nbsp; &nbsp; at org.apache.flink.table.filesystem.stream.StreamingFileCommitter.processElement(StreamingFileCommitter.java:144) ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
 &nbsp; &nbsp; at org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:161) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:178) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:153) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:345) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:191) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:181) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:558) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:530) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
 Caused by: java.nio.channels.ClosedByInterruptException
 &nbsp; &nbsp; at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202) ~[?:1.8.0_144]
 &nbsp; &nbsp; at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:659) ~[?:1.8.0_144]
 &nbsp; &nbsp; at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:609) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:707) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:370) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.hadoop.ipc.Client.getConnection(Client.java:1523) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; at org.apache.hadoop.ipc.Client.call(Client.java:1440) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
 &nbsp; &nbsp; ... 38 more
 
 补充:程序多次执行,均会出现部分分区创建失败的情况,而且每次失败的分区是不同的
 
 
 ------------------ 原始邮件 ------------------
 发件人: "user-zh@flink.apache.org 夏帅" <jkillers@dingtalk.com.INVALID&gt;;
 发送时间: 2020年9月8日(星期二) 上午10:47
 收件人: "user-zh"<user-zh@flink.apache.org&gt;;"MuChen"<9329748@qq.com&gt;;
 主题:&nbsp; 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
 
 异常日志只有这些么?有没有详细点的



-- 
Best regards!
 Rui Li

Re: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败

Posted by Rui Li <li...@gmail.com>.
streaming file committer在提交分区之前会打印这样的日志:

LOG.info("Partition {} of table {} is ready to be committed",
partSpec, tableIdentifier);

partition commit policy会在成功提交分区以后打印这样的日志:

LOG.info("Committed partition {} to metastore", partitionSpec);

LOG.info("Committed partition {} with success file", context.partitionSpec());

可以检查一下这样的日志,看是不是卡在什么地方了

On Tue, Sep 8, 2020 at 11:02 AM 夏帅 <jk...@dingtalk.com.invalid> wrote:

> 就第二次提供的日志看,好像是你的namenode出现的问题
>
>
> ------------------------------------------------------------------
> 发件人:MuChen <93...@qq.com>
> 发送时间:2020年9月8日(星期二) 10:56
> 收件人:user-zh@flink.apache.org 夏帅 <jk...@dingtalk.com>; user-zh <
> user-zh@flink.apache.org>
> 主 题:回复: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
>
> 在checkpoint失败的时间,tm上还有一些info和warn级别的日志:
> 2020-09-04 17:17:59,520 INFO
> org.apache.hadoop.io.retry.RetryInvocationHandler [] - Exception while
> invoking create of class ClientNamenodeProtocolTranslatorPB over
> uhadoop-op3raf-master2/10.42.52.202:8020 after 14 fail over attempts.
> Trying to fail over immediately.
> java.io.IOException: java.lang.InterruptedException
>     at org.apache.hadoop.ipc.Client.call(Client.java:1449)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at org.apache.hadoop.ipc.Client.call(Client.java:1401)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at com.sun.proxy.$Proxy26.create(Unknown Source) ~[?:?]
>     at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:295)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source) ~[?:?]
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> ~[?:1.8.0_144]
>     at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144]
>     at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at com.sun.proxy.$Proxy27.create(Unknown Source) ~[?:?]
>     at
> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1721)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1657)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1582)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at
> org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:397)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at
> org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:393)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:393)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:337)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at
> org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:141)
> ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
>     at
> org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:37)
> ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
>     at
> org.apache.flink.table.filesystem.SuccessFileCommitPolicy.commit(SuccessFileCommitPolicy.java:45)
> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
>     at
> org.apache.flink.table.filesystem.stream.StreamingFileCommitter.commitPartitions(StreamingFileCommitter.java:167)
> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
>     at
> org.apache.flink.table.filesystem.stream.StreamingFileCommitter.processElement(StreamingFileCommitter.java:144)
> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
>     at
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:161)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:178)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:153)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at
> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:345)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:191)
> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:181)
> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:558)
> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:530)
> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721)
> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546)
> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
> Caused by: java.lang.InterruptedException
>     at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404)
> ~[?:1.8.0_144]
>     at java.util.concurrent.FutureTask.get(FutureTask.java:191)
> ~[?:1.8.0_144]
>     at
> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1048)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at org.apache.hadoop.ipc.Client.call(Client.java:1443)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     ... 38 more
> 2020-09-04 17:17:59,522 WARN
> org.apache.hadoop.io.retry.RetryInvocationHandler [] - Exception while
> invoking class
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create
> over uhadoop-op3raf-master1/10.42.31.63:8020. Not retrying because
> failovers (15) exceeded maximum allowed (15)
> java.io.IOException: Failed on local exception:
> java.nio.channels.ClosedByInterruptException; Host Details : local host is:
> "uhadoop-op3raf-core13/10.42.99.178"; destination host is:
> "uhadoop-op3raf-master1":8020;
>     at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at org.apache.hadoop.ipc.Client.call(Client.java:1474)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at org.apache.hadoop.ipc.Client.call(Client.java:1401)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at com.sun.proxy.$Proxy26.create(Unknown Source) ~[?:?]
>     at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:295)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source) ~[?:?]
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> ~[?:1.8.0_144]
>     at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144]
>     at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at com.sun.proxy.$Proxy27.create(Unknown Source) ~[?:?]
>     at
> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1721)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1657)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1582)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at
> org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:397)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at
> org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:393)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:393)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:337)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at
> org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:141)
> ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
>     at
> org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:37)
> ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
>     at
> org.apache.flink.table.filesystem.SuccessFileCommitPolicy.commit(SuccessFileCommitPolicy.java:45)
> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
>     at
> org.apache.flink.table.filesystem.stream.StreamingFileCommitter.commitPartitions(StreamingFileCommitter.java:167)
> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
>     at
> org.apache.flink.table.filesystem.stream.StreamingFileCommitter.processElement(StreamingFileCommitter.java:144)
> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
>     at
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:161)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:178)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:153)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at
> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:345)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:191)
> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:181)
> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:558)
> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:530)
> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721)
> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546)
> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
> Caused by: java.nio.channels.ClosedByInterruptException
>     at
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
> ~[?:1.8.0_144]
>     at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:659)
> ~[?:1.8.0_144]
>     at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at
> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:609)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:707)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at
> org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:370)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1523)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     at org.apache.hadoop.ipc.Client.call(Client.java:1440)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>     ... 38 more
>
> 补充:程序多次执行,均会出现部分分区创建失败的情况,而且每次失败的分区是不同的
>
>
> ------------------ 原始邮件 ------------------
> 发件人: "user-zh@flink.apache.org 夏帅" <jk...@dingtalk.com.INVALID>;
> 发送时间: 2020年9月8日(星期二) 上午10:47
> 收件人: "user-zh"<us...@qq.com>;
> 主题:  回复:使用StreamingFileSink向hive metadata中增加分区部分失败
>
> 异常日志只有这些么?有没有详细点的



-- 
Best regards!
Rui Li

回复:回复:使用StreamingFileSink向hive metadata中增加分区部分失败

Posted by 夏帅 <jk...@dingtalk.com.INVALID>.
就第二次提供的日志看,好像是你的namenode出现的问题


------------------------------------------------------------------
发件人:MuChen <93...@qq.com>
发送时间:2020年9月8日(星期二) 10:56
收件人:user-zh@flink.apache.org 夏帅 <jk...@dingtalk.com>; user-zh <us...@flink.apache.org>
主 题:回复: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败

在checkpoint失败的时间,tm上还有一些info和warn级别的日志:
2020-09-04 17:17:59,520 INFO org.apache.hadoop.io.retry.RetryInvocationHandler [] - Exception while invoking create of class ClientNamenodeProtocolTranslatorPB over uhadoop-op3raf-master2/10.42.52.202:8020 after 14 fail over attempts. Trying to fail over immediately.
java.io.IOException: java.lang.InterruptedException
    at org.apache.hadoop.ipc.Client.call(Client.java:1449) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.hadoop.ipc.Client.call(Client.java:1401) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at com.sun.proxy.$Proxy26.create(Unknown Source) ~[?:?]
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:295) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source) ~[?:?]
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_144]
    at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144]
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at com.sun.proxy.$Proxy27.create(Unknown Source) ~[?:?]
    at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1721) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1657) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1582) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:397) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:393) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:393) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:337) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:141) ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
    at org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:37) ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
    at org.apache.flink.table.filesystem.SuccessFileCommitPolicy.commit(SuccessFileCommitPolicy.java:45) ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
    at org.apache.flink.table.filesystem.stream.StreamingFileCommitter.commitPartitions(StreamingFileCommitter.java:167) ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
    at org.apache.flink.table.filesystem.stream.StreamingFileCommitter.processElement(StreamingFileCommitter.java:144) ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
    at org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:161) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:178) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:153) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:345) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:191) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:181) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:558) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:530) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
Caused by: java.lang.InterruptedException
    at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404) ~[?:1.8.0_144]
    at java.util.concurrent.FutureTask.get(FutureTask.java:191) ~[?:1.8.0_144]
    at org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1048) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.hadoop.ipc.Client.call(Client.java:1443) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    ... 38 more
2020-09-04 17:17:59,522 WARN org.apache.hadoop.io.retry.RetryInvocationHandler [] - Exception while invoking class org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create over uhadoop-op3raf-master1/10.42.31.63:8020. Not retrying because failovers (15) exceeded maximum allowed (15)
java.io.IOException: Failed on local exception: java.nio.channels.ClosedByInterruptException; Host Details : local host is: "uhadoop-op3raf-core13/10.42.99.178"; destination host is: "uhadoop-op3raf-master1":8020; 
    at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.hadoop.ipc.Client.call(Client.java:1474) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.hadoop.ipc.Client.call(Client.java:1401) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at com.sun.proxy.$Proxy26.create(Unknown Source) ~[?:?]
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:295) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source) ~[?:?]
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_144]
    at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144]
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at com.sun.proxy.$Proxy27.create(Unknown Source) ~[?:?]
    at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1721) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1657) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1582) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:397) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:393) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:393) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:337) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:141) ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
    at org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:37) ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
    at org.apache.flink.table.filesystem.SuccessFileCommitPolicy.commit(SuccessFileCommitPolicy.java:45) ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
    at org.apache.flink.table.filesystem.stream.StreamingFileCommitter.commitPartitions(StreamingFileCommitter.java:167) ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
    at org.apache.flink.table.filesystem.stream.StreamingFileCommitter.processElement(StreamingFileCommitter.java:144) ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
    at org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:161) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:178) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:153) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:345) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:191) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:181) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:558) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:530) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
Caused by: java.nio.channels.ClosedByInterruptException
    at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202) ~[?:1.8.0_144]
    at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:659) ~[?:1.8.0_144]
    at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:609) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:707) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:370) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.hadoop.ipc.Client.getConnection(Client.java:1523) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    at org.apache.hadoop.ipc.Client.call(Client.java:1440) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
    ... 38 more

补充:程序多次执行,均会出现部分分区创建失败的情况,而且每次失败的分区是不同的


------------------ 原始邮件 ------------------
发件人: "user-zh@flink.apache.org 夏帅" <jk...@dingtalk.com.INVALID>;
发送时间: 2020年9月8日(星期二) 上午10:47
收件人: "user-zh"<us...@qq.com>;
主题:  回复:使用StreamingFileSink向hive metadata中增加分区部分失败

异常日志只有这些么?有没有详细点的

回复: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败

Posted by MuChen <93...@qq.com>.
在checkpoint失败的时间,tm上还有一些info和warn级别的日志:
2020-09-04 17:17:59,520 INFO  org.apache.hadoop.io.retry.RetryInvocationHandler            [] - Exception while invoking create of class ClientNamenodeProtocolTranslatorPB over uhadoop-op3raf-master2/10.42.52.202:8020 after 14 fail over attempts. Trying to fail over immediately.
java.io.IOException: java.lang.InterruptedException
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.hadoop.ipc.Client.call(Client.java:1449) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.hadoop.ipc.Client.call(Client.java:1401) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at com.sun.proxy.$Proxy26.create(Unknown Source) ~[?:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:295) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source) ~[?:?]
&nbsp;&nbsp;&nbsp;&nbsp;at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_144]
&nbsp;&nbsp;&nbsp;&nbsp;at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at com.sun.proxy.$Proxy27.create(Unknown Source) ~[?:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1721) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1657) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1582) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:397) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:393) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:393) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:337) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:141) ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:37) ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.flink.table.filesystem.SuccessFileCommitPolicy.commit(SuccessFileCommitPolicy.java:45) ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.flink.table.filesystem.stream.StreamingFileCommitter.commitPartitions(StreamingFileCommitter.java:167) ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.flink.table.filesystem.stream.StreamingFileCommitter.processElement(StreamingFileCommitter.java:144) ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:161) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:178) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:153) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:345) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:191) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:181) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:558) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:530) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
Caused by: java.lang.InterruptedException
&nbsp;&nbsp;&nbsp;&nbsp;at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404) ~[?:1.8.0_144]
&nbsp;&nbsp;&nbsp;&nbsp;at java.util.concurrent.FutureTask.get(FutureTask.java:191) ~[?:1.8.0_144]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1048) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.hadoop.ipc.Client.call(Client.java:1443) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;... 38 more
2020-09-04 17:17:59,522 WARN  org.apache.hadoop.io.retry.RetryInvocationHandler            [] - Exception while invoking class org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create over uhadoop-op3raf-master1/10.42.31.63:8020. Not retrying because failovers (15) exceeded maximum allowed (15)
java.io.IOException: Failed on local exception: java.nio.channels.ClosedByInterruptException; Host Details : local host is: "uhadoop-op3raf-core13/10.42.99.178"; destination host is: "uhadoop-op3raf-master1":8020; 
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.hadoop.ipc.Client.call(Client.java:1474) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.hadoop.ipc.Client.call(Client.java:1401) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at com.sun.proxy.$Proxy26.create(Unknown Source) ~[?:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:295) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source) ~[?:?]
&nbsp;&nbsp;&nbsp;&nbsp;at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_144]
&nbsp;&nbsp;&nbsp;&nbsp;at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at com.sun.proxy.$Proxy27.create(Unknown Source) ~[?:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1721) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1657) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1582) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:397) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:393) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:393) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:337) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:141) ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:37) ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.flink.table.filesystem.SuccessFileCommitPolicy.commit(SuccessFileCommitPolicy.java:45) ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.flink.table.filesystem.stream.StreamingFileCommitter.commitPartitions(StreamingFileCommitter.java:167) ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.flink.table.filesystem.stream.StreamingFileCommitter.processElement(StreamingFileCommitter.java:144) ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:161) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:178) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:153) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:345) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:191) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:181) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:558) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:530) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
Caused by: java.nio.channels.ClosedByInterruptException
&nbsp;&nbsp;&nbsp;&nbsp;at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202) ~[?:1.8.0_144]
&nbsp;&nbsp;&nbsp;&nbsp;at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:659) ~[?:1.8.0_144]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:609) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:707) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:370) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.hadoop.ipc.Client.getConnection(Client.java:1523) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.hadoop.ipc.Client.call(Client.java:1440) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
&nbsp;&nbsp;&nbsp;&nbsp;... 38 more


补充:程序多次执行,均会出现部分分区创建失败的情况,而且每次失败的分区是不同的






------------------&nbsp;原始邮件&nbsp;------------------
发件人:                                                                                                                        "user-zh@flink.apache.org 夏帅"                                                                                    <jkillers@dingtalk.com.INVALID&gt;;
发送时间:&nbsp;2020年9月8日(星期二) 上午10:47
收件人:&nbsp;"user-zh"<user-zh@flink.apache.org&gt;;"MuChen"<9329748@qq.com&gt;;

主题:&nbsp;  回复:使用StreamingFileSink向hive metadata中增加分区部分失败



异常日志只有这些么?有没有详细点的

回复:使用StreamingFileSink向hive metadata中增加分区部分失败

Posted by 夏帅 <jk...@dingtalk.com.INVALID>.
异常日志只有这些么?有没有详细点的

回复: 使用StreamingFileSink向hive metadata中增加分区部分失败

Posted by MuChen <93...@qq.com>.
还是没有找到原因,请大家帮诊断一下




------------------&nbsp;原始邮件&nbsp;------------------
发件人:                                                                                                                        "MuChen"                                                                                    <9329748@qq.com&gt;;
发送时间:&nbsp;2020年9月7日(星期一) 中午11:01
收件人:&nbsp;"user-zh"<user-zh@flink.apache.org&gt;;

主题:&nbsp;回复: 使用StreamingFileSink向hive metadata中增加分区部分失败



hi,jingsong:


图片发失败了,传到了图床:
https://s1.ax1x.com/2020/09/07/wn1CFg.png


checkpoint失败日志:
2020-09-04 17:17:59
org.apache.flink.util.FlinkRuntimeException: Exceeded checkpoint tolerable failure threshold.
&nbsp;&nbsp;&nbsp; at org.apache.flink.runtime.checkpoint.CheckpointFailureManager.handleJobLevelCheckpointException(CheckpointFailureManager.java:66)
&nbsp;&nbsp;&nbsp; at org.apache.flink.runtime.checkpoint.CheckpointCoordinator.abortPendingCheckpoint(CheckpointCoordinator.java:1626)
&nbsp;&nbsp;&nbsp; at org.apache.flink.runtime.checkpoint.CheckpointCoordinator.abortPendingCheckpoint(CheckpointCoordinator.java:1603)
&nbsp;&nbsp;&nbsp; at org.apache.flink.runtime.checkpoint.CheckpointCoordinator.access$600(CheckpointCoordinator.java:90)
&nbsp;&nbsp;&nbsp; at org.apache.flink.runtime.checkpoint.CheckpointCoordinator$CheckpointCanceller.run(CheckpointCoordinator.java:1736)
&nbsp;&nbsp;&nbsp; at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
&nbsp;&nbsp;&nbsp; at java.util.concurrent.FutureTask.run(FutureTask.java:266)
&nbsp;&nbsp;&nbsp; at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
&nbsp;&nbsp;&nbsp; at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
&nbsp;&nbsp;&nbsp; at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
&nbsp;&nbsp;&nbsp; at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
&nbsp;&nbsp;&nbsp; at java.lang.Thread.run(Thread.java:748)





------------------ 原始邮件 ------------------
发件人:                                                                                                                        "user-zh"                                                                                    <jingsonglee0@gmail.com&gt;;
发送时间:&nbsp;2020年9月7日(星期一) 上午10:55
收件人:&nbsp;"user-zh"<user-zh@flink.apache.org&gt;;

主题:&nbsp;Re: 使用StreamingFileSink向hive metadata中增加分区部分失败



失败的图没有呢。。具体什么异常?

On Mon, Sep 7, 2020 at 10:23 AM MuChen <9329748@qq.com&gt; wrote:

&gt; hi, all:
&gt; 麻烦大佬们帮看个问题,多谢!
&gt;
&gt; 处理逻辑如下
&gt; 1. 使用DataStream API读取kafka中的数据,写入DataStream ds1中
&gt; 2. 新建一个tableEnv,并注册hive catalog:
&gt;&nbsp;&nbsp; &nbsp; tableEnv.registerCatalog(catalogName, catalog);
&gt;&nbsp;&nbsp; &nbsp; tableEnv.useCatalog(catalogName);
&gt; 3. 声明以ds1为数据源的table
&gt;&nbsp;&nbsp; &nbsp; Table sourcetable = tableEnv.fromDataStream(ds1);
&gt;&nbsp;&nbsp; &nbsp; String souceTableName = "music_source";
&gt;&nbsp;&nbsp; &nbsp; tableEnv.createTemporaryView(souceTableName, sourcetable);
&gt; 4. 创建一张hive表:
&gt;
&gt; CREATE TABLE `dwd_music_copyright_test`(
&gt; &nbsp; `url` string COMMENT 'url',
&gt; &nbsp; `md5` string COMMENT 'md5',
&gt; &nbsp; `utime` bigint COMMENT '时间',
&gt; &nbsp; `title` string COMMENT '歌曲名',
&gt; &nbsp; `singer` string COMMENT '演唱者',
&gt; &nbsp; `company` string COMMENT '公司',
&gt; &nbsp; `level` int COMMENT '置信度.0是标题切词,1是acrcloud返回的结果,3是人工标准')
&gt; PARTITIONED BY (
&gt; &nbsp; `dt` string,
&gt; &nbsp; `hour` string)ROW FORMAT SERDE
&gt; &nbsp; 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'STORED AS INPUTFORMAT
&gt; &nbsp; 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
&gt; OUTPUTFORMAT
&gt; &nbsp; 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
&gt; LOCATION
&gt; &nbsp; 'hdfs://Ucluster/user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test'
&gt; TBLPROPERTIES (
&gt; &nbsp; 'connector'='HiveCatalog',
&gt; &nbsp; 'partition.time-extractor.timestamp-pattern'='$dt $hour:00:00',
&gt; &nbsp; 'sink.partition-commit.delay'='1 min',
&gt; &nbsp; 'sink.partition-commit.policy.kind'='metastore,success-file',
&gt; &nbsp; 'sink.partition-commit.trigger'='partition-time',
&gt; &nbsp; 'sink.rolling-policy.check-interval'='30s',
&gt; &nbsp; 'sink.rolling-policy.rollover-interval'='1min',
&gt; &nbsp; 'sink.rolling-policy.file-size'='1MB');
&gt;
&gt;
&gt; 5. 将step3表中的数据插入dwd_music_copyright_test
&gt;
&gt; 环境
&gt;
&gt; flink:1.11
&gt; kafka:1.1.1
&gt; hadoop:2.6.0
&gt; hive:1.2.0
&gt;
&gt; 问题
&gt;&nbsp;&nbsp; &nbsp; 程序运行后,发现hive catalog部分分区未成功创建,如下未成功创建hour=02和hour=03分区:
&gt;
&gt; show partitions rt_dwd.dwd_music_copyright_test;
&gt;
&gt; | dt=2020-08-29/hour=00&nbsp; |
&gt; | dt=2020-08-29/hour=01&nbsp; |
&gt; | dt=2020-08-29/hour=04&nbsp; |
&gt; | dt=2020-08-29/hour=05&nbsp; |
&gt;
&gt;&nbsp; 但是hdfs目录下有文件生成:
&gt;
&gt; $ hadoop fs -du -h /user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-08-29/4.5 K &nbsp; 13.4 K&nbsp; /user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-08-29/hour=002.0 K &nbsp; 6.1 K &nbsp; /user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-08-29/hour=011.7 K &nbsp; 5.1 K &nbsp; /user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-08-29/hour=021.3 K &nbsp; 3.8 K &nbsp; /user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-08-29/hour=033.1 K &nbsp; 9.2 K &nbsp; /user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-08-29/hour=04
&gt;
&gt;
&gt; 且手动add partition后可以正常读取数据。
&gt;
&gt; 通过flink WebUI可以看到,过程中有checkpoint在StreamingFileCommitter时失败的情况发生:
&gt;
&gt;
&gt;
&gt;
&gt;
&gt; 请问:
&gt;
&gt; 1. exactly-once只能保证写sink文件,不能保证更新catalog吗?
&gt; 2. 是的话有什么方案解决这个问题吗?
&gt; 3.
&gt; EXACTLY_ONCE有没有必要指定kafka参数isolation.level=read_committed和enable.auto.commit=false?是不是有了如下设置就可以保证EXACTLY_ONCE?
&gt;
&gt; streamEnv.getCheckpointConfig().setCheckpointingMode(CheckpointingMode.EXACTLY_ONCE);tableEnv.getConfig().getConfiguration().set(ExecutionCheckpointingOptions.CHECKPOINTING_MODE, CheckpointingMode.EXACTLY_ONCE);
&gt;
&gt;

-- 
Best, Jingsong Lee

回复: 使用StreamingFileSink向hive metadata中增加分区部分失败

Posted by MuChen <93...@qq.com>.
hi,jingsong:


图片发失败了,传到了图床:
https://s1.ax1x.com/2020/09/07/wn1CFg.png


checkpoint失败日志:
2020-09-04 17:17:59
org.apache.flink.util.FlinkRuntimeException: Exceeded checkpoint tolerable failure threshold.
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.flink.runtime.checkpoint.CheckpointFailureManager.handleJobLevelCheckpointException(CheckpointFailureManager.java:66)
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.flink.runtime.checkpoint.CheckpointCoordinator.abortPendingCheckpoint(CheckpointCoordinator.java:1626)
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.flink.runtime.checkpoint.CheckpointCoordinator.abortPendingCheckpoint(CheckpointCoordinator.java:1603)
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.flink.runtime.checkpoint.CheckpointCoordinator.access$600(CheckpointCoordinator.java:90)
&nbsp;&nbsp;&nbsp;&nbsp;at org.apache.flink.runtime.checkpoint.CheckpointCoordinator$CheckpointCanceller.run(CheckpointCoordinator.java:1736)
&nbsp;&nbsp;&nbsp;&nbsp;at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
&nbsp;&nbsp;&nbsp;&nbsp;at java.util.concurrent.FutureTask.run(FutureTask.java:266)
&nbsp;&nbsp;&nbsp;&nbsp;at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
&nbsp;&nbsp;&nbsp;&nbsp;at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
&nbsp;&nbsp;&nbsp;&nbsp;at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
&nbsp;&nbsp;&nbsp;&nbsp;at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
&nbsp;&nbsp;&nbsp;&nbsp;at java.lang.Thread.run(Thread.java:748)





------------------&nbsp;原始邮件&nbsp;------------------
发件人:                                                                                                                        "user-zh"                                                                                    <jingsonglee0@gmail.com&gt;;
发送时间:&nbsp;2020年9月7日(星期一) 上午10:55
收件人:&nbsp;"user-zh"<user-zh@flink.apache.org&gt;;

主题:&nbsp;Re: 使用StreamingFileSink向hive metadata中增加分区部分失败



失败的图没有呢。。具体什么异常?

On Mon, Sep 7, 2020 at 10:23 AM MuChen <9329748@qq.com&gt; wrote:

&gt; hi, all:
&gt; 麻烦大佬们帮看个问题,多谢!
&gt;
&gt; 处理逻辑如下
&gt; 1. 使用DataStream API读取kafka中的数据,写入DataStream ds1中
&gt; 2. 新建一个tableEnv,并注册hive catalog:
&gt;&nbsp;&nbsp;&nbsp;&nbsp; tableEnv.registerCatalog(catalogName, catalog);
&gt;&nbsp;&nbsp;&nbsp;&nbsp; tableEnv.useCatalog(catalogName);
&gt; 3. 声明以ds1为数据源的table
&gt;&nbsp;&nbsp;&nbsp;&nbsp; Table sourcetable = tableEnv.fromDataStream(ds1);
&gt;&nbsp;&nbsp;&nbsp;&nbsp; String souceTableName = "music_source";
&gt;&nbsp;&nbsp;&nbsp;&nbsp; tableEnv.createTemporaryView(souceTableName, sourcetable);
&gt; 4. 创建一张hive表:
&gt;
&gt; CREATE TABLE `dwd_music_copyright_test`(
&gt;&nbsp;&nbsp; `url` string COMMENT 'url',
&gt;&nbsp;&nbsp; `md5` string COMMENT 'md5',
&gt;&nbsp;&nbsp; `utime` bigint COMMENT '时间',
&gt;&nbsp;&nbsp; `title` string COMMENT '歌曲名',
&gt;&nbsp;&nbsp; `singer` string COMMENT '演唱者',
&gt;&nbsp;&nbsp; `company` string COMMENT '公司',
&gt;&nbsp;&nbsp; `level` int COMMENT '置信度.0是标题切词,1是acrcloud返回的结果,3是人工标准')
&gt; PARTITIONED BY (
&gt;&nbsp;&nbsp; `dt` string,
&gt;&nbsp;&nbsp; `hour` string)ROW FORMAT SERDE
&gt;&nbsp;&nbsp; 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'STORED AS INPUTFORMAT
&gt;&nbsp;&nbsp; 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
&gt; OUTPUTFORMAT
&gt;&nbsp;&nbsp; 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
&gt; LOCATION
&gt;&nbsp;&nbsp; 'hdfs://Ucluster/user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test'
&gt; TBLPROPERTIES (
&gt;&nbsp;&nbsp; 'connector'='HiveCatalog',
&gt;&nbsp;&nbsp; 'partition.time-extractor.timestamp-pattern'='$dt $hour:00:00',
&gt;&nbsp;&nbsp; 'sink.partition-commit.delay'='1 min',
&gt;&nbsp;&nbsp; 'sink.partition-commit.policy.kind'='metastore,success-file',
&gt;&nbsp;&nbsp; 'sink.partition-commit.trigger'='partition-time',
&gt;&nbsp;&nbsp; 'sink.rolling-policy.check-interval'='30s',
&gt;&nbsp;&nbsp; 'sink.rolling-policy.rollover-interval'='1min',
&gt;&nbsp;&nbsp; 'sink.rolling-policy.file-size'='1MB');
&gt;
&gt;
&gt; 5. 将step3表中的数据插入dwd_music_copyright_test
&gt;
&gt; 环境
&gt;
&gt; flink:1.11
&gt; kafka:1.1.1
&gt; hadoop:2.6.0
&gt; hive:1.2.0
&gt;
&gt; 问题
&gt;&nbsp;&nbsp;&nbsp;&nbsp; 程序运行后,发现hive catalog部分分区未成功创建,如下未成功创建hour=02和hour=03分区:
&gt;
&gt; show partitions rt_dwd.dwd_music_copyright_test;
&gt;
&gt; | dt=2020-08-29/hour=00&nbsp; |
&gt; | dt=2020-08-29/hour=01&nbsp; |
&gt; | dt=2020-08-29/hour=04&nbsp; |
&gt; | dt=2020-08-29/hour=05&nbsp; |
&gt;
&gt;&nbsp; 但是hdfs目录下有文件生成:
&gt;
&gt; $ hadoop fs -du -h /user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-08-29/4.5 K&nbsp;&nbsp; 13.4 K&nbsp; /user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-08-29/hour=002.0 K&nbsp;&nbsp; 6.1 K&nbsp;&nbsp; /user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-08-29/hour=011.7 K&nbsp;&nbsp; 5.1 K&nbsp;&nbsp; /user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-08-29/hour=021.3 K&nbsp;&nbsp; 3.8 K&nbsp;&nbsp; /user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-08-29/hour=033.1 K&nbsp;&nbsp; 9.2 K&nbsp;&nbsp; /user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-08-29/hour=04
&gt;
&gt;
&gt; 且手动add partition后可以正常读取数据。
&gt;
&gt; 通过flink WebUI可以看到,过程中有checkpoint在StreamingFileCommitter时失败的情况发生:
&gt;
&gt;
&gt;
&gt;
&gt;
&gt; 请问:
&gt;
&gt; 1. exactly-once只能保证写sink文件,不能保证更新catalog吗?
&gt; 2. 是的话有什么方案解决这个问题吗?
&gt; 3.
&gt; EXACTLY_ONCE有没有必要指定kafka参数isolation.level=read_committed和enable.auto.commit=false?是不是有了如下设置就可以保证EXACTLY_ONCE?
&gt;
&gt; streamEnv.getCheckpointConfig().setCheckpointingMode(CheckpointingMode.EXACTLY_ONCE);tableEnv.getConfig().getConfiguration().set(ExecutionCheckpointingOptions.CHECKPOINTING_MODE, CheckpointingMode.EXACTLY_ONCE);
&gt;
&gt;

-- 
Best, Jingsong Lee

Re: 使用StreamingFileSink向hive metadata中增加分区部分失败

Posted by Jingsong Li <ji...@gmail.com>.
失败的图没有呢。。具体什么异常?

On Mon, Sep 7, 2020 at 10:23 AM MuChen <93...@qq.com> wrote:

> hi, all:
> 麻烦大佬们帮看个问题,多谢!
>
> 处理逻辑如下
> 1. 使用DataStream API读取kafka中的数据,写入DataStream ds1中
> 2. 新建一个tableEnv,并注册hive catalog:
>     tableEnv.registerCatalog(catalogName, catalog);
>     tableEnv.useCatalog(catalogName);
> 3. 声明以ds1为数据源的table
>     Table sourcetable = tableEnv.fromDataStream(ds1);
>     String souceTableName = "music_source";
>     tableEnv.createTemporaryView(souceTableName, sourcetable);
> 4. 创建一张hive表:
>
> CREATE TABLE `dwd_music_copyright_test`(
>   `url` string COMMENT 'url',
>   `md5` string COMMENT 'md5',
>   `utime` bigint COMMENT '时间',
>   `title` string COMMENT '歌曲名',
>   `singer` string COMMENT '演唱者',
>   `company` string COMMENT '公司',
>   `level` int COMMENT '置信度.0是标题切词,1是acrcloud返回的结果,3是人工标准')
> PARTITIONED BY (
>   `dt` string,
>   `hour` string)ROW FORMAT SERDE
>   'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'STORED AS INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
> OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
> LOCATION
>   'hdfs://Ucluster/user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test'
> TBLPROPERTIES (
>   'connector'='HiveCatalog',
>   'partition.time-extractor.timestamp-pattern'='$dt $hour:00:00',
>   'sink.partition-commit.delay'='1 min',
>   'sink.partition-commit.policy.kind'='metastore,success-file',
>   'sink.partition-commit.trigger'='partition-time',
>   'sink.rolling-policy.check-interval'='30s',
>   'sink.rolling-policy.rollover-interval'='1min',
>   'sink.rolling-policy.file-size'='1MB');
>
>
> 5. 将step3表中的数据插入dwd_music_copyright_test
>
> 环境
>
> flink:1.11
> kafka:1.1.1
> hadoop:2.6.0
> hive:1.2.0
>
> 问题
>     程序运行后,发现hive catalog部分分区未成功创建,如下未成功创建hour=02和hour=03分区:
>
> show partitions rt_dwd.dwd_music_copyright_test;
>
> | dt=2020-08-29/hour=00  |
> | dt=2020-08-29/hour=01  |
> | dt=2020-08-29/hour=04  |
> | dt=2020-08-29/hour=05  |
>
>  但是hdfs目录下有文件生成:
>
> $ hadoop fs -du -h /user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-08-29/4.5 K   13.4 K  /user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-08-29/hour=002.0 K   6.1 K   /user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-08-29/hour=011.7 K   5.1 K   /user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-08-29/hour=021.3 K   3.8 K   /user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-08-29/hour=033.1 K   9.2 K   /user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-08-29/hour=04
>
>
> 且手动add partition后可以正常读取数据。
>
> 通过flink WebUI可以看到,过程中有checkpoint在StreamingFileCommitter时失败的情况发生:
>
>
>
>
>
> 请问:
>
> 1. exactly-once只能保证写sink文件,不能保证更新catalog吗?
> 2. 是的话有什么方案解决这个问题吗?
> 3.
> EXACTLY_ONCE有没有必要指定kafka参数isolation.level=read_committed和enable.auto.commit=false?是不是有了如下设置就可以保证EXACTLY_ONCE?
>
> streamEnv.getCheckpointConfig().setCheckpointingMode(CheckpointingMode.EXACTLY_ONCE);tableEnv.getConfig().getConfiguration().set(ExecutionCheckpointingOptions.CHECKPOINTING_MODE, CheckpointingMode.EXACTLY_ONCE);
>
>

-- 
Best, Jingsong Lee