You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user-zh@flink.apache.org by 夏帅 <jk...@dingtalk.com.INVALID> on 2020/09/08 02:47:07 UTC
回复:使用StreamingFileSink向hive metadata中增加分区部分失败
异常日志只有这些么?有没有详细点的
Re: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
Posted by Jingsong Li <ji...@gmail.com>.
非常感谢你的反馈,应该是真的有问题,我建个JIRA追踪下
https://issues.apache.org/jira/browse/FLINK-19166
会包含在即将发布的1.11.2中
Best,
Jingsong
On Wed, Sep 9, 2020 at 10:44 AM MuChen <93...@qq.com> wrote:
> hi,Rui Li:
> 没有提交分区的目录是commited状态,手动add partition是可以正常查询的
>
> /user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-08-19/hour=07/part-b7d8f3c6-f1f3-40d4-a269-1ccf2c9a7720-0-1031
>
>
>
>
> ------------------ 原始邮件 ------------------
> 发件人:
> "user-zh"
> <
> lirui.fudan@gmail.com>;
> 发送时间: 2020年9月8日(星期二) 晚上9:43
> 收件人: "MuChen"<9329748@qq.com>;
> 抄送: "user-zh"<user-zh@flink.apache.org>;"夏帅"<jkillers@dingtalk.com
> >;
> 主题: Re: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
>
>
>
> 另外也list一下没有提交的分区目录吧,看看里面的文件是什么状态
>
> On Tue, Sep 8, 2020 at 9:19 PM Rui Li <lirui.fudan@gmail.com> wrote:
>
> > 作业有发生failover么?还是说作业能成功结束但是某些partition始终没提交?
> >
> > On Tue, Sep 8, 2020 at 5:20 PM MuChen <9329748@qq.com> wrote:
> >
> >> hi, Rui Li:
> >> 如你所说,的确有类似日志,但是只有成功增加的分区的日志,没有失败分区的日志:
> >> 2020-09-04 17:17:10,548 INFO
> org.apache.flink.streaming.api.operators.
> >> AbstractStreamOperator [] - Partition {dt=2020-08-22, hour=18} of
> table
> >> `hive_catalog`.`rt_dwd`.`dwd_music_copyright_test` is ready to be
> committed
> >> 2020-09-04 17:17:10,716 INFO org.apache.flink.table.filesystem.
> >> MetastoreCommitPolicy [] - Committed partition {dt=2020-08-22,
> hour=18}
> >> to metastore
> >> 2020-09-04 17:17:10,720 INFO org.apache.flink.table.filesystem.
> >> SuccessFileCommitPolicy [] - Committed partition {dt=2020-08-22,
> hour=18}
> >> with success file
> >> 2020-09-04 17:17:19,652 INFO
> org.apache.flink.streaming.api.operators.
> >> AbstractStreamOperator [] - Partition {dt=2020-08-22, hour=19} of
> table
> >> `hive_catalog`.`rt_dwd`.`dwd_music_copyright_test` is ready to be
> committed
> >> 2020-09-04 17:17:19,820 INFO org.apache.flink.table.filesystem.
> >> MetastoreCommitPolicy [] - Committed partition {dt=2020-08-22,
> hour=19}
> >> to metastore
> >> 2020-09-04 17:17:19,824 INFO org.apache.flink.table.filesystem.
> >> SuccessFileCommitPolicy [] - Committed partition {dt=2020-08-22,
> hour=19}
> >> with success file
> >>
> >> 写hdfs的日志是都有的:
> >> 2020-09-04 17:16:04,100 INFO org.apache.hadoop.hive.ql.io
> .parquet.write.
> >> ParquetRecordWriterWrapper [] - creating real writer to write at
> hdfs://
> >>
> Ucluster/user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-
> >> 08-22/hour=07/.part-b7d8f3c6-f1f3-40d4-a269-1ccf2c9a7720-0-1140
> >> .inprogress.1631ac6c-a07c-4ad7-86ff-cf0d4375d1de
> >> 2020-09-04 17:16:04,126 INFO org.apache.hadoop.hive.ql.io
> .parquet.write.
> >> ParquetRecordWriterWrapper [] - creating real writer to write at
> hdfs://
> >>
> Ucluster/user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-
> >> 08-22/hour=19/.part-b7d8f3c6-f1f3-40d4-a269-1ccf2c9a7720-0-1142
> >> .inprogress.2700eded-5ed0-4794-8ee9-21721c0c2ffd
> >>
> >> ------------------ 原始邮件 ------------------
> >> *发件人:* "Rui Li" <lirui.fudan@gmail.com>;
> >> *发送时间:* 2020年9月8日(星期二) 中午12:09
> >> *收件人:* "user-zh"<user-zh@flink.apache.org>;"夏帅"<
> jkillers@dingtalk.com>;
> >> *抄送:* "MuChen"<9329748@qq.com>;
> >> *主题:* Re: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
> >>
> >> streaming file committer在提交分区之前会打印这样的日志:
> >>
> >> LOG.info("Partition {} of table {} is ready to be committed",
> partSpec, tableIdentifier);
> >>
> >> partition commit policy会在成功提交分区以后打印这样的日志:
> >>
> >> LOG.info("Committed partition {} to metastore", partitionSpec);
> >>
> >> LOG.info("Committed partition {} with success file",
> context.partitionSpec());
> >>
> >> 可以检查一下这样的日志,看是不是卡在什么地方了
> >>
> >> On Tue, Sep 8, 2020 at 11:02 AM 夏帅 <jkillers@dingtalk.com.invalid>
> wrote:
> >>
> >>> 就第二次提供的日志看,好像是你的namenode出现的问题
> >>>
> >>>
> >>>
> ------------------------------------------------------------------
> >>> 发件人:MuChen <9329748@qq.com>
> >>> 发送时间:2020年9月8日(星期二) 10:56
> >>> 收件人:user-zh@flink.apache.org 夏帅 <jkillers@dingtalk.com>;
> user-zh <
> >>> user-zh@flink.apache.org>
> >>> 主 题:回复: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
> >>>
> >>> 在checkpoint失败的时间,tm上还有一些info和warn级别的日志:
> >>> 2020-09-04 17:17:59,520 INFO
> >>> org.apache.hadoop.io.retry.RetryInvocationHandler [] -
> Exception while
> >>> invoking create of class ClientNamenodeProtocolTranslatorPB
> over
> >>> uhadoop-op3raf-master2/10.42.52.202:8020 after 14 fail over
> attempts.
> >>> Trying to fail over immediately.
> >>> java.io.IOException: java.lang.InterruptedException
> >>> at
> org.apache.hadoop.ipc.Client.call(Client.java:1449)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> org.apache.hadoop.ipc.Client.call(Client.java:1401)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> com.sun.proxy.$Proxy26.create(Unknown Source) ~[?:?]
> >>> at
> >>>
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:295)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
> >>> ~[?:?]
> >>> at
> >>>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >>> ~[?:1.8.0_144]
> >>> at
> java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144]
> >>> at
> >>>
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> com.sun.proxy.$Proxy27.create(Unknown Source) ~[?:?]
> >>> at
> >>>
> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1721)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1657)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1582)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:397)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:393)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:393)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:337)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:141)
> >>> ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
> >>> at
> >>>
> org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:37)
> >>> ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
> >>> at
> >>>
> org.apache.flink.table.filesystem.SuccessFileCommitPolicy.commit(SuccessFileCommitPolicy.java:45)
> >>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
> >>> at
> >>>
> org.apache.flink.table.filesystem.stream.StreamingFileCommitter.commitPartitions(StreamingFileCommitter.java:167)
> >>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
> >>> at
> >>>
> org.apache.flink.table.filesystem.stream.StreamingFileCommitter.processElement(StreamingFileCommitter.java:144)
> >>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
> >>> at
> >>>
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:161)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.flink.streaming.
> runtime.io
> .StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:178)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.flink.streaming.
> runtime.io
> .StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:153)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.flink.streaming.
> runtime.io
> .StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:345)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:191)
> >>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:181)
> >>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:558)
> >>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:530)
> >>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721)
> >>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> org.apache.flink.runtime.taskmanager.Task.run(Task.java:546)
> >>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
> >>> Caused by: java.lang.InterruptedException
> >>> at
> java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404)
> >>> ~[?:1.8.0_144]
> >>> at
> java.util.concurrent.FutureTask.get(FutureTask.java:191)
> >>> ~[?:1.8.0_144]
> >>> at
> >>>
> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1048)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> org.apache.hadoop.ipc.Client.call(Client.java:1443)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> ... 38 more
> >>> 2020-09-04 17:17:59,522 WARN
> >>> org.apache.hadoop.io.retry.RetryInvocationHandler [] -
> Exception while
> >>> invoking class
> >>>
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create
> >>> over uhadoop-op3raf-master1/10.42.31.63:8020. Not retrying
> because
> >>> failovers (15) exceeded maximum allowed (15)
> >>> java.io.IOException: Failed on local exception:
> >>> java.nio.channels.ClosedByInterruptException; Host Details :
> local host is:
> >>> "uhadoop-op3raf-core13/10.42.99.178"; destination host is:
> >>> "uhadoop-op3raf-master1":8020;
> >>> at org.apache.hadoop.net
> .NetUtils.wrapException(NetUtils.java:772)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> org.apache.hadoop.ipc.Client.call(Client.java:1474)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> org.apache.hadoop.ipc.Client.call(Client.java:1401)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> com.sun.proxy.$Proxy26.create(Unknown Source) ~[?:?]
> >>> at
> >>>
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:295)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
> >>> ~[?:?]
> >>> at
> >>>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >>> ~[?:1.8.0_144]
> >>> at
> java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144]
> >>> at
> >>>
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> com.sun.proxy.$Proxy27.create(Unknown Source) ~[?:?]
> >>> at
> >>>
> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1721)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1657)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1582)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:397)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:393)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:393)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:337)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:141)
> >>> ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
> >>> at
> >>>
> org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:37)
> >>> ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
> >>> at
> >>>
> org.apache.flink.table.filesystem.SuccessFileCommitPolicy.commit(SuccessFileCommitPolicy.java:45)
> >>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
> >>> at
> >>>
> org.apache.flink.table.filesystem.stream.StreamingFileCommitter.commitPartitions(StreamingFileCommitter.java:167)
> >>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
> >>> at
> >>>
> org.apache.flink.table.filesystem.stream.StreamingFileCommitter.processElement(StreamingFileCommitter.java:144)
> >>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
> >>> at
> >>>
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:161)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.flink.streaming.
> runtime.io
> .StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:178)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.flink.streaming.
> runtime.io
> .StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:153)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.flink.streaming.
> runtime.io
> .StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:345)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:191)
> >>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:181)
> >>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:558)
> >>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:530)
> >>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721)
> >>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> org.apache.flink.runtime.taskmanager.Task.run(Task.java:546)
> >>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
> >>> Caused by: java.nio.channels.ClosedByInterruptException
> >>> at
> >>>
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
> >>> ~[?:1.8.0_144]
> >>> at
> sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:659)
> >>> ~[?:1.8.0_144]
> >>> at org.apache.hadoop.net
> .SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.hadoop.net
> .NetUtils.connect(NetUtils.java:530)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.hadoop.net
> .NetUtils.connect(NetUtils.java:494)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:609)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:707)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:370)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> org.apache.hadoop.ipc.Client.getConnection(Client.java:1523)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> org.apache.hadoop.ipc.Client.call(Client.java:1440)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> ... 38 more
> >>>
> >>> 补充:程序多次执行,均会出现部分分区创建失败的情况,而且每次失败的分区是不同的
> >>>
> >>>
> >>> ------------------ 原始邮件 ------------------
> >>> 发件人: "user-zh@flink.apache.org 夏帅"
> <jkillers@dingtalk.com.INVALID>;
> >>> 发送时间: 2020年9月8日(星期二) 上午10:47
> >>> 收件人: "user-zh"<user-zh@flink.apache.org>;"MuChen"<
> 9329748@qq.com>;
> >>> 主题: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
> >>>
> >>> 异常日志只有这些么?有没有详细点的
> >>
> >>
> >>
> >> --
> >> Best regards!
> >> Rui Li
> >>
> >
> >
> > --
> > Best regards!
> > Rui Li
> >
>
>
> --
> Best regards!
> Rui Li
--
Best, Jingsong Lee
回复: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
Posted by MuChen <93...@qq.com>.
hi,Rui Li:
没有提交分区的目录是commited状态,手动add partition是可以正常查询的
/user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-08-19/hour=07/part-b7d8f3c6-f1f3-40d4-a269-1ccf2c9a7720-0-1031
------------------ 原始邮件 ------------------
发件人: "user-zh" <lirui.fudan@gmail.com>;
发送时间: 2020年9月8日(星期二) 晚上9:43
收件人: "MuChen"<9329748@qq.com>;
抄送: "user-zh"<user-zh@flink.apache.org>;"夏帅"<jkillers@dingtalk.com>;
主题: Re: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
另外也list一下没有提交的分区目录吧,看看里面的文件是什么状态
On Tue, Sep 8, 2020 at 9:19 PM Rui Li <lirui.fudan@gmail.com> wrote:
> 作业有发生failover么?还是说作业能成功结束但是某些partition始终没提交?
>
> On Tue, Sep 8, 2020 at 5:20 PM MuChen <9329748@qq.com> wrote:
>
>> hi, Rui Li:
>> 如你所说,的确有类似日志,但是只有成功增加的分区的日志,没有失败分区的日志:
>> 2020-09-04 17:17:10,548 INFO org.apache.flink.streaming.api.operators.
>> AbstractStreamOperator [] - Partition {dt=2020-08-22, hour=18} of table
>> `hive_catalog`.`rt_dwd`.`dwd_music_copyright_test` is ready to be committed
>> 2020-09-04 17:17:10,716 INFO org.apache.flink.table.filesystem.
>> MetastoreCommitPolicy [] - Committed partition {dt=2020-08-22, hour=18}
>> to metastore
>> 2020-09-04 17:17:10,720 INFO org.apache.flink.table.filesystem.
>> SuccessFileCommitPolicy [] - Committed partition {dt=2020-08-22, hour=18}
>> with success file
>> 2020-09-04 17:17:19,652 INFO org.apache.flink.streaming.api.operators.
>> AbstractStreamOperator [] - Partition {dt=2020-08-22, hour=19} of table
>> `hive_catalog`.`rt_dwd`.`dwd_music_copyright_test` is ready to be committed
>> 2020-09-04 17:17:19,820 INFO org.apache.flink.table.filesystem.
>> MetastoreCommitPolicy [] - Committed partition {dt=2020-08-22, hour=19}
>> to metastore
>> 2020-09-04 17:17:19,824 INFO org.apache.flink.table.filesystem.
>> SuccessFileCommitPolicy [] - Committed partition {dt=2020-08-22, hour=19}
>> with success file
>>
>> 写hdfs的日志是都有的:
>> 2020-09-04 17:16:04,100 INFO org.apache.hadoop.hive.ql.io.parquet.write.
>> ParquetRecordWriterWrapper [] - creating real writer to write at hdfs://
>> Ucluster/user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-
>> 08-22/hour=07/.part-b7d8f3c6-f1f3-40d4-a269-1ccf2c9a7720-0-1140
>> .inprogress.1631ac6c-a07c-4ad7-86ff-cf0d4375d1de
>> 2020-09-04 17:16:04,126 INFO org.apache.hadoop.hive.ql.io.parquet.write.
>> ParquetRecordWriterWrapper [] - creating real writer to write at hdfs://
>> Ucluster/user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-
>> 08-22/hour=19/.part-b7d8f3c6-f1f3-40d4-a269-1ccf2c9a7720-0-1142
>> .inprogress.2700eded-5ed0-4794-8ee9-21721c0c2ffd
>>
>> ------------------ 原始邮件 ------------------
>> *发件人:* "Rui Li" <lirui.fudan@gmail.com>;
>> *发送时间:* 2020年9月8日(星期二) 中午12:09
>> *收件人:* "user-zh"<user-zh@flink.apache.org>;"夏帅"<jkillers@dingtalk.com>;
>> *抄送:* "MuChen"<9329748@qq.com>;
>> *主题:* Re: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
>>
>> streaming file committer在提交分区之前会打印这样的日志:
>>
>> LOG.info("Partition {} of table {} is ready to be committed", partSpec, tableIdentifier);
>>
>> partition commit policy会在成功提交分区以后打印这样的日志:
>>
>> LOG.info("Committed partition {} to metastore", partitionSpec);
>>
>> LOG.info("Committed partition {} with success file", context.partitionSpec());
>>
>> 可以检查一下这样的日志,看是不是卡在什么地方了
>>
>> On Tue, Sep 8, 2020 at 11:02 AM 夏帅 <jkillers@dingtalk.com.invalid> wrote:
>>
>>> 就第二次提供的日志看,好像是你的namenode出现的问题
>>>
>>>
>>> ------------------------------------------------------------------
>>> 发件人:MuChen <9329748@qq.com>
>>> 发送时间:2020年9月8日(星期二) 10:56
>>> 收件人:user-zh@flink.apache.org 夏帅 <jkillers@dingtalk.com>; user-zh <
>>> user-zh@flink.apache.org>
>>> 主 题:回复: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
>>>
>>> 在checkpoint失败的时间,tm上还有一些info和warn级别的日志:
>>> 2020-09-04 17:17:59,520 INFO
>>> org.apache.hadoop.io.retry.RetryInvocationHandler [] - Exception while
>>> invoking create of class ClientNamenodeProtocolTranslatorPB over
>>> uhadoop-op3raf-master2/10.42.52.202:8020 after 14 fail over attempts.
>>> Trying to fail over immediately.
>>> java.io.IOException: java.lang.InterruptedException
>>> at org.apache.hadoop.ipc.Client.call(Client.java:1449)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.hadoop.ipc.Client.call(Client.java:1401)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at com.sun.proxy.$Proxy26.create(Unknown Source) ~[?:?]
>>> at
>>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:295)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
>>> ~[?:?]
>>> at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> ~[?:1.8.0_144]
>>> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144]
>>> at
>>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at com.sun.proxy.$Proxy27.create(Unknown Source) ~[?:?]
>>> at
>>> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1721)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1657)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1582)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:397)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:393)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:393)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:337)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:141)
>>> ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
>>> at
>>> org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:37)
>>> ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
>>> at
>>> org.apache.flink.table.filesystem.SuccessFileCommitPolicy.commit(SuccessFileCommitPolicy.java:45)
>>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
>>> at
>>> org.apache.flink.table.filesystem.stream.StreamingFileCommitter.commitPartitions(StreamingFileCommitter.java:167)
>>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
>>> at
>>> org.apache.flink.table.filesystem.stream.StreamingFileCommitter.processElement(StreamingFileCommitter.java:144)
>>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
>>> at
>>> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:161)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:178)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:153)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:345)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:191)
>>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:181)
>>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:558)
>>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:530)
>>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721)
>>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546)
>>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
>>> Caused by: java.lang.InterruptedException
>>> at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404)
>>> ~[?:1.8.0_144]
>>> at java.util.concurrent.FutureTask.get(FutureTask.java:191)
>>> ~[?:1.8.0_144]
>>> at
>>> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1048)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.hadoop.ipc.Client.call(Client.java:1443)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> ... 38 more
>>> 2020-09-04 17:17:59,522 WARN
>>> org.apache.hadoop.io.retry.RetryInvocationHandler [] - Exception while
>>> invoking class
>>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create
>>> over uhadoop-op3raf-master1/10.42.31.63:8020. Not retrying because
>>> failovers (15) exceeded maximum allowed (15)
>>> java.io.IOException: Failed on local exception:
>>> java.nio.channels.ClosedByInterruptException; Host Details : local host is:
>>> "uhadoop-op3raf-core13/10.42.99.178"; destination host is:
>>> "uhadoop-op3raf-master1":8020;
>>> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.hadoop.ipc.Client.call(Client.java:1474)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.hadoop.ipc.Client.call(Client.java:1401)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at com.sun.proxy.$Proxy26.create(Unknown Source) ~[?:?]
>>> at
>>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:295)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
>>> ~[?:?]
>>> at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> ~[?:1.8.0_144]
>>> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144]
>>> at
>>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at com.sun.proxy.$Proxy27.create(Unknown Source) ~[?:?]
>>> at
>>> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1721)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1657)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1582)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:397)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:393)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:393)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:337)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:141)
>>> ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
>>> at
>>> org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:37)
>>> ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
>>> at
>>> org.apache.flink.table.filesystem.SuccessFileCommitPolicy.commit(SuccessFileCommitPolicy.java:45)
>>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
>>> at
>>> org.apache.flink.table.filesystem.stream.StreamingFileCommitter.commitPartitions(StreamingFileCommitter.java:167)
>>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
>>> at
>>> org.apache.flink.table.filesystem.stream.StreamingFileCommitter.processElement(StreamingFileCommitter.java:144)
>>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
>>> at
>>> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:161)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:178)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:153)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:345)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:191)
>>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:181)
>>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:558)
>>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:530)
>>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721)
>>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546)
>>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
>>> Caused by: java.nio.channels.ClosedByInterruptException
>>> at
>>> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
>>> ~[?:1.8.0_144]
>>> at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:659)
>>> ~[?:1.8.0_144]
>>> at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:609)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:707)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:370)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.hadoop.ipc.Client.getConnection(Client.java:1523)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.hadoop.ipc.Client.call(Client.java:1440)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> ... 38 more
>>>
>>> 补充:程序多次执行,均会出现部分分区创建失败的情况,而且每次失败的分区是不同的
>>>
>>>
>>> ------------------ 原始邮件 ------------------
>>> 发件人: "user-zh@flink.apache.org 夏帅" <jkillers@dingtalk.com.INVALID>;
>>> 发送时间: 2020年9月8日(星期二) 上午10:47
>>> 收件人: "user-zh"<user-zh@flink.apache.org>;"MuChen"<9329748@qq.com>;
>>> 主题: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
>>>
>>> 异常日志只有这些么?有没有详细点的
>>
>>
>>
>> --
>> Best regards!
>> Rui Li
>>
>
>
> --
> Best regards!
> Rui Li
>
--
Best regards!
Rui Li
回复: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
Posted by MuChen <93...@qq.com>.
插入表的sql如下:
INSERT INTO rt_dwd.dwd_music_copyright_test
SELECT url,md5,utime,title,singer,company,level,
from_unixtime(cast(utime/1000 as int),'yyyy-MM-dd')
,from_unixtime(cast(utime/1000 as int),'HH') FROM music_source;
------------------ 原始邮件 ------------------
发件人: "user-zh" <jingsonglee0@gmail.com>;
发送时间: 2020年9月9日(星期三) 上午10:32
收件人: "user-zh"<user-zh@flink.apache.org>;
抄送: "MuChen"<9329748@qq.com>;"夏帅"<jkillers@dingtalk.com>;
主题: Re: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
插入Hive表的SQL也发下?
On Tue, Sep 8, 2020 at 9:44 PM Rui Li <lirui.fudan@gmail.com> wrote:
> 另外也list一下没有提交的分区目录吧,看看里面的文件是什么状态
>
> On Tue, Sep 8, 2020 at 9:19 PM Rui Li <lirui.fudan@gmail.com> wrote:
>
> > 作业有发生failover么?还是说作业能成功结束但是某些partition始终没提交?
> >
> > On Tue, Sep 8, 2020 at 5:20 PM MuChen <9329748@qq.com> wrote:
> >
> >> hi, Rui Li:
> >> 如你所说,的确有类似日志,但是只有成功增加的分区的日志,没有失败分区的日志:
> >> 2020-09-04 17:17:10,548 INFO org.apache.flink.streaming.api.operators.
> >> AbstractStreamOperator [] - Partition {dt=2020-08-22, hour=18} of table
> >> `hive_catalog`.`rt_dwd`.`dwd_music_copyright_test` is ready to be
> committed
> >> 2020-09-04 17:17:10,716 INFO org.apache.flink.table.filesystem.
> >> MetastoreCommitPolicy [] - Committed partition {dt=2020-08-22, hour=18}
> >> to metastore
> >> 2020-09-04 17:17:10,720 INFO org.apache.flink.table.filesystem.
> >> SuccessFileCommitPolicy [] - Committed partition {dt=2020-08-22,
> hour=18}
> >> with success file
> >> 2020-09-04 17:17:19,652 INFO org.apache.flink.streaming.api.operators.
> >> AbstractStreamOperator [] - Partition {dt=2020-08-22, hour=19} of table
> >> `hive_catalog`.`rt_dwd`.`dwd_music_copyright_test` is ready to be
> committed
> >> 2020-09-04 17:17:19,820 INFO org.apache.flink.table.filesystem.
> >> MetastoreCommitPolicy [] - Committed partition {dt=2020-08-22, hour=19}
> >> to metastore
> >> 2020-09-04 17:17:19,824 INFO org.apache.flink.table.filesystem.
> >> SuccessFileCommitPolicy [] - Committed partition {dt=2020-08-22,
> hour=19}
> >> with success file
> >>
> >> 写hdfs的日志是都有的:
> >> 2020-09-04 17:16:04,100 INFO org.apache.hadoop.hive.ql.io
> .parquet.write.
> >> ParquetRecordWriterWrapper [] - creating real writer to write at hdfs://
> >> Ucluster/user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-
> >> 08-22/hour=07/.part-b7d8f3c6-f1f3-40d4-a269-1ccf2c9a7720-0-1140
> >> .inprogress.1631ac6c-a07c-4ad7-86ff-cf0d4375d1de
> >> 2020-09-04 17:16:04,126 INFO org.apache.hadoop.hive.ql.io
> .parquet.write.
> >> ParquetRecordWriterWrapper [] - creating real writer to write at hdfs://
> >> Ucluster/user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-
> >> 08-22/hour=19/.part-b7d8f3c6-f1f3-40d4-a269-1ccf2c9a7720-0-1142
> >> .inprogress.2700eded-5ed0-4794-8ee9-21721c0c2ffd
> >>
> >> ------------------ 原始邮件 ------------------
> >> *发件人:* "Rui Li" <lirui.fudan@gmail.com>;
> >> *发送时间:* 2020年9月8日(星期二) 中午12:09
> >> *收件人:* "user-zh"<user-zh@flink.apache.org>;"夏帅"<jkillers@dingtalk.com>;
> >> *抄送:* "MuChen"<9329748@qq.com>;
> >> *主题:* Re: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
> >>
> >> streaming file committer在提交分区之前会打印这样的日志:
> >>
> >> LOG.info("Partition {} of table {} is ready to be committed", partSpec,
> tableIdentifier);
> >>
> >> partition commit policy会在成功提交分区以后打印这样的日志:
> >>
> >> LOG.info("Committed partition {} to metastore", partitionSpec);
> >>
> >> LOG.info("Committed partition {} with success file",
> context.partitionSpec());
> >>
> >> 可以检查一下这样的日志,看是不是卡在什么地方了
> >>
> >> On Tue, Sep 8, 2020 at 11:02 AM 夏帅 <jkillers@dingtalk.com.invalid>
> wrote:
> >>
> >>> 就第二次提供的日志看,好像是你的namenode出现的问题
> >>>
> >>>
> >>> ------------------------------------------------------------------
> >>> 发件人:MuChen <9329748@qq.com>
> >>> 发送时间:2020年9月8日(星期二) 10:56
> >>> 收件人:user-zh@flink.apache.org 夏帅 <jkillers@dingtalk.com>; user-zh <
> >>> user-zh@flink.apache.org>
> >>> 主 题:回复: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
> >>>
> >>> 在checkpoint失败的时间,tm上还有一些info和warn级别的日志:
> >>> 2020-09-04 17:17:59,520 INFO
> >>> org.apache.hadoop.io.retry.RetryInvocationHandler [] - Exception while
> >>> invoking create of class ClientNamenodeProtocolTranslatorPB over
> >>> uhadoop-op3raf-master2/10.42.52.202:8020 after 14 fail over attempts.
> >>> Trying to fail over immediately.
> >>> java.io.IOException: java.lang.InterruptedException
> >>> at org.apache.hadoop.ipc.Client.call(Client.java:1449)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.hadoop.ipc.Client.call(Client.java:1401)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at com.sun.proxy.$Proxy26.create(Unknown Source) ~[?:?]
> >>> at
> >>>
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:295)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
> >>> ~[?:?]
> >>> at
> >>>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >>> ~[?:1.8.0_144]
> >>> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144]
> >>> at
> >>>
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at com.sun.proxy.$Proxy27.create(Unknown Source) ~[?:?]
> >>> at
> >>>
> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1721)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1657)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1582)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:397)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:393)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:393)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:337)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:141)
> >>> ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
> >>> at
> >>>
> org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:37)
> >>> ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
> >>> at
> >>>
> org.apache.flink.table.filesystem.SuccessFileCommitPolicy.commit(SuccessFileCommitPolicy.java:45)
> >>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
> >>> at
> >>>
> org.apache.flink.table.filesystem.stream.StreamingFileCommitter.commitPartitions(StreamingFileCommitter.java:167)
> >>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
> >>> at
> >>>
> org.apache.flink.table.filesystem.stream.StreamingFileCommitter.processElement(StreamingFileCommitter.java:144)
> >>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
> >>> at
> >>>
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:161)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.flink.streaming.runtime.io
> .StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:178)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.flink.streaming.runtime.io
> .StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:153)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.flink.streaming.runtime.io
> .StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:345)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:191)
> >>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:181)
> >>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:558)
> >>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:530)
> >>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721)
> >>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546)
> >>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
> >>> Caused by: java.lang.InterruptedException
> >>> at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404)
> >>> ~[?:1.8.0_144]
> >>> at java.util.concurrent.FutureTask.get(FutureTask.java:191)
> >>> ~[?:1.8.0_144]
> >>> at
> >>>
> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1048)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.hadoop.ipc.Client.call(Client.java:1443)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> ... 38 more
> >>> 2020-09-04 17:17:59,522 WARN
> >>> org.apache.hadoop.io.retry.RetryInvocationHandler [] - Exception while
> >>> invoking class
> >>>
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create
> >>> over uhadoop-op3raf-master1/10.42.31.63:8020. Not retrying because
> >>> failovers (15) exceeded maximum allowed (15)
> >>> java.io.IOException: Failed on local exception:
> >>> java.nio.channels.ClosedByInterruptException; Host Details : local
> host is:
> >>> "uhadoop-op3raf-core13/10.42.99.178"; destination host is:
> >>> "uhadoop-op3raf-master1":8020;
> >>> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.hadoop.ipc.Client.call(Client.java:1474)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.hadoop.ipc.Client.call(Client.java:1401)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at com.sun.proxy.$Proxy26.create(Unknown Source) ~[?:?]
> >>> at
> >>>
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:295)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
> >>> ~[?:?]
> >>> at
> >>>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >>> ~[?:1.8.0_144]
> >>> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144]
> >>> at
> >>>
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at com.sun.proxy.$Proxy27.create(Unknown Source) ~[?:?]
> >>> at
> >>>
> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1721)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1657)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1582)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:397)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:393)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:393)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:337)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:141)
> >>> ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
> >>> at
> >>>
> org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:37)
> >>> ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
> >>> at
> >>>
> org.apache.flink.table.filesystem.SuccessFileCommitPolicy.commit(SuccessFileCommitPolicy.java:45)
> >>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
> >>> at
> >>>
> org.apache.flink.table.filesystem.stream.StreamingFileCommitter.commitPartitions(StreamingFileCommitter.java:167)
> >>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
> >>> at
> >>>
> org.apache.flink.table.filesystem.stream.StreamingFileCommitter.processElement(StreamingFileCommitter.java:144)
> >>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
> >>> at
> >>>
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:161)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.flink.streaming.runtime.io
> .StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:178)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.flink.streaming.runtime.io
> .StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:153)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.flink.streaming.runtime.io
> .StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:345)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:191)
> >>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:181)
> >>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:558)
> >>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:530)
> >>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721)
> >>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546)
> >>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
> >>> Caused by: java.nio.channels.ClosedByInterruptException
> >>> at
> >>>
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
> >>> ~[?:1.8.0_144]
> >>> at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:659)
> >>> ~[?:1.8.0_144]
> >>> at org.apache.hadoop.net
> .SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:609)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:707)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>> org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:370)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.hadoop.ipc.Client.getConnection(Client.java:1523)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.hadoop.ipc.Client.call(Client.java:1440)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> ... 38 more
> >>>
> >>> 补充:程序多次执行,均会出现部分分区创建失败的情况,而且每次失败的分区是不同的
> >>>
> >>>
> >>> ------------------ 原始邮件 ------------------
> >>> 发件人: "user-zh@flink.apache.org 夏帅" <jkillers@dingtalk.com.INVALID>;
> >>> 发送时间: 2020年9月8日(星期二) 上午10:47
> >>> 收件人: "user-zh"<user-zh@flink.apache.org>;"MuChen"<9329748@qq.com>;
> >>> 主题: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
> >>>
> >>> 异常日志只有这些么?有没有详细点的
> >>
> >>
> >>
> >> --
> >> Best regards!
> >> Rui Li
> >>
> >
> >
> > --
> > Best regards!
> > Rui Li
> >
>
>
> --
> Best regards!
> Rui Li
>
--
Best, Jingsong Lee
Re: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
Posted by Jingsong Li <ji...@gmail.com>.
插入Hive表的SQL也发下?
On Tue, Sep 8, 2020 at 9:44 PM Rui Li <li...@gmail.com> wrote:
> 另外也list一下没有提交的分区目录吧,看看里面的文件是什么状态
>
> On Tue, Sep 8, 2020 at 9:19 PM Rui Li <li...@gmail.com> wrote:
>
> > 作业有发生failover么?还是说作业能成功结束但是某些partition始终没提交?
> >
> > On Tue, Sep 8, 2020 at 5:20 PM MuChen <93...@qq.com> wrote:
> >
> >> hi, Rui Li:
> >> 如你所说,的确有类似日志,但是只有成功增加的分区的日志,没有失败分区的日志:
> >> 2020-09-04 17:17:10,548 INFO org.apache.flink.streaming.api.operators.
> >> AbstractStreamOperator [] - Partition {dt=2020-08-22, hour=18} of table
> >> `hive_catalog`.`rt_dwd`.`dwd_music_copyright_test` is ready to be
> committed
> >> 2020-09-04 17:17:10,716 INFO org.apache.flink.table.filesystem.
> >> MetastoreCommitPolicy [] - Committed partition {dt=2020-08-22, hour=18}
> >> to metastore
> >> 2020-09-04 17:17:10,720 INFO org.apache.flink.table.filesystem.
> >> SuccessFileCommitPolicy [] - Committed partition {dt=2020-08-22,
> hour=18}
> >> with success file
> >> 2020-09-04 17:17:19,652 INFO org.apache.flink.streaming.api.operators.
> >> AbstractStreamOperator [] - Partition {dt=2020-08-22, hour=19} of table
> >> `hive_catalog`.`rt_dwd`.`dwd_music_copyright_test` is ready to be
> committed
> >> 2020-09-04 17:17:19,820 INFO org.apache.flink.table.filesystem.
> >> MetastoreCommitPolicy [] - Committed partition {dt=2020-08-22, hour=19}
> >> to metastore
> >> 2020-09-04 17:17:19,824 INFO org.apache.flink.table.filesystem.
> >> SuccessFileCommitPolicy [] - Committed partition {dt=2020-08-22,
> hour=19}
> >> with success file
> >>
> >> 写hdfs的日志是都有的:
> >> 2020-09-04 17:16:04,100 INFO org.apache.hadoop.hive.ql.io
> .parquet.write.
> >> ParquetRecordWriterWrapper [] - creating real writer to write at hdfs://
> >> Ucluster/user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-
> >> 08-22/hour=07/.part-b7d8f3c6-f1f3-40d4-a269-1ccf2c9a7720-0-1140
> >> .inprogress.1631ac6c-a07c-4ad7-86ff-cf0d4375d1de
> >> 2020-09-04 17:16:04,126 INFO org.apache.hadoop.hive.ql.io
> .parquet.write.
> >> ParquetRecordWriterWrapper [] - creating real writer to write at hdfs://
> >> Ucluster/user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-
> >> 08-22/hour=19/.part-b7d8f3c6-f1f3-40d4-a269-1ccf2c9a7720-0-1142
> >> .inprogress.2700eded-5ed0-4794-8ee9-21721c0c2ffd
> >>
> >> ------------------ 原始邮件 ------------------
> >> *发件人:* "Rui Li" <li...@gmail.com>;
> >> *发送时间:* 2020年9月8日(星期二) 中午12:09
> >> *收件人:* "user-zh"<us...@dingtalk.com>;
> >> *抄送:* "MuChen"<93...@qq.com>;
> >> *主题:* Re: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
> >>
> >> streaming file committer在提交分区之前会打印这样的日志:
> >>
> >> LOG.info("Partition {} of table {} is ready to be committed", partSpec,
> tableIdentifier);
> >>
> >> partition commit policy会在成功提交分区以后打印这样的日志:
> >>
> >> LOG.info("Committed partition {} to metastore", partitionSpec);
> >>
> >> LOG.info("Committed partition {} with success file",
> context.partitionSpec());
> >>
> >> 可以检查一下这样的日志,看是不是卡在什么地方了
> >>
> >> On Tue, Sep 8, 2020 at 11:02 AM 夏帅 <jk...@dingtalk.com.invalid>
> wrote:
> >>
> >>> 就第二次提供的日志看,好像是你的namenode出现的问题
> >>>
> >>>
> >>> ------------------------------------------------------------------
> >>> 发件人:MuChen <93...@qq.com>
> >>> 发送时间:2020年9月8日(星期二) 10:56
> >>> 收件人:user-zh@flink.apache.org 夏帅 <jk...@dingtalk.com>; user-zh <
> >>> user-zh@flink.apache.org>
> >>> 主 题:回复: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
> >>>
> >>> 在checkpoint失败的时间,tm上还有一些info和warn级别的日志:
> >>> 2020-09-04 17:17:59,520 INFO
> >>> org.apache.hadoop.io.retry.RetryInvocationHandler [] - Exception while
> >>> invoking create of class ClientNamenodeProtocolTranslatorPB over
> >>> uhadoop-op3raf-master2/10.42.52.202:8020 after 14 fail over attempts.
> >>> Trying to fail over immediately.
> >>> java.io.IOException: java.lang.InterruptedException
> >>> at org.apache.hadoop.ipc.Client.call(Client.java:1449)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.hadoop.ipc.Client.call(Client.java:1401)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at com.sun.proxy.$Proxy26.create(Unknown Source) ~[?:?]
> >>> at
> >>>
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:295)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
> >>> ~[?:?]
> >>> at
> >>>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >>> ~[?:1.8.0_144]
> >>> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144]
> >>> at
> >>>
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at com.sun.proxy.$Proxy27.create(Unknown Source) ~[?:?]
> >>> at
> >>>
> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1721)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1657)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1582)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:397)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:393)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:393)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:337)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:141)
> >>> ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
> >>> at
> >>>
> org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:37)
> >>> ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
> >>> at
> >>>
> org.apache.flink.table.filesystem.SuccessFileCommitPolicy.commit(SuccessFileCommitPolicy.java:45)
> >>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
> >>> at
> >>>
> org.apache.flink.table.filesystem.stream.StreamingFileCommitter.commitPartitions(StreamingFileCommitter.java:167)
> >>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
> >>> at
> >>>
> org.apache.flink.table.filesystem.stream.StreamingFileCommitter.processElement(StreamingFileCommitter.java:144)
> >>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
> >>> at
> >>>
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:161)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.flink.streaming.runtime.io
> .StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:178)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.flink.streaming.runtime.io
> .StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:153)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.flink.streaming.runtime.io
> .StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:345)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:191)
> >>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:181)
> >>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:558)
> >>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:530)
> >>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721)
> >>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546)
> >>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
> >>> Caused by: java.lang.InterruptedException
> >>> at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404)
> >>> ~[?:1.8.0_144]
> >>> at java.util.concurrent.FutureTask.get(FutureTask.java:191)
> >>> ~[?:1.8.0_144]
> >>> at
> >>>
> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1048)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.hadoop.ipc.Client.call(Client.java:1443)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> ... 38 more
> >>> 2020-09-04 17:17:59,522 WARN
> >>> org.apache.hadoop.io.retry.RetryInvocationHandler [] - Exception while
> >>> invoking class
> >>>
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create
> >>> over uhadoop-op3raf-master1/10.42.31.63:8020. Not retrying because
> >>> failovers (15) exceeded maximum allowed (15)
> >>> java.io.IOException: Failed on local exception:
> >>> java.nio.channels.ClosedByInterruptException; Host Details : local
> host is:
> >>> "uhadoop-op3raf-core13/10.42.99.178"; destination host is:
> >>> "uhadoop-op3raf-master1":8020;
> >>> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.hadoop.ipc.Client.call(Client.java:1474)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.hadoop.ipc.Client.call(Client.java:1401)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at com.sun.proxy.$Proxy26.create(Unknown Source) ~[?:?]
> >>> at
> >>>
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:295)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
> >>> ~[?:?]
> >>> at
> >>>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >>> ~[?:1.8.0_144]
> >>> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144]
> >>> at
> >>>
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at com.sun.proxy.$Proxy27.create(Unknown Source) ~[?:?]
> >>> at
> >>>
> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1721)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1657)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1582)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:397)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:393)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:393)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:337)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:141)
> >>> ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
> >>> at
> >>>
> org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:37)
> >>> ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
> >>> at
> >>>
> org.apache.flink.table.filesystem.SuccessFileCommitPolicy.commit(SuccessFileCommitPolicy.java:45)
> >>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
> >>> at
> >>>
> org.apache.flink.table.filesystem.stream.StreamingFileCommitter.commitPartitions(StreamingFileCommitter.java:167)
> >>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
> >>> at
> >>>
> org.apache.flink.table.filesystem.stream.StreamingFileCommitter.processElement(StreamingFileCommitter.java:144)
> >>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
> >>> at
> >>>
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:161)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.flink.streaming.runtime.io
> .StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:178)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.flink.streaming.runtime.io
> .StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:153)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.flink.streaming.runtime.io
> .StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:345)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:191)
> >>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:181)
> >>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:558)
> >>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:530)
> >>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721)
> >>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546)
> >>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
> >>> Caused by: java.nio.channels.ClosedByInterruptException
> >>> at
> >>>
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
> >>> ~[?:1.8.0_144]
> >>> at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:659)
> >>> ~[?:1.8.0_144]
> >>> at org.apache.hadoop.net
> .SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>>
> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:609)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:707)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at
> >>> org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:370)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.hadoop.ipc.Client.getConnection(Client.java:1523)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> at org.apache.hadoop.ipc.Client.call(Client.java:1440)
> >>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> >>> ... 38 more
> >>>
> >>> 补充:程序多次执行,均会出现部分分区创建失败的情况,而且每次失败的分区是不同的
> >>>
> >>>
> >>> ------------------ 原始邮件 ------------------
> >>> 发件人: "user-zh@flink.apache.org 夏帅" <jk...@dingtalk.com.INVALID>;
> >>> 发送时间: 2020年9月8日(星期二) 上午10:47
> >>> 收件人: "user-zh"<us...@qq.com>;
> >>> 主题: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
> >>>
> >>> 异常日志只有这些么?有没有详细点的
> >>
> >>
> >>
> >> --
> >> Best regards!
> >> Rui Li
> >>
> >
> >
> > --
> > Best regards!
> > Rui Li
> >
>
>
> --
> Best regards!
> Rui Li
>
--
Best, Jingsong Lee
Re: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
Posted by Rui Li <li...@gmail.com>.
另外也list一下没有提交的分区目录吧,看看里面的文件是什么状态
On Tue, Sep 8, 2020 at 9:19 PM Rui Li <li...@gmail.com> wrote:
> 作业有发生failover么?还是说作业能成功结束但是某些partition始终没提交?
>
> On Tue, Sep 8, 2020 at 5:20 PM MuChen <93...@qq.com> wrote:
>
>> hi, Rui Li:
>> 如你所说,的确有类似日志,但是只有成功增加的分区的日志,没有失败分区的日志:
>> 2020-09-04 17:17:10,548 INFO org.apache.flink.streaming.api.operators.
>> AbstractStreamOperator [] - Partition {dt=2020-08-22, hour=18} of table
>> `hive_catalog`.`rt_dwd`.`dwd_music_copyright_test` is ready to be committed
>> 2020-09-04 17:17:10,716 INFO org.apache.flink.table.filesystem.
>> MetastoreCommitPolicy [] - Committed partition {dt=2020-08-22, hour=18}
>> to metastore
>> 2020-09-04 17:17:10,720 INFO org.apache.flink.table.filesystem.
>> SuccessFileCommitPolicy [] - Committed partition {dt=2020-08-22, hour=18}
>> with success file
>> 2020-09-04 17:17:19,652 INFO org.apache.flink.streaming.api.operators.
>> AbstractStreamOperator [] - Partition {dt=2020-08-22, hour=19} of table
>> `hive_catalog`.`rt_dwd`.`dwd_music_copyright_test` is ready to be committed
>> 2020-09-04 17:17:19,820 INFO org.apache.flink.table.filesystem.
>> MetastoreCommitPolicy [] - Committed partition {dt=2020-08-22, hour=19}
>> to metastore
>> 2020-09-04 17:17:19,824 INFO org.apache.flink.table.filesystem.
>> SuccessFileCommitPolicy [] - Committed partition {dt=2020-08-22, hour=19}
>> with success file
>>
>> 写hdfs的日志是都有的:
>> 2020-09-04 17:16:04,100 INFO org.apache.hadoop.hive.ql.io.parquet.write.
>> ParquetRecordWriterWrapper [] - creating real writer to write at hdfs://
>> Ucluster/user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-
>> 08-22/hour=07/.part-b7d8f3c6-f1f3-40d4-a269-1ccf2c9a7720-0-1140
>> .inprogress.1631ac6c-a07c-4ad7-86ff-cf0d4375d1de
>> 2020-09-04 17:16:04,126 INFO org.apache.hadoop.hive.ql.io.parquet.write.
>> ParquetRecordWriterWrapper [] - creating real writer to write at hdfs://
>> Ucluster/user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-
>> 08-22/hour=19/.part-b7d8f3c6-f1f3-40d4-a269-1ccf2c9a7720-0-1142
>> .inprogress.2700eded-5ed0-4794-8ee9-21721c0c2ffd
>>
>> ------------------ 原始邮件 ------------------
>> *发件人:* "Rui Li" <li...@gmail.com>;
>> *发送时间:* 2020年9月8日(星期二) 中午12:09
>> *收件人:* "user-zh"<us...@dingtalk.com>;
>> *抄送:* "MuChen"<93...@qq.com>;
>> *主题:* Re: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
>>
>> streaming file committer在提交分区之前会打印这样的日志:
>>
>> LOG.info("Partition {} of table {} is ready to be committed", partSpec, tableIdentifier);
>>
>> partition commit policy会在成功提交分区以后打印这样的日志:
>>
>> LOG.info("Committed partition {} to metastore", partitionSpec);
>>
>> LOG.info("Committed partition {} with success file", context.partitionSpec());
>>
>> 可以检查一下这样的日志,看是不是卡在什么地方了
>>
>> On Tue, Sep 8, 2020 at 11:02 AM 夏帅 <jk...@dingtalk.com.invalid> wrote:
>>
>>> 就第二次提供的日志看,好像是你的namenode出现的问题
>>>
>>>
>>> ------------------------------------------------------------------
>>> 发件人:MuChen <93...@qq.com>
>>> 发送时间:2020年9月8日(星期二) 10:56
>>> 收件人:user-zh@flink.apache.org 夏帅 <jk...@dingtalk.com>; user-zh <
>>> user-zh@flink.apache.org>
>>> 主 题:回复: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
>>>
>>> 在checkpoint失败的时间,tm上还有一些info和warn级别的日志:
>>> 2020-09-04 17:17:59,520 INFO
>>> org.apache.hadoop.io.retry.RetryInvocationHandler [] - Exception while
>>> invoking create of class ClientNamenodeProtocolTranslatorPB over
>>> uhadoop-op3raf-master2/10.42.52.202:8020 after 14 fail over attempts.
>>> Trying to fail over immediately.
>>> java.io.IOException: java.lang.InterruptedException
>>> at org.apache.hadoop.ipc.Client.call(Client.java:1449)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.hadoop.ipc.Client.call(Client.java:1401)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at com.sun.proxy.$Proxy26.create(Unknown Source) ~[?:?]
>>> at
>>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:295)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
>>> ~[?:?]
>>> at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> ~[?:1.8.0_144]
>>> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144]
>>> at
>>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at com.sun.proxy.$Proxy27.create(Unknown Source) ~[?:?]
>>> at
>>> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1721)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1657)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1582)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:397)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:393)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:393)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:337)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:141)
>>> ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
>>> at
>>> org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:37)
>>> ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
>>> at
>>> org.apache.flink.table.filesystem.SuccessFileCommitPolicy.commit(SuccessFileCommitPolicy.java:45)
>>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
>>> at
>>> org.apache.flink.table.filesystem.stream.StreamingFileCommitter.commitPartitions(StreamingFileCommitter.java:167)
>>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
>>> at
>>> org.apache.flink.table.filesystem.stream.StreamingFileCommitter.processElement(StreamingFileCommitter.java:144)
>>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
>>> at
>>> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:161)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:178)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:153)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:345)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:191)
>>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:181)
>>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:558)
>>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:530)
>>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721)
>>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546)
>>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
>>> Caused by: java.lang.InterruptedException
>>> at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404)
>>> ~[?:1.8.0_144]
>>> at java.util.concurrent.FutureTask.get(FutureTask.java:191)
>>> ~[?:1.8.0_144]
>>> at
>>> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1048)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.hadoop.ipc.Client.call(Client.java:1443)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> ... 38 more
>>> 2020-09-04 17:17:59,522 WARN
>>> org.apache.hadoop.io.retry.RetryInvocationHandler [] - Exception while
>>> invoking class
>>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create
>>> over uhadoop-op3raf-master1/10.42.31.63:8020. Not retrying because
>>> failovers (15) exceeded maximum allowed (15)
>>> java.io.IOException: Failed on local exception:
>>> java.nio.channels.ClosedByInterruptException; Host Details : local host is:
>>> "uhadoop-op3raf-core13/10.42.99.178"; destination host is:
>>> "uhadoop-op3raf-master1":8020;
>>> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.hadoop.ipc.Client.call(Client.java:1474)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.hadoop.ipc.Client.call(Client.java:1401)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at com.sun.proxy.$Proxy26.create(Unknown Source) ~[?:?]
>>> at
>>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:295)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
>>> ~[?:?]
>>> at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> ~[?:1.8.0_144]
>>> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144]
>>> at
>>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at com.sun.proxy.$Proxy27.create(Unknown Source) ~[?:?]
>>> at
>>> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1721)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1657)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1582)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:397)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:393)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:393)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:337)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:141)
>>> ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
>>> at
>>> org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:37)
>>> ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
>>> at
>>> org.apache.flink.table.filesystem.SuccessFileCommitPolicy.commit(SuccessFileCommitPolicy.java:45)
>>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
>>> at
>>> org.apache.flink.table.filesystem.stream.StreamingFileCommitter.commitPartitions(StreamingFileCommitter.java:167)
>>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
>>> at
>>> org.apache.flink.table.filesystem.stream.StreamingFileCommitter.processElement(StreamingFileCommitter.java:144)
>>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
>>> at
>>> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:161)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:178)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:153)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:345)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:191)
>>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:181)
>>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:558)
>>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:530)
>>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721)
>>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546)
>>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
>>> Caused by: java.nio.channels.ClosedByInterruptException
>>> at
>>> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
>>> ~[?:1.8.0_144]
>>> at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:659)
>>> ~[?:1.8.0_144]
>>> at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:609)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:707)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at
>>> org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:370)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.hadoop.ipc.Client.getConnection(Client.java:1523)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> at org.apache.hadoop.ipc.Client.call(Client.java:1440)
>>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>>> ... 38 more
>>>
>>> 补充:程序多次执行,均会出现部分分区创建失败的情况,而且每次失败的分区是不同的
>>>
>>>
>>> ------------------ 原始邮件 ------------------
>>> 发件人: "user-zh@flink.apache.org 夏帅" <jk...@dingtalk.com.INVALID>;
>>> 发送时间: 2020年9月8日(星期二) 上午10:47
>>> 收件人: "user-zh"<us...@qq.com>;
>>> 主题: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
>>>
>>> 异常日志只有这些么?有没有详细点的
>>
>>
>>
>> --
>> Best regards!
>> Rui Li
>>
>
>
> --
> Best regards!
> Rui Li
>
--
Best regards!
Rui Li
Re: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
Posted by Rui Li <li...@gmail.com>.
作业有发生failover么?还是说作业能成功结束但是某些partition始终没提交?
On Tue, Sep 8, 2020 at 5:20 PM MuChen <93...@qq.com> wrote:
> hi, Rui Li:
> 如你所说,的确有类似日志,但是只有成功增加的分区的日志,没有失败分区的日志:
> 2020-09-04 17:17:10,548 INFO org.apache.flink.streaming.api.operators.
> AbstractStreamOperator [] - Partition {dt=2020-08-22, hour=18} of table
> `hive_catalog`.`rt_dwd`.`dwd_music_copyright_test` is ready to be committed
> 2020-09-04 17:17:10,716 INFO org.apache.flink.table.filesystem.
> MetastoreCommitPolicy [] - Committed partition {dt=2020-08-22, hour=18}
> to metastore
> 2020-09-04 17:17:10,720 INFO org.apache.flink.table.filesystem.
> SuccessFileCommitPolicy [] - Committed partition {dt=2020-08-22, hour=18}
> with success file
> 2020-09-04 17:17:19,652 INFO org.apache.flink.streaming.api.operators.
> AbstractStreamOperator [] - Partition {dt=2020-08-22, hour=19} of table
> `hive_catalog`.`rt_dwd`.`dwd_music_copyright_test` is ready to be committed
> 2020-09-04 17:17:19,820 INFO org.apache.flink.table.filesystem.
> MetastoreCommitPolicy [] - Committed partition {dt=2020-08-22, hour=19}
> to metastore
> 2020-09-04 17:17:19,824 INFO org.apache.flink.table.filesystem.
> SuccessFileCommitPolicy [] - Committed partition {dt=2020-08-22, hour=19}
> with success file
>
> 写hdfs的日志是都有的:
> 2020-09-04 17:16:04,100 INFO org.apache.hadoop.hive.ql.io.parquet.write.
> ParquetRecordWriterWrapper [] - creating real writer to write at hdfs://
> Ucluster/user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-08
> -22/hour=07/.part-b7d8f3c6-f1f3-40d4-a269-1ccf2c9a7720-0-1140.inprogress.
> 1631ac6c-a07c-4ad7-86ff-cf0d4375d1de
> 2020-09-04 17:16:04,126 INFO org.apache.hadoop.hive.ql.io.parquet.write.
> ParquetRecordWriterWrapper [] - creating real writer to write at hdfs://
> Ucluster/user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-08
> -22/hour=19/.part-b7d8f3c6-f1f3-40d4-a269-1ccf2c9a7720-0-1142.inprogress.
> 2700eded-5ed0-4794-8ee9-21721c0c2ffd
>
> ------------------ 原始邮件 ------------------
> *发件人:* "Rui Li" <li...@gmail.com>;
> *发送时间:* 2020年9月8日(星期二) 中午12:09
> *收件人:* "user-zh"<us...@dingtalk.com>;
> *抄送:* "MuChen"<93...@qq.com>;
> *主题:* Re: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
>
> streaming file committer在提交分区之前会打印这样的日志:
>
> LOG.info("Partition {} of table {} is ready to be committed", partSpec, tableIdentifier);
>
> partition commit policy会在成功提交分区以后打印这样的日志:
>
> LOG.info("Committed partition {} to metastore", partitionSpec);
>
> LOG.info("Committed partition {} with success file", context.partitionSpec());
>
> 可以检查一下这样的日志,看是不是卡在什么地方了
>
> On Tue, Sep 8, 2020 at 11:02 AM 夏帅 <jk...@dingtalk.com.invalid> wrote:
>
>> 就第二次提供的日志看,好像是你的namenode出现的问题
>>
>>
>> ------------------------------------------------------------------
>> 发件人:MuChen <93...@qq.com>
>> 发送时间:2020年9月8日(星期二) 10:56
>> 收件人:user-zh@flink.apache.org 夏帅 <jk...@dingtalk.com>; user-zh <
>> user-zh@flink.apache.org>
>> 主 题:回复: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
>>
>> 在checkpoint失败的时间,tm上还有一些info和warn级别的日志:
>> 2020-09-04 17:17:59,520 INFO
>> org.apache.hadoop.io.retry.RetryInvocationHandler [] - Exception while
>> invoking create of class ClientNamenodeProtocolTranslatorPB over
>> uhadoop-op3raf-master2/10.42.52.202:8020 after 14 fail over attempts.
>> Trying to fail over immediately.
>> java.io.IOException: java.lang.InterruptedException
>> at org.apache.hadoop.ipc.Client.call(Client.java:1449)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at org.apache.hadoop.ipc.Client.call(Client.java:1401)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at com.sun.proxy.$Proxy26.create(Unknown Source) ~[?:?]
>> at
>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:295)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source) ~[?:?]
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> ~[?:1.8.0_144]
>> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144]
>> at
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at com.sun.proxy.$Proxy27.create(Unknown Source) ~[?:?]
>> at
>> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1721)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1657)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1582)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at
>> org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:397)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at
>> org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:393)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at
>> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at
>> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:393)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at
>> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:337)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at
>> org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:141)
>> ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
>> at
>> org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:37)
>> ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
>> at
>> org.apache.flink.table.filesystem.SuccessFileCommitPolicy.commit(SuccessFileCommitPolicy.java:45)
>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
>> at
>> org.apache.flink.table.filesystem.stream.StreamingFileCommitter.commitPartitions(StreamingFileCommitter.java:167)
>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
>> at
>> org.apache.flink.table.filesystem.stream.StreamingFileCommitter.processElement(StreamingFileCommitter.java:144)
>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
>> at
>> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:161)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:178)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:153)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at
>> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:345)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at
>> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:191)
>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at
>> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:181)
>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at
>> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:558)
>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at
>> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:530)
>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721)
>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546)
>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
>> Caused by: java.lang.InterruptedException
>> at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404)
>> ~[?:1.8.0_144]
>> at java.util.concurrent.FutureTask.get(FutureTask.java:191)
>> ~[?:1.8.0_144]
>> at
>> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1048)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at org.apache.hadoop.ipc.Client.call(Client.java:1443)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> ... 38 more
>> 2020-09-04 17:17:59,522 WARN
>> org.apache.hadoop.io.retry.RetryInvocationHandler [] - Exception while
>> invoking class
>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create
>> over uhadoop-op3raf-master1/10.42.31.63:8020. Not retrying because
>> failovers (15) exceeded maximum allowed (15)
>> java.io.IOException: Failed on local exception:
>> java.nio.channels.ClosedByInterruptException; Host Details : local host is:
>> "uhadoop-op3raf-core13/10.42.99.178"; destination host is:
>> "uhadoop-op3raf-master1":8020;
>> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at org.apache.hadoop.ipc.Client.call(Client.java:1474)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at org.apache.hadoop.ipc.Client.call(Client.java:1401)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at com.sun.proxy.$Proxy26.create(Unknown Source) ~[?:?]
>> at
>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:295)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source) ~[?:?]
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> ~[?:1.8.0_144]
>> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144]
>> at
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at com.sun.proxy.$Proxy27.create(Unknown Source) ~[?:?]
>> at
>> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1721)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1657)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1582)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at
>> org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:397)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at
>> org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:393)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at
>> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at
>> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:393)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at
>> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:337)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at
>> org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:141)
>> ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
>> at
>> org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:37)
>> ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
>> at
>> org.apache.flink.table.filesystem.SuccessFileCommitPolicy.commit(SuccessFileCommitPolicy.java:45)
>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
>> at
>> org.apache.flink.table.filesystem.stream.StreamingFileCommitter.commitPartitions(StreamingFileCommitter.java:167)
>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
>> at
>> org.apache.flink.table.filesystem.stream.StreamingFileCommitter.processElement(StreamingFileCommitter.java:144)
>> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
>> at
>> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:161)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:178)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:153)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at
>> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:345)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at
>> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:191)
>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at
>> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:181)
>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at
>> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:558)
>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at
>> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:530)
>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721)
>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546)
>> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
>> Caused by: java.nio.channels.ClosedByInterruptException
>> at
>> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
>> ~[?:1.8.0_144]
>> at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:659)
>> ~[?:1.8.0_144]
>> at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at
>> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:609)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at
>> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:707)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at
>> org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:370)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at org.apache.hadoop.ipc.Client.getConnection(Client.java:1523)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> at org.apache.hadoop.ipc.Client.call(Client.java:1440)
>> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
>> ... 38 more
>>
>> 补充:程序多次执行,均会出现部分分区创建失败的情况,而且每次失败的分区是不同的
>>
>>
>> ------------------ 原始邮件 ------------------
>> 发件人: "user-zh@flink.apache.org 夏帅" <jk...@dingtalk.com.INVALID>;
>> 发送时间: 2020年9月8日(星期二) 上午10:47
>> 收件人: "user-zh"<us...@qq.com>;
>> 主题: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
>>
>> 异常日志只有这些么?有没有详细点的
>
>
>
> --
> Best regards!
> Rui Li
>
--
Best regards!
Rui Li
回复: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
Posted by MuChen <93...@qq.com>.
hi, Rui Li:
如你所说,的确有类似日志,但是只有成功增加的分区的日志,没有失败分区的日志:
2020-09-04 17:17:10,548 INFO org.apache.flink.streaming.api.operators.AbstractStreamOperator [] - Partition {dt=2020-08-22, hour=18} of table `hive_catalog`.`rt_dwd`.`dwd_music_copyright_test` is ready to be committed
2020-09-04 17:17:10,716 INFO org.apache.flink.table.filesystem.MetastoreCommitPolicy [] - Committed partition {dt=2020-08-22, hour=18} to metastore
2020-09-04 17:17:10,720 INFO org.apache.flink.table.filesystem.SuccessFileCommitPolicy [] - Committed partition {dt=2020-08-22, hour=18} with success file
2020-09-04 17:17:19,652 INFO org.apache.flink.streaming.api.operators.AbstractStreamOperator [] - Partition {dt=2020-08-22, hour=19} of table `hive_catalog`.`rt_dwd`.`dwd_music_copyright_test` is ready to be committed
2020-09-04 17:17:19,820 INFO org.apache.flink.table.filesystem.MetastoreCommitPolicy [] - Committed partition {dt=2020-08-22, hour=19} to metastore
2020-09-04 17:17:19,824 INFO org.apache.flink.table.filesystem.SuccessFileCommitPolicy [] - Committed partition {dt=2020-08-22, hour=19} with success file
写hdfs的日志是都有的:
2020-09-04 17:16:04,100 INFO org.apache.hadoop.hive.ql.io.parquet.write.ParquetRecordWriterWrapper [] - creating real writer to write at hdfs://Ucluster/user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-08-22/hour=07/.part-b7d8f3c6-f1f3-40d4-a269-1ccf2c9a7720-0-1140.inprogress.1631ac6c-a07c-4ad7-86ff-cf0d4375d1de
2020-09-04 17:16:04,126 INFO org.apache.hadoop.hive.ql.io.parquet.write.ParquetRecordWriterWrapper [] - creating real writer to write at hdfs://Ucluster/user/hive/warehouse/rt_dwd.db/dwd_music_copyright_test/dt=2020-08-22/hour=19/.part-b7d8f3c6-f1f3-40d4-a269-1ccf2c9a7720-0-1142.inprogress.2700eded-5ed0-4794-8ee9-21721c0c2ffd
------------------ 原始邮件 ------------------
发件人: "Rui Li" <lirui.fudan@gmail.com>;
发送时间: 2020年9月8日(星期二) 中午12:09
收件人: "user-zh"<user-zh@flink.apache.org>;"夏帅"<jkillers@dingtalk.com>;
抄送: "MuChen"<9329748@qq.com>;
主题: Re: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
streaming file committer在提交分区之前会打印这样的日志:LOG.info("Partition {} of table {} is ready to be committed", partSpec, tableIdentifier);
partition commit policy会在成功提交分区以后打印这样的日志:
LOG.info("Committed partition {} to metastore", partitionSpec);LOG.info("Committed partition {} with success file", context.partitionSpec());
可以检查一下这样的日志,看是不是卡在什么地方了
On Tue, Sep 8, 2020 at 11:02 AM 夏帅 <jkillers@dingtalk.com.invalid> wrote:
就第二次提供的日志看,好像是你的namenode出现的问题
------------------------------------------------------------------
发件人:MuChen <9329748@qq.com>
发送时间:2020年9月8日(星期二) 10:56
收件人:user-zh@flink.apache.org 夏帅 <jkillers@dingtalk.com>; user-zh <user-zh@flink.apache.org>
主 题:回复: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
在checkpoint失败的时间,tm上还有一些info和warn级别的日志:
2020-09-04 17:17:59,520 INFO org.apache.hadoop.io.retry.RetryInvocationHandler [] - Exception while invoking create of class ClientNamenodeProtocolTranslatorPB over uhadoop-op3raf-master2/10.42.52.202:8020 after 14 fail over attempts. Trying to fail over immediately.
java.io.IOException: java.lang.InterruptedException
at org.apache.hadoop.ipc.Client.call(Client.java:1449) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.ipc.Client.call(Client.java:1401) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at com.sun.proxy.$Proxy26.create(Unknown Source) ~[?:?]
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:295) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source) ~[?:?]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_144]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144]
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at com.sun.proxy.$Proxy27.create(Unknown Source) ~[?:?]
at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1721) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1657) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1582) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:397) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:393) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:393) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:337) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:141) ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
at org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:37) ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
at org.apache.flink.table.filesystem.SuccessFileCommitPolicy.commit(SuccessFileCommitPolicy.java:45) ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
at org.apache.flink.table.filesystem.stream.StreamingFileCommitter.commitPartitions(StreamingFileCommitter.java:167) ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
at org.apache.flink.table.filesystem.stream.StreamingFileCommitter.processElement(StreamingFileCommitter.java:144) ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
at org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:161) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:178) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:153) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:345) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:191) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:181) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:558) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:530) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
Caused by: java.lang.InterruptedException
at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404) ~[?:1.8.0_144]
at java.util.concurrent.FutureTask.get(FutureTask.java:191) ~[?:1.8.0_144]
at org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1048) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.ipc.Client.call(Client.java:1443) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
... 38 more
2020-09-04 17:17:59,522 WARN org.apache.hadoop.io.retry.RetryInvocationHandler [] - Exception while invoking class org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create over uhadoop-op3raf-master1/10.42.31.63:8020. Not retrying because failovers (15) exceeded maximum allowed (15)
java.io.IOException: Failed on local exception: java.nio.channels.ClosedByInterruptException; Host Details : local host is: "uhadoop-op3raf-core13/10.42.99.178"; destination host is: "uhadoop-op3raf-master1":8020;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.ipc.Client.call(Client.java:1474) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.ipc.Client.call(Client.java:1401) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at com.sun.proxy.$Proxy26.create(Unknown Source) ~[?:?]
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:295) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source) ~[?:?]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_144]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144]
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at com.sun.proxy.$Proxy27.create(Unknown Source) ~[?:?]
at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1721) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1657) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1582) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:397) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:393) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:393) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:337) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:141) ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
at org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:37) ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
at org.apache.flink.table.filesystem.SuccessFileCommitPolicy.commit(SuccessFileCommitPolicy.java:45) ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
at org.apache.flink.table.filesystem.stream.StreamingFileCommitter.commitPartitions(StreamingFileCommitter.java:167) ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
at org.apache.flink.table.filesystem.stream.StreamingFileCommitter.processElement(StreamingFileCommitter.java:144) ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
at org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:161) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:178) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:153) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:345) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:191) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:181) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:558) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:530) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
Caused by: java.nio.channels.ClosedByInterruptException
at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202) ~[?:1.8.0_144]
at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:659) ~[?:1.8.0_144]
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:609) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:707) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:370) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1523) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.ipc.Client.call(Client.java:1440) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
... 38 more
补充:程序多次执行,均会出现部分分区创建失败的情况,而且每次失败的分区是不同的
------------------ 原始邮件 ------------------
发件人: "user-zh@flink.apache.org 夏帅" <jkillers@dingtalk.com.INVALID>;
发送时间: 2020年9月8日(星期二) 上午10:47
收件人: "user-zh"<user-zh@flink.apache.org>;"MuChen"<9329748@qq.com>;
主题: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
异常日志只有这些么?有没有详细点的
--
Best regards!
Rui Li
Re: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
Posted by Rui Li <li...@gmail.com>.
streaming file committer在提交分区之前会打印这样的日志:
LOG.info("Partition {} of table {} is ready to be committed",
partSpec, tableIdentifier);
partition commit policy会在成功提交分区以后打印这样的日志:
LOG.info("Committed partition {} to metastore", partitionSpec);
LOG.info("Committed partition {} with success file", context.partitionSpec());
可以检查一下这样的日志,看是不是卡在什么地方了
On Tue, Sep 8, 2020 at 11:02 AM 夏帅 <jk...@dingtalk.com.invalid> wrote:
> 就第二次提供的日志看,好像是你的namenode出现的问题
>
>
> ------------------------------------------------------------------
> 发件人:MuChen <93...@qq.com>
> 发送时间:2020年9月8日(星期二) 10:56
> 收件人:user-zh@flink.apache.org 夏帅 <jk...@dingtalk.com>; user-zh <
> user-zh@flink.apache.org>
> 主 题:回复: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
>
> 在checkpoint失败的时间,tm上还有一些info和warn级别的日志:
> 2020-09-04 17:17:59,520 INFO
> org.apache.hadoop.io.retry.RetryInvocationHandler [] - Exception while
> invoking create of class ClientNamenodeProtocolTranslatorPB over
> uhadoop-op3raf-master2/10.42.52.202:8020 after 14 fail over attempts.
> Trying to fail over immediately.
> java.io.IOException: java.lang.InterruptedException
> at org.apache.hadoop.ipc.Client.call(Client.java:1449)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at org.apache.hadoop.ipc.Client.call(Client.java:1401)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at com.sun.proxy.$Proxy26.create(Unknown Source) ~[?:?]
> at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:295)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source) ~[?:?]
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> ~[?:1.8.0_144]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144]
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at com.sun.proxy.$Proxy27.create(Unknown Source) ~[?:?]
> at
> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1721)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1657)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1582)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:397)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:393)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:393)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:337)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at
> org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:141)
> ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
> at
> org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:37)
> ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
> at
> org.apache.flink.table.filesystem.SuccessFileCommitPolicy.commit(SuccessFileCommitPolicy.java:45)
> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
> at
> org.apache.flink.table.filesystem.stream.StreamingFileCommitter.commitPartitions(StreamingFileCommitter.java:167)
> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
> at
> org.apache.flink.table.filesystem.stream.StreamingFileCommitter.processElement(StreamingFileCommitter.java:144)
> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
> at
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:161)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:178)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:153)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at
> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:345)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:191)
> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:181)
> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:558)
> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:530)
> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721)
> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546)
> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
> Caused by: java.lang.InterruptedException
> at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404)
> ~[?:1.8.0_144]
> at java.util.concurrent.FutureTask.get(FutureTask.java:191)
> ~[?:1.8.0_144]
> at
> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1048)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at org.apache.hadoop.ipc.Client.call(Client.java:1443)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> ... 38 more
> 2020-09-04 17:17:59,522 WARN
> org.apache.hadoop.io.retry.RetryInvocationHandler [] - Exception while
> invoking class
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create
> over uhadoop-op3raf-master1/10.42.31.63:8020. Not retrying because
> failovers (15) exceeded maximum allowed (15)
> java.io.IOException: Failed on local exception:
> java.nio.channels.ClosedByInterruptException; Host Details : local host is:
> "uhadoop-op3raf-core13/10.42.99.178"; destination host is:
> "uhadoop-op3raf-master1":8020;
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at org.apache.hadoop.ipc.Client.call(Client.java:1474)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at org.apache.hadoop.ipc.Client.call(Client.java:1401)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at com.sun.proxy.$Proxy26.create(Unknown Source) ~[?:?]
> at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:295)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source) ~[?:?]
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> ~[?:1.8.0_144]
> at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144]
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at com.sun.proxy.$Proxy27.create(Unknown Source) ~[?:?]
> at
> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1721)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1657)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1582)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:397)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:393)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:393)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:337)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at
> org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:141)
> ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
> at
> org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:37)
> ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
> at
> org.apache.flink.table.filesystem.SuccessFileCommitPolicy.commit(SuccessFileCommitPolicy.java:45)
> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
> at
> org.apache.flink.table.filesystem.stream.StreamingFileCommitter.commitPartitions(StreamingFileCommitter.java:167)
> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
> at
> org.apache.flink.table.filesystem.stream.StreamingFileCommitter.processElement(StreamingFileCommitter.java:144)
> ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
> at
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:161)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:178)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:153)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at
> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:345)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:191)
> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:181)
> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:558)
> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:530)
> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721)
> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546)
> [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
> Caused by: java.nio.channels.ClosedByInterruptException
> at
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
> ~[?:1.8.0_144]
> at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:659)
> ~[?:1.8.0_144]
> at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at
> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:609)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:707)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at
> org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:370)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at org.apache.hadoop.ipc.Client.getConnection(Client.java:1523)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> at org.apache.hadoop.ipc.Client.call(Client.java:1440)
> ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
> ... 38 more
>
> 补充:程序多次执行,均会出现部分分区创建失败的情况,而且每次失败的分区是不同的
>
>
> ------------------ 原始邮件 ------------------
> 发件人: "user-zh@flink.apache.org 夏帅" <jk...@dingtalk.com.INVALID>;
> 发送时间: 2020年9月8日(星期二) 上午10:47
> 收件人: "user-zh"<us...@qq.com>;
> 主题: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
>
> 异常日志只有这些么?有没有详细点的
--
Best regards!
Rui Li
回复:回复:使用StreamingFileSink向hive metadata中增加分区部分失败
Posted by 夏帅 <jk...@dingtalk.com.INVALID>.
就第二次提供的日志看,好像是你的namenode出现的问题
------------------------------------------------------------------
发件人:MuChen <93...@qq.com>
发送时间:2020年9月8日(星期二) 10:56
收件人:user-zh@flink.apache.org 夏帅 <jk...@dingtalk.com>; user-zh <us...@flink.apache.org>
主 题:回复: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
在checkpoint失败的时间,tm上还有一些info和warn级别的日志:
2020-09-04 17:17:59,520 INFO org.apache.hadoop.io.retry.RetryInvocationHandler [] - Exception while invoking create of class ClientNamenodeProtocolTranslatorPB over uhadoop-op3raf-master2/10.42.52.202:8020 after 14 fail over attempts. Trying to fail over immediately.
java.io.IOException: java.lang.InterruptedException
at org.apache.hadoop.ipc.Client.call(Client.java:1449) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.ipc.Client.call(Client.java:1401) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at com.sun.proxy.$Proxy26.create(Unknown Source) ~[?:?]
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:295) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source) ~[?:?]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_144]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144]
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at com.sun.proxy.$Proxy27.create(Unknown Source) ~[?:?]
at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1721) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1657) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1582) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:397) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:393) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:393) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:337) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:141) ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
at org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:37) ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
at org.apache.flink.table.filesystem.SuccessFileCommitPolicy.commit(SuccessFileCommitPolicy.java:45) ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
at org.apache.flink.table.filesystem.stream.StreamingFileCommitter.commitPartitions(StreamingFileCommitter.java:167) ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
at org.apache.flink.table.filesystem.stream.StreamingFileCommitter.processElement(StreamingFileCommitter.java:144) ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
at org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:161) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:178) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:153) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:345) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:191) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:181) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:558) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:530) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
Caused by: java.lang.InterruptedException
at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404) ~[?:1.8.0_144]
at java.util.concurrent.FutureTask.get(FutureTask.java:191) ~[?:1.8.0_144]
at org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1048) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.ipc.Client.call(Client.java:1443) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
... 38 more
2020-09-04 17:17:59,522 WARN org.apache.hadoop.io.retry.RetryInvocationHandler [] - Exception while invoking class org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create over uhadoop-op3raf-master1/10.42.31.63:8020. Not retrying because failovers (15) exceeded maximum allowed (15)
java.io.IOException: Failed on local exception: java.nio.channels.ClosedByInterruptException; Host Details : local host is: "uhadoop-op3raf-core13/10.42.99.178"; destination host is: "uhadoop-op3raf-master1":8020;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.ipc.Client.call(Client.java:1474) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.ipc.Client.call(Client.java:1401) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at com.sun.proxy.$Proxy26.create(Unknown Source) ~[?:?]
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:295) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source) ~[?:?]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_144]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144]
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at com.sun.proxy.$Proxy27.create(Unknown Source) ~[?:?]
at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1721) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1657) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1582) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:397) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:393) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:393) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:337) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:141) ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
at org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:37) ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
at org.apache.flink.table.filesystem.SuccessFileCommitPolicy.commit(SuccessFileCommitPolicy.java:45) ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
at org.apache.flink.table.filesystem.stream.StreamingFileCommitter.commitPartitions(StreamingFileCommitter.java:167) ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
at org.apache.flink.table.filesystem.stream.StreamingFileCommitter.processElement(StreamingFileCommitter.java:144) ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
at org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:161) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:178) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:153) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:345) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:191) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:181) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:558) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:530) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
Caused by: java.nio.channels.ClosedByInterruptException
at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202) ~[?:1.8.0_144]
at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:659) ~[?:1.8.0_144]
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:609) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:707) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:370) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1523) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.ipc.Client.call(Client.java:1440) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
... 38 more
补充:程序多次执行,均会出现部分分区创建失败的情况,而且每次失败的分区是不同的
------------------ 原始邮件 ------------------
发件人: "user-zh@flink.apache.org 夏帅" <jk...@dingtalk.com.INVALID>;
发送时间: 2020年9月8日(星期二) 上午10:47
收件人: "user-zh"<us...@qq.com>;
主题: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
异常日志只有这些么?有没有详细点的
回复: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
Posted by MuChen <93...@qq.com>.
在checkpoint失败的时间,tm上还有一些info和warn级别的日志:
2020-09-04 17:17:59,520 INFO org.apache.hadoop.io.retry.RetryInvocationHandler [] - Exception while invoking create of class ClientNamenodeProtocolTranslatorPB over uhadoop-op3raf-master2/10.42.52.202:8020 after 14 fail over attempts. Trying to fail over immediately.
java.io.IOException: java.lang.InterruptedException
at org.apache.hadoop.ipc.Client.call(Client.java:1449) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.ipc.Client.call(Client.java:1401) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at com.sun.proxy.$Proxy26.create(Unknown Source) ~[?:?]
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:295) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source) ~[?:?]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_144]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144]
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at com.sun.proxy.$Proxy27.create(Unknown Source) ~[?:?]
at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1721) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1657) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1582) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:397) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:393) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:393) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:337) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:141) ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
at org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:37) ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
at org.apache.flink.table.filesystem.SuccessFileCommitPolicy.commit(SuccessFileCommitPolicy.java:45) ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
at org.apache.flink.table.filesystem.stream.StreamingFileCommitter.commitPartitions(StreamingFileCommitter.java:167) ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
at org.apache.flink.table.filesystem.stream.StreamingFileCommitter.processElement(StreamingFileCommitter.java:144) ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
at org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:161) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:178) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:153) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:345) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:191) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:181) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:558) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:530) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
Caused by: java.lang.InterruptedException
at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404) ~[?:1.8.0_144]
at java.util.concurrent.FutureTask.get(FutureTask.java:191) ~[?:1.8.0_144]
at org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1048) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.ipc.Client.call(Client.java:1443) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
... 38 more
2020-09-04 17:17:59,522 WARN org.apache.hadoop.io.retry.RetryInvocationHandler [] - Exception while invoking class org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create over uhadoop-op3raf-master1/10.42.31.63:8020. Not retrying because failovers (15) exceeded maximum allowed (15)
java.io.IOException: Failed on local exception: java.nio.channels.ClosedByInterruptException; Host Details : local host is: "uhadoop-op3raf-core13/10.42.99.178"; destination host is: "uhadoop-op3raf-master1":8020;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.ipc.Client.call(Client.java:1474) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.ipc.Client.call(Client.java:1401) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at com.sun.proxy.$Proxy26.create(Unknown Source) ~[?:?]
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:295) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source) ~[?:?]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_144]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_144]
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at com.sun.proxy.$Proxy27.create(Unknown Source) ~[?:?]
at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1721) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1657) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1582) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:397) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:393) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:393) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:337) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:141) ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
at org.apache.flink.hive.shaded.fs.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:37) ~[flink-sql-connector-hive-1.2.2_2.11-1.11.0.jar:1.11.0]
at org.apache.flink.table.filesystem.SuccessFileCommitPolicy.commit(SuccessFileCommitPolicy.java:45) ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
at org.apache.flink.table.filesystem.stream.StreamingFileCommitter.commitPartitions(StreamingFileCommitter.java:167) ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
at org.apache.flink.table.filesystem.stream.StreamingFileCommitter.processElement(StreamingFileCommitter.java:144) ~[flink-table-blink_2.11-1.11.0.jar:1.11.0]
at org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:161) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:178) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:153) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:345) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxStep(MailboxProcessor.java:191) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:181) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:558) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:530) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:721) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546) [music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
Caused by: java.nio.channels.ClosedByInterruptException
at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202) ~[?:1.8.0_144]
at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:659) ~[?:1.8.0_144]
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:609) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:707) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:370) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1523) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
at org.apache.hadoop.ipc.Client.call(Client.java:1440) ~[music_copyright-1.0-SNAPSHOT-jar-with-dependencies.jar:?]
... 38 more
补充:程序多次执行,均会出现部分分区创建失败的情况,而且每次失败的分区是不同的
------------------ 原始邮件 ------------------
发件人: "user-zh@flink.apache.org 夏帅" <jkillers@dingtalk.com.INVALID>;
发送时间: 2020年9月8日(星期二) 上午10:47
收件人: "user-zh"<user-zh@flink.apache.org>;"MuChen"<9329748@qq.com>;
主题: 回复:使用StreamingFileSink向hive metadata中增加分区部分失败
异常日志只有这些么?有没有详细点的