You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user-zh@flink.apache.org by RS <ti...@163.com> on 2022/07/21 06:52:51 UTC

Hive提交分区异常,Caused by: java.io.FileNotFoundException: /tmp/... (No such file or directory)

Hi,


环境:
flink-1.15.1 on K8S session集群
hive3
flink写hive任务,配置了定时提交分区


现象:
1. checkpoint是30s一次
2. HDFS上有数据文件产生
3. hive里面没有分区信息
4. 任务异常,自动重启后下次ck的时候还是异常
5. 写hive的任务有的一直正常运行,有的有这种异常
6. 任务停掉,重新创建后恢复正常


异常日志如下:
java.lang.RuntimeException: java.io.FileNotFoundException: /tmp/tm_10.244.25.164:6122-5b9301/blobStorage/job_ac640bf7276279f0452642918561670e/blob_p-d9755afa943119c325c059ddc70a45c904d9e4bd-d3a2ca82ca3a2429095708d4908ae184 (No such file or directory)

    at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:3021)

    at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2973)

    at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2848)

    at org.apache.hadoop.conf.Configuration.get(Configuration.java:1460)

    at org.apache.hadoop.hive.conf.HiveConf.getVar(HiveConf.java:5001)

    at org.apache.hadoop.hive.conf.HiveConf.getVar(HiveConf.java:5074)

    at org.apache.hadoop.hive.conf.HiveConf.initialize(HiveConf.java:5161)

    at org.apache.hadoop.hive.conf.HiveConf.<init>(HiveConf.java:5114)

    at org.apache.flink.connectors.hive.util.HiveConfUtils.create(HiveConfUtils.java:38)

    at org.apache.flink.connectors.hive.HiveTableMetaStoreFactory$HiveTableMetaStore.<init>(HiveTableMetaStoreFactory.java:72)

    at org.apache.flink.connectors.hive.HiveTableMetaStoreFactory$HiveTableMetaStore.<init>(HiveTableMetaStoreFactory.java:64)

    at org.apache.flink.connectors.hive.HiveTableMetaStoreFactory.createTableMetaStore(HiveTableMetaStoreFactory.java:61)

    at org.apache.flink.connectors.hive.HiveTableMetaStoreFactory.createTableMetaStore(HiveTableMetaStoreFactory.java:43)

    at org.apache.flink.connector.file.table.stream.PartitionCommitter.commitPartitions(PartitionCommitter.java:159)

    at org.apache.flink.connector.file.table.stream.PartitionCommitter.processElement(PartitionCommitter.java:145)

    at org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:233)

    at org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.processElement(AbstractStreamTaskNetworkInput.java:134)

    at org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.emitNext(AbstractStreamTaskNetworkInput.java:105)

    at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)

    at org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:519)

    at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:203)

    at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:804)

    at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:753)

    at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:948)

    at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:927)

    at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:741)

    at org.apache.flink.runtime.taskmanager.Task.run(Task.java:563)

    at java.lang.Thread.run(Thread.java:750)

Caused by: java.io.FileNotFoundException: /tmp/tm_10.244.25.164:6122-5b9301/blobStorage/job_ac640bf7276279f0452642918561670e/blob_p-d9755afa943119c325c059ddc70a45c904d9e4bd-d3a2ca82ca3a2429095708d4908ae184 (No such file or directory)

    at java.util.zip.ZipFile.open(Native Method)

    at java.util.zip.ZipFile.<init>(ZipFile.java:228)

    at java.util.zip.ZipFile.<init>(ZipFile.java:157)

    at java.util.jar.JarFile.<init>(JarFile.java:171)

    at java.util.jar.JarFile.<init>(JarFile.java:108)

    at sun.net.www.protocol.jar.URLJarFile.<init>(URLJarFile.java:93)

    at sun.net.www.protocol.jar.URLJarFile.getJarFile(URLJarFile.java:69)

    at sun.net.www.protocol.jar.JarFileFactory.get(JarFileFactory.java:99)

    at sun.net.www.protocol.jar.JarURLConnection.connect(JarURLConnection.java:122)

    at sun.net.www.protocol.jar.JarURLConnection.getInputStream(JarURLConnection.java:152)

    at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2943)

    at org.apache.hadoop.conf.Configuration.getStreamReader(Configuration.java:3034)

    at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2995)

    ... 27 more



Thanks

Re:Hive提交分区异常,Caused by: java.io.FileNotFoundException: /tmp/... (No such file or directory)

Posted by RS <ti...@163.com>.
Hi,
之前的问题还是没有搞定,不过现象更明晰了点,


版本:flink-1.15.1
场景:写hive数据的时候,写完提交分区,会异常


错误日志:

Caused by: java.io.FileNotFoundException: /tmp/jm_253c182f914fb67750844d2e71864a5a/blobStorage/job_615800b00c211de674f17e46938daeb7/blob_p-a813f094892f1c71b7884d0aec7972edbeae08e3-65d1205985504738577e6a7d90385f17 (没有那个文件或目录)
at java.util.zip.ZipFile.open(Native Method)
at java.util.zip.ZipFile.<init>(ZipFile.java:228)
at java.util.zip.ZipFile.<init>(ZipFile.java:157)
at java.util.jar.JarFile.<init>(JarFile.java:171)
at java.util.jar.JarFile.<init>(JarFile.java:108)
at sun.net.www.protocol.jar.URLJarFile.<init>(URLJarFile.java:93)
at sun.net.www.protocol.jar.URLJarFile.getJarFile(URLJarFile.java:69)
at sun.net.www.protocol.jar.JarFileFactory.get(JarFileFactory.java:99)
at sun.net.www.protocol.jar.JarURLConnection.connect(JarURLConnection.java:122)
at sun.net.www.protocol.jar.JarURLConnection.getInputStream(JarURLConnection.java:152)
at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2943)
at org.apache.hadoop.conf.Configuration.getStreamReader(Configuration.java:3034)
at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2995)
... 54 more




这里应该是要访问hive-site,jobmanager访问本地路径/tmp/jm_253c182f914fb67750844d2e71864a5a/blobStorage/job_615800b00c211de674f17e46938daeb7/blob_p-a813f094892f1c71b7884d0aec7972edbeae08e3-65d1205985504738577e6a7d90385f17找hive-site,但是这个路径是不存在的,导致异常




这个路径曾经存在过,路径中的job id=615800b00c211de674f17e46938daeb7是历史执行的一次任务,执行完之后,job_615800b00c211de674f17e46938daeb7这个目录就没了




但是后面新启动的任务全部都还是在这个路径下查找hive配置,导致异常




如果重启集群的话,同样的任务提交,不会报错,看起来是个概率事件,所以这个问题可能是什么原因导致的呢?




Thanks






在 2022-07-21 14:52:51,"RS" <ti...@163.com> 写道:
>Hi,
>
>
>环境:
>flink-1.15.1 on K8S session集群
>hive3
>flink写hive任务,配置了定时提交分区
>
>
>现象:
>1. checkpoint是30s一次
>2. HDFS上有数据文件产生
>3. hive里面没有分区信息
>4. 任务异常,自动重启后下次ck的时候还是异常
>5. 写hive的任务有的一直正常运行,有的有这种异常
>6. 任务停掉,重新创建后恢复正常
>
>
>异常日志如下:
>java.lang.RuntimeException: java.io.FileNotFoundException: /tmp/tm_10.244.25.164:6122-5b9301/blobStorage/job_ac640bf7276279f0452642918561670e/blob_p-d9755afa943119c325c059ddc70a45c904d9e4bd-d3a2ca82ca3a2429095708d4908ae184 (No such file or directory)
>
>    at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:3021)
>
>    at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2973)
>
>    at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2848)
>
>    at org.apache.hadoop.conf.Configuration.get(Configuration.java:1460)
>
>    at org.apache.hadoop.hive.conf.HiveConf.getVar(HiveConf.java:5001)
>
>    at org.apache.hadoop.hive.conf.HiveConf.getVar(HiveConf.java:5074)
>
>    at org.apache.hadoop.hive.conf.HiveConf.initialize(HiveConf.java:5161)
>
>    at org.apache.hadoop.hive.conf.HiveConf.<init>(HiveConf.java:5114)
>
>    at org.apache.flink.connectors.hive.util.HiveConfUtils.create(HiveConfUtils.java:38)
>
>    at org.apache.flink.connectors.hive.HiveTableMetaStoreFactory$HiveTableMetaStore.<init>(HiveTableMetaStoreFactory.java:72)
>
>    at org.apache.flink.connectors.hive.HiveTableMetaStoreFactory$HiveTableMetaStore.<init>(HiveTableMetaStoreFactory.java:64)
>
>    at org.apache.flink.connectors.hive.HiveTableMetaStoreFactory.createTableMetaStore(HiveTableMetaStoreFactory.java:61)
>
>    at org.apache.flink.connectors.hive.HiveTableMetaStoreFactory.createTableMetaStore(HiveTableMetaStoreFactory.java:43)
>
>    at org.apache.flink.connector.file.table.stream.PartitionCommitter.commitPartitions(PartitionCommitter.java:159)
>
>    at org.apache.flink.connector.file.table.stream.PartitionCommitter.processElement(PartitionCommitter.java:145)
>
>    at org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:233)
>
>    at org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.processElement(AbstractStreamTaskNetworkInput.java:134)
>
>    at org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.emitNext(AbstractStreamTaskNetworkInput.java:105)
>
>    at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)
>
>    at org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:519)
>
>    at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:203)
>
>    at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:804)
>
>    at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:753)
>
>    at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:948)
>
>    at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:927)
>
>    at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:741)
>
>    at org.apache.flink.runtime.taskmanager.Task.run(Task.java:563)
>
>    at java.lang.Thread.run(Thread.java:750)
>
>Caused by: java.io.FileNotFoundException: /tmp/tm_10.244.25.164:6122-5b9301/blobStorage/job_ac640bf7276279f0452642918561670e/blob_p-d9755afa943119c325c059ddc70a45c904d9e4bd-d3a2ca82ca3a2429095708d4908ae184 (No such file or directory)
>
>    at java.util.zip.ZipFile.open(Native Method)
>
>    at java.util.zip.ZipFile.<init>(ZipFile.java:228)
>
>    at java.util.zip.ZipFile.<init>(ZipFile.java:157)
>
>    at java.util.jar.JarFile.<init>(JarFile.java:171)
>
>    at java.util.jar.JarFile.<init>(JarFile.java:108)
>
>    at sun.net.www.protocol.jar.URLJarFile.<init>(URLJarFile.java:93)
>
>    at sun.net.www.protocol.jar.URLJarFile.getJarFile(URLJarFile.java:69)
>
>    at sun.net.www.protocol.jar.JarFileFactory.get(JarFileFactory.java:99)
>
>    at sun.net.www.protocol.jar.JarURLConnection.connect(JarURLConnection.java:122)
>
>    at sun.net.www.protocol.jar.JarURLConnection.getInputStream(JarURLConnection.java:152)
>
>    at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2943)
>
>    at org.apache.hadoop.conf.Configuration.getStreamReader(Configuration.java:3034)
>
>    at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2995)
>
>    ... 27 more
>
>
>
>Thanks