You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2020/04/28 01:38:56 UTC
[GitHub] [incubator-hudi] tieke1121 edited a comment on issue #1568: [SUPPORT] java.lang.reflect.InvocationTargetException when upsert
tieke1121 edited a comment on issue #1568:
URL: https://github.com/apache/incubator-hudi/issues/1568#issuecomment-620324107
I've set it up
```
dataFrame.writeStream
.format("org.apache.hudi")
.option("path", conf.getString("hudi.basePath"))
.option(DataSourceWriteOptions.RECORDKEY_FIELD_OPT_KEY, conf.getString("hudi.recordkey"))
.option(DataSourceWriteOptions.PRECOMBINE_FIELD_OPT_KEY, conf.getString("hudi.precombineKey"))
.option(HoodieWriteConfig.TABLE_NAME, conf.getString("hudi.tableName"))
.option("checkpointLocation", conf.getString("hudi.checkpoinPath"))
.option(DataSourceWriteOptions.HIVE_DATABASE_OPT_KEY, conf.getString("hive.database"))
.option(DataSourceWriteOptions.HIVE_TABLE_OPT_KEY, conf.getString("hive.table"))
.option(DataSourceWriteOptions.HIVE_URL_OPT_KEY, conf.getString("hive.url"))
.option(DataSourceWriteOptions.HIVE_USER_OPT_KEY, conf.getString("hive.username"))
.option(DataSourceWriteOptions.HIVE_PASS_OPT_KEY, conf.getString("hive.password"))
.option(DataSourceWriteOptions.HIVE_SYNC_ENABLED_OPT_KEY, "true")
.option(DataSourceWriteOptions.KEYGENERATOR_CLASS_OPT_KEY,classOf[NonpartitionedKeyGenerator].getCanonicalName)
.option(DataSourceWriteOptions.HIVE_PARTITION_EXTRACTOR_CLASS_OPT_KEY, classOf[NonPartitionedExtractor].getCanonicalName)
.outputMode(OutputMode.Append())
.start()
and the hdfs path is :
-rw-r--r-- 3 root supergroup 737196 2020-04-28 01:16 /wap-olap/data/device/status/data_1/b33868cc-6609-47a3-8e93-bdd248deb21e-0_0-1068-281898_20200428011554.parquet
-rw-r--r-- 3 root supergroup 745158 2020-04-28 01:16 /wap-olap/data/device/status/data_1/b33868cc-6609-47a3-8e93-bdd248deb21e-0_0-1120-295461_20200428011603.parquet
-rw-r--r-- 3 root supergroup 750006 2020-04-28 01:16 /wap-olap/data/device/status/data_1/b33868cc-6609-47a3-8e93-bdd248deb21e-0_0-1168-309014_20200428011613.parquet
-rw-r--r-- 3 root supergroup 755947 2020-04-28 01:16 /wap-olap/data/device/status/data_1/b33868cc-6609-47a3-8e93-bdd248deb21e-0_0-1217-322579_20200428011624.parquet
-rw-r--r-- 3 root supergroup 765879 2020-04-28 01:16 /wap-olap/data/device/status/data_1/b33868cc-6609-47a3-8e93-bdd248deb21e-0_0-1267-336149_20200428011634.parquet
-rw-r--r-- 3 root supergroup 690225 2020-04-28 01:14 /wap-olap/data/device/status/data_1/b33868cc-6609-47a3-8e93-bdd248deb21e-0_0-770-200500_20200428011449.parquet
-rw-r--r-- 3 root supergroup 698213 2020-04-28 01:15 /wap-olap/data/device/status/data_1/b33868cc-6609-47a3-8e93-bdd248deb21e-0_0-819-214064_20200428011500.parquet
-rw-r--r-- 3 root supergroup 705870 2020-04-28 01:15 /wap-olap/data/device/status/data_1/b33868cc-6609-47a3-8e93-bdd248deb21e-0_0-870-227637_20200428011511.parquet
-rw-r--r-- 3 root supergroup 713830 2020-04-28 01:15 /wap-olap/data/device/status/data_1/b33868cc-6609-47a3-8e93-bdd248deb21e-0_0-918-241200_20200428011521.parquet
-rw-r--r-- 3 root supergroup 720687 2020-04-28 01:15 /wap-olap/data/device/status/data_1/b33868cc-6609-47a3-8e93-bdd248deb21e-0_0-967-254767_20200428011532.parquet
when i quey simple hive sql : select deviceid from device_status_hudi_1;
the query is ok
but I use the complex hive SQL : select deviceid from device_status_hudi_1 group by deviceid having count(deviceid)>1;
Error: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2)
```
2020-04-28 01:29:16,698 INFO [main] org.apache.hadoop.hive.conf.HiveConf: Found configuration file null
2020-04-28 01:29:16,880 INFO [main] org.apache.hadoop.hive.ql.exec.SerializationUtilities: Deserializing MapWork using kryo
2020-04-28 01:29:17,045 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.io.IOException: java.lang.reflect.InvocationTargetException
at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:271)
at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.<init>(HadoopShimsSecure.java:217)
at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:345)
at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:702)
at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.<init>(MapTask.java:175)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:444)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:349)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:257)
... 11 more
Caused by: java.io.FileNotFoundException: File does not exist: hdfs://sap-namenode1:8020/wap-olap/data/device/status/data_1/b33868cc-6609-47a3-8e93-bdd248deb21e-0_0-4288-1150066_20200428012704.parquet
at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1500)
at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1493)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1508)
at org.apache.parquet.hadoop.util.HadoopInputFile.fromPath(HadoopInputFile.java:39)
at org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:413)
at org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:400)
at org.apache.hadoop.hive.ql.io.parquet.ParquetRecordReaderBase.getSplit(ParquetRecordReaderBase.java:79)
at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.<init>(ParquetRecordReaderWrapper.java:78)
at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.<init>(ParquetRecordReaderWrapper.java:63)
at org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:75)
at org.apache.hudi.hadoop.HoodieParquetInputFormat.getRecordReader(HoodieParquetInputFormat.java:297)
at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.<init>(CombineHiveRecordReader.java:68)
... 16 more
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org