You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by "ideal (via GitHub)" <gi...@apache.org> on 2023/05/30 13:52:02 UTC
[GitHub] [iceberg] ideal opened a new issue, #7739: NullPointerException at org.apache.iceberg.mr.hive.TezUtil$TaskAttemptWrapper when INSERT with spark-sql
ideal opened a new issue, #7739:
URL: https://github.com/apache/iceberg/issues/7739
### Apache Iceberg version
1.2.1 (latest release)
### Query engine
Spark
### Please describe the bug 🐞
With Spark 3.2.4 standalone mode,and the table:
```
> desc extended my_test_table;
23/05/30 20:21:47 WARN HiveConf: HiveConf of name hive.stats.jdbc.timeout does not exist
23/05/30 20:21:47 WARN HiveConf: HiveConf of name hive.stats.retries.wait does not exist
a string a
b string b
c string c
# Detailed Table Information
Database my_test_iceberg
Table my_test_table
Owner root
Created Time Tue May 30 16:28:17 CST 2023
Last Access UNKNOWN
Created By Spark 2.2 or prior
Type EXTERNAL
Provider hive
Comment my_test_table
Table Properties [current-schema={"type":"struct","schema-id":0,"fields":[{"id":1,"name":"a","required":false,"type":"string","doc":"a"},{"id":2,"name":"b","required":false,"type":"string","doc":"b"},{"id":3,"name":"c","required":false,"type":"string","doc":"c"}]}, current-snapshot-id=1142796867698349657, current-snapshot-summary={"spark.app.id":"app-20230530113600-0013","added-data-files":"1","added-records":"1","added-files-size":"860","changed-partition-count":"1","total-records":"24","total-files-size":"20654","total-data-files":"24","total-delete-files":"0","total-position-deletes":"0","total-equality-deletes":"0"}, current-snapshot-timestamp-ms=1685438806641, default-partition-spec={"spec-id":0,"fields":[{"name":"c","transform":"identity","source-id":3,"field-id":1000}]}, engine.hive.enabled=true, external.table.purge=TRUE, metadata_location=hdfs://xxxx/user/warehouse/my_test_iceberg/my_test_table/metadata/00024-665493b0-a47d-4861-8ebd-767f868f8fda.metadata.json, prev
ious_metadata_location=hdfs://xxxx/user/warehouse/my_test_iceberg/my_test_table/metadata/00023-87b8cb31-15ab-45cb-9b73-d8085549e2c1.metadata.json, snapshot-count=24, storage_handler=org.apache.iceberg.mr.hive.HiveIcebergStorageHandler, table_type=ICEBERG, transient_lastDdlTime=1685435297, uuid=5d212398-0457-4058-b400-936e0533fcd6]
Statistics 20654 bytes, 24 rows
Location hdfs://xxxx/user/warehouse/my_test_iceberg/my_test_table
Serde Library org.apache.iceberg.mr.hive.HiveIcebergSerDe
InputFormat org.apache.iceberg.mr.hive.HiveIcebergInputFormat
OutputFormat org.apache.iceberg.mr.hive.HiveIcebergOutputFormat
Partition Provider Catalog
```
And running:
```
bin/spark-sql --master spark://${spark-master}:7077 --conf spark.driver.host=${current-host-ip} --conf spark.hive.metastore.uris=thrift://${metastore-service}:9083 --conf spark.sql.catalog.hive_prod=org.apache.iceberg.spark.SparkCatalog --conf spark.sql.catalog.hive_prod.type=hive --conf spark.sql.catalog.hive_prod.warehouse=hdfs://xxxx/hivewarehouse/iceberg --jars iceberg-hive-runtime-1.2.1.jar,iceberg-spark-runtime-3.2_2.12-1.2.1.jar
> insert into my_test_table values ('a1','b1','c1');
```
The exception is like this:
```
org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NullPointerException
at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:274)
at org.apache.spark.sql.hive.execution.HiveOutputWriter.<init>(HiveFileFormat.scala:132)
at org.apache.spark.sql.hive.execution.HiveFileFormat$$anon$1.newInstance(HiveFileFormat.scala:105)
at org.apache.spark.sql.execution.datasources.SingleDirectoryDataWriter.newOutputWriter(FileFormatDataWriter.scala:161)
at org.apache.spark.sql.execution.datasources.SingleDirectoryDataWriter.<init>(FileFormatDataWriter.scala:146)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:290)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.$anonfun$write$16(FileFormatWriter.scala:229)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:131)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1462)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Caused by: java.lang.NullPointerException
at org.apache.iceberg.mr.hive.TezUtil$TaskAttemptWrapper.<init>(TezUtil.java:105)
at org.apache.iceberg.mr.hive.TezUtil.taskAttemptWrapper(TezUtil.java:78)
at org.apache.iceberg.mr.hive.HiveIcebergOutputFormat.writer(HiveIcebergOutputFormat.java:73)
at org.apache.iceberg.mr.hive.HiveIcebergOutputFormat.getHiveRecordWriter(HiveIcebergOutputFormat.java:58)
at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getRecordWriter(HiveFileFormatUtils.java:286)
at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:271)
... 14 more
```
Does anyone had this problem before? Thanks.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org
[GitHub] [iceberg] ideal closed issue #7739: NullPointerException at org.apache.iceberg.mr.hive.TezUtil$TaskAttemptWrapper when INSERT with spark-sql
Posted by "ideal (via GitHub)" <gi...@apache.org>.
ideal closed issue #7739: NullPointerException at org.apache.iceberg.mr.hive.TezUtil$TaskAttemptWrapper when INSERT with spark-sql
URL: https://github.com/apache/iceberg/issues/7739
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org
[GitHub] [iceberg] ideal commented on issue #7739: NullPointerException at org.apache.iceberg.mr.hive.TezUtil$TaskAttemptWrapper when INSERT with spark-sql
Posted by "ideal (via GitHub)" <gi...@apache.org>.
ideal commented on issue #7739:
URL: https://github.com/apache/iceberg/issues/7739#issuecomment-1569539732
Seems that after add `--conf spark.sql.defaultCatalog=${the catalog}` to spark-sql, the problem disappeared.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org