You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@iceberg.apache.org by "ideal (via GitHub)" <gi...@apache.org> on 2023/05/30 13:52:02 UTC

[GitHub] [iceberg] ideal opened a new issue, #7739: NullPointerException at org.apache.iceberg.mr.hive.TezUtil$TaskAttemptWrapper when INSERT with spark-sql

ideal opened a new issue, #7739:
URL: https://github.com/apache/iceberg/issues/7739

   ### Apache Iceberg version
   
   1.2.1 (latest release)
   
   ### Query engine
   
   Spark
   
   ### Please describe the bug 🐞
   
   With Spark 3.2.4 standalone mode，and  the table:
   
   ```
   > desc extended my_test_table;
   23/05/30 20:21:47 WARN HiveConf: HiveConf of name hive.stats.jdbc.timeout does not exist
   23/05/30 20:21:47 WARN HiveConf: HiveConf of name hive.stats.retries.wait does not exist
   a                       string                  a                   
   b                       string                  b                   
   c                       string                  c                   
                                                                       
   # Detailed Table Information                                                
   Database                my_test_iceberg                                
   Table                   my_test_table                            
   Owner                   root                                        
   Created Time            Tue May 30 16:28:17 CST 2023                        
   Last Access             UNKNOWN                                     
   Created By              Spark 2.2 or prior                          
   Type                    EXTERNAL                                    
   Provider                hive                                        
   Comment                 my_test_table                            
   Table Properties        [current-schema={"type":"struct","schema-id":0,"fields":[{"id":1,"name":"a","required":false,"type":"string","doc":"a"},{"id":2,"name":"b","required":false,"type":"string","doc":"b"},{"id":3,"name":"c","required":false,"type":"string","doc":"c"}]}, current-snapshot-id=1142796867698349657, current-snapshot-summary={"spark.app.id":"app-20230530113600-0013","added-data-files":"1","added-records":"1","added-files-size":"860","changed-partition-count":"1","total-records":"24","total-files-size":"20654","total-data-files":"24","total-delete-files":"0","total-position-deletes":"0","total-equality-deletes":"0"}, current-snapshot-timestamp-ms=1685438806641, default-partition-spec={"spec-id":0,"fields":[{"name":"c","transform":"identity","source-id":3,"field-id":1000}]}, engine.hive.enabled=true, external.table.purge=TRUE, metadata_location=hdfs://xxxx/user/warehouse/my_test_iceberg/my_test_table/metadata/00024-665493b0-a47d-4861-8ebd-767f868f8fda.metadata.json, prev
 ious_metadata_location=hdfs://xxxx/user/warehouse/my_test_iceberg/my_test_table/metadata/00023-87b8cb31-15ab-45cb-9b73-d8085549e2c1.metadata.json, snapshot-count=24, storage_handler=org.apache.iceberg.mr.hive.HiveIcebergStorageHandler, table_type=ICEBERG, transient_lastDdlTime=1685435297, uuid=5d212398-0457-4058-b400-936e0533fcd6]                          
   Statistics              20654 bytes, 24 rows                        
   Location                hdfs://xxxx/user/warehouse/my_test_iceberg/my_test_table               
   Serde Library           org.apache.iceberg.mr.hive.HiveIcebergSerDe                         
   InputFormat             org.apache.iceberg.mr.hive.HiveIcebergInputFormat                           
   OutputFormat            org.apache.iceberg.mr.hive.HiveIcebergOutputFormat                          
   Partition Provider      Catalog
   ```
   
   And running:
   ```
   bin/spark-sql --master spark://${spark-master}:7077 --conf spark.driver.host=${current-host-ip} --conf spark.hive.metastore.uris=thrift://${metastore-service}:9083 --conf spark.sql.catalog.hive_prod=org.apache.iceberg.spark.SparkCatalog --conf spark.sql.catalog.hive_prod.type=hive   --conf spark.sql.catalog.hive_prod.warehouse=hdfs://xxxx/hivewarehouse/iceberg --jars iceberg-hive-runtime-1.2.1.jar,iceberg-spark-runtime-3.2_2.12-1.2.1.jar
   
   > insert into my_test_table values ('a1','b1','c1');
   ```
   
   The exception is like this:
   ```
   org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NullPointerException
           at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:274)
           at org.apache.spark.sql.hive.execution.HiveOutputWriter.<init>(HiveFileFormat.scala:132)
           at org.apache.spark.sql.hive.execution.HiveFileFormat$$anon$1.newInstance(HiveFileFormat.scala:105)
           at org.apache.spark.sql.execution.datasources.SingleDirectoryDataWriter.newOutputWriter(FileFormatDataWriter.scala:161)
           at org.apache.spark.sql.execution.datasources.SingleDirectoryDataWriter.<init>(FileFormatDataWriter.scala:146)
           at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:290)
           at org.apache.spark.sql.execution.datasources.FileFormatWriter$.$anonfun$write$16(FileFormatWriter.scala:229)
           at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
           at org.apache.spark.scheduler.Task.run(Task.scala:131)
           at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506)
           at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1462)
           at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509)
           at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
           at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
           at java.lang.Thread.run(Thread.java:750)
   Caused by: java.lang.NullPointerException
           at org.apache.iceberg.mr.hive.TezUtil$TaskAttemptWrapper.<init>(TezUtil.java:105)
           at org.apache.iceberg.mr.hive.TezUtil.taskAttemptWrapper(TezUtil.java:78)
           at org.apache.iceberg.mr.hive.HiveIcebergOutputFormat.writer(HiveIcebergOutputFormat.java:73)
           at org.apache.iceberg.mr.hive.HiveIcebergOutputFormat.getHiveRecordWriter(HiveIcebergOutputFormat.java:58)
           at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getRecordWriter(HiveFileFormatUtils.java:286)
           at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:271)
           ... 14 more
   ```
   
   Does anyone had this problem before? Thanks.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] ideal closed issue #7739: NullPointerException at org.apache.iceberg.mr.hive.TezUtil$TaskAttemptWrapper when INSERT with spark-sql

Posted by "ideal (via GitHub)" <gi...@apache.org>.

ideal closed issue #7739: NullPointerException at org.apache.iceberg.mr.hive.TezUtil$TaskAttemptWrapper when INSERT with spark-sql
URL: https://github.com/apache/iceberg/issues/7739


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] ideal commented on issue #7739: NullPointerException at org.apache.iceberg.mr.hive.TezUtil$TaskAttemptWrapper when INSERT with spark-sql

Posted by "ideal (via GitHub)" <gi...@apache.org>.

ideal commented on issue #7739:
URL: https://github.com/apache/iceberg/issues/7739#issuecomment-1569539732

   Seems that after add `--conf spark.sql.defaultCatalog=${the catalog}` to spark-sql, the problem disappeared.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org