You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2020/07/29 17:38:11 UTC
[GitHub] [iceberg] massdosage commented on pull request #1267: Single jar for input formats

massdosage commented on pull request #1267:
URL: https://github.com/apache/iceberg/pull/1267#issuecomment-665760682


   @guilload I've taken the `iceberg-mr-all.jar` that the above produces and added it to a Hive client's classpath by doing the following:
   ```
   hive> add jar /home/hadoop/iceberg/0.9.0-SNAPSHOT/iceberg-mr-all.jar;
   Added [/home/hadoop/iceberg/0.9.0-SNAPSHOT/iceberg-mr-all.jar] to class path
   ```
   I've then created a Hive table on top of an existing Iceberg table by doing the following:
   ```
   CREATE EXTERNAL TABLE default.iceberg_table_a STORED BY 'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler' LOCATION 'hdfs://host:port/hiveberg/table_a';
   ```
   I can successfully perform a `SELECT *` from this but if I add an `ORDER BY` clause to force a Map Reduce job to execute it fails with the following error:
   ```
   Error: java.io.IOException: java.lang.NullPointerException: Table cannot be null
           at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
           at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
           at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:379)
           at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:678)
           at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.<init>(MapTask.java:170)
           at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:433)
           at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344)
           at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175)
           at java.security.AccessController.doPrivileged(Native Method)
           at javax.security.auth.Subject.doAs(Subject.java:422)
           at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)
           at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:169)
   Caused by: java.lang.NullPointerException: Table cannot be null
           at org.apache.iceberg.relocated.com.google.common.base.Preconditions.checkNotNull(Preconditions.java:897)
           at org.apache.iceberg.mr.hive.HiveIcebergInputFormat.forwardConfigSettings(HiveIcebergInputFormat.java:76)
           at org.apache.iceberg.mr.hive.HiveIcebergInputFormat.getRecordReader(HiveIcebergInputFormat.java:63)
           at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:376)
           ... 9 more
   ```
   I haven't had time to look into it in depth but I know we had this working in Hiveberg so there is something in the new InputFormat that is failing. I tried removing the null checks and adding "if not null" checks in the code below them but the NPE then just moves further down in the code:
   ```
   Caused by: java.lang.NullPointerException
           at java.util.Objects.requireNonNull(Objects.java:203)
           at com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:2296)
           at com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:111)
           at com.github.benmanes.caffeine.cache.LocalManualCache.get(LocalManualCache.java:54)
           at org.apache.iceberg.SchemaParser.fromJson(SchemaParser.java:247)
           at org.apache.iceberg.mr.mapreduce.IcebergInputFormat$IcebergRecordReader.initialize(IcebergInputFormat.java:181)
           at org.apache.iceberg.mr.mapred.MapredIcebergInputFormat$MapredIcebergRecordReader.<init>(MapredIcebergInputFormat.java:92)
           at org.apache.iceberg.mr.mapred.MapredIcebergInputFormat.getRecordReader(MapredIcebergInputFormat.java:78)
           at org.apache.iceberg.mr.hive.HiveIcebergInputFormat.getRecordReader(HiveIcebergInputFormat.java:64)
           at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:376)
           ... 26 more
   ```
   You said you had this working via a Hive client, what have I done differently that I'm running into all these exceptions?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org