You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2020/10/13 09:28:39 UTC

[GitHub] [iceberg] waterlx opened a new issue #1604: Import Hive table (using ORC) is blocked by "ORC schema does not contain Iceberg IDs"

waterlx opened a new issue #1604:
URL: https://github.com/apache/iceberg/issues/1604


   I am trying to use SparkTableUtil.importSparkTable to import a Hive table (file format is ORC) as an Iceberg table, it is blocked by the following error:
   ```
   Exception in thread "main" java.lang.IllegalArgumentException: ORC schema does not contain Iceberg IDs
   	at org.apache.iceberg.orc.ORCSchemaUtil.convert(ORCSchemaUtil.java:221)
   	at org.apache.iceberg.orc.OrcMetrics.buildOrcMetrics(OrcMetrics.java:100)
   	at org.apache.iceberg.orc.OrcMetrics.fromInputFile(OrcMetrics.java:83)
   	at org.apache.iceberg.orc.OrcMetrics.fromInputFile(OrcMetrics.java:78)
   	at org.apache.iceberg.spark.SparkTableUtil.lambda$listOrcPartition$8(SparkTableUtil.java:407)
   	at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
   	at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
   	at java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948)
   	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
   	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
   	at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
   	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
   	at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
   	at org.apache.iceberg.spark.SparkTableUtil.listOrcPartition(SparkTableUtil.java:421)
   	at org.apache.iceberg.spark.SparkTableUtil.listPartition(SparkTableUtil.java:326)
   	at org.apache.iceberg.spark.SparkTableUtil.importUnpartitionedSparkTable(SparkTableUtil.java:545)
   	at org.apache.iceberg.spark.SparkTableUtil.importSparkTable(SparkTableUtil.java:519)
   ```
   As the ORC data file is created by Hive, it does not have the Iceberg ID, so it is blocked by #


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] waterlx commented on issue #1604: Importing Hive table (using ORC) is blocked by "ORC schema does not contain Iceberg IDs"

Posted by GitBox <gi...@apache.org>.
waterlx commented on issue #1604:
URL: https://github.com/apache/iceberg/issues/1604#issuecomment-707715436


   Ready to close this issue as it is fixed by PR #1399 , I am not using the latest code so #1399 is not picked up.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] waterlx closed issue #1604: Importing Hive table (using ORC) is blocked by "ORC schema does not contain Iceberg IDs"

Posted by GitBox <gi...@apache.org>.
waterlx closed issue #1604:
URL: https://github.com/apache/iceberg/issues/1604


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org