You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/12/07 23:26:01 UTC

[GitHub] [iceberg] 1ambda commented on issue #3295: Can spark operates iceberg table created by hive?

1ambda commented on issue #3295:
URL: https://github.com/apache/iceberg/issues/3295#issuecomment-988336942


   Hi, 
   
   I am trying to read a PrestoDB (0.266 w/ Iceberg 0.12) created iceberg table in Spark (3.2.0 w/ Iceberg 0.13 SNAPSHOT).
   - Presto can read Spark created iceberg tables without any errors
   - However, Spark throw an error when trying to read Presto created iceberg tables
   
   ```scala
   // Spark SQL
   spark.sql("SELECT * FROM hive_prod.test_db.test_table LIMIT 10").printSchema().show()
   
   // ERROR
   Caused by: java.lang.ClassCastException: [B cannot be cast to org.apache.spark.unsafe.types.UTF8String
     at org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow.getUTF8String(rows.scala:46)
     at org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow.getUTF8String$(rows.scala:46)
     at org.apache.spark.sql.catalyst.expressions.GenericInternalRow.getUTF8String(rows.scala:195)
     at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
   ```
   
   Here is DDL and DML for creating iceberg tables used in Presto and inserting data into the table.
   
   ```sql
   CREATE TABLE iceberg.test_db.test_table (
       col_a bigint comment '...' ,
       col_b varchar comment '...',
       p_ymd varchar
   )
   WITH (
       format = 'PARQUET',
       partitioning = ARRAY['p_ymd'],
       location = 's3://...'
   )
    
   INSERT INTO CREATE TABLE iceberg.test_db.test_table
   SELECT col_a, col_b, '20211201' as p_ymd
   FROM existing_table
   ```
   
   @RussellSpitzer Could different version of iceberg library used in each computing engine have affect on the read query? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org