You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@paimon.apache.org by "melin (via GitHub)" <gi...@apache.org> on 2023/11/08 10:54:01 UTC

[I] SparkGenericCatalog not support Iceberg [incubator-paimon]

melin opened a new issue, #2292:
URL: https://github.com/apache/incubator-paimon/issues/2292

   ### Search before asking
   
   - [X] I searched in the [issues](https://github.com/apache/incubator-paimon/issues) and found nothing similar.
   
   
   ### Paimon version
   
   0.5.0-incubating
   
   ### Compute Engine
   
   ```
           val spark = SparkSession.builder()
             .master("local")
             .enableHiveSupport()
             .config("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
   
             .config("spark.sql.extensions", "org.apache.spark.sql.hudi.HoodieSparkSessionExtension,org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions")
             .config("spark.sql.catalog.hudi_catalog", "org.apache.spark.sql.hudi.catalog.HoodieCatalog")
   
             .config("spark.sql.catalog.iceberg_catalog", "org.apache.iceberg.spark.SparkCatalog")
             .config("spark.sql.catalog.iceberg_catalog", "org.apache.iceberg.spark.SparkSessionCatalog")
             .config("spark.sql.catalog.iceberg_catalog.type", "hive")
             .config("spark.sql.catalog.iceberg_catalog.uri", "thrift://cdh2:9083")
   
             .config("spark.sql.catalog.hive_metastore", "org.apache.paimon.spark.SparkGenericCatalog")
             .getOrCreate()
   
           println("hello hudi")
           spark.sql("select * from hive_metastore.bigdata.hudi_sample_1").show
           println("hello paimon")
           spark.sql("select * from hive_metastore.bigdata.paimon_sample").show
           println("hello iceberg")
           spark.sql("select * from hive_metastore.bigdata.iceberg_sample_1").show
   ```
   
   ### Minimal reproduce step
   
   ```
   hello hudi
   +-------------------+--------------------+------------------+----------------------+-----------------+---+----+
   |_hoodie_commit_time|_hoodie_commit_seqno|_hoodie_record_key|_hoodie_partition_path|_hoodie_file_name| id|data|
   +-------------------+--------------------+------------------+----------------------+-----------------+---+----+
   +-------------------+--------------------+------------------+----------------------+-----------------+---+----+
   
   hello paimon
   +---+---+
   |  k|  v|
   +---+---+
   +---+---+
   
   hello iceberg
   Exception in thread "main" java.lang.RuntimeException: java.lang.InstantiationException
   	at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:135)
   	at org.apache.spark.rdd.HadoopRDD.getInputFormat(HadoopRDD.scala:194)
   	at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:208)
   	at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:291)
   	at scala.Option.getOrElse(Option.scala:189)
   	at org.apache.spark.rdd.RDD.partitions(RDD.scala:287)
   	at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
   	at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:291)
   	at scala.Option.getOrElse(Option.scala:189)
   	at org.apache.spark.rdd.RDD.partitions(RDD.scala:287)
   	at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
   	at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:291)
   	at scala.Option.getOrElse(Option.scala:189)
   	at org.apache.spark.rdd.RDD.partitions(RDD.scala:287)
   	at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
   	at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:291)
   	at scala.Option.getOrElse(Option.scala:189)
   	at org.apache.spark.rdd.RDD.partitions(RDD.scala:287)
   	at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
   	at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:291)
   	at scala.Option.getOrElse(Option.scala:189)
   	at org.apache.spark.rdd.RDD.partitions(RDD.scala:287)
   	at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
   	at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:291)
   	at scala.Option.getOrElse(Option.scala:189)
   	at org.apache.spark.rdd.RDD.partitions(RDD.scala:287)
   	at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:505)
   	at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:487)
   	at org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:61)
   	at org.apache.spark.sql.Dataset.collectFromPlan(Dataset.scala:4217)
   	at org.apache.spark.sql.Dataset.$anonfun$head$1(Dataset.scala:3201)
   	at org.apache.spark.sql.Dataset.$anonfun$withAction$2(Dataset.scala:4207)
   	at org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:526)
   	at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:4205)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:118)
   	at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:195)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:103)
   	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:827)
   	at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:65)
   	at org.apache.spark.sql.Dataset.withAction(Dataset.scala:4205)
   	at org.apache.spark.sql.Dataset.head(Dataset.scala:3201)
   	at org.apache.spark.sql.Dataset.take(Dataset.scala:3422)
   	at org.apache.spark.sql.Dataset.getRows(Dataset.scala:283)
   	at org.apache.spark.sql.Dataset.showString(Dataset.scala:322)
   	at org.apache.spark.sql.Dataset.show(Dataset.scala:808)
   	at org.apache.spark.sql.Dataset.show(Dataset.scala:767)
   	at org.apache.spark.sql.Dataset.show(Dataset.scala:776)
   	at org.example.spark.datasource.hudi.PaimonTest1$$anon$1.run(PaimonTest1.scala:45)
   	at org.example.spark.datasource.hudi.PaimonTest1$$anon$1.run(PaimonTest1.scala:22)
   	at java.security.AccessController.doPrivileged(Native Method)
   	at javax.security.auth.Subject.doAs(Subject.java:422)
   	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
   	at org.example.spark.datasource.hudi.PaimonTest1$.main(PaimonTest1.scala:22)
   	at org.example.spark.datasource.hudi.PaimonTest1.main(PaimonTest1.scala)
   Caused by: java.lang.InstantiationException
   	at sun.reflect.InstantiationExceptionConstructorAccessorImpl.newInstance(InstantiationExceptionConstructorAccessorImpl.java:48)
   	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
   	at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
   	... 53 more
   ```
   
   ### What doesn't meet your expectations?
   
   Catalog hive_metastore supports access to different types of paimon, hudi, iceberg table
   
   
   ### Anything else?
   
   _No response_
   
   ### Are you willing to submit a PR?
   
   - [ ] I'm willing to submit a PR!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org