You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@paimon.apache.org by "melin (via GitHub)" <gi...@apache.org> on 2023/11/08 10:54:01 UTC

[I] SparkGenericCatalog not support Iceberg [incubator-paimon]

melin opened a new issue, #2292:
URL: https://github.com/apache/incubator-paimon/issues/2292

   ### Search before asking
   
   - [X] I searched in the [issues](https://github.com/apache/incubator-paimon/issues) and found nothing similar.
   
   
   ### Paimon version
   
   0.5.0-incubating
   
   ### Compute Engine
   
   ```
           val spark = SparkSession.builder()
             .master("local")
             .enableHiveSupport()
             .config("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
   
             .config("spark.sql.extensions", "org.apache.spark.sql.hudi.HoodieSparkSessionExtension,org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions")
             .config("spark.sql.catalog.hudi_catalog", "org.apache.spark.sql.hudi.catalog.HoodieCatalog")
   
             .config("spark.sql.catalog.iceberg_catalog", "org.apache.iceberg.spark.SparkCatalog")
             .config("spark.sql.catalog.iceberg_catalog", "org.apache.iceberg.spark.SparkSessionCatalog")
             .config("spark.sql.catalog.iceberg_catalog.type", "hive")
             .config("spark.sql.catalog.iceberg_catalog.uri", "thrift://cdh2:9083")
   
             .config("spark.sql.catalog.hive_metastore", "org.apache.paimon.spark.SparkGenericCatalog")
             .getOrCreate()
   
           println("hello hudi")
           spark.sql("select * from hive_metastore.bigdata.hudi_sample_1").show
           println("hello paimon")
           spark.sql("select * from hive_metastore.bigdata.paimon_sample").show
           println("hello iceberg")
           spark.sql("select * from hive_metastore.bigdata.iceberg_sample_1").show
   ```
   
   ### Minimal reproduce step
   
   ```
   hello hudi
   +-------------------+--------------------+------------------+----------------------+-----------------+---+----+
   |_hoodie_commit_time|_hoodie_commit_seqno|_hoodie_record_key|_hoodie_partition_path|_hoodie_file_name| id|data|
   +-------------------+--------------------+------------------+----------------------+-----------------+---+----+
   +-------------------+--------------------+------------------+----------------------+-----------------+---+----+
   
   hello paimon
   +---+---+
   |  k|  v|
   +---+---+
   +---+---+
   
   hello iceberg
   Exception in thread "main" java.lang.RuntimeException: java.lang.InstantiationException
   	at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:135)
   	at org.apache.spark.rdd.HadoopRDD.getInputFormat(HadoopRDD.scala:194)
   	at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:208)
   	at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:291)
   	at scala.Option.getOrElse(Option.scala:189)
   	at org.apache.spark.rdd.RDD.partitions(RDD.scala:287)
   	at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
   	at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:291)
   	at scala.Option.getOrElse(Option.scala:189)
   	at org.apache.spark.rdd.RDD.partitions(RDD.scala:287)
   	at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
   	at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:291)
   	at scala.Option.getOrElse(Option.scala:189)
   	at org.apache.spark.rdd.RDD.partitions(RDD.scala:287)
   	at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
   	at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:291)
   	at scala.Option.getOrElse(Option.scala:189)
   	at org.apache.spark.rdd.RDD.partitions(RDD.scala:287)
   	at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
   	at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:291)
   	at scala.Option.getOrElse(Option.scala:189)
   	at org.apache.spark.rdd.RDD.partitions(RDD.scala:287)
   	at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
   	at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:291)
   	at scala.Option.getOrElse(Option.scala:189)
   	at org.apache.spark.rdd.RDD.partitions(RDD.scala:287)
   	at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:505)
   	at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:487)
   	at org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:61)
   	at org.apache.spark.sql.Dataset.collectFromPlan(Dataset.scala:4217)
   	at org.apache.spark.sql.Dataset.$anonfun$head$1(Dataset.scala:3201)
   	at org.apache.spark.sql.Dataset.$anonfun$withAction$2(Dataset.scala:4207)
   	at org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:526)
   	at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:4205)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:118)
   	at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:195)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:103)
   	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:827)
   	at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:65)
   	at org.apache.spark.sql.Dataset.withAction(Dataset.scala:4205)
   	at org.apache.spark.sql.Dataset.head(Dataset.scala:3201)
   	at org.apache.spark.sql.Dataset.take(Dataset.scala:3422)
   	at org.apache.spark.sql.Dataset.getRows(Dataset.scala:283)
   	at org.apache.spark.sql.Dataset.showString(Dataset.scala:322)
   	at org.apache.spark.sql.Dataset.show(Dataset.scala:808)
   	at org.apache.spark.sql.Dataset.show(Dataset.scala:767)
   	at org.apache.spark.sql.Dataset.show(Dataset.scala:776)
   	at org.example.spark.datasource.hudi.PaimonTest1$$anon$1.run(PaimonTest1.scala:45)
   	at org.example.spark.datasource.hudi.PaimonTest1$$anon$1.run(PaimonTest1.scala:22)
   	at java.security.AccessController.doPrivileged(Native Method)
   	at javax.security.auth.Subject.doAs(Subject.java:422)
   	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
   	at org.example.spark.datasource.hudi.PaimonTest1$.main(PaimonTest1.scala:22)
   	at org.example.spark.datasource.hudi.PaimonTest1.main(PaimonTest1.scala)
   Caused by: java.lang.InstantiationException
   	at sun.reflect.InstantiationExceptionConstructorAccessorImpl.newInstance(InstantiationExceptionConstructorAccessorImpl.java:48)
   	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
   	at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
   	... 53 more
   ```
   
   ### What doesn't meet your expectations?
   
   Catalog hive_metastore supports access to different types of paimon, hudi, iceberg table
   
   
   ### Anything else?
   
   _No response_
   
   ### Are you willing to submit a PR?
   
   - [ ] I'm willing to submit a PR!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] SparkGenericCatalog not support Iceberg [incubator-paimon]

Posted by "Paper-plane123 (via GitHub)" <gi...@apache.org>.
Paper-plane123 commented on issue #2292:
URL: https://github.com/apache/incubator-paimon/issues/2292#issuecomment-1825063051

   讲个笑话:高考考场门口穿旗袍的不一定都是美女,偶尔也有大叔。


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] SparkGenericCatalog not support Iceberg [incubator-paimon]

Posted by "melin (via GitHub)" <gi...@apache.org>.
melin closed issue #2292: SparkGenericCatalog not support Iceberg
URL: https://github.com/apache/incubator-paimon/issues/2292


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] SparkGenericCatalog not support Iceberg [incubator-paimon]

Posted by "Zouxxyy (via GitHub)" <gi...@apache.org>.
Zouxxyy commented on issue #2292:
URL: https://github.com/apache/incubator-paimon/issues/2292#issuecomment-1804028361

   @melin How can you use SparkGenericCatalog to create iceberg table?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] SparkGenericCatalog not support Iceberg [incubator-paimon]

Posted by "melin (via GitHub)" <gi...@apache.org>.
melin commented on issue #2292:
URL: https://github.com/apache/incubator-paimon/issues/2292#issuecomment-1803613143

   @JingsongLi 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] SparkGenericCatalog not support Iceberg [incubator-paimon]

Posted by "Zouxxyy (via GitHub)" <gi...@apache.org>.
Zouxxyy commented on issue #2292:
URL: https://github.com/apache/incubator-paimon/issues/2292#issuecomment-1813678670

   > > @melin How can you use SparkGenericCatalog to create iceberg table?
   > 
   > <img alt="image" width="728" src="https://user-images.githubusercontent.com/1145830/281783918-1b31c16a-c4e8-4da9-8886-1db89680e350.png">
   > 创建hudi table 是正常的
   
   hudi 表的创建不强依赖catalog


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] SparkGenericCatalog not support Iceberg [incubator-paimon]

Posted by "melin (via GitHub)" <gi...@apache.org>.
melin commented on issue #2292:
URL: https://github.com/apache/incubator-paimon/issues/2292#issuecomment-1803466261

   使用 iceberg_catalog 可以执行
   select * from iceberg_catalog.bigdata.iceberg_sample_1


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] SparkGenericCatalog not support Iceberg [incubator-paimon]

Posted by "melin (via GitHub)" <gi...@apache.org>.
melin commented on issue #2292:
URL: https://github.com/apache/incubator-paimon/issues/2292#issuecomment-1804056502

   > @melin How can you use SparkGenericCatalog to create iceberg table?
   
   <img width="728" alt="image" src="https://github.com/apache/incubator-paimon/assets/1145830/1b31c16a-c4e8-4da9-8886-1db89680e350">
   
   
   -----
   创建hudi table 是正常的
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] SparkGenericCatalog not support Iceberg [incubator-paimon]

Posted by "melin (via GitHub)" <gi...@apache.org>.
melin commented on issue #2292:
URL: https://github.com/apache/incubator-paimon/issues/2292#issuecomment-1803653625

   有 SparkGenericCatalog 创建的iceberg 表有问题:bigdata.iceberg_sample_1 
   bigdata.iceberg_sample_2 是iceberg catalog 创建的表,是正确的。
   ![image](https://github.com/apache/incubator-paimon/assets/1145830/4fc7cef5-754b-45dd-8334-cad51912eab0)
   
   @JingsongLi 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] SparkGenericCatalog not support Iceberg [incubator-paimon]

Posted by "melin (via GitHub)" <gi...@apache.org>.
melin commented on issue #2292:
URL: https://github.com/apache/incubator-paimon/issues/2292#issuecomment-1859610032

   > > > @melin How can you use SparkGenericCatalog to create iceberg table?
   > > 
   > > 
   > > <img alt="image" width="728" src="https://user-images.githubusercontent.com/1145830/281783918-1b31c16a-c4e8-4da9-8886-1db89680e350.png">
   > > 创建hudi table 是正常的
   > 
   > hudi 表的创建不强依赖catalog
   
   
   能够解决这个问题吗?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org