You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@paimon.apache.org by "melin (via GitHub)" <gi...@apache.org> on 2023/11/08 10:54:01 UTC
[I] SparkGenericCatalog not support Iceberg [incubator-paimon]
melin opened a new issue, #2292:
URL: https://github.com/apache/incubator-paimon/issues/2292
### Search before asking
- [X] I searched in the [issues](https://github.com/apache/incubator-paimon/issues) and found nothing similar.
### Paimon version
0.5.0-incubating
### Compute Engine
```
val spark = SparkSession.builder()
.master("local")
.enableHiveSupport()
.config("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
.config("spark.sql.extensions", "org.apache.spark.sql.hudi.HoodieSparkSessionExtension,org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions")
.config("spark.sql.catalog.hudi_catalog", "org.apache.spark.sql.hudi.catalog.HoodieCatalog")
.config("spark.sql.catalog.iceberg_catalog", "org.apache.iceberg.spark.SparkCatalog")
.config("spark.sql.catalog.iceberg_catalog", "org.apache.iceberg.spark.SparkSessionCatalog")
.config("spark.sql.catalog.iceberg_catalog.type", "hive")
.config("spark.sql.catalog.iceberg_catalog.uri", "thrift://cdh2:9083")
.config("spark.sql.catalog.hive_metastore", "org.apache.paimon.spark.SparkGenericCatalog")
.getOrCreate()
println("hello hudi")
spark.sql("select * from hive_metastore.bigdata.hudi_sample_1").show
println("hello paimon")
spark.sql("select * from hive_metastore.bigdata.paimon_sample").show
println("hello iceberg")
spark.sql("select * from hive_metastore.bigdata.iceberg_sample_1").show
```
### Minimal reproduce step
```
hello hudi
+-------------------+--------------------+------------------+----------------------+-----------------+---+----+
|_hoodie_commit_time|_hoodie_commit_seqno|_hoodie_record_key|_hoodie_partition_path|_hoodie_file_name| id|data|
+-------------------+--------------------+------------------+----------------------+-----------------+---+----+
+-------------------+--------------------+------------------+----------------------+-----------------+---+----+
hello paimon
+---+---+
| k| v|
+---+---+
+---+---+
hello iceberg
Exception in thread "main" java.lang.RuntimeException: java.lang.InstantiationException
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:135)
at org.apache.spark.rdd.HadoopRDD.getInputFormat(HadoopRDD.scala:194)
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:208)
at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:291)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:287)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:291)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:287)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:291)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:287)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:291)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:287)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:291)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:287)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
at org.apache.spark.rdd.RDD.$anonfun$partitions$2(RDD.scala:291)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:287)
at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:505)
at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:487)
at org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:61)
at org.apache.spark.sql.Dataset.collectFromPlan(Dataset.scala:4217)
at org.apache.spark.sql.Dataset.$anonfun$head$1(Dataset.scala:3201)
at org.apache.spark.sql.Dataset.$anonfun$withAction$2(Dataset.scala:4207)
at org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:526)
at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:4205)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:118)
at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:195)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:103)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:827)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:65)
at org.apache.spark.sql.Dataset.withAction(Dataset.scala:4205)
at org.apache.spark.sql.Dataset.head(Dataset.scala:3201)
at org.apache.spark.sql.Dataset.take(Dataset.scala:3422)
at org.apache.spark.sql.Dataset.getRows(Dataset.scala:283)
at org.apache.spark.sql.Dataset.showString(Dataset.scala:322)
at org.apache.spark.sql.Dataset.show(Dataset.scala:808)
at org.apache.spark.sql.Dataset.show(Dataset.scala:767)
at org.apache.spark.sql.Dataset.show(Dataset.scala:776)
at org.example.spark.datasource.hudi.PaimonTest1$$anon$1.run(PaimonTest1.scala:45)
at org.example.spark.datasource.hudi.PaimonTest1$$anon$1.run(PaimonTest1.scala:22)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
at org.example.spark.datasource.hudi.PaimonTest1$.main(PaimonTest1.scala:22)
at org.example.spark.datasource.hudi.PaimonTest1.main(PaimonTest1.scala)
Caused by: java.lang.InstantiationException
at sun.reflect.InstantiationExceptionConstructorAccessorImpl.newInstance(InstantiationExceptionConstructorAccessorImpl.java:48)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
... 53 more
```
### What doesn't meet your expectations?
Catalog hive_metastore supports access to different types of paimon, hudi, iceberg table
### Anything else?
_No response_
### Are you willing to submit a PR?
- [ ] I'm willing to submit a PR!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] SparkGenericCatalog not support Iceberg [incubator-paimon]
Posted by "Paper-plane123 (via GitHub)" <gi...@apache.org>.
Paper-plane123 commented on issue #2292:
URL: https://github.com/apache/incubator-paimon/issues/2292#issuecomment-1825063051
讲个笑话:高考考场门口穿旗袍的不一定都是美女,偶尔也有大叔。
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] SparkGenericCatalog not support Iceberg [incubator-paimon]
Posted by "melin (via GitHub)" <gi...@apache.org>.
melin closed issue #2292: SparkGenericCatalog not support Iceberg
URL: https://github.com/apache/incubator-paimon/issues/2292
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] SparkGenericCatalog not support Iceberg [incubator-paimon]
Posted by "Zouxxyy (via GitHub)" <gi...@apache.org>.
Zouxxyy commented on issue #2292:
URL: https://github.com/apache/incubator-paimon/issues/2292#issuecomment-1804028361
@melin How can you use SparkGenericCatalog to create iceberg table?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] SparkGenericCatalog not support Iceberg [incubator-paimon]
Posted by "melin (via GitHub)" <gi...@apache.org>.
melin commented on issue #2292:
URL: https://github.com/apache/incubator-paimon/issues/2292#issuecomment-1803613143
@JingsongLi
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] SparkGenericCatalog not support Iceberg [incubator-paimon]
Posted by "Zouxxyy (via GitHub)" <gi...@apache.org>.
Zouxxyy commented on issue #2292:
URL: https://github.com/apache/incubator-paimon/issues/2292#issuecomment-1813678670
> > @melin How can you use SparkGenericCatalog to create iceberg table?
>
> <img alt="image" width="728" src="https://user-images.githubusercontent.com/1145830/281783918-1b31c16a-c4e8-4da9-8886-1db89680e350.png">
> 创建hudi table 是正常的
hudi 表的创建不强依赖catalog
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] SparkGenericCatalog not support Iceberg [incubator-paimon]
Posted by "melin (via GitHub)" <gi...@apache.org>.
melin commented on issue #2292:
URL: https://github.com/apache/incubator-paimon/issues/2292#issuecomment-1803466261
使用 iceberg_catalog 可以执行
select * from iceberg_catalog.bigdata.iceberg_sample_1
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] SparkGenericCatalog not support Iceberg [incubator-paimon]
Posted by "melin (via GitHub)" <gi...@apache.org>.
melin commented on issue #2292:
URL: https://github.com/apache/incubator-paimon/issues/2292#issuecomment-1804056502
> @melin How can you use SparkGenericCatalog to create iceberg table?
<img width="728" alt="image" src="https://github.com/apache/incubator-paimon/assets/1145830/1b31c16a-c4e8-4da9-8886-1db89680e350">
-----
创建hudi table 是正常的
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] SparkGenericCatalog not support Iceberg [incubator-paimon]
Posted by "melin (via GitHub)" <gi...@apache.org>.
melin commented on issue #2292:
URL: https://github.com/apache/incubator-paimon/issues/2292#issuecomment-1803653625
有 SparkGenericCatalog 创建的iceberg 表有问题:bigdata.iceberg_sample_1
bigdata.iceberg_sample_2 是iceberg catalog 创建的表,是正确的。
![image](https://github.com/apache/incubator-paimon/assets/1145830/4fc7cef5-754b-45dd-8334-cad51912eab0)
@JingsongLi
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] SparkGenericCatalog not support Iceberg [incubator-paimon]
Posted by "melin (via GitHub)" <gi...@apache.org>.
melin commented on issue #2292:
URL: https://github.com/apache/incubator-paimon/issues/2292#issuecomment-1859610032
> > > @melin How can you use SparkGenericCatalog to create iceberg table?
> >
> >
> > <img alt="image" width="728" src="https://user-images.githubusercontent.com/1145830/281783918-1b31c16a-c4e8-4da9-8886-1db89680e350.png">
> > 创建hudi table 是正常的
>
> hudi 表的创建不强依赖catalog
能够解决这个问题吗?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org