You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/06/07 08:36:32 UTC

[GitHub] [hudi] felixYyu opened a new issue, #5779: [SUPPORT]Spark session operation hudi table via HoodieCatalog[Table or view not found]

felixYyu opened a new issue, #5779:
URL: https://github.com/apache/hudi/issues/5779

   **_Tips before filing an issue_**
   
   - Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)?
   
   - Join the mailing list to engage in conversations and get faster support at dev-subscribe@hudi.apache.org.
   
   - If you have triaged this as a bug, then file an [issue](https://issues.apache.org/jira/projects/HUDI/issues) directly.
   
   **Describe the problem you faced**
   
   A clear and concise description of the problem.
   
   **To Reproduce**
   
   Steps to reproduce the behavior:
   
   1.create spark session
   ```
   val spark = SparkSession
         .builder()
         .master("local[*]")
         .appName("HoodieAPI")
         .withExtensions(new HoodieSparkSessionExtension)
         .config("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
         .config("hoodie.insert.shuffle.parallelism", "4")
         .config("hoodie.upsert.shuffle.parallelism", "4")
         .config("hoodie.delete.shuffle.parallelism", "4")
         .config("spark.sql.warehouse.dir", "file:///D://hudi-data/testdb/")
         .config("spark.sql.catalog.spark_catalog", "org.apache.spark.sql.hudi.catalog.HoodieCatalog")
         .getOrCreate()
   ```
   2.create db
   ```
   spark.sql("create database if not exists testdb")
   spark.sql("use testdb")
   ```
   3.create hudi table
   ```
   spark.sql(
         s"""
           |create table if not exists testdb.$tableName (
           |id bigint,
           |name string,
           |dt string,
           |hh string,
           |ts timestamp
           |) using hudi
           |partitioned by (dt, hh)
           |tblproperties(
           |    hoodie.database.name = 'testdb',
           |    primaryKey = 'id',
           |    preCombineField = 'ts',
           |    hoodie.datasource.write.operation = 'upsert',
           |    hoodie.datasource.meta.sync.enable = 'true',
           |    type = '$tableType'
           | )
           | options (
           |    hoodie.metadata.enable = 'true'
           | )
           | location '$basePath'
           |""".stripMargin)
   ```
   4.`spark.sql("show tables").show(false)`
   +---------+---------------+-----------+
   |namespace|tableName      |isTemporary|
   +---------+---------------+-----------+
   |testdb   |hudi_trips_mor2|false      |
   +---------+---------------+-----------+
   
   5.**only `show tables` runing again via spark, the result is empty, how to setting SparkSession**?
   +---------+---------------+-----------+
   |namespace|tableName      |isTemporary|
   +---------+---------------+-----------+
   +---------+---------------+-----------+
   **Expected behavior**
   
   A clear and concise description of what you expected to happen.
   
   **Environment Description**
   
   * Hudi version :0.11.0
   
   * Spark version :3.2.1
   
   * Hive version :3.1.3
   
   * Hadoop version :
   
   * Storage (HDFS/S3/GCS..) :local
   
   * Running on Docker? (yes/no) 🔕 
   
   
   **Additional context**
   
   Add any other context about the problem here.
   
   **Stacktrace**
   
   ```Add the stacktrace of the error.```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] nsivabalan commented on issue #5779: [SUPPORT]Spark session operation hudi table via HoodieCatalog[Table or view not found]

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on issue #5779:
URL: https://github.com/apache/hudi/issues/5779#issuecomment-1149126184

   CC @leesf @XuQianJin-Stars 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] leesf commented on issue #5779: [SUPPORT]Spark session operation hudi table via HoodieCatalog[Table or view not found]

Posted by GitBox <gi...@apache.org>.
leesf commented on issue #5779:
URL: https://github.com/apache/hudi/issues/5779#issuecomment-1159646409

   > Table should be visible within the same spark session it was created
   
   @felixYyu agree with kazdy's comment. Feel free to reopen the issue if you have any other problems.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] nsivabalan commented on issue #5779: [SUPPORT]Spark session operation hudi table via HoodieCatalog[Table or view not found]

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on issue #5779:
URL: https://github.com/apache/hudi/issues/5779#issuecomment-1149125960

   I am not sure what does '.withExtensions(new HoodieSparkSessionExtension)' does, but can you try adding below spark config while launching spark 
   ```
   'spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension'
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] nsivabalan commented on issue #5779: [SUPPORT]Spark session operation hudi table via HoodieCatalog[Table or view not found]

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on issue #5779:
URL: https://github.com/apache/hudi/issues/5779#issuecomment-1152899383

   @minihippo can you help triage this please.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] leesf commented on issue #5779: [SUPPORT]Spark session operation hudi table via HoodieCatalog[Table or view not found]

Posted by GitBox <gi...@apache.org>.
leesf commented on issue #5779:
URL: https://github.com/apache/hudi/issues/5779#issuecomment-1152947593

   @felixYyu does the table sync to HMS successfully? I think it is in memory so when restart spark, the table created is not found.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] leesf closed issue #5779: [SUPPORT]Spark session operation hudi table via HoodieCatalog[Table or view not found]

Posted by GitBox <gi...@apache.org>.
leesf closed issue #5779: [SUPPORT]Spark session operation hudi table via HoodieCatalog[Table or view not found]
URL: https://github.com/apache/hudi/issues/5779


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] kazdy commented on issue #5779: [SUPPORT]Spark session operation hudi table via HoodieCatalog[Table or view not found]

Posted by GitBox <gi...@apache.org>.
kazdy commented on issue #5779:
URL: https://github.com/apache/hudi/issues/5779#issuecomment-1159537984

   Table should be visible within the same spark session it was created


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] felixYyu commented on issue #5779: [SUPPORT]Spark session operation hudi table via HoodieCatalog[Table or view not found]

Posted by GitBox <gi...@apache.org>.
felixYyu commented on issue #5779:
URL: https://github.com/apache/hudi/issues/5779#issuecomment-1149355274

   adding below spark config, the result is also empty. I think the sparksession does not loading metadata? 
   
   ```
   val spark = SparkSession
         .builder()
         .master("local[*]")
         .appName("HoodieAPI")
   //      .withExtensions(new HoodieSparkSessionExtension)
         .config("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
         .config("spark.sql.extensions", "org.apache.spark.sql.hudi.HoodieSparkSessionExtension")
         .config("hoodie.insert.shuffle.parallelism", "4")
         .config("hoodie.upsert.shuffle.parallelism", "4")
         .config("hoodie.delete.shuffle.parallelism", "4")
         .config("spark.sql.warehouse.dir", "file:///D://hudi-data/testdb/")
         .config("spark.sql.catalog.spark_catalog", "org.apache.spark.sql.hudi.catalog.HoodieCatalog")
         .getOrCreate()
   
   spark.sql("show tables").show(false)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] felixYyu commented on issue #5779: [SUPPORT]Spark session operation hudi table via HoodieCatalog[Table or view not found]

Posted by GitBox <gi...@apache.org>.
felixYyu commented on issue #5779:
URL: https://github.com/apache/hudi/issues/5779#issuecomment-1153372252

   Only in local memory store. Could store metadata locally for unit testing? @leesf 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org