You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2021/09/15 03:22:31 UTC

[GitHub] [hudi] kelvin-qin opened a new issue #3662: [SUPPORT]Hudi‘s CTAS

kelvin-qin opened a new issue #3662:
URL: https://github.com/apache/hudi/issues/3662


   **_Tips before filing an issue_**
   
   - Have you gone through our [FAQs](https://cwiki.apache.org/confluence/display/HUDI/FAQ)?
   
   - Join the mailing list to engage in conversations and get faster support at dev-subscribe@hudi.apache.org.
   
   - If you have triaged this as a bug, then file an [issue](https://issues.apache.org/jira/projects/HUDI/issues) directly.
   
   **Describe the problem you faced**
   
   A clear and concise description of the problem.
   
   **To Reproduce**
   
   Steps to reproduce the behavior:
   
   1.
   2.
   3.
   4.
   
   **Expected behavior**
   
   A clear and concise description of what you expected to happen.
   
   **Environment Description**
   
   * Hudi version :
   
   * Spark version :
   
   * Hive version :
   
   * Hadoop version :
   
   * Storage (HDFS/S3/GCS..) :
   
   * Running on Docker? (yes/no) :
   
   
   **Additional context**
   
   Add any other context about the problem here.
   
   **Stacktrace**
   
   ```Add the stacktrace of the error.```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xushiyan closed issue #3662: [SUPPORT] Error on the spark version in the desc information of the hudi CTAS Table

Posted by GitBox <gi...@apache.org>.
xushiyan closed issue #3662:
URL: https://github.com/apache/hudi/issues/3662


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xushiyan commented on issue #3662: [SUPPORT] Error on the spark version in the desc information of the hudi CTAS Table

Posted by GitBox <gi...@apache.org>.
xushiyan commented on issue #3662:
URL: https://github.com/apache/hudi/issues/3662#issuecomment-950438362


   @kelvin-qin thanks for reproducing this! i see it's not the right spark version info if CTAS from a hudi table. the version info not propagated correctly. I can also reproduce it; It'd be a nice fix. Filing a JIRA now. If you're keen, please feel free to take it.
   https://issues.apache.org/jira/browse/HUDI-2610


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] kelvin-qin commented on issue #3662: [SUPPORT] Error on the spark version in the desc information of the hudi CTAS Table

Posted by GitBox <gi...@apache.org>.
kelvin-qin commented on issue #3662:
URL: https://github.com/apache/hudi/issues/3662#issuecomment-938489457


   @xushiyan Thanks,Leave on vacation.I tested it directly with spark-sql again:
   ## start cmd line like this
   /path-to-spark3/spark-3.0.3/bin/spark-sql --master yarn --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' --conf 'spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension'
   
   ## spark version info:
   21/10/08 17:01:41 INFO SparkContext: Running Spark version 3.0.3
   21/10/08 17:01:41 INFO ResourceUtils: =====================
   21/10/08 17:01:41 INFO ResourceUtils: Resources for spark.driver:
   
   ## create hudi table cmd:
   spark-sql>create table if not exists hudi_table0 (
     id int, 
     name string, 
     price double
   ) using hudi
   options (
     type = 'cow',
     primaryKey = 'id'
   ); 
   ## CTAS cmd:
   spark-sql> create table  h0 using hudi options (type = 'cow',primaryKey = 'id') as select id,name,price from hudi_table0; 
   
   ## Show tables
   spark-sql> show tables;
   default h0      false
   default hudi_table0     false
   
   ## DESC tables:
   1. hudi_table0
   spark-sql> desc formatted hudi_table0;
   ---
   Database        default
   Table   hudi_table0
   Owner   hive
   Created Time    Fri Oct 08 17:06:04 CST 2021
   Last Access     UNKNOWN
   Created By      Spark 3.0.3
   Type    MANAGED
   Provider        hudi
   
   2. h0
   spark-sql> desc formatted h0;
   ---
   Database        default
   Table   h0
   Created Time    Fri Oct 08 17:08:09 CST 2021
   Last Access     UNKNOWN
   Created By      Spark 2.2 or prior
   Type    EXTERNAL
   Provider        hudi
   
   # So,the h0 Created By      Spark 2.2 or prior


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xushiyan commented on issue #3662: [SUPPORT] Error on the spark version in the desc information of the hudi CTAS Table

Posted by GitBox <gi...@apache.org>.
xushiyan commented on issue #3662:
URL: https://github.com/apache/hudi/issues/3662#issuecomment-928965831


   @kelvin-qin i can't reproduce this. CTAS from this UT gave the correct info.
    `org.apache.spark.sql.hudi.TestHoodieSqlBase#test("Test Create Table As Select")`
   
   ```scala
   spark.sql(
           s"""describe table extended $tableName1"""
         ).show(100)
   ```
   
   ```
   |          Created By|         Spark 2.4.4|       |
   |                Type|            EXTERNAL|       |
   |            Provider|                hudi|       |
   ```
   
   can you post your sql and also print `spark.version` in your spark shell to double check?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xushiyan commented on issue #3662: [SUPPORT] Error on the spark version in the desc information of the hudi CTAS Table

Posted by GitBox <gi...@apache.org>.
xushiyan commented on issue #3662:
URL: https://github.com/apache/hudi/issues/3662#issuecomment-928965831


   @kelvin-qin i can't reproduce this. CTAS from this UT gave the correct info.
    `org.apache.spark.sql.hudi.TestHoodieSqlBase#test("Test Create Table As Select")`
   
   ```scala
   spark.sql(
           s"""describe table extended $tableName1"""
         ).show(100)
   ```
   
   ```
   |          Created By|         Spark 2.4.4|       |
   |                Type|            EXTERNAL|       |
   |            Provider|                hudi|       |
   ```
   
   can you post your sql and also print `spark.version` in your spark shell to double check?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org