You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "xuzifu666 (via GitHub)" <gi...@apache.org> on 2023/03/09 01:43:46 UTC

[GitHub] [hudi] xuzifu666 opened a new issue, #8138: [SUPPORT] Merge into not get the right value in update action

xuzifu666 opened a new issue, #8138:
URL: https://github.com/apache/hudi/issues/8138

   **_Tips before filing an issue_**
   
   - Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)?
   
   - Join the mailing list to engage in conversations and get faster support at dev-subscribe@hudi.apache.org.
   
   - If you have triaged this as a bug, then file an [issue](https://issues.apache.org/jira/projects/HUDI/issues) directly.
   
   **Describe the problem you faced**
       al conf = new SparkConf().setAppName("insertDatasToHudi").setMaster("local[*]")
       val spark = SparkSession.builder().config(conf)
         .config("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
         .config("spark.sql.extensions", "org.apache.spark.sql.hudi.HoodieSparkSessionExtension")
         .getOrCreate()
       import spark.implicits._
   
       val incDF = Seq(
         HudiDataWithData(1, "lb", 8, "shu", 1646643412l),
         HudiDataWithData(1, "lb", 2, "shu", 1646643412l),
         HudiDataWithData(1, "lb", 7, "shu", 1646643412l),
         HudiDataWithData(2, "gy", 12, "shu", 1646643193l),
         HudiDataWithData(1, "cc", 22, "wei", 1646643193l),
         HudiDataWithData(2, "xy", 23, "wei", 1646643193l)
       ).toDF
       incDF.createOrReplaceTempView("inc_table")
   
   
       spark.sql(
         s"""
            |create table hudi_cow_pt_tbl (
            |  id int,
            |  name string,
            |  data int,
            |  country string,
            |  ts bigint
            |) using hudi
            |tblproperties (
            |  type = 'cow',
            |  primaryKey = 'id',
            |  preCombineField = 'ts'
            | )
            |partitioned by (country)
            |location 'D:/tmp/hudi_data/hudi_merge_test01'
            |""".stripMargin)
   
       spark.sql(
         s"""
            |merge into hudi_cow_pt_tbl as target
            |using (
            |	select id, name, data, country, ts from inc_table
            |) source
            |on source.id = target.id
            |when matched and source.data > target.data then
            |update set target.data = source.data, target.ts = source.ts
            |when not matched then
            |insert *
            |""".stripMargin)
   
   A clear and concise description of the problem.
   when we query,  record with 'lb' is 7, not 8
   ------------
   [20230308151658945,20230308151658945_0_0,1,country=wei,c0d7ce20-c40f-4064-8e9c-22d4dd2b1e2a-0_0-14-48_20230309094305111.parquet,1,cc,44,1646643193,wei]
   ------------
   [20230308151658945,20230308151658945_0_1,2,country=wei,c0d7ce20-c40f-4064-8e9c-22d4dd2b1e2a-0_0-14-48_20230309094305111.parquet,2,xy,46,1646643193,wei]
   ------------
   [20230308214428068,20230308214428068_1_0,2,country=shu,28617435-0f34-4c3f-a9e1-859611b68094-0_1-14-49_20230309094305111.parquet,2,gy,12,1646643193,shu]
   ------------
   [20230309094305111,20230309094305111_1_1,1,country=shu,28617435-0f34-4c3f-a9e1-859611b68094-0_1-14-49_20230309094305111.parquet,1,lb,7,1646643412,shu]
   ------------
   
   **To Reproduce**
   
   Steps to reproduce the behavior:
   
   1.
   2.
   3.
   4.
   
   **Expected behavior**
   
   A clear and concise description of what you expected to happen.
   
   **Environment Description**
   
   * Hudi version :
   
   * Spark version :
   
   * Hive version :
   
   * Hadoop version :
   
   * Storage (HDFS/S3/GCS..) :
   
   * Running on Docker? (yes/no) :
   
   
   **Additional context**
   
   Add any other context about the problem here.
   
   **Stacktrace**
   
   ```Add the stacktrace of the error.```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] xuzifu666 closed issue #8138: [SUPPORT] Merge into not get the right value in update action

Posted by "xuzifu666 (via GitHub)" <gi...@apache.org>.
xuzifu666 closed issue #8138: [SUPPORT] Merge into not get the right value in update action
URL: https://github.com/apache/hudi/issues/8138


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] KnightChess commented on issue #8138: [SUPPORT] Merge into not get the right value in update action

Posted by "KnightChess (via GitHub)" <gi...@apache.org>.
KnightChess commented on issue #8138:
URL: https://github.com/apache/hudi/issues/8138#issuecomment-1461175607

   the coming record in source table will dedup before use update, make sure the dedup is correct as you expect, because the ts is the same, It should be related to the order of data


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] xuzifu666 commented on issue #8138: [SUPPORT] Merge into not get the right value in update action

Posted by "xuzifu666 (via GitHub)" <gi...@apache.org>.
xuzifu666 commented on issue #8138:
URL: https://github.com/apache/hudi/issues/8138#issuecomment-1465565014

   ok,finished


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org