You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "zouxxyy (Jira)" <ji...@apache.org> on 2023/03/31 02:59:00 UTC

[jira] [Updated] (HUDI-6007) When using the MOR table with flink, hudi savepoint may be invalid which lead to consistency issue

     [ https://issues.apache.org/jira/browse/HUDI-6007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

zouxxyy updated HUDI-6007:
--------------------------
    Description: 
Currently hudi's savepoint only saves the base file, and filter the files in it when clean.

But when using the MOR table with flink, hudi's savepoint may be invalid, because there may no base file in a FG.

file:
{code:java}
.4e164104-da91-4c39-89b8-a81890cf2c8c_20230330235205548.log.1_0-1-0
.4e164104-da91-4c39-89b8-a81890cf2c8c_20230330235205548.log.2_0-1-0
.hoodie_partition_metadata {code}
20230330235205548.savepoint:
{code:java}
{
  "savepointedBy": "",
  "savepointedAt": 1680191759153,
  "comments": "",
  "partitionMetadata": {
    "dt=2021-12-09/hh=10": {
      "partitionPath": "dt=2021-12-09/hh=10",
      "savepointDataFile": []
    }
  },
  "version": {
    "int": 1
  }
} {code}
 

  was:
Currently hudi's savepoint only saves the base file, and filter the files in it when clean.

But when using the MOR table with flink, hudi's savepoint is may invalid, because there may no base file in a FG.

file:
{code:java}
.4e164104-da91-4c39-89b8-a81890cf2c8c_20230330235205548.log.1_0-1-0
.4e164104-da91-4c39-89b8-a81890cf2c8c_20230330235205548.log.2_0-1-0
.hoodie_partition_metadata {code}
20230330235205548.savepoint:
{code:java}
{
  "savepointedBy": "",
  "savepointedAt": 1680191759153,
  "comments": "",
  "partitionMetadata": {
    "dt=2021-12-09/hh=10": {
      "partitionPath": "dt=2021-12-09/hh=10",
      "savepointDataFile": []
    }
  },
  "version": {
    "int": 1
  }
} {code}
 


> When using the MOR table with flink, hudi savepoint may be invalid which lead to consistency issue
> --------------------------------------------------------------------------------------------------
>
>                 Key: HUDI-6007
>                 URL: https://issues.apache.org/jira/browse/HUDI-6007
>             Project: Apache Hudi
>          Issue Type: Bug
>            Reporter: zouxxyy
>            Priority: Major
>
> Currently hudi's savepoint only saves the base file, and filter the files in it when clean.
> But when using the MOR table with flink, hudi's savepoint may be invalid, because there may no base file in a FG.
> file:
> {code:java}
> .4e164104-da91-4c39-89b8-a81890cf2c8c_20230330235205548.log.1_0-1-0
> .4e164104-da91-4c39-89b8-a81890cf2c8c_20230330235205548.log.2_0-1-0
> .hoodie_partition_metadata {code}
> 20230330235205548.savepoint:
> {code:java}
> {
>   "savepointedBy": "",
>   "savepointedAt": 1680191759153,
>   "comments": "",
>   "partitionMetadata": {
>     "dt=2021-12-09/hh=10": {
>       "partitionPath": "dt=2021-12-09/hh=10",
>       "savepointDataFile": []
>     }
>   },
>   "version": {
>     "int": 1
>   }
> } {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)