You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/10/11 13:11:43 UTC

[GitHub] [hudi] hbgstc123 opened a new issue, #6924: [SUPPORT]Flink clustering does not reserve commit metadata

hbgstc123 opened a new issue, #6924:
URL: https://github.com/apache/hudi/issues/6924

   append data to hudi with flink, use inline async clustering.
   _hoodie_commit_time field is update to the commit time of replacecommit.
   _hoodie_commit_time change will result in data duplication when stream read from this hudi table.
   
   Steps to reproduce the behavior:
   stream write data to hudi,
   enable inline clustering,
   check the _hoodie_commit_time field of data after replacecommit complete.
   
   * Hudi version : 0.12
   
   * flink version : 1.13
   
   table ddl
   `
   CREATE TEMPORARY TABLE target_hudi_table1
   (
       imp_date string,
       tag string,
       id bigint,
       name string,
       score double,
       ts timestamp(3)
   ) PARTITIONED BY (imp_date)
   WITH
   (
       -- Hudi settings
       'connector' = 'hudi',
       'path' = 'hdfs://...',
       'write.operation' = 'insert',
       'table.type' = 'COPY_ON_WRITE',
       'hoodie.table.keygenerator.class' = 'org.apache.hudi.keygen.SimpleKeyGenerator',
       'hoodie.datasource.write.recordkey.field' = 'id',
       'write.precombine.field' = 'ts',
   
       'hive_sync.partition_extractor_class' = 'org.apache.hudi.hive.MultiPartKeysValueExtractor',
       'hoodie.datasource.write.hive_style_partitioning' = 'true',
       
       'hoodie.metadata.enable'='false',
       'clean.retain_commits'='5',
       'clustering.async.enabled'='true',
       'clustering.schedule.enabled'='true',
       'clustering.delta_commits'='3'
   );
   
   insert into hudi_stream_write_append_mode_updatetest
   select imp_date, tag, id, name, score, ts
   from data_source;
   `
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 closed issue #6924: [SUPPORT]Flink clustering does not reserve commit metadata

Posted by GitBox <gi...@apache.org>.
danny0405 closed issue #6924: [SUPPORT]Flink clustering does not reserve commit metadata
URL: https://github.com/apache/hudi/issues/6924


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on issue #6924: [SUPPORT]Flink clustering does not reserve commit metadata

Posted by GitBox <gi...@apache.org>.
danny0405 commented on issue #6924:
URL: https://github.com/apache/hudi/issues/6924#issuecomment-1275878246

   And a PR here: https://github.com/apache/hudi/pull/6929


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on issue #6924: [SUPPORT]Flink clustering does not reserve commit metadata

Posted by GitBox <gi...@apache.org>.
danny0405 commented on issue #6924:
URL: https://github.com/apache/hudi/issues/6924#issuecomment-1277056897

   Close because we already have a PR to resolve this issue.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hbgstc123 commented on issue #6924: [SUPPORT]Flink clustering does not reserve commit metadata

Posted by GitBox <gi...@apache.org>.
hbgstc123 commented on issue #6924:
URL: https://github.com/apache/hudi/issues/6924#issuecomment-1275975196

   thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on issue #6924: [SUPPORT]Flink clustering does not reserve commit metadata

Posted by GitBox <gi...@apache.org>.
danny0405 commented on issue #6924:
URL: https://github.com/apache/hudi/issues/6924#issuecomment-1275523729

   Thanks for the feedback, i have created a JIRA issue here: https://issues.apache.org/jira/browse/HUDI-5016


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org