You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2021/01/13 13:52:34 UTC

[GitHub] [hudi] liujinhui1994 opened a new pull request #2438: [HUDI-1147] DeltaStreamer kafka source supports consuming from specified timestamp

liujinhui1994 opened a new pull request #2438:
URL: https://github.com/apache/hudi/pull/2438


   DeltaStreamer kafka source supports consuming from specified timestamp
   
   ## *Tips*
   - *Thank you very much for contributing to Apache Hudi.*
   - *Please review https://hudi.apache.org/contributing.html before opening a pull request.*
   
   ## What is the purpose of the pull request
   
   DeltaStreamer kafka source supports consuming from specified timestamp
   
   ## Brief change log
   
   org.apache.hudi.utilities.sources.helpers.KafkaOffsetGen
   
   
   ## Committer checklist
   
    - [ ] Has a corresponding JIRA in PR title & commit
    
    - [ ] Commit message is descriptive of the change
    
    - [ ] CI is green
   
    - [ ] Necessary doc changes done or have another open PR
          
    - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
nsivabalan edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-864413719


   Guess we can simplify things. Let me go over some pseudo code of interest. 
   
   within DeltaSync.read()
   ```
   // set right checkpoint value 
   if(cfg.checkpoint != null && ! (commitMetadata.contains(Checkpoint_RESET_Key) ) {
      checkpoint = cfg.checkpoint;
   } else if (commitMetadata.contains(Checkpoint_Key)) {
       checkpoint = commitMetadata.get(Checkpoint_Key));
   } else {
       Option.empty() 
   }
   ```
   // Note that first if condition deals with RESET_key where as 2nd else if conditions deals with Checkpoint_key
   
   within write() 
   ```
   // towards the end
   commitMetadata.out(Checkpoint_Key, updated checkpoint after writing)
   if(cfg.checkpoint != null) {
     commitMetadata.add(Checkpoint_RESET_Key);
   }
   ```
   
   If cfg.checkpoint is set, only during first round, it will be honored. At the end of first batch, we add Checkpoint_RESET_Key to the commitmetadata and hence from subsequent batches, checkpoint will be parsed from commitMetadata. 
   
   With this PR, only addition is that we are introducing a new checkpoint type. Let me propose a simple add on to above code that would work for us. 
   
   within DeltaSync.read()
   ```
   // set right checkpoint value 
   boolean resetCheckpointType = true; // New addition
   if(cfg.checkpoint != null && ! (commitMetadata.contains(Checkpoint_RESET_Key) ) {
      checkpoint = cfg.checkpoint;
      resetCheckpointType = false; // New addition
   } else if (commitMetadata.contains(Checkpoint_Key)) {
       checkpoint = commitMetadata.get(Checkpoint_Key));
   } else {
       Option.empty() 
   }
   // New addition
   if (resetCheckpointType) {
     **reset checkpoint type if set.** 
   }
   ```
   
   No other changes are required. This is based of the assumption that Checkpoint_RESET_Key and checkpoint type goes hand in hand. During first batch, checkpoint type could be set, there won't be any Checkpoint_RESET_Key set. But from 2nd batch, it should be reverse. check point type should not be set, but Checkpoint_RESET_Key should be part of the commit metadata. Given this assumption, we don't really need to add checkpoint type to commitMetadata, but still decide whether to use the checkpoint type or not. 
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] liujinhui1994 commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-799512747


   no problem
   
   
   
   
   
   ------------------ Original ------------------
   From: Sivabalan Narayanan ***@***.***&gt;
   Date: Mon,Mar 15,2021 11:28 PM
   To: apache/hudi ***@***.***&gt;
   Cc: liujinhui ***@***.***&gt;, Mention ***@***.***&gt;
   Subject: Re: [apache/hudi] [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp (#2438)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (c705ce5) into [master](https://codecov.io/gh/apache/hudi/commit/5804ad8e32ae05758ebc5e47f5d4fb4db371ab52?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (5804ad8) will **increase** coverage by `0.15%`.
   > The diff coverage is `89.09%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #2438      +/-   ##
   ============================================
   + Coverage     47.72%   47.88%   +0.15%     
   - Complexity     5528     5580      +52     
   ============================================
     Files           934      936       +2     
     Lines         41457    41665     +208     
     Branches       4166     4193      +27     
   ============================================
   + Hits          19786    19950     +164     
   - Misses        19914    19947      +33     
   - Partials       1757     1768      +11     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `39.97% <ø> (ø)` | |
   | hudiclient | `34.51% <ø> (+0.05%)` | :arrow_up: |
   | hudicommon | `48.67% <ø> (+0.11%)` | :arrow_up: |
   | hudiflink | `59.68% <ø> (-0.35%)` | :arrow_down: |
   | hudihadoopmr | `52.02% <ø> (+0.73%)` | :arrow_up: |
   | hudisparkdatasource | `67.59% <ø> (-0.07%)` | :arrow_down: |
   | hudisync | `55.97% <ø> (+1.46%)` | :arrow_up: |
   | huditimelineservice | `64.07% <ø> (ø)` | |
   | hudiutilities | `59.77% <89.09%> (+0.50%)` | :arrow_up: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...apache/hudi/utilities/sources/AvroKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb0thZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (ø)` | |
   | [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `72.72% <83.33%> (+1.15%)` | :arrow_up: |
   | [...hudi/utilities/sources/helpers/KafkaOffsetGen.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvaGVscGVycy9LYWZrYU9mZnNldEdlbi5qYXZh) | `88.13% <91.48%> (+0.45%)` | :arrow_up: |
   | [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `100.00% <100.00%> (ø)` | |
   | [.../java/org/apache/hudi/client/HoodieReadClient.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1zcGFyay1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpZW50L0hvb2RpZVJlYWRDbGllbnQuamF2YQ==) | `94.64% <0.00%> (-5.36%)` | :arrow_down: |
   | [...c/main/java/org/apache/hudi/util/StreamerUtil.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS91dGlsL1N0cmVhbWVyVXRpbC5qYXZh) | `64.00% <0.00%> (-3.80%)` | :arrow_down: |
   | [.../hudi/common/util/collection/LazyFileIterable.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3V0aWwvY29sbGVjdGlvbi9MYXp5RmlsZUl0ZXJhYmxlLmphdmE=) | `71.73% <0.00%> (-2.68%)` | :arrow_down: |
   | [...e/hudi/common/table/log/HoodieLogFormatWriter.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL2xvZy9Ib29kaWVMb2dGb3JtYXRXcml0ZXIuamF2YQ==) | `78.12% <0.00%> (-1.57%)` | :arrow_down: |
   | [.../org/apache/hudi/MergeOnReadSnapshotRelation.scala](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL01lcmdlT25SZWFkU25hcHNob3RSZWxhdGlvbi5zY2FsYQ==) | `90.59% <0.00%> (-0.56%)` | :arrow_down: |
   | [...src/main/scala/org/apache/hudi/DefaultSource.scala](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0RlZmF1bHRTb3VyY2Uuc2NhbGE=) | `74.77% <0.00%> (-0.46%)` | :arrow_down: |
   | ... and [38 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [5804ad8...c705ce5](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
       "triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "triggerType" : "PUSH"
     }, {
       "hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
       "triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
       "triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
       "triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
       "triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
       "triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 5e8ab52b0e139333c4c003932c55ff6e88302206 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565) 
   * 1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] liujinhui1994 commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r657692451



##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/DeltaSync.java
##########
@@ -461,7 +465,7 @@ public void refreshTimeline() throws IOException {
     if (!hasErrors || cfg.commitOnErrors) {
       HashMap<String, String> checkpointCommitMetadata = new HashMap<>();
       checkpointCommitMetadata.put(CHECKPOINT_KEY, checkpointStr);
-      if (cfg.checkpoint != null) {
+      if (cfg.checkpoint != null && !"timestamp".equals(props.getString("hoodie.deltastreamer.source.kafka.checkpoint.type"))) {

Review comment:
       You understand that is correct, I wanted to set the timestamp to CHECKPOINT_RESET_KEY at the time. Considering that it could not serve a practical purpose, I cancelled it. After listening to your thoughts, adding it should be more appropriate




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] wangxianghu commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
wangxianghu commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-786606533


   > I will add the unit test, and then please review
   
   ack, will review soon


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (ffd30f5) into [master](https://codecov.io/gh/apache/hudi/commit/0b57483a8e41742689a1362aa94aabb94a1361b3?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (0b57483) will **decrease** coverage by `6.60%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #2438      +/-   ##
   ============================================
   - Coverage     45.76%   39.15%   -6.61%     
   + Complexity     5261     3482    -1779     
   ============================================
     Files           909      661     -248     
     Lines         39353    28070   -11283     
     Branches       4239     2817    -1422     
   ============================================
   - Hits          18010    10991    -7019     
   + Misses        19499    15978    -3521     
   + Partials       1844     1101     -743     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `39.95% <ø> (ø)` | |
   | hudiclient | `16.45% <ø> (-13.95%)` | :arrow_down: |
   | hudicommon | `47.56% <ø> (-0.02%)` | :arrow_down: |
   | hudiflink | `61.26% <ø> (+0.45%)` | :arrow_up: |
   | hudihadoopmr | `51.29% <ø> (ø)` | |
   | hudisparkdatasource | `?` | |
   | hudisync | `?` | |
   | huditimelineservice | `?` | |
   | hudiutilities | `?` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [.../org/apache/hudi/sink/compact/CompactFunction.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zaW5rL2NvbXBhY3QvQ29tcGFjdEZ1bmN0aW9uLmphdmE=) | `86.66% <0.00%> (-13.34%)` | :arrow_down: |
   | [...e/hudi/sink/partitioner/profile/WriteProfiles.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zaW5rL3BhcnRpdGlvbmVyL3Byb2ZpbGUvV3JpdGVQcm9maWxlcy5qYXZh) | `50.00% <0.00%> (-5.89%)` | :arrow_down: |
   | [...e/hudi/common/table/log/HoodieLogFormatWriter.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL2xvZy9Ib29kaWVMb2dGb3JtYXRXcml0ZXIuamF2YQ==) | `78.12% <0.00%> (-1.57%)` | :arrow_down: |
   | [...c/main/java/org/apache/hudi/util/StreamerUtil.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS91dGlsL1N0cmVhbWVyVXRpbC5qYXZh) | `55.00% <0.00%> (-1.42%)` | :arrow_down: |
   | [...i/common/table/timeline/HoodieDefaultTimeline.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3RpbWVsaW5lL0hvb2RpZURlZmF1bHRUaW1lbGluZS5qYXZh) | `79.22% <0.00%> (-1.30%)` | :arrow_down: |
   | [...java/org/apache/hudi/sink/StreamWriteFunction.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zaW5rL1N0cmVhbVdyaXRlRnVuY3Rpb24uamF2YQ==) | `84.34% <0.00%> (-0.66%)` | :arrow_down: |
   | [...he/hudi/sink/partitioner/profile/WriteProfile.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zaW5rL3BhcnRpdGlvbmVyL3Byb2ZpbGUvV3JpdGVQcm9maWxlLmphdmE=) | `87.50% <0.00%> (-0.50%)` | :arrow_down: |
   | [...va/org/apache/hudi/configuration/FlinkOptions.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9jb25maWd1cmF0aW9uL0ZsaW5rT3B0aW9ucy5qYXZh) | `96.37% <0.00%> (-0.05%)` | :arrow_down: |
   | [...va/org/apache/hudi/metadata/BaseTableMetadata.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0YWRhdGEvQmFzZVRhYmxlTWV0YWRhdGEuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [.../org/apache/hudi/streamer/FlinkStreamerConfig.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zdHJlYW1lci9GbGlua1N0cmVhbWVyQ29uZmlnLmphdmE=) | `0.00% <0.00%> (ø)` | |
   | ... and [264 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [0b57483...ffd30f5](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (e98b8e4) into [master](https://codecov.io/gh/apache/hudi/commit/5804ad8e32ae05758ebc5e47f5d4fb4db371ab52?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (5804ad8) will **increase** coverage by `0.13%`.
   > The diff coverage is `85.45%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #2438      +/-   ##
   ============================================
   + Coverage     47.72%   47.85%   +0.13%     
   - Complexity     5528     5576      +48     
   ============================================
     Files           934      936       +2     
     Lines         41457    41665     +208     
     Branches       4166     4193      +27     
   ============================================
   + Hits          19786    19940     +154     
   - Misses        19914    19958      +44     
   - Partials       1757     1767      +10     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `39.97% <ø> (ø)` | |
   | hudiclient | `34.51% <ø> (+0.05%)` | :arrow_up: |
   | hudicommon | `48.71% <ø> (+0.15%)` | :arrow_up: |
   | hudiflink | `59.68% <ø> (-0.35%)` | :arrow_down: |
   | hudihadoopmr | `52.02% <ø> (+0.73%)` | :arrow_up: |
   | hudisparkdatasource | `67.23% <ø> (-0.43%)` | :arrow_down: |
   | hudisync | `55.97% <ø> (+1.46%)` | :arrow_up: |
   | huditimelineservice | `64.07% <ø> (ø)` | |
   | hudiutilities | `59.70% <85.45%> (+0.44%)` | :arrow_up: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...apache/hudi/utilities/sources/AvroKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb0thZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (ø)` | |
   | [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `72.72% <83.33%> (+1.15%)` | :arrow_up: |
   | [...hudi/utilities/sources/helpers/KafkaOffsetGen.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvaGVscGVycy9LYWZrYU9mZnNldEdlbi5qYXZh) | `87.00% <87.23%> (-0.68%)` | :arrow_down: |
   | [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `100.00% <100.00%> (ø)` | |
   | [...in/scala/org/apache/hudi/HoodieStreamingSink.scala](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZVN0cmVhbWluZ1Npbmsuc2NhbGE=) | `28.00% <0.00%> (-10.40%)` | :arrow_down: |
   | [.../java/org/apache/hudi/client/HoodieReadClient.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1zcGFyay1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpZW50L0hvb2RpZVJlYWRDbGllbnQuamF2YQ==) | `94.64% <0.00%> (-5.36%)` | :arrow_down: |
   | [...c/main/java/org/apache/hudi/util/StreamerUtil.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS91dGlsL1N0cmVhbWVyVXRpbC5qYXZh) | `64.00% <0.00%> (-3.80%)` | :arrow_down: |
   | [.../hudi/common/util/collection/LazyFileIterable.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3V0aWwvY29sbGVjdGlvbi9MYXp5RmlsZUl0ZXJhYmxlLmphdmE=) | `71.73% <0.00%> (-2.68%)` | :arrow_down: |
   | [.../org/apache/hudi/MergeOnReadSnapshotRelation.scala](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL01lcmdlT25SZWFkU25hcHNob3RSZWxhdGlvbi5zY2FsYQ==) | `90.59% <0.00%> (-0.56%)` | :arrow_down: |
   | [...src/main/scala/org/apache/hudi/DefaultSource.scala](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0RlZmF1bHRTb3VyY2Uuc2NhbGE=) | `74.77% <0.00%> (-0.46%)` | :arrow_down: |
   | ... and [39 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [5804ad8...e98b8e4](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] wangxianghu commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
wangxianghu commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-759873110


   @liujinhui1994 IMO, we can provide both offset and timestamp checkpoint by `--checkpoint`,  add a new param named checkpointType(default offset type if not configed) to tell hudi the checkpoint type user used.  WDYT ?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] liujinhui1994 commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r557069306



##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java
##########
@@ -165,6 +169,7 @@ public KafkaOffsetGen(TypedProperties props) {
     }
     DataSourceUtils.checkRequiredProperties(props, Collections.singletonList(Config.KAFKA_TOPIC_NAME));
     topicName = props.getString(Config.KAFKA_TOPIC_NAME);
+    kafkaCheckpointTimestamp = props.getString(Config.KAFKA_CHECKPOINT_TIMESTAMP);

Review comment:
       yes, i will correct




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r604840433



##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java
##########
@@ -553,6 +555,11 @@ public DeltaSyncService(Config cfg, JavaSparkContext jssc, FileSystem fs, Config
           "'--filter-dupes' needs to be disabled when '--op' is 'UPSERT' to ensure updates are not missed.");
 
       this.props = properties.get();
+      String kafkaCheckpointTimestamp = props.getString(KafkaOffsetGen.Config.KAFKA_CHECKPOINT_TIMESTAMP, "");

Review comment:
       Let me think more on this. Wondering if we should just rely on existing "HoodieDeltaStreamer.Config.checkpoint" only and add another config named "checkpoint.type" or something which could be set to timestamp for this purpose. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
nsivabalan edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-856862390


   @liujinhui1994 : here is what we can do. 
   If someone is running it just one, this should not be an issue. Issue arises when someone runs deltastreamer in a continuous manner. 
   
   So, user is expected to set HoodieDeltaStreamer.Config.checkpoint or InitialCheckpointProvider. 
   Also user sets the new config (hoodie.deltastreamer.source.kafka.checkpoint.type) to timestamp. 
   
   KafkaOffset gen should be capable of parsing the checkpoint as timestamp. 
   at the end write, deltaSync should reset this(...kafka.checkpoint.type) config (similar to how we reset the checkpoint).
   So, for subsequent runs, this(...kafka.checkpoint.type) config value will not be set. So, KafkaOffsetGen should parse checkpoint and fetch from source as a regular checkpoint. 
   
   Let me know if you can understand the approach, and if it makes sense. 
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] liujinhui1994 commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-782589121


   I have verified, please help review
   @wangxianghu @yanghua @nsivabalan 
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
nsivabalan edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-864413719


   Guess we can simplify things. Let me go over some pseudo code of interest. 
   
   Code before this patch. 
   
   within DeltaSync.read()
   ```
   // set right checkpoint value 
   if(cfg.checkpoint != null && ! (commitMetadata.contains(Checkpoint_RESET_Key) ) {
      checkpoint = cfg.checkpoint;
   } else if (commitMetadata.contains(Checkpoint_Key)) {
       checkpoint = commitMetadata.get(Checkpoint_Key));
   } else {
       Option.empty() 
   }
   ```
   // Note that first if condition deals with RESET_key where as 2nd else if conditions deals with Checkpoint_key. 
   I have simplified some exception cases, but should give you the gist.
   
   within write() 
   ```
   // towards the end
   commitMetadata.out(Checkpoint_Key, updated checkpoint after writing)
   if(cfg.checkpoint != null) {
     commitMetadata.add(Checkpoint_RESET_Key);
   }
   ```
   
   If cfg.checkpoint is set, only during first round, it will be honored. At the end of first batch, we add Checkpoint_RESET_Key to the commitmetadata and hence from subsequent batches, checkpoint will be parsed from commitMetadata. 
   
   With this PR, only addition is that we are introducing a new checkpoint type. Let me propose a simple add on to above code that would work for us. 
   
   within DeltaSync.read()
   ```
   // set right checkpoint value 
   boolean resetCheckpointType = true; // New addition
   if(cfg.checkpoint != null && ! (commitMetadata.contains(Checkpoint_RESET_Key) ) {
      checkpoint = cfg.checkpoint;
      resetCheckpointType = false; // New addition
   } else if (commitMetadata.contains(Checkpoint_Key)) {
       checkpoint = commitMetadata.get(Checkpoint_Key));
   } else {
       Option.empty() 
   }
   // New addition
   if (resetCheckpointType) {
     **reset checkpoint type if set.** 
   }
   ```
   
   No other changes are required. This is based of the assumption that Checkpoint_RESET_Key and checkpoint type goes hand in hand. During first batch, checkpoint type could be set, there won't be any Checkpoint_RESET_Key set. But from 2nd batch, it should be reverse. check point type should not be set, but Checkpoint_RESET_Key should be part of the commit metadata. Given this assumption, we don't really need to add checkpoint type to commitMetadata, but still decide whether to use the checkpoint type or not. 
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (67041c2) into [master](https://codecov.io/gh/apache/hudi/commit/990820476a41b318017ba63dd446911141c929ce?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (9908204) will **decrease** coverage by `1.15%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #2438      +/-   ##
   ============================================
   - Coverage     47.61%   46.46%   -1.16%     
   + Complexity     5487     5030     -457     
   ============================================
     Files           924      866      -58     
     Lines         41206    38565    -2641     
     Branches       4133     3837     -296     
   ============================================
   - Hits          19619    17918    -1701     
   + Misses        19844    19060     -784     
   + Partials       1743     1587     -156     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `39.97% <ø> (ø)` | |
   | hudiclient | `34.20% <ø> (-0.38%)` | :arrow_down: |
   | hudicommon | `48.58% <ø> (+0.02%)` | :arrow_up: |
   | hudiflink | `60.03% <ø> (+0.44%)` | :arrow_up: |
   | hudihadoopmr | `51.29% <ø> (ø)` | |
   | hudisparkdatasource | `67.32% <ø> (-0.01%)` | :arrow_down: |
   | hudisync | `50.55% <ø> (-3.93%)` | :arrow_down: |
   | huditimelineservice | `64.07% <ø> (ø)` | |
   | hudiutilities | `?` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...ache/hudi/hive/HiveMetastoreBasedLockProvider.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvSGl2ZU1ldGFzdG9yZUJhc2VkTG9ja1Byb3ZpZGVyLmphdmE=) | `0.00% <0.00%> (-60.22%)` | :arrow_down: |
   | [...n/java/org/apache/hudi/index/SparkHoodieIndex.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1zcGFyay1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaW5kZXgvU3BhcmtIb29kaWVJbmRleC5qYXZh) | `56.52% <0.00%> (-30.15%)` | :arrow_down: |
   | [...java/org/apache/hudi/table/HoodieTableFactory.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS90YWJsZS9Ib29kaWVUYWJsZUZhY3RvcnkuamF2YQ==) | `84.61% <0.00%> (-7.06%)` | :arrow_down: |
   | [...org/apache/hudi/config/HoodieClusteringConfig.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVDbHVzdGVyaW5nQ29uZmlnLmphdmE=) | `67.74% <0.00%> (-3.54%)` | :arrow_down: |
   | [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh) | `39.57% <0.00%> (-3.31%)` | :arrow_down: |
   | [...n/java/org/apache/hudi/common/model/FileSlice.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0ZpbGVTbGljZS5qYXZh) | `73.80% <0.00%> (-2.39%)` | :arrow_down: |
   | [...a/org/apache/hudi/common/util/ClusteringUtils.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3V0aWwvQ2x1c3RlcmluZ1V0aWxzLmphdmE=) | `88.40% <0.00%> (-1.31%)` | :arrow_down: |
   | [.../org/apache/hudi/common/model/HoodieFileGroup.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0hvb2RpZUZpbGVHcm91cC5qYXZh) | `83.92% <0.00%> (-0.56%)` | :arrow_down: |
   | [...in/java/org/apache/hudi/table/HoodieTableSink.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS90YWJsZS9Ib29kaWVUYWJsZVNpbmsuamF2YQ==) | `10.52% <0.00%> (ø)` | |
   | [.../org/apache/hudi/streamer/FlinkStreamerConfig.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zdHJlYW1lci9GbGlua1N0cmVhbWVyQ29uZmlnLmphdmE=) | `0.00% <0.00%> (ø)` | |
   | ... and [100 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [9908204...67041c2](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (5022f1d) into [master](https://codecov.io/gh/apache/hudi/commit/5804ad8e32ae05758ebc5e47f5d4fb4db371ab52?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (5804ad8) will **decrease** coverage by `44.87%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master   #2438       +/-   ##
   ============================================
   - Coverage     47.72%   2.85%   -44.88%     
   + Complexity     5528      85     -5443     
   ============================================
     Files           934     283      -651     
     Lines         41457   11751    -29706     
     Branches       4166     966     -3200     
   ============================================
   - Hits          19786     335    -19451     
   + Misses        19914   11390     -8524     
   + Partials       1757      26     -1731     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `?` | |
   | hudiclient | `0.00% <ø> (-34.46%)` | :arrow_down: |
   | hudicommon | `?` | |
   | hudiflink | `?` | |
   | hudihadoopmr | `?` | |
   | hudisparkdatasource | `?` | |
   | hudisync | `5.37% <ø> (-49.15%)` | :arrow_down: |
   | huditimelineservice | `?` | |
   | hudiutilities | `8.99% <0.00%> (-50.27%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `0.00% <0.00%> (-71.57%)` | :arrow_down: |
   | [...apache/hudi/utilities/sources/AvroKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb0thZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (ø)` | |
   | [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...hudi/utilities/sources/helpers/KafkaOffsetGen.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvaGVscGVycy9LYWZrYU9mZnNldEdlbi5qYXZh) | `0.00% <0.00%> (-87.69%)` | :arrow_down: |
   | [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | ... and [771 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [5804ad8...5022f1d](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] liujinhui1994 commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r605393270



##########
File path: hudi-integ-test/src/main/java/org/apache/hudi/integ/testsuite/HoodieDeltaStreamerWrapper.java
##########
@@ -65,7 +65,7 @@ public void scheduleCompact() throws Exception {
     return upsert(WriteOperationType.UPSERT);
   }
 
-  public Pair<SchemaProvider, Pair<String, JavaRDD<HoodieRecord>>> fetchSource() throws Exception {
+  public Pair<Pair<SchemaProvider, JavaRDD<HoodieRecord>>, Pair<String, String>> fetchSource() throws Exception {

Review comment:
       Okay, I'll add this class to this PR




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-799571386


   Few high level questions.
   1. Why not we leverage DeltaSreamerConfig.checkpoint to pass in a timestamp for Kafka source? Or do we expect the format of this config to be "topic_name,partition_num:offset,partition_num:offset,...." and hence we need a new config for timestamp based checkpoint. 
   2. If yes to (1), Did we think about parsing the checkpoint config and determining whether its above format or timestamp and then proceeding from there. Just trying to avoid introducing new configs if possible. 
   3. Checkpoint in deltastreamer in general is getting too complicated. I definitely see a benefit in this patch. But, is there a way we can abstract it out based on source. Bcoz, the new config introduced as part of this PR, is very specific to Kafka. So, trying to see if we can keep it abstracted out from deltastreamer if possible. 
   4. I see KafkaConsumer.offsetsForTimes() could return null for partitions w/ msgs of old format. So, what's the expected behavior for such partitions. Do we resume from earliest offset? 
   
   @n3nash @vinothchandar : open to hear your thoughts if any. One of my suggestion above, could potentially add apis to Source and hence CCing you. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-799511127


   hey folks. may I know what's the status of this PR. I see this could benefit others in the community as well. Do you think we can take it across the finish line by this weekend. so that we have it for upcoming release? 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (b77b639) into [master](https://codecov.io/gh/apache/hudi/commit/5804ad8e32ae05758ebc5e47f5d4fb4db371ab52?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (5804ad8) will **decrease** coverage by `2.00%`.
   > The diff coverage is `85.45%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #2438      +/-   ##
   ============================================
   - Coverage     47.72%   45.71%   -2.01%     
   + Complexity     5528     4738     -790     
   ============================================
     Files           934      832     -102     
     Lines         41457    38264    -3193     
     Branches       4166     3832     -334     
   ============================================
   - Hits          19786    17493    -2293     
   + Misses        19914    19150     -764     
   + Partials       1757     1621     -136     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `39.97% <ø> (ø)` | |
   | hudiclient | `22.27% <ø> (-12.18%)` | :arrow_down: |
   | hudicommon | `48.56% <ø> (+<0.01%)` | :arrow_up: |
   | hudiflink | `60.03% <ø> (ø)` | |
   | hudihadoopmr | `51.55% <ø> (+0.26%)` | :arrow_up: |
   | hudisparkdatasource | `67.32% <ø> (-0.34%)` | :arrow_down: |
   | hudisync | `54.51% <ø> (ø)` | |
   | huditimelineservice | `64.07% <ø> (ø)` | |
   | hudiutilities | `59.70% <85.45%> (+0.44%)` | :arrow_up: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...apache/hudi/utilities/sources/AvroKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb0thZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (ø)` | |
   | [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `72.72% <83.33%> (+1.15%)` | :arrow_up: |
   | [...hudi/utilities/sources/helpers/KafkaOffsetGen.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvaGVscGVycy9LYWZrYU9mZnNldEdlbi5qYXZh) | `87.00% <87.23%> (-0.68%)` | :arrow_down: |
   | [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `100.00% <100.00%> (ø)` | |
   | [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh) | `29.45% <0.00%> (-13.36%)` | :arrow_down: |
   | [...in/scala/org/apache/hudi/HoodieStreamingSink.scala](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZVN0cmVhbWluZ1Npbmsuc2NhbGE=) | `28.00% <0.00%> (-10.40%)` | :arrow_down: |
   | [...e/hudi/common/table/log/HoodieLogFormatWriter.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL2xvZy9Ib29kaWVMb2dGb3JtYXRXcml0ZXIuamF2YQ==) | `78.12% <0.00%> (-1.57%)` | :arrow_down: |
   | [.../org/apache/hudi/MergeOnReadSnapshotRelation.scala](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL01lcmdlT25SZWFkU25hcHNob3RSZWxhdGlvbi5zY2FsYQ==) | `90.59% <0.00%> (-0.56%)` | :arrow_down: |
   | [...c/main/scala/org/apache/hudi/HoodieFileIndex.scala](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZUZpbGVJbmRleC5zY2FsYQ==) | `80.98% <0.00%> (-0.16%)` | :arrow_down: |
   | [...va/org/apache/hudi/client/SparkRDDWriteClient.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1zcGFyay1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpZW50L1NwYXJrUkREV3JpdGVDbGllbnQuamF2YQ==) | | |
   | ... and [104 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [5804ad8...b77b639](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
       "triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "triggerType" : "PUSH"
     }, {
       "hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
       "triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
       "triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ea5ed9da433064022a69e06c98f58fc10c09e8b6 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373) 
   * ffd30f564c780a25ddccf8c5bc819d4eed9b437a UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (739a252) into [master](https://codecov.io/gh/apache/hudi/commit/7fed7352bd506e20e5316bb0b3ed9e5c1e9c76df?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (7fed735) will **decrease** coverage by `1.31%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #2438      +/-   ##
   ============================================
   - Coverage     54.96%   53.65%   -1.32%     
   + Complexity     3844     3253     -591     
   ============================================
     Files           485      407      -78     
     Lines         23437    19762    -3675     
     Branches       2494     2085     -409     
   ============================================
   - Hits          12882    10603    -2279     
   + Misses         9401     8221    -1180     
   + Partials       1154      938     -216     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `39.55% <ø> (ø)` | |
   | hudiclient | `∅ <ø> (∅)` | |
   | hudicommon | `50.29% <ø> (ø)` | |
   | hudiflink | `63.41% <ø> (ø)` | |
   | hudihadoopmr | `51.54% <ø> (ø)` | |
   | hudisparkdatasource | `73.33% <ø> (ø)` | |
   | hudisync | `?` | |
   | huditimelineservice | `?` | |
   | hudiutilities | `?` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...java/org/apache/hudi/utilities/sources/Source.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvU291cmNlLmphdmE=) | | |
   | [...ies/exception/HoodieSnapshotExporterException.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2V4Y2VwdGlvbi9Ib29kaWVTbmFwc2hvdEV4cG9ydGVyRXhjZXB0aW9uLmphdmE=) | | |
   | [...lities/schema/SchemaProviderWithPostProcessor.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFQcm92aWRlcldpdGhQb3N0UHJvY2Vzc29yLmphdmE=) | | |
   | [.../apache/hudi/timeline/service/TimelineService.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS10aW1lbGluZS1zZXJ2aWNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL3RpbWVsaW5lL3NlcnZpY2UvVGltZWxpbmVTZXJ2aWNlLmphdmE=) | | |
   | [...di/utilities/sources/helpers/IncrSourceHelper.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvaGVscGVycy9JbmNyU291cmNlSGVscGVyLmphdmE=) | | |
   | [.../apache/hudi/hive/MultiPartKeysValueExtractor.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTXVsdGlQYXJ0S2V5c1ZhbHVlRXh0cmFjdG9yLmphdmE=) | | |
   | [...ities/checkpointing/InitialCheckPointProvider.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2NoZWNrcG9pbnRpbmcvSW5pdGlhbENoZWNrUG9pbnRQcm92aWRlci5qYXZh) | | |
   | [...udi/timeline/service/handlers/TimelineHandler.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS10aW1lbGluZS1zZXJ2aWNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL3RpbWVsaW5lL3NlcnZpY2UvaGFuZGxlcnMvVGltZWxpbmVIYW5kbGVyLmphdmE=) | | |
   | [...alCheckpointFromAnotherHoodieTimelineProvider.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2NoZWNrcG9pbnRpbmcvSW5pdGlhbENoZWNrcG9pbnRGcm9tQW5vdGhlckhvb2RpZVRpbWVsaW5lUHJvdmlkZXIuamF2YQ==) | | |
   | [...java/org/apache/hudi/hive/util/HiveSchemaUtil.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvdXRpbC9IaXZlU2NoZW1hVXRpbC5qYXZh) | | |
   | ... and [65 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] pratyakshsharma commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
pratyakshsharma commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r830456199



##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java
##########
@@ -283,6 +323,41 @@ private Long delayOffsetCalculation(Option<String> lastCheckpointStr, Set<TopicP
     return delayCount;
   }
 
+  /**
+   * Get the checkpoint by timestamp.
+   * This method returns the checkpoint format based on the timestamp.
+   * example:
+   * 1. input: timestamp, etc.
+   * 2. output: topicName,partition_num_0:100,partition_num_1:101,partition_num_2:102.
+   *
+   * @param consumer
+   * @param topicName
+   * @param timestamp
+   * @return
+   */
+  private Option<String> getOffsetsByTimestamp(KafkaConsumer consumer, List<PartitionInfo> partitionInfoList, Set<TopicPartition> topicPartitions,
+                                               String topicName, Long timestamp) {
+
+    Map<TopicPartition, Long> topicPartitionsTimestamp = partitionInfoList.stream()
+                                                    .map(x -> new TopicPartition(x.topic(), x.partition()))
+                                                    .collect(Collectors.toMap(Function.identity(), x -> timestamp));
+
+    Map<TopicPartition, Long> earliestOffsets = consumer.beginningOffsets(topicPartitions);
+    Map<TopicPartition, OffsetAndTimestamp> offsetAndTimestamp = consumer.offsetsForTimes(topicPartitionsTimestamp);
+
+    StringBuilder sb = new StringBuilder();
+    sb.append(topicName + ",");
+    for (Map.Entry<TopicPartition, OffsetAndTimestamp> map : offsetAndTimestamp.entrySet()) {
+      if (map.getValue() != null) {
+        sb.append(map.getKey().partition()).append(":").append(map.getValue().offset()).append(",");
+      } else {
+        sb.append(map.getKey().partition()).append(":").append(earliestOffsets.get(map.getKey())).append(",");

Review comment:
       @liujinhui1994 @nsivabalan Can you help me understand why are we adding this value here from earliestOffsets? From what I understand, the whole point of consuming from specified timestamp is we do not want to consume records whose offset has timestamp lesser than the specified timestamp. Let us take an example of topic A with 3 partitions 0,1,2. Offsets are as below - 
   partition 0 - 100 (ts-210),101 (ts-220),102 (ts-230),103 (ts-240) .....
   partition 1 - 50 (ts 200), 51 (ts-205), 52 (ts-225) ....
   partition 2 - 51 (ts - 100), 60 (ts - 150) (only 2 records present in this)
   
   Now suppose if the timestamp is passed as 220, the expected results from consumer api will be - 
   partition 0 -> 101
   partition 1 -> 52
   partition 2 -> null
   
   As per the code, we return - 
   partition 0 -> 101
   partition 1 -> 52
   partition 2 -> 51
   
   I want to understand why are we populating this value here for partition 2? If the corresponding offsets in partition 2 have timestamp less than 220, this implies these offsets have either been already consumed or the records are not needed at all for ingestion into hudi table. Ideally no offset should be returned from this method for partition 2. 
   
   Even if this functionality is added only for one time initial bootstrap, then also consuming the records from partition 2 above does not make sense. Please let me know the thought process behind this logic. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-856862390


   @liujinhui1994 : here is what we can do. 
   If someone is running it just one, this should not be an issue. Issue arises when someone runs deltastreamer in a continuous manner. 
   
   So, user is expected to set HoodieDeltaStreamer.Config.checkpoint or InitialCheckpointProvider. 
   Also user sets the new config (hoodie.deltastreamer.source.kafka.checkpoint.type) to timestamp. 
   
   KafkaOffset gen should be capable of parsing the checkpoint as timestamp. 
   at the end write, deltaSync should reset this config (similar to how we reset the checkpoint).
   So, for subsequent runs, this config value will not be set. So, KafkaOffsetGen should parse checkpoint and fetch from source as a regular checkpoint. 
   
   Let me know if you can understand the approach, and if it makes sense. 
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-853866564


   good point. 
   Tell me if my understanding is right in general wrt usage of timestamp based checkpointing. 
   user would like to use timestamp based checkpointing in deltastreamer only for bootstrap case. 
   and further on, checkpointing will be using the regular kafka checkpoint format of "topicName,0:123,1:456". 
   
   if my understanding (stated above) is true, essentially, within kafkaOffsenGen, we might have to parse checkpoint as timestamp for first time(bootstrap), but from 2nd time, we fallback to regular checkpoint parsing mechanism. 
   
   I see we have InitialCheckPointProvider. Let me think about how to go about this and will get back to you. For now, this is what I can think of. 
   InitialCheckpointProvider will expose getCheckpointType() method. 
   and we add it as a property to configs if initialCheckpointProvider is set around [here](https://github.com/apache/hudi/blob/f6eee77636223077cfd2ce516f1b8805dfa6e35e/hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java#L132). 
   Within readFromSource in DeltaSync(), if checkpoint is fetched from commit metadata, we may not honor this checkpoint type. or we will clear the checkpoint type property if set. 
   but if fetched from cfg.checkPoint, we will leave the property as is and let kafkaOffsetGen handle checkpoint parsing. 
   
   But let me think through this more. But in the mean time, if you can confirm my understanding of the usage of timestamp based checkpointing, would be great. 
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
       "triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "triggerType" : "PUSH"
     }, {
       "hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
       "triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
       "triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
       "triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
       "triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 5e8ab52b0e139333c4c003932c55ff6e88302206 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565) 
   * 1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
       "triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "triggerType" : "PUSH"
     }, {
       "hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
       "triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
       "triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
       "triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
       "triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
       "triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638",
       "triggerID" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
       "triggerType" : "PUSH"
     }, {
       "hash" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=775",
       "triggerID" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a688be727d6d6beff51a3f347b9e596d982610b5",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=838",
       "triggerID" : "a688be727d6d6beff51a3f347b9e596d982610b5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=839",
       "triggerID" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a688be727d6d6beff51a3f347b9e596d982610b5 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=838) 
   * 67041c2d836e61355aea26bd24f91548ec5e92ce Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=839) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (ffd30f5) into [master](https://codecov.io/gh/apache/hudi/commit/0b57483a8e41742689a1362aa94aabb94a1361b3?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (0b57483) will **decrease** coverage by `0.75%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #2438      +/-   ##
   ============================================
   - Coverage     45.76%   45.00%   -0.76%     
   + Complexity     5261     4858     -403     
   ============================================
     Files           909      849      -60     
     Lines         39353    36710    -2643     
     Branches       4239     3955     -284     
   ============================================
   - Hits          18010    16522    -1488     
   + Misses        19499    18474    -1025     
   + Partials       1844     1714     -130     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `39.95% <ø> (ø)` | |
   | hudiclient | `30.45% <ø> (+0.06%)` | :arrow_up: |
   | hudicommon | `47.56% <ø> (-0.02%)` | :arrow_down: |
   | hudiflink | `61.26% <ø> (+0.45%)` | :arrow_up: |
   | hudihadoopmr | `51.29% <ø> (ø)` | |
   | hudisparkdatasource | `67.00% <ø> (+0.52%)` | :arrow_up: |
   | hudisync | `47.11% <ø> (-4.35%)` | :arrow_down: |
   | huditimelineservice | `64.36% <ø> (ø)` | |
   | hudiutilities | `?` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...ache/hudi/hive/HiveMetastoreBasedLockProvider.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvSGl2ZU1ldGFzdG9yZUJhc2VkTG9ja1Byb3ZpZGVyLmphdmE=) | `0.00% <0.00%> (-60.22%)` | :arrow_down: |
   | [.../org/apache/hudi/sink/compact/CompactFunction.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zaW5rL2NvbXBhY3QvQ29tcGFjdEZ1bmN0aW9uLmphdmE=) | `86.66% <0.00%> (-13.34%)` | :arrow_down: |
   | [...e/hudi/sink/partitioner/profile/WriteProfiles.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zaW5rL3BhcnRpdGlvbmVyL3Byb2ZpbGUvV3JpdGVQcm9maWxlcy5qYXZh) | `50.00% <0.00%> (-5.89%)` | :arrow_down: |
   | [...src/main/scala/org/apache/hudi/DefaultSource.scala](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0RlZmF1bHRTb3VyY2Uuc2NhbGE=) | `75.22% <0.00%> (-2.23%)` | :arrow_down: |
   | [...e/hudi/common/table/log/HoodieLogFormatWriter.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL2xvZy9Ib29kaWVMb2dGb3JtYXRXcml0ZXIuamF2YQ==) | `78.12% <0.00%> (-1.57%)` | :arrow_down: |
   | [...c/main/java/org/apache/hudi/util/StreamerUtil.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS91dGlsL1N0cmVhbWVyVXRpbC5qYXZh) | `55.00% <0.00%> (-1.42%)` | :arrow_down: |
   | [...i/common/table/timeline/HoodieDefaultTimeline.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3RpbWVsaW5lL0hvb2RpZURlZmF1bHRUaW1lbGluZS5qYXZh) | `79.22% <0.00%> (-1.30%)` | :arrow_down: |
   | [...metadata/SparkHoodieBackedTableMetadataWriter.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1zcGFyay1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0YWRhdGEvU3BhcmtIb29kaWVCYWNrZWRUYWJsZU1ldGFkYXRhV3JpdGVyLmphdmE=) | `72.36% <0.00%> (-0.88%)` | :arrow_down: |
   | [...java/org/apache/hudi/sink/StreamWriteFunction.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zaW5rL1N0cmVhbVdyaXRlRnVuY3Rpb24uamF2YQ==) | `84.34% <0.00%> (-0.66%)` | :arrow_down: |
   | [...he/hudi/sink/partitioner/profile/WriteProfile.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zaW5rL3BhcnRpdGlvbmVyL3Byb2ZpbGUvV3JpdGVQcm9maWxlLmphdmE=) | `87.50% <0.00%> (-0.50%)` | :arrow_down: |
   | ... and [94 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [0b57483...ffd30f5](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
       "triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "triggerType" : "PUSH"
     }, {
       "hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
       "triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
       "triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ea5ed9da433064022a69e06c98f58fc10c09e8b6 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] liujinhui1994 closed pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
liujinhui1994 closed pull request #2438:
URL: https://github.com/apache/hudi/pull/2438


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
       "triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "triggerType" : "PUSH"
     }, {
       "hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
       "triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 00d29d85f32f376ef44cb99d49f605a4af6f798c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269) 
   * ea5ed9da433064022a69e06c98f58fc10c09e8b6 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
       "triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "triggerType" : "PUSH"
     }, {
       "hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
       "triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
       "triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
       "triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ffd30f564c780a25ddccf8c5bc819d4eed9b437a Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
       "triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "triggerType" : "PUSH"
     }, {
       "hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
       "triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
       "triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
       "triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
       "triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
       "triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638",
       "triggerID" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
       "triggerType" : "PUSH"
     }, {
       "hash" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=775",
       "triggerID" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a688be727d6d6beff51a3f347b9e596d982610b5",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=838",
       "triggerID" : "a688be727d6d6beff51a3f347b9e596d982610b5",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * bf50481b923dbaa14be994bd0cc45bbe22ff8524 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=775) 
   * a688be727d6d6beff51a3f347b9e596d982610b5 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=838) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
       "triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "triggerType" : "PUSH"
     }, {
       "hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
       "triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
       "triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
       "triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
       "triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
       "triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638",
       "triggerID" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a39570dfe0493bcd23edf911f6256e90d3b22907 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r671607330



##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java
##########
@@ -212,6 +234,9 @@ public KafkaOffsetGen(TypedProperties props) {
       Set<TopicPartition> topicPartitions = partitionInfoList.stream()
               .map(x -> new TopicPartition(x.topic(), x.partition())).collect(Collectors.toSet());
 
+      if (Config.KAFKA_CHECKPOINT_TYPE_TIMESTAMP.equals(kafkaCheckpointType) && isValidCheckpointType(lastCheckpointStr)) {
+        lastCheckpointStr = getOffsetsByTimestamp(consumer, partitionInfoList, topicPartitions, topicName, Long.parseLong(lastCheckpointStr.get()));
+      }

Review comment:
       ok, I get it now. makes sense. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
       "triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "triggerType" : "PUSH"
     }, {
       "hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
       "triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
       "triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
       "triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
       "triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
       "triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638",
       "triggerID" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
       "triggerType" : "PUSH"
     }, {
       "hash" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=775",
       "triggerID" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a688be727d6d6beff51a3f347b9e596d982610b5",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=838",
       "triggerID" : "a688be727d6d6beff51a3f347b9e596d982610b5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=839",
       "triggerID" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8bc0333e4fc14158b126da1f7b14f6c43a3abfb8",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "8bc0333e4fc14158b126da1f7b14f6c43a3abfb8",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 67041c2d836e61355aea26bd24f91548ec5e92ce Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=839) 
   * 8bc0333e4fc14158b126da1f7b14f6c43a3abfb8 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-864413719


   Guess we can simplify things. Let me go over some pseudo code of interest. 
   
   within DeltaSync.read()
   ```
   // set right checkpoint value 
   if(cfg.checkpoint != null && ! (commitMetadata.contains(Checkpoint_RESET_Key) ) {
      checkpoint = cfg.checkpoint;
   } else if (commitMetadata.contains(Checkpoint_Key)) {
       checkpoint = commitMetadata.get(Checkpoint_Key));
   } else {
       Option.empty() 
   }
   ```
   // Note that first if condition deals with RESET_key where as 2nd else if conditions deals with Checkpoint_key
   
   within write() 
   ```
   // towards the end
   commitMetadata.out(Checkpoint_Key, updated checkpoint after writing)
   if(cfg.checkpoint != null) {
     commitMetadata.add(Checkpoint_RESET_Key);
   }
   ```
   
   If cfg.checkpoint is set, only during first round, it will be honored. At the end of first batch, we add Checkpoint_RESET_Key to the commitmetadata and hence from subsequent batches, checkpoint will be parsed from commitMetadata. 
   
   With this PR, only addition is that we are introducing a new checkpoint type. Let me propose a simple add on to above code that would work for us. 
   
   within DeltaSync.read()
   ```
   // set right checkpoint value 
   boolean resetCheckpointType = true;
   if(cfg.checkpoint != null && ! (commitMetadata.contains(Checkpoint_RESET_Key) ) {
      checkpoint = cfg.checkpoint;
      resetCheckpointType = false;
   } else if (commitMetadata.contains(Checkpoint_Key)) {
       checkpoint = commitMetadata.get(Checkpoint_Key));
   } else {
       Option.empty() 
   }
   if (resetCheckpointType) {
     **reset checkpoint type if set.** 
   }
   ```
   
   No other changes are required. This is based of the assumption that Checkpoint_RESET_Key and checkpoint type goes hand in hand. During first batch, checkpoint type could be set, there won't be any Checkpoint_RESET_Key set. But from 2nd batch, it should be reverse. check point type should not be set, but Checkpoint_RESET_Key should be part of the commit metadata. Given this assumption, we don't really need to add checkpoint type to commitMetadata, but still decide whether to use the checkpoint type or not. 
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-io edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-759677298






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
       "triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "triggerType" : "PUSH"
     }, {
       "hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
       "triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
       "triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
       "triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
       "triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
       "triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638",
       "triggerID" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
       "triggerType" : "PUSH"
     }, {
       "hash" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=775",
       "triggerID" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a688be727d6d6beff51a3f347b9e596d982610b5",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=838",
       "triggerID" : "a688be727d6d6beff51a3f347b9e596d982610b5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=839",
       "triggerID" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8bc0333e4fc14158b126da1f7b14f6c43a3abfb8",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=856",
       "triggerID" : "8bc0333e4fc14158b126da1f7b14f6c43a3abfb8",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5022f1d97e4e9b140d8e41b5b49c034ceb9ae601",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "5022f1d97e4e9b140d8e41b5b49c034ceb9ae601",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 8bc0333e4fc14158b126da1f7b14f6c43a3abfb8 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=856) 
   * 5022f1d97e4e9b140d8e41b5b49c034ceb9ae601 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] wangxianghu commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
wangxianghu commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-787616261


   > I will add the unit test, and then please review
   
   Hi @liujinhui1994 sorry for the day.
   Can we keep all these changes in `KafkaOffsetGen`, this seems more elegant


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] liujinhui1994 commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r642183230



##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java
##########
@@ -326,6 +328,16 @@ private boolean onDeltaSyncShutdown(boolean error) {
     @Parameter(names = {"--checkpoint"}, description = "Resume Delta Streamer from this checkpoint.")
     public String checkpoint = null;
 
+    /**
+     * 1. string: topicName,partition number 0:offset value,partition number 1:offset value
+     * 2. timestamp: kafka offset timestamp
+     * example
+     * 1. hudi_topic,0:100,1:101,2:201
+     * 2. 1621947081
+     */
+    @Parameter(names = {"--checkpoint-type"}, description = "Checkpoint type, divided into timestamp or string offset")
+    public String checkpointType = "string";

Review comment:
       I have not finished processing this PR, so this is currently a semi-finished product. When I finish processing, I will ping you
   @nsivabalan 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
       "triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "triggerType" : "PUSH"
     }, {
       "hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
       "triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
       "triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
       "triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
       "triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
       "triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638",
       "triggerID" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
       "triggerType" : "PUSH"
     }, {
       "hash" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=775",
       "triggerID" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * bf50481b923dbaa14be994bd0cc45bbe22ff8524 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=775) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (b77b639) into [master](https://codecov.io/gh/apache/hudi/commit/5804ad8e32ae05758ebc5e47f5d4fb4db371ab52?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (5804ad8) will **decrease** coverage by `0.27%`.
   > The diff coverage is `85.45%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #2438      +/-   ##
   ============================================
   - Coverage     47.72%   47.44%   -0.28%     
   - Complexity     5528     5536       +8     
   ============================================
     Files           934      934              
     Lines         41457    41768     +311     
     Branches       4166     4187      +21     
   ============================================
   + Hits          19786    19818      +32     
   - Misses        19914    20189     +275     
   - Partials       1757     1761       +4     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `39.97% <ø> (ø)` | |
   | hudiclient | `33.75% <ø> (-0.70%)` | :arrow_down: |
   | hudicommon | `48.56% <ø> (+<0.01%)` | :arrow_up: |
   | hudiflink | `60.03% <ø> (ø)` | |
   | hudihadoopmr | `51.55% <ø> (+0.26%)` | :arrow_up: |
   | hudisparkdatasource | `67.32% <ø> (-0.34%)` | :arrow_down: |
   | hudisync | `54.51% <ø> (ø)` | |
   | huditimelineservice | `64.07% <ø> (ø)` | |
   | hudiutilities | `59.70% <85.45%> (+0.44%)` | :arrow_up: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...apache/hudi/utilities/sources/AvroKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb0thZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (ø)` | |
   | [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `72.72% <83.33%> (+1.15%)` | :arrow_up: |
   | [...hudi/utilities/sources/helpers/KafkaOffsetGen.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvaGVscGVycy9LYWZrYU9mZnNldEdlbi5qYXZh) | `87.00% <87.23%> (-0.68%)` | :arrow_down: |
   | [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `100.00% <100.00%> (ø)` | |
   | [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh) | `29.45% <0.00%> (-13.36%)` | :arrow_down: |
   | [...in/scala/org/apache/hudi/HoodieStreamingSink.scala](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZVN0cmVhbWluZ1Npbmsuc2NhbGE=) | `28.00% <0.00%> (-10.40%)` | :arrow_down: |
   | [...e/hudi/common/table/log/HoodieLogFormatWriter.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL2xvZy9Ib29kaWVMb2dGb3JtYXRXcml0ZXIuamF2YQ==) | `78.12% <0.00%> (-1.57%)` | :arrow_down: |
   | [.../org/apache/hudi/MergeOnReadSnapshotRelation.scala](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL01lcmdlT25SZWFkU25hcHNob3RSZWxhdGlvbi5zY2FsYQ==) | `90.59% <0.00%> (-0.56%)` | :arrow_down: |
   | [...c/main/scala/org/apache/hudi/HoodieFileIndex.scala](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZUZpbGVJbmRleC5zY2FsYQ==) | `80.98% <0.00%> (-0.16%)` | :arrow_down: |
   | ... and [2 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [5804ad8...b77b639](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-844060742


   @liujinhui1994 : were you able to make progress on this. would be nice to have this in before next release. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (b77b639) into [master](https://codecov.io/gh/apache/hudi/commit/5804ad8e32ae05758ebc5e47f5d4fb4db371ab52?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (5804ad8) will **decrease** coverage by `0.27%`.
   > The diff coverage is `85.45%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #2438      +/-   ##
   ============================================
   - Coverage     47.72%   47.44%   -0.28%     
   - Complexity     5528     5536       +8     
   ============================================
     Files           934      934              
     Lines         41457    41768     +311     
     Branches       4166     4187      +21     
   ============================================
   + Hits          19786    19818      +32     
   - Misses        19914    20189     +275     
   - Partials       1757     1761       +4     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `39.97% <ø> (ø)` | |
   | hudiclient | `33.75% <ø> (-0.70%)` | :arrow_down: |
   | hudicommon | `48.56% <ø> (+<0.01%)` | :arrow_up: |
   | hudiflink | `60.03% <ø> (ø)` | |
   | hudihadoopmr | `51.55% <ø> (+0.26%)` | :arrow_up: |
   | hudisparkdatasource | `67.32% <ø> (-0.34%)` | :arrow_down: |
   | hudisync | `54.51% <ø> (ø)` | |
   | huditimelineservice | `64.07% <ø> (ø)` | |
   | hudiutilities | `59.70% <85.45%> (+0.44%)` | :arrow_up: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...apache/hudi/utilities/sources/AvroKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb0thZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (ø)` | |
   | [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `72.72% <83.33%> (+1.15%)` | :arrow_up: |
   | [...hudi/utilities/sources/helpers/KafkaOffsetGen.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvaGVscGVycy9LYWZrYU9mZnNldEdlbi5qYXZh) | `87.00% <87.23%> (-0.68%)` | :arrow_down: |
   | [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `100.00% <100.00%> (ø)` | |
   | [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh) | `29.45% <0.00%> (-13.36%)` | :arrow_down: |
   | [...in/scala/org/apache/hudi/HoodieStreamingSink.scala](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZVN0cmVhbWluZ1Npbmsuc2NhbGE=) | `28.00% <0.00%> (-10.40%)` | :arrow_down: |
   | [...e/hudi/common/table/log/HoodieLogFormatWriter.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL2xvZy9Ib29kaWVMb2dGb3JtYXRXcml0ZXIuamF2YQ==) | `78.12% <0.00%> (-1.57%)` | :arrow_down: |
   | [.../org/apache/hudi/MergeOnReadSnapshotRelation.scala](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL01lcmdlT25SZWFkU25hcHNob3RSZWxhdGlvbi5zY2FsYQ==) | `90.59% <0.00%> (-0.56%)` | :arrow_down: |
   | [...c/main/scala/org/apache/hudi/HoodieFileIndex.scala](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZUZpbGVJbmRleC5zY2FsYQ==) | `80.98% <0.00%> (-0.16%)` | :arrow_down: |
   | ... and [2 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [5804ad8...b77b639](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] liujinhui1994 commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r671380922



##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java
##########
@@ -212,6 +234,9 @@ public KafkaOffsetGen(TypedProperties props) {
       Set<TopicPartition> topicPartitions = partitionInfoList.stream()
               .map(x -> new TopicPartition(x.topic(), x.partition())).collect(Collectors.toSet());
 
+      if (Config.KAFKA_CHECKPOINT_TYPE_TIMESTAMP.equals(kafkaCheckpointType) && isValidCheckpointType(lastCheckpointStr)) {
+        lastCheckpointStr = getOffsetsByTimestamp(consumer, partitionInfoList, topicPartitions, topicName, Long.parseLong(lastCheckpointStr.get()));
+      }

Review comment:
       It can be understood that when checkTimestamptype is not used, the format of lastCheckpointStr is topic_name,partition_num:offset,partition_num:offset
   
   When getOffsetsByTimestamp method is used, what we do is to convert lastCheckpointStr=timestamp to topic_name,partition_num:offset,partition_num:offset




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-853866564


   good point. 
   Tell me if my understanding is right in general wrt usage of timestamp based checkpointing. 
   user would like to use timestamp based checkpointing in deltastreamer only for bootstrap case. 
   and further on, checkpointing will be using the regular kafka checkpoint format of "topicName,0:123,1:456". 
   
   if my understanding (stated above) is true, essentially, within kafkaOffsenGen, we might have to parse checkpoint as timestamp for first time(bootstrap), but from 2nd time, we fallback to regular checkpoint parsing mechanism. 
   
   I see we have InitialCheckPointProvider. Let me think about how to go about this and will get back to you. For now, this is what I can think of. 
   InitialCheckpointProvider will expose getCheckpointType() method. 
   and we add it as a property to configs if initialCheckpointProvider is set around [here](https://github.com/apache/hudi/blob/f6eee77636223077cfd2ce516f1b8805dfa6e35e/hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java#L132). 
   Within readFromSource in DeltaSync(), if checkpoint is fetched from commit metadata, we may not honor this checkpoint type. or we will clear the checkpoint type property if set. 
   but if fetched from cfg.checkPoint, we will leave the property as is and let kafkaOffsetGen handle checkpoint parsing. 
   
   But let me think through this more. But in the mean time, if you can confirm my understanding of the usage of timestamp based checkpointing, would be great. 
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
       "triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "triggerType" : "PUSH"
     }, {
       "hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
       "triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
       "triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
       "triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
       "triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
       "triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638",
       "triggerID" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
       "triggerType" : "PUSH"
     }, {
       "hash" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=775",
       "triggerID" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a688be727d6d6beff51a3f347b9e596d982610b5",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=838",
       "triggerID" : "a688be727d6d6beff51a3f347b9e596d982610b5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=839",
       "triggerID" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 67041c2d836e61355aea26bd24f91548ec5e92ce Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=839) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] pratyakshsharma commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
pratyakshsharma commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r830456199



##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java
##########
@@ -283,6 +323,41 @@ private Long delayOffsetCalculation(Option<String> lastCheckpointStr, Set<TopicP
     return delayCount;
   }
 
+  /**
+   * Get the checkpoint by timestamp.
+   * This method returns the checkpoint format based on the timestamp.
+   * example:
+   * 1. input: timestamp, etc.
+   * 2. output: topicName,partition_num_0:100,partition_num_1:101,partition_num_2:102.
+   *
+   * @param consumer
+   * @param topicName
+   * @param timestamp
+   * @return
+   */
+  private Option<String> getOffsetsByTimestamp(KafkaConsumer consumer, List<PartitionInfo> partitionInfoList, Set<TopicPartition> topicPartitions,
+                                               String topicName, Long timestamp) {
+
+    Map<TopicPartition, Long> topicPartitionsTimestamp = partitionInfoList.stream()
+                                                    .map(x -> new TopicPartition(x.topic(), x.partition()))
+                                                    .collect(Collectors.toMap(Function.identity(), x -> timestamp));
+
+    Map<TopicPartition, Long> earliestOffsets = consumer.beginningOffsets(topicPartitions);
+    Map<TopicPartition, OffsetAndTimestamp> offsetAndTimestamp = consumer.offsetsForTimes(topicPartitionsTimestamp);
+
+    StringBuilder sb = new StringBuilder();
+    sb.append(topicName + ",");
+    for (Map.Entry<TopicPartition, OffsetAndTimestamp> map : offsetAndTimestamp.entrySet()) {
+      if (map.getValue() != null) {
+        sb.append(map.getKey().partition()).append(":").append(map.getValue().offset()).append(",");
+      } else {
+        sb.append(map.getKey().partition()).append(":").append(earliestOffsets.get(map.getKey())).append(",");

Review comment:
       @liujinhui1994 @nsivabalan Can you help me understand why are we adding this value here from earliestOffsets? From what I understand, the whole point of consuming from specified timestamp is we do not want to consume records whose offset has timestamp lesser than the specified timestamp. Let us take an example of topic A with 3 partitions 0,1,2. Offsets are as below - 
   partition 0 - 100 (ts-210),101 (ts-220),102 (ts-230),103 (ts-240) .....
   partition 1 - 50 (ts 200), 51 (ts-205), 52 (ts-225) ....
   partition 2 - 51 (ts - 100), 60 (ts - 150) (only 2 records present in this)
   
   Now suppose if the timestamp is passed as 220, the expected results from consumer api will be - 
   partition 0 -> 101
   partition 1 -> 52
   partition 2 -> null
   
   As per the code, we return - 
   partition 0 -> 101
   partition 1 -> 52
   partition 2 -> 51 (earliest offset)
   
   I want to understand why are we populating this value here for partition 2? If the corresponding offsets in partition 2 have timestamp less than 220, this implies these offsets have either been already consumed or the records are not needed at all for ingestion into hudi table. Ideally no offset should be returned from this method for partition 2. 
   
   Even if this functionality is added only for one time initial bootstrap, then also consuming the records from partition 2 above does not make sense. Please let me know the thought process behind this logic. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] n3nash commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
n3nash commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r642690647



##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java
##########
@@ -326,6 +328,16 @@ private boolean onDeltaSyncShutdown(boolean error) {
     @Parameter(names = {"--checkpoint"}, description = "Resume Delta Streamer from this checkpoint.")
     public String checkpoint = null;
 
+    /**
+     * 1. string: topicName,partition number 0:offset value,partition number 1:offset value
+     * 2. timestamp: kafka offset timestamp
+     * example
+     * 1. hudi_topic,0:100,1:101,2:201
+     * 2. 1621947081
+     */
+    @Parameter(names = {"--checkpoint-type"}, description = "Checkpoint type, divided into timestamp or string offset")
+    public String checkpointType = "string";

Review comment:
       @nsivabalan Do we need to introduce something explicitly here ? Can we just introduce another property like below `hoodie.deltastreamer.source.kafka.checkpoint.type` and not have this change present as a top level option ? This checkpoint type seems very specific to a use-case in kafka and would like to reduce the confusions at the top level configs for users who want to use other source types. We should add documentation about this property so folks can drop this property in the properties file.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r641925742



##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java
##########
@@ -326,6 +328,16 @@ private boolean onDeltaSyncShutdown(boolean error) {
     @Parameter(names = {"--checkpoint"}, description = "Resume Delta Streamer from this checkpoint.")
     public String checkpoint = null;
 
+    /**
+     * 1. string: topicName,partition number 0:offset value,partition number 1:offset value

Review comment:
       this format is specific to kafka. lets call it out. other sources could have checkpoint differently. 

##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java
##########
@@ -326,6 +328,16 @@ private boolean onDeltaSyncShutdown(boolean error) {
     @Parameter(names = {"--checkpoint"}, description = "Resume Delta Streamer from this checkpoint.")
     public String checkpoint = null;
 
+    /**
+     * 1. string: topicName,partition number 0:offset value,partition number 1:offset value
+     * 2. timestamp: kafka offset timestamp
+     * example
+     * 1. hudi_topic,0:100,1:101,2:201
+     * 2. 1621947081
+     */
+    @Parameter(names = {"--checkpoint-type"}, description = "Checkpoint type, divided into timestamp or string offset")
+    public String checkpointType = "string";

Review comment:
       I am contemplating between "string" or "default" or "regular" to be set as default checkpoint type. @n3nash : any thoughts. We are looking to introduce a new config called checkpoint type. by default we need to set some value. this patch adds a new checkpoint type "timestamp" for kafka source. 

##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/DeltaSync.java
##########
@@ -38,6 +38,7 @@
 import org.apache.hudi.common.table.timeline.HoodieTimeline;
 import org.apache.hudi.common.util.Option;
 import org.apache.hudi.common.util.ReflectionUtils;
+import org.apache.hudi.common.util.StringUtils;

Review comment:
       can we revert unintended changes in this file. 

##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java
##########
@@ -326,6 +328,16 @@ private boolean onDeltaSyncShutdown(boolean error) {
     @Parameter(names = {"--checkpoint"}, description = "Resume Delta Streamer from this checkpoint.")
     public String checkpoint = null;
 
+    /**
+     * 1. string: topicName,partition number 0:offset value,partition number 1:offset value
+     * 2. timestamp: kafka offset timestamp
+     * example
+     * 1. hudi_topic,0:100,1:101,2:201
+     * 2. 1621947081
+     */
+    @Parameter(names = {"--checkpoint-type"}, description = "Checkpoint type, divided into timestamp or string offset")
+    public String checkpointType = "string";

Review comment:
       sorry, why do we have this config in two places. We have it defined as top level config in HoodieDeltaStreamer.Config. But in KafkaOffsetGen, I see you are accessing it as "hoodie.deltastreamer.source.kafka.checkpoint.type". May be we should rely on this config param and remove it from top level since this is applicable just to kafka for now. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (e98b8e4) into [master](https://codecov.io/gh/apache/hudi/commit/5804ad8e32ae05758ebc5e47f5d4fb4db371ab52?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (5804ad8) will **decrease** coverage by `44.90%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master   #2438       +/-   ##
   ============================================
   - Coverage     47.72%   2.82%   -44.91%     
   + Complexity     5528      85     -5443     
   ============================================
     Files           934     284      -650     
     Lines         41457   11869    -29588     
     Branches       4166     986     -3180     
   ============================================
   - Hits          19786     335    -19451     
   + Misses        19914   11508     -8406     
   + Partials       1757      26     -1731     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `?` | |
   | hudiclient | `0.00% <ø> (-34.46%)` | :arrow_down: |
   | hudicommon | `?` | |
   | hudiflink | `?` | |
   | hudihadoopmr | `?` | |
   | hudisparkdatasource | `?` | |
   | hudisync | `4.88% <ø> (-49.64%)` | :arrow_down: |
   | huditimelineservice | `?` | |
   | hudiutilities | `8.99% <0.00%> (-50.27%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `0.00% <0.00%> (-71.57%)` | :arrow_down: |
   | [...apache/hudi/utilities/sources/AvroKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb0thZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (ø)` | |
   | [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...hudi/utilities/sources/helpers/KafkaOffsetGen.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvaGVscGVycy9LYWZrYU9mZnNldEdlbi5qYXZh) | `0.00% <0.00%> (-87.69%)` | :arrow_down: |
   | [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | ... and [778 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [5804ad8...e98b8e4](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] yanghua commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
yanghua commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-771572476


   CI still failed, check it again.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
       "triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "triggerType" : "PUSH"
     }, {
       "hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
       "triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
       "triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
       "triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
       "triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
       "triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633) 
   * a39570dfe0493bcd23edf911f6256e90d3b22907 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] liujinhui1994 commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-881278532


   @nsivabalan  I have completed the changes as you requested, please take a look~
   Thank you very much for your help!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (5022f1d) into [master](https://codecov.io/gh/apache/hudi/commit/5804ad8e32ae05758ebc5e47f5d4fb4db371ab52?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (5804ad8) will **decrease** coverage by `3.74%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #2438      +/-   ##
   ============================================
   - Coverage     47.72%   43.97%   -3.75%     
   + Complexity     5528     5119     -409     
   ============================================
     Files           934      934              
     Lines         41457    41498      +41     
     Branches       4166     4171       +5     
   ============================================
   - Hits          19786    18250    -1536     
   - Misses        19914    21625    +1711     
   + Partials       1757     1623     -134     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `39.97% <ø> (ø)` | |
   | hudiclient | `34.45% <ø> (ø)` | |
   | hudicommon | `48.56% <ø> (+<0.01%)` | :arrow_up: |
   | hudiflink | `60.03% <ø> (ø)` | |
   | hudihadoopmr | `51.29% <ø> (ø)` | |
   | hudisparkdatasource | `67.44% <ø> (-0.22%)` | :arrow_down: |
   | hudisync | `54.51% <ø> (ø)` | |
   | huditimelineservice | `64.07% <ø> (ø)` | |
   | hudiutilities | `8.99% <0.00%> (-50.27%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `0.00% <0.00%> (-71.57%)` | :arrow_down: |
   | [...apache/hudi/utilities/sources/AvroKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb0thZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (ø)` | |
   | [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...hudi/utilities/sources/helpers/KafkaOffsetGen.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvaGVscGVycy9LYWZrYU9mZnNldEdlbi5qYXZh) | `0.00% <0.00%> (-87.69%)` | :arrow_down: |
   | [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | ... and [45 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [5804ad8...5022f1d](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] liujinhui1994 commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-875975684


   woking....


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] liujinhui1994 commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-811856655


   > Myself and Nishith discussed on this. Here is our proposal.
   > Let's rely on Deltastreamer.Config.checkpoint to pass in any type of checkpoint.
   > We can add another config called "checkpoint.type" which could default to string for all default checkpoints. For checkpoint of interest of this PR, we could set the value for this new config to "timestamp".
   > 
   > With this, its upto each source to parse and interpret the checkpoint value and DeltaSync does not need to deal w/ diff checkpointing formats.
   > 
   > Having said this, DeltaSync readFromSource() should not have any changes in this diff.
   > KafkaOffsetGen should have logic to parse diff checkpoint values, based on two values(deltastreamer.config.checkpoint and checkpoint.type).
   > 
   > With this, we also moved source specific checkpointing logic within source specific class and did not leak it to DeltaSync which should be agnostic to different Source.
   > 
   > @liujinhui1994 : Let me know what do you think. Happy to chat more on this.
   
   Great, I will modify this PR based on this


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
       "triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "triggerType" : "PUSH"
     }, {
       "hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
       "triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
       "triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
       "triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
       "triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
       "triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638",
       "triggerID" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
       "triggerType" : "PUSH"
     }, {
       "hash" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=775",
       "triggerID" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a688be727d6d6beff51a3f347b9e596d982610b5",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "a688be727d6d6beff51a3f347b9e596d982610b5",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * bf50481b923dbaa14be994bd0cc45bbe22ff8524 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=775) 
   * a688be727d6d6beff51a3f347b9e596d982610b5 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] liujinhui1994 commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-786452099


   I will add the unit test, and then please review


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r604841403



##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java
##########
@@ -247,6 +266,32 @@ private Long delayOffsetCalculation(Option<String> lastCheckpointStr, Set<TopicP
     return delayCount;
   }
 
+  /**
+   * Get the checkpoint by timestamp.
+   * @param consumer
+   * @param topicName
+   * @param timestamp
+   * @return
+   */
+  private String getOffsetsByTimestamp(KafkaConsumer consumer, List<PartitionInfo> partitionInfoList, String topicName, Long timestamp) {

Review comment:
       Can we add tests for the new code that is added. I don't see any tests. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] liujinhui1994 commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-881203019


   > Let's try to land this in by weekend. Its been hanging for quite sometime.
   
   ok.
   Sorry, I'll deal with it now, please excuse me
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (c705ce5) into [master](https://codecov.io/gh/apache/hudi/commit/5804ad8e32ae05758ebc5e47f5d4fb4db371ab52?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (5804ad8) will **decrease** coverage by `20.21%`.
   > The diff coverage is `89.09%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@              Coverage Diff              @@
   ##             master    #2438       +/-   ##
   =============================================
   - Coverage     47.72%   27.50%   -20.22%     
   + Complexity     5528     1302     -4226     
   =============================================
     Files           934      386      -548     
     Lines         41457    15377    -26080     
     Branches       4166     1343     -2823     
   =============================================
   - Hits          19786     4230    -15556     
   + Misses        19914    10842     -9072     
   + Partials       1757      305     -1452     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `?` | |
   | hudiclient | `20.91% <ø> (-13.54%)` | :arrow_down: |
   | hudicommon | `?` | |
   | hudiflink | `?` | |
   | hudihadoopmr | `?` | |
   | hudisparkdatasource | `?` | |
   | hudisync | `4.88% <ø> (-49.64%)` | :arrow_down: |
   | huditimelineservice | `?` | |
   | hudiutilities | `59.77% <89.09%> (+0.50%)` | :arrow_up: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...apache/hudi/utilities/sources/AvroKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb0thZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (ø)` | |
   | [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `72.72% <83.33%> (+1.15%)` | :arrow_up: |
   | [...hudi/utilities/sources/helpers/KafkaOffsetGen.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvaGVscGVycy9LYWZrYU9mZnNldEdlbi5qYXZh) | `88.13% <91.48%> (+0.45%)` | :arrow_up: |
   | [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `100.00% <100.00%> (ø)` | |
   | [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...a/org/apache/hudi/metrics/MetricsReporterType.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyVHlwZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...rg/apache/hudi/client/bootstrap/BootstrapMode.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9ib290c3RyYXAvQm9vdHN0cmFwTW9kZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...he/hudi/hive/HiveStylePartitionValueExtractor.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvSGl2ZVN0eWxlUGFydGl0aW9uVmFsdWVFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | ... and [630 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [5804ad8...c705ce5](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
       "triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "triggerType" : "PUSH"
     }, {
       "hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
       "triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
       "triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
       "triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
       "triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
       "triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638",
       "triggerID" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
       "triggerType" : "PUSH"
     }, {
       "hash" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=775",
       "triggerID" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a688be727d6d6beff51a3f347b9e596d982610b5",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=838",
       "triggerID" : "a688be727d6d6beff51a3f347b9e596d982610b5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=839",
       "triggerID" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8bc0333e4fc14158b126da1f7b14f6c43a3abfb8",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=856",
       "triggerID" : "8bc0333e4fc14158b126da1f7b14f6c43a3abfb8",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5022f1d97e4e9b140d8e41b5b49c034ceb9ae601",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=858",
       "triggerID" : "5022f1d97e4e9b140d8e41b5b49c034ceb9ae601",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b77b63994db2e91853a06d3a5c4c129a21feefcf",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=863",
       "triggerID" : "b77b63994db2e91853a06d3a5c4c129a21feefcf",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e98b8e407f1bbcd0f0219d2f2d65f4e95f663c00",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "e98b8e407f1bbcd0f0219d2f2d65f4e95f663c00",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * b77b63994db2e91853a06d3a5c4c129a21feefcf Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=863) 
   * e98b8e407f1bbcd0f0219d2f2d65f4e95f663c00 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
       "triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "triggerType" : "PUSH"
     }, {
       "hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
       "triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
       "triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
       "triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
       "triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 5e8ab52b0e139333c4c003932c55ff6e88302206 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] liujinhui1994 commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-852970149


   I am currently facing a problem and would like to hear your opinion
   After we add this type, hoodie.deltastreamer.source.kafka.checkpoint.type=timestamp
   I am currently thinking, does deltastreamer.checkpoint.key maintain the status quo? The format is still: topicName,0:123,1:456
   If we continue to maintain the above format, when we specify: for example --checkpoint 1622635064, we need to determine the relationship between commitMetadata.getMetadata(CHECKPOINT_KEY) and --checkpoint 1622635064 in org.apache.hudi.utilities.deltastreamer.DeltaSync#readFromSource, This seems to be contrary to the results of our discussion, do not add kafka dependent code in DeltaSync
   
   Do you have any suggestions for this? thanks 
   @nsivabalan 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
       "triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "triggerType" : "PUSH"
     }, {
       "hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
       "triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
       "triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
       "triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
       "triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
       "triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638",
       "triggerID" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
       "triggerType" : "PUSH"
     }, {
       "hash" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=775",
       "triggerID" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a688be727d6d6beff51a3f347b9e596d982610b5",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=838",
       "triggerID" : "a688be727d6d6beff51a3f347b9e596d982610b5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=839",
       "triggerID" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8bc0333e4fc14158b126da1f7b14f6c43a3abfb8",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=856",
       "triggerID" : "8bc0333e4fc14158b126da1f7b14f6c43a3abfb8",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5022f1d97e4e9b140d8e41b5b49c034ceb9ae601",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=858",
       "triggerID" : "5022f1d97e4e9b140d8e41b5b49c034ceb9ae601",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b77b63994db2e91853a06d3a5c4c129a21feefcf",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=863",
       "triggerID" : "b77b63994db2e91853a06d3a5c4c129a21feefcf",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e98b8e407f1bbcd0f0219d2f2d65f4e95f663c00",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=959",
       "triggerID" : "e98b8e407f1bbcd0f0219d2f2d65f4e95f663c00",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c705ce5d409b139a14f22bef3ecdc189fa90f562",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "c705ce5d409b139a14f22bef3ecdc189fa90f562",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * e98b8e407f1bbcd0f0219d2f2d65f4e95f663c00 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=959) 
   * c705ce5d409b139a14f22bef3ecdc189fa90f562 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (739a252) into [master](https://codecov.io/gh/apache/hudi/commit/7fed7352bd506e20e5316bb0b3ed9e5c1e9c76df?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (7fed735) will **decrease** coverage by `1.49%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #2438      +/-   ##
   ============================================
   - Coverage     54.96%   53.47%   -1.50%     
   + Complexity     3844     3459     -385     
   ============================================
     Files           485      431      -54     
     Lines         23437    21421    -2016     
     Branches       2494     2253     -241     
   ============================================
   - Hits          12882    11454    -1428     
   + Misses         9401     8947     -454     
   + Partials       1154     1020     -134     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `39.55% <ø> (ø)` | |
   | hudiclient | `∅ <ø> (∅)` | |
   | hudicommon | `50.29% <ø> (ø)` | |
   | hudiflink | `63.41% <ø> (ø)` | |
   | hudihadoopmr | `51.54% <ø> (ø)` | |
   | hudisparkdatasource | `73.33% <ø> (ø)` | |
   | hudisync | `46.44% <ø> (ø)` | |
   | huditimelineservice | `64.36% <ø> (ø)` | |
   | hudiutilities | `?` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...di/utilities/sources/helpers/IncrSourceHelper.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvaGVscGVycy9JbmNyU291cmNlSGVscGVyLmphdmE=) | | |
   | [...in/java/org/apache/hudi/utilities/UtilHelpers.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL1V0aWxIZWxwZXJzLmphdmE=) | | |
   | [...i/utilities/deltastreamer/SourceFormatAdapter.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvU291cmNlRm9ybWF0QWRhcHRlci5qYXZh) | | |
   | [...s/exception/HoodieIncrementalPullSQLException.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2V4Y2VwdGlvbi9Ib29kaWVJbmNyZW1lbnRhbFB1bGxTUUxFeGNlcHRpb24uamF2YQ==) | | |
   | [...alCheckpointFromAnotherHoodieTimelineProvider.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2NoZWNrcG9pbnRpbmcvSW5pdGlhbENoZWNrcG9pbnRGcm9tQW5vdGhlckhvb2RpZVRpbWVsaW5lUHJvdmlkZXIuamF2YQ==) | | |
   | [...g/apache/hudi/utilities/schema/SchemaProvider.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFQcm92aWRlci5qYXZh) | | |
   | [...ities/schema/NullTargetSchemaRegistryProvider.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9OdWxsVGFyZ2V0U2NoZW1hUmVnaXN0cnlQcm92aWRlci5qYXZh) | | |
   | [...ties/exception/HoodieIncrementalPullException.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2V4Y2VwdGlvbi9Ib29kaWVJbmNyZW1lbnRhbFB1bGxFeGNlcHRpb24uamF2YQ==) | | |
   | [...pache/hudi/utilities/sources/ParquetDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUGFycXVldERGU1NvdXJjZS5qYXZh) | | |
   | [...org/apache/hudi/utilities/HoodieClusteringJob.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hvb2RpZUNsdXN0ZXJpbmdKb2IuamF2YQ==) | | |
   | ... and [41 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
       "triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "triggerType" : "PUSH"
     }, {
       "hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
       "triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
       "triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
       "triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
       "triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ffd30f564c780a25ddccf8c5bc819d4eed9b437a Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400) 
   * 5e8ab52b0e139333c4c003932c55ff6e88302206 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] yanghua commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
yanghua commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-771572476


   CI still failed, check it again.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] liujinhui1994 commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-845791067


   > @liujinhui1994 : were you able to make progress on this. would be nice to have this in before next release.
   
   Sorry, I was too busy with work before~ I just sorted out the whole idea of this PR, clarified the goal, and will start soon


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-io edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-759677298


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1) Report
   > Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc) (87125e7) into [master](https://codecov.io/gh/apache/hudi/commit/c4afd179c1983a382b8a5197d800b0f5dba254de?el=desc) (c4afd17) will **decrease** coverage by `6.14%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #2438      +/-   ##
   ============================================
   - Coverage     50.18%   44.03%   -6.15%     
   + Complexity     3050     2741     -309     
   ============================================
     Files           419      419              
     Lines         18931    18949      +18     
     Branches       1948     1953       +5     
   ============================================
   - Hits           9500     8345    -1155     
   - Misses         8656     9949    +1293     
   + Partials        775      655     -120     
   ```
   
   | Flag | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | hudicli | `37.21% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudiclient | `100.00% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudicommon | `51.47% <ø> (-0.03%)` | `0.00 <ø> (ø)` | |
   | hudiflink | `0.00% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudihadoopmr | `33.16% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudisparkdatasource | `65.85% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudisync | `48.61% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | huditimelineservice | `66.49% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudiutilities | `9.59% <0.00%> (-59.84%)` | `0.00 <0.00> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `0.00% <0.00%> (-70.51%)` | `0.00 <0.00> (-50.00)` | |
   | [...hudi/utilities/sources/helpers/KafkaOffsetGen.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvaGVscGVycy9LYWZrYU9mZnNldEdlbi5qYXZh) | `0.00% <0.00%> (-88.78%)` | `0.00 <0.00> (-16.00)` | |
   | [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-2.00%)` | |
   | [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-3.00%)` | |
   | [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
   | [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
   | [...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-10.00%)` | |
   | [...g/apache/hudi/utilities/sources/JsonDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkRGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-6.00%)` | |
   | ... and [33 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more) | |
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
       "triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "triggerType" : "PUSH"
     }, {
       "hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
       "triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
       "triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 00d29d85f32f376ef44cb99d49f605a4af6f798c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269) 
   * ea5ed9da433064022a69e06c98f58fc10c09e8b6 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] liujinhui1994 commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r605394316



##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java
##########
@@ -247,6 +266,32 @@ private Long delayOffsetCalculation(Option<String> lastCheckpointStr, Set<TopicP
     return delayCount;
   }
 
+  /**
+   * Get the checkpoint by timestamp.
+   * @param consumer
+   * @param topicName
+   * @param timestamp
+   * @return
+   */
+  private String getOffsetsByTimestamp(KafkaConsumer consumer, List<PartitionInfo> partitionInfoList, String topicName, Long timestamp) {

Review comment:
       
   When the implementation plan is confirmed, I will quickly add test
   When the program is confirmed, I will quickly add test
   

##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java
##########
@@ -247,6 +266,32 @@ private Long delayOffsetCalculation(Option<String> lastCheckpointStr, Set<TopicP
     return delayCount;
   }
 
+  /**
+   * Get the checkpoint by timestamp.
+   * @param consumer
+   * @param topicName
+   * @param timestamp
+   * @return
+   */
+  private String getOffsetsByTimestamp(KafkaConsumer consumer, List<PartitionInfo> partitionInfoList, String topicName, Long timestamp) {

Review comment:
       When the program is confirmed, I will quickly add test
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (bf50481) into [master](https://codecov.io/gh/apache/hudi/commit/990820476a41b318017ba63dd446911141c929ce?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (9908204) will **decrease** coverage by `1.00%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #2438      +/-   ##
   ============================================
   - Coverage     47.61%   46.60%   -1.01%     
   + Complexity     5487     5009     -478     
   ============================================
     Files           924      862      -62     
     Lines         41206    38238    -2968     
     Branches       4133     3814     -319     
   ============================================
   - Hits          19619    17822    -1797     
   + Misses        19844    18830    -1014     
   + Partials       1743     1586     -157     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `39.97% <ø> (ø)` | |
   | hudiclient | `34.60% <ø> (+0.02%)` | :arrow_up: |
   | hudicommon | `48.57% <ø> (+<0.01%)` | :arrow_up: |
   | hudiflink | `59.58% <ø> (ø)` | |
   | hudihadoopmr | `51.29% <ø> (ø)` | |
   | hudisparkdatasource | `67.23% <ø> (-0.10%)` | :arrow_down: |
   | hudisync | `50.59% <ø> (-3.90%)` | :arrow_down: |
   | huditimelineservice | `64.07% <ø> (ø)` | |
   | hudiutilities | `?` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...ache/hudi/hive/HiveMetastoreBasedLockProvider.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvSGl2ZU1ldGFzdG9yZUJhc2VkTG9ja1Byb3ZpZGVyLmphdmE=) | `0.00% <0.00%> (-60.22%)` | :arrow_down: |
   | [...n/java/org/apache/hudi/common/model/FileSlice.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0ZpbGVTbGljZS5qYXZh) | `73.80% <0.00%> (-2.39%)` | :arrow_down: |
   | [.../org/apache/hudi/common/model/HoodieFileGroup.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0hvb2RpZUZpbGVHcm91cC5qYXZh) | `83.92% <0.00%> (-0.56%)` | :arrow_down: |
   | [...org/apache/hudi/HoodieDatasetBulkInsertHelper.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvSG9vZGllRGF0YXNldEJ1bGtJbnNlcnRIZWxwZXIuamF2YQ==) | `96.77% <0.00%> (-0.20%)` | :arrow_down: |
   | [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh) | `42.88% <0.00%> (ø)` | |
   | [...n/java/org/apache/hudi/internal/DefaultSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3BhcmsyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2ludGVybmFsL0RlZmF1bHRTb3VyY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...org/apache/hudi/spark3/internal/DefaultSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3BhcmszL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL3NwYXJrMy9pbnRlcm5hbC9EZWZhdWx0U291cmNlLmphdmE=) | `0.00% <0.00%> (ø)` | |
   | [...spark3/internal/HoodieDataSourceInternalTable.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3BhcmszL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL3NwYXJrMy9pbnRlcm5hbC9Ib29kaWVEYXRhU291cmNlSW50ZXJuYWxUYWJsZS5qYXZh) | `0.00% <0.00%> (ø)` | |
   | [...nal/HoodieBulkInsertDataInternalWriterFactory.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3BhcmsyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2ludGVybmFsL0hvb2RpZUJ1bGtJbnNlcnREYXRhSW50ZXJuYWxXcml0ZXJGYWN0b3J5LmphdmE=) | `100.00% <0.00%> (ø)` | |
   | [...nal/HoodieBulkInsertDataInternalWriterFactory.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3BhcmszL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL3NwYXJrMy9pbnRlcm5hbC9Ib29kaWVCdWxrSW5zZXJ0RGF0YUludGVybmFsV3JpdGVyRmFjdG9yeS5qYXZh) | `100.00% <0.00%> (ø)` | |
   | ... and [75 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [9908204...bf50481](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
nsivabalan edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-864413719


   Guess we can simplify things. Let me go over some pseudo code of interest. 
   
   within DeltaSync.read()
   ```
   // set right checkpoint value 
   if(cfg.checkpoint != null && ! (commitMetadata.contains(Checkpoint_RESET_Key) ) {
      checkpoint = cfg.checkpoint;
   } else if (commitMetadata.contains(Checkpoint_Key)) {
       checkpoint = commitMetadata.get(Checkpoint_Key));
   } else {
       Option.empty() 
   }
   ```
   // Note that first if condition deals with RESET_key where as 2nd else if conditions deals with Checkpoint_key. 
   I have simplified some exception cases, but should give you the gist.
   
   within write() 
   ```
   // towards the end
   commitMetadata.out(Checkpoint_Key, updated checkpoint after writing)
   if(cfg.checkpoint != null) {
     commitMetadata.add(Checkpoint_RESET_Key);
   }
   ```
   
   If cfg.checkpoint is set, only during first round, it will be honored. At the end of first batch, we add Checkpoint_RESET_Key to the commitmetadata and hence from subsequent batches, checkpoint will be parsed from commitMetadata. 
   
   With this PR, only addition is that we are introducing a new checkpoint type. Let me propose a simple add on to above code that would work for us. 
   
   Code before this patch. 
   within DeltaSync.read()
   ```
   // set right checkpoint value 
   boolean resetCheckpointType = true; // New addition
   if(cfg.checkpoint != null && ! (commitMetadata.contains(Checkpoint_RESET_Key) ) {
      checkpoint = cfg.checkpoint;
      resetCheckpointType = false; // New addition
   } else if (commitMetadata.contains(Checkpoint_Key)) {
       checkpoint = commitMetadata.get(Checkpoint_Key));
   } else {
       Option.empty() 
   }
   // New addition
   if (resetCheckpointType) {
     **reset checkpoint type if set.** 
   }
   ```
   
   No other changes are required. This is based of the assumption that Checkpoint_RESET_Key and checkpoint type goes hand in hand. During first batch, checkpoint type could be set, there won't be any Checkpoint_RESET_Key set. But from 2nd batch, it should be reverse. check point type should not be set, but Checkpoint_RESET_Key should be part of the commit metadata. Given this assumption, we don't really need to add checkpoint type to commitMetadata, but still decide whether to use the checkpoint type or not. 
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-851604627


   sure, sounds good. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (e98b8e4) into [master](https://codecov.io/gh/apache/hudi/commit/5804ad8e32ae05758ebc5e47f5d4fb4db371ab52?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (5804ad8) will **decrease** coverage by `20.23%`.
   > The diff coverage is `85.45%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@              Coverage Diff              @@
   ##             master    #2438       +/-   ##
   =============================================
   - Coverage     47.72%   27.49%   -20.24%     
   + Complexity     5528     1301     -4227     
   =============================================
     Files           934      386      -548     
     Lines         41457    15377    -26080     
     Branches       4166     1343     -2823     
   =============================================
   - Hits          19786     4228    -15558     
   + Misses        19914    10843     -9071     
   + Partials       1757      306     -1451     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `?` | |
   | hudiclient | `20.91% <ø> (-13.54%)` | :arrow_down: |
   | hudicommon | `?` | |
   | hudiflink | `?` | |
   | hudihadoopmr | `?` | |
   | hudisparkdatasource | `?` | |
   | hudisync | `4.88% <ø> (-49.64%)` | :arrow_down: |
   | huditimelineservice | `?` | |
   | hudiutilities | `59.70% <85.45%> (+0.44%)` | :arrow_up: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...apache/hudi/utilities/sources/AvroKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb0thZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (ø)` | |
   | [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `72.72% <83.33%> (+1.15%)` | :arrow_up: |
   | [...hudi/utilities/sources/helpers/KafkaOffsetGen.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvaGVscGVycy9LYWZrYU9mZnNldEdlbi5qYXZh) | `87.00% <87.23%> (-0.68%)` | :arrow_down: |
   | [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `100.00% <100.00%> (ø)` | |
   | [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...a/org/apache/hudi/metrics/MetricsReporterType.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyVHlwZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...rg/apache/hudi/client/bootstrap/BootstrapMode.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9ib290c3RyYXAvQm9vdHN0cmFwTW9kZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...he/hudi/hive/HiveStylePartitionValueExtractor.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvSGl2ZVN0eWxlUGFydGl0aW9uVmFsdWVFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | ... and [630 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [5804ad8...e98b8e4](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] n3nash commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
n3nash commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r642690647



##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java
##########
@@ -326,6 +328,16 @@ private boolean onDeltaSyncShutdown(boolean error) {
     @Parameter(names = {"--checkpoint"}, description = "Resume Delta Streamer from this checkpoint.")
     public String checkpoint = null;
 
+    /**
+     * 1. string: topicName,partition number 0:offset value,partition number 1:offset value
+     * 2. timestamp: kafka offset timestamp
+     * example
+     * 1. hudi_topic,0:100,1:101,2:201
+     * 2. 1621947081
+     */
+    @Parameter(names = {"--checkpoint-type"}, description = "Checkpoint type, divided into timestamp or string offset")
+    public String checkpointType = "string";

Review comment:
       @nsivabalan Do we need to introduce something explicitly here ? Can we just introduce another property like below `hoodie.deltastreamer.source.kafka.checkpoint.type` and not have this change present as a top level option ? This checkpoint type seems very specific to a use-case in kafka and would like to reduce the confusions at the top level configs for users who want to use other source types.  




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
       "triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "triggerType" : "PUSH"
     }, {
       "hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
       "triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
       "triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
       "triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
       "triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
       "triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] yanghua commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
yanghua commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-759520009


   @wangxianghu Can you review this PR firstly?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
       "triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "triggerType" : "PUSH"
     }, {
       "hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
       "triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
       "triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
       "triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
       "triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
       "triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638",
       "triggerID" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
       "triggerType" : "PUSH"
     }, {
       "hash" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=775",
       "triggerID" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a688be727d6d6beff51a3f347b9e596d982610b5",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=838",
       "triggerID" : "a688be727d6d6beff51a3f347b9e596d982610b5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=839",
       "triggerID" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8bc0333e4fc14158b126da1f7b14f6c43a3abfb8",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=856",
       "triggerID" : "8bc0333e4fc14158b126da1f7b14f6c43a3abfb8",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5022f1d97e4e9b140d8e41b5b49c034ceb9ae601",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "5022f1d97e4e9b140d8e41b5b49c034ceb9ae601",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 67041c2d836e61355aea26bd24f91548ec5e92ce Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=839) 
   * 8bc0333e4fc14158b126da1f7b14f6c43a3abfb8 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=856) 
   * 5022f1d97e4e9b140d8e41b5b49c034ceb9ae601 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] liujinhui1994 commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r670998409



##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java
##########
@@ -282,6 +301,36 @@ private Long delayOffsetCalculation(Option<String> lastCheckpointStr, Set<TopicP
     return delayCount;
   }
 
+  /**
+   * Get the checkpoint by timestamp.
+   * @param consumer

Review comment:
       OK, I will add it immediately




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (67041c2) into [master](https://codecov.io/gh/apache/hudi/commit/990820476a41b318017ba63dd446911141c929ce?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (9908204) will **decrease** coverage by `0.92%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #2438      +/-   ##
   ============================================
   - Coverage     47.61%   46.68%   -0.93%     
   + Complexity     5487     5039     -448     
   ============================================
     Files           924      867      -57     
     Lines         41206    38791    -2415     
     Branches       4133     3927     -206     
   ============================================
   - Hits          19619    18110    -1509     
   + Misses        19844    19079     -765     
   + Partials       1743     1602     -141     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `39.97% <ø> (ø)` | |
   | hudiclient | `34.20% <ø> (-0.38%)` | :arrow_down: |
   | hudicommon | `48.58% <ø> (+0.02%)` | :arrow_up: |
   | hudiflink | `60.03% <ø> (+0.44%)` | :arrow_up: |
   | hudihadoopmr | `51.29% <ø> (ø)` | |
   | hudisparkdatasource | `68.36% <ø> (+1.03%)` | :arrow_up: |
   | hudisync | `50.55% <ø> (-3.93%)` | :arrow_down: |
   | huditimelineservice | `64.07% <ø> (ø)` | |
   | hudiutilities | `?` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...ache/hudi/hive/HiveMetastoreBasedLockProvider.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvSGl2ZU1ldGFzdG9yZUJhc2VkTG9ja1Byb3ZpZGVyLmphdmE=) | `0.00% <0.00%> (-60.22%)` | :arrow_down: |
   | [...n/java/org/apache/hudi/index/SparkHoodieIndex.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1zcGFyay1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaW5kZXgvU3BhcmtIb29kaWVJbmRleC5qYXZh) | `56.52% <0.00%> (-30.15%)` | :arrow_down: |
   | [...java/org/apache/hudi/table/HoodieTableFactory.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS90YWJsZS9Ib29kaWVUYWJsZUZhY3RvcnkuamF2YQ==) | `84.61% <0.00%> (-7.06%)` | :arrow_down: |
   | [...org/apache/hudi/config/HoodieClusteringConfig.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVDbHVzdGVyaW5nQ29uZmlnLmphdmE=) | `67.74% <0.00%> (-3.54%)` | :arrow_down: |
   | [...main/scala/org/apache/hudi/HoodieWriterUtils.scala](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZVdyaXRlclV0aWxzLnNjYWxh) | `81.53% <0.00%> (-3.37%)` | :arrow_down: |
   | [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh) | `39.57% <0.00%> (-3.31%)` | :arrow_down: |
   | [...n/java/org/apache/hudi/common/model/FileSlice.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0ZpbGVTbGljZS5qYXZh) | `73.80% <0.00%> (-2.39%)` | :arrow_down: |
   | [...a/org/apache/hudi/common/util/ClusteringUtils.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3V0aWwvQ2x1c3RlcmluZ1V0aWxzLmphdmE=) | `88.40% <0.00%> (-1.31%)` | :arrow_down: |
   | [.../org/apache/hudi/common/model/HoodieFileGroup.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0hvb2RpZUZpbGVHcm91cC5qYXZh) | `83.92% <0.00%> (-0.56%)` | :arrow_down: |
   | [...in/java/org/apache/hudi/table/HoodieTableSink.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS90YWJsZS9Ib29kaWVUYWJsZVNpbmsuamF2YQ==) | `10.52% <0.00%> (ø)` | |
   | ... and [103 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [9908204...67041c2](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] liujinhui1994 commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-782588952


   @yanghua @wangxianghu @nsivabalan  
   
   I have verified, please help review
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] liujinhui1994 commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r557069074



##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java
##########
@@ -182,6 +187,10 @@ public KafkaOffsetGen(TypedProperties props) {
               .map(x -> new TopicPartition(x.topic(), x.partition())).collect(Collectors.toSet());
 
       // Determine the offset ranges to read from
+      if (kafkaCheckpointTimestamp != null) {
+        lastCheckpointStr = Option.of(getOffsetsByTimestamp(consumer, partitionInfoList, topicName, Long.parseLong(kafkaCheckpointTimestamp)));
+      }
+
       if (lastCheckpointStr.isPresent() && !lastCheckpointStr.get().isEmpty()) {

Review comment:
       I deal with it now




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-864418016


   actually, we can make it even more simpler. 
   ```
   // set right checkpoint value
   if(cfg.checkpoint != null && ! (commitMetadata.contains(Checkpoint_RESET_Key) ) {
      checkpoint = cfg.checkpoint;
   } else if (commitMetadata.contains(Checkpoint_Key)) {
       checkpoint = commitMetadata.get(Checkpoint_Key));
   } else {
       Option.empty() 
   }
   // New addition
   if (commitMetadata.contains(Checkpoint_RESET_Key)) {
     **reset checkpoint type if set.** 
   }
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
       "triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "triggerType" : "PUSH"
     }, {
       "hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
       "triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
       "triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
       "triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
       "triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
       "triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638",
       "triggerID" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
       "triggerType" : "PUSH"
     }, {
       "hash" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=775",
       "triggerID" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a688be727d6d6beff51a3f347b9e596d982610b5",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=838",
       "triggerID" : "a688be727d6d6beff51a3f347b9e596d982610b5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=839",
       "triggerID" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8bc0333e4fc14158b126da1f7b14f6c43a3abfb8",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=856",
       "triggerID" : "8bc0333e4fc14158b126da1f7b14f6c43a3abfb8",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5022f1d97e4e9b140d8e41b5b49c034ceb9ae601",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=858",
       "triggerID" : "5022f1d97e4e9b140d8e41b5b49c034ceb9ae601",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b77b63994db2e91853a06d3a5c4c129a21feefcf",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=863",
       "triggerID" : "b77b63994db2e91853a06d3a5c4c129a21feefcf",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e98b8e407f1bbcd0f0219d2f2d65f4e95f663c00",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=959",
       "triggerID" : "e98b8e407f1bbcd0f0219d2f2d65f4e95f663c00",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c705ce5d409b139a14f22bef3ecdc189fa90f562",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=962",
       "triggerID" : "c705ce5d409b139a14f22bef3ecdc189fa90f562",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * e98b8e407f1bbcd0f0219d2f2d65f4e95f663c00 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=959) 
   * c705ce5d409b139a14f22bef3ecdc189fa90f562 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=962) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] liujinhui1994 commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r671069986



##########
File path: hudi-utilities/src/test/java/org/apache/hudi/utilities/sources/helpers/TestKafkaOffsetGen.java
##########
@@ -64,7 +63,7 @@ public void teardown() throws Exception {
 
   private TypedProperties getConsumerConfigs(String autoOffsetReset) {
     TypedProperties props = new TypedProperties();
-    props.put(Config.KAFKA_AUTO_OFFSET_RESET, autoOffsetReset);
+    props.put("auto.offset.reset", autoOffsetReset);

Review comment:
       Necessary, already added.
   This will better guarantee the correctness of the procedure




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
nsivabalan edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-853866564


   good point. 
   Tell me if my understanding is right in general wrt usage of timestamp based checkpointing. 
   user would like to use timestamp based checkpointing in deltastreamer only for bootstrap case. 
   and further on, checkpointing will be using the regular kafka checkpoint format of "topicName,0:123,1:456". 
   
   if my understanding (stated above) is true, essentially, within kafkaOffsenGen, we might have to parse checkpoint as timestamp for first time(bootstrap), but from 2nd time, we fallback to regular checkpoint parsing mechanism. 
   
   I see we have InitialCheckPointProvider. Let me think about how to go about this and will get back to you. For now, this is what I can think of. 
   InitialCheckpointProvider will expose getCheckpointType() method. 
   and we add it as a property to configs if initialCheckpointProvider is set around [here](https://github.com/apache/hudi/blob/f6eee77636223077cfd2ce516f1b8805dfa6e35e/hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java#L132). 
   Within readFromSource in DeltaSync(), if checkpoint is fetched from commit metadata, we may not honor this checkpoint type. or we will clear the checkpoint type property if set. 
   but if fetched from cfg.checkPoint, we will leave the property as is and let kafkaOffsetGen handle checkpoint parsing. 
   
   But let me think through this more. But in the mean time, if you can confirm my understanding of the usage of timestamp based checkpointing, would be great. 
   
   CC @n3nash @bvaradar 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
       "triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "triggerType" : "PUSH"
     }, {
       "hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
       "triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
       "triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
       "triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
       "triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
       "triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638",
       "triggerID" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
       "triggerType" : "PUSH"
     }, {
       "hash" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a39570dfe0493bcd23edf911f6256e90d3b22907 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638) 
   * bf50481b923dbaa14be994bd0cc45bbe22ff8524 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (bf50481) into [master](https://codecov.io/gh/apache/hudi/commit/990820476a41b318017ba63dd446911141c929ce?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (9908204) will **decrease** coverage by `1.37%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #2438      +/-   ##
   ============================================
   - Coverage     47.61%   46.23%   -1.38%     
   + Complexity     5487     4760     -727     
   ============================================
     Files           924      833      -91     
     Lines         41206    36348    -4858     
     Branches       4133     3623     -510     
   ============================================
   - Hits          19619    16805    -2814     
   + Misses        19844    18052    -1792     
   + Partials       1743     1491     -252     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `39.97% <ø> (ø)` | |
   | hudiclient | `34.60% <ø> (+0.02%)` | :arrow_up: |
   | hudicommon | `48.57% <ø> (+<0.01%)` | :arrow_up: |
   | hudiflink | `59.58% <ø> (ø)` | |
   | hudihadoopmr | `51.29% <ø> (ø)` | |
   | hudisparkdatasource | `67.23% <ø> (-0.10%)` | :arrow_down: |
   | hudisync | `?` | |
   | huditimelineservice | `?` | |
   | hudiutilities | `?` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...n/java/org/apache/hudi/common/model/FileSlice.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0ZpbGVTbGljZS5qYXZh) | `73.80% <0.00%> (-2.39%)` | :arrow_down: |
   | [.../org/apache/hudi/common/model/HoodieFileGroup.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0hvb2RpZUZpbGVHcm91cC5qYXZh) | `83.92% <0.00%> (-0.56%)` | :arrow_down: |
   | [...org/apache/hudi/HoodieDatasetBulkInsertHelper.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvSG9vZGllRGF0YXNldEJ1bGtJbnNlcnRIZWxwZXIuamF2YQ==) | `96.77% <0.00%> (-0.20%)` | :arrow_down: |
   | [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh) | `42.88% <0.00%> (ø)` | |
   | [...n/java/org/apache/hudi/internal/DefaultSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3BhcmsyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2ludGVybmFsL0RlZmF1bHRTb3VyY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...org/apache/hudi/spark3/internal/DefaultSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3BhcmszL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL3NwYXJrMy9pbnRlcm5hbC9EZWZhdWx0U291cmNlLmphdmE=) | `0.00% <0.00%> (ø)` | |
   | [...spark3/internal/HoodieDataSourceInternalTable.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3BhcmszL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL3NwYXJrMy9pbnRlcm5hbC9Ib29kaWVEYXRhU291cmNlSW50ZXJuYWxUYWJsZS5qYXZh) | `0.00% <0.00%> (ø)` | |
   | [...nal/HoodieBulkInsertDataInternalWriterFactory.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3BhcmsyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2ludGVybmFsL0hvb2RpZUJ1bGtJbnNlcnREYXRhSW50ZXJuYWxXcml0ZXJGYWN0b3J5LmphdmE=) | `100.00% <0.00%> (ø)` | |
   | [...nal/HoodieBulkInsertDataInternalWriterFactory.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3BhcmszL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL3NwYXJrMy9pbnRlcm5hbC9Ib29kaWVCdWxrSW5zZXJ0RGF0YUludGVybmFsV3JpdGVyRmFjdG9yeS5qYXZh) | `100.00% <0.00%> (ø)` | |
   | [...nal/HoodieDataSourceInternalBatchWriteBuilder.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3BhcmszL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL3NwYXJrMy9pbnRlcm5hbC9Ib29kaWVEYXRhU291cmNlSW50ZXJuYWxCYXRjaFdyaXRlQnVpbGRlci5qYXZh) | `0.00% <0.00%> (ø)` | |
   | ... and [103 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [9908204...bf50481](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] liujinhui1994 commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-787662595


   The current implementation is mainly in KafkaOffsetGen @wangxianghu 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
       "triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "triggerType" : "PUSH"
     }, {
       "hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
       "triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
       "triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
       "triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
       "triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
       "triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638",
       "triggerID" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
       "triggerType" : "PUSH"
     }, {
       "hash" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=775",
       "triggerID" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a688be727d6d6beff51a3f347b9e596d982610b5",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=838",
       "triggerID" : "a688be727d6d6beff51a3f347b9e596d982610b5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=839",
       "triggerID" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8bc0333e4fc14158b126da1f7b14f6c43a3abfb8",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=856",
       "triggerID" : "8bc0333e4fc14158b126da1f7b14f6c43a3abfb8",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5022f1d97e4e9b140d8e41b5b49c034ceb9ae601",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=858",
       "triggerID" : "5022f1d97e4e9b140d8e41b5b49c034ceb9ae601",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b77b63994db2e91853a06d3a5c4c129a21feefcf",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=863",
       "triggerID" : "b77b63994db2e91853a06d3a5c4c129a21feefcf",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e98b8e407f1bbcd0f0219d2f2d65f4e95f663c00",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=959",
       "triggerID" : "e98b8e407f1bbcd0f0219d2f2d65f4e95f663c00",
       "triggerType" : "PUSH"
     }, {
       "hash" : "c705ce5d409b139a14f22bef3ecdc189fa90f562",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=962",
       "triggerID" : "c705ce5d409b139a14f22bef3ecdc189fa90f562",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * c705ce5d409b139a14f22bef3ecdc189fa90f562 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=962) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] liujinhui1994 commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r670997475



##########
File path: hudi-utilities/src/test/java/org/apache/hudi/utilities/sources/TestKafkaSource.java
##########
@@ -193,7 +193,7 @@ public void testJsonKafkaSourceWithDefaultUpperCap() {
 
     Source jsonSource = new JsonKafkaSource(props, jsc, sparkSession, schemaProvider, metrics);
     SourceFormatAdapter kafkaSource = new SourceFormatAdapter(jsonSource);
-    Config.maxEventsFromKafkaSource = 500;
+    //props.setProperty("hoodie.deltastreamer.kafka.source.maxEvents", "500");

Review comment:
       //props.setProperty("hoodie.deltastreamer.kafka.source.maxEvents", "500") 
   should not appear here.
   sorry,My Mistake
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r671353717



##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java
##########
@@ -212,6 +234,9 @@ public KafkaOffsetGen(TypedProperties props) {
       Set<TopicPartition> topicPartitions = partitionInfoList.stream()
               .map(x -> new TopicPartition(x.topic(), x.partition())).collect(Collectors.toSet());
 
+      if (Config.KAFKA_CHECKPOINT_TYPE_TIMESTAMP.equals(kafkaCheckpointType) && isValidCheckpointType(lastCheckpointStr)) {
+        lastCheckpointStr = getOffsetsByTimestamp(consumer, partitionInfoList, topicPartitions, topicName, Long.parseLong(lastCheckpointStr.get()));
+      }

Review comment:
       not sure I understand. this is what I am thinking
   ```
   if (timestamp based checkpoint)
       lastCheckpoint = getOffsetByTimestamp()
   else if regular checkpoint type
      lastCheckpoint = fetValidOffsets()
   else 
      reset based on auto.offset.reset. 
   ```
   
   Am I misunderstanding anything here? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r671353717



##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java
##########
@@ -212,6 +234,9 @@ public KafkaOffsetGen(TypedProperties props) {
       Set<TopicPartition> topicPartitions = partitionInfoList.stream()
               .map(x -> new TopicPartition(x.topic(), x.partition())).collect(Collectors.toSet());
 
+      if (Config.KAFKA_CHECKPOINT_TYPE_TIMESTAMP.equals(kafkaCheckpointType) && isValidCheckpointType(lastCheckpointStr)) {
+        lastCheckpointStr = getOffsetsByTimestamp(consumer, partitionInfoList, topicPartitions, topicName, Long.parseLong(lastCheckpointStr.get()));
+      }

Review comment:
       not sure I understand. this is what I am thinking
   ```
   if (timestamp based checkpoint)
       lastCheckpoint = getOffsetByTimestamp()
   else if regular checkpoint type
      lastCheckpoint = fetValidOffsets()
   else 
      reset based on auto.offset.reset. 
   ```
   
   Am I misunderstanding anything here? Can you help me understand please.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (ffd30f5) into [master](https://codecov.io/gh/apache/hudi/commit/0b57483a8e41742689a1362aa94aabb94a1361b3?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (0b57483) will **decrease** coverage by `3.02%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #2438      +/-   ##
   ============================================
   - Coverage     45.76%   42.74%   -3.03%     
   + Complexity     5261     4070    -1191     
   ============================================
     Files           909      753     -156     
     Lines         39353    33259    -6094     
     Branches       4239     3603     -636     
   ============================================
   - Hits          18010    14215    -3795     
   + Misses        19499    17467    -2032     
   + Partials       1844     1577     -267     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `39.95% <ø> (ø)` | |
   | hudiclient | `16.45% <ø> (-13.95%)` | :arrow_down: |
   | hudicommon | `47.56% <ø> (-0.02%)` | :arrow_down: |
   | hudiflink | `61.26% <ø> (+0.45%)` | :arrow_up: |
   | hudihadoopmr | `51.29% <ø> (ø)` | |
   | hudisparkdatasource | `67.00% <ø> (+0.52%)` | :arrow_up: |
   | hudisync | `47.11% <ø> (-4.35%)` | :arrow_down: |
   | huditimelineservice | `64.36% <ø> (ø)` | |
   | hudiutilities | `?` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...ache/hudi/hive/HiveMetastoreBasedLockProvider.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvSGl2ZU1ldGFzdG9yZUJhc2VkTG9ja1Byb3ZpZGVyLmphdmE=) | `0.00% <0.00%> (-60.22%)` | :arrow_down: |
   | [.../org/apache/hudi/sink/compact/CompactFunction.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zaW5rL2NvbXBhY3QvQ29tcGFjdEZ1bmN0aW9uLmphdmE=) | `86.66% <0.00%> (-13.34%)` | :arrow_down: |
   | [...e/hudi/sink/partitioner/profile/WriteProfiles.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zaW5rL3BhcnRpdGlvbmVyL3Byb2ZpbGUvV3JpdGVQcm9maWxlcy5qYXZh) | `50.00% <0.00%> (-5.89%)` | :arrow_down: |
   | [...src/main/scala/org/apache/hudi/DefaultSource.scala](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0RlZmF1bHRTb3VyY2Uuc2NhbGE=) | `75.22% <0.00%> (-2.23%)` | :arrow_down: |
   | [...e/hudi/common/table/log/HoodieLogFormatWriter.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL2xvZy9Ib29kaWVMb2dGb3JtYXRXcml0ZXIuamF2YQ==) | `78.12% <0.00%> (-1.57%)` | :arrow_down: |
   | [...c/main/java/org/apache/hudi/util/StreamerUtil.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS91dGlsL1N0cmVhbWVyVXRpbC5qYXZh) | `55.00% <0.00%> (-1.42%)` | :arrow_down: |
   | [...i/common/table/timeline/HoodieDefaultTimeline.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3RpbWVsaW5lL0hvb2RpZURlZmF1bHRUaW1lbGluZS5qYXZh) | `79.22% <0.00%> (-1.30%)` | :arrow_down: |
   | [...java/org/apache/hudi/sink/StreamWriteFunction.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zaW5rL1N0cmVhbVdyaXRlRnVuY3Rpb24uamF2YQ==) | `84.34% <0.00%> (-0.66%)` | :arrow_down: |
   | [...he/hudi/sink/partitioner/profile/WriteProfile.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zaW5rL3BhcnRpdGlvbmVyL3Byb2ZpbGUvV3JpdGVQcm9maWxlLmphdmE=) | `87.50% <0.00%> (-0.50%)` | :arrow_down: |
   | [...va/org/apache/hudi/configuration/FlinkOptions.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9jb25maWd1cmF0aW9uL0ZsaW5rT3B0aW9ucy5qYXZh) | `96.37% <0.00%> (-0.05%)` | :arrow_down: |
   | ... and [189 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [0b57483...ffd30f5](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] wangxianghu edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
wangxianghu edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-759873110


   @liujinhui1994 IMO, we can provide both offset and timestamp checkpoint by `--checkpoint`,  add a new param named checkpointType(default offset type if not configed) to tell hudi the checkpoint type user used.  WDYT ?
   please check why ci failed BTW


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r663377708



##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java
##########
@@ -282,6 +301,36 @@ private Long delayOffsetCalculation(Option<String> lastCheckpointStr, Set<TopicP
     return delayCount;
   }
 
+  /**
+   * Get the checkpoint by timestamp.
+   * @param consumer

Review comment:
       can you please add some documentation on whats happening here. format etc. an example would be great. 

##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java
##########
@@ -270,6 +280,15 @@ public KafkaOffsetGen(TypedProperties props) {
     return checkpointOffsetReseter ? earliestOffsets : checkpointOffsets;
   }
 
+  private Boolean checkLastCheckpointType(Option<String> lastCheckpointStr) {

Review comment:
       should we name this "isValidCheckpointType" or something? also, can you add java docs as to what validation we are doing here. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-io commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
codecov-io commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-759677298


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1) Report
   > Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc) (9c37f30) into [master](https://codecov.io/gh/apache/hudi/commit/e3d3677b7e7899705b624925666317f0c074f7c7?el=desc) (e3d3677) will **decrease** coverage by `41.11%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master   #2438       +/-   ##
   ============================================
   - Coverage     50.73%   9.61%   -41.12%     
   + Complexity     3064      48     -3016     
   ============================================
     Files           419      53      -366     
     Lines         18797    1944    -16853     
     Branches       1922     233     -1689     
   ============================================
   - Hits           9536     187     -9349     
   + Misses         8487    1744     -6743     
   + Partials        774      13      -761     
   ```
   
   | Flag | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | hudicli | `?` | `?` | |
   | hudiclient | `100.00% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudicommon | `?` | `?` | |
   | hudiflink | `?` | `?` | |
   | hudihadoopmr | `?` | `?` | |
   | hudisparkdatasource | `?` | `?` | |
   | hudisync | `?` | `?` | |
   | huditimelineservice | `?` | `?` | |
   | hudiutilities | `9.61% <0.00%> (-59.87%)` | `0.00 <0.00> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | [...hudi/utilities/sources/helpers/KafkaOffsetGen.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvaGVscGVycy9LYWZrYU9mZnNldEdlbi5qYXZh) | `0.00% <0.00%> (-88.78%)` | `0.00 <0.00> (-16.00)` | |
   | [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-2.00%)` | |
   | [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-3.00%)` | |
   | [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
   | [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
   | [...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-10.00%)` | |
   | [...g/apache/hudi/utilities/sources/JsonDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkRGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-6.00%)` | |
   | [...pache/hudi/utilities/sources/ParquetDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUGFycXVldERGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-5.00%)` | |
   | ... and [397 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more) | |
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
       "triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "triggerType" : "PUSH"
     }, {
       "hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
       "triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
       "triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
       "triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ea5ed9da433064022a69e06c98f58fc10c09e8b6 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373) 
   * ffd30f564c780a25ddccf8c5bc819d4eed9b437a Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (67041c2) into [master](https://codecov.io/gh/apache/hudi/commit/990820476a41b318017ba63dd446911141c929ce?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (9908204) will **increase** coverage by `19.04%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@              Coverage Diff              @@
   ##             master    #2438       +/-   ##
   =============================================
   + Coverage     47.61%   66.65%   +19.04%     
   + Complexity     5487      798     -4689     
   =============================================
     Files           924      100      -824     
     Lines         41206     3488    -37718     
     Branches       4133      353     -3780     
   =============================================
   - Hits          19619     2325    -17294     
   + Misses        19844     1024    -18820     
   + Partials       1743      139     -1604     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `?` | |
   | hudiclient | `66.65% <ø> (+32.07%)` | :arrow_up: |
   | hudicommon | `?` | |
   | hudiflink | `?` | |
   | hudihadoopmr | `?` | |
   | hudisparkdatasource | `?` | |
   | hudisync | `?` | |
   | huditimelineservice | `?` | |
   | hudiutilities | `?` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...n/java/org/apache/hudi/index/SparkHoodieIndex.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1zcGFyay1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaW5kZXgvU3BhcmtIb29kaWVJbmRleC5qYXZh) | `56.52% <0.00%> (-30.15%)` | :arrow_down: |
   | [...e/hudi/table/format/mor/MergeOnReadTableState.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS90YWJsZS9mb3JtYXQvbW9yL01lcmdlT25SZWFkVGFibGVTdGF0ZS5qYXZh) | | |
   | [.../main/java/org/apache/hudi/util/AvroConvertor.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS91dGlsL0F2cm9Db252ZXJ0b3IuamF2YQ==) | | |
   | [...he/hudi/table/format/cow/AbstractColumnReader.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS90YWJsZS9mb3JtYXQvY293L0Fic3RyYWN0Q29sdW1uUmVhZGVyLmphdmE=) | | |
   | [...di/utilities/sources/helpers/IncrSourceHelper.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvaGVscGVycy9JbmNyU291cmNlSGVscGVyLmphdmE=) | | |
   | [...rg/apache/hudi/table/action/commit/BucketType.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL3RhYmxlL2FjdGlvbi9jb21taXQvQnVja2V0VHlwZS5qYXZh) | | |
   | [...apache/hudi/timeline/service/handlers/Handler.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS10aW1lbGluZS1zZXJ2aWNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL3RpbWVsaW5lL3NlcnZpY2UvaGFuZGxlcnMvSGFuZGxlci5qYXZh) | | |
   | [.../apache/hudi/keygen/constant/KeyGeneratorType.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2tleWdlbi9jb25zdGFudC9LZXlHZW5lcmF0b3JUeXBlLmphdmE=) | | |
   | [...apache/hudi/client/utils/LazyIterableIterator.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC91dGlscy9MYXp5SXRlcmFibGVJdGVyYXRvci5qYXZh) | | |
   | [...hudi/table/action/commit/AbstractDeleteHelper.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL3RhYmxlL2FjdGlvbi9jb21taXQvQWJzdHJhY3REZWxldGVIZWxwZXIuamF2YQ==) | | |
   | ... and [819 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [9908204...67041c2](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (67041c2) into [master](https://codecov.io/gh/apache/hudi/commit/990820476a41b318017ba63dd446911141c929ce?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (9908204) will **decrease** coverage by `0.94%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #2438      +/-   ##
   ============================================
   - Coverage     47.61%   46.66%   -0.95%     
   + Complexity     5487     5027     -460     
   ============================================
     Files           924      864      -60     
     Lines         41206    38317    -2889     
     Branches       4133     3824     -309     
   ============================================
   - Hits          19619    17880    -1739     
   + Misses        19844    18850     -994     
   + Partials       1743     1587     -156     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `39.97% <ø> (ø)` | |
   | hudiclient | `34.55% <ø> (-0.03%)` | :arrow_down: |
   | hudicommon | `48.59% <ø> (+0.02%)` | :arrow_up: |
   | hudiflink | `60.03% <ø> (+0.44%)` | :arrow_up: |
   | hudihadoopmr | `51.29% <ø> (ø)` | |
   | hudisparkdatasource | `67.32% <ø> (-0.01%)` | :arrow_down: |
   | hudisync | `50.55% <ø> (-3.93%)` | :arrow_down: |
   | huditimelineservice | `64.07% <ø> (ø)` | |
   | hudiutilities | `?` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...ache/hudi/hive/HiveMetastoreBasedLockProvider.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvSGl2ZU1ldGFzdG9yZUJhc2VkTG9ja1Byb3ZpZGVyLmphdmE=) | `0.00% <0.00%> (-60.22%)` | :arrow_down: |
   | [...n/java/org/apache/hudi/index/SparkHoodieIndex.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1zcGFyay1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaW5kZXgvU3BhcmtIb29kaWVJbmRleC5qYXZh) | `56.52% <0.00%> (-30.15%)` | :arrow_down: |
   | [...java/org/apache/hudi/table/HoodieTableFactory.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS90YWJsZS9Ib29kaWVUYWJsZUZhY3RvcnkuamF2YQ==) | `84.61% <0.00%> (-7.06%)` | :arrow_down: |
   | [...n/java/org/apache/hudi/common/model/FileSlice.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0ZpbGVTbGljZS5qYXZh) | `73.80% <0.00%> (-2.39%)` | :arrow_down: |
   | [.../org/apache/hudi/common/model/HoodieFileGroup.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0hvb2RpZUZpbGVHcm91cC5qYXZh) | `83.92% <0.00%> (-0.56%)` | :arrow_down: |
   | [...in/java/org/apache/hudi/table/HoodieTableSink.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS90YWJsZS9Ib29kaWVUYWJsZVNpbmsuamF2YQ==) | `10.52% <0.00%> (ø)` | |
   | [.../org/apache/hudi/streamer/FlinkStreamerConfig.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zdHJlYW1lci9GbGlua1N0cmVhbWVyQ29uZmlnLmphdmE=) | `0.00% <0.00%> (ø)` | |
   | [.../org/apache/hudi/streamer/HoodieFlinkStreamer.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zdHJlYW1lci9Ib29kaWVGbGlua1N0cmVhbWVyLmphdmE=) | `0.00% <0.00%> (ø)` | |
   | [...ain/java/org/apache/hudi/io/FlinkAppendHandle.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1mbGluay1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaW8vRmxpbmtBcHBlbmRIYW5kbGUuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...ain/java/org/apache/hudi/io/FlinkCreateHandle.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1mbGluay1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaW8vRmxpbmtDcmVhdGVIYW5kbGUuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | ... and [94 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [9908204...67041c2](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (e98b8e4) into [master](https://codecov.io/gh/apache/hudi/commit/5804ad8e32ae05758ebc5e47f5d4fb4db371ab52?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (5804ad8) will **decrease** coverage by `31.70%`.
   > The diff coverage is `85.45%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@              Coverage Diff              @@
   ##             master    #2438       +/-   ##
   =============================================
   - Coverage     47.72%   16.02%   -31.71%     
   + Complexity     5528      502     -5026     
   =============================================
     Files           934      284      -650     
     Lines         41457    11869    -29588     
     Branches       4166      986     -3180     
   =============================================
   - Hits          19786     1902    -17884     
   + Misses        19914     9802    -10112     
   + Partials       1757      165     -1592     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `?` | |
   | hudiclient | `0.00% <ø> (-34.46%)` | :arrow_down: |
   | hudicommon | `?` | |
   | hudiflink | `?` | |
   | hudihadoopmr | `?` | |
   | hudisparkdatasource | `?` | |
   | hudisync | `4.88% <ø> (-49.64%)` | :arrow_down: |
   | huditimelineservice | `?` | |
   | hudiutilities | `59.70% <85.45%> (+0.44%)` | :arrow_up: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...apache/hudi/utilities/sources/AvroKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb0thZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (ø)` | |
   | [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `72.72% <83.33%> (+1.15%)` | :arrow_up: |
   | [...hudi/utilities/sources/helpers/KafkaOffsetGen.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvaGVscGVycy9LYWZrYU9mZnNldEdlbi5qYXZh) | `87.00% <87.23%> (-0.68%)` | :arrow_down: |
   | [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `100.00% <100.00%> (ø)` | |
   | [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...a/org/apache/hudi/metrics/MetricsReporterType.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyVHlwZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...rg/apache/hudi/client/bootstrap/BootstrapMode.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9ib290c3RyYXAvQm9vdHN0cmFwTW9kZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...he/hudi/hive/HiveStylePartitionValueExtractor.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvSGl2ZVN0eWxlUGFydGl0aW9uVmFsdWVFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | ... and [731 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [5804ad8...e98b8e4](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
nsivabalan edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-864413719


   Guess we can simplify things. Let me go over some pseudo code of interest. 
   
   within DeltaSync.read()
   ```
   // set right checkpoint value 
   if(cfg.checkpoint != null && ! (commitMetadata.contains(Checkpoint_RESET_Key) ) {
      checkpoint = cfg.checkpoint;
   } else if (commitMetadata.contains(Checkpoint_Key)) {
       checkpoint = commitMetadata.get(Checkpoint_Key));
   } else {
       Option.empty() 
   }
   ```
   // Note that first if condition deals with RESET_key where as 2nd else if conditions deals with Checkpoint_key. 
   I have simplified some exception cases, but should give you the gist.
   
   within write() 
   ```
   // towards the end
   commitMetadata.out(Checkpoint_Key, updated checkpoint after writing)
   if(cfg.checkpoint != null) {
     commitMetadata.add(Checkpoint_RESET_Key);
   }
   ```
   
   If cfg.checkpoint is set, only during first round, it will be honored. At the end of first batch, we add Checkpoint_RESET_Key to the commitmetadata and hence from subsequent batches, checkpoint will be parsed from commitMetadata. 
   
   With this PR, only addition is that we are introducing a new checkpoint type. Let me propose a simple add on to above code that would work for us. 
   
   Code before this patch. 
   
   within DeltaSync.read()
   ```
   // set right checkpoint value 
   boolean resetCheckpointType = true; // New addition
   if(cfg.checkpoint != null && ! (commitMetadata.contains(Checkpoint_RESET_Key) ) {
      checkpoint = cfg.checkpoint;
      resetCheckpointType = false; // New addition
   } else if (commitMetadata.contains(Checkpoint_Key)) {
       checkpoint = commitMetadata.get(Checkpoint_Key));
   } else {
       Option.empty() 
   }
   // New addition
   if (resetCheckpointType) {
     **reset checkpoint type if set.** 
   }
   ```
   
   No other changes are required. This is based of the assumption that Checkpoint_RESET_Key and checkpoint type goes hand in hand. During first batch, checkpoint type could be set, there won't be any Checkpoint_RESET_Key set. But from 2nd batch, it should be reverse. check point type should not be set, but Checkpoint_RESET_Key should be part of the commit metadata. Given this assumption, we don't really need to add checkpoint type to commitMetadata, but still decide whether to use the checkpoint type or not. 
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
       "triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "triggerType" : "PUSH"
     }, {
       "hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
       "triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
       "triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
       "triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
       "triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
       "triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638",
       "triggerID" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
       "triggerType" : "PUSH"
     }, {
       "hash" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=775",
       "triggerID" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a688be727d6d6beff51a3f347b9e596d982610b5",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=838",
       "triggerID" : "a688be727d6d6beff51a3f347b9e596d982610b5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a688be727d6d6beff51a3f347b9e596d982610b5 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=838) 
   * 67041c2d836e61355aea26bd24f91548ec5e92ce UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] liujinhui1994 commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r670996945



##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java
##########
@@ -212,6 +234,9 @@ public KafkaOffsetGen(TypedProperties props) {
       Set<TopicPartition> topicPartitions = partitionInfoList.stream()
               .map(x -> new TopicPartition(x.topic(), x.partition())).collect(Collectors.toSet());
 
+      if (Config.KAFKA_CHECKPOINT_TYPE_TIMESTAMP.equals(kafkaCheckpointType) && isValidCheckpointType(lastCheckpointStr)) {
+        lastCheckpointStr = getOffsetsByTimestamp(consumer, partitionInfoList, topicPartitions, topicName, Long.parseLong(lastCheckpointStr.get()));
+      }

Review comment:
       I don't think there needs to be an "else if" here.
   If you are using timestamp kafkaCheckpointType, lastCheckpointStr will be passed a timestamp of "timestamp type", which we will handle using the getOffsetByTimestamp method.
   If it is not a timestamp type, then we can interpret it as a regular string type checkpoint rule and do not process it




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
       "triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "triggerType" : "PUSH"
     }, {
       "hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
       "triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
       "triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
       "triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
       "triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
       "triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638",
       "triggerID" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
       "triggerType" : "PUSH"
     }, {
       "hash" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=775",
       "triggerID" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a688be727d6d6beff51a3f347b9e596d982610b5",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=838",
       "triggerID" : "a688be727d6d6beff51a3f347b9e596d982610b5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=839",
       "triggerID" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8bc0333e4fc14158b126da1f7b14f6c43a3abfb8",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=856",
       "triggerID" : "8bc0333e4fc14158b126da1f7b14f6c43a3abfb8",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 67041c2d836e61355aea26bd24f91548ec5e92ce Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=839) 
   * 8bc0333e4fc14158b126da1f7b14f6c43a3abfb8 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=856) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan merged pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
nsivabalan merged pull request #2438:
URL: https://github.com/apache/hudi/pull/2438


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] pratyakshsharma commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
pratyakshsharma commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r830456199



##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java
##########
@@ -283,6 +323,41 @@ private Long delayOffsetCalculation(Option<String> lastCheckpointStr, Set<TopicP
     return delayCount;
   }
 
+  /**
+   * Get the checkpoint by timestamp.
+   * This method returns the checkpoint format based on the timestamp.
+   * example:
+   * 1. input: timestamp, etc.
+   * 2. output: topicName,partition_num_0:100,partition_num_1:101,partition_num_2:102.
+   *
+   * @param consumer
+   * @param topicName
+   * @param timestamp
+   * @return
+   */
+  private Option<String> getOffsetsByTimestamp(KafkaConsumer consumer, List<PartitionInfo> partitionInfoList, Set<TopicPartition> topicPartitions,
+                                               String topicName, Long timestamp) {
+
+    Map<TopicPartition, Long> topicPartitionsTimestamp = partitionInfoList.stream()
+                                                    .map(x -> new TopicPartition(x.topic(), x.partition()))
+                                                    .collect(Collectors.toMap(Function.identity(), x -> timestamp));
+
+    Map<TopicPartition, Long> earliestOffsets = consumer.beginningOffsets(topicPartitions);
+    Map<TopicPartition, OffsetAndTimestamp> offsetAndTimestamp = consumer.offsetsForTimes(topicPartitionsTimestamp);
+
+    StringBuilder sb = new StringBuilder();
+    sb.append(topicName + ",");
+    for (Map.Entry<TopicPartition, OffsetAndTimestamp> map : offsetAndTimestamp.entrySet()) {
+      if (map.getValue() != null) {
+        sb.append(map.getKey().partition()).append(":").append(map.getValue().offset()).append(",");
+      } else {
+        sb.append(map.getKey().partition()).append(":").append(earliestOffsets.get(map.getKey())).append(",");

Review comment:
       @liujinhui1994 @nsivabalan Can you help me understand why are we adding this value here from earliestOffsets? From what I understand, the whole point of consuming from specified timestamp is we do not want to consume records whose offset has timestamp lesser than the specified timestamp. Let us take an example of topic A with 3 partitions 0,1,2. Offsets are as below - 
   `partition 0 - 100 (ts-210),101 (ts-220),102 (ts-230),103 (ts-240) .....
   partition 1 - 50 (ts 200), 51 (ts-205), 52 (ts-225) ....
   partition 2 - 51 (ts - 100), 60 (ts - 150) (only 2 records present in this)`
   
   Now suppose if the timestamp is passed as 220, the expected results from consumer api will be - 
   `partition 0 -> 101
   partition 1 -> 52
   partition 2 -> null`
   
   As per the code, we return - 
   `partition 0 -> 101
   partition 1 -> 52
   partition 2 -> 51`
   
   I want to understand why are we populating this value here for partition 2? If the corresponding offsets in partition 2 have timestamp less than 220, this implies these offsets have either been already consumed or the records are not needed at all for ingestion into hudi table. Ideally no offset should be returned from this method for partition 2. 
   
   Even if this functionality is added only for one time initial bootstrap, then also consuming the records from partition 2 above does not make sense. Please let me know the thought process behind this logic. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-822457134


   @liujinhui1994 : ping me here once the PR is ready to be reviewed again


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
nsivabalan edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-799571386


   Thanks for your contribution. this is going to be useful to the community. 
   Few high level questions.
   1. Why not we leverage DeltaSreamerConfig.checkpoint to pass in a timestamp for Kafka source? Or do we expect the format of this config to be "topic_name,partition_num:offset,partition_num:offset,...." and hence we need a new config for timestamp based checkpoint. 
   2. If yes to (1), Did we think about parsing the checkpoint config and determining whether its above format or timestamp and then proceeding from there. Just trying to avoid introducing new configs if possible. 
   3. Checkpoint in deltastreamer in general is getting too complicated. I definitely see a benefit in this patch. But, is there a way we can abstract it out based on source. Bcoz, the new config introduced as part of this PR, is very specific to Kafka. So, trying to see if we can keep it abstracted out from deltastreamer if possible. 
   4. I see KafkaConsumer.offsetsForTimes() could return null for partitions w/ msgs of old format. So, what's the expected behavior for such partitions. Do we resume from earliest offset? 
   
   @n3nash @vinothchandar : open to hear your thoughts if any. One of my suggestion above, could potentially add apis to Source and hence CCing you. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
       "triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "triggerType" : "PUSH"
     }, {
       "hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
       "triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
       "triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
       "triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
       "triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
       "triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638",
       "triggerID" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
       "triggerType" : "PUSH"
     }, {
       "hash" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=775",
       "triggerID" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a688be727d6d6beff51a3f347b9e596d982610b5",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=838",
       "triggerID" : "a688be727d6d6beff51a3f347b9e596d982610b5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=839",
       "triggerID" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8bc0333e4fc14158b126da1f7b14f6c43a3abfb8",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=856",
       "triggerID" : "8bc0333e4fc14158b126da1f7b14f6c43a3abfb8",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5022f1d97e4e9b140d8e41b5b49c034ceb9ae601",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=858",
       "triggerID" : "5022f1d97e4e9b140d8e41b5b49c034ceb9ae601",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b77b63994db2e91853a06d3a5c4c129a21feefcf",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=863",
       "triggerID" : "b77b63994db2e91853a06d3a5c4c129a21feefcf",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e98b8e407f1bbcd0f0219d2f2d65f4e95f663c00",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=959",
       "triggerID" : "e98b8e407f1bbcd0f0219d2f2d65f4e95f663c00",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * b77b63994db2e91853a06d3a5c4c129a21feefcf Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=863) 
   * e98b8e407f1bbcd0f0219d2f2d65f4e95f663c00 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=959) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] wangxianghu commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
wangxianghu commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r556990030



##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java
##########
@@ -165,6 +169,7 @@ public KafkaOffsetGen(TypedProperties props) {
     }
     DataSourceUtils.checkRequiredProperties(props, Collections.singletonList(Config.KAFKA_TOPIC_NAME));
     topicName = props.getString(Config.KAFKA_TOPIC_NAME);
+    kafkaCheckpointTimestamp = props.getString(Config.KAFKA_CHECKPOINT_TIMESTAMP);

Review comment:
       if the value of `Config.KAFKA_CHECKPOINT_TIMESTAMP`  does not exist, Exception will be thrown, this is not expected when the user want to use checkpoint by providing offsets

##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java
##########
@@ -182,6 +187,10 @@ public KafkaOffsetGen(TypedProperties props) {
               .map(x -> new TopicPartition(x.topic(), x.partition())).collect(Collectors.toSet());
 
       // Determine the offset ranges to read from
+      if (kafkaCheckpointTimestamp != null) {
+        lastCheckpointStr = Option.of(getOffsetsByTimestamp(consumer, partitionInfoList, topicName, Long.parseLong(kafkaCheckpointTimestamp)));
+      }
+
       if (lastCheckpointStr.isPresent() && !lastCheckpointStr.get().isEmpty()) {

Review comment:
       Here we can not simply over write `lastCheckpointStr`. if user configed `Config.KAFKA_CHECKPOINT_TIMESTAMP`,  hudi will always consume from `Config.KAFKA_CHECKPOINT_TIMESTAMP` and can not moving on, right ?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] liujinhui1994 commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r594877281



##########
File path: hudi-integ-test/src/main/java/org/apache/hudi/integ/testsuite/HoodieDeltaStreamerWrapper.java
##########
@@ -65,7 +65,7 @@ public void scheduleCompact() throws Exception {
     return upsert(WriteOperationType.UPSERT);
   }
 
-  public Pair<SchemaProvider, Pair<String, JavaRDD<HoodieRecord>>> fetchSource() throws Exception {
+  public Pair<Pair<SchemaProvider, JavaRDD<HoodieRecord>>, Pair<String, String>> fetchSource() throws Exception {

Review comment:
       After your PR is over, continue with the next PR?
   @nsivabalan 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-881200974


   Let's try to land this in by weekend. Its been hanging for quite sometime. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] liujinhui1994 removed a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
liujinhui1994 removed a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-782588952


   @yanghua @wangxianghu @nsivabalan  
   
   I have verified, please help review
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-io edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-759677298






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] liujinhui1994 closed pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
liujinhui1994 closed pull request #2438:
URL: https://github.com/apache/hudi/pull/2438


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r604828288



##########
File path: hudi-integ-test/src/main/java/org/apache/hudi/integ/testsuite/HoodieDeltaStreamerWrapper.java
##########
@@ -65,7 +65,7 @@ public void scheduleCompact() throws Exception {
     return upsert(WriteOperationType.UPSERT);
   }
 
-  public Pair<SchemaProvider, Pair<String, JavaRDD<HoodieRecord>>> fetchSource() throws Exception {
+  public Pair<Pair<SchemaProvider, JavaRDD<HoodieRecord>>, Pair<String, String>> fetchSource() throws Exception {

Review comment:
       actually my PR was closed as it was invalid. But [here](https://github.com/nsivabalan/hudi/blob/f7439e2e28748bf7b713fb72ba611f8af7bb97a1/hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/ReadBatch.java) is the class that I added. May be you can add it in this patch only. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r594471195



##########
File path: hudi-integ-test/src/main/java/org/apache/hudi/integ/testsuite/HoodieDeltaStreamerWrapper.java
##########
@@ -65,7 +65,7 @@ public void scheduleCompact() throws Exception {
     return upsert(WriteOperationType.UPSERT);
   }
 
-  public Pair<SchemaProvider, Pair<String, JavaRDD<HoodieRecord>>> fetchSource() throws Exception {
+  public Pair<Pair<SchemaProvider, JavaRDD<HoodieRecord>>, Pair<String, String>> fetchSource() throws Exception {

Review comment:
       this is getting out of hand(two pairs within a pair). we can't keep adding more Pairs here. I am adding a class to hold the return value here in one of my PRs. Lets see if we can rebase once the other PR lands.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r670978737



##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java
##########
@@ -212,6 +234,9 @@ public KafkaOffsetGen(TypedProperties props) {
       Set<TopicPartition> topicPartitions = partitionInfoList.stream()
               .map(x -> new TopicPartition(x.topic(), x.partition())).collect(Collectors.toSet());
 
+      if (Config.KAFKA_CHECKPOINT_TYPE_TIMESTAMP.equals(kafkaCheckpointType) && isValidCheckpointType(lastCheckpointStr)) {
+        lastCheckpointStr = getOffsetsByTimestamp(consumer, partitionInfoList, topicPartitions, topicName, Long.parseLong(lastCheckpointStr.get()));
+      }

Review comment:
       I was expecting a else if block after this line. Can you clarify please. If not, we might go into the else block ? 

##########
File path: hudi-utilities/src/test/java/org/apache/hudi/utilities/sources/helpers/TestKafkaOffsetGen.java
##########
@@ -64,7 +63,7 @@ public void teardown() throws Exception {
 
   private TypedProperties getConsumerConfigs(String autoOffsetReset) {
     TypedProperties props = new TypedProperties();
-    props.put(Config.KAFKA_AUTO_OFFSET_RESET, autoOffsetReset);
+    props.put("auto.offset.reset", autoOffsetReset);

Review comment:
       Do you think we can add some tests to this class for the timestamp type? 

##########
File path: hudi-utilities/src/test/java/org/apache/hudi/utilities/sources/TestKafkaSource.java
##########
@@ -193,7 +193,7 @@ public void testJsonKafkaSourceWithDefaultUpperCap() {
 
     Source jsonSource = new JsonKafkaSource(props, jsc, sparkSession, schemaProvider, metrics);
     SourceFormatAdapter kafkaSource = new SourceFormatAdapter(jsonSource);
-    Config.maxEventsFromKafkaSource = 500;
+    //props.setProperty("hoodie.deltastreamer.kafka.source.maxEvents", "500");

Review comment:
       why commented out? 

##########
File path: hudi-utilities/src/test/java/org/apache/hudi/utilities/sources/TestKafkaSource.java
##########
@@ -193,7 +193,7 @@ public void testJsonKafkaSourceWithDefaultUpperCap() {
 
     Source jsonSource = new JsonKafkaSource(props, jsc, sparkSession, schemaProvider, metrics);
     SourceFormatAdapter kafkaSource = new SourceFormatAdapter(jsonSource);
-    Config.maxEventsFromKafkaSource = 500;
+    //props.setProperty("hoodie.deltastreamer.kafka.source.maxEvents", "500");

Review comment:
       I tried your patch locally. the test fails if I uncomment this line. I don't understand ? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] liujinhui1994 commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r605393953



##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java
##########
@@ -553,6 +555,11 @@ public DeltaSyncService(Config cfg, JavaSparkContext jssc, FileSystem fs, Config
           "'--filter-dupes' needs to be disabled when '--op' is 'UPSERT' to ensure updates are not missed.");
 
       this.props = properties.get();
+      String kafkaCheckpointTimestamp = props.getString(KafkaOffsetGen.Config.KAFKA_CHECKPOINT_TIMESTAMP, "");

Review comment:
       KAFKA_CHECKPOINT_TIMESTAMP, I think is just a way to make it easier for users to set checkpoint




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
       "triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "triggerType" : "PUSH"
     }, {
       "hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
       "triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
       "triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
       "triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
       "triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
       "triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638",
       "triggerID" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
       "triggerType" : "PUSH"
     }, {
       "hash" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=775",
       "triggerID" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a688be727d6d6beff51a3f347b9e596d982610b5",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=838",
       "triggerID" : "a688be727d6d6beff51a3f347b9e596d982610b5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=839",
       "triggerID" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8bc0333e4fc14158b126da1f7b14f6c43a3abfb8",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=856",
       "triggerID" : "8bc0333e4fc14158b126da1f7b14f6c43a3abfb8",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5022f1d97e4e9b140d8e41b5b49c034ceb9ae601",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=858",
       "triggerID" : "5022f1d97e4e9b140d8e41b5b49c034ceb9ae601",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b77b63994db2e91853a06d3a5c4c129a21feefcf",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=863",
       "triggerID" : "b77b63994db2e91853a06d3a5c4c129a21feefcf",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e98b8e407f1bbcd0f0219d2f2d65f4e95f663c00",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=959",
       "triggerID" : "e98b8e407f1bbcd0f0219d2f2d65f4e95f663c00",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * e98b8e407f1bbcd0f0219d2f2d65f4e95f663c00 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=959) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-881817246


   Appreciate your perseverance in addressing all the feedback. You are the best! :) Thanks for your contribution!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (c705ce5) into [master](https://codecov.io/gh/apache/hudi/commit/5804ad8e32ae05758ebc5e47f5d4fb4db371ab52?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (5804ad8) will **decrease** coverage by `44.90%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master   #2438       +/-   ##
   ============================================
   - Coverage     47.72%   2.82%   -44.91%     
   + Complexity     5528      85     -5443     
   ============================================
     Files           934     284      -650     
     Lines         41457   11869    -29588     
     Branches       4166     986     -3180     
   ============================================
   - Hits          19786     335    -19451     
   + Misses        19914   11508     -8406     
   + Partials       1757      26     -1731     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `?` | |
   | hudiclient | `0.00% <ø> (-34.46%)` | :arrow_down: |
   | hudicommon | `?` | |
   | hudiflink | `?` | |
   | hudihadoopmr | `?` | |
   | hudisparkdatasource | `?` | |
   | hudisync | `4.88% <ø> (-49.64%)` | :arrow_down: |
   | huditimelineservice | `?` | |
   | hudiutilities | `8.99% <0.00%> (-50.27%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `0.00% <0.00%> (-71.57%)` | :arrow_down: |
   | [...apache/hudi/utilities/sources/AvroKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb0thZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (ø)` | |
   | [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...hudi/utilities/sources/helpers/KafkaOffsetGen.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvaGVscGVycy9LYWZrYU9mZnNldEdlbi5qYXZh) | `0.00% <0.00%> (-87.69%)` | :arrow_down: |
   | [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | ... and [778 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [5804ad8...c705ce5](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
nsivabalan edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-864418016


   actually, we can make it even more simpler. 
   DeltaSync.read()
   ```
   // set right checkpoint value
   if(cfg.checkpoint != null && ! (commitMetadata.contains(Checkpoint_RESET_Key) ) {
      checkpoint = cfg.checkpoint;
   } else if (commitMetadata.contains(Checkpoint_Key)) {
       checkpoint = commitMetadata.get(Checkpoint_Key));
   } else {
       Option.empty() 
   }
   // New addition
   if (commitMetadata.contains(Checkpoint_RESET_Key)) {
     **reset checkpoint type if set.** 
   }
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] liujinhui1994 commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-1073504190


   The purpose of introducing timestamps: Mainly when users want to consume from a certain location, deltastreamer can only specify checkpoint sites in the past. For example, kafka may have 50+ partitions, and users need to manually configure the checkpoint string. Introducing this simplifies this operation
   
   Regarding your example: I think you are right and agree with your idea. Partition 2 should not be populated with this value.
   At that time, the main consideration of this PR was to solve the problem of complex user configuration. It can simplify consumption data as much as possible. This example of partition 2 makes sense for some businesses. Maybe your current scenario may be a bit contradictory, and I feel like we can improve it and make it better


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] liujinhui1994 commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-1073504409


   @pratyakshsharma 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] pratyakshsharma commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
pratyakshsharma commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r830768428



##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java
##########
@@ -283,6 +323,41 @@ private Long delayOffsetCalculation(Option<String> lastCheckpointStr, Set<TopicP
     return delayCount;
   }
 
+  /**
+   * Get the checkpoint by timestamp.
+   * This method returns the checkpoint format based on the timestamp.
+   * example:
+   * 1. input: timestamp, etc.
+   * 2. output: topicName,partition_num_0:100,partition_num_1:101,partition_num_2:102.
+   *
+   * @param consumer
+   * @param topicName
+   * @param timestamp
+   * @return
+   */
+  private Option<String> getOffsetsByTimestamp(KafkaConsumer consumer, List<PartitionInfo> partitionInfoList, Set<TopicPartition> topicPartitions,
+                                               String topicName, Long timestamp) {
+
+    Map<TopicPartition, Long> topicPartitionsTimestamp = partitionInfoList.stream()
+                                                    .map(x -> new TopicPartition(x.topic(), x.partition()))
+                                                    .collect(Collectors.toMap(Function.identity(), x -> timestamp));
+
+    Map<TopicPartition, Long> earliestOffsets = consumer.beginningOffsets(topicPartitions);
+    Map<TopicPartition, OffsetAndTimestamp> offsetAndTimestamp = consumer.offsetsForTimes(topicPartitionsTimestamp);
+
+    StringBuilder sb = new StringBuilder();
+    sb.append(topicName + ",");
+    for (Map.Entry<TopicPartition, OffsetAndTimestamp> map : offsetAndTimestamp.entrySet()) {
+      if (map.getValue() != null) {
+        sb.append(map.getKey().partition()).append(":").append(map.getValue().offset()).append(",");
+      } else {
+        sb.append(map.getKey().partition()).append(":").append(earliestOffsets.get(map.getKey())).append(",");

Review comment:
       created a jira for this - https://issues.apache.org/jira/browse/HUDI-3671




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-io edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-759677298






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r594471195



##########
File path: hudi-integ-test/src/main/java/org/apache/hudi/integ/testsuite/HoodieDeltaStreamerWrapper.java
##########
@@ -65,7 +65,7 @@ public void scheduleCompact() throws Exception {
     return upsert(WriteOperationType.UPSERT);
   }
 
-  public Pair<SchemaProvider, Pair<String, JavaRDD<HoodieRecord>>> fetchSource() throws Exception {
+  public Pair<Pair<SchemaProvider, JavaRDD<HoodieRecord>>, Pair<String, String>> fetchSource() throws Exception {

Review comment:
       this is getting out of hand(two pairs within a pair). we can't keep adding more Pairs here. I am adding a class to hold the return value in a class here in one of my PRs. Lets see if we can rebase once the other PR lands.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-811855554


   Myself and Nishith discussed on this. Here is our proposal. 
   Let's rely on Deltastreamer.Config.checkpoint to pass in any type of checkpoint. 
   We can add another config called "checkpoint.type" which could default to string for all default checkpoints. For checkpoint of interest of this PR, we could set the value for this new config to "timestamp". 
   
   With this, its upto each source to parse and interpret the checkpoint value and DeltaSync does not need to deal w/ diff checkpointing formats. 
   
   Having said this, DeltaSync readFromSource() should not have any changes in this diff. 
   KafkaOffsetGen should have logic to parse diff checkpoint values, based on two values(deltastreamer.config.checkpoint and checkpoint.type). 
   
   With this, we also moved source specific checkpointing logic within source specific class and did not leak it to DeltaSync which should be agnostic to different Source. 
   
   @liujinhui1994 : Let me know what do you think. Happy to chat more on this. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (a39570d) into [master](https://codecov.io/gh/apache/hudi/commit/6eca06d074520140d7bc67b48bd2b9a5b76f0a87?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (6eca06d) will **decrease** coverage by `0.96%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #2438      +/-   ##
   ============================================
   - Coverage     47.51%   46.54%   -0.97%     
   + Complexity     5429     4951     -478     
   ============================================
     Files           922      855      -67     
     Lines         40968    37983    -2985     
     Branches       4105     3785     -320     
   ============================================
   - Hits          19464    17678    -1786     
   + Misses        19780    18741    -1039     
   + Partials       1724     1564     -160     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `40.00% <ø> (ø)` | |
   | hudiclient | `34.58% <ø> (ø)` | |
   | hudicommon | `48.39% <ø> (+0.01%)` | :arrow_up: |
   | hudiflink | `60.07% <ø> (ø)` | |
   | hudihadoopmr | `51.29% <ø> (ø)` | |
   | hudisparkdatasource | `67.10% <ø> (ø)` | |
   | hudisync | `50.10% <ø> (-3.95%)` | :arrow_down: |
   | huditimelineservice | `64.07% <ø> (ø)` | |
   | hudiutilities | `?` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...ache/hudi/hive/HiveMetastoreBasedLockProvider.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvSGl2ZU1ldGFzdG9yZUJhc2VkTG9ja1Byb3ZpZGVyLmphdmE=) | `0.00% <0.00%> (-60.22%)` | :arrow_down: |
   | [...s/exception/HoodieIncrementalPullSQLException.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2V4Y2VwdGlvbi9Ib29kaWVJbmNyZW1lbnRhbFB1bGxTUUxFeGNlcHRpb24uamF2YQ==) | | |
   | [...udi/utilities/transform/FlatteningTransformer.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3RyYW5zZm9ybS9GbGF0dGVuaW5nVHJhbnNmb3JtZXIuamF2YQ==) | | |
   | [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | | |
   | [...i/utilities/deltastreamer/SourceFormatAdapter.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvU291cmNlRm9ybWF0QWRhcHRlci5qYXZh) | | |
   | [...org/apache/hudi/utilities/HoodieClusteringJob.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hvb2RpZUNsdXN0ZXJpbmdKb2IuamF2YQ==) | | |
   | [...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=) | | |
   | [.../hudi/utilities/schema/RowBasedSchemaProvider.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9Sb3dCYXNlZFNjaGVtYVByb3ZpZGVyLmphdmE=) | | |
   | [...org/apache/hudi/utilities/HDFSParquetImporter.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hERlNQYXJxdWV0SW1wb3J0ZXIuamF2YQ==) | | |
   | [...in/java/org/apache/hudi/utilities/UtilHelpers.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL1V0aWxIZWxwZXJzLmphdmE=) | | |
   | ... and [57 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [6eca06d...a39570d](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
       "triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "triggerType" : "PUSH"
     }, {
       "hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
       "triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
       "triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
       "triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
       "triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
       "triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638",
       "triggerID" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
       "triggerType" : "PUSH"
     }, {
       "hash" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=775",
       "triggerID" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a688be727d6d6beff51a3f347b9e596d982610b5",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=838",
       "triggerID" : "a688be727d6d6beff51a3f347b9e596d982610b5",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a688be727d6d6beff51a3f347b9e596d982610b5 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=838) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
       "triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "triggerType" : "PUSH"
     }, {
       "hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
       "triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
       "triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
       "triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ffd30f564c780a25ddccf8c5bc819d4eed9b437a Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400) 
   * 5e8ab52b0e139333c4c003932c55ff6e88302206 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r657945070



##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/DeltaSync.java
##########
@@ -312,13 +313,13 @@ public void refreshTimeline() throws IOException {
       if (lastCommit.isPresent()) {
         HoodieCommitMetadata commitMetadata = HoodieCommitMetadata
             .fromBytes(commitTimelineOpt.get().getInstantDetails(lastCommit.get()).get(), HoodieCommitMetadata.class);
-        if (cfg.checkpoint != null && !cfg.checkpoint.equals(commitMetadata.getMetadata(CHECKPOINT_RESET_KEY))) {
-          resumeCheckpointStr = Option.of(cfg.checkpoint);
-        } else if (commitMetadata.getMetadata(CHECKPOINT_KEY) != null) {
-          //if previous checkpoint is an empty string, skip resume use Option.empty()
-          if (!commitMetadata.getMetadata(CHECKPOINT_KEY).isEmpty()) {
-            resumeCheckpointStr = Option.of(commitMetadata.getMetadata(CHECKPOINT_KEY));
+        if (cfg.checkpoint != null) {

Review comment:
       we could club both these within single if condition. 
   ```
   if (cfg.checkpoint != null && (StringUtils.isNullOrEmpty(commitMetadata.getMetadata(CHECKPOINT_RESET_KEY))
                     || !cfg.checkpoint.equals(commitMetadata.getMetadata(CHECKPOINT_RESET_KEY))) {
               resumeCheckpointStr = Option.of(cfg.checkpoint);
   }

##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/DeltaSync.java
##########
@@ -330,6 +331,9 @@ public void refreshTimeline() throws IOException {
                   + commitTimelineOpt.get().getInstants().collect(Collectors.toList()) + ", CommitMetadata="
                   + commitMetadata.toJsonString());
         }
+        if (!StringUtils.isNullOrEmpty(commitMetadata.getMetadata(CHECKPOINT_RESET_KEY))) {
+          props.put("hoodie.deltastreamer.source.kafka.checkpoint.type", "string");

Review comment:
       actually better thing to do here is to remove the entry from props. wdyt?

##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/DeltaSync.java
##########
@@ -330,6 +331,9 @@ public void refreshTimeline() throws IOException {
                   + commitTimelineOpt.get().getInstants().collect(Collectors.toList()) + ", CommitMetadata="
                   + commitMetadata.toJsonString());
         }
+        if (!StringUtils.isNullOrEmpty(commitMetadata.getMetadata(CHECKPOINT_RESET_KEY))) {
+          props.put("hoodie.deltastreamer.source.kafka.checkpoint.type", "string");

Review comment:
       rather than hardcoding the config here, can we use variable please.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
       "triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "triggerType" : "PUSH"
     }, {
       "hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
       "triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
       "triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
       "triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
       "triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
       "triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638",
       "triggerID" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633) 
   * a39570dfe0493bcd23edf911f6256e90d3b22907 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
       "triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "triggerType" : "PUSH"
     }, {
       "hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
       "triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
       "triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
       "triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
       "triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
       "triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638",
       "triggerID" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
       "triggerType" : "PUSH"
     }, {
       "hash" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=775",
       "triggerID" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a688be727d6d6beff51a3f347b9e596d982610b5",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=838",
       "triggerID" : "a688be727d6d6beff51a3f347b9e596d982610b5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=839",
       "triggerID" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8bc0333e4fc14158b126da1f7b14f6c43a3abfb8",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=856",
       "triggerID" : "8bc0333e4fc14158b126da1f7b14f6c43a3abfb8",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5022f1d97e4e9b140d8e41b5b49c034ceb9ae601",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=858",
       "triggerID" : "5022f1d97e4e9b140d8e41b5b49c034ceb9ae601",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 8bc0333e4fc14158b126da1f7b14f6c43a3abfb8 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=856) 
   * 5022f1d97e4e9b140d8e41b5b49c034ceb9ae601 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=858) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
       "triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "triggerType" : "PUSH"
     }, {
       "hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
       "triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
       "triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
       "triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
       "triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
       "triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638",
       "triggerID" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
       "triggerType" : "PUSH"
     }, {
       "hash" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=775",
       "triggerID" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a39570dfe0493bcd23edf911f6256e90d3b22907 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638) 
   * bf50481b923dbaa14be994bd0cc45bbe22ff8524 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=775) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (a688be7) into [master](https://codecov.io/gh/apache/hudi/commit/990820476a41b318017ba63dd446911141c929ce?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (9908204) will **increase** coverage by `19.04%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@              Coverage Diff              @@
   ##             master    #2438       +/-   ##
   =============================================
   + Coverage     47.61%   66.65%   +19.04%     
   + Complexity     5487      798     -4689     
   =============================================
     Files           924      100      -824     
     Lines         41206     3488    -37718     
     Branches       4133      353     -3780     
   =============================================
   - Hits          19619     2325    -17294     
   + Misses        19844     1024    -18820     
   + Partials       1743      139     -1604     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `?` | |
   | hudiclient | `66.65% <ø> (+32.07%)` | :arrow_up: |
   | hudicommon | `?` | |
   | hudiflink | `?` | |
   | hudihadoopmr | `?` | |
   | hudisparkdatasource | `?` | |
   | hudisync | `?` | |
   | huditimelineservice | `?` | |
   | hudiutilities | `?` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...n/java/org/apache/hudi/index/SparkHoodieIndex.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1zcGFyay1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaW5kZXgvU3BhcmtIb29kaWVJbmRleC5qYXZh) | `56.52% <0.00%> (-30.15%)` | :arrow_down: |
   | [...in/java/org/apache/hudi/index/JavaHoodieIndex.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1qYXZhLWNsaWVudC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9pbmRleC9KYXZhSG9vZGllSW5kZXguamF2YQ==) | | |
   | [...pache/hudi/cli/commands/FileSystemViewCommand.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpL2NvbW1hbmRzL0ZpbGVTeXN0ZW1WaWV3Q29tbWFuZC5qYXZh) | | |
   | [...metadata/HoodieMetadataMergedLogRecordScanner.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0YWRhdGEvSG9vZGllTWV0YWRhdGFNZXJnZWRMb2dSZWNvcmRTY2FubmVyLmphdmE=) | | |
   | [...apache/hudi/common/fs/inline/InLineFileSystem.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL2ZzL2lubGluZS9JbkxpbmVGaWxlU3lzdGVtLmphdmE=) | | |
   | [...va/org/apache/hudi/sink/utils/PayloadCreation.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zaW5rL3V0aWxzL1BheWxvYWRDcmVhdGlvbi5qYXZh) | | |
   | [.../main/scala/org/apache/hudi/HoodieSparkUtils.scala](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZVNwYXJrVXRpbHMuc2NhbGE=) | | |
   | [...apache/hudi/common/model/WriteConcurrencyMode.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL1dyaXRlQ29uY3VycmVuY3lNb2RlLmphdmE=) | | |
   | [...he/hudi/common/model/BootstrapBaseFileMapping.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0Jvb3RzdHJhcEJhc2VGaWxlTWFwcGluZy5qYXZh) | | |
   | [...di/common/table/log/block/HoodieAvroDataBlock.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL2xvZy9ibG9jay9Ib29kaWVBdnJvRGF0YUJsb2NrLmphdmE=) | | |
   | ... and [819 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [9908204...a688be7](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
nsivabalan edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-853866564


   good point. 
   Tell me if my understanding is right in general wrt usage of timestamp based checkpointing. 
   user would like to use timestamp based checkpointing in deltastreamer only for bootstrap case. 
   and further on, checkpointing will be using the regular kafka checkpoint format of "topicName,0:123,1:456". 
   
   if my understanding (stated above) is true, essentially, within kafkaOffsenGen, we might have to parse checkpoint as timestamp for first time(bootstrap), but from 2nd time, we fallback to regular checkpoint parsing mechanism. 
   
   I see we have InitialCheckPointProvider. Let me think about how to go about this and will get back to you. For now, this is what I can think of. 
   InitialCheckpointProvider will expose getCheckpointType() method. 
   and we add it as a property to configs if initialCheckpointProvider is set around [here](https://github.com/apache/hudi/blob/f6eee77636223077cfd2ce516f1b8805dfa6e35e/hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java#L132). 
   Within readFromSource in DeltaSync(), if checkpoint is fetched from commit metadata, we may not honor this checkpoint type. or we will clear the checkpoint type property if set. 
   but if fetched from cfg.checkPoint, we will leave the property as is and let kafkaOffsetGen handle checkpoint parsing. 
   
   But let me think through this more. But in the mean time, if you can confirm my understanding of the usage of timestamp based checkpointing, would be great. 
   
   CC @n3nash @bvaradar 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] liujinhui1994 commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863968623


   deltaSync should reset this(...kafka.checkpoint.type) configuration (similar to how we reset checkpoints)
   In this way, we may need to store this in the metadata file. If it is a memory modification, there is a greater risk. I have submitted my latest implementation, please help to see if it is feasible
   @nsivabalan 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (a39570d) into [master](https://codecov.io/gh/apache/hudi/commit/6eca06d074520140d7bc67b48bd2b9a5b76f0a87?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (6eca06d) will **decrease** coverage by `3.51%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #2438      +/-   ##
   ============================================
   - Coverage     47.51%   43.99%   -3.52%     
   + Complexity     5429     3918    -1511     
   ============================================
     Files           922      730     -192     
     Lines         40968    32657    -8311     
     Branches       4105     3245     -860     
   ============================================
   - Hits          19464    14366    -5098     
   + Misses        19780    16957    -2823     
   + Partials       1724     1334     -390     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `40.00% <ø> (ø)` | |
   | hudiclient | `22.94% <ø> (-11.65%)` | :arrow_down: |
   | hudicommon | `48.39% <ø> (+0.01%)` | :arrow_up: |
   | hudiflink | `60.07% <ø> (ø)` | |
   | hudihadoopmr | `51.29% <ø> (ø)` | |
   | hudisparkdatasource | `67.10% <ø> (ø)` | |
   | hudisync | `?` | |
   | huditimelineservice | `?` | |
   | hudiutilities | `?` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...g/apache/hudi/utilities/sources/JsonDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkRGU1NvdXJjZS5qYXZh) | | |
   | [...he/hudi/utilities/transform/AWSDmsTransformer.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3RyYW5zZm9ybS9BV1NEbXNUcmFuc2Zvcm1lci5qYXZh) | | |
   | [...ties/exception/HoodieIncrementalPullException.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2V4Y2VwdGlvbi9Ib29kaWVJbmNyZW1lbnRhbFB1bGxFeGNlcHRpb24uamF2YQ==) | | |
   | [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | | |
   | [...src/main/java/org/apache/hudi/dla/DLASyncTool.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktZGxhLXN5bmMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvZGxhL0RMQVN5bmNUb29sLmphdmE=) | | |
   | [...i/utilities/deltastreamer/SourceFormatAdapter.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvU291cmNlRm9ybWF0QWRhcHRlci5qYXZh) | | |
   | [...a/org/apache/hudi/metrics/DistributedRegistry.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1zcGFyay1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0cmljcy9EaXN0cmlidXRlZFJlZ2lzdHJ5LmphdmE=) | | |
   | [...g/apache/hudi/keygen/GlobalDeleteKeyGenerator.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1zcGFyay1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkva2V5Z2VuL0dsb2JhbERlbGV0ZUtleUdlbmVyYXRvci5qYXZh) | | |
   | [.../hudi/utilities/sources/helpers/AvroConvertor.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvaGVscGVycy9BdnJvQ29udmVydG9yLmphdmE=) | | |
   | [...llback/SparkMergeOnReadRollbackActionExecutor.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1zcGFyay1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdGFibGUvYWN0aW9uL3JvbGxiYWNrL1NwYXJrTWVyZ2VPblJlYWRSb2xsYmFja0FjdGlvbkV4ZWN1dG9yLmphdmE=) | | |
   | ... and [181 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [6eca06d...a39570d](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
nsivabalan edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-864413719


   Guess we can simplify things. Let me go over some pseudo code of interest. 
   
   within DeltaSync.read()
   ```
   // set right checkpoint value 
   if(cfg.checkpoint != null && ! (commitMetadata.contains(Checkpoint_RESET_Key) ) {
      checkpoint = cfg.checkpoint;
   } else if (commitMetadata.contains(Checkpoint_Key)) {
       checkpoint = commitMetadata.get(Checkpoint_Key));
   } else {
       Option.empty() 
   }
   ```
   // Note that first if condition deals with RESET_key where as 2nd else if conditions deals with Checkpoint_key. 
   I have simplified some exception cases, but should give you the gist.
   
   within write() 
   ```
   // towards the end
   commitMetadata.out(Checkpoint_Key, updated checkpoint after writing)
   if(cfg.checkpoint != null) {
     commitMetadata.add(Checkpoint_RESET_Key);
   }
   ```
   
   If cfg.checkpoint is set, only during first round, it will be honored. At the end of first batch, we add Checkpoint_RESET_Key to the commitmetadata and hence from subsequent batches, checkpoint will be parsed from commitMetadata. 
   
   With this PR, only addition is that we are introducing a new checkpoint type. Let me propose a simple add on to above code that would work for us. 
   
   within DeltaSync.read()
   ```
   // set right checkpoint value 
   boolean resetCheckpointType = true; // New addition
   if(cfg.checkpoint != null && ! (commitMetadata.contains(Checkpoint_RESET_Key) ) {
      checkpoint = cfg.checkpoint;
      resetCheckpointType = false; // New addition
   } else if (commitMetadata.contains(Checkpoint_Key)) {
       checkpoint = commitMetadata.get(Checkpoint_Key));
   } else {
       Option.empty() 
   }
   // New addition
   if (resetCheckpointType) {
     **reset checkpoint type if set.** 
   }
   ```
   
   No other changes are required. This is based of the assumption that Checkpoint_RESET_Key and checkpoint type goes hand in hand. During first batch, checkpoint type could be set, there won't be any Checkpoint_RESET_Key set. But from 2nd batch, it should be reverse. check point type should not be set, but Checkpoint_RESET_Key should be part of the commit metadata. Given this assumption, we don't really need to add checkpoint type to commitMetadata, but still decide whether to use the checkpoint type or not. 
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r657546921



##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/DeltaSync.java
##########
@@ -461,7 +465,7 @@ public void refreshTimeline() throws IOException {
     if (!hasErrors || cfg.commitOnErrors) {
       HashMap<String, String> checkpointCommitMetadata = new HashMap<>();
       checkpointCommitMetadata.put(CHECKPOINT_KEY, checkpointStr);
-      if (cfg.checkpoint != null) {
+      if (cfg.checkpoint != null && !"timestamp".equals(props.getString("hoodie.deltastreamer.source.kafka.checkpoint.type"))) {

Review comment:
       Can you help me understand why we need this ? My understanding is that, user will set cfg.checkpoint during first batch and set the checkpoint type (to timestamp) as well. but even for any checkpoint types, we should add the checkpoint_reset_key here at the end of 1st batch. Am I missing something. can you please help me understand. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
       "triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
       "triggerType" : "PUSH"
     }, {
       "hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
       "triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
       "triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
       "triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
       "triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
       "triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638",
       "triggerID" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
       "triggerType" : "PUSH"
     }, {
       "hash" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=775",
       "triggerID" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a688be727d6d6beff51a3f347b9e596d982610b5",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=838",
       "triggerID" : "a688be727d6d6beff51a3f347b9e596d982610b5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=839",
       "triggerID" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8bc0333e4fc14158b126da1f7b14f6c43a3abfb8",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=856",
       "triggerID" : "8bc0333e4fc14158b126da1f7b14f6c43a3abfb8",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5022f1d97e4e9b140d8e41b5b49c034ceb9ae601",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=858",
       "triggerID" : "5022f1d97e4e9b140d8e41b5b49c034ceb9ae601",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 5022f1d97e4e9b140d8e41b5b49c034ceb9ae601 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=858) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] liujinhui1994 closed pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
liujinhui1994 closed pull request #2438:
URL: https://github.com/apache/hudi/pull/2438


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] liujinhui1994 commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp

Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-881820983


   @nsivabalan  Thank you for your concern and patience to help!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org