You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2021/01/13 13:52:34 UTC
[GitHub] [hudi] liujinhui1994 opened a new pull request #2438: [HUDI-1147] DeltaStreamer kafka source supports consuming from specified timestamp
liujinhui1994 opened a new pull request #2438:
URL: https://github.com/apache/hudi/pull/2438
DeltaStreamer kafka source supports consuming from specified timestamp
## *Tips*
- *Thank you very much for contributing to Apache Hudi.*
- *Please review https://hudi.apache.org/contributing.html before opening a pull request.*
## What is the purpose of the pull request
DeltaStreamer kafka source supports consuming from specified timestamp
## Brief change log
org.apache.hudi.utilities.sources.helpers.KafkaOffsetGen
## Committer checklist
- [ ] Has a corresponding JIRA in PR title & commit
- [ ] Commit message is descriptive of the change
- [ ] CI is green
- [ ] Necessary doc changes done or have another open PR
- [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
nsivabalan edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-864413719
Guess we can simplify things. Let me go over some pseudo code of interest.
within DeltaSync.read()
```
// set right checkpoint value
if(cfg.checkpoint != null && ! (commitMetadata.contains(Checkpoint_RESET_Key) ) {
checkpoint = cfg.checkpoint;
} else if (commitMetadata.contains(Checkpoint_Key)) {
checkpoint = commitMetadata.get(Checkpoint_Key));
} else {
Option.empty()
}
```
// Note that first if condition deals with RESET_key where as 2nd else if conditions deals with Checkpoint_key
within write()
```
// towards the end
commitMetadata.out(Checkpoint_Key, updated checkpoint after writing)
if(cfg.checkpoint != null) {
commitMetadata.add(Checkpoint_RESET_Key);
}
```
If cfg.checkpoint is set, only during first round, it will be honored. At the end of first batch, we add Checkpoint_RESET_Key to the commitmetadata and hence from subsequent batches, checkpoint will be parsed from commitMetadata.
With this PR, only addition is that we are introducing a new checkpoint type. Let me propose a simple add on to above code that would work for us.
within DeltaSync.read()
```
// set right checkpoint value
boolean resetCheckpointType = true; // New addition
if(cfg.checkpoint != null && ! (commitMetadata.contains(Checkpoint_RESET_Key) ) {
checkpoint = cfg.checkpoint;
resetCheckpointType = false; // New addition
} else if (commitMetadata.contains(Checkpoint_Key)) {
checkpoint = commitMetadata.get(Checkpoint_Key));
} else {
Option.empty()
}
// New addition
if (resetCheckpointType) {
**reset checkpoint type if set.**
}
```
No other changes are required. This is based of the assumption that Checkpoint_RESET_Key and checkpoint type goes hand in hand. During first batch, checkpoint type could be set, there won't be any Checkpoint_RESET_Key set. But from 2nd batch, it should be reverse. check point type should not be set, but Checkpoint_RESET_Key should be part of the commit metadata. Given this assumption, we don't really need to add checkpoint type to commitMetadata, but still decide whether to use the checkpoint type or not.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] liujinhui1994 commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-799512747
no problem
------------------ Original ------------------
From: Sivabalan Narayanan ***@***.***>
Date: Mon,Mar 15,2021 11:28 PM
To: apache/hudi ***@***.***>
Cc: liujinhui ***@***.***>, Mention ***@***.***>
Subject: Re: [apache/hudi] [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp (#2438)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847
# [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (c705ce5) into [master](https://codecov.io/gh/apache/hudi/commit/5804ad8e32ae05758ebc5e47f5d4fb4db371ab52?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (5804ad8) will **increase** coverage by `0.15%`.
> The diff coverage is `89.09%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #2438 +/- ##
============================================
+ Coverage 47.72% 47.88% +0.15%
- Complexity 5528 5580 +52
============================================
Files 934 936 +2
Lines 41457 41665 +208
Branches 4166 4193 +27
============================================
+ Hits 19786 19950 +164
- Misses 19914 19947 +33
- Partials 1757 1768 +11
```
| Flag | Coverage Δ | |
|---|---|---|
| hudicli | `39.97% <ø> (ø)` | |
| hudiclient | `34.51% <ø> (+0.05%)` | :arrow_up: |
| hudicommon | `48.67% <ø> (+0.11%)` | :arrow_up: |
| hudiflink | `59.68% <ø> (-0.35%)` | :arrow_down: |
| hudihadoopmr | `52.02% <ø> (+0.73%)` | :arrow_up: |
| hudisparkdatasource | `67.59% <ø> (-0.07%)` | :arrow_down: |
| hudisync | `55.97% <ø> (+1.46%)` | :arrow_up: |
| huditimelineservice | `64.07% <ø> (ø)` | |
| hudiutilities | `59.77% <89.09%> (+0.50%)` | :arrow_up: |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [...apache/hudi/utilities/sources/AvroKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb0thZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (ø)` | |
| [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `72.72% <83.33%> (+1.15%)` | :arrow_up: |
| [...hudi/utilities/sources/helpers/KafkaOffsetGen.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvaGVscGVycy9LYWZrYU9mZnNldEdlbi5qYXZh) | `88.13% <91.48%> (+0.45%)` | :arrow_up: |
| [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `100.00% <100.00%> (ø)` | |
| [.../java/org/apache/hudi/client/HoodieReadClient.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1zcGFyay1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpZW50L0hvb2RpZVJlYWRDbGllbnQuamF2YQ==) | `94.64% <0.00%> (-5.36%)` | :arrow_down: |
| [...c/main/java/org/apache/hudi/util/StreamerUtil.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS91dGlsL1N0cmVhbWVyVXRpbC5qYXZh) | `64.00% <0.00%> (-3.80%)` | :arrow_down: |
| [.../hudi/common/util/collection/LazyFileIterable.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3V0aWwvY29sbGVjdGlvbi9MYXp5RmlsZUl0ZXJhYmxlLmphdmE=) | `71.73% <0.00%> (-2.68%)` | :arrow_down: |
| [...e/hudi/common/table/log/HoodieLogFormatWriter.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL2xvZy9Ib29kaWVMb2dGb3JtYXRXcml0ZXIuamF2YQ==) | `78.12% <0.00%> (-1.57%)` | :arrow_down: |
| [.../org/apache/hudi/MergeOnReadSnapshotRelation.scala](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL01lcmdlT25SZWFkU25hcHNob3RSZWxhdGlvbi5zY2FsYQ==) | `90.59% <0.00%> (-0.56%)` | :arrow_down: |
| [...src/main/scala/org/apache/hudi/DefaultSource.scala](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0RlZmF1bHRTb3VyY2Uuc2NhbGE=) | `74.77% <0.00%> (-0.46%)` | :arrow_down: |
| ... and [38 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [5804ad8...c705ce5](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
"triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"triggerType" : "PUSH"
}, {
"hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
"triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"triggerType" : "PUSH"
}, {
"hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
"triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"triggerType" : "PUSH"
}, {
"hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
"triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"triggerType" : "PUSH"
}, {
"hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
"triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"triggerType" : "PUSH"
}, {
"hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"status" : "PENDING",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
"triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* 5e8ab52b0e139333c4c003932c55ff6e88302206 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565)
* 1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] liujinhui1994 commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r657692451
##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/DeltaSync.java
##########
@@ -461,7 +465,7 @@ public void refreshTimeline() throws IOException {
if (!hasErrors || cfg.commitOnErrors) {
HashMap<String, String> checkpointCommitMetadata = new HashMap<>();
checkpointCommitMetadata.put(CHECKPOINT_KEY, checkpointStr);
- if (cfg.checkpoint != null) {
+ if (cfg.checkpoint != null && !"timestamp".equals(props.getString("hoodie.deltastreamer.source.kafka.checkpoint.type"))) {
Review comment:
You understand that is correct, I wanted to set the timestamp to CHECKPOINT_RESET_KEY at the time. Considering that it could not serve a practical purpose, I cancelled it. After listening to your thoughts, adding it should be more appropriate
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] wangxianghu commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
wangxianghu commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-786606533
> I will add the unit test, and then please review
ack, will review soon
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847
# [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (ffd30f5) into [master](https://codecov.io/gh/apache/hudi/commit/0b57483a8e41742689a1362aa94aabb94a1361b3?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (0b57483) will **decrease** coverage by `6.60%`.
> The diff coverage is `n/a`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #2438 +/- ##
============================================
- Coverage 45.76% 39.15% -6.61%
+ Complexity 5261 3482 -1779
============================================
Files 909 661 -248
Lines 39353 28070 -11283
Branches 4239 2817 -1422
============================================
- Hits 18010 10991 -7019
+ Misses 19499 15978 -3521
+ Partials 1844 1101 -743
```
| Flag | Coverage Δ | |
|---|---|---|
| hudicli | `39.95% <ø> (ø)` | |
| hudiclient | `16.45% <ø> (-13.95%)` | :arrow_down: |
| hudicommon | `47.56% <ø> (-0.02%)` | :arrow_down: |
| hudiflink | `61.26% <ø> (+0.45%)` | :arrow_up: |
| hudihadoopmr | `51.29% <ø> (ø)` | |
| hudisparkdatasource | `?` | |
| hudisync | `?` | |
| huditimelineservice | `?` | |
| hudiutilities | `?` | |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [.../org/apache/hudi/sink/compact/CompactFunction.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zaW5rL2NvbXBhY3QvQ29tcGFjdEZ1bmN0aW9uLmphdmE=) | `86.66% <0.00%> (-13.34%)` | :arrow_down: |
| [...e/hudi/sink/partitioner/profile/WriteProfiles.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zaW5rL3BhcnRpdGlvbmVyL3Byb2ZpbGUvV3JpdGVQcm9maWxlcy5qYXZh) | `50.00% <0.00%> (-5.89%)` | :arrow_down: |
| [...e/hudi/common/table/log/HoodieLogFormatWriter.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL2xvZy9Ib29kaWVMb2dGb3JtYXRXcml0ZXIuamF2YQ==) | `78.12% <0.00%> (-1.57%)` | :arrow_down: |
| [...c/main/java/org/apache/hudi/util/StreamerUtil.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS91dGlsL1N0cmVhbWVyVXRpbC5qYXZh) | `55.00% <0.00%> (-1.42%)` | :arrow_down: |
| [...i/common/table/timeline/HoodieDefaultTimeline.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3RpbWVsaW5lL0hvb2RpZURlZmF1bHRUaW1lbGluZS5qYXZh) | `79.22% <0.00%> (-1.30%)` | :arrow_down: |
| [...java/org/apache/hudi/sink/StreamWriteFunction.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zaW5rL1N0cmVhbVdyaXRlRnVuY3Rpb24uamF2YQ==) | `84.34% <0.00%> (-0.66%)` | :arrow_down: |
| [...he/hudi/sink/partitioner/profile/WriteProfile.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zaW5rL3BhcnRpdGlvbmVyL3Byb2ZpbGUvV3JpdGVQcm9maWxlLmphdmE=) | `87.50% <0.00%> (-0.50%)` | :arrow_down: |
| [...va/org/apache/hudi/configuration/FlinkOptions.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9jb25maWd1cmF0aW9uL0ZsaW5rT3B0aW9ucy5qYXZh) | `96.37% <0.00%> (-0.05%)` | :arrow_down: |
| [...va/org/apache/hudi/metadata/BaseTableMetadata.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0YWRhdGEvQmFzZVRhYmxlTWV0YWRhdGEuamF2YQ==) | `0.00% <0.00%> (ø)` | |
| [.../org/apache/hudi/streamer/FlinkStreamerConfig.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zdHJlYW1lci9GbGlua1N0cmVhbWVyQ29uZmlnLmphdmE=) | `0.00% <0.00%> (ø)` | |
| ... and [264 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [0b57483...ffd30f5](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847
# [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (e98b8e4) into [master](https://codecov.io/gh/apache/hudi/commit/5804ad8e32ae05758ebc5e47f5d4fb4db371ab52?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (5804ad8) will **increase** coverage by `0.13%`.
> The diff coverage is `85.45%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #2438 +/- ##
============================================
+ Coverage 47.72% 47.85% +0.13%
- Complexity 5528 5576 +48
============================================
Files 934 936 +2
Lines 41457 41665 +208
Branches 4166 4193 +27
============================================
+ Hits 19786 19940 +154
- Misses 19914 19958 +44
- Partials 1757 1767 +10
```
| Flag | Coverage Δ | |
|---|---|---|
| hudicli | `39.97% <ø> (ø)` | |
| hudiclient | `34.51% <ø> (+0.05%)` | :arrow_up: |
| hudicommon | `48.71% <ø> (+0.15%)` | :arrow_up: |
| hudiflink | `59.68% <ø> (-0.35%)` | :arrow_down: |
| hudihadoopmr | `52.02% <ø> (+0.73%)` | :arrow_up: |
| hudisparkdatasource | `67.23% <ø> (-0.43%)` | :arrow_down: |
| hudisync | `55.97% <ø> (+1.46%)` | :arrow_up: |
| huditimelineservice | `64.07% <ø> (ø)` | |
| hudiutilities | `59.70% <85.45%> (+0.44%)` | :arrow_up: |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [...apache/hudi/utilities/sources/AvroKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb0thZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (ø)` | |
| [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `72.72% <83.33%> (+1.15%)` | :arrow_up: |
| [...hudi/utilities/sources/helpers/KafkaOffsetGen.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvaGVscGVycy9LYWZrYU9mZnNldEdlbi5qYXZh) | `87.00% <87.23%> (-0.68%)` | :arrow_down: |
| [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `100.00% <100.00%> (ø)` | |
| [...in/scala/org/apache/hudi/HoodieStreamingSink.scala](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZVN0cmVhbWluZ1Npbmsuc2NhbGE=) | `28.00% <0.00%> (-10.40%)` | :arrow_down: |
| [.../java/org/apache/hudi/client/HoodieReadClient.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1zcGFyay1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpZW50L0hvb2RpZVJlYWRDbGllbnQuamF2YQ==) | `94.64% <0.00%> (-5.36%)` | :arrow_down: |
| [...c/main/java/org/apache/hudi/util/StreamerUtil.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS91dGlsL1N0cmVhbWVyVXRpbC5qYXZh) | `64.00% <0.00%> (-3.80%)` | :arrow_down: |
| [.../hudi/common/util/collection/LazyFileIterable.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3V0aWwvY29sbGVjdGlvbi9MYXp5RmlsZUl0ZXJhYmxlLmphdmE=) | `71.73% <0.00%> (-2.68%)` | :arrow_down: |
| [.../org/apache/hudi/MergeOnReadSnapshotRelation.scala](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL01lcmdlT25SZWFkU25hcHNob3RSZWxhdGlvbi5zY2FsYQ==) | `90.59% <0.00%> (-0.56%)` | :arrow_down: |
| [...src/main/scala/org/apache/hudi/DefaultSource.scala](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0RlZmF1bHRTb3VyY2Uuc2NhbGE=) | `74.77% <0.00%> (-0.46%)` | :arrow_down: |
| ... and [39 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [5804ad8...e98b8e4](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] wangxianghu commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
wangxianghu commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-759873110
@liujinhui1994 IMO, we can provide both offset and timestamp checkpoint by `--checkpoint`, add a new param named checkpointType(default offset type if not configed) to tell hudi the checkpoint type user used. WDYT ?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] liujinhui1994 commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r557069306
##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java
##########
@@ -165,6 +169,7 @@ public KafkaOffsetGen(TypedProperties props) {
}
DataSourceUtils.checkRequiredProperties(props, Collections.singletonList(Config.KAFKA_TOPIC_NAME));
topicName = props.getString(Config.KAFKA_TOPIC_NAME);
+ kafkaCheckpointTimestamp = props.getString(Config.KAFKA_CHECKPOINT_TIMESTAMP);
Review comment:
yes, i will correct
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r604840433
##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java
##########
@@ -553,6 +555,11 @@ public DeltaSyncService(Config cfg, JavaSparkContext jssc, FileSystem fs, Config
"'--filter-dupes' needs to be disabled when '--op' is 'UPSERT' to ensure updates are not missed.");
this.props = properties.get();
+ String kafkaCheckpointTimestamp = props.getString(KafkaOffsetGen.Config.KAFKA_CHECKPOINT_TIMESTAMP, "");
Review comment:
Let me think more on this. Wondering if we should just rely on existing "HoodieDeltaStreamer.Config.checkpoint" only and add another config named "checkpoint.type" or something which could be set to timestamp for this purpose.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
nsivabalan edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-856862390
@liujinhui1994 : here is what we can do.
If someone is running it just one, this should not be an issue. Issue arises when someone runs deltastreamer in a continuous manner.
So, user is expected to set HoodieDeltaStreamer.Config.checkpoint or InitialCheckpointProvider.
Also user sets the new config (hoodie.deltastreamer.source.kafka.checkpoint.type) to timestamp.
KafkaOffset gen should be capable of parsing the checkpoint as timestamp.
at the end write, deltaSync should reset this(...kafka.checkpoint.type) config (similar to how we reset the checkpoint).
So, for subsequent runs, this(...kafka.checkpoint.type) config value will not be set. So, KafkaOffsetGen should parse checkpoint and fetch from source as a regular checkpoint.
Let me know if you can understand the approach, and if it makes sense.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] liujinhui1994 commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-782589121
I have verified, please help review
@wangxianghu @yanghua @nsivabalan
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
nsivabalan edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-864413719
Guess we can simplify things. Let me go over some pseudo code of interest.
Code before this patch.
within DeltaSync.read()
```
// set right checkpoint value
if(cfg.checkpoint != null && ! (commitMetadata.contains(Checkpoint_RESET_Key) ) {
checkpoint = cfg.checkpoint;
} else if (commitMetadata.contains(Checkpoint_Key)) {
checkpoint = commitMetadata.get(Checkpoint_Key));
} else {
Option.empty()
}
```
// Note that first if condition deals with RESET_key where as 2nd else if conditions deals with Checkpoint_key.
I have simplified some exception cases, but should give you the gist.
within write()
```
// towards the end
commitMetadata.out(Checkpoint_Key, updated checkpoint after writing)
if(cfg.checkpoint != null) {
commitMetadata.add(Checkpoint_RESET_Key);
}
```
If cfg.checkpoint is set, only during first round, it will be honored. At the end of first batch, we add Checkpoint_RESET_Key to the commitmetadata and hence from subsequent batches, checkpoint will be parsed from commitMetadata.
With this PR, only addition is that we are introducing a new checkpoint type. Let me propose a simple add on to above code that would work for us.
within DeltaSync.read()
```
// set right checkpoint value
boolean resetCheckpointType = true; // New addition
if(cfg.checkpoint != null && ! (commitMetadata.contains(Checkpoint_RESET_Key) ) {
checkpoint = cfg.checkpoint;
resetCheckpointType = false; // New addition
} else if (commitMetadata.contains(Checkpoint_Key)) {
checkpoint = commitMetadata.get(Checkpoint_Key));
} else {
Option.empty()
}
// New addition
if (resetCheckpointType) {
**reset checkpoint type if set.**
}
```
No other changes are required. This is based of the assumption that Checkpoint_RESET_Key and checkpoint type goes hand in hand. During first batch, checkpoint type could be set, there won't be any Checkpoint_RESET_Key set. But from 2nd batch, it should be reverse. check point type should not be set, but Checkpoint_RESET_Key should be part of the commit metadata. Given this assumption, we don't really need to add checkpoint type to commitMetadata, but still decide whether to use the checkpoint type or not.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847
# [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (67041c2) into [master](https://codecov.io/gh/apache/hudi/commit/990820476a41b318017ba63dd446911141c929ce?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (9908204) will **decrease** coverage by `1.15%`.
> The diff coverage is `n/a`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #2438 +/- ##
============================================
- Coverage 47.61% 46.46% -1.16%
+ Complexity 5487 5030 -457
============================================
Files 924 866 -58
Lines 41206 38565 -2641
Branches 4133 3837 -296
============================================
- Hits 19619 17918 -1701
+ Misses 19844 19060 -784
+ Partials 1743 1587 -156
```
| Flag | Coverage Δ | |
|---|---|---|
| hudicli | `39.97% <ø> (ø)` | |
| hudiclient | `34.20% <ø> (-0.38%)` | :arrow_down: |
| hudicommon | `48.58% <ø> (+0.02%)` | :arrow_up: |
| hudiflink | `60.03% <ø> (+0.44%)` | :arrow_up: |
| hudihadoopmr | `51.29% <ø> (ø)` | |
| hudisparkdatasource | `67.32% <ø> (-0.01%)` | :arrow_down: |
| hudisync | `50.55% <ø> (-3.93%)` | :arrow_down: |
| huditimelineservice | `64.07% <ø> (ø)` | |
| hudiutilities | `?` | |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [...ache/hudi/hive/HiveMetastoreBasedLockProvider.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvSGl2ZU1ldGFzdG9yZUJhc2VkTG9ja1Byb3ZpZGVyLmphdmE=) | `0.00% <0.00%> (-60.22%)` | :arrow_down: |
| [...n/java/org/apache/hudi/index/SparkHoodieIndex.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1zcGFyay1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaW5kZXgvU3BhcmtIb29kaWVJbmRleC5qYXZh) | `56.52% <0.00%> (-30.15%)` | :arrow_down: |
| [...java/org/apache/hudi/table/HoodieTableFactory.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS90YWJsZS9Ib29kaWVUYWJsZUZhY3RvcnkuamF2YQ==) | `84.61% <0.00%> (-7.06%)` | :arrow_down: |
| [...org/apache/hudi/config/HoodieClusteringConfig.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVDbHVzdGVyaW5nQ29uZmlnLmphdmE=) | `67.74% <0.00%> (-3.54%)` | :arrow_down: |
| [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh) | `39.57% <0.00%> (-3.31%)` | :arrow_down: |
| [...n/java/org/apache/hudi/common/model/FileSlice.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0ZpbGVTbGljZS5qYXZh) | `73.80% <0.00%> (-2.39%)` | :arrow_down: |
| [...a/org/apache/hudi/common/util/ClusteringUtils.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3V0aWwvQ2x1c3RlcmluZ1V0aWxzLmphdmE=) | `88.40% <0.00%> (-1.31%)` | :arrow_down: |
| [.../org/apache/hudi/common/model/HoodieFileGroup.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0hvb2RpZUZpbGVHcm91cC5qYXZh) | `83.92% <0.00%> (-0.56%)` | :arrow_down: |
| [...in/java/org/apache/hudi/table/HoodieTableSink.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS90YWJsZS9Ib29kaWVUYWJsZVNpbmsuamF2YQ==) | `10.52% <0.00%> (ø)` | |
| [.../org/apache/hudi/streamer/FlinkStreamerConfig.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zdHJlYW1lci9GbGlua1N0cmVhbWVyQ29uZmlnLmphdmE=) | `0.00% <0.00%> (ø)` | |
| ... and [100 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [9908204...67041c2](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847
# [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (5022f1d) into [master](https://codecov.io/gh/apache/hudi/commit/5804ad8e32ae05758ebc5e47f5d4fb4db371ab52?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (5804ad8) will **decrease** coverage by `44.87%`.
> The diff coverage is `0.00%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #2438 +/- ##
============================================
- Coverage 47.72% 2.85% -44.88%
+ Complexity 5528 85 -5443
============================================
Files 934 283 -651
Lines 41457 11751 -29706
Branches 4166 966 -3200
============================================
- Hits 19786 335 -19451
+ Misses 19914 11390 -8524
+ Partials 1757 26 -1731
```
| Flag | Coverage Δ | |
|---|---|---|
| hudicli | `?` | |
| hudiclient | `0.00% <ø> (-34.46%)` | :arrow_down: |
| hudicommon | `?` | |
| hudiflink | `?` | |
| hudihadoopmr | `?` | |
| hudisparkdatasource | `?` | |
| hudisync | `5.37% <ø> (-49.15%)` | :arrow_down: |
| huditimelineservice | `?` | |
| hudiutilities | `8.99% <0.00%> (-50.27%)` | :arrow_down: |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `0.00% <0.00%> (-71.57%)` | :arrow_down: |
| [...apache/hudi/utilities/sources/AvroKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb0thZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (ø)` | |
| [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...hudi/utilities/sources/helpers/KafkaOffsetGen.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvaGVscGVycy9LYWZrYU9mZnNldEdlbi5qYXZh) | `0.00% <0.00%> (-87.69%)` | :arrow_down: |
| [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| ... and [771 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [5804ad8...5022f1d](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] liujinhui1994 commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r605393270
##########
File path: hudi-integ-test/src/main/java/org/apache/hudi/integ/testsuite/HoodieDeltaStreamerWrapper.java
##########
@@ -65,7 +65,7 @@ public void scheduleCompact() throws Exception {
return upsert(WriteOperationType.UPSERT);
}
- public Pair<SchemaProvider, Pair<String, JavaRDD<HoodieRecord>>> fetchSource() throws Exception {
+ public Pair<Pair<SchemaProvider, JavaRDD<HoodieRecord>>, Pair<String, String>> fetchSource() throws Exception {
Review comment:
Okay, I'll add this class to this PR
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-799571386
Few high level questions.
1. Why not we leverage DeltaSreamerConfig.checkpoint to pass in a timestamp for Kafka source? Or do we expect the format of this config to be "topic_name,partition_num:offset,partition_num:offset,...." and hence we need a new config for timestamp based checkpoint.
2. If yes to (1), Did we think about parsing the checkpoint config and determining whether its above format or timestamp and then proceeding from there. Just trying to avoid introducing new configs if possible.
3. Checkpoint in deltastreamer in general is getting too complicated. I definitely see a benefit in this patch. But, is there a way we can abstract it out based on source. Bcoz, the new config introduced as part of this PR, is very specific to Kafka. So, trying to see if we can keep it abstracted out from deltastreamer if possible.
4. I see KafkaConsumer.offsetsForTimes() could return null for partitions w/ msgs of old format. So, what's the expected behavior for such partitions. Do we resume from earliest offset?
@n3nash @vinothchandar : open to hear your thoughts if any. One of my suggestion above, could potentially add apis to Source and hence CCing you.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-799511127
hey folks. may I know what's the status of this PR. I see this could benefit others in the community as well. Do you think we can take it across the finish line by this weekend. so that we have it for upcoming release?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847
# [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (b77b639) into [master](https://codecov.io/gh/apache/hudi/commit/5804ad8e32ae05758ebc5e47f5d4fb4db371ab52?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (5804ad8) will **decrease** coverage by `2.00%`.
> The diff coverage is `85.45%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #2438 +/- ##
============================================
- Coverage 47.72% 45.71% -2.01%
+ Complexity 5528 4738 -790
============================================
Files 934 832 -102
Lines 41457 38264 -3193
Branches 4166 3832 -334
============================================
- Hits 19786 17493 -2293
+ Misses 19914 19150 -764
+ Partials 1757 1621 -136
```
| Flag | Coverage Δ | |
|---|---|---|
| hudicli | `39.97% <ø> (ø)` | |
| hudiclient | `22.27% <ø> (-12.18%)` | :arrow_down: |
| hudicommon | `48.56% <ø> (+<0.01%)` | :arrow_up: |
| hudiflink | `60.03% <ø> (ø)` | |
| hudihadoopmr | `51.55% <ø> (+0.26%)` | :arrow_up: |
| hudisparkdatasource | `67.32% <ø> (-0.34%)` | :arrow_down: |
| hudisync | `54.51% <ø> (ø)` | |
| huditimelineservice | `64.07% <ø> (ø)` | |
| hudiutilities | `59.70% <85.45%> (+0.44%)` | :arrow_up: |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [...apache/hudi/utilities/sources/AvroKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb0thZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (ø)` | |
| [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `72.72% <83.33%> (+1.15%)` | :arrow_up: |
| [...hudi/utilities/sources/helpers/KafkaOffsetGen.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvaGVscGVycy9LYWZrYU9mZnNldEdlbi5qYXZh) | `87.00% <87.23%> (-0.68%)` | :arrow_down: |
| [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `100.00% <100.00%> (ø)` | |
| [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh) | `29.45% <0.00%> (-13.36%)` | :arrow_down: |
| [...in/scala/org/apache/hudi/HoodieStreamingSink.scala](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZVN0cmVhbWluZ1Npbmsuc2NhbGE=) | `28.00% <0.00%> (-10.40%)` | :arrow_down: |
| [...e/hudi/common/table/log/HoodieLogFormatWriter.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL2xvZy9Ib29kaWVMb2dGb3JtYXRXcml0ZXIuamF2YQ==) | `78.12% <0.00%> (-1.57%)` | :arrow_down: |
| [.../org/apache/hudi/MergeOnReadSnapshotRelation.scala](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL01lcmdlT25SZWFkU25hcHNob3RSZWxhdGlvbi5zY2FsYQ==) | `90.59% <0.00%> (-0.56%)` | :arrow_down: |
| [...c/main/scala/org/apache/hudi/HoodieFileIndex.scala](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZUZpbGVJbmRleC5zY2FsYQ==) | `80.98% <0.00%> (-0.16%)` | :arrow_down: |
| [...va/org/apache/hudi/client/SparkRDDWriteClient.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1zcGFyay1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpZW50L1NwYXJrUkREV3JpdGVDbGllbnQuamF2YQ==) | | |
| ... and [104 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [5804ad8...b77b639](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
"triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"triggerType" : "PUSH"
}, {
"hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
"triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"triggerType" : "PUSH"
}, {
"hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
"triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"triggerType" : "PUSH"
}, {
"hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"status" : "UNKNOWN",
"url" : "TBD",
"triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* ea5ed9da433064022a69e06c98f58fc10c09e8b6 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373)
* ffd30f564c780a25ddccf8c5bc819d4eed9b437a UNKNOWN
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847
# [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (739a252) into [master](https://codecov.io/gh/apache/hudi/commit/7fed7352bd506e20e5316bb0b3ed9e5c1e9c76df?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (7fed735) will **decrease** coverage by `1.31%`.
> The diff coverage is `n/a`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #2438 +/- ##
============================================
- Coverage 54.96% 53.65% -1.32%
+ Complexity 3844 3253 -591
============================================
Files 485 407 -78
Lines 23437 19762 -3675
Branches 2494 2085 -409
============================================
- Hits 12882 10603 -2279
+ Misses 9401 8221 -1180
+ Partials 1154 938 -216
```
| Flag | Coverage Δ | |
|---|---|---|
| hudicli | `39.55% <ø> (ø)` | |
| hudiclient | `∅ <ø> (∅)` | |
| hudicommon | `50.29% <ø> (ø)` | |
| hudiflink | `63.41% <ø> (ø)` | |
| hudihadoopmr | `51.54% <ø> (ø)` | |
| hudisparkdatasource | `73.33% <ø> (ø)` | |
| hudisync | `?` | |
| huditimelineservice | `?` | |
| hudiutilities | `?` | |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [...java/org/apache/hudi/utilities/sources/Source.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvU291cmNlLmphdmE=) | | |
| [...ies/exception/HoodieSnapshotExporterException.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2V4Y2VwdGlvbi9Ib29kaWVTbmFwc2hvdEV4cG9ydGVyRXhjZXB0aW9uLmphdmE=) | | |
| [...lities/schema/SchemaProviderWithPostProcessor.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFQcm92aWRlcldpdGhQb3N0UHJvY2Vzc29yLmphdmE=) | | |
| [.../apache/hudi/timeline/service/TimelineService.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS10aW1lbGluZS1zZXJ2aWNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL3RpbWVsaW5lL3NlcnZpY2UvVGltZWxpbmVTZXJ2aWNlLmphdmE=) | | |
| [...di/utilities/sources/helpers/IncrSourceHelper.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvaGVscGVycy9JbmNyU291cmNlSGVscGVyLmphdmE=) | | |
| [.../apache/hudi/hive/MultiPartKeysValueExtractor.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTXVsdGlQYXJ0S2V5c1ZhbHVlRXh0cmFjdG9yLmphdmE=) | | |
| [...ities/checkpointing/InitialCheckPointProvider.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2NoZWNrcG9pbnRpbmcvSW5pdGlhbENoZWNrUG9pbnRQcm92aWRlci5qYXZh) | | |
| [...udi/timeline/service/handlers/TimelineHandler.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS10aW1lbGluZS1zZXJ2aWNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL3RpbWVsaW5lL3NlcnZpY2UvaGFuZGxlcnMvVGltZWxpbmVIYW5kbGVyLmphdmE=) | | |
| [...alCheckpointFromAnotherHoodieTimelineProvider.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2NoZWNrcG9pbnRpbmcvSW5pdGlhbENoZWNrcG9pbnRGcm9tQW5vdGhlckhvb2RpZVRpbWVsaW5lUHJvdmlkZXIuamF2YQ==) | | |
| [...java/org/apache/hudi/hive/util/HiveSchemaUtil.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvdXRpbC9IaXZlU2NoZW1hVXRpbC5qYXZh) | | |
| ... and [65 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] pratyakshsharma commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
pratyakshsharma commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r830456199
##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java
##########
@@ -283,6 +323,41 @@ private Long delayOffsetCalculation(Option<String> lastCheckpointStr, Set<TopicP
return delayCount;
}
+ /**
+ * Get the checkpoint by timestamp.
+ * This method returns the checkpoint format based on the timestamp.
+ * example:
+ * 1. input: timestamp, etc.
+ * 2. output: topicName,partition_num_0:100,partition_num_1:101,partition_num_2:102.
+ *
+ * @param consumer
+ * @param topicName
+ * @param timestamp
+ * @return
+ */
+ private Option<String> getOffsetsByTimestamp(KafkaConsumer consumer, List<PartitionInfo> partitionInfoList, Set<TopicPartition> topicPartitions,
+ String topicName, Long timestamp) {
+
+ Map<TopicPartition, Long> topicPartitionsTimestamp = partitionInfoList.stream()
+ .map(x -> new TopicPartition(x.topic(), x.partition()))
+ .collect(Collectors.toMap(Function.identity(), x -> timestamp));
+
+ Map<TopicPartition, Long> earliestOffsets = consumer.beginningOffsets(topicPartitions);
+ Map<TopicPartition, OffsetAndTimestamp> offsetAndTimestamp = consumer.offsetsForTimes(topicPartitionsTimestamp);
+
+ StringBuilder sb = new StringBuilder();
+ sb.append(topicName + ",");
+ for (Map.Entry<TopicPartition, OffsetAndTimestamp> map : offsetAndTimestamp.entrySet()) {
+ if (map.getValue() != null) {
+ sb.append(map.getKey().partition()).append(":").append(map.getValue().offset()).append(",");
+ } else {
+ sb.append(map.getKey().partition()).append(":").append(earliestOffsets.get(map.getKey())).append(",");
Review comment:
@liujinhui1994 @nsivabalan Can you help me understand why are we adding this value here from earliestOffsets? From what I understand, the whole point of consuming from specified timestamp is we do not want to consume records whose offset has timestamp lesser than the specified timestamp. Let us take an example of topic A with 3 partitions 0,1,2. Offsets are as below -
partition 0 - 100 (ts-210),101 (ts-220),102 (ts-230),103 (ts-240) .....
partition 1 - 50 (ts 200), 51 (ts-205), 52 (ts-225) ....
partition 2 - 51 (ts - 100), 60 (ts - 150) (only 2 records present in this)
Now suppose if the timestamp is passed as 220, the expected results from consumer api will be -
partition 0 -> 101
partition 1 -> 52
partition 2 -> null
As per the code, we return -
partition 0 -> 101
partition 1 -> 52
partition 2 -> 51
I want to understand why are we populating this value here for partition 2? If the corresponding offsets in partition 2 have timestamp less than 220, this implies these offsets have either been already consumed or the records are not needed at all for ingestion into hudi table. Ideally no offset should be returned from this method for partition 2.
Even if this functionality is added only for one time initial bootstrap, then also consuming the records from partition 2 above does not make sense. Please let me know the thought process behind this logic.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-856862390
@liujinhui1994 : here is what we can do.
If someone is running it just one, this should not be an issue. Issue arises when someone runs deltastreamer in a continuous manner.
So, user is expected to set HoodieDeltaStreamer.Config.checkpoint or InitialCheckpointProvider.
Also user sets the new config (hoodie.deltastreamer.source.kafka.checkpoint.type) to timestamp.
KafkaOffset gen should be capable of parsing the checkpoint as timestamp.
at the end write, deltaSync should reset this config (similar to how we reset the checkpoint).
So, for subsequent runs, this config value will not be set. So, KafkaOffsetGen should parse checkpoint and fetch from source as a regular checkpoint.
Let me know if you can understand the approach, and if it makes sense.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-853866564
good point.
Tell me if my understanding is right in general wrt usage of timestamp based checkpointing.
user would like to use timestamp based checkpointing in deltastreamer only for bootstrap case.
and further on, checkpointing will be using the regular kafka checkpoint format of "topicName,0:123,1:456".
if my understanding (stated above) is true, essentially, within kafkaOffsenGen, we might have to parse checkpoint as timestamp for first time(bootstrap), but from 2nd time, we fallback to regular checkpoint parsing mechanism.
I see we have InitialCheckPointProvider. Let me think about how to go about this and will get back to you. For now, this is what I can think of.
InitialCheckpointProvider will expose getCheckpointType() method.
and we add it as a property to configs if initialCheckpointProvider is set around [here](https://github.com/apache/hudi/blob/f6eee77636223077cfd2ce516f1b8805dfa6e35e/hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java#L132).
Within readFromSource in DeltaSync(), if checkpoint is fetched from commit metadata, we may not honor this checkpoint type. or we will clear the checkpoint type property if set.
but if fetched from cfg.checkPoint, we will leave the property as is and let kafkaOffsetGen handle checkpoint parsing.
But let me think through this more. But in the mean time, if you can confirm my understanding of the usage of timestamp based checkpointing, would be great.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
"triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"triggerType" : "PUSH"
}, {
"hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
"triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"triggerType" : "PUSH"
}, {
"hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
"triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"triggerType" : "PUSH"
}, {
"hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
"triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"triggerType" : "PUSH"
}, {
"hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
"triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"triggerType" : "PUSH"
}, {
"hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"status" : "UNKNOWN",
"url" : "TBD",
"triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* 5e8ab52b0e139333c4c003932c55ff6e88302206 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565)
* 1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657 UNKNOWN
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
"triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"triggerType" : "PUSH"
}, {
"hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
"triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"triggerType" : "PUSH"
}, {
"hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
"triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"triggerType" : "PUSH"
}, {
"hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
"triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"triggerType" : "PUSH"
}, {
"hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
"triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"triggerType" : "PUSH"
}, {
"hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
"triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"triggerType" : "PUSH"
}, {
"hash" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638",
"triggerID" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
"triggerType" : "PUSH"
}, {
"hash" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=775",
"triggerID" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
"triggerType" : "PUSH"
}, {
"hash" : "a688be727d6d6beff51a3f347b9e596d982610b5",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=838",
"triggerID" : "a688be727d6d6beff51a3f347b9e596d982610b5",
"triggerType" : "PUSH"
}, {
"hash" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
"status" : "PENDING",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=839",
"triggerID" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* a688be727d6d6beff51a3f347b9e596d982610b5 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=838)
* 67041c2d836e61355aea26bd24f91548ec5e92ce Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=839)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847
# [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (ffd30f5) into [master](https://codecov.io/gh/apache/hudi/commit/0b57483a8e41742689a1362aa94aabb94a1361b3?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (0b57483) will **decrease** coverage by `0.75%`.
> The diff coverage is `n/a`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #2438 +/- ##
============================================
- Coverage 45.76% 45.00% -0.76%
+ Complexity 5261 4858 -403
============================================
Files 909 849 -60
Lines 39353 36710 -2643
Branches 4239 3955 -284
============================================
- Hits 18010 16522 -1488
+ Misses 19499 18474 -1025
+ Partials 1844 1714 -130
```
| Flag | Coverage Δ | |
|---|---|---|
| hudicli | `39.95% <ø> (ø)` | |
| hudiclient | `30.45% <ø> (+0.06%)` | :arrow_up: |
| hudicommon | `47.56% <ø> (-0.02%)` | :arrow_down: |
| hudiflink | `61.26% <ø> (+0.45%)` | :arrow_up: |
| hudihadoopmr | `51.29% <ø> (ø)` | |
| hudisparkdatasource | `67.00% <ø> (+0.52%)` | :arrow_up: |
| hudisync | `47.11% <ø> (-4.35%)` | :arrow_down: |
| huditimelineservice | `64.36% <ø> (ø)` | |
| hudiutilities | `?` | |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [...ache/hudi/hive/HiveMetastoreBasedLockProvider.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvSGl2ZU1ldGFzdG9yZUJhc2VkTG9ja1Byb3ZpZGVyLmphdmE=) | `0.00% <0.00%> (-60.22%)` | :arrow_down: |
| [.../org/apache/hudi/sink/compact/CompactFunction.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zaW5rL2NvbXBhY3QvQ29tcGFjdEZ1bmN0aW9uLmphdmE=) | `86.66% <0.00%> (-13.34%)` | :arrow_down: |
| [...e/hudi/sink/partitioner/profile/WriteProfiles.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zaW5rL3BhcnRpdGlvbmVyL3Byb2ZpbGUvV3JpdGVQcm9maWxlcy5qYXZh) | `50.00% <0.00%> (-5.89%)` | :arrow_down: |
| [...src/main/scala/org/apache/hudi/DefaultSource.scala](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0RlZmF1bHRTb3VyY2Uuc2NhbGE=) | `75.22% <0.00%> (-2.23%)` | :arrow_down: |
| [...e/hudi/common/table/log/HoodieLogFormatWriter.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL2xvZy9Ib29kaWVMb2dGb3JtYXRXcml0ZXIuamF2YQ==) | `78.12% <0.00%> (-1.57%)` | :arrow_down: |
| [...c/main/java/org/apache/hudi/util/StreamerUtil.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS91dGlsL1N0cmVhbWVyVXRpbC5qYXZh) | `55.00% <0.00%> (-1.42%)` | :arrow_down: |
| [...i/common/table/timeline/HoodieDefaultTimeline.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3RpbWVsaW5lL0hvb2RpZURlZmF1bHRUaW1lbGluZS5qYXZh) | `79.22% <0.00%> (-1.30%)` | :arrow_down: |
| [...metadata/SparkHoodieBackedTableMetadataWriter.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1zcGFyay1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0YWRhdGEvU3BhcmtIb29kaWVCYWNrZWRUYWJsZU1ldGFkYXRhV3JpdGVyLmphdmE=) | `72.36% <0.00%> (-0.88%)` | :arrow_down: |
| [...java/org/apache/hudi/sink/StreamWriteFunction.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zaW5rL1N0cmVhbVdyaXRlRnVuY3Rpb24uamF2YQ==) | `84.34% <0.00%> (-0.66%)` | :arrow_down: |
| [...he/hudi/sink/partitioner/profile/WriteProfile.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zaW5rL3BhcnRpdGlvbmVyL3Byb2ZpbGUvV3JpdGVQcm9maWxlLmphdmE=) | `87.50% <0.00%> (-0.50%)` | :arrow_down: |
| ... and [94 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [0b57483...ffd30f5](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
"triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"triggerType" : "PUSH"
}, {
"hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
"triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"triggerType" : "PUSH"
}, {
"hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
"triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* ea5ed9da433064022a69e06c98f58fc10c09e8b6 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] liujinhui1994 closed pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
liujinhui1994 closed pull request #2438:
URL: https://github.com/apache/hudi/pull/2438
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
"triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"triggerType" : "PUSH"
}, {
"hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
"triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"triggerType" : "PUSH"
}, {
"hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"status" : "UNKNOWN",
"url" : "TBD",
"triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* 00d29d85f32f376ef44cb99d49f605a4af6f798c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269)
* ea5ed9da433064022a69e06c98f58fc10c09e8b6 UNKNOWN
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
"triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"triggerType" : "PUSH"
}, {
"hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
"triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"triggerType" : "PUSH"
}, {
"hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
"triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"triggerType" : "PUSH"
}, {
"hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
"triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* ffd30f564c780a25ddccf8c5bc819d4eed9b437a Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
"triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"triggerType" : "PUSH"
}, {
"hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
"triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"triggerType" : "PUSH"
}, {
"hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
"triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"triggerType" : "PUSH"
}, {
"hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
"triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"triggerType" : "PUSH"
}, {
"hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
"triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"triggerType" : "PUSH"
}, {
"hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
"triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"triggerType" : "PUSH"
}, {
"hash" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638",
"triggerID" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
"triggerType" : "PUSH"
}, {
"hash" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=775",
"triggerID" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
"triggerType" : "PUSH"
}, {
"hash" : "a688be727d6d6beff51a3f347b9e596d982610b5",
"status" : "PENDING",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=838",
"triggerID" : "a688be727d6d6beff51a3f347b9e596d982610b5",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* bf50481b923dbaa14be994bd0cc45bbe22ff8524 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=775)
* a688be727d6d6beff51a3f347b9e596d982610b5 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=838)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
"triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"triggerType" : "PUSH"
}, {
"hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
"triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"triggerType" : "PUSH"
}, {
"hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
"triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"triggerType" : "PUSH"
}, {
"hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
"triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"triggerType" : "PUSH"
}, {
"hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
"triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"triggerType" : "PUSH"
}, {
"hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
"triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"triggerType" : "PUSH"
}, {
"hash" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638",
"triggerID" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* a39570dfe0493bcd23edf911f6256e90d3b22907 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r671607330
##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java
##########
@@ -212,6 +234,9 @@ public KafkaOffsetGen(TypedProperties props) {
Set<TopicPartition> topicPartitions = partitionInfoList.stream()
.map(x -> new TopicPartition(x.topic(), x.partition())).collect(Collectors.toSet());
+ if (Config.KAFKA_CHECKPOINT_TYPE_TIMESTAMP.equals(kafkaCheckpointType) && isValidCheckpointType(lastCheckpointStr)) {
+ lastCheckpointStr = getOffsetsByTimestamp(consumer, partitionInfoList, topicPartitions, topicName, Long.parseLong(lastCheckpointStr.get()));
+ }
Review comment:
ok, I get it now. makes sense.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
"triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"triggerType" : "PUSH"
}, {
"hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
"triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"triggerType" : "PUSH"
}, {
"hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
"triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"triggerType" : "PUSH"
}, {
"hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
"triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"triggerType" : "PUSH"
}, {
"hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
"triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"triggerType" : "PUSH"
}, {
"hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
"triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"triggerType" : "PUSH"
}, {
"hash" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638",
"triggerID" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
"triggerType" : "PUSH"
}, {
"hash" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=775",
"triggerID" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
"triggerType" : "PUSH"
}, {
"hash" : "a688be727d6d6beff51a3f347b9e596d982610b5",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=838",
"triggerID" : "a688be727d6d6beff51a3f347b9e596d982610b5",
"triggerType" : "PUSH"
}, {
"hash" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=839",
"triggerID" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
"triggerType" : "PUSH"
}, {
"hash" : "8bc0333e4fc14158b126da1f7b14f6c43a3abfb8",
"status" : "UNKNOWN",
"url" : "TBD",
"triggerID" : "8bc0333e4fc14158b126da1f7b14f6c43a3abfb8",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* 67041c2d836e61355aea26bd24f91548ec5e92ce Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=839)
* 8bc0333e4fc14158b126da1f7b14f6c43a3abfb8 UNKNOWN
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-864413719
Guess we can simplify things. Let me go over some pseudo code of interest.
within DeltaSync.read()
```
// set right checkpoint value
if(cfg.checkpoint != null && ! (commitMetadata.contains(Checkpoint_RESET_Key) ) {
checkpoint = cfg.checkpoint;
} else if (commitMetadata.contains(Checkpoint_Key)) {
checkpoint = commitMetadata.get(Checkpoint_Key));
} else {
Option.empty()
}
```
// Note that first if condition deals with RESET_key where as 2nd else if conditions deals with Checkpoint_key
within write()
```
// towards the end
commitMetadata.out(Checkpoint_Key, updated checkpoint after writing)
if(cfg.checkpoint != null) {
commitMetadata.add(Checkpoint_RESET_Key);
}
```
If cfg.checkpoint is set, only during first round, it will be honored. At the end of first batch, we add Checkpoint_RESET_Key to the commitmetadata and hence from subsequent batches, checkpoint will be parsed from commitMetadata.
With this PR, only addition is that we are introducing a new checkpoint type. Let me propose a simple add on to above code that would work for us.
within DeltaSync.read()
```
// set right checkpoint value
boolean resetCheckpointType = true;
if(cfg.checkpoint != null && ! (commitMetadata.contains(Checkpoint_RESET_Key) ) {
checkpoint = cfg.checkpoint;
resetCheckpointType = false;
} else if (commitMetadata.contains(Checkpoint_Key)) {
checkpoint = commitMetadata.get(Checkpoint_Key));
} else {
Option.empty()
}
if (resetCheckpointType) {
**reset checkpoint type if set.**
}
```
No other changes are required. This is based of the assumption that Checkpoint_RESET_Key and checkpoint type goes hand in hand. During first batch, checkpoint type could be set, there won't be any Checkpoint_RESET_Key set. But from 2nd batch, it should be reverse. check point type should not be set, but Checkpoint_RESET_Key should be part of the commit metadata. Given this assumption, we don't really need to add checkpoint type to commitMetadata, but still decide whether to use the checkpoint type or not.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-io edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-759677298
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
"triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"triggerType" : "PUSH"
}, {
"hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
"triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"triggerType" : "PUSH"
}, {
"hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
"triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"triggerType" : "PUSH"
}, {
"hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
"triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"triggerType" : "PUSH"
}, {
"hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
"triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"triggerType" : "PUSH"
}, {
"hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
"triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"triggerType" : "PUSH"
}, {
"hash" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638",
"triggerID" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
"triggerType" : "PUSH"
}, {
"hash" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=775",
"triggerID" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
"triggerType" : "PUSH"
}, {
"hash" : "a688be727d6d6beff51a3f347b9e596d982610b5",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=838",
"triggerID" : "a688be727d6d6beff51a3f347b9e596d982610b5",
"triggerType" : "PUSH"
}, {
"hash" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=839",
"triggerID" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
"triggerType" : "PUSH"
}, {
"hash" : "8bc0333e4fc14158b126da1f7b14f6c43a3abfb8",
"status" : "CANCELED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=856",
"triggerID" : "8bc0333e4fc14158b126da1f7b14f6c43a3abfb8",
"triggerType" : "PUSH"
}, {
"hash" : "5022f1d97e4e9b140d8e41b5b49c034ceb9ae601",
"status" : "UNKNOWN",
"url" : "TBD",
"triggerID" : "5022f1d97e4e9b140d8e41b5b49c034ceb9ae601",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* 8bc0333e4fc14158b126da1f7b14f6c43a3abfb8 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=856)
* 5022f1d97e4e9b140d8e41b5b49c034ceb9ae601 UNKNOWN
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] wangxianghu commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
wangxianghu commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-787616261
> I will add the unit test, and then please review
Hi @liujinhui1994 sorry for the day.
Can we keep all these changes in `KafkaOffsetGen`, this seems more elegant
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] liujinhui1994 commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r642183230
##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java
##########
@@ -326,6 +328,16 @@ private boolean onDeltaSyncShutdown(boolean error) {
@Parameter(names = {"--checkpoint"}, description = "Resume Delta Streamer from this checkpoint.")
public String checkpoint = null;
+ /**
+ * 1. string: topicName,partition number 0:offset value,partition number 1:offset value
+ * 2. timestamp: kafka offset timestamp
+ * example
+ * 1. hudi_topic,0:100,1:101,2:201
+ * 2. 1621947081
+ */
+ @Parameter(names = {"--checkpoint-type"}, description = "Checkpoint type, divided into timestamp or string offset")
+ public String checkpointType = "string";
Review comment:
I have not finished processing this PR, so this is currently a semi-finished product. When I finish processing, I will ping you
@nsivabalan
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
"triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"triggerType" : "PUSH"
}, {
"hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
"triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"triggerType" : "PUSH"
}, {
"hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
"triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"triggerType" : "PUSH"
}, {
"hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
"triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"triggerType" : "PUSH"
}, {
"hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
"triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"triggerType" : "PUSH"
}, {
"hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
"triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"triggerType" : "PUSH"
}, {
"hash" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638",
"triggerID" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
"triggerType" : "PUSH"
}, {
"hash" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=775",
"triggerID" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* bf50481b923dbaa14be994bd0cc45bbe22ff8524 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=775)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847
# [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (b77b639) into [master](https://codecov.io/gh/apache/hudi/commit/5804ad8e32ae05758ebc5e47f5d4fb4db371ab52?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (5804ad8) will **decrease** coverage by `0.27%`.
> The diff coverage is `85.45%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #2438 +/- ##
============================================
- Coverage 47.72% 47.44% -0.28%
- Complexity 5528 5536 +8
============================================
Files 934 934
Lines 41457 41768 +311
Branches 4166 4187 +21
============================================
+ Hits 19786 19818 +32
- Misses 19914 20189 +275
- Partials 1757 1761 +4
```
| Flag | Coverage Δ | |
|---|---|---|
| hudicli | `39.97% <ø> (ø)` | |
| hudiclient | `33.75% <ø> (-0.70%)` | :arrow_down: |
| hudicommon | `48.56% <ø> (+<0.01%)` | :arrow_up: |
| hudiflink | `60.03% <ø> (ø)` | |
| hudihadoopmr | `51.55% <ø> (+0.26%)` | :arrow_up: |
| hudisparkdatasource | `67.32% <ø> (-0.34%)` | :arrow_down: |
| hudisync | `54.51% <ø> (ø)` | |
| huditimelineservice | `64.07% <ø> (ø)` | |
| hudiutilities | `59.70% <85.45%> (+0.44%)` | :arrow_up: |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [...apache/hudi/utilities/sources/AvroKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb0thZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (ø)` | |
| [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `72.72% <83.33%> (+1.15%)` | :arrow_up: |
| [...hudi/utilities/sources/helpers/KafkaOffsetGen.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvaGVscGVycy9LYWZrYU9mZnNldEdlbi5qYXZh) | `87.00% <87.23%> (-0.68%)` | :arrow_down: |
| [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `100.00% <100.00%> (ø)` | |
| [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh) | `29.45% <0.00%> (-13.36%)` | :arrow_down: |
| [...in/scala/org/apache/hudi/HoodieStreamingSink.scala](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZVN0cmVhbWluZ1Npbmsuc2NhbGE=) | `28.00% <0.00%> (-10.40%)` | :arrow_down: |
| [...e/hudi/common/table/log/HoodieLogFormatWriter.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL2xvZy9Ib29kaWVMb2dGb3JtYXRXcml0ZXIuamF2YQ==) | `78.12% <0.00%> (-1.57%)` | :arrow_down: |
| [.../org/apache/hudi/MergeOnReadSnapshotRelation.scala](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL01lcmdlT25SZWFkU25hcHNob3RSZWxhdGlvbi5zY2FsYQ==) | `90.59% <0.00%> (-0.56%)` | :arrow_down: |
| [...c/main/scala/org/apache/hudi/HoodieFileIndex.scala](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZUZpbGVJbmRleC5zY2FsYQ==) | `80.98% <0.00%> (-0.16%)` | :arrow_down: |
| ... and [2 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [5804ad8...b77b639](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-844060742
@liujinhui1994 : were you able to make progress on this. would be nice to have this in before next release.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847
# [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (b77b639) into [master](https://codecov.io/gh/apache/hudi/commit/5804ad8e32ae05758ebc5e47f5d4fb4db371ab52?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (5804ad8) will **decrease** coverage by `0.27%`.
> The diff coverage is `85.45%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #2438 +/- ##
============================================
- Coverage 47.72% 47.44% -0.28%
- Complexity 5528 5536 +8
============================================
Files 934 934
Lines 41457 41768 +311
Branches 4166 4187 +21
============================================
+ Hits 19786 19818 +32
- Misses 19914 20189 +275
- Partials 1757 1761 +4
```
| Flag | Coverage Δ | |
|---|---|---|
| hudicli | `39.97% <ø> (ø)` | |
| hudiclient | `33.75% <ø> (-0.70%)` | :arrow_down: |
| hudicommon | `48.56% <ø> (+<0.01%)` | :arrow_up: |
| hudiflink | `60.03% <ø> (ø)` | |
| hudihadoopmr | `51.55% <ø> (+0.26%)` | :arrow_up: |
| hudisparkdatasource | `67.32% <ø> (-0.34%)` | :arrow_down: |
| hudisync | `54.51% <ø> (ø)` | |
| huditimelineservice | `64.07% <ø> (ø)` | |
| hudiutilities | `59.70% <85.45%> (+0.44%)` | :arrow_up: |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [...apache/hudi/utilities/sources/AvroKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb0thZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (ø)` | |
| [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `72.72% <83.33%> (+1.15%)` | :arrow_up: |
| [...hudi/utilities/sources/helpers/KafkaOffsetGen.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvaGVscGVycy9LYWZrYU9mZnNldEdlbi5qYXZh) | `87.00% <87.23%> (-0.68%)` | :arrow_down: |
| [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `100.00% <100.00%> (ø)` | |
| [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh) | `29.45% <0.00%> (-13.36%)` | :arrow_down: |
| [...in/scala/org/apache/hudi/HoodieStreamingSink.scala](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZVN0cmVhbWluZ1Npbmsuc2NhbGE=) | `28.00% <0.00%> (-10.40%)` | :arrow_down: |
| [...e/hudi/common/table/log/HoodieLogFormatWriter.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL2xvZy9Ib29kaWVMb2dGb3JtYXRXcml0ZXIuamF2YQ==) | `78.12% <0.00%> (-1.57%)` | :arrow_down: |
| [.../org/apache/hudi/MergeOnReadSnapshotRelation.scala](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL01lcmdlT25SZWFkU25hcHNob3RSZWxhdGlvbi5zY2FsYQ==) | `90.59% <0.00%> (-0.56%)` | :arrow_down: |
| [...c/main/scala/org/apache/hudi/HoodieFileIndex.scala](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZUZpbGVJbmRleC5zY2FsYQ==) | `80.98% <0.00%> (-0.16%)` | :arrow_down: |
| ... and [2 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [5804ad8...b77b639](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] liujinhui1994 commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r671380922
##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java
##########
@@ -212,6 +234,9 @@ public KafkaOffsetGen(TypedProperties props) {
Set<TopicPartition> topicPartitions = partitionInfoList.stream()
.map(x -> new TopicPartition(x.topic(), x.partition())).collect(Collectors.toSet());
+ if (Config.KAFKA_CHECKPOINT_TYPE_TIMESTAMP.equals(kafkaCheckpointType) && isValidCheckpointType(lastCheckpointStr)) {
+ lastCheckpointStr = getOffsetsByTimestamp(consumer, partitionInfoList, topicPartitions, topicName, Long.parseLong(lastCheckpointStr.get()));
+ }
Review comment:
It can be understood that when checkTimestamptype is not used, the format of lastCheckpointStr is topic_name,partition_num:offset,partition_num:offset
When getOffsetsByTimestamp method is used, what we do is to convert lastCheckpointStr=timestamp to topic_name,partition_num:offset,partition_num:offset
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-853866564
good point.
Tell me if my understanding is right in general wrt usage of timestamp based checkpointing.
user would like to use timestamp based checkpointing in deltastreamer only for bootstrap case.
and further on, checkpointing will be using the regular kafka checkpoint format of "topicName,0:123,1:456".
if my understanding (stated above) is true, essentially, within kafkaOffsenGen, we might have to parse checkpoint as timestamp for first time(bootstrap), but from 2nd time, we fallback to regular checkpoint parsing mechanism.
I see we have InitialCheckPointProvider. Let me think about how to go about this and will get back to you. For now, this is what I can think of.
InitialCheckpointProvider will expose getCheckpointType() method.
and we add it as a property to configs if initialCheckpointProvider is set around [here](https://github.com/apache/hudi/blob/f6eee77636223077cfd2ce516f1b8805dfa6e35e/hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java#L132).
Within readFromSource in DeltaSync(), if checkpoint is fetched from commit metadata, we may not honor this checkpoint type. or we will clear the checkpoint type property if set.
but if fetched from cfg.checkPoint, we will leave the property as is and let kafkaOffsetGen handle checkpoint parsing.
But let me think through this more. But in the mean time, if you can confirm my understanding of the usage of timestamp based checkpointing, would be great.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
"triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"triggerType" : "PUSH"
}, {
"hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
"triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"triggerType" : "PUSH"
}, {
"hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
"triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"triggerType" : "PUSH"
}, {
"hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
"triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"triggerType" : "PUSH"
}, {
"hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
"triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"triggerType" : "PUSH"
}, {
"hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
"triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"triggerType" : "PUSH"
}, {
"hash" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638",
"triggerID" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
"triggerType" : "PUSH"
}, {
"hash" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=775",
"triggerID" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
"triggerType" : "PUSH"
}, {
"hash" : "a688be727d6d6beff51a3f347b9e596d982610b5",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=838",
"triggerID" : "a688be727d6d6beff51a3f347b9e596d982610b5",
"triggerType" : "PUSH"
}, {
"hash" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=839",
"triggerID" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* 67041c2d836e61355aea26bd24f91548ec5e92ce Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=839)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] pratyakshsharma commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
pratyakshsharma commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r830456199
##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java
##########
@@ -283,6 +323,41 @@ private Long delayOffsetCalculation(Option<String> lastCheckpointStr, Set<TopicP
return delayCount;
}
+ /**
+ * Get the checkpoint by timestamp.
+ * This method returns the checkpoint format based on the timestamp.
+ * example:
+ * 1. input: timestamp, etc.
+ * 2. output: topicName,partition_num_0:100,partition_num_1:101,partition_num_2:102.
+ *
+ * @param consumer
+ * @param topicName
+ * @param timestamp
+ * @return
+ */
+ private Option<String> getOffsetsByTimestamp(KafkaConsumer consumer, List<PartitionInfo> partitionInfoList, Set<TopicPartition> topicPartitions,
+ String topicName, Long timestamp) {
+
+ Map<TopicPartition, Long> topicPartitionsTimestamp = partitionInfoList.stream()
+ .map(x -> new TopicPartition(x.topic(), x.partition()))
+ .collect(Collectors.toMap(Function.identity(), x -> timestamp));
+
+ Map<TopicPartition, Long> earliestOffsets = consumer.beginningOffsets(topicPartitions);
+ Map<TopicPartition, OffsetAndTimestamp> offsetAndTimestamp = consumer.offsetsForTimes(topicPartitionsTimestamp);
+
+ StringBuilder sb = new StringBuilder();
+ sb.append(topicName + ",");
+ for (Map.Entry<TopicPartition, OffsetAndTimestamp> map : offsetAndTimestamp.entrySet()) {
+ if (map.getValue() != null) {
+ sb.append(map.getKey().partition()).append(":").append(map.getValue().offset()).append(",");
+ } else {
+ sb.append(map.getKey().partition()).append(":").append(earliestOffsets.get(map.getKey())).append(",");
Review comment:
@liujinhui1994 @nsivabalan Can you help me understand why are we adding this value here from earliestOffsets? From what I understand, the whole point of consuming from specified timestamp is we do not want to consume records whose offset has timestamp lesser than the specified timestamp. Let us take an example of topic A with 3 partitions 0,1,2. Offsets are as below -
partition 0 - 100 (ts-210),101 (ts-220),102 (ts-230),103 (ts-240) .....
partition 1 - 50 (ts 200), 51 (ts-205), 52 (ts-225) ....
partition 2 - 51 (ts - 100), 60 (ts - 150) (only 2 records present in this)
Now suppose if the timestamp is passed as 220, the expected results from consumer api will be -
partition 0 -> 101
partition 1 -> 52
partition 2 -> null
As per the code, we return -
partition 0 -> 101
partition 1 -> 52
partition 2 -> 51 (earliest offset)
I want to understand why are we populating this value here for partition 2? If the corresponding offsets in partition 2 have timestamp less than 220, this implies these offsets have either been already consumed or the records are not needed at all for ingestion into hudi table. Ideally no offset should be returned from this method for partition 2.
Even if this functionality is added only for one time initial bootstrap, then also consuming the records from partition 2 above does not make sense. Please let me know the thought process behind this logic.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] n3nash commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
n3nash commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r642690647
##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java
##########
@@ -326,6 +328,16 @@ private boolean onDeltaSyncShutdown(boolean error) {
@Parameter(names = {"--checkpoint"}, description = "Resume Delta Streamer from this checkpoint.")
public String checkpoint = null;
+ /**
+ * 1. string: topicName,partition number 0:offset value,partition number 1:offset value
+ * 2. timestamp: kafka offset timestamp
+ * example
+ * 1. hudi_topic,0:100,1:101,2:201
+ * 2. 1621947081
+ */
+ @Parameter(names = {"--checkpoint-type"}, description = "Checkpoint type, divided into timestamp or string offset")
+ public String checkpointType = "string";
Review comment:
@nsivabalan Do we need to introduce something explicitly here ? Can we just introduce another property like below `hoodie.deltastreamer.source.kafka.checkpoint.type` and not have this change present as a top level option ? This checkpoint type seems very specific to a use-case in kafka and would like to reduce the confusions at the top level configs for users who want to use other source types. We should add documentation about this property so folks can drop this property in the properties file.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r641925742
##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java
##########
@@ -326,6 +328,16 @@ private boolean onDeltaSyncShutdown(boolean error) {
@Parameter(names = {"--checkpoint"}, description = "Resume Delta Streamer from this checkpoint.")
public String checkpoint = null;
+ /**
+ * 1. string: topicName,partition number 0:offset value,partition number 1:offset value
Review comment:
this format is specific to kafka. lets call it out. other sources could have checkpoint differently.
##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java
##########
@@ -326,6 +328,16 @@ private boolean onDeltaSyncShutdown(boolean error) {
@Parameter(names = {"--checkpoint"}, description = "Resume Delta Streamer from this checkpoint.")
public String checkpoint = null;
+ /**
+ * 1. string: topicName,partition number 0:offset value,partition number 1:offset value
+ * 2. timestamp: kafka offset timestamp
+ * example
+ * 1. hudi_topic,0:100,1:101,2:201
+ * 2. 1621947081
+ */
+ @Parameter(names = {"--checkpoint-type"}, description = "Checkpoint type, divided into timestamp or string offset")
+ public String checkpointType = "string";
Review comment:
I am contemplating between "string" or "default" or "regular" to be set as default checkpoint type. @n3nash : any thoughts. We are looking to introduce a new config called checkpoint type. by default we need to set some value. this patch adds a new checkpoint type "timestamp" for kafka source.
##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/DeltaSync.java
##########
@@ -38,6 +38,7 @@
import org.apache.hudi.common.table.timeline.HoodieTimeline;
import org.apache.hudi.common.util.Option;
import org.apache.hudi.common.util.ReflectionUtils;
+import org.apache.hudi.common.util.StringUtils;
Review comment:
can we revert unintended changes in this file.
##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java
##########
@@ -326,6 +328,16 @@ private boolean onDeltaSyncShutdown(boolean error) {
@Parameter(names = {"--checkpoint"}, description = "Resume Delta Streamer from this checkpoint.")
public String checkpoint = null;
+ /**
+ * 1. string: topicName,partition number 0:offset value,partition number 1:offset value
+ * 2. timestamp: kafka offset timestamp
+ * example
+ * 1. hudi_topic,0:100,1:101,2:201
+ * 2. 1621947081
+ */
+ @Parameter(names = {"--checkpoint-type"}, description = "Checkpoint type, divided into timestamp or string offset")
+ public String checkpointType = "string";
Review comment:
sorry, why do we have this config in two places. We have it defined as top level config in HoodieDeltaStreamer.Config. But in KafkaOffsetGen, I see you are accessing it as "hoodie.deltastreamer.source.kafka.checkpoint.type". May be we should rely on this config param and remove it from top level since this is applicable just to kafka for now.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847
# [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (e98b8e4) into [master](https://codecov.io/gh/apache/hudi/commit/5804ad8e32ae05758ebc5e47f5d4fb4db371ab52?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (5804ad8) will **decrease** coverage by `44.90%`.
> The diff coverage is `0.00%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #2438 +/- ##
============================================
- Coverage 47.72% 2.82% -44.91%
+ Complexity 5528 85 -5443
============================================
Files 934 284 -650
Lines 41457 11869 -29588
Branches 4166 986 -3180
============================================
- Hits 19786 335 -19451
+ Misses 19914 11508 -8406
+ Partials 1757 26 -1731
```
| Flag | Coverage Δ | |
|---|---|---|
| hudicli | `?` | |
| hudiclient | `0.00% <ø> (-34.46%)` | :arrow_down: |
| hudicommon | `?` | |
| hudiflink | `?` | |
| hudihadoopmr | `?` | |
| hudisparkdatasource | `?` | |
| hudisync | `4.88% <ø> (-49.64%)` | :arrow_down: |
| huditimelineservice | `?` | |
| hudiutilities | `8.99% <0.00%> (-50.27%)` | :arrow_down: |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `0.00% <0.00%> (-71.57%)` | :arrow_down: |
| [...apache/hudi/utilities/sources/AvroKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb0thZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (ø)` | |
| [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...hudi/utilities/sources/helpers/KafkaOffsetGen.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvaGVscGVycy9LYWZrYU9mZnNldEdlbi5qYXZh) | `0.00% <0.00%> (-87.69%)` | :arrow_down: |
| [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| ... and [778 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [5804ad8...e98b8e4](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] yanghua commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
yanghua commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-771572476
CI still failed, check it again.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
"triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"triggerType" : "PUSH"
}, {
"hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
"triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"triggerType" : "PUSH"
}, {
"hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
"triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"triggerType" : "PUSH"
}, {
"hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
"triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"triggerType" : "PUSH"
}, {
"hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
"triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"triggerType" : "PUSH"
}, {
"hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
"triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"triggerType" : "PUSH"
}, {
"hash" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
"status" : "UNKNOWN",
"url" : "TBD",
"triggerID" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* 1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633)
* a39570dfe0493bcd23edf911f6256e90d3b22907 UNKNOWN
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] liujinhui1994 commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-881278532
@nsivabalan I have completed the changes as you requested, please take a look~
Thank you very much for your help!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847
# [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (5022f1d) into [master](https://codecov.io/gh/apache/hudi/commit/5804ad8e32ae05758ebc5e47f5d4fb4db371ab52?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (5804ad8) will **decrease** coverage by `3.74%`.
> The diff coverage is `0.00%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #2438 +/- ##
============================================
- Coverage 47.72% 43.97% -3.75%
+ Complexity 5528 5119 -409
============================================
Files 934 934
Lines 41457 41498 +41
Branches 4166 4171 +5
============================================
- Hits 19786 18250 -1536
- Misses 19914 21625 +1711
+ Partials 1757 1623 -134
```
| Flag | Coverage Δ | |
|---|---|---|
| hudicli | `39.97% <ø> (ø)` | |
| hudiclient | `34.45% <ø> (ø)` | |
| hudicommon | `48.56% <ø> (+<0.01%)` | :arrow_up: |
| hudiflink | `60.03% <ø> (ø)` | |
| hudihadoopmr | `51.29% <ø> (ø)` | |
| hudisparkdatasource | `67.44% <ø> (-0.22%)` | :arrow_down: |
| hudisync | `54.51% <ø> (ø)` | |
| huditimelineservice | `64.07% <ø> (ø)` | |
| hudiutilities | `8.99% <0.00%> (-50.27%)` | :arrow_down: |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `0.00% <0.00%> (-71.57%)` | :arrow_down: |
| [...apache/hudi/utilities/sources/AvroKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb0thZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (ø)` | |
| [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...hudi/utilities/sources/helpers/KafkaOffsetGen.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvaGVscGVycy9LYWZrYU9mZnNldEdlbi5qYXZh) | `0.00% <0.00%> (-87.69%)` | :arrow_down: |
| [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| ... and [45 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [5804ad8...5022f1d](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] liujinhui1994 commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-875975684
woking....
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] liujinhui1994 commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-811856655
> Myself and Nishith discussed on this. Here is our proposal.
> Let's rely on Deltastreamer.Config.checkpoint to pass in any type of checkpoint.
> We can add another config called "checkpoint.type" which could default to string for all default checkpoints. For checkpoint of interest of this PR, we could set the value for this new config to "timestamp".
>
> With this, its upto each source to parse and interpret the checkpoint value and DeltaSync does not need to deal w/ diff checkpointing formats.
>
> Having said this, DeltaSync readFromSource() should not have any changes in this diff.
> KafkaOffsetGen should have logic to parse diff checkpoint values, based on two values(deltastreamer.config.checkpoint and checkpoint.type).
>
> With this, we also moved source specific checkpointing logic within source specific class and did not leak it to DeltaSync which should be agnostic to different Source.
>
> @liujinhui1994 : Let me know what do you think. Happy to chat more on this.
Great, I will modify this PR based on this
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
"triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"triggerType" : "PUSH"
}, {
"hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
"triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"triggerType" : "PUSH"
}, {
"hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
"triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"triggerType" : "PUSH"
}, {
"hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
"triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"triggerType" : "PUSH"
}, {
"hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
"triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"triggerType" : "PUSH"
}, {
"hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
"triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"triggerType" : "PUSH"
}, {
"hash" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638",
"triggerID" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
"triggerType" : "PUSH"
}, {
"hash" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=775",
"triggerID" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
"triggerType" : "PUSH"
}, {
"hash" : "a688be727d6d6beff51a3f347b9e596d982610b5",
"status" : "UNKNOWN",
"url" : "TBD",
"triggerID" : "a688be727d6d6beff51a3f347b9e596d982610b5",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* bf50481b923dbaa14be994bd0cc45bbe22ff8524 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=775)
* a688be727d6d6beff51a3f347b9e596d982610b5 UNKNOWN
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] liujinhui1994 commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-786452099
I will add the unit test, and then please review
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r604841403
##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java
##########
@@ -247,6 +266,32 @@ private Long delayOffsetCalculation(Option<String> lastCheckpointStr, Set<TopicP
return delayCount;
}
+ /**
+ * Get the checkpoint by timestamp.
+ * @param consumer
+ * @param topicName
+ * @param timestamp
+ * @return
+ */
+ private String getOffsetsByTimestamp(KafkaConsumer consumer, List<PartitionInfo> partitionInfoList, String topicName, Long timestamp) {
Review comment:
Can we add tests for the new code that is added. I don't see any tests.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] liujinhui1994 commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-881203019
> Let's try to land this in by weekend. Its been hanging for quite sometime.
ok.
Sorry, I'll deal with it now, please excuse me
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847
# [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (c705ce5) into [master](https://codecov.io/gh/apache/hudi/commit/5804ad8e32ae05758ebc5e47f5d4fb4db371ab52?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (5804ad8) will **decrease** coverage by `20.21%`.
> The diff coverage is `89.09%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #2438 +/- ##
=============================================
- Coverage 47.72% 27.50% -20.22%
+ Complexity 5528 1302 -4226
=============================================
Files 934 386 -548
Lines 41457 15377 -26080
Branches 4166 1343 -2823
=============================================
- Hits 19786 4230 -15556
+ Misses 19914 10842 -9072
+ Partials 1757 305 -1452
```
| Flag | Coverage Δ | |
|---|---|---|
| hudicli | `?` | |
| hudiclient | `20.91% <ø> (-13.54%)` | :arrow_down: |
| hudicommon | `?` | |
| hudiflink | `?` | |
| hudihadoopmr | `?` | |
| hudisparkdatasource | `?` | |
| hudisync | `4.88% <ø> (-49.64%)` | :arrow_down: |
| huditimelineservice | `?` | |
| hudiutilities | `59.77% <89.09%> (+0.50%)` | :arrow_up: |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [...apache/hudi/utilities/sources/AvroKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb0thZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (ø)` | |
| [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `72.72% <83.33%> (+1.15%)` | :arrow_up: |
| [...hudi/utilities/sources/helpers/KafkaOffsetGen.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvaGVscGVycy9LYWZrYU9mZnNldEdlbi5qYXZh) | `88.13% <91.48%> (+0.45%)` | :arrow_up: |
| [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `100.00% <100.00%> (ø)` | |
| [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...a/org/apache/hudi/metrics/MetricsReporterType.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyVHlwZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...rg/apache/hudi/client/bootstrap/BootstrapMode.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9ib290c3RyYXAvQm9vdHN0cmFwTW9kZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...he/hudi/hive/HiveStylePartitionValueExtractor.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvSGl2ZVN0eWxlUGFydGl0aW9uVmFsdWVFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| ... and [630 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [5804ad8...c705ce5](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
"triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"triggerType" : "PUSH"
}, {
"hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
"triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"triggerType" : "PUSH"
}, {
"hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
"triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"triggerType" : "PUSH"
}, {
"hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
"triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"triggerType" : "PUSH"
}, {
"hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
"triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"triggerType" : "PUSH"
}, {
"hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
"triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"triggerType" : "PUSH"
}, {
"hash" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638",
"triggerID" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
"triggerType" : "PUSH"
}, {
"hash" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=775",
"triggerID" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
"triggerType" : "PUSH"
}, {
"hash" : "a688be727d6d6beff51a3f347b9e596d982610b5",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=838",
"triggerID" : "a688be727d6d6beff51a3f347b9e596d982610b5",
"triggerType" : "PUSH"
}, {
"hash" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=839",
"triggerID" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
"triggerType" : "PUSH"
}, {
"hash" : "8bc0333e4fc14158b126da1f7b14f6c43a3abfb8",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=856",
"triggerID" : "8bc0333e4fc14158b126da1f7b14f6c43a3abfb8",
"triggerType" : "PUSH"
}, {
"hash" : "5022f1d97e4e9b140d8e41b5b49c034ceb9ae601",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=858",
"triggerID" : "5022f1d97e4e9b140d8e41b5b49c034ceb9ae601",
"triggerType" : "PUSH"
}, {
"hash" : "b77b63994db2e91853a06d3a5c4c129a21feefcf",
"status" : "SUCCESS",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=863",
"triggerID" : "b77b63994db2e91853a06d3a5c4c129a21feefcf",
"triggerType" : "PUSH"
}, {
"hash" : "e98b8e407f1bbcd0f0219d2f2d65f4e95f663c00",
"status" : "UNKNOWN",
"url" : "TBD",
"triggerID" : "e98b8e407f1bbcd0f0219d2f2d65f4e95f663c00",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* b77b63994db2e91853a06d3a5c4c129a21feefcf Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=863)
* e98b8e407f1bbcd0f0219d2f2d65f4e95f663c00 UNKNOWN
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
"triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"triggerType" : "PUSH"
}, {
"hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
"triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"triggerType" : "PUSH"
}, {
"hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
"triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"triggerType" : "PUSH"
}, {
"hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
"triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"triggerType" : "PUSH"
}, {
"hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
"triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* 5e8ab52b0e139333c4c003932c55ff6e88302206 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] liujinhui1994 commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-852970149
I am currently facing a problem and would like to hear your opinion
After we add this type, hoodie.deltastreamer.source.kafka.checkpoint.type=timestamp
I am currently thinking, does deltastreamer.checkpoint.key maintain the status quo? The format is still: topicName,0:123,1:456
If we continue to maintain the above format, when we specify: for example --checkpoint 1622635064, we need to determine the relationship between commitMetadata.getMetadata(CHECKPOINT_KEY) and --checkpoint 1622635064 in org.apache.hudi.utilities.deltastreamer.DeltaSync#readFromSource, This seems to be contrary to the results of our discussion, do not add kafka dependent code in DeltaSync
Do you have any suggestions for this? thanks
@nsivabalan
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
"triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"triggerType" : "PUSH"
}, {
"hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
"triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"triggerType" : "PUSH"
}, {
"hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
"triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"triggerType" : "PUSH"
}, {
"hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
"triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"triggerType" : "PUSH"
}, {
"hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
"triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"triggerType" : "PUSH"
}, {
"hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
"triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"triggerType" : "PUSH"
}, {
"hash" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638",
"triggerID" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
"triggerType" : "PUSH"
}, {
"hash" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=775",
"triggerID" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
"triggerType" : "PUSH"
}, {
"hash" : "a688be727d6d6beff51a3f347b9e596d982610b5",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=838",
"triggerID" : "a688be727d6d6beff51a3f347b9e596d982610b5",
"triggerType" : "PUSH"
}, {
"hash" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=839",
"triggerID" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
"triggerType" : "PUSH"
}, {
"hash" : "8bc0333e4fc14158b126da1f7b14f6c43a3abfb8",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=856",
"triggerID" : "8bc0333e4fc14158b126da1f7b14f6c43a3abfb8",
"triggerType" : "PUSH"
}, {
"hash" : "5022f1d97e4e9b140d8e41b5b49c034ceb9ae601",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=858",
"triggerID" : "5022f1d97e4e9b140d8e41b5b49c034ceb9ae601",
"triggerType" : "PUSH"
}, {
"hash" : "b77b63994db2e91853a06d3a5c4c129a21feefcf",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=863",
"triggerID" : "b77b63994db2e91853a06d3a5c4c129a21feefcf",
"triggerType" : "PUSH"
}, {
"hash" : "e98b8e407f1bbcd0f0219d2f2d65f4e95f663c00",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=959",
"triggerID" : "e98b8e407f1bbcd0f0219d2f2d65f4e95f663c00",
"triggerType" : "PUSH"
}, {
"hash" : "c705ce5d409b139a14f22bef3ecdc189fa90f562",
"status" : "UNKNOWN",
"url" : "TBD",
"triggerID" : "c705ce5d409b139a14f22bef3ecdc189fa90f562",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* e98b8e407f1bbcd0f0219d2f2d65f4e95f663c00 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=959)
* c705ce5d409b139a14f22bef3ecdc189fa90f562 UNKNOWN
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847
# [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (739a252) into [master](https://codecov.io/gh/apache/hudi/commit/7fed7352bd506e20e5316bb0b3ed9e5c1e9c76df?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (7fed735) will **decrease** coverage by `1.49%`.
> The diff coverage is `n/a`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #2438 +/- ##
============================================
- Coverage 54.96% 53.47% -1.50%
+ Complexity 3844 3459 -385
============================================
Files 485 431 -54
Lines 23437 21421 -2016
Branches 2494 2253 -241
============================================
- Hits 12882 11454 -1428
+ Misses 9401 8947 -454
+ Partials 1154 1020 -134
```
| Flag | Coverage Δ | |
|---|---|---|
| hudicli | `39.55% <ø> (ø)` | |
| hudiclient | `∅ <ø> (∅)` | |
| hudicommon | `50.29% <ø> (ø)` | |
| hudiflink | `63.41% <ø> (ø)` | |
| hudihadoopmr | `51.54% <ø> (ø)` | |
| hudisparkdatasource | `73.33% <ø> (ø)` | |
| hudisync | `46.44% <ø> (ø)` | |
| huditimelineservice | `64.36% <ø> (ø)` | |
| hudiutilities | `?` | |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [...di/utilities/sources/helpers/IncrSourceHelper.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvaGVscGVycy9JbmNyU291cmNlSGVscGVyLmphdmE=) | | |
| [...in/java/org/apache/hudi/utilities/UtilHelpers.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL1V0aWxIZWxwZXJzLmphdmE=) | | |
| [...i/utilities/deltastreamer/SourceFormatAdapter.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvU291cmNlRm9ybWF0QWRhcHRlci5qYXZh) | | |
| [...s/exception/HoodieIncrementalPullSQLException.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2V4Y2VwdGlvbi9Ib29kaWVJbmNyZW1lbnRhbFB1bGxTUUxFeGNlcHRpb24uamF2YQ==) | | |
| [...alCheckpointFromAnotherHoodieTimelineProvider.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2NoZWNrcG9pbnRpbmcvSW5pdGlhbENoZWNrcG9pbnRGcm9tQW5vdGhlckhvb2RpZVRpbWVsaW5lUHJvdmlkZXIuamF2YQ==) | | |
| [...g/apache/hudi/utilities/schema/SchemaProvider.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFQcm92aWRlci5qYXZh) | | |
| [...ities/schema/NullTargetSchemaRegistryProvider.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9OdWxsVGFyZ2V0U2NoZW1hUmVnaXN0cnlQcm92aWRlci5qYXZh) | | |
| [...ties/exception/HoodieIncrementalPullException.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2V4Y2VwdGlvbi9Ib29kaWVJbmNyZW1lbnRhbFB1bGxFeGNlcHRpb24uamF2YQ==) | | |
| [...pache/hudi/utilities/sources/ParquetDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUGFycXVldERGU1NvdXJjZS5qYXZh) | | |
| [...org/apache/hudi/utilities/HoodieClusteringJob.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hvb2RpZUNsdXN0ZXJpbmdKb2IuamF2YQ==) | | |
| ... and [41 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
"triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"triggerType" : "PUSH"
}, {
"hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
"triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"triggerType" : "PUSH"
}, {
"hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
"triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"triggerType" : "PUSH"
}, {
"hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
"triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"triggerType" : "PUSH"
}, {
"hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"status" : "PENDING",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
"triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* ffd30f564c780a25ddccf8c5bc819d4eed9b437a Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400)
* 5e8ab52b0e139333c4c003932c55ff6e88302206 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] yanghua commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
yanghua commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-771572476
CI still failed, check it again.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] liujinhui1994 commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-845791067
> @liujinhui1994 : were you able to make progress on this. would be nice to have this in before next release.
Sorry, I was too busy with work before~ I just sorted out the whole idea of this PR, clarified the goal, and will start soon
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-io edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-759677298
# [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1) Report
> Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc) (87125e7) into [master](https://codecov.io/gh/apache/hudi/commit/c4afd179c1983a382b8a5197d800b0f5dba254de?el=desc) (c4afd17) will **decrease** coverage by `6.14%`.
> The diff coverage is `0.00%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree)
```diff
@@ Coverage Diff @@
## master #2438 +/- ##
============================================
- Coverage 50.18% 44.03% -6.15%
+ Complexity 3050 2741 -309
============================================
Files 419 419
Lines 18931 18949 +18
Branches 1948 1953 +5
============================================
- Hits 9500 8345 -1155
- Misses 8656 9949 +1293
+ Partials 775 655 -120
```
| Flag | Coverage Δ | Complexity Δ | |
|---|---|---|---|
| hudicli | `37.21% <ø> (ø)` | `0.00 <ø> (ø)` | |
| hudiclient | `100.00% <ø> (ø)` | `0.00 <ø> (ø)` | |
| hudicommon | `51.47% <ø> (-0.03%)` | `0.00 <ø> (ø)` | |
| hudiflink | `0.00% <ø> (ø)` | `0.00 <ø> (ø)` | |
| hudihadoopmr | `33.16% <ø> (ø)` | `0.00 <ø> (ø)` | |
| hudisparkdatasource | `65.85% <ø> (ø)` | `0.00 <ø> (ø)` | |
| hudisync | `48.61% <ø> (ø)` | `0.00 <ø> (ø)` | |
| huditimelineservice | `66.49% <ø> (ø)` | `0.00 <ø> (ø)` | |
| hudiutilities | `9.59% <0.00%> (-59.84%)` | `0.00 <0.00> (ø)` | |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
|---|---|---|---|
| [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `0.00% <0.00%> (-70.51%)` | `0.00 <0.00> (-50.00)` | |
| [...hudi/utilities/sources/helpers/KafkaOffsetGen.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvaGVscGVycy9LYWZrYU9mZnNldEdlbi5qYXZh) | `0.00% <0.00%> (-88.78%)` | `0.00 <0.00> (-16.00)` | |
| [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-2.00%)` | |
| [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-3.00%)` | |
| [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
| [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
| [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
| [...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-10.00%)` | |
| [...g/apache/hudi/utilities/sources/JsonDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkRGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
| [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-6.00%)` | |
| ... and [33 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more) | |
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
"triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"triggerType" : "PUSH"
}, {
"hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
"triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"triggerType" : "PUSH"
}, {
"hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"status" : "PENDING",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
"triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* 00d29d85f32f376ef44cb99d49f605a4af6f798c Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269)
* ea5ed9da433064022a69e06c98f58fc10c09e8b6 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] liujinhui1994 commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r605394316
##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java
##########
@@ -247,6 +266,32 @@ private Long delayOffsetCalculation(Option<String> lastCheckpointStr, Set<TopicP
return delayCount;
}
+ /**
+ * Get the checkpoint by timestamp.
+ * @param consumer
+ * @param topicName
+ * @param timestamp
+ * @return
+ */
+ private String getOffsetsByTimestamp(KafkaConsumer consumer, List<PartitionInfo> partitionInfoList, String topicName, Long timestamp) {
Review comment:
When the implementation plan is confirmed, I will quickly add test
When the program is confirmed, I will quickly add test
##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java
##########
@@ -247,6 +266,32 @@ private Long delayOffsetCalculation(Option<String> lastCheckpointStr, Set<TopicP
return delayCount;
}
+ /**
+ * Get the checkpoint by timestamp.
+ * @param consumer
+ * @param topicName
+ * @param timestamp
+ * @return
+ */
+ private String getOffsetsByTimestamp(KafkaConsumer consumer, List<PartitionInfo> partitionInfoList, String topicName, Long timestamp) {
Review comment:
When the program is confirmed, I will quickly add test
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847
# [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (bf50481) into [master](https://codecov.io/gh/apache/hudi/commit/990820476a41b318017ba63dd446911141c929ce?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (9908204) will **decrease** coverage by `1.00%`.
> The diff coverage is `n/a`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #2438 +/- ##
============================================
- Coverage 47.61% 46.60% -1.01%
+ Complexity 5487 5009 -478
============================================
Files 924 862 -62
Lines 41206 38238 -2968
Branches 4133 3814 -319
============================================
- Hits 19619 17822 -1797
+ Misses 19844 18830 -1014
+ Partials 1743 1586 -157
```
| Flag | Coverage Δ | |
|---|---|---|
| hudicli | `39.97% <ø> (ø)` | |
| hudiclient | `34.60% <ø> (+0.02%)` | :arrow_up: |
| hudicommon | `48.57% <ø> (+<0.01%)` | :arrow_up: |
| hudiflink | `59.58% <ø> (ø)` | |
| hudihadoopmr | `51.29% <ø> (ø)` | |
| hudisparkdatasource | `67.23% <ø> (-0.10%)` | :arrow_down: |
| hudisync | `50.59% <ø> (-3.90%)` | :arrow_down: |
| huditimelineservice | `64.07% <ø> (ø)` | |
| hudiutilities | `?` | |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [...ache/hudi/hive/HiveMetastoreBasedLockProvider.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvSGl2ZU1ldGFzdG9yZUJhc2VkTG9ja1Byb3ZpZGVyLmphdmE=) | `0.00% <0.00%> (-60.22%)` | :arrow_down: |
| [...n/java/org/apache/hudi/common/model/FileSlice.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0ZpbGVTbGljZS5qYXZh) | `73.80% <0.00%> (-2.39%)` | :arrow_down: |
| [.../org/apache/hudi/common/model/HoodieFileGroup.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0hvb2RpZUZpbGVHcm91cC5qYXZh) | `83.92% <0.00%> (-0.56%)` | :arrow_down: |
| [...org/apache/hudi/HoodieDatasetBulkInsertHelper.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvSG9vZGllRGF0YXNldEJ1bGtJbnNlcnRIZWxwZXIuamF2YQ==) | `96.77% <0.00%> (-0.20%)` | :arrow_down: |
| [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh) | `42.88% <0.00%> (ø)` | |
| [...n/java/org/apache/hudi/internal/DefaultSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3BhcmsyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2ludGVybmFsL0RlZmF1bHRTb3VyY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
| [...org/apache/hudi/spark3/internal/DefaultSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3BhcmszL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL3NwYXJrMy9pbnRlcm5hbC9EZWZhdWx0U291cmNlLmphdmE=) | `0.00% <0.00%> (ø)` | |
| [...spark3/internal/HoodieDataSourceInternalTable.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3BhcmszL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL3NwYXJrMy9pbnRlcm5hbC9Ib29kaWVEYXRhU291cmNlSW50ZXJuYWxUYWJsZS5qYXZh) | `0.00% <0.00%> (ø)` | |
| [...nal/HoodieBulkInsertDataInternalWriterFactory.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3BhcmsyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2ludGVybmFsL0hvb2RpZUJ1bGtJbnNlcnREYXRhSW50ZXJuYWxXcml0ZXJGYWN0b3J5LmphdmE=) | `100.00% <0.00%> (ø)` | |
| [...nal/HoodieBulkInsertDataInternalWriterFactory.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3BhcmszL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL3NwYXJrMy9pbnRlcm5hbC9Ib29kaWVCdWxrSW5zZXJ0RGF0YUludGVybmFsV3JpdGVyRmFjdG9yeS5qYXZh) | `100.00% <0.00%> (ø)` | |
| ... and [75 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [9908204...bf50481](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
nsivabalan edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-864413719
Guess we can simplify things. Let me go over some pseudo code of interest.
within DeltaSync.read()
```
// set right checkpoint value
if(cfg.checkpoint != null && ! (commitMetadata.contains(Checkpoint_RESET_Key) ) {
checkpoint = cfg.checkpoint;
} else if (commitMetadata.contains(Checkpoint_Key)) {
checkpoint = commitMetadata.get(Checkpoint_Key));
} else {
Option.empty()
}
```
// Note that first if condition deals with RESET_key where as 2nd else if conditions deals with Checkpoint_key.
I have simplified some exception cases, but should give you the gist.
within write()
```
// towards the end
commitMetadata.out(Checkpoint_Key, updated checkpoint after writing)
if(cfg.checkpoint != null) {
commitMetadata.add(Checkpoint_RESET_Key);
}
```
If cfg.checkpoint is set, only during first round, it will be honored. At the end of first batch, we add Checkpoint_RESET_Key to the commitmetadata and hence from subsequent batches, checkpoint will be parsed from commitMetadata.
With this PR, only addition is that we are introducing a new checkpoint type. Let me propose a simple add on to above code that would work for us.
Code before this patch.
within DeltaSync.read()
```
// set right checkpoint value
boolean resetCheckpointType = true; // New addition
if(cfg.checkpoint != null && ! (commitMetadata.contains(Checkpoint_RESET_Key) ) {
checkpoint = cfg.checkpoint;
resetCheckpointType = false; // New addition
} else if (commitMetadata.contains(Checkpoint_Key)) {
checkpoint = commitMetadata.get(Checkpoint_Key));
} else {
Option.empty()
}
// New addition
if (resetCheckpointType) {
**reset checkpoint type if set.**
}
```
No other changes are required. This is based of the assumption that Checkpoint_RESET_Key and checkpoint type goes hand in hand. During first batch, checkpoint type could be set, there won't be any Checkpoint_RESET_Key set. But from 2nd batch, it should be reverse. check point type should not be set, but Checkpoint_RESET_Key should be part of the commit metadata. Given this assumption, we don't really need to add checkpoint type to commitMetadata, but still decide whether to use the checkpoint type or not.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-851604627
sure, sounds good.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847
# [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (e98b8e4) into [master](https://codecov.io/gh/apache/hudi/commit/5804ad8e32ae05758ebc5e47f5d4fb4db371ab52?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (5804ad8) will **decrease** coverage by `20.23%`.
> The diff coverage is `85.45%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #2438 +/- ##
=============================================
- Coverage 47.72% 27.49% -20.24%
+ Complexity 5528 1301 -4227
=============================================
Files 934 386 -548
Lines 41457 15377 -26080
Branches 4166 1343 -2823
=============================================
- Hits 19786 4228 -15558
+ Misses 19914 10843 -9071
+ Partials 1757 306 -1451
```
| Flag | Coverage Δ | |
|---|---|---|
| hudicli | `?` | |
| hudiclient | `20.91% <ø> (-13.54%)` | :arrow_down: |
| hudicommon | `?` | |
| hudiflink | `?` | |
| hudihadoopmr | `?` | |
| hudisparkdatasource | `?` | |
| hudisync | `4.88% <ø> (-49.64%)` | :arrow_down: |
| huditimelineservice | `?` | |
| hudiutilities | `59.70% <85.45%> (+0.44%)` | :arrow_up: |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [...apache/hudi/utilities/sources/AvroKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb0thZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (ø)` | |
| [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `72.72% <83.33%> (+1.15%)` | :arrow_up: |
| [...hudi/utilities/sources/helpers/KafkaOffsetGen.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvaGVscGVycy9LYWZrYU9mZnNldEdlbi5qYXZh) | `87.00% <87.23%> (-0.68%)` | :arrow_down: |
| [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `100.00% <100.00%> (ø)` | |
| [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...a/org/apache/hudi/metrics/MetricsReporterType.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyVHlwZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...rg/apache/hudi/client/bootstrap/BootstrapMode.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9ib290c3RyYXAvQm9vdHN0cmFwTW9kZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...he/hudi/hive/HiveStylePartitionValueExtractor.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvSGl2ZVN0eWxlUGFydGl0aW9uVmFsdWVFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| ... and [630 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [5804ad8...e98b8e4](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] n3nash commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
n3nash commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r642690647
##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java
##########
@@ -326,6 +328,16 @@ private boolean onDeltaSyncShutdown(boolean error) {
@Parameter(names = {"--checkpoint"}, description = "Resume Delta Streamer from this checkpoint.")
public String checkpoint = null;
+ /**
+ * 1. string: topicName,partition number 0:offset value,partition number 1:offset value
+ * 2. timestamp: kafka offset timestamp
+ * example
+ * 1. hudi_topic,0:100,1:101,2:201
+ * 2. 1621947081
+ */
+ @Parameter(names = {"--checkpoint-type"}, description = "Checkpoint type, divided into timestamp or string offset")
+ public String checkpointType = "string";
Review comment:
@nsivabalan Do we need to introduce something explicitly here ? Can we just introduce another property like below `hoodie.deltastreamer.source.kafka.checkpoint.type` and not have this change present as a top level option ? This checkpoint type seems very specific to a use-case in kafka and would like to reduce the confusions at the top level configs for users who want to use other source types.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
"triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"triggerType" : "PUSH"
}, {
"hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
"triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"triggerType" : "PUSH"
}, {
"hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
"triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"triggerType" : "PUSH"
}, {
"hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
"triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"triggerType" : "PUSH"
}, {
"hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
"triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"triggerType" : "PUSH"
}, {
"hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
"triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* 1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] yanghua commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
yanghua commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-759520009
@wangxianghu Can you review this PR firstly?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
"triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"triggerType" : "PUSH"
}, {
"hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
"triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"triggerType" : "PUSH"
}, {
"hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
"triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"triggerType" : "PUSH"
}, {
"hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
"triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"triggerType" : "PUSH"
}, {
"hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
"triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"triggerType" : "PUSH"
}, {
"hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
"triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"triggerType" : "PUSH"
}, {
"hash" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638",
"triggerID" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
"triggerType" : "PUSH"
}, {
"hash" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=775",
"triggerID" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
"triggerType" : "PUSH"
}, {
"hash" : "a688be727d6d6beff51a3f347b9e596d982610b5",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=838",
"triggerID" : "a688be727d6d6beff51a3f347b9e596d982610b5",
"triggerType" : "PUSH"
}, {
"hash" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=839",
"triggerID" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
"triggerType" : "PUSH"
}, {
"hash" : "8bc0333e4fc14158b126da1f7b14f6c43a3abfb8",
"status" : "PENDING",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=856",
"triggerID" : "8bc0333e4fc14158b126da1f7b14f6c43a3abfb8",
"triggerType" : "PUSH"
}, {
"hash" : "5022f1d97e4e9b140d8e41b5b49c034ceb9ae601",
"status" : "UNKNOWN",
"url" : "TBD",
"triggerID" : "5022f1d97e4e9b140d8e41b5b49c034ceb9ae601",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* 67041c2d836e61355aea26bd24f91548ec5e92ce Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=839)
* 8bc0333e4fc14158b126da1f7b14f6c43a3abfb8 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=856)
* 5022f1d97e4e9b140d8e41b5b49c034ceb9ae601 UNKNOWN
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] liujinhui1994 commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r670998409
##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java
##########
@@ -282,6 +301,36 @@ private Long delayOffsetCalculation(Option<String> lastCheckpointStr, Set<TopicP
return delayCount;
}
+ /**
+ * Get the checkpoint by timestamp.
+ * @param consumer
Review comment:
OK, I will add it immediately
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847
# [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (67041c2) into [master](https://codecov.io/gh/apache/hudi/commit/990820476a41b318017ba63dd446911141c929ce?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (9908204) will **decrease** coverage by `0.92%`.
> The diff coverage is `n/a`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #2438 +/- ##
============================================
- Coverage 47.61% 46.68% -0.93%
+ Complexity 5487 5039 -448
============================================
Files 924 867 -57
Lines 41206 38791 -2415
Branches 4133 3927 -206
============================================
- Hits 19619 18110 -1509
+ Misses 19844 19079 -765
+ Partials 1743 1602 -141
```
| Flag | Coverage Δ | |
|---|---|---|
| hudicli | `39.97% <ø> (ø)` | |
| hudiclient | `34.20% <ø> (-0.38%)` | :arrow_down: |
| hudicommon | `48.58% <ø> (+0.02%)` | :arrow_up: |
| hudiflink | `60.03% <ø> (+0.44%)` | :arrow_up: |
| hudihadoopmr | `51.29% <ø> (ø)` | |
| hudisparkdatasource | `68.36% <ø> (+1.03%)` | :arrow_up: |
| hudisync | `50.55% <ø> (-3.93%)` | :arrow_down: |
| huditimelineservice | `64.07% <ø> (ø)` | |
| hudiutilities | `?` | |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [...ache/hudi/hive/HiveMetastoreBasedLockProvider.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvSGl2ZU1ldGFzdG9yZUJhc2VkTG9ja1Byb3ZpZGVyLmphdmE=) | `0.00% <0.00%> (-60.22%)` | :arrow_down: |
| [...n/java/org/apache/hudi/index/SparkHoodieIndex.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1zcGFyay1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaW5kZXgvU3BhcmtIb29kaWVJbmRleC5qYXZh) | `56.52% <0.00%> (-30.15%)` | :arrow_down: |
| [...java/org/apache/hudi/table/HoodieTableFactory.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS90YWJsZS9Ib29kaWVUYWJsZUZhY3RvcnkuamF2YQ==) | `84.61% <0.00%> (-7.06%)` | :arrow_down: |
| [...org/apache/hudi/config/HoodieClusteringConfig.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVDbHVzdGVyaW5nQ29uZmlnLmphdmE=) | `67.74% <0.00%> (-3.54%)` | :arrow_down: |
| [...main/scala/org/apache/hudi/HoodieWriterUtils.scala](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZVdyaXRlclV0aWxzLnNjYWxh) | `81.53% <0.00%> (-3.37%)` | :arrow_down: |
| [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh) | `39.57% <0.00%> (-3.31%)` | :arrow_down: |
| [...n/java/org/apache/hudi/common/model/FileSlice.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0ZpbGVTbGljZS5qYXZh) | `73.80% <0.00%> (-2.39%)` | :arrow_down: |
| [...a/org/apache/hudi/common/util/ClusteringUtils.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3V0aWwvQ2x1c3RlcmluZ1V0aWxzLmphdmE=) | `88.40% <0.00%> (-1.31%)` | :arrow_down: |
| [.../org/apache/hudi/common/model/HoodieFileGroup.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0hvb2RpZUZpbGVHcm91cC5qYXZh) | `83.92% <0.00%> (-0.56%)` | :arrow_down: |
| [...in/java/org/apache/hudi/table/HoodieTableSink.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS90YWJsZS9Ib29kaWVUYWJsZVNpbmsuamF2YQ==) | `10.52% <0.00%> (ø)` | |
| ... and [103 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [9908204...67041c2](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] liujinhui1994 commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-782588952
@yanghua @wangxianghu @nsivabalan
I have verified, please help review
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] liujinhui1994 commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r557069074
##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java
##########
@@ -182,6 +187,10 @@ public KafkaOffsetGen(TypedProperties props) {
.map(x -> new TopicPartition(x.topic(), x.partition())).collect(Collectors.toSet());
// Determine the offset ranges to read from
+ if (kafkaCheckpointTimestamp != null) {
+ lastCheckpointStr = Option.of(getOffsetsByTimestamp(consumer, partitionInfoList, topicName, Long.parseLong(kafkaCheckpointTimestamp)));
+ }
+
if (lastCheckpointStr.isPresent() && !lastCheckpointStr.get().isEmpty()) {
Review comment:
I deal with it now
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-864418016
actually, we can make it even more simpler.
```
// set right checkpoint value
if(cfg.checkpoint != null && ! (commitMetadata.contains(Checkpoint_RESET_Key) ) {
checkpoint = cfg.checkpoint;
} else if (commitMetadata.contains(Checkpoint_Key)) {
checkpoint = commitMetadata.get(Checkpoint_Key));
} else {
Option.empty()
}
// New addition
if (commitMetadata.contains(Checkpoint_RESET_Key)) {
**reset checkpoint type if set.**
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
"triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"triggerType" : "PUSH"
}, {
"hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
"triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"triggerType" : "PUSH"
}, {
"hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
"triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"triggerType" : "PUSH"
}, {
"hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
"triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"triggerType" : "PUSH"
}, {
"hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
"triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"triggerType" : "PUSH"
}, {
"hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
"triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"triggerType" : "PUSH"
}, {
"hash" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638",
"triggerID" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
"triggerType" : "PUSH"
}, {
"hash" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=775",
"triggerID" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
"triggerType" : "PUSH"
}, {
"hash" : "a688be727d6d6beff51a3f347b9e596d982610b5",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=838",
"triggerID" : "a688be727d6d6beff51a3f347b9e596d982610b5",
"triggerType" : "PUSH"
}, {
"hash" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=839",
"triggerID" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
"triggerType" : "PUSH"
}, {
"hash" : "8bc0333e4fc14158b126da1f7b14f6c43a3abfb8",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=856",
"triggerID" : "8bc0333e4fc14158b126da1f7b14f6c43a3abfb8",
"triggerType" : "PUSH"
}, {
"hash" : "5022f1d97e4e9b140d8e41b5b49c034ceb9ae601",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=858",
"triggerID" : "5022f1d97e4e9b140d8e41b5b49c034ceb9ae601",
"triggerType" : "PUSH"
}, {
"hash" : "b77b63994db2e91853a06d3a5c4c129a21feefcf",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=863",
"triggerID" : "b77b63994db2e91853a06d3a5c4c129a21feefcf",
"triggerType" : "PUSH"
}, {
"hash" : "e98b8e407f1bbcd0f0219d2f2d65f4e95f663c00",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=959",
"triggerID" : "e98b8e407f1bbcd0f0219d2f2d65f4e95f663c00",
"triggerType" : "PUSH"
}, {
"hash" : "c705ce5d409b139a14f22bef3ecdc189fa90f562",
"status" : "PENDING",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=962",
"triggerID" : "c705ce5d409b139a14f22bef3ecdc189fa90f562",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* e98b8e407f1bbcd0f0219d2f2d65f4e95f663c00 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=959)
* c705ce5d409b139a14f22bef3ecdc189fa90f562 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=962)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] liujinhui1994 commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r671069986
##########
File path: hudi-utilities/src/test/java/org/apache/hudi/utilities/sources/helpers/TestKafkaOffsetGen.java
##########
@@ -64,7 +63,7 @@ public void teardown() throws Exception {
private TypedProperties getConsumerConfigs(String autoOffsetReset) {
TypedProperties props = new TypedProperties();
- props.put(Config.KAFKA_AUTO_OFFSET_RESET, autoOffsetReset);
+ props.put("auto.offset.reset", autoOffsetReset);
Review comment:
Necessary, already added.
This will better guarantee the correctness of the procedure
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
nsivabalan edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-853866564
good point.
Tell me if my understanding is right in general wrt usage of timestamp based checkpointing.
user would like to use timestamp based checkpointing in deltastreamer only for bootstrap case.
and further on, checkpointing will be using the regular kafka checkpoint format of "topicName,0:123,1:456".
if my understanding (stated above) is true, essentially, within kafkaOffsenGen, we might have to parse checkpoint as timestamp for first time(bootstrap), but from 2nd time, we fallback to regular checkpoint parsing mechanism.
I see we have InitialCheckPointProvider. Let me think about how to go about this and will get back to you. For now, this is what I can think of.
InitialCheckpointProvider will expose getCheckpointType() method.
and we add it as a property to configs if initialCheckpointProvider is set around [here](https://github.com/apache/hudi/blob/f6eee77636223077cfd2ce516f1b8805dfa6e35e/hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java#L132).
Within readFromSource in DeltaSync(), if checkpoint is fetched from commit metadata, we may not honor this checkpoint type. or we will clear the checkpoint type property if set.
but if fetched from cfg.checkPoint, we will leave the property as is and let kafkaOffsetGen handle checkpoint parsing.
But let me think through this more. But in the mean time, if you can confirm my understanding of the usage of timestamp based checkpointing, would be great.
CC @n3nash @bvaradar
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
"triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"triggerType" : "PUSH"
}, {
"hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
"triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"triggerType" : "PUSH"
}, {
"hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
"triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"triggerType" : "PUSH"
}, {
"hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
"triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"triggerType" : "PUSH"
}, {
"hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
"triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"triggerType" : "PUSH"
}, {
"hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
"triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"triggerType" : "PUSH"
}, {
"hash" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638",
"triggerID" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
"triggerType" : "PUSH"
}, {
"hash" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
"status" : "UNKNOWN",
"url" : "TBD",
"triggerID" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* a39570dfe0493bcd23edf911f6256e90d3b22907 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638)
* bf50481b923dbaa14be994bd0cc45bbe22ff8524 UNKNOWN
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847
# [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (bf50481) into [master](https://codecov.io/gh/apache/hudi/commit/990820476a41b318017ba63dd446911141c929ce?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (9908204) will **decrease** coverage by `1.37%`.
> The diff coverage is `n/a`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #2438 +/- ##
============================================
- Coverage 47.61% 46.23% -1.38%
+ Complexity 5487 4760 -727
============================================
Files 924 833 -91
Lines 41206 36348 -4858
Branches 4133 3623 -510
============================================
- Hits 19619 16805 -2814
+ Misses 19844 18052 -1792
+ Partials 1743 1491 -252
```
| Flag | Coverage Δ | |
|---|---|---|
| hudicli | `39.97% <ø> (ø)` | |
| hudiclient | `34.60% <ø> (+0.02%)` | :arrow_up: |
| hudicommon | `48.57% <ø> (+<0.01%)` | :arrow_up: |
| hudiflink | `59.58% <ø> (ø)` | |
| hudihadoopmr | `51.29% <ø> (ø)` | |
| hudisparkdatasource | `67.23% <ø> (-0.10%)` | :arrow_down: |
| hudisync | `?` | |
| huditimelineservice | `?` | |
| hudiutilities | `?` | |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [...n/java/org/apache/hudi/common/model/FileSlice.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0ZpbGVTbGljZS5qYXZh) | `73.80% <0.00%> (-2.39%)` | :arrow_down: |
| [.../org/apache/hudi/common/model/HoodieFileGroup.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0hvb2RpZUZpbGVHcm91cC5qYXZh) | `83.92% <0.00%> (-0.56%)` | :arrow_down: |
| [...org/apache/hudi/HoodieDatasetBulkInsertHelper.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvSG9vZGllRGF0YXNldEJ1bGtJbnNlcnRIZWxwZXIuamF2YQ==) | `96.77% <0.00%> (-0.20%)` | :arrow_down: |
| [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh) | `42.88% <0.00%> (ø)` | |
| [...n/java/org/apache/hudi/internal/DefaultSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3BhcmsyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2ludGVybmFsL0RlZmF1bHRTb3VyY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
| [...org/apache/hudi/spark3/internal/DefaultSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3BhcmszL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL3NwYXJrMy9pbnRlcm5hbC9EZWZhdWx0U291cmNlLmphdmE=) | `0.00% <0.00%> (ø)` | |
| [...spark3/internal/HoodieDataSourceInternalTable.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3BhcmszL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL3NwYXJrMy9pbnRlcm5hbC9Ib29kaWVEYXRhU291cmNlSW50ZXJuYWxUYWJsZS5qYXZh) | `0.00% <0.00%> (ø)` | |
| [...nal/HoodieBulkInsertDataInternalWriterFactory.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3BhcmsyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2ludGVybmFsL0hvb2RpZUJ1bGtJbnNlcnREYXRhSW50ZXJuYWxXcml0ZXJGYWN0b3J5LmphdmE=) | `100.00% <0.00%> (ø)` | |
| [...nal/HoodieBulkInsertDataInternalWriterFactory.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3BhcmszL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL3NwYXJrMy9pbnRlcm5hbC9Ib29kaWVCdWxrSW5zZXJ0RGF0YUludGVybmFsV3JpdGVyRmFjdG9yeS5qYXZh) | `100.00% <0.00%> (ø)` | |
| [...nal/HoodieDataSourceInternalBatchWriteBuilder.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3BhcmszL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL3NwYXJrMy9pbnRlcm5hbC9Ib29kaWVEYXRhU291cmNlSW50ZXJuYWxCYXRjaFdyaXRlQnVpbGRlci5qYXZh) | `0.00% <0.00%> (ø)` | |
| ... and [103 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [9908204...bf50481](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] liujinhui1994 commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-787662595
The current implementation is mainly in KafkaOffsetGen @wangxianghu
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
"triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"triggerType" : "PUSH"
}, {
"hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
"triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"triggerType" : "PUSH"
}, {
"hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
"triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"triggerType" : "PUSH"
}, {
"hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
"triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"triggerType" : "PUSH"
}, {
"hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
"triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"triggerType" : "PUSH"
}, {
"hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
"triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"triggerType" : "PUSH"
}, {
"hash" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638",
"triggerID" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
"triggerType" : "PUSH"
}, {
"hash" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=775",
"triggerID" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
"triggerType" : "PUSH"
}, {
"hash" : "a688be727d6d6beff51a3f347b9e596d982610b5",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=838",
"triggerID" : "a688be727d6d6beff51a3f347b9e596d982610b5",
"triggerType" : "PUSH"
}, {
"hash" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=839",
"triggerID" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
"triggerType" : "PUSH"
}, {
"hash" : "8bc0333e4fc14158b126da1f7b14f6c43a3abfb8",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=856",
"triggerID" : "8bc0333e4fc14158b126da1f7b14f6c43a3abfb8",
"triggerType" : "PUSH"
}, {
"hash" : "5022f1d97e4e9b140d8e41b5b49c034ceb9ae601",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=858",
"triggerID" : "5022f1d97e4e9b140d8e41b5b49c034ceb9ae601",
"triggerType" : "PUSH"
}, {
"hash" : "b77b63994db2e91853a06d3a5c4c129a21feefcf",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=863",
"triggerID" : "b77b63994db2e91853a06d3a5c4c129a21feefcf",
"triggerType" : "PUSH"
}, {
"hash" : "e98b8e407f1bbcd0f0219d2f2d65f4e95f663c00",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=959",
"triggerID" : "e98b8e407f1bbcd0f0219d2f2d65f4e95f663c00",
"triggerType" : "PUSH"
}, {
"hash" : "c705ce5d409b139a14f22bef3ecdc189fa90f562",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=962",
"triggerID" : "c705ce5d409b139a14f22bef3ecdc189fa90f562",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* c705ce5d409b139a14f22bef3ecdc189fa90f562 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=962)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] liujinhui1994 commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r670997475
##########
File path: hudi-utilities/src/test/java/org/apache/hudi/utilities/sources/TestKafkaSource.java
##########
@@ -193,7 +193,7 @@ public void testJsonKafkaSourceWithDefaultUpperCap() {
Source jsonSource = new JsonKafkaSource(props, jsc, sparkSession, schemaProvider, metrics);
SourceFormatAdapter kafkaSource = new SourceFormatAdapter(jsonSource);
- Config.maxEventsFromKafkaSource = 500;
+ //props.setProperty("hoodie.deltastreamer.kafka.source.maxEvents", "500");
Review comment:
//props.setProperty("hoodie.deltastreamer.kafka.source.maxEvents", "500")
should not appear here.
sorry,My Mistake
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r671353717
##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java
##########
@@ -212,6 +234,9 @@ public KafkaOffsetGen(TypedProperties props) {
Set<TopicPartition> topicPartitions = partitionInfoList.stream()
.map(x -> new TopicPartition(x.topic(), x.partition())).collect(Collectors.toSet());
+ if (Config.KAFKA_CHECKPOINT_TYPE_TIMESTAMP.equals(kafkaCheckpointType) && isValidCheckpointType(lastCheckpointStr)) {
+ lastCheckpointStr = getOffsetsByTimestamp(consumer, partitionInfoList, topicPartitions, topicName, Long.parseLong(lastCheckpointStr.get()));
+ }
Review comment:
not sure I understand. this is what I am thinking
```
if (timestamp based checkpoint)
lastCheckpoint = getOffsetByTimestamp()
else if regular checkpoint type
lastCheckpoint = fetValidOffsets()
else
reset based on auto.offset.reset.
```
Am I misunderstanding anything here?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r671353717
##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java
##########
@@ -212,6 +234,9 @@ public KafkaOffsetGen(TypedProperties props) {
Set<TopicPartition> topicPartitions = partitionInfoList.stream()
.map(x -> new TopicPartition(x.topic(), x.partition())).collect(Collectors.toSet());
+ if (Config.KAFKA_CHECKPOINT_TYPE_TIMESTAMP.equals(kafkaCheckpointType) && isValidCheckpointType(lastCheckpointStr)) {
+ lastCheckpointStr = getOffsetsByTimestamp(consumer, partitionInfoList, topicPartitions, topicName, Long.parseLong(lastCheckpointStr.get()));
+ }
Review comment:
not sure I understand. this is what I am thinking
```
if (timestamp based checkpoint)
lastCheckpoint = getOffsetByTimestamp()
else if regular checkpoint type
lastCheckpoint = fetValidOffsets()
else
reset based on auto.offset.reset.
```
Am I misunderstanding anything here? Can you help me understand please.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847
# [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (ffd30f5) into [master](https://codecov.io/gh/apache/hudi/commit/0b57483a8e41742689a1362aa94aabb94a1361b3?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (0b57483) will **decrease** coverage by `3.02%`.
> The diff coverage is `n/a`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #2438 +/- ##
============================================
- Coverage 45.76% 42.74% -3.03%
+ Complexity 5261 4070 -1191
============================================
Files 909 753 -156
Lines 39353 33259 -6094
Branches 4239 3603 -636
============================================
- Hits 18010 14215 -3795
+ Misses 19499 17467 -2032
+ Partials 1844 1577 -267
```
| Flag | Coverage Δ | |
|---|---|---|
| hudicli | `39.95% <ø> (ø)` | |
| hudiclient | `16.45% <ø> (-13.95%)` | :arrow_down: |
| hudicommon | `47.56% <ø> (-0.02%)` | :arrow_down: |
| hudiflink | `61.26% <ø> (+0.45%)` | :arrow_up: |
| hudihadoopmr | `51.29% <ø> (ø)` | |
| hudisparkdatasource | `67.00% <ø> (+0.52%)` | :arrow_up: |
| hudisync | `47.11% <ø> (-4.35%)` | :arrow_down: |
| huditimelineservice | `64.36% <ø> (ø)` | |
| hudiutilities | `?` | |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [...ache/hudi/hive/HiveMetastoreBasedLockProvider.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvSGl2ZU1ldGFzdG9yZUJhc2VkTG9ja1Byb3ZpZGVyLmphdmE=) | `0.00% <0.00%> (-60.22%)` | :arrow_down: |
| [.../org/apache/hudi/sink/compact/CompactFunction.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zaW5rL2NvbXBhY3QvQ29tcGFjdEZ1bmN0aW9uLmphdmE=) | `86.66% <0.00%> (-13.34%)` | :arrow_down: |
| [...e/hudi/sink/partitioner/profile/WriteProfiles.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zaW5rL3BhcnRpdGlvbmVyL3Byb2ZpbGUvV3JpdGVQcm9maWxlcy5qYXZh) | `50.00% <0.00%> (-5.89%)` | :arrow_down: |
| [...src/main/scala/org/apache/hudi/DefaultSource.scala](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0RlZmF1bHRTb3VyY2Uuc2NhbGE=) | `75.22% <0.00%> (-2.23%)` | :arrow_down: |
| [...e/hudi/common/table/log/HoodieLogFormatWriter.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL2xvZy9Ib29kaWVMb2dGb3JtYXRXcml0ZXIuamF2YQ==) | `78.12% <0.00%> (-1.57%)` | :arrow_down: |
| [...c/main/java/org/apache/hudi/util/StreamerUtil.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS91dGlsL1N0cmVhbWVyVXRpbC5qYXZh) | `55.00% <0.00%> (-1.42%)` | :arrow_down: |
| [...i/common/table/timeline/HoodieDefaultTimeline.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3RpbWVsaW5lL0hvb2RpZURlZmF1bHRUaW1lbGluZS5qYXZh) | `79.22% <0.00%> (-1.30%)` | :arrow_down: |
| [...java/org/apache/hudi/sink/StreamWriteFunction.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zaW5rL1N0cmVhbVdyaXRlRnVuY3Rpb24uamF2YQ==) | `84.34% <0.00%> (-0.66%)` | :arrow_down: |
| [...he/hudi/sink/partitioner/profile/WriteProfile.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zaW5rL3BhcnRpdGlvbmVyL3Byb2ZpbGUvV3JpdGVQcm9maWxlLmphdmE=) | `87.50% <0.00%> (-0.50%)` | :arrow_down: |
| [...va/org/apache/hudi/configuration/FlinkOptions.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9jb25maWd1cmF0aW9uL0ZsaW5rT3B0aW9ucy5qYXZh) | `96.37% <0.00%> (-0.05%)` | :arrow_down: |
| ... and [189 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [0b57483...ffd30f5](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] wangxianghu edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
wangxianghu edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-759873110
@liujinhui1994 IMO, we can provide both offset and timestamp checkpoint by `--checkpoint`, add a new param named checkpointType(default offset type if not configed) to tell hudi the checkpoint type user used. WDYT ?
please check why ci failed BTW
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r663377708
##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java
##########
@@ -282,6 +301,36 @@ private Long delayOffsetCalculation(Option<String> lastCheckpointStr, Set<TopicP
return delayCount;
}
+ /**
+ * Get the checkpoint by timestamp.
+ * @param consumer
Review comment:
can you please add some documentation on whats happening here. format etc. an example would be great.
##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java
##########
@@ -270,6 +280,15 @@ public KafkaOffsetGen(TypedProperties props) {
return checkpointOffsetReseter ? earliestOffsets : checkpointOffsets;
}
+ private Boolean checkLastCheckpointType(Option<String> lastCheckpointStr) {
Review comment:
should we name this "isValidCheckpointType" or something? also, can you add java docs as to what validation we are doing here.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-io commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
codecov-io commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-759677298
# [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1) Report
> Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc) (9c37f30) into [master](https://codecov.io/gh/apache/hudi/commit/e3d3677b7e7899705b624925666317f0c074f7c7?el=desc) (e3d3677) will **decrease** coverage by `41.11%`.
> The diff coverage is `0.00%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree)
```diff
@@ Coverage Diff @@
## master #2438 +/- ##
============================================
- Coverage 50.73% 9.61% -41.12%
+ Complexity 3064 48 -3016
============================================
Files 419 53 -366
Lines 18797 1944 -16853
Branches 1922 233 -1689
============================================
- Hits 9536 187 -9349
+ Misses 8487 1744 -6743
+ Partials 774 13 -761
```
| Flag | Coverage Δ | Complexity Δ | |
|---|---|---|---|
| hudicli | `?` | `?` | |
| hudiclient | `100.00% <ø> (ø)` | `0.00 <ø> (ø)` | |
| hudicommon | `?` | `?` | |
| hudiflink | `?` | `?` | |
| hudihadoopmr | `?` | `?` | |
| hudisparkdatasource | `?` | `?` | |
| hudisync | `?` | `?` | |
| huditimelineservice | `?` | `?` | |
| hudiutilities | `9.61% <0.00%> (-59.87%)` | `0.00 <0.00> (ø)` | |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
|---|---|---|---|
| [...hudi/utilities/sources/helpers/KafkaOffsetGen.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvaGVscGVycy9LYWZrYU9mZnNldEdlbi5qYXZh) | `0.00% <0.00%> (-88.78%)` | `0.00 <0.00> (-16.00)` | |
| [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-2.00%)` | |
| [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-3.00%)` | |
| [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
| [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
| [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
| [...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-10.00%)` | |
| [...g/apache/hudi/utilities/sources/JsonDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkRGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
| [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-6.00%)` | |
| [...pache/hudi/utilities/sources/ParquetDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUGFycXVldERGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-5.00%)` | |
| ... and [397 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more) | |
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
"triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"triggerType" : "PUSH"
}, {
"hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
"triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"triggerType" : "PUSH"
}, {
"hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
"triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"triggerType" : "PUSH"
}, {
"hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"status" : "PENDING",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
"triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* ea5ed9da433064022a69e06c98f58fc10c09e8b6 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373)
* ffd30f564c780a25ddccf8c5bc819d4eed9b437a Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847
# [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (67041c2) into [master](https://codecov.io/gh/apache/hudi/commit/990820476a41b318017ba63dd446911141c929ce?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (9908204) will **increase** coverage by `19.04%`.
> The diff coverage is `n/a`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #2438 +/- ##
=============================================
+ Coverage 47.61% 66.65% +19.04%
+ Complexity 5487 798 -4689
=============================================
Files 924 100 -824
Lines 41206 3488 -37718
Branches 4133 353 -3780
=============================================
- Hits 19619 2325 -17294
+ Misses 19844 1024 -18820
+ Partials 1743 139 -1604
```
| Flag | Coverage Δ | |
|---|---|---|
| hudicli | `?` | |
| hudiclient | `66.65% <ø> (+32.07%)` | :arrow_up: |
| hudicommon | `?` | |
| hudiflink | `?` | |
| hudihadoopmr | `?` | |
| hudisparkdatasource | `?` | |
| hudisync | `?` | |
| huditimelineservice | `?` | |
| hudiutilities | `?` | |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [...n/java/org/apache/hudi/index/SparkHoodieIndex.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1zcGFyay1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaW5kZXgvU3BhcmtIb29kaWVJbmRleC5qYXZh) | `56.52% <0.00%> (-30.15%)` | :arrow_down: |
| [...e/hudi/table/format/mor/MergeOnReadTableState.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS90YWJsZS9mb3JtYXQvbW9yL01lcmdlT25SZWFkVGFibGVTdGF0ZS5qYXZh) | | |
| [.../main/java/org/apache/hudi/util/AvroConvertor.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS91dGlsL0F2cm9Db252ZXJ0b3IuamF2YQ==) | | |
| [...he/hudi/table/format/cow/AbstractColumnReader.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS90YWJsZS9mb3JtYXQvY293L0Fic3RyYWN0Q29sdW1uUmVhZGVyLmphdmE=) | | |
| [...di/utilities/sources/helpers/IncrSourceHelper.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvaGVscGVycy9JbmNyU291cmNlSGVscGVyLmphdmE=) | | |
| [...rg/apache/hudi/table/action/commit/BucketType.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL3RhYmxlL2FjdGlvbi9jb21taXQvQnVja2V0VHlwZS5qYXZh) | | |
| [...apache/hudi/timeline/service/handlers/Handler.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS10aW1lbGluZS1zZXJ2aWNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL3RpbWVsaW5lL3NlcnZpY2UvaGFuZGxlcnMvSGFuZGxlci5qYXZh) | | |
| [.../apache/hudi/keygen/constant/KeyGeneratorType.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2tleWdlbi9jb25zdGFudC9LZXlHZW5lcmF0b3JUeXBlLmphdmE=) | | |
| [...apache/hudi/client/utils/LazyIterableIterator.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC91dGlscy9MYXp5SXRlcmFibGVJdGVyYXRvci5qYXZh) | | |
| [...hudi/table/action/commit/AbstractDeleteHelper.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL3RhYmxlL2FjdGlvbi9jb21taXQvQWJzdHJhY3REZWxldGVIZWxwZXIuamF2YQ==) | | |
| ... and [819 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [9908204...67041c2](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847
# [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (67041c2) into [master](https://codecov.io/gh/apache/hudi/commit/990820476a41b318017ba63dd446911141c929ce?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (9908204) will **decrease** coverage by `0.94%`.
> The diff coverage is `n/a`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #2438 +/- ##
============================================
- Coverage 47.61% 46.66% -0.95%
+ Complexity 5487 5027 -460
============================================
Files 924 864 -60
Lines 41206 38317 -2889
Branches 4133 3824 -309
============================================
- Hits 19619 17880 -1739
+ Misses 19844 18850 -994
+ Partials 1743 1587 -156
```
| Flag | Coverage Δ | |
|---|---|---|
| hudicli | `39.97% <ø> (ø)` | |
| hudiclient | `34.55% <ø> (-0.03%)` | :arrow_down: |
| hudicommon | `48.59% <ø> (+0.02%)` | :arrow_up: |
| hudiflink | `60.03% <ø> (+0.44%)` | :arrow_up: |
| hudihadoopmr | `51.29% <ø> (ø)` | |
| hudisparkdatasource | `67.32% <ø> (-0.01%)` | :arrow_down: |
| hudisync | `50.55% <ø> (-3.93%)` | :arrow_down: |
| huditimelineservice | `64.07% <ø> (ø)` | |
| hudiutilities | `?` | |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [...ache/hudi/hive/HiveMetastoreBasedLockProvider.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvSGl2ZU1ldGFzdG9yZUJhc2VkTG9ja1Byb3ZpZGVyLmphdmE=) | `0.00% <0.00%> (-60.22%)` | :arrow_down: |
| [...n/java/org/apache/hudi/index/SparkHoodieIndex.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1zcGFyay1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaW5kZXgvU3BhcmtIb29kaWVJbmRleC5qYXZh) | `56.52% <0.00%> (-30.15%)` | :arrow_down: |
| [...java/org/apache/hudi/table/HoodieTableFactory.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS90YWJsZS9Ib29kaWVUYWJsZUZhY3RvcnkuamF2YQ==) | `84.61% <0.00%> (-7.06%)` | :arrow_down: |
| [...n/java/org/apache/hudi/common/model/FileSlice.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0ZpbGVTbGljZS5qYXZh) | `73.80% <0.00%> (-2.39%)` | :arrow_down: |
| [.../org/apache/hudi/common/model/HoodieFileGroup.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0hvb2RpZUZpbGVHcm91cC5qYXZh) | `83.92% <0.00%> (-0.56%)` | :arrow_down: |
| [...in/java/org/apache/hudi/table/HoodieTableSink.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS90YWJsZS9Ib29kaWVUYWJsZVNpbmsuamF2YQ==) | `10.52% <0.00%> (ø)` | |
| [.../org/apache/hudi/streamer/FlinkStreamerConfig.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zdHJlYW1lci9GbGlua1N0cmVhbWVyQ29uZmlnLmphdmE=) | `0.00% <0.00%> (ø)` | |
| [.../org/apache/hudi/streamer/HoodieFlinkStreamer.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zdHJlYW1lci9Ib29kaWVGbGlua1N0cmVhbWVyLmphdmE=) | `0.00% <0.00%> (ø)` | |
| [...ain/java/org/apache/hudi/io/FlinkAppendHandle.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1mbGluay1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaW8vRmxpbmtBcHBlbmRIYW5kbGUuamF2YQ==) | `0.00% <0.00%> (ø)` | |
| [...ain/java/org/apache/hudi/io/FlinkCreateHandle.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1mbGluay1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaW8vRmxpbmtDcmVhdGVIYW5kbGUuamF2YQ==) | `0.00% <0.00%> (ø)` | |
| ... and [94 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [9908204...67041c2](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847
# [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (e98b8e4) into [master](https://codecov.io/gh/apache/hudi/commit/5804ad8e32ae05758ebc5e47f5d4fb4db371ab52?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (5804ad8) will **decrease** coverage by `31.70%`.
> The diff coverage is `85.45%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #2438 +/- ##
=============================================
- Coverage 47.72% 16.02% -31.71%
+ Complexity 5528 502 -5026
=============================================
Files 934 284 -650
Lines 41457 11869 -29588
Branches 4166 986 -3180
=============================================
- Hits 19786 1902 -17884
+ Misses 19914 9802 -10112
+ Partials 1757 165 -1592
```
| Flag | Coverage Δ | |
|---|---|---|
| hudicli | `?` | |
| hudiclient | `0.00% <ø> (-34.46%)` | :arrow_down: |
| hudicommon | `?` | |
| hudiflink | `?` | |
| hudihadoopmr | `?` | |
| hudisparkdatasource | `?` | |
| hudisync | `4.88% <ø> (-49.64%)` | :arrow_down: |
| huditimelineservice | `?` | |
| hudiutilities | `59.70% <85.45%> (+0.44%)` | :arrow_up: |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [...apache/hudi/utilities/sources/AvroKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb0thZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (ø)` | |
| [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `72.72% <83.33%> (+1.15%)` | :arrow_up: |
| [...hudi/utilities/sources/helpers/KafkaOffsetGen.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvaGVscGVycy9LYWZrYU9mZnNldEdlbi5qYXZh) | `87.00% <87.23%> (-0.68%)` | :arrow_down: |
| [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `100.00% <100.00%> (ø)` | |
| [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...a/org/apache/hudi/metrics/MetricsReporterType.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyVHlwZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...rg/apache/hudi/client/bootstrap/BootstrapMode.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9ib290c3RyYXAvQm9vdHN0cmFwTW9kZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...he/hudi/hive/HiveStylePartitionValueExtractor.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvSGl2ZVN0eWxlUGFydGl0aW9uVmFsdWVFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| ... and [731 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [5804ad8...e98b8e4](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
nsivabalan edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-864413719
Guess we can simplify things. Let me go over some pseudo code of interest.
within DeltaSync.read()
```
// set right checkpoint value
if(cfg.checkpoint != null && ! (commitMetadata.contains(Checkpoint_RESET_Key) ) {
checkpoint = cfg.checkpoint;
} else if (commitMetadata.contains(Checkpoint_Key)) {
checkpoint = commitMetadata.get(Checkpoint_Key));
} else {
Option.empty()
}
```
// Note that first if condition deals with RESET_key where as 2nd else if conditions deals with Checkpoint_key.
I have simplified some exception cases, but should give you the gist.
within write()
```
// towards the end
commitMetadata.out(Checkpoint_Key, updated checkpoint after writing)
if(cfg.checkpoint != null) {
commitMetadata.add(Checkpoint_RESET_Key);
}
```
If cfg.checkpoint is set, only during first round, it will be honored. At the end of first batch, we add Checkpoint_RESET_Key to the commitmetadata and hence from subsequent batches, checkpoint will be parsed from commitMetadata.
With this PR, only addition is that we are introducing a new checkpoint type. Let me propose a simple add on to above code that would work for us.
Code before this patch.
within DeltaSync.read()
```
// set right checkpoint value
boolean resetCheckpointType = true; // New addition
if(cfg.checkpoint != null && ! (commitMetadata.contains(Checkpoint_RESET_Key) ) {
checkpoint = cfg.checkpoint;
resetCheckpointType = false; // New addition
} else if (commitMetadata.contains(Checkpoint_Key)) {
checkpoint = commitMetadata.get(Checkpoint_Key));
} else {
Option.empty()
}
// New addition
if (resetCheckpointType) {
**reset checkpoint type if set.**
}
```
No other changes are required. This is based of the assumption that Checkpoint_RESET_Key and checkpoint type goes hand in hand. During first batch, checkpoint type could be set, there won't be any Checkpoint_RESET_Key set. But from 2nd batch, it should be reverse. check point type should not be set, but Checkpoint_RESET_Key should be part of the commit metadata. Given this assumption, we don't really need to add checkpoint type to commitMetadata, but still decide whether to use the checkpoint type or not.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
"triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"triggerType" : "PUSH"
}, {
"hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
"triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"triggerType" : "PUSH"
}, {
"hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
"triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"triggerType" : "PUSH"
}, {
"hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
"triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"triggerType" : "PUSH"
}, {
"hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
"triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"triggerType" : "PUSH"
}, {
"hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
"triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"triggerType" : "PUSH"
}, {
"hash" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638",
"triggerID" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
"triggerType" : "PUSH"
}, {
"hash" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=775",
"triggerID" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
"triggerType" : "PUSH"
}, {
"hash" : "a688be727d6d6beff51a3f347b9e596d982610b5",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=838",
"triggerID" : "a688be727d6d6beff51a3f347b9e596d982610b5",
"triggerType" : "PUSH"
}, {
"hash" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
"status" : "UNKNOWN",
"url" : "TBD",
"triggerID" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* a688be727d6d6beff51a3f347b9e596d982610b5 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=838)
* 67041c2d836e61355aea26bd24f91548ec5e92ce UNKNOWN
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] liujinhui1994 commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r670996945
##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java
##########
@@ -212,6 +234,9 @@ public KafkaOffsetGen(TypedProperties props) {
Set<TopicPartition> topicPartitions = partitionInfoList.stream()
.map(x -> new TopicPartition(x.topic(), x.partition())).collect(Collectors.toSet());
+ if (Config.KAFKA_CHECKPOINT_TYPE_TIMESTAMP.equals(kafkaCheckpointType) && isValidCheckpointType(lastCheckpointStr)) {
+ lastCheckpointStr = getOffsetsByTimestamp(consumer, partitionInfoList, topicPartitions, topicName, Long.parseLong(lastCheckpointStr.get()));
+ }
Review comment:
I don't think there needs to be an "else if" here.
If you are using timestamp kafkaCheckpointType, lastCheckpointStr will be passed a timestamp of "timestamp type", which we will handle using the getOffsetByTimestamp method.
If it is not a timestamp type, then we can interpret it as a regular string type checkpoint rule and do not process it
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
"triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"triggerType" : "PUSH"
}, {
"hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
"triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"triggerType" : "PUSH"
}, {
"hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
"triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"triggerType" : "PUSH"
}, {
"hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
"triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"triggerType" : "PUSH"
}, {
"hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
"triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"triggerType" : "PUSH"
}, {
"hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
"triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"triggerType" : "PUSH"
}, {
"hash" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638",
"triggerID" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
"triggerType" : "PUSH"
}, {
"hash" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=775",
"triggerID" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
"triggerType" : "PUSH"
}, {
"hash" : "a688be727d6d6beff51a3f347b9e596d982610b5",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=838",
"triggerID" : "a688be727d6d6beff51a3f347b9e596d982610b5",
"triggerType" : "PUSH"
}, {
"hash" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=839",
"triggerID" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
"triggerType" : "PUSH"
}, {
"hash" : "8bc0333e4fc14158b126da1f7b14f6c43a3abfb8",
"status" : "PENDING",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=856",
"triggerID" : "8bc0333e4fc14158b126da1f7b14f6c43a3abfb8",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* 67041c2d836e61355aea26bd24f91548ec5e92ce Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=839)
* 8bc0333e4fc14158b126da1f7b14f6c43a3abfb8 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=856)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan merged pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
nsivabalan merged pull request #2438:
URL: https://github.com/apache/hudi/pull/2438
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] pratyakshsharma commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
pratyakshsharma commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r830456199
##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java
##########
@@ -283,6 +323,41 @@ private Long delayOffsetCalculation(Option<String> lastCheckpointStr, Set<TopicP
return delayCount;
}
+ /**
+ * Get the checkpoint by timestamp.
+ * This method returns the checkpoint format based on the timestamp.
+ * example:
+ * 1. input: timestamp, etc.
+ * 2. output: topicName,partition_num_0:100,partition_num_1:101,partition_num_2:102.
+ *
+ * @param consumer
+ * @param topicName
+ * @param timestamp
+ * @return
+ */
+ private Option<String> getOffsetsByTimestamp(KafkaConsumer consumer, List<PartitionInfo> partitionInfoList, Set<TopicPartition> topicPartitions,
+ String topicName, Long timestamp) {
+
+ Map<TopicPartition, Long> topicPartitionsTimestamp = partitionInfoList.stream()
+ .map(x -> new TopicPartition(x.topic(), x.partition()))
+ .collect(Collectors.toMap(Function.identity(), x -> timestamp));
+
+ Map<TopicPartition, Long> earliestOffsets = consumer.beginningOffsets(topicPartitions);
+ Map<TopicPartition, OffsetAndTimestamp> offsetAndTimestamp = consumer.offsetsForTimes(topicPartitionsTimestamp);
+
+ StringBuilder sb = new StringBuilder();
+ sb.append(topicName + ",");
+ for (Map.Entry<TopicPartition, OffsetAndTimestamp> map : offsetAndTimestamp.entrySet()) {
+ if (map.getValue() != null) {
+ sb.append(map.getKey().partition()).append(":").append(map.getValue().offset()).append(",");
+ } else {
+ sb.append(map.getKey().partition()).append(":").append(earliestOffsets.get(map.getKey())).append(",");
Review comment:
@liujinhui1994 @nsivabalan Can you help me understand why are we adding this value here from earliestOffsets? From what I understand, the whole point of consuming from specified timestamp is we do not want to consume records whose offset has timestamp lesser than the specified timestamp. Let us take an example of topic A with 3 partitions 0,1,2. Offsets are as below -
`partition 0 - 100 (ts-210),101 (ts-220),102 (ts-230),103 (ts-240) .....
partition 1 - 50 (ts 200), 51 (ts-205), 52 (ts-225) ....
partition 2 - 51 (ts - 100), 60 (ts - 150) (only 2 records present in this)`
Now suppose if the timestamp is passed as 220, the expected results from consumer api will be -
`partition 0 -> 101
partition 1 -> 52
partition 2 -> null`
As per the code, we return -
`partition 0 -> 101
partition 1 -> 52
partition 2 -> 51`
I want to understand why are we populating this value here for partition 2? If the corresponding offsets in partition 2 have timestamp less than 220, this implies these offsets have either been already consumed or the records are not needed at all for ingestion into hudi table. Ideally no offset should be returned from this method for partition 2.
Even if this functionality is added only for one time initial bootstrap, then also consuming the records from partition 2 above does not make sense. Please let me know the thought process behind this logic.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-822457134
@liujinhui1994 : ping me here once the PR is ready to be reviewed again
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
nsivabalan edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-799571386
Thanks for your contribution. this is going to be useful to the community.
Few high level questions.
1. Why not we leverage DeltaSreamerConfig.checkpoint to pass in a timestamp for Kafka source? Or do we expect the format of this config to be "topic_name,partition_num:offset,partition_num:offset,...." and hence we need a new config for timestamp based checkpoint.
2. If yes to (1), Did we think about parsing the checkpoint config and determining whether its above format or timestamp and then proceeding from there. Just trying to avoid introducing new configs if possible.
3. Checkpoint in deltastreamer in general is getting too complicated. I definitely see a benefit in this patch. But, is there a way we can abstract it out based on source. Bcoz, the new config introduced as part of this PR, is very specific to Kafka. So, trying to see if we can keep it abstracted out from deltastreamer if possible.
4. I see KafkaConsumer.offsetsForTimes() could return null for partitions w/ msgs of old format. So, what's the expected behavior for such partitions. Do we resume from earliest offset?
@n3nash @vinothchandar : open to hear your thoughts if any. One of my suggestion above, could potentially add apis to Source and hence CCing you.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
"triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"triggerType" : "PUSH"
}, {
"hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
"triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"triggerType" : "PUSH"
}, {
"hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
"triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"triggerType" : "PUSH"
}, {
"hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
"triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"triggerType" : "PUSH"
}, {
"hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
"triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"triggerType" : "PUSH"
}, {
"hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
"triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"triggerType" : "PUSH"
}, {
"hash" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638",
"triggerID" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
"triggerType" : "PUSH"
}, {
"hash" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=775",
"triggerID" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
"triggerType" : "PUSH"
}, {
"hash" : "a688be727d6d6beff51a3f347b9e596d982610b5",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=838",
"triggerID" : "a688be727d6d6beff51a3f347b9e596d982610b5",
"triggerType" : "PUSH"
}, {
"hash" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=839",
"triggerID" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
"triggerType" : "PUSH"
}, {
"hash" : "8bc0333e4fc14158b126da1f7b14f6c43a3abfb8",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=856",
"triggerID" : "8bc0333e4fc14158b126da1f7b14f6c43a3abfb8",
"triggerType" : "PUSH"
}, {
"hash" : "5022f1d97e4e9b140d8e41b5b49c034ceb9ae601",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=858",
"triggerID" : "5022f1d97e4e9b140d8e41b5b49c034ceb9ae601",
"triggerType" : "PUSH"
}, {
"hash" : "b77b63994db2e91853a06d3a5c4c129a21feefcf",
"status" : "SUCCESS",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=863",
"triggerID" : "b77b63994db2e91853a06d3a5c4c129a21feefcf",
"triggerType" : "PUSH"
}, {
"hash" : "e98b8e407f1bbcd0f0219d2f2d65f4e95f663c00",
"status" : "PENDING",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=959",
"triggerID" : "e98b8e407f1bbcd0f0219d2f2d65f4e95f663c00",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* b77b63994db2e91853a06d3a5c4c129a21feefcf Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=863)
* e98b8e407f1bbcd0f0219d2f2d65f4e95f663c00 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=959)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] wangxianghu commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
wangxianghu commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r556990030
##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java
##########
@@ -165,6 +169,7 @@ public KafkaOffsetGen(TypedProperties props) {
}
DataSourceUtils.checkRequiredProperties(props, Collections.singletonList(Config.KAFKA_TOPIC_NAME));
topicName = props.getString(Config.KAFKA_TOPIC_NAME);
+ kafkaCheckpointTimestamp = props.getString(Config.KAFKA_CHECKPOINT_TIMESTAMP);
Review comment:
if the value of `Config.KAFKA_CHECKPOINT_TIMESTAMP` does not exist, Exception will be thrown, this is not expected when the user want to use checkpoint by providing offsets
##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java
##########
@@ -182,6 +187,10 @@ public KafkaOffsetGen(TypedProperties props) {
.map(x -> new TopicPartition(x.topic(), x.partition())).collect(Collectors.toSet());
// Determine the offset ranges to read from
+ if (kafkaCheckpointTimestamp != null) {
+ lastCheckpointStr = Option.of(getOffsetsByTimestamp(consumer, partitionInfoList, topicName, Long.parseLong(kafkaCheckpointTimestamp)));
+ }
+
if (lastCheckpointStr.isPresent() && !lastCheckpointStr.get().isEmpty()) {
Review comment:
Here we can not simply over write `lastCheckpointStr`. if user configed `Config.KAFKA_CHECKPOINT_TIMESTAMP`, hudi will always consume from `Config.KAFKA_CHECKPOINT_TIMESTAMP` and can not moving on, right ?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] liujinhui1994 commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r594877281
##########
File path: hudi-integ-test/src/main/java/org/apache/hudi/integ/testsuite/HoodieDeltaStreamerWrapper.java
##########
@@ -65,7 +65,7 @@ public void scheduleCompact() throws Exception {
return upsert(WriteOperationType.UPSERT);
}
- public Pair<SchemaProvider, Pair<String, JavaRDD<HoodieRecord>>> fetchSource() throws Exception {
+ public Pair<Pair<SchemaProvider, JavaRDD<HoodieRecord>>, Pair<String, String>> fetchSource() throws Exception {
Review comment:
After your PR is over, continue with the next PR?
@nsivabalan
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-881200974
Let's try to land this in by weekend. Its been hanging for quite sometime.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] liujinhui1994 removed a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
liujinhui1994 removed a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-782588952
@yanghua @wangxianghu @nsivabalan
I have verified, please help review
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-io edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-759677298
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] liujinhui1994 closed pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
liujinhui1994 closed pull request #2438:
URL: https://github.com/apache/hudi/pull/2438
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r604828288
##########
File path: hudi-integ-test/src/main/java/org/apache/hudi/integ/testsuite/HoodieDeltaStreamerWrapper.java
##########
@@ -65,7 +65,7 @@ public void scheduleCompact() throws Exception {
return upsert(WriteOperationType.UPSERT);
}
- public Pair<SchemaProvider, Pair<String, JavaRDD<HoodieRecord>>> fetchSource() throws Exception {
+ public Pair<Pair<SchemaProvider, JavaRDD<HoodieRecord>>, Pair<String, String>> fetchSource() throws Exception {
Review comment:
actually my PR was closed as it was invalid. But [here](https://github.com/nsivabalan/hudi/blob/f7439e2e28748bf7b713fb72ba611f8af7bb97a1/hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/ReadBatch.java) is the class that I added. May be you can add it in this patch only.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r594471195
##########
File path: hudi-integ-test/src/main/java/org/apache/hudi/integ/testsuite/HoodieDeltaStreamerWrapper.java
##########
@@ -65,7 +65,7 @@ public void scheduleCompact() throws Exception {
return upsert(WriteOperationType.UPSERT);
}
- public Pair<SchemaProvider, Pair<String, JavaRDD<HoodieRecord>>> fetchSource() throws Exception {
+ public Pair<Pair<SchemaProvider, JavaRDD<HoodieRecord>>, Pair<String, String>> fetchSource() throws Exception {
Review comment:
this is getting out of hand(two pairs within a pair). we can't keep adding more Pairs here. I am adding a class to hold the return value here in one of my PRs. Lets see if we can rebase once the other PR lands.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r670978737
##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java
##########
@@ -212,6 +234,9 @@ public KafkaOffsetGen(TypedProperties props) {
Set<TopicPartition> topicPartitions = partitionInfoList.stream()
.map(x -> new TopicPartition(x.topic(), x.partition())).collect(Collectors.toSet());
+ if (Config.KAFKA_CHECKPOINT_TYPE_TIMESTAMP.equals(kafkaCheckpointType) && isValidCheckpointType(lastCheckpointStr)) {
+ lastCheckpointStr = getOffsetsByTimestamp(consumer, partitionInfoList, topicPartitions, topicName, Long.parseLong(lastCheckpointStr.get()));
+ }
Review comment:
I was expecting a else if block after this line. Can you clarify please. If not, we might go into the else block ?
##########
File path: hudi-utilities/src/test/java/org/apache/hudi/utilities/sources/helpers/TestKafkaOffsetGen.java
##########
@@ -64,7 +63,7 @@ public void teardown() throws Exception {
private TypedProperties getConsumerConfigs(String autoOffsetReset) {
TypedProperties props = new TypedProperties();
- props.put(Config.KAFKA_AUTO_OFFSET_RESET, autoOffsetReset);
+ props.put("auto.offset.reset", autoOffsetReset);
Review comment:
Do you think we can add some tests to this class for the timestamp type?
##########
File path: hudi-utilities/src/test/java/org/apache/hudi/utilities/sources/TestKafkaSource.java
##########
@@ -193,7 +193,7 @@ public void testJsonKafkaSourceWithDefaultUpperCap() {
Source jsonSource = new JsonKafkaSource(props, jsc, sparkSession, schemaProvider, metrics);
SourceFormatAdapter kafkaSource = new SourceFormatAdapter(jsonSource);
- Config.maxEventsFromKafkaSource = 500;
+ //props.setProperty("hoodie.deltastreamer.kafka.source.maxEvents", "500");
Review comment:
why commented out?
##########
File path: hudi-utilities/src/test/java/org/apache/hudi/utilities/sources/TestKafkaSource.java
##########
@@ -193,7 +193,7 @@ public void testJsonKafkaSourceWithDefaultUpperCap() {
Source jsonSource = new JsonKafkaSource(props, jsc, sparkSession, schemaProvider, metrics);
SourceFormatAdapter kafkaSource = new SourceFormatAdapter(jsonSource);
- Config.maxEventsFromKafkaSource = 500;
+ //props.setProperty("hoodie.deltastreamer.kafka.source.maxEvents", "500");
Review comment:
I tried your patch locally. the test fails if I uncomment this line. I don't understand ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] liujinhui1994 commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r605393953
##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java
##########
@@ -553,6 +555,11 @@ public DeltaSyncService(Config cfg, JavaSparkContext jssc, FileSystem fs, Config
"'--filter-dupes' needs to be disabled when '--op' is 'UPSERT' to ensure updates are not missed.");
this.props = properties.get();
+ String kafkaCheckpointTimestamp = props.getString(KafkaOffsetGen.Config.KAFKA_CHECKPOINT_TIMESTAMP, "");
Review comment:
KAFKA_CHECKPOINT_TIMESTAMP, I think is just a way to make it easier for users to set checkpoint
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
"triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"triggerType" : "PUSH"
}, {
"hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
"triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"triggerType" : "PUSH"
}, {
"hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
"triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"triggerType" : "PUSH"
}, {
"hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
"triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"triggerType" : "PUSH"
}, {
"hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
"triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"triggerType" : "PUSH"
}, {
"hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
"triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"triggerType" : "PUSH"
}, {
"hash" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638",
"triggerID" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
"triggerType" : "PUSH"
}, {
"hash" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=775",
"triggerID" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
"triggerType" : "PUSH"
}, {
"hash" : "a688be727d6d6beff51a3f347b9e596d982610b5",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=838",
"triggerID" : "a688be727d6d6beff51a3f347b9e596d982610b5",
"triggerType" : "PUSH"
}, {
"hash" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=839",
"triggerID" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
"triggerType" : "PUSH"
}, {
"hash" : "8bc0333e4fc14158b126da1f7b14f6c43a3abfb8",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=856",
"triggerID" : "8bc0333e4fc14158b126da1f7b14f6c43a3abfb8",
"triggerType" : "PUSH"
}, {
"hash" : "5022f1d97e4e9b140d8e41b5b49c034ceb9ae601",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=858",
"triggerID" : "5022f1d97e4e9b140d8e41b5b49c034ceb9ae601",
"triggerType" : "PUSH"
}, {
"hash" : "b77b63994db2e91853a06d3a5c4c129a21feefcf",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=863",
"triggerID" : "b77b63994db2e91853a06d3a5c4c129a21feefcf",
"triggerType" : "PUSH"
}, {
"hash" : "e98b8e407f1bbcd0f0219d2f2d65f4e95f663c00",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=959",
"triggerID" : "e98b8e407f1bbcd0f0219d2f2d65f4e95f663c00",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* e98b8e407f1bbcd0f0219d2f2d65f4e95f663c00 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=959)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-881817246
Appreciate your perseverance in addressing all the feedback. You are the best! :) Thanks for your contribution!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847
# [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (c705ce5) into [master](https://codecov.io/gh/apache/hudi/commit/5804ad8e32ae05758ebc5e47f5d4fb4db371ab52?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (5804ad8) will **decrease** coverage by `44.90%`.
> The diff coverage is `0.00%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #2438 +/- ##
============================================
- Coverage 47.72% 2.82% -44.91%
+ Complexity 5528 85 -5443
============================================
Files 934 284 -650
Lines 41457 11869 -29588
Branches 4166 986 -3180
============================================
- Hits 19786 335 -19451
+ Misses 19914 11508 -8406
+ Partials 1757 26 -1731
```
| Flag | Coverage Δ | |
|---|---|---|
| hudicli | `?` | |
| hudiclient | `0.00% <ø> (-34.46%)` | :arrow_down: |
| hudicommon | `?` | |
| hudiflink | `?` | |
| hudihadoopmr | `?` | |
| hudisparkdatasource | `?` | |
| hudisync | `4.88% <ø> (-49.64%)` | :arrow_down: |
| huditimelineservice | `?` | |
| hudiutilities | `8.99% <0.00%> (-50.27%)` | :arrow_down: |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `0.00% <0.00%> (-71.57%)` | :arrow_down: |
| [...apache/hudi/utilities/sources/AvroKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb0thZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (ø)` | |
| [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...hudi/utilities/sources/helpers/KafkaOffsetGen.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvaGVscGVycy9LYWZrYU9mZnNldEdlbi5qYXZh) | `0.00% <0.00%> (-87.69%)` | :arrow_down: |
| [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| ... and [778 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [5804ad8...c705ce5](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
nsivabalan edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-864418016
actually, we can make it even more simpler.
DeltaSync.read()
```
// set right checkpoint value
if(cfg.checkpoint != null && ! (commitMetadata.contains(Checkpoint_RESET_Key) ) {
checkpoint = cfg.checkpoint;
} else if (commitMetadata.contains(Checkpoint_Key)) {
checkpoint = commitMetadata.get(Checkpoint_Key));
} else {
Option.empty()
}
// New addition
if (commitMetadata.contains(Checkpoint_RESET_Key)) {
**reset checkpoint type if set.**
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] liujinhui1994 commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-1073504190
The purpose of introducing timestamps: Mainly when users want to consume from a certain location, deltastreamer can only specify checkpoint sites in the past. For example, kafka may have 50+ partitions, and users need to manually configure the checkpoint string. Introducing this simplifies this operation
Regarding your example: I think you are right and agree with your idea. Partition 2 should not be populated with this value.
At that time, the main consideration of this PR was to solve the problem of complex user configuration. It can simplify consumption data as much as possible. This example of partition 2 makes sense for some businesses. Maybe your current scenario may be a bit contradictory, and I feel like we can improve it and make it better
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] liujinhui1994 commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-1073504409
@pratyakshsharma
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] pratyakshsharma commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
pratyakshsharma commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r830768428
##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java
##########
@@ -283,6 +323,41 @@ private Long delayOffsetCalculation(Option<String> lastCheckpointStr, Set<TopicP
return delayCount;
}
+ /**
+ * Get the checkpoint by timestamp.
+ * This method returns the checkpoint format based on the timestamp.
+ * example:
+ * 1. input: timestamp, etc.
+ * 2. output: topicName,partition_num_0:100,partition_num_1:101,partition_num_2:102.
+ *
+ * @param consumer
+ * @param topicName
+ * @param timestamp
+ * @return
+ */
+ private Option<String> getOffsetsByTimestamp(KafkaConsumer consumer, List<PartitionInfo> partitionInfoList, Set<TopicPartition> topicPartitions,
+ String topicName, Long timestamp) {
+
+ Map<TopicPartition, Long> topicPartitionsTimestamp = partitionInfoList.stream()
+ .map(x -> new TopicPartition(x.topic(), x.partition()))
+ .collect(Collectors.toMap(Function.identity(), x -> timestamp));
+
+ Map<TopicPartition, Long> earliestOffsets = consumer.beginningOffsets(topicPartitions);
+ Map<TopicPartition, OffsetAndTimestamp> offsetAndTimestamp = consumer.offsetsForTimes(topicPartitionsTimestamp);
+
+ StringBuilder sb = new StringBuilder();
+ sb.append(topicName + ",");
+ for (Map.Entry<TopicPartition, OffsetAndTimestamp> map : offsetAndTimestamp.entrySet()) {
+ if (map.getValue() != null) {
+ sb.append(map.getKey().partition()).append(":").append(map.getValue().offset()).append(",");
+ } else {
+ sb.append(map.getKey().partition()).append(":").append(earliestOffsets.get(map.getKey())).append(",");
Review comment:
created a jira for this - https://issues.apache.org/jira/browse/HUDI-3671
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-io edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-759677298
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r594471195
##########
File path: hudi-integ-test/src/main/java/org/apache/hudi/integ/testsuite/HoodieDeltaStreamerWrapper.java
##########
@@ -65,7 +65,7 @@ public void scheduleCompact() throws Exception {
return upsert(WriteOperationType.UPSERT);
}
- public Pair<SchemaProvider, Pair<String, JavaRDD<HoodieRecord>>> fetchSource() throws Exception {
+ public Pair<Pair<SchemaProvider, JavaRDD<HoodieRecord>>, Pair<String, String>> fetchSource() throws Exception {
Review comment:
this is getting out of hand(two pairs within a pair). we can't keep adding more Pairs here. I am adding a class to hold the return value in a class here in one of my PRs. Lets see if we can rebase once the other PR lands.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-811855554
Myself and Nishith discussed on this. Here is our proposal.
Let's rely on Deltastreamer.Config.checkpoint to pass in any type of checkpoint.
We can add another config called "checkpoint.type" which could default to string for all default checkpoints. For checkpoint of interest of this PR, we could set the value for this new config to "timestamp".
With this, its upto each source to parse and interpret the checkpoint value and DeltaSync does not need to deal w/ diff checkpointing formats.
Having said this, DeltaSync readFromSource() should not have any changes in this diff.
KafkaOffsetGen should have logic to parse diff checkpoint values, based on two values(deltastreamer.config.checkpoint and checkpoint.type).
With this, we also moved source specific checkpointing logic within source specific class and did not leak it to DeltaSync which should be agnostic to different Source.
@liujinhui1994 : Let me know what do you think. Happy to chat more on this.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847
# [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (a39570d) into [master](https://codecov.io/gh/apache/hudi/commit/6eca06d074520140d7bc67b48bd2b9a5b76f0a87?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (6eca06d) will **decrease** coverage by `0.96%`.
> The diff coverage is `n/a`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #2438 +/- ##
============================================
- Coverage 47.51% 46.54% -0.97%
+ Complexity 5429 4951 -478
============================================
Files 922 855 -67
Lines 40968 37983 -2985
Branches 4105 3785 -320
============================================
- Hits 19464 17678 -1786
+ Misses 19780 18741 -1039
+ Partials 1724 1564 -160
```
| Flag | Coverage Δ | |
|---|---|---|
| hudicli | `40.00% <ø> (ø)` | |
| hudiclient | `34.58% <ø> (ø)` | |
| hudicommon | `48.39% <ø> (+0.01%)` | :arrow_up: |
| hudiflink | `60.07% <ø> (ø)` | |
| hudihadoopmr | `51.29% <ø> (ø)` | |
| hudisparkdatasource | `67.10% <ø> (ø)` | |
| hudisync | `50.10% <ø> (-3.95%)` | :arrow_down: |
| huditimelineservice | `64.07% <ø> (ø)` | |
| hudiutilities | `?` | |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [...ache/hudi/hive/HiveMetastoreBasedLockProvider.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvSGl2ZU1ldGFzdG9yZUJhc2VkTG9ja1Byb3ZpZGVyLmphdmE=) | `0.00% <0.00%> (-60.22%)` | :arrow_down: |
| [...s/exception/HoodieIncrementalPullSQLException.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2V4Y2VwdGlvbi9Ib29kaWVJbmNyZW1lbnRhbFB1bGxTUUxFeGNlcHRpb24uamF2YQ==) | | |
| [...udi/utilities/transform/FlatteningTransformer.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3RyYW5zZm9ybS9GbGF0dGVuaW5nVHJhbnNmb3JtZXIuamF2YQ==) | | |
| [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | | |
| [...i/utilities/deltastreamer/SourceFormatAdapter.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvU291cmNlRm9ybWF0QWRhcHRlci5qYXZh) | | |
| [...org/apache/hudi/utilities/HoodieClusteringJob.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hvb2RpZUNsdXN0ZXJpbmdKb2IuamF2YQ==) | | |
| [...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=) | | |
| [.../hudi/utilities/schema/RowBasedSchemaProvider.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9Sb3dCYXNlZFNjaGVtYVByb3ZpZGVyLmphdmE=) | | |
| [...org/apache/hudi/utilities/HDFSParquetImporter.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0hERlNQYXJxdWV0SW1wb3J0ZXIuamF2YQ==) | | |
| [...in/java/org/apache/hudi/utilities/UtilHelpers.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL1V0aWxIZWxwZXJzLmphdmE=) | | |
| ... and [57 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [6eca06d...a39570d](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
"triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"triggerType" : "PUSH"
}, {
"hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
"triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"triggerType" : "PUSH"
}, {
"hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
"triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"triggerType" : "PUSH"
}, {
"hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
"triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"triggerType" : "PUSH"
}, {
"hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
"triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"triggerType" : "PUSH"
}, {
"hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
"triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"triggerType" : "PUSH"
}, {
"hash" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638",
"triggerID" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
"triggerType" : "PUSH"
}, {
"hash" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=775",
"triggerID" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
"triggerType" : "PUSH"
}, {
"hash" : "a688be727d6d6beff51a3f347b9e596d982610b5",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=838",
"triggerID" : "a688be727d6d6beff51a3f347b9e596d982610b5",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* a688be727d6d6beff51a3f347b9e596d982610b5 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=838)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
"triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"triggerType" : "PUSH"
}, {
"hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
"triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"triggerType" : "PUSH"
}, {
"hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
"triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"triggerType" : "PUSH"
}, {
"hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
"triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"triggerType" : "PUSH"
}, {
"hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"status" : "UNKNOWN",
"url" : "TBD",
"triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* ffd30f564c780a25ddccf8c5bc819d4eed9b437a Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400)
* 5e8ab52b0e139333c4c003932c55ff6e88302206 UNKNOWN
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r657945070
##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/DeltaSync.java
##########
@@ -312,13 +313,13 @@ public void refreshTimeline() throws IOException {
if (lastCommit.isPresent()) {
HoodieCommitMetadata commitMetadata = HoodieCommitMetadata
.fromBytes(commitTimelineOpt.get().getInstantDetails(lastCommit.get()).get(), HoodieCommitMetadata.class);
- if (cfg.checkpoint != null && !cfg.checkpoint.equals(commitMetadata.getMetadata(CHECKPOINT_RESET_KEY))) {
- resumeCheckpointStr = Option.of(cfg.checkpoint);
- } else if (commitMetadata.getMetadata(CHECKPOINT_KEY) != null) {
- //if previous checkpoint is an empty string, skip resume use Option.empty()
- if (!commitMetadata.getMetadata(CHECKPOINT_KEY).isEmpty()) {
- resumeCheckpointStr = Option.of(commitMetadata.getMetadata(CHECKPOINT_KEY));
+ if (cfg.checkpoint != null) {
Review comment:
we could club both these within single if condition.
```
if (cfg.checkpoint != null && (StringUtils.isNullOrEmpty(commitMetadata.getMetadata(CHECKPOINT_RESET_KEY))
|| !cfg.checkpoint.equals(commitMetadata.getMetadata(CHECKPOINT_RESET_KEY))) {
resumeCheckpointStr = Option.of(cfg.checkpoint);
}
##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/DeltaSync.java
##########
@@ -330,6 +331,9 @@ public void refreshTimeline() throws IOException {
+ commitTimelineOpt.get().getInstants().collect(Collectors.toList()) + ", CommitMetadata="
+ commitMetadata.toJsonString());
}
+ if (!StringUtils.isNullOrEmpty(commitMetadata.getMetadata(CHECKPOINT_RESET_KEY))) {
+ props.put("hoodie.deltastreamer.source.kafka.checkpoint.type", "string");
Review comment:
actually better thing to do here is to remove the entry from props. wdyt?
##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/DeltaSync.java
##########
@@ -330,6 +331,9 @@ public void refreshTimeline() throws IOException {
+ commitTimelineOpt.get().getInstants().collect(Collectors.toList()) + ", CommitMetadata="
+ commitMetadata.toJsonString());
}
+ if (!StringUtils.isNullOrEmpty(commitMetadata.getMetadata(CHECKPOINT_RESET_KEY))) {
+ props.put("hoodie.deltastreamer.source.kafka.checkpoint.type", "string");
Review comment:
rather than hardcoding the config here, can we use variable please.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
"triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"triggerType" : "PUSH"
}, {
"hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
"triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"triggerType" : "PUSH"
}, {
"hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
"triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"triggerType" : "PUSH"
}, {
"hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
"triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"triggerType" : "PUSH"
}, {
"hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
"triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"triggerType" : "PUSH"
}, {
"hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
"triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"triggerType" : "PUSH"
}, {
"hash" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
"status" : "PENDING",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638",
"triggerID" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* 1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633)
* a39570dfe0493bcd23edf911f6256e90d3b22907 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
"triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"triggerType" : "PUSH"
}, {
"hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
"triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"triggerType" : "PUSH"
}, {
"hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
"triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"triggerType" : "PUSH"
}, {
"hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
"triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"triggerType" : "PUSH"
}, {
"hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
"triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"triggerType" : "PUSH"
}, {
"hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
"triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"triggerType" : "PUSH"
}, {
"hash" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638",
"triggerID" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
"triggerType" : "PUSH"
}, {
"hash" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=775",
"triggerID" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
"triggerType" : "PUSH"
}, {
"hash" : "a688be727d6d6beff51a3f347b9e596d982610b5",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=838",
"triggerID" : "a688be727d6d6beff51a3f347b9e596d982610b5",
"triggerType" : "PUSH"
}, {
"hash" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=839",
"triggerID" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
"triggerType" : "PUSH"
}, {
"hash" : "8bc0333e4fc14158b126da1f7b14f6c43a3abfb8",
"status" : "CANCELED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=856",
"triggerID" : "8bc0333e4fc14158b126da1f7b14f6c43a3abfb8",
"triggerType" : "PUSH"
}, {
"hash" : "5022f1d97e4e9b140d8e41b5b49c034ceb9ae601",
"status" : "PENDING",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=858",
"triggerID" : "5022f1d97e4e9b140d8e41b5b49c034ceb9ae601",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* 8bc0333e4fc14158b126da1f7b14f6c43a3abfb8 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=856)
* 5022f1d97e4e9b140d8e41b5b49c034ceb9ae601 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=858)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
"triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"triggerType" : "PUSH"
}, {
"hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
"triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"triggerType" : "PUSH"
}, {
"hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
"triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"triggerType" : "PUSH"
}, {
"hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
"triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"triggerType" : "PUSH"
}, {
"hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
"triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"triggerType" : "PUSH"
}, {
"hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
"triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"triggerType" : "PUSH"
}, {
"hash" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638",
"triggerID" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
"triggerType" : "PUSH"
}, {
"hash" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
"status" : "PENDING",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=775",
"triggerID" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* a39570dfe0493bcd23edf911f6256e90d3b22907 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638)
* bf50481b923dbaa14be994bd0cc45bbe22ff8524 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=775)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847
# [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (a688be7) into [master](https://codecov.io/gh/apache/hudi/commit/990820476a41b318017ba63dd446911141c929ce?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (9908204) will **increase** coverage by `19.04%`.
> The diff coverage is `n/a`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #2438 +/- ##
=============================================
+ Coverage 47.61% 66.65% +19.04%
+ Complexity 5487 798 -4689
=============================================
Files 924 100 -824
Lines 41206 3488 -37718
Branches 4133 353 -3780
=============================================
- Hits 19619 2325 -17294
+ Misses 19844 1024 -18820
+ Partials 1743 139 -1604
```
| Flag | Coverage Δ | |
|---|---|---|
| hudicli | `?` | |
| hudiclient | `66.65% <ø> (+32.07%)` | :arrow_up: |
| hudicommon | `?` | |
| hudiflink | `?` | |
| hudihadoopmr | `?` | |
| hudisparkdatasource | `?` | |
| hudisync | `?` | |
| huditimelineservice | `?` | |
| hudiutilities | `?` | |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [...n/java/org/apache/hudi/index/SparkHoodieIndex.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1zcGFyay1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaW5kZXgvU3BhcmtIb29kaWVJbmRleC5qYXZh) | `56.52% <0.00%> (-30.15%)` | :arrow_down: |
| [...in/java/org/apache/hudi/index/JavaHoodieIndex.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1qYXZhLWNsaWVudC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9pbmRleC9KYXZhSG9vZGllSW5kZXguamF2YQ==) | | |
| [...pache/hudi/cli/commands/FileSystemViewCommand.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpL2NvbW1hbmRzL0ZpbGVTeXN0ZW1WaWV3Q29tbWFuZC5qYXZh) | | |
| [...metadata/HoodieMetadataMergedLogRecordScanner.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0YWRhdGEvSG9vZGllTWV0YWRhdGFNZXJnZWRMb2dSZWNvcmRTY2FubmVyLmphdmE=) | | |
| [...apache/hudi/common/fs/inline/InLineFileSystem.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL2ZzL2lubGluZS9JbkxpbmVGaWxlU3lzdGVtLmphdmE=) | | |
| [...va/org/apache/hudi/sink/utils/PayloadCreation.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1mbGluay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9zaW5rL3V0aWxzL1BheWxvYWRDcmVhdGlvbi5qYXZh) | | |
| [.../main/scala/org/apache/hudi/HoodieSparkUtils.scala](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZVNwYXJrVXRpbHMuc2NhbGE=) | | |
| [...apache/hudi/common/model/WriteConcurrencyMode.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL1dyaXRlQ29uY3VycmVuY3lNb2RlLmphdmE=) | | |
| [...he/hudi/common/model/BootstrapBaseFileMapping.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0Jvb3RzdHJhcEJhc2VGaWxlTWFwcGluZy5qYXZh) | | |
| [...di/common/table/log/block/HoodieAvroDataBlock.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL2xvZy9ibG9jay9Ib29kaWVBdnJvRGF0YUJsb2NrLmphdmE=) | | |
| ... and [819 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [9908204...a688be7](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
nsivabalan edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-853866564
good point.
Tell me if my understanding is right in general wrt usage of timestamp based checkpointing.
user would like to use timestamp based checkpointing in deltastreamer only for bootstrap case.
and further on, checkpointing will be using the regular kafka checkpoint format of "topicName,0:123,1:456".
if my understanding (stated above) is true, essentially, within kafkaOffsenGen, we might have to parse checkpoint as timestamp for first time(bootstrap), but from 2nd time, we fallback to regular checkpoint parsing mechanism.
I see we have InitialCheckPointProvider. Let me think about how to go about this and will get back to you. For now, this is what I can think of.
InitialCheckpointProvider will expose getCheckpointType() method.
and we add it as a property to configs if initialCheckpointProvider is set around [here](https://github.com/apache/hudi/blob/f6eee77636223077cfd2ce516f1b8805dfa6e35e/hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java#L132).
Within readFromSource in DeltaSync(), if checkpoint is fetched from commit metadata, we may not honor this checkpoint type. or we will clear the checkpoint type property if set.
but if fetched from cfg.checkPoint, we will leave the property as is and let kafkaOffsetGen handle checkpoint parsing.
But let me think through this more. But in the mean time, if you can confirm my understanding of the usage of timestamp based checkpointing, would be great.
CC @n3nash @bvaradar
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] liujinhui1994 commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863968623
deltaSync should reset this(...kafka.checkpoint.type) configuration (similar to how we reset checkpoints)
In this way, we may need to store this in the metadata file. If it is a memory modification, there is a greater risk. I have submitted my latest implementation, please help to see if it is feasible
@nsivabalan
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-850284847
# [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#2438](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (a39570d) into [master](https://codecov.io/gh/apache/hudi/commit/6eca06d074520140d7bc67b48bd2b9a5b76f0a87?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (6eca06d) will **decrease** coverage by `3.51%`.
> The diff coverage is `n/a`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2438/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #2438 +/- ##
============================================
- Coverage 47.51% 43.99% -3.52%
+ Complexity 5429 3918 -1511
============================================
Files 922 730 -192
Lines 40968 32657 -8311
Branches 4105 3245 -860
============================================
- Hits 19464 14366 -5098
+ Misses 19780 16957 -2823
+ Partials 1724 1334 -390
```
| Flag | Coverage Δ | |
|---|---|---|
| hudicli | `40.00% <ø> (ø)` | |
| hudiclient | `22.94% <ø> (-11.65%)` | :arrow_down: |
| hudicommon | `48.39% <ø> (+0.01%)` | :arrow_up: |
| hudiflink | `60.07% <ø> (ø)` | |
| hudihadoopmr | `51.29% <ø> (ø)` | |
| hudisparkdatasource | `67.10% <ø> (ø)` | |
| hudisync | `?` | |
| huditimelineservice | `?` | |
| hudiutilities | `?` | |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [...g/apache/hudi/utilities/sources/JsonDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkRGU1NvdXJjZS5qYXZh) | | |
| [...he/hudi/utilities/transform/AWSDmsTransformer.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3RyYW5zZm9ybS9BV1NEbXNUcmFuc2Zvcm1lci5qYXZh) | | |
| [...ties/exception/HoodieIncrementalPullException.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2V4Y2VwdGlvbi9Ib29kaWVJbmNyZW1lbnRhbFB1bGxFeGNlcHRpb24uamF2YQ==) | | |
| [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | | |
| [...src/main/java/org/apache/hudi/dla/DLASyncTool.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktZGxhLXN5bmMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvZGxhL0RMQVN5bmNUb29sLmphdmE=) | | |
| [...i/utilities/deltastreamer/SourceFormatAdapter.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvU291cmNlRm9ybWF0QWRhcHRlci5qYXZh) | | |
| [...a/org/apache/hudi/metrics/DistributedRegistry.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1zcGFyay1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0cmljcy9EaXN0cmlidXRlZFJlZ2lzdHJ5LmphdmE=) | | |
| [...g/apache/hudi/keygen/GlobalDeleteKeyGenerator.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1zcGFyay1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkva2V5Z2VuL0dsb2JhbERlbGV0ZUtleUdlbmVyYXRvci5qYXZh) | | |
| [.../hudi/utilities/sources/helpers/AvroConvertor.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvaGVscGVycy9BdnJvQ29udmVydG9yLmphdmE=) | | |
| [...llback/SparkMergeOnReadRollbackActionExecutor.java](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1zcGFyay1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdGFibGUvYWN0aW9uL3JvbGxiYWNrL1NwYXJrTWVyZ2VPblJlYWRSb2xsYmFja0FjdGlvbkV4ZWN1dG9yLmphdmE=) | | |
| ... and [181 more](https://codecov.io/gh/apache/hudi/pull/2438/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [6eca06d...a39570d](https://codecov.io/gh/apache/hudi/pull/2438?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
nsivabalan edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-864413719
Guess we can simplify things. Let me go over some pseudo code of interest.
within DeltaSync.read()
```
// set right checkpoint value
if(cfg.checkpoint != null && ! (commitMetadata.contains(Checkpoint_RESET_Key) ) {
checkpoint = cfg.checkpoint;
} else if (commitMetadata.contains(Checkpoint_Key)) {
checkpoint = commitMetadata.get(Checkpoint_Key));
} else {
Option.empty()
}
```
// Note that first if condition deals with RESET_key where as 2nd else if conditions deals with Checkpoint_key.
I have simplified some exception cases, but should give you the gist.
within write()
```
// towards the end
commitMetadata.out(Checkpoint_Key, updated checkpoint after writing)
if(cfg.checkpoint != null) {
commitMetadata.add(Checkpoint_RESET_Key);
}
```
If cfg.checkpoint is set, only during first round, it will be honored. At the end of first batch, we add Checkpoint_RESET_Key to the commitmetadata and hence from subsequent batches, checkpoint will be parsed from commitMetadata.
With this PR, only addition is that we are introducing a new checkpoint type. Let me propose a simple add on to above code that would work for us.
within DeltaSync.read()
```
// set right checkpoint value
boolean resetCheckpointType = true; // New addition
if(cfg.checkpoint != null && ! (commitMetadata.contains(Checkpoint_RESET_Key) ) {
checkpoint = cfg.checkpoint;
resetCheckpointType = false; // New addition
} else if (commitMetadata.contains(Checkpoint_Key)) {
checkpoint = commitMetadata.get(Checkpoint_Key));
} else {
Option.empty()
}
// New addition
if (resetCheckpointType) {
**reset checkpoint type if set.**
}
```
No other changes are required. This is based of the assumption that Checkpoint_RESET_Key and checkpoint type goes hand in hand. During first batch, checkpoint type could be set, there won't be any Checkpoint_RESET_Key set. But from 2nd batch, it should be reverse. check point type should not be set, but Checkpoint_RESET_Key should be part of the commit metadata. Given this assumption, we don't really need to add checkpoint type to commitMetadata, but still decide whether to use the checkpoint type or not.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan commented on a change in pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r657546921
##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/DeltaSync.java
##########
@@ -461,7 +465,7 @@ public void refreshTimeline() throws IOException {
if (!hasErrors || cfg.commitOnErrors) {
HashMap<String, String> checkpointCommitMetadata = new HashMap<>();
checkpointCommitMetadata.put(CHECKPOINT_KEY, checkpointStr);
- if (cfg.checkpoint != null) {
+ if (cfg.checkpoint != null && !"timestamp".equals(props.getString("hoodie.deltastreamer.source.kafka.checkpoint.type"))) {
Review comment:
Can you help me understand why we need this ? My understanding is that, user will set cfg.checkpoint during first batch and set the checkpoint type (to timestamp) as well. but even for any checkpoint types, we should add the checkpoint_reset_key here at the end of 1st batch. Am I missing something. can you please help me understand.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-863310563
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=257",
"triggerID" : "70101da19852dbb2e2d850c59942b4395ceaa390",
"triggerType" : "PUSH"
}, {
"hash" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=269",
"triggerID" : "00d29d85f32f376ef44cb99d49f605a4af6f798c",
"triggerType" : "PUSH"
}, {
"hash" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=373",
"triggerID" : "ea5ed9da433064022a69e06c98f58fc10c09e8b6",
"triggerType" : "PUSH"
}, {
"hash" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=400",
"triggerID" : "ffd30f564c780a25ddccf8c5bc819d4eed9b437a",
"triggerType" : "PUSH"
}, {
"hash" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=565",
"triggerID" : "5e8ab52b0e139333c4c003932c55ff6e88302206",
"triggerType" : "PUSH"
}, {
"hash" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=633",
"triggerID" : "1bbcdb44cbb0ab9ac84c95a48fea5d7f38a8f657",
"triggerType" : "PUSH"
}, {
"hash" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=638",
"triggerID" : "a39570dfe0493bcd23edf911f6256e90d3b22907",
"triggerType" : "PUSH"
}, {
"hash" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=775",
"triggerID" : "bf50481b923dbaa14be994bd0cc45bbe22ff8524",
"triggerType" : "PUSH"
}, {
"hash" : "a688be727d6d6beff51a3f347b9e596d982610b5",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=838",
"triggerID" : "a688be727d6d6beff51a3f347b9e596d982610b5",
"triggerType" : "PUSH"
}, {
"hash" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=839",
"triggerID" : "67041c2d836e61355aea26bd24f91548ec5e92ce",
"triggerType" : "PUSH"
}, {
"hash" : "8bc0333e4fc14158b126da1f7b14f6c43a3abfb8",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=856",
"triggerID" : "8bc0333e4fc14158b126da1f7b14f6c43a3abfb8",
"triggerType" : "PUSH"
}, {
"hash" : "5022f1d97e4e9b140d8e41b5b49c034ceb9ae601",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=858",
"triggerID" : "5022f1d97e4e9b140d8e41b5b49c034ceb9ae601",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* 5022f1d97e4e9b140d8e41b5b49c034ceb9ae601 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=858)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] liujinhui1994 closed pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
liujinhui1994 closed pull request #2438:
URL: https://github.com/apache/hudi/pull/2438
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] liujinhui1994 commented on pull request #2438: [HUDI-1447] DeltaStreamer kafka source supports consuming from specified timestamp
Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-881820983
@nsivabalan Thank you for your concern and patience to help!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org