You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2021/06/21 08:43:13 UTC
[GitHub] [hudi] danny0405 opened a new pull request #3122: [HUDI-2048] HoodieRealtimeInputFormatUtils#groupLogsByBaseFile throws…
danny0405 opened a new pull request #3122:
URL: https://github.com/apache/hudi/pull/3122
… NPE for file group that has only logs
## *Tips*
- *Thank you very much for contributing to Apache Hudi.*
- *Please review https://hudi.apache.org/contributing.html before opening a pull request.*
## What is the purpose of the pull request
*(For example: This pull request adds quick-start document.)*
## Brief change log
*(for example:)*
- *Modify AnnotationLocation checkstyle rule in checkstyle.xml*
## Verify this pull request
*(Please pick either of the following options)*
This pull request is a trivial rework / code cleanup without any test coverage.
*(or)*
This pull request is already covered by existing tests, such as *(please describe tests)*.
(or)
This change added tests and can be verified as follows:
*(example:)*
- *Added integration tests for end-to-end.*
- *Added HoodieClientWriteTest to verify the change.*
- *Manually verified the change by running a job locally.*
## Committer checklist
- [ ] Has a corresponding JIRA in PR title & commit
- [ ] Commit message is descriptive of the change
- [ ] CI is green
- [ ] Necessary doc changes done or have another open PR
- [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #3122: [HUDI-2048] HoodieRealtimeInputFormatUtils#groupLogsByBaseFile throws…
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3122:
URL: https://github.com/apache/hudi/pull/3122#issuecomment-864855800
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] zhangjun0x01 commented on a change in pull request #3122: [HUDI-2048] HoodieRealtimeInputFormatUtils#groupLogsByBaseFile throws…
Posted by GitBox <gi...@apache.org>.
zhangjun0x01 commented on a change in pull request #3122:
URL: https://github.com/apache/hudi/pull/3122#discussion_r655213190
##########
File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/utils/HoodieRealtimeInputFormatUtils.java
##########
@@ -165,16 +165,20 @@
Map<String, List<HoodieBaseFile>> groupedInputSplits = partitionsToParquetSplits.get(partitionPath).stream()
.collect(Collectors.groupingBy(file -> FSUtils.getFileId(file.getFileStatus().getPath().getName())));
latestFileSlices.forEach(fileSlice -> {
- List<HoodieBaseFile> dataFileSplits = groupedInputSplits.get(fileSlice.getFileId());
- dataFileSplits.forEach(split -> {
- try {
- List<String> logFilePaths = fileSlice.getLogFiles().sorted(HoodieLogFile.getLogFileComparator())
- .map(logFile -> logFile.getPath().toString()).collect(Collectors.toList());
- resultMap.put(split, logFilePaths);
- } catch (Exception e) {
- throw new HoodieException("Error creating hoodie real time split ", e);
- }
- });
+ final String fileId = fileSlice.getFileId();
+ // filter out the file group that has only logs (say the index is global).
+ if (groupedInputSplits.containsKey(fileId)) {
+ List<HoodieBaseFile> dataFileSplits = groupedInputSplits.get(fileId);
+ dataFileSplits.forEach(split -> {
+ try {
+ List<String> logFilePaths = fileSlice.getLogFiles().sorted(HoodieLogFile.getLogFileComparator())
+ .map(logFile -> logFile.getPath().toString()).collect(Collectors.toList());
+ resultMap.put(split, logFilePaths);
+ } catch (Exception e) {
+ throw new HoodieException("Error creating hoodie real time split ", e);
Review comment:
I mean to modify this content,for example 'Error creating hoodie real time split for group logs by BaseFile',because the exception content of `groupLogsByBaseFile` and `getRealtimeSplits` method are the same.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] danny0405 closed pull request #3122: [HUDI-2048] HoodieRealtimeInputFormatUtils#groupLogsByBaseFile throws…
Posted by GitBox <gi...@apache.org>.
danny0405 closed pull request #3122:
URL: https://github.com/apache/hudi/pull/3122
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] vinothchandar commented on pull request #3122: [HUDI-2048] HoodieRealtimeInputFormatUtils#groupLogsByBaseFile throws…
Posted by GitBox <gi...@apache.org>.
vinothchandar commented on pull request #3122:
URL: https://github.com/apache/hudi/pull/3122#issuecomment-874493680
Does #3203 solve this?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] danny0405 commented on a change in pull request #3122: [HUDI-2048] HoodieRealtimeInputFormatUtils#groupLogsByBaseFile throws…
Posted by GitBox <gi...@apache.org>.
danny0405 commented on a change in pull request #3122:
URL: https://github.com/apache/hudi/pull/3122#discussion_r655208566
##########
File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/utils/HoodieRealtimeInputFormatUtils.java
##########
@@ -165,16 +165,20 @@
Map<String, List<HoodieBaseFile>> groupedInputSplits = partitionsToParquetSplits.get(partitionPath).stream()
.collect(Collectors.groupingBy(file -> FSUtils.getFileId(file.getFileStatus().getPath().getName())));
latestFileSlices.forEach(fileSlice -> {
- List<HoodieBaseFile> dataFileSplits = groupedInputSplits.get(fileSlice.getFileId());
- dataFileSplits.forEach(split -> {
- try {
- List<String> logFilePaths = fileSlice.getLogFiles().sorted(HoodieLogFile.getLogFileComparator())
- .map(logFile -> logFile.getPath().toString()).collect(Collectors.toList());
- resultMap.put(split, logFilePaths);
- } catch (Exception e) {
- throw new HoodieException("Error creating hoodie real time split ", e);
- }
- });
+ final String fileId = fileSlice.getFileId();
+ // filter out the file group that has only logs (say the index is global).
+ if (groupedInputSplits.containsKey(fileId)) {
+ List<HoodieBaseFile> dataFileSplits = groupedInputSplits.get(fileId);
+ dataFileSplits.forEach(split -> {
+ try {
+ List<String> logFilePaths = fileSlice.getLogFiles().sorted(HoodieLogFile.getLogFileComparator())
+ .map(logFile -> logFile.getPath().toString()).collect(Collectors.toList());
+ resultMap.put(split, logFilePaths);
+ } catch (Exception e) {
+ throw new HoodieException("Error creating hoodie real time split ", e);
Review comment:
The exception already throws out, what kind of exception do you suggest it to be ?
##########
File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/utils/HoodieRealtimeInputFormatUtils.java
##########
@@ -165,16 +165,20 @@
Map<String, List<HoodieBaseFile>> groupedInputSplits = partitionsToParquetSplits.get(partitionPath).stream()
.collect(Collectors.groupingBy(file -> FSUtils.getFileId(file.getFileStatus().getPath().getName())));
latestFileSlices.forEach(fileSlice -> {
- List<HoodieBaseFile> dataFileSplits = groupedInputSplits.get(fileSlice.getFileId());
- dataFileSplits.forEach(split -> {
- try {
- List<String> logFilePaths = fileSlice.getLogFiles().sorted(HoodieLogFile.getLogFileComparator())
- .map(logFile -> logFile.getPath().toString()).collect(Collectors.toList());
- resultMap.put(split, logFilePaths);
- } catch (Exception e) {
- throw new HoodieException("Error creating hoodie real time split ", e);
- }
- });
+ final String fileId = fileSlice.getFileId();
+ // filter out the file group that has only logs (say the index is global).
+ if (groupedInputSplits.containsKey(fileId)) {
+ List<HoodieBaseFile> dataFileSplits = groupedInputSplits.get(fileId);
+ dataFileSplits.forEach(split -> {
+ try {
+ List<String> logFilePaths = fileSlice.getLogFiles().sorted(HoodieLogFile.getLogFileComparator())
+ .map(logFile -> logFile.getPath().toString()).collect(Collectors.toList());
+ resultMap.put(split, logFilePaths);
+ } catch (Exception e) {
+ throw new HoodieException("Error creating hoodie real time split ", e);
Review comment:
> When we use hive query the hudi table, the same NPE exception may also occur [here](https://github.com/apache/hudi/blob/master/hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/utils/HoodieRealtimeInputFormatUtils.java#L105) , should we add the same judgment logic?
Oops, i guess we should
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter commented on pull request #3122: [HUDI-2048] HoodieRealtimeInputFormatUtils#groupLogsByBaseFile throws…
Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on pull request #3122:
URL: https://github.com/apache/hudi/pull/3122#issuecomment-864855800
# [Codecov](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#3122](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (9d1fc32) into [master](https://codecov.io/gh/apache/hudi/commit/cdb9b48170ef98634babd8954392efb1c1b90fcf?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (cdb9b48) will **decrease** coverage by `42.71%`.
> The diff coverage is `n/a`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3122/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #3122 +/- ##
============================================
- Coverage 45.85% 3.13% -42.72%
+ Complexity 5269 82 -5187
============================================
Files 908 274 -634
Lines 39332 10650 -28682
Branches 4239 1088 -3151
============================================
- Hits 18036 334 -17702
+ Misses 19451 10290 -9161
+ Partials 1845 26 -1819
```
| Flag | Coverage Δ | |
|---|---|---|
| hudicli | `?` | |
| hudiclient | `0.00% <ø> (-30.45%)` | :arrow_down: |
| hudicommon | `?` | |
| hudiflink | `?` | |
| hudihadoopmr | `?` | |
| hudisparkdatasource | `?` | |
| hudisync | `6.72% <ø> (-45.01%)` | :arrow_down: |
| huditimelineservice | `?` | |
| hudiutilities | `9.52% <ø> (-47.12%)` | :arrow_down: |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...g/apache/hudi/utilities/sources/JsonDFSSource.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkRGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...pache/hudi/utilities/sources/ParquetDFSSource.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUGFycXVldERGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| ... and [744 more](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [cdb9b48...9d1fc32](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] danny0405 closed pull request #3122: [HUDI-2048] HoodieRealtimeInputFormatUtils#groupLogsByBaseFile throws…
Posted by GitBox <gi...@apache.org>.
danny0405 closed pull request #3122:
URL: https://github.com/apache/hudi/pull/3122
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] zhangjun0x01 commented on pull request #3122: [HUDI-2048] HoodieRealtimeInputFormatUtils#groupLogsByBaseFile throws…
Posted by GitBox <gi...@apache.org>.
zhangjun0x01 commented on pull request #3122:
URL: https://github.com/apache/hudi/pull/3122#issuecomment-864873076
When we use hive query the hudi table, the same NPE exception may also occur [here](https://github.com/apache/hudi/blob/master/hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/utils/HoodieRealtimeInputFormatUtils.java#L105) , should we add the same judgment logic?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] zhangjun0x01 commented on a change in pull request #3122: [HUDI-2048] HoodieRealtimeInputFormatUtils#groupLogsByBaseFile throws…
Posted by GitBox <gi...@apache.org>.
zhangjun0x01 commented on a change in pull request #3122:
URL: https://github.com/apache/hudi/pull/3122#discussion_r655204771
##########
File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/utils/HoodieRealtimeInputFormatUtils.java
##########
@@ -165,16 +165,20 @@
Map<String, List<HoodieBaseFile>> groupedInputSplits = partitionsToParquetSplits.get(partitionPath).stream()
.collect(Collectors.groupingBy(file -> FSUtils.getFileId(file.getFileStatus().getPath().getName())));
latestFileSlices.forEach(fileSlice -> {
- List<HoodieBaseFile> dataFileSplits = groupedInputSplits.get(fileSlice.getFileId());
- dataFileSplits.forEach(split -> {
- try {
- List<String> logFilePaths = fileSlice.getLogFiles().sorted(HoodieLogFile.getLogFileComparator())
- .map(logFile -> logFile.getPath().toString()).collect(Collectors.toList());
- resultMap.put(split, logFilePaths);
- } catch (Exception e) {
- throw new HoodieException("Error creating hoodie real time split ", e);
- }
- });
+ final String fileId = fileSlice.getFileId();
+ // filter out the file group that has only logs (say the index is global).
+ if (groupedInputSplits.containsKey(fileId)) {
+ List<HoodieBaseFile> dataFileSplits = groupedInputSplits.get(fileId);
+ dataFileSplits.forEach(split -> {
+ try {
+ List<String> logFilePaths = fileSlice.getLogFiles().sorted(HoodieLogFile.getLogFileComparator())
+ .map(logFile -> logFile.getPath().toString()).collect(Collectors.toList());
+ resultMap.put(split, logFilePaths);
+ } catch (Exception e) {
+ throw new HoodieException("Error creating hoodie real time split ", e);
Review comment:
nit : Should we modify the exception message so that when throw an exception, we can distinguish it from the exception in the `getRealtimeSplits` method?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] xiarixiaoyao commented on pull request #3122: [HUDI-2048] HoodieRealtimeInputFormatUtils#groupLogsByBaseFile throws…
Posted by GitBox <gi...@apache.org>.
xiarixiaoyao commented on pull request #3122:
URL: https://github.com/apache/hudi/pull/3122#issuecomment-869514439
HI @danny0405 yes, we will fix this problem by HUDI-2086, it will be ok to close this one
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #3122: [HUDI-2048] HoodieRealtimeInputFormatUtils#groupLogsByBaseFile throws…
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3122:
URL: https://github.com/apache/hudi/pull/3122#issuecomment-864855800
# [Codecov](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#3122](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (530787e) into [master](https://codecov.io/gh/apache/hudi/commit/cdb9b48170ef98634babd8954392efb1c1b90fcf?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (cdb9b48) will **decrease** coverage by `42.71%`.
> The diff coverage is `n/a`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3122/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #3122 +/- ##
============================================
- Coverage 45.85% 3.13% -42.72%
+ Complexity 5269 82 -5187
============================================
Files 908 274 -634
Lines 39332 10650 -28682
Branches 4239 1088 -3151
============================================
- Hits 18036 334 -17702
+ Misses 19451 10290 -9161
+ Partials 1845 26 -1819
```
| Flag | Coverage Δ | |
|---|---|---|
| hudicli | `?` | |
| hudiclient | `0.00% <ø> (-30.45%)` | :arrow_down: |
| hudicommon | `?` | |
| hudiflink | `?` | |
| hudihadoopmr | `?` | |
| hudisparkdatasource | `?` | |
| hudisync | `6.72% <ø> (-45.01%)` | :arrow_down: |
| huditimelineservice | `?` | |
| hudiutilities | `9.52% <ø> (-47.12%)` | :arrow_down: |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...g/apache/hudi/utilities/sources/JsonDFSSource.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkRGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...pache/hudi/utilities/sources/ParquetDFSSource.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUGFycXVldERGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| ... and [744 more](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [cdb9b48...530787e](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] xiarixiaoyao commented on pull request #3122: [HUDI-2048] HoodieRealtimeInputFormatUtils#groupLogsByBaseFile throws…
Posted by GitBox <gi...@apache.org>.
xiarixiaoyao commented on pull request #3122:
URL: https://github.com/apache/hudi/pull/3122#issuecomment-869486590
HI @vinothchandar @danny0405 i think we should not filter those logs directly, those logs contains the data which we needed。 we encounter this problem and solved it in our products,this problem is sub_problem3 of HUDI-2086 and the correlation pr will come next few days。
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3122: [HUDI-2048] HoodieRealtimeInputFormatUtils#groupLogsByBaseFile throws…
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3122:
URL: https://github.com/apache/hudi/pull/3122#issuecomment-864851929
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #3122: [HUDI-2048] HoodieRealtimeInputFormatUtils#groupLogsByBaseFile throws…
Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #3122:
URL: https://github.com/apache/hudi/pull/3122#issuecomment-864851929
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "9d1fc32406d8633cef61672457c5448cf5aa4dfa",
"status" : "UNKNOWN",
"url" : "TBD",
"triggerID" : "9d1fc32406d8633cef61672457c5448cf5aa4dfa",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* 9d1fc32406d8633cef61672457c5448cf5aa4dfa UNKNOWN
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] zhangjun0x01 commented on pull request #3122: [HUDI-2048] HoodieRealtimeInputFormatUtils#groupLogsByBaseFile throws…
Posted by GitBox <gi...@apache.org>.
zhangjun0x01 commented on pull request #3122:
URL: https://github.com/apache/hudi/pull/3122#issuecomment-864873076
When we use hive query the hudi table, the same NPE exception may also occur [here](https://github.com/apache/hudi/blob/master/hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/utils/HoodieRealtimeInputFormatUtils.java#L105) , should we add the same judgment logic?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #3122: [HUDI-2048] HoodieRealtimeInputFormatUtils#groupLogsByBaseFile throws…
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3122:
URL: https://github.com/apache/hudi/pull/3122#issuecomment-864855800
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] danny0405 commented on pull request #3122: [HUDI-2048] HoodieRealtimeInputFormatUtils#groupLogsByBaseFile throws…
Posted by GitBox <gi...@apache.org>.
danny0405 commented on pull request #3122:
URL: https://github.com/apache/hudi/pull/3122#issuecomment-874503060
> Does #3203 solve this?
I think it should, would just close this one.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] danny0405 closed pull request #3122: [HUDI-2048] HoodieRealtimeInputFormatUtils#groupLogsByBaseFile throws…
Posted by GitBox <gi...@apache.org>.
danny0405 closed pull request #3122:
URL: https://github.com/apache/hudi/pull/3122
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] vinothchandar commented on a change in pull request #3122: [HUDI-2048] HoodieRealtimeInputFormatUtils#groupLogsByBaseFile throws…
Posted by GitBox <gi...@apache.org>.
vinothchandar commented on a change in pull request #3122:
URL: https://github.com/apache/hudi/pull/3122#discussion_r664267191
##########
File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieParquetInputFormat.java
##########
@@ -209,6 +210,12 @@ public Configuration getConf() {
colNamesWithTypesForExternal.size(),
true);
}
+ } else if (split instanceof RealtimeSplit) {
Review comment:
can we do this at the sub class? Not sure if we want to make this class aware of RealtimeSplit
##########
File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieParquetInputFormat.java
##########
@@ -253,4 +260,34 @@ private BootstrapBaseFileSplit makeExternalFileSplit(PathWithBootstrapFileStatus
throw new HoodieIOException(e.getMessage(), e);
}
}
+
+ /**
+ * Dummy record reader that outputs nothing.
+ */
+ public static class DummyRecordReader implements RecordReader<NullWritable, ArrayWritable> {
Review comment:
rename: NoopRecordReader
##########
File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/utils/HoodieRealtimeInputFormatUtils.java
##########
@@ -102,28 +105,37 @@
HoodieTimeline.ROLLBACK_ACTION, HoodieTimeline.DELTA_COMMIT_ACTION, HoodieTimeline.REPLACE_COMMIT_ACTION))
.filterCompletedInstants().lastInstant().get().getTimestamp();
latestFileSlices.forEach(fileSlice -> {
+ List<String> logFilePaths = fileSlice.getLogFiles().sorted(HoodieLogFile.getLogFileComparator())
+ .map(logFile -> logFile.getPath().toString()).collect(Collectors.toList());
List<FileSplit> dataFileSplits = groupedInputSplits.get(fileSlice.getFileId());
- dataFileSplits.forEach(split -> {
- try {
- List<String> logFilePaths = fileSlice.getLogFiles().sorted(HoodieLogFile.getLogFileComparator())
- .map(logFile -> logFile.getPath().toString()).collect(Collectors.toList());
- if (split instanceof BootstrapBaseFileSplit) {
- BootstrapBaseFileSplit eSplit = (BootstrapBaseFileSplit) split;
- String[] hosts = split.getLocationInfo() != null ? Arrays.stream(split.getLocationInfo())
- .filter(x -> !x.isInMemory()).toArray(String[]::new) : new String[0];
- String[] inMemoryHosts = split.getLocationInfo() != null ? Arrays.stream(split.getLocationInfo())
- .filter(SplitLocationInfo::isInMemory).toArray(String[]::new) : new String[0];
- FileSplit baseSplit = new FileSplit(eSplit.getPath(), eSplit.getStart(), eSplit.getLength(),
- hosts, inMemoryHosts);
- rtSplits.add(new RealtimeBootstrapBaseFileSplit(baseSplit, metaClient.getBasePath(),
- logFilePaths, maxCommitTime, eSplit.getBootstrapFileSplit()));
- } else {
- rtSplits.add(new HoodieRealtimeFileSplit(split, metaClient.getBasePath(), logFilePaths, maxCommitTime));
+ if (dataFileSplits != null) {
+ dataFileSplits.forEach(split -> {
+ try {
+ if (split instanceof BootstrapBaseFileSplit) {
+ BootstrapBaseFileSplit eSplit = (BootstrapBaseFileSplit) split;
+ String[] hosts = split.getLocationInfo() != null ? Arrays.stream(split.getLocationInfo())
+ .filter(x -> !x.isInMemory()).toArray(String[]::new) : new String[0];
+ String[] inMemoryHosts = split.getLocationInfo() != null ? Arrays.stream(split.getLocationInfo())
+ .filter(SplitLocationInfo::isInMemory).toArray(String[]::new) : new String[0];
+ FileSplit baseSplit = new FileSplit(eSplit.getPath(), eSplit.getStart(), eSplit.getLength(),
+ hosts, inMemoryHosts);
+ rtSplits.add(new RealtimeBootstrapBaseFileSplit(baseSplit, metaClient.getBasePath(),
+ logFilePaths, maxCommitTime, eSplit.getBootstrapFileSplit()));
+ } else {
+ rtSplits.add(new HoodieRealtimeFileSplit(split, metaClient.getBasePath(), logFilePaths, maxCommitTime));
+ }
+ } catch (IOException e) {
+ throw new HoodieIOException("Error creating hoodie real time split ", e);
}
+ });
+ } else {
+ // the file group has only logs (say the index is global).
+ try {
+ rtSplits.add(new HoodieRealtimeFileSplit(DummyInputSplit.INSTANCE, metaClient.getBasePath(), logFilePaths, maxCommitTime));
Review comment:
This code seems to be intending to solely avoid reading this split, if it only has logs? Can't we just skip this fileId /fileslice, instead of adding a dummy split and record reader. They do add quite bit of complexity here. Wondering if we can fix it localized manner if all we want to do is avoid NPE
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] danny0405 commented on a change in pull request #3122: [HUDI-2048] HoodieRealtimeInputFormatUtils#groupLogsByBaseFile throws…
Posted by GitBox <gi...@apache.org>.
danny0405 commented on a change in pull request #3122:
URL: https://github.com/apache/hudi/pull/3122#discussion_r655208566
##########
File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/utils/HoodieRealtimeInputFormatUtils.java
##########
@@ -165,16 +165,20 @@
Map<String, List<HoodieBaseFile>> groupedInputSplits = partitionsToParquetSplits.get(partitionPath).stream()
.collect(Collectors.groupingBy(file -> FSUtils.getFileId(file.getFileStatus().getPath().getName())));
latestFileSlices.forEach(fileSlice -> {
- List<HoodieBaseFile> dataFileSplits = groupedInputSplits.get(fileSlice.getFileId());
- dataFileSplits.forEach(split -> {
- try {
- List<String> logFilePaths = fileSlice.getLogFiles().sorted(HoodieLogFile.getLogFileComparator())
- .map(logFile -> logFile.getPath().toString()).collect(Collectors.toList());
- resultMap.put(split, logFilePaths);
- } catch (Exception e) {
- throw new HoodieException("Error creating hoodie real time split ", e);
- }
- });
+ final String fileId = fileSlice.getFileId();
+ // filter out the file group that has only logs (say the index is global).
+ if (groupedInputSplits.containsKey(fileId)) {
+ List<HoodieBaseFile> dataFileSplits = groupedInputSplits.get(fileId);
+ dataFileSplits.forEach(split -> {
+ try {
+ List<String> logFilePaths = fileSlice.getLogFiles().sorted(HoodieLogFile.getLogFileComparator())
+ .map(logFile -> logFile.getPath().toString()).collect(Collectors.toList());
+ resultMap.put(split, logFilePaths);
+ } catch (Exception e) {
+ throw new HoodieException("Error creating hoodie real time split ", e);
Review comment:
The exception already throws out, what kind of exception do you suggest it to be ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #3122: [HUDI-2048] HoodieRealtimeInputFormatUtils#groupLogsByBaseFile throws…
Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #3122:
URL: https://github.com/apache/hudi/pull/3122#issuecomment-864851929
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "9d1fc32406d8633cef61672457c5448cf5aa4dfa",
"status" : "UNKNOWN",
"url" : "TBD",
"triggerID" : "9d1fc32406d8633cef61672457c5448cf5aa4dfa",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* 9d1fc32406d8633cef61672457c5448cf5aa4dfa UNKNOWN
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3122: [HUDI-2048] HoodieRealtimeInputFormatUtils#groupLogsByBaseFile throws…
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3122:
URL: https://github.com/apache/hudi/pull/3122#issuecomment-864851929
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "9d1fc32406d8633cef61672457c5448cf5aa4dfa",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=302",
"triggerID" : "9d1fc32406d8633cef61672457c5448cf5aa4dfa",
"triggerType" : "PUSH"
}, {
"hash" : "530787e9a14f8ac180399f4641a58e5eff6ada11",
"status" : "SUCCESS",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=307",
"triggerID" : "530787e9a14f8ac180399f4641a58e5eff6ada11",
"triggerType" : "PUSH"
}, {
"hash" : "84bfc2d2c3bea3d088e3c1f8b0a390c586d80e48",
"status" : "UNKNOWN",
"url" : "TBD",
"triggerID" : "84bfc2d2c3bea3d088e3c1f8b0a390c586d80e48",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* 530787e9a14f8ac180399f4641a58e5eff6ada11 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=307)
* 84bfc2d2c3bea3d088e3c1f8b0a390c586d80e48 UNKNOWN
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] danny0405 closed pull request #3122: [HUDI-2048] HoodieRealtimeInputFormatUtils#groupLogsByBaseFile throws…
Posted by GitBox <gi...@apache.org>.
danny0405 closed pull request #3122:
URL: https://github.com/apache/hudi/pull/3122
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3122: [HUDI-2048] HoodieRealtimeInputFormatUtils#groupLogsByBaseFile throws…
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3122:
URL: https://github.com/apache/hudi/pull/3122#issuecomment-864851929
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "9d1fc32406d8633cef61672457c5448cf5aa4dfa",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=302",
"triggerID" : "9d1fc32406d8633cef61672457c5448cf5aa4dfa",
"triggerType" : "PUSH"
}, {
"hash" : "530787e9a14f8ac180399f4641a58e5eff6ada11",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=307",
"triggerID" : "530787e9a14f8ac180399f4641a58e5eff6ada11",
"triggerType" : "PUSH"
}, {
"hash" : "84bfc2d2c3bea3d088e3c1f8b0a390c586d80e48",
"status" : "SUCCESS",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=555",
"triggerID" : "84bfc2d2c3bea3d088e3c1f8b0a390c586d80e48",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* 84bfc2d2c3bea3d088e3c1f8b0a390c586d80e48 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=555)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] danny0405 commented on a change in pull request #3122: [HUDI-2048] HoodieRealtimeInputFormatUtils#groupLogsByBaseFile throws…
Posted by GitBox <gi...@apache.org>.
danny0405 commented on a change in pull request #3122:
URL: https://github.com/apache/hudi/pull/3122#discussion_r664272560
##########
File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/utils/HoodieRealtimeInputFormatUtils.java
##########
@@ -102,28 +105,37 @@
HoodieTimeline.ROLLBACK_ACTION, HoodieTimeline.DELTA_COMMIT_ACTION, HoodieTimeline.REPLACE_COMMIT_ACTION))
.filterCompletedInstants().lastInstant().get().getTimestamp();
latestFileSlices.forEach(fileSlice -> {
+ List<String> logFilePaths = fileSlice.getLogFiles().sorted(HoodieLogFile.getLogFileComparator())
+ .map(logFile -> logFile.getPath().toString()).collect(Collectors.toList());
List<FileSplit> dataFileSplits = groupedInputSplits.get(fileSlice.getFileId());
- dataFileSplits.forEach(split -> {
- try {
- List<String> logFilePaths = fileSlice.getLogFiles().sorted(HoodieLogFile.getLogFileComparator())
- .map(logFile -> logFile.getPath().toString()).collect(Collectors.toList());
- if (split instanceof BootstrapBaseFileSplit) {
- BootstrapBaseFileSplit eSplit = (BootstrapBaseFileSplit) split;
- String[] hosts = split.getLocationInfo() != null ? Arrays.stream(split.getLocationInfo())
- .filter(x -> !x.isInMemory()).toArray(String[]::new) : new String[0];
- String[] inMemoryHosts = split.getLocationInfo() != null ? Arrays.stream(split.getLocationInfo())
- .filter(SplitLocationInfo::isInMemory).toArray(String[]::new) : new String[0];
- FileSplit baseSplit = new FileSplit(eSplit.getPath(), eSplit.getStart(), eSplit.getLength(),
- hosts, inMemoryHosts);
- rtSplits.add(new RealtimeBootstrapBaseFileSplit(baseSplit, metaClient.getBasePath(),
- logFilePaths, maxCommitTime, eSplit.getBootstrapFileSplit()));
- } else {
- rtSplits.add(new HoodieRealtimeFileSplit(split, metaClient.getBasePath(), logFilePaths, maxCommitTime));
+ if (dataFileSplits != null) {
+ dataFileSplits.forEach(split -> {
+ try {
+ if (split instanceof BootstrapBaseFileSplit) {
+ BootstrapBaseFileSplit eSplit = (BootstrapBaseFileSplit) split;
+ String[] hosts = split.getLocationInfo() != null ? Arrays.stream(split.getLocationInfo())
+ .filter(x -> !x.isInMemory()).toArray(String[]::new) : new String[0];
+ String[] inMemoryHosts = split.getLocationInfo() != null ? Arrays.stream(split.getLocationInfo())
+ .filter(SplitLocationInfo::isInMemory).toArray(String[]::new) : new String[0];
+ FileSplit baseSplit = new FileSplit(eSplit.getPath(), eSplit.getStart(), eSplit.getLength(),
+ hosts, inMemoryHosts);
+ rtSplits.add(new RealtimeBootstrapBaseFileSplit(baseSplit, metaClient.getBasePath(),
+ logFilePaths, maxCommitTime, eSplit.getBootstrapFileSplit()));
+ } else {
+ rtSplits.add(new HoodieRealtimeFileSplit(split, metaClient.getBasePath(), logFilePaths, maxCommitTime));
+ }
+ } catch (IOException e) {
+ throw new HoodieIOException("Error creating hoodie real time split ", e);
}
+ });
+ } else {
+ // the file group has only logs (say the index is global).
+ try {
+ rtSplits.add(new HoodieRealtimeFileSplit(DummyInputSplit.INSTANCE, metaClient.getBasePath(), logFilePaths, maxCommitTime));
Review comment:
Personally i want to fix the NPE at first, read the file groups with parquets is better than a thrown exception. You you think reading the logs only file group adds quite bit of complexity, we can avoid that.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #3122: [HUDI-2048] HoodieRealtimeInputFormatUtils#groupLogsByBaseFile throws…
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3122:
URL: https://github.com/apache/hudi/pull/3122#issuecomment-864855800
# [Codecov](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#3122](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (530787e) into [master](https://codecov.io/gh/apache/hudi/commit/cdb9b48170ef98634babd8954392efb1c1b90fcf?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (cdb9b48) will **increase** coverage by `0.15%`.
> The diff coverage is `22.58%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3122/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #3122 +/- ##
============================================
+ Coverage 45.85% 46.01% +0.15%
- Complexity 5269 5306 +37
============================================
Files 908 911 +3
Lines 39332 39480 +148
Branches 4239 4256 +17
============================================
+ Hits 18036 18167 +131
- Misses 19451 19458 +7
- Partials 1845 1855 +10
```
| Flag | Coverage Δ | |
|---|---|---|
| hudicli | `39.95% <ø> (ø)` | |
| hudiclient | `30.44% <ø> (ø)` | |
| hudicommon | `47.56% <ø> (-0.02%)` | :arrow_down: |
| hudiflink | `61.33% <ø> (ø)` | |
| hudihadoopmr | `51.23% <22.58%> (-0.06%)` | :arrow_down: |
| hudisparkdatasource | `66.52% <ø> (ø)` | |
| hudisync | `51.73% <ø> (ø)` | |
| huditimelineservice | `64.36% <ø> (ø)` | |
| hudiutilities | `58.37% <ø> (+1.73%)` | :arrow_up: |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [...i/hadoop/utils/HoodieRealtimeInputFormatUtils.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1oYWRvb3AtbXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaGFkb29wL3V0aWxzL0hvb2RpZVJlYWx0aW1lSW5wdXRGb3JtYXRVdGlscy5qYXZh) | `34.45% <22.58%> (-0.33%)` | :arrow_down: |
| [...ache/hudi/table/action/rollback/RollbackUtils.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL3RhYmxlL2FjdGlvbi9yb2xsYmFjay9Sb2xsYmFja1V0aWxzLmphdmE=) | `0.00% <0.00%> (ø)` | |
| [.../org/apache/hudi/utilities/sources/JdbcSource.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSmRiY1NvdXJjZS5qYXZh) | `92.00% <0.00%> (ø)` | |
| [...apache/hudi/exception/HoodieMetadataException.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvZXhjZXB0aW9uL0hvb2RpZU1ldGFkYXRhRXhjZXB0aW9uLmphdmE=) | `0.00% <0.00%> (ø)` | |
| [...ava/org/apache/hudi/utilities/SqlQueryBuilder.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL1NxbFF1ZXJ5QnVpbGRlci5qYXZh) | `92.50% <0.00%> (ø)` | |
| [...i/utilities/deltastreamer/HoodieDeltaStreamer.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvSG9vZGllRGVsdGFTdHJlYW1lci5qYXZh) | `72.83% <0.00%> (+0.39%)` | :arrow_up: |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [cdb9b48...530787e](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3122: [HUDI-2048] HoodieRealtimeInputFormatUtils#groupLogsByBaseFile throws…
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3122:
URL: https://github.com/apache/hudi/pull/3122#issuecomment-864851929
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "9d1fc32406d8633cef61672457c5448cf5aa4dfa",
"status" : "SUCCESS",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=302",
"triggerID" : "9d1fc32406d8633cef61672457c5448cf5aa4dfa",
"triggerType" : "PUSH"
}, {
"hash" : "530787e9a14f8ac180399f4641a58e5eff6ada11",
"status" : "PENDING",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=307",
"triggerID" : "530787e9a14f8ac180399f4641a58e5eff6ada11",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* 9d1fc32406d8633cef61672457c5448cf5aa4dfa Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=302)
* 530787e9a14f8ac180399f4641a58e5eff6ada11 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=307)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] danny0405 commented on pull request #3122: [HUDI-2048] HoodieRealtimeInputFormatUtils#groupLogsByBaseFile throws…
Posted by GitBox <gi...@apache.org>.
danny0405 commented on pull request #3122:
URL: https://github.com/apache/hudi/pull/3122#issuecomment-871114694
@vinothchandar I have modified to support pure logs file group reading, can you take a look again, thanks ~
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3122: [HUDI-2048] HoodieRealtimeInputFormatUtils#groupLogsByBaseFile throws…
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3122:
URL: https://github.com/apache/hudi/pull/3122#issuecomment-864851929
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "9d1fc32406d8633cef61672457c5448cf5aa4dfa",
"status" : "SUCCESS",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=302",
"triggerID" : "9d1fc32406d8633cef61672457c5448cf5aa4dfa",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* 9d1fc32406d8633cef61672457c5448cf5aa4dfa Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=302)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] danny0405 commented on pull request #3122: [HUDI-2048] HoodieRealtimeInputFormatUtils#groupLogsByBaseFile throws…
Posted by GitBox <gi...@apache.org>.
danny0405 commented on pull request #3122:
URL: https://github.com/apache/hudi/pull/3122#issuecomment-864876572
Hi @vinothchandar , can you also take a look, i guess this is also a blocker because the global index causes this problem (file group that has no parquet triggers the NPE).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3122: [HUDI-2048] HoodieRealtimeInputFormatUtils#groupLogsByBaseFile throws…
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3122:
URL: https://github.com/apache/hudi/pull/3122#issuecomment-864851929
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "9d1fc32406d8633cef61672457c5448cf5aa4dfa",
"status" : "PENDING",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=302",
"triggerID" : "9d1fc32406d8633cef61672457c5448cf5aa4dfa",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* 9d1fc32406d8633cef61672457c5448cf5aa4dfa Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=302)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] zhangjun0x01 commented on a change in pull request #3122: [HUDI-2048] HoodieRealtimeInputFormatUtils#groupLogsByBaseFile throws…
Posted by GitBox <gi...@apache.org>.
zhangjun0x01 commented on a change in pull request #3122:
URL: https://github.com/apache/hudi/pull/3122#discussion_r655204771
##########
File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/utils/HoodieRealtimeInputFormatUtils.java
##########
@@ -165,16 +165,20 @@
Map<String, List<HoodieBaseFile>> groupedInputSplits = partitionsToParquetSplits.get(partitionPath).stream()
.collect(Collectors.groupingBy(file -> FSUtils.getFileId(file.getFileStatus().getPath().getName())));
latestFileSlices.forEach(fileSlice -> {
- List<HoodieBaseFile> dataFileSplits = groupedInputSplits.get(fileSlice.getFileId());
- dataFileSplits.forEach(split -> {
- try {
- List<String> logFilePaths = fileSlice.getLogFiles().sorted(HoodieLogFile.getLogFileComparator())
- .map(logFile -> logFile.getPath().toString()).collect(Collectors.toList());
- resultMap.put(split, logFilePaths);
- } catch (Exception e) {
- throw new HoodieException("Error creating hoodie real time split ", e);
- }
- });
+ final String fileId = fileSlice.getFileId();
+ // filter out the file group that has only logs (say the index is global).
+ if (groupedInputSplits.containsKey(fileId)) {
+ List<HoodieBaseFile> dataFileSplits = groupedInputSplits.get(fileId);
+ dataFileSplits.forEach(split -> {
+ try {
+ List<String> logFilePaths = fileSlice.getLogFiles().sorted(HoodieLogFile.getLogFileComparator())
+ .map(logFile -> logFile.getPath().toString()).collect(Collectors.toList());
+ resultMap.put(split, logFilePaths);
+ } catch (Exception e) {
+ throw new HoodieException("Error creating hoodie real time split ", e);
Review comment:
nit : Should we modify the exception message so that when throw an exception, we can distinguish it from the exception in the `getRealtimeSplits` method?
##########
File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/utils/HoodieRealtimeInputFormatUtils.java
##########
@@ -165,16 +165,20 @@
Map<String, List<HoodieBaseFile>> groupedInputSplits = partitionsToParquetSplits.get(partitionPath).stream()
.collect(Collectors.groupingBy(file -> FSUtils.getFileId(file.getFileStatus().getPath().getName())));
latestFileSlices.forEach(fileSlice -> {
- List<HoodieBaseFile> dataFileSplits = groupedInputSplits.get(fileSlice.getFileId());
- dataFileSplits.forEach(split -> {
- try {
- List<String> logFilePaths = fileSlice.getLogFiles().sorted(HoodieLogFile.getLogFileComparator())
- .map(logFile -> logFile.getPath().toString()).collect(Collectors.toList());
- resultMap.put(split, logFilePaths);
- } catch (Exception e) {
- throw new HoodieException("Error creating hoodie real time split ", e);
- }
- });
+ final String fileId = fileSlice.getFileId();
+ // filter out the file group that has only logs (say the index is global).
+ if (groupedInputSplits.containsKey(fileId)) {
+ List<HoodieBaseFile> dataFileSplits = groupedInputSplits.get(fileId);
+ dataFileSplits.forEach(split -> {
+ try {
+ List<String> logFilePaths = fileSlice.getLogFiles().sorted(HoodieLogFile.getLogFileComparator())
+ .map(logFile -> logFile.getPath().toString()).collect(Collectors.toList());
+ resultMap.put(split, logFilePaths);
+ } catch (Exception e) {
+ throw new HoodieException("Error creating hoodie real time split ", e);
Review comment:
I mean to modify this content,for example 'Error creating hoodie real time split for group logs by BaseFile',because the exception content of `groupLogsByBaseFile` and `getRealtimeSplits` method are the same.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #3122: [HUDI-2048] HoodieRealtimeInputFormatUtils#groupLogsByBaseFile throws…
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3122:
URL: https://github.com/apache/hudi/pull/3122#issuecomment-864855800
# [Codecov](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#3122](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (84bfc2d) into [master](https://codecov.io/gh/apache/hudi/commit/202887b8ca27eb6de808ba7a2e737b13ae9eb8c0?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (202887b) will **decrease** coverage by `29.88%`.
> The diff coverage is `n/a`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3122/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #3122 +/- ##
=============================================
- Coverage 46.19% 16.30% -29.89%
+ Complexity 5385 474 -4911
=============================================
Files 921 280 -641
Lines 40040 10875 -29165
Branches 4294 1106 -3188
=============================================
- Hits 18495 1773 -16722
+ Misses 19661 8944 -10717
+ Partials 1884 158 -1726
```
| Flag | Coverage Δ | |
|---|---|---|
| hudicli | `?` | |
| hudiclient | `0.00% <ø> (-30.46%)` | :arrow_down: |
| hudicommon | `?` | |
| hudiflink | `?` | |
| hudihadoopmr | `?` | |
| hudisparkdatasource | `?` | |
| hudisync | `5.38% <ø> (-48.67%)` | :arrow_down: |
| huditimelineservice | `?` | |
| hudiutilities | `58.44% <ø> (+0.03%)` | :arrow_up: |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...a/org/apache/hudi/metrics/MetricsReporterType.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyVHlwZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...rg/apache/hudi/client/bootstrap/BootstrapMode.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9ib290c3RyYXAvQm9vdHN0cmFwTW9kZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...he/hudi/hive/HiveStylePartitionValueExtractor.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvSGl2ZVN0eWxlUGFydGl0aW9uVmFsdWVFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [.../apache/hudi/keygen/constant/KeyGeneratorType.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2tleWdlbi9jb25zdGFudC9LZXlHZW5lcmF0b3JUeXBlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...pache/hudi/client/utils/ConcatenatingIterator.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC91dGlscy9Db25jYXRlbmF0aW5nSXRlcmF0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [.../hudi/execution/bulkinsert/BulkInsertSortMode.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2V4ZWN1dGlvbi9idWxraW5zZXJ0L0J1bGtJbnNlcnRTb3J0TW9kZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...able/action/compact/CompactionTriggerStrategy.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL3RhYmxlL2FjdGlvbi9jb21wYWN0L0NvbXBhY3Rpb25UcmlnZ2VyU3RyYXRlZ3kuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| ... and [707 more](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [202887b...84bfc2d](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3122: [HUDI-2048] HoodieRealtimeInputFormatUtils#groupLogsByBaseFile throws…
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3122:
URL: https://github.com/apache/hudi/pull/3122#issuecomment-864851929
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "9d1fc32406d8633cef61672457c5448cf5aa4dfa",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=302",
"triggerID" : "9d1fc32406d8633cef61672457c5448cf5aa4dfa",
"triggerType" : "PUSH"
}, {
"hash" : "530787e9a14f8ac180399f4641a58e5eff6ada11",
"status" : "SUCCESS",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=307",
"triggerID" : "530787e9a14f8ac180399f4641a58e5eff6ada11",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* 530787e9a14f8ac180399f4641a58e5eff6ada11 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=307)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter commented on pull request #3122: [HUDI-2048] HoodieRealtimeInputFormatUtils#groupLogsByBaseFile throws…
Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on pull request #3122:
URL: https://github.com/apache/hudi/pull/3122#issuecomment-864855800
# [Codecov](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#3122](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (9d1fc32) into [master](https://codecov.io/gh/apache/hudi/commit/cdb9b48170ef98634babd8954392efb1c1b90fcf?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (cdb9b48) will **decrease** coverage by `42.71%`.
> The diff coverage is `n/a`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3122/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #3122 +/- ##
============================================
- Coverage 45.85% 3.13% -42.72%
+ Complexity 5269 82 -5187
============================================
Files 908 274 -634
Lines 39332 10650 -28682
Branches 4239 1088 -3151
============================================
- Hits 18036 334 -17702
+ Misses 19451 10290 -9161
+ Partials 1845 26 -1819
```
| Flag | Coverage Δ | |
|---|---|---|
| hudicli | `?` | |
| hudiclient | `0.00% <ø> (-30.45%)` | :arrow_down: |
| hudicommon | `?` | |
| hudiflink | `?` | |
| hudihadoopmr | `?` | |
| hudisparkdatasource | `?` | |
| hudisync | `6.72% <ø> (-45.01%)` | :arrow_down: |
| huditimelineservice | `?` | |
| hudiutilities | `9.52% <ø> (-47.12%)` | :arrow_down: |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...g/apache/hudi/utilities/sources/JsonDFSSource.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkRGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...pache/hudi/utilities/sources/ParquetDFSSource.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUGFycXVldERGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| ... and [744 more](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [cdb9b48...9d1fc32](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] danny0405 commented on a change in pull request #3122: [HUDI-2048] HoodieRealtimeInputFormatUtils#groupLogsByBaseFile throws…
Posted by GitBox <gi...@apache.org>.
danny0405 commented on a change in pull request #3122:
URL: https://github.com/apache/hudi/pull/3122#discussion_r655226296
##########
File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/utils/HoodieRealtimeInputFormatUtils.java
##########
@@ -165,16 +165,20 @@
Map<String, List<HoodieBaseFile>> groupedInputSplits = partitionsToParquetSplits.get(partitionPath).stream()
.collect(Collectors.groupingBy(file -> FSUtils.getFileId(file.getFileStatus().getPath().getName())));
latestFileSlices.forEach(fileSlice -> {
- List<HoodieBaseFile> dataFileSplits = groupedInputSplits.get(fileSlice.getFileId());
- dataFileSplits.forEach(split -> {
- try {
- List<String> logFilePaths = fileSlice.getLogFiles().sorted(HoodieLogFile.getLogFileComparator())
- .map(logFile -> logFile.getPath().toString()).collect(Collectors.toList());
- resultMap.put(split, logFilePaths);
- } catch (Exception e) {
- throw new HoodieException("Error creating hoodie real time split ", e);
- }
- });
+ final String fileId = fileSlice.getFileId();
+ // filter out the file group that has only logs (say the index is global).
+ if (groupedInputSplits.containsKey(fileId)) {
+ List<HoodieBaseFile> dataFileSplits = groupedInputSplits.get(fileId);
+ dataFileSplits.forEach(split -> {
+ try {
+ List<String> logFilePaths = fileSlice.getLogFiles().sorted(HoodieLogFile.getLogFileComparator())
+ .map(logFile -> logFile.getPath().toString()).collect(Collectors.toList());
+ resultMap.put(split, logFilePaths);
+ } catch (Exception e) {
+ throw new HoodieException("Error creating hoodie real time split ", e);
Review comment:
> When we use hive query the hudi table, the same NPE exception may also occur [here](https://github.com/apache/hudi/blob/master/hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/utils/HoodieRealtimeInputFormatUtils.java#L105) , should we add the same judgment logic?
Oops, i guess we should
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] danny0405 commented on a change in pull request #3122: [HUDI-2048] HoodieRealtimeInputFormatUtils#groupLogsByBaseFile throws…
Posted by GitBox <gi...@apache.org>.
danny0405 commented on a change in pull request #3122:
URL: https://github.com/apache/hudi/pull/3122#discussion_r664272560
##########
File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/utils/HoodieRealtimeInputFormatUtils.java
##########
@@ -102,28 +105,37 @@
HoodieTimeline.ROLLBACK_ACTION, HoodieTimeline.DELTA_COMMIT_ACTION, HoodieTimeline.REPLACE_COMMIT_ACTION))
.filterCompletedInstants().lastInstant().get().getTimestamp();
latestFileSlices.forEach(fileSlice -> {
+ List<String> logFilePaths = fileSlice.getLogFiles().sorted(HoodieLogFile.getLogFileComparator())
+ .map(logFile -> logFile.getPath().toString()).collect(Collectors.toList());
List<FileSplit> dataFileSplits = groupedInputSplits.get(fileSlice.getFileId());
- dataFileSplits.forEach(split -> {
- try {
- List<String> logFilePaths = fileSlice.getLogFiles().sorted(HoodieLogFile.getLogFileComparator())
- .map(logFile -> logFile.getPath().toString()).collect(Collectors.toList());
- if (split instanceof BootstrapBaseFileSplit) {
- BootstrapBaseFileSplit eSplit = (BootstrapBaseFileSplit) split;
- String[] hosts = split.getLocationInfo() != null ? Arrays.stream(split.getLocationInfo())
- .filter(x -> !x.isInMemory()).toArray(String[]::new) : new String[0];
- String[] inMemoryHosts = split.getLocationInfo() != null ? Arrays.stream(split.getLocationInfo())
- .filter(SplitLocationInfo::isInMemory).toArray(String[]::new) : new String[0];
- FileSplit baseSplit = new FileSplit(eSplit.getPath(), eSplit.getStart(), eSplit.getLength(),
- hosts, inMemoryHosts);
- rtSplits.add(new RealtimeBootstrapBaseFileSplit(baseSplit, metaClient.getBasePath(),
- logFilePaths, maxCommitTime, eSplit.getBootstrapFileSplit()));
- } else {
- rtSplits.add(new HoodieRealtimeFileSplit(split, metaClient.getBasePath(), logFilePaths, maxCommitTime));
+ if (dataFileSplits != null) {
+ dataFileSplits.forEach(split -> {
+ try {
+ if (split instanceof BootstrapBaseFileSplit) {
+ BootstrapBaseFileSplit eSplit = (BootstrapBaseFileSplit) split;
+ String[] hosts = split.getLocationInfo() != null ? Arrays.stream(split.getLocationInfo())
+ .filter(x -> !x.isInMemory()).toArray(String[]::new) : new String[0];
+ String[] inMemoryHosts = split.getLocationInfo() != null ? Arrays.stream(split.getLocationInfo())
+ .filter(SplitLocationInfo::isInMemory).toArray(String[]::new) : new String[0];
+ FileSplit baseSplit = new FileSplit(eSplit.getPath(), eSplit.getStart(), eSplit.getLength(),
+ hosts, inMemoryHosts);
+ rtSplits.add(new RealtimeBootstrapBaseFileSplit(baseSplit, metaClient.getBasePath(),
+ logFilePaths, maxCommitTime, eSplit.getBootstrapFileSplit()));
+ } else {
+ rtSplits.add(new HoodieRealtimeFileSplit(split, metaClient.getBasePath(), logFilePaths, maxCommitTime));
+ }
+ } catch (IOException e) {
+ throw new HoodieIOException("Error creating hoodie real time split ", e);
}
+ });
+ } else {
+ // the file group has only logs (say the index is global).
+ try {
+ rtSplits.add(new HoodieRealtimeFileSplit(DummyInputSplit.INSTANCE, metaClient.getBasePath(), logFilePaths, maxCommitTime));
Review comment:
Personally i want to fix the NPE at first, read the file groups with parquets is better than a thrown exception. You you think reading the logs only file group adds quite bit of complexity, we can avoid that.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3122: [HUDI-2048] HoodieRealtimeInputFormatUtils#groupLogsByBaseFile throws…
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3122:
URL: https://github.com/apache/hudi/pull/3122#issuecomment-864851929
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "9d1fc32406d8633cef61672457c5448cf5aa4dfa",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=302",
"triggerID" : "9d1fc32406d8633cef61672457c5448cf5aa4dfa",
"triggerType" : "PUSH"
}, {
"hash" : "530787e9a14f8ac180399f4641a58e5eff6ada11",
"status" : "SUCCESS",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=307",
"triggerID" : "530787e9a14f8ac180399f4641a58e5eff6ada11",
"triggerType" : "PUSH"
}, {
"hash" : "84bfc2d2c3bea3d088e3c1f8b0a390c586d80e48",
"status" : "PENDING",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=555",
"triggerID" : "84bfc2d2c3bea3d088e3c1f8b0a390c586d80e48",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* 530787e9a14f8ac180399f4641a58e5eff6ada11 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=307)
* 84bfc2d2c3bea3d088e3c1f8b0a390c586d80e48 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=555)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] danny0405 commented on pull request #3122: [HUDI-2048] HoodieRealtimeInputFormatUtils#groupLogsByBaseFile throws…
Posted by GitBox <gi...@apache.org>.
danny0405 commented on pull request #3122:
URL: https://github.com/apache/hudi/pull/3122#issuecomment-874503060
> Does #3203 solve this?
I think it should, would just close this one.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #3122: [HUDI-2048] HoodieRealtimeInputFormatUtils#groupLogsByBaseFile throws…
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3122:
URL: https://github.com/apache/hudi/pull/3122#issuecomment-864855800
# [Codecov](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#3122](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (84bfc2d) into [master](https://codecov.io/gh/apache/hudi/commit/202887b8ca27eb6de808ba7a2e737b13ae9eb8c0?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (202887b) will **decrease** coverage by `43.12%`.
> The diff coverage is `n/a`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3122/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #3122 +/- ##
============================================
- Coverage 46.19% 3.07% -43.13%
+ Complexity 5385 82 -5303
============================================
Files 921 280 -641
Lines 40040 10875 -29165
Branches 4294 1106 -3188
============================================
- Hits 18495 334 -18161
+ Misses 19661 10515 -9146
+ Partials 1884 26 -1858
```
| Flag | Coverage Δ | |
|---|---|---|
| hudicli | `?` | |
| hudiclient | `0.00% <ø> (-30.46%)` | :arrow_down: |
| hudicommon | `?` | |
| hudiflink | `?` | |
| hudihadoopmr | `?` | |
| hudisparkdatasource | `?` | |
| hudisync | `5.38% <ø> (-48.67%)` | :arrow_down: |
| huditimelineservice | `?` | |
| hudiutilities | `9.46% <ø> (-48.95%)` | :arrow_down: |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...g/apache/hudi/utilities/sources/JsonDFSSource.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkRGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...pache/hudi/utilities/sources/ParquetDFSSource.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUGFycXVldERGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| ... and [753 more](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [202887b...84bfc2d](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] danny0405 commented on pull request #3122: [HUDI-2048] HoodieRealtimeInputFormatUtils#groupLogsByBaseFile throws…
Posted by GitBox <gi...@apache.org>.
danny0405 commented on pull request #3122:
URL: https://github.com/apache/hudi/pull/3122#issuecomment-864876572
Hi @vinothchandar , can you also take a look, i guess this is also a blocker because the global index causes this problem (file group that has no parquet triggers the NPE).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] vinothchandar commented on a change in pull request #3122: [HUDI-2048] HoodieRealtimeInputFormatUtils#groupLogsByBaseFile throws…
Posted by GitBox <gi...@apache.org>.
vinothchandar commented on a change in pull request #3122:
URL: https://github.com/apache/hudi/pull/3122#discussion_r664267191
##########
File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieParquetInputFormat.java
##########
@@ -209,6 +210,12 @@ public Configuration getConf() {
colNamesWithTypesForExternal.size(),
true);
}
+ } else if (split instanceof RealtimeSplit) {
Review comment:
can we do this at the sub class? Not sure if we want to make this class aware of RealtimeSplit
##########
File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieParquetInputFormat.java
##########
@@ -253,4 +260,34 @@ private BootstrapBaseFileSplit makeExternalFileSplit(PathWithBootstrapFileStatus
throw new HoodieIOException(e.getMessage(), e);
}
}
+
+ /**
+ * Dummy record reader that outputs nothing.
+ */
+ public static class DummyRecordReader implements RecordReader<NullWritable, ArrayWritable> {
Review comment:
rename: NoopRecordReader
##########
File path: hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/utils/HoodieRealtimeInputFormatUtils.java
##########
@@ -102,28 +105,37 @@
HoodieTimeline.ROLLBACK_ACTION, HoodieTimeline.DELTA_COMMIT_ACTION, HoodieTimeline.REPLACE_COMMIT_ACTION))
.filterCompletedInstants().lastInstant().get().getTimestamp();
latestFileSlices.forEach(fileSlice -> {
+ List<String> logFilePaths = fileSlice.getLogFiles().sorted(HoodieLogFile.getLogFileComparator())
+ .map(logFile -> logFile.getPath().toString()).collect(Collectors.toList());
List<FileSplit> dataFileSplits = groupedInputSplits.get(fileSlice.getFileId());
- dataFileSplits.forEach(split -> {
- try {
- List<String> logFilePaths = fileSlice.getLogFiles().sorted(HoodieLogFile.getLogFileComparator())
- .map(logFile -> logFile.getPath().toString()).collect(Collectors.toList());
- if (split instanceof BootstrapBaseFileSplit) {
- BootstrapBaseFileSplit eSplit = (BootstrapBaseFileSplit) split;
- String[] hosts = split.getLocationInfo() != null ? Arrays.stream(split.getLocationInfo())
- .filter(x -> !x.isInMemory()).toArray(String[]::new) : new String[0];
- String[] inMemoryHosts = split.getLocationInfo() != null ? Arrays.stream(split.getLocationInfo())
- .filter(SplitLocationInfo::isInMemory).toArray(String[]::new) : new String[0];
- FileSplit baseSplit = new FileSplit(eSplit.getPath(), eSplit.getStart(), eSplit.getLength(),
- hosts, inMemoryHosts);
- rtSplits.add(new RealtimeBootstrapBaseFileSplit(baseSplit, metaClient.getBasePath(),
- logFilePaths, maxCommitTime, eSplit.getBootstrapFileSplit()));
- } else {
- rtSplits.add(new HoodieRealtimeFileSplit(split, metaClient.getBasePath(), logFilePaths, maxCommitTime));
+ if (dataFileSplits != null) {
+ dataFileSplits.forEach(split -> {
+ try {
+ if (split instanceof BootstrapBaseFileSplit) {
+ BootstrapBaseFileSplit eSplit = (BootstrapBaseFileSplit) split;
+ String[] hosts = split.getLocationInfo() != null ? Arrays.stream(split.getLocationInfo())
+ .filter(x -> !x.isInMemory()).toArray(String[]::new) : new String[0];
+ String[] inMemoryHosts = split.getLocationInfo() != null ? Arrays.stream(split.getLocationInfo())
+ .filter(SplitLocationInfo::isInMemory).toArray(String[]::new) : new String[0];
+ FileSplit baseSplit = new FileSplit(eSplit.getPath(), eSplit.getStart(), eSplit.getLength(),
+ hosts, inMemoryHosts);
+ rtSplits.add(new RealtimeBootstrapBaseFileSplit(baseSplit, metaClient.getBasePath(),
+ logFilePaths, maxCommitTime, eSplit.getBootstrapFileSplit()));
+ } else {
+ rtSplits.add(new HoodieRealtimeFileSplit(split, metaClient.getBasePath(), logFilePaths, maxCommitTime));
+ }
+ } catch (IOException e) {
+ throw new HoodieIOException("Error creating hoodie real time split ", e);
}
+ });
+ } else {
+ // the file group has only logs (say the index is global).
+ try {
+ rtSplits.add(new HoodieRealtimeFileSplit(DummyInputSplit.INSTANCE, metaClient.getBasePath(), logFilePaths, maxCommitTime));
Review comment:
This code seems to be intending to solely avoid reading this split, if it only has logs? Can't we just skip this fileId /fileslice, instead of adding a dummy split and record reader. They do add quite bit of complexity here. Wondering if we can fix it localized manner if all we want to do is avoid NPE
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3122: [HUDI-2048] HoodieRealtimeInputFormatUtils#groupLogsByBaseFile throws…
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3122:
URL: https://github.com/apache/hudi/pull/3122#issuecomment-864851929
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "9d1fc32406d8633cef61672457c5448cf5aa4dfa",
"status" : "SUCCESS",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=302",
"triggerID" : "9d1fc32406d8633cef61672457c5448cf5aa4dfa",
"triggerType" : "PUSH"
}, {
"hash" : "530787e9a14f8ac180399f4641a58e5eff6ada11",
"status" : "UNKNOWN",
"url" : "TBD",
"triggerID" : "530787e9a14f8ac180399f4641a58e5eff6ada11",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* 9d1fc32406d8633cef61672457c5448cf5aa4dfa Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=302)
* 530787e9a14f8ac180399f4641a58e5eff6ada11 UNKNOWN
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] danny0405 commented on pull request #3122: [HUDI-2048] HoodieRealtimeInputFormatUtils#groupLogsByBaseFile throws…
Posted by GitBox <gi...@apache.org>.
danny0405 commented on pull request #3122:
URL: https://github.com/apache/hudi/pull/3122#issuecomment-869492247
> HI @vinothchandar @danny0405 i think we should not filter those logs directly, those logs contains the data which we needed。 we encounter this problem and solved it in our products,this problem is sub_problem3 of HUDI-2086 and the correlation pr will come next few days。
That's cool, so you mean you would fix that in HUDI-2086 though ? Can we close this one instead ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] vinothchandar commented on pull request #3122: [HUDI-2048] HoodieRealtimeInputFormatUtils#groupLogsByBaseFile throws…
Posted by GitBox <gi...@apache.org>.
vinothchandar commented on pull request #3122:
URL: https://github.com/apache/hudi/pull/3122#issuecomment-874493680
Does #3203 solve this?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #3122: [HUDI-2048] HoodieRealtimeInputFormatUtils#groupLogsByBaseFile throws…
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3122:
URL: https://github.com/apache/hudi/pull/3122#issuecomment-864855800
# [Codecov](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#3122](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (84bfc2d) into [master](https://codecov.io/gh/apache/hudi/commit/202887b8ca27eb6de808ba7a2e737b13ae9eb8c0?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (202887b) will **decrease** coverage by `17.71%`.
> The diff coverage is `n/a`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3122/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #3122 +/- ##
=============================================
- Coverage 46.19% 28.47% -17.72%
+ Complexity 5385 1262 -4123
=============================================
Files 921 376 -545
Lines 40040 14326 -25714
Branches 4294 1458 -2836
=============================================
- Hits 18495 4080 -14415
+ Misses 19661 9951 -9710
+ Partials 1884 295 -1589
```
| Flag | Coverage Δ | |
|---|---|---|
| hudicli | `?` | |
| hudiclient | `22.29% <ø> (-8.16%)` | :arrow_down: |
| hudicommon | `?` | |
| hudiflink | `?` | |
| hudihadoopmr | `?` | |
| hudisparkdatasource | `?` | |
| hudisync | `5.38% <ø> (-48.67%)` | :arrow_down: |
| huditimelineservice | `?` | |
| hudiutilities | `58.44% <ø> (+0.03%)` | :arrow_up: |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [.../org/apache/hudi/hive/NonPartitionedExtractor.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvTm9uUGFydGl0aW9uZWRFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...a/org/apache/hudi/metrics/MetricsReporterType.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvTWV0cmljc1JlcG9ydGVyVHlwZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...rg/apache/hudi/client/bootstrap/BootstrapMode.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9ib290c3RyYXAvQm9vdHN0cmFwTW9kZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...he/hudi/hive/HiveStylePartitionValueExtractor.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvSGl2ZVN0eWxlUGFydGl0aW9uVmFsdWVFeHRyYWN0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [.../apache/hudi/keygen/constant/KeyGeneratorType.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2tleWdlbi9jb25zdGFudC9LZXlHZW5lcmF0b3JUeXBlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...pache/hudi/client/utils/ConcatenatingIterator.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC91dGlscy9Db25jYXRlbmF0aW5nSXRlcmF0b3IuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [.../hudi/execution/bulkinsert/BulkInsertSortMode.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2V4ZWN1dGlvbi9idWxraW5zZXJ0L0J1bGtJbnNlcnRTb3J0TW9kZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...able/action/compact/CompactionTriggerStrategy.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL3RhYmxlL2FjdGlvbi9jb21wYWN0L0NvbXBhY3Rpb25UcmlnZ2VyU3RyYXRlZ3kuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| ... and [611 more](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [202887b...84bfc2d](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #3122: [HUDI-2048] HoodieRealtimeInputFormatUtils#groupLogsByBaseFile throws…
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3122:
URL: https://github.com/apache/hudi/pull/3122#issuecomment-864855800
# [Codecov](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#3122](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (84bfc2d) into [master](https://codecov.io/gh/apache/hudi/commit/202887b8ca27eb6de808ba7a2e737b13ae9eb8c0?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (202887b) will **increase** coverage by `0.00%`.
> The diff coverage is `36.95%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3122/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #3122 +/- ##
=========================================
Coverage 46.19% 46.19%
- Complexity 5385 5386 +1
=========================================
Files 921 921
Lines 40040 40062 +22
Branches 4294 4297 +3
=========================================
+ Hits 18495 18506 +11
- Misses 19661 19669 +8
- Partials 1884 1887 +3
```
| Flag | Coverage Δ | |
|---|---|---|
| hudicli | `39.95% <ø> (ø)` | |
| hudiclient | `30.45% <ø> (ø)` | |
| hudicommon | `47.56% <ø> (-0.02%)` | :arrow_down: |
| hudiflink | `59.91% <ø> (ø)` | |
| hudihadoopmr | `51.33% <36.95%> (+0.03%)` | :arrow_up: |
| hudisparkdatasource | `67.06% <ø> (ø)` | |
| hudisync | `54.05% <ø> (ø)` | |
| huditimelineservice | `64.36% <ø> (ø)` | |
| hudiutilities | `58.44% <ø> (+0.03%)` | :arrow_up: |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [...i/hadoop/utils/HoodieRealtimeInputFormatUtils.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1oYWRvb3AtbXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaGFkb29wL3V0aWxzL0hvb2RpZVJlYWx0aW1lSW5wdXRGb3JtYXRVdGlscy5qYXZh) | `34.40% <23.52%> (-0.39%)` | :arrow_down: |
| [...g/apache/hudi/hadoop/HoodieParquetInputFormat.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1oYWRvb3AtbXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaGFkb29wL0hvb2RpZVBhcnF1ZXRJbnB1dEZvcm1hdC5qYXZh) | `44.66% <75.00%> (+4.00%)` | :arrow_up: |
| [...e/hudi/common/table/log/HoodieLogFormatWriter.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL2xvZy9Ib29kaWVMb2dGb3JtYXRXcml0ZXIuamF2YQ==) | `78.12% <0.00%> (-1.57%)` | :arrow_down: |
| [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/3122/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `71.28% <0.00%> (+0.33%)` | :arrow_up: |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [202887b...84bfc2d](https://codecov.io/gh/apache/hudi/pull/3122?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org