You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2021/02/10 23:16:24 UTC
[GitHub] [hudi] prashantwason opened a new pull request #2565: [HUDI-1611] Added a configuration to allow specific directories to be filtered out during Metadata Table bootstrap.
prashantwason opened a new pull request #2565:
URL: https://github.com/apache/hudi/pull/2565
## What is the purpose of the pull request
During the bootstrap of the Metadata Table, all the directories which contain the partition metadata directory are assumed to be partitions and are added to the metadata table.
In our HDFS clusters, we have directories like .backup, .temp which are used by various teams for non-hoodie purposes (e.g. .backup may be keeping a snapshot of the dataset). During bootstrap, Metadata Table ends up containing all those paths also as partitions.
In this patch, I would like to introduce a configuration for HoodieMetadataConfig to filter out some directories based on a regular expression string.
## Brief change log
Added a config to HoodieMetadataConfig.
Added a unit test.
## Verify this pull request
This change added tests and can be verified as follows:
Extended the unit test TestHoodieBackedMetadata#testOnlyValidPartitionsAdded
## Committer checklist
- [ ] Has a corresponding JIRA in PR title & commit
- [ ] Commit message is descriptive of the change
- [ ] CI is green
- [ ] Necessary doc changes done or have another open PR
- [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-io edited a comment on pull request #2565: [HUDI-1611] Added a configuration to allow specific directories to be filtered out during Metadata Table bootstrap.
Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #2565:
URL: https://github.com/apache/hudi/pull/2565#issuecomment-777120159
# [Codecov](https://codecov.io/gh/apache/hudi/pull/2565?src=pr&el=h1) Report
> Merging [#2565](https://codecov.io/gh/apache/hudi/pull/2565?src=pr&el=desc) (6cb664d) into [master](https://codecov.io/gh/apache/hudi/commit/26da4f546275e8ab6496537743efe73510cb723d?el=desc) (26da4f5) will **increase** coverage by `0.11%`.
> The diff coverage is `0.00%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2565/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2565?src=pr&el=tree)
```diff
@@ Coverage Diff @@
## master #2565 +/- ##
============================================
+ Coverage 50.92% 51.04% +0.11%
- Complexity 3168 3174 +6
============================================
Files 433 433
Lines 19812 19901 +89
Branches 2033 2065 +32
============================================
+ Hits 10090 10159 +69
- Misses 8902 8919 +17
- Partials 820 823 +3
```
| Flag | Coverage Δ | Complexity Δ | |
|---|---|---|---|
| hudicli | `36.90% <ø> (ø)` | `0.00 <ø> (ø)` | |
| hudiclient | `100.00% <ø> (ø)` | `0.00 <ø> (ø)` | |
| hudicommon | `51.36% <0.00%> (-0.03%)` | `0.00 <0.00> (ø)` | |
| hudiflink | `43.21% <ø> (ø)` | `0.00 <ø> (ø)` | |
| hudihadoopmr | `33.16% <ø> (ø)` | `0.00 <ø> (ø)` | |
| hudisparkdatasource | `69.73% <ø> (ø)` | `0.00 <ø> (ø)` | |
| hudisync | `48.61% <ø> (ø)` | `0.00 <ø> (ø)` | |
| huditimelineservice | `66.49% <ø> (ø)` | `0.00 <ø> (ø)` | |
| hudiutilities | `70.00% <ø> (+0.49%)` | `0.00 <ø> (ø)` | |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2565?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
|---|---|---|---|
| [...pache/hudi/common/config/HoodieMetadataConfig.java](https://codecov.io/gh/apache/hudi/pull/2565/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL2NvbmZpZy9Ib29kaWVNZXRhZGF0YUNvbmZpZy5qYXZh) | `0.00% <0.00%> (ø)` | `0.00 <0.00> (ø)` | |
| [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2565/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `73.27% <0.00%> (+2.41%)` | `57.00% <0.00%> (+6.00%)` | |
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] n3nash commented on pull request #2565: [HUDI-1611] Added a configuration to allow specific directories to be filtered out during Metadata Table bootstrap.
Posted by GitBox <gi...@apache.org>.
n3nash commented on pull request #2565:
URL: https://github.com/apache/hudi/pull/2565#issuecomment-780972942
@prashantwason can you take care of @nsivabalan comment and concern, we can merge then.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nbalajee commented on pull request #2565: [HUDI-1611] Added a configuration to allow specific directories to be filtered out during Metadata Table bootstrap.
Posted by GitBox <gi...@apache.org>.
nbalajee commented on pull request #2565:
URL: https://github.com/apache/hudi/pull/2565#issuecomment-779996470
LGTM.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] n3nash closed pull request #2565: [HUDI-1611] Added a configuration to allow specific directories to be filtered out during Metadata Table bootstrap.
Posted by GitBox <gi...@apache.org>.
n3nash closed pull request #2565:
URL: https://github.com/apache/hudi/pull/2565
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-io edited a comment on pull request #2565: [HUDI-1611] Added a configuration to allow specific directories to be filtered out during Metadata Table bootstrap.
Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #2565:
URL: https://github.com/apache/hudi/pull/2565#issuecomment-777120159
# [Codecov](https://codecov.io/gh/apache/hudi/pull/2565?src=pr&el=h1) Report
> Merging [#2565](https://codecov.io/gh/apache/hudi/pull/2565?src=pr&el=desc) (9e7e18b) into [master](https://codecov.io/gh/apache/hudi/commit/26da4f546275e8ab6496537743efe73510cb723d?el=desc) (26da4f5) will **decrease** coverage by `41.24%`.
> The diff coverage is `n/a`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2565/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2565?src=pr&el=tree)
```diff
@@ Coverage Diff @@
## master #2565 +/- ##
============================================
- Coverage 50.92% 9.68% -41.25%
+ Complexity 3168 48 -3120
============================================
Files 433 53 -380
Lines 19812 1931 -17881
Branches 2033 230 -1803
============================================
- Hits 10090 187 -9903
+ Misses 8902 1731 -7171
+ Partials 820 13 -807
```
| Flag | Coverage Δ | Complexity Δ | |
|---|---|---|---|
| hudicli | `?` | `?` | |
| hudiclient | `?` | `?` | |
| hudicommon | `?` | `?` | |
| hudiflink | `?` | `?` | |
| hudihadoopmr | `?` | `?` | |
| hudisparkdatasource | `?` | `?` | |
| hudisync | `?` | `?` | |
| huditimelineservice | `?` | `?` | |
| hudiutilities | `9.68% <ø> (-59.84%)` | `0.00 <ø> (ø)` | |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2565?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
|---|---|---|---|
| [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2565/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-2.00%)` | |
| [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2565/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-3.00%)` | |
| [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2565/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
| [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2565/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
| [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2565/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
| [...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2565/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-10.00%)` | |
| [...g/apache/hudi/utilities/sources/JsonDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2565/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkRGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
| [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2565/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-6.00%)` | |
| [...pache/hudi/utilities/sources/ParquetDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2565/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUGFycXVldERGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-5.00%)` | |
| [...lities/schema/SchemaProviderWithPostProcessor.java](https://codecov.io/gh/apache/hudi/pull/2565/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFQcm92aWRlcldpdGhQb3N0UHJvY2Vzc29yLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
| ... and [409 more](https://codecov.io/gh/apache/hudi/pull/2565/diff?src=pr&el=tree-more) | |
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] prashantwason commented on a change in pull request #2565: [HUDI-1611] Added a configuration to allow specific directories to be filtered out during Metadata Table bootstrap.
Posted by GitBox <gi...@apache.org>.
prashantwason commented on a change in pull request #2565:
URL: https://github.com/apache/hudi/pull/2565#discussion_r578801683
##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java
##########
@@ -326,9 +327,15 @@ private void bootstrapFromFilesystem(HoodieEngineContext engineContext, HoodieTa
}, listingParallelism);
pathsToList.clear();
+
Review comment:
removed
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] prashantwason commented on pull request #2565: [HUDI-1611] Added a configuration to allow specific directories to be filtered out during Metadata Table bootstrap.
Posted by GitBox <gi...@apache.org>.
prashantwason commented on pull request #2565:
URL: https://github.com/apache/hudi/pull/2565#issuecomment-781686174
> just curious. actual bootstrap should have some config on these lines right? while bootstrapping data to hudi, filter directories based on some predicate. can't we reuse the same?
> CC @bvaradar @n3nash
I did not find any such config in HoodieBootstrapConfig. I feel keeping it separate may be better:
1. so as to not be confusing the two features which work independently (also the configs in bootstrap have a prefix hoodie.bootstrap)
2. building the HoodieWriteConfig will be awkward with having to provide a HoodieBootstrapConfig when enabling Metadata Table
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] n3nash merged pull request #2565: [HUDI-1611] Added a configuration to allow specific directories to be filtered out during Metadata Table bootstrap.
Posted by GitBox <gi...@apache.org>.
n3nash merged pull request #2565:
URL: https://github.com/apache/hudi/pull/2565
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan commented on pull request #2565: [HUDI-1611] Added a configuration to allow specific directories to be filtered out during Metadata Table bootstrap.
Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #2565:
URL: https://github.com/apache/hudi/pull/2565#issuecomment-780556494
just curious. actual bootstrap should have some config on these lines right? while bootstrapping data to hudi, filter directories based on some predicate. can't we reuse the same?
CC @bvaradar @n3nash
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-io edited a comment on pull request #2565: [HUDI-1611] Added a configuration to allow specific directories to be filtered out during Metadata Table bootstrap.
Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #2565:
URL: https://github.com/apache/hudi/pull/2565#issuecomment-777120159
# [Codecov](https://codecov.io/gh/apache/hudi/pull/2565?src=pr&el=h1) Report
> Merging [#2565](https://codecov.io/gh/apache/hudi/pull/2565?src=pr&el=desc) (6cb664d) into [master](https://codecov.io/gh/apache/hudi/commit/26da4f546275e8ab6496537743efe73510cb723d?el=desc) (26da4f5) will **decrease** coverage by `1.48%`.
> The diff coverage is `0.00%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2565/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2565?src=pr&el=tree)
```diff
@@ Coverage Diff @@
## master #2565 +/- ##
============================================
- Coverage 50.92% 49.44% -1.49%
+ Complexity 3168 3002 -166
============================================
Files 433 399 -34
Lines 19812 18382 -1430
Branches 2033 1848 -185
============================================
- Hits 10090 9089 -1001
+ Misses 8902 8569 -333
+ Partials 820 724 -96
```
| Flag | Coverage Δ | Complexity Δ | |
|---|---|---|---|
| hudicli | `36.90% <ø> (ø)` | `0.00 <ø> (ø)` | |
| hudiclient | `100.00% <ø> (ø)` | `0.00 <ø> (ø)` | |
| hudicommon | `51.36% <0.00%> (-0.03%)` | `0.00 <0.00> (ø)` | |
| hudiflink | `43.21% <ø> (ø)` | `0.00 <ø> (ø)` | |
| hudihadoopmr | `33.16% <ø> (ø)` | `0.00 <ø> (ø)` | |
| hudisparkdatasource | `?` | `?` | |
| hudisync | `48.61% <ø> (ø)` | `0.00 <ø> (ø)` | |
| huditimelineservice | `66.49% <ø> (ø)` | `0.00 <ø> (ø)` | |
| hudiutilities | `69.46% <ø> (-0.06%)` | `0.00 <ø> (ø)` | |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2565?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
|---|---|---|---|
| [...pache/hudi/common/config/HoodieMetadataConfig.java](https://codecov.io/gh/apache/hudi/pull/2565/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL2NvbmZpZy9Ib29kaWVNZXRhZGF0YUNvbmZpZy5qYXZh) | `0.00% <0.00%> (ø)` | `0.00 <0.00> (ø)` | |
| [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2565/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `70.50% <0.00%> (-0.36%)` | `50.00% <0.00%> (-1.00%)` | |
| [...in/scala/org/apache/hudi/IncrementalRelation.scala](https://codecov.io/gh/apache/hudi/pull/2565/diff?src=pr&el=tree#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0luY3JlbWVudGFsUmVsYXRpb24uc2NhbGE=) | | | |
| [...main/scala/org/apache/hudi/HoodieWriterUtils.scala](https://codecov.io/gh/apache/hudi/pull/2565/diff?src=pr&el=tree#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZVdyaXRlclV0aWxzLnNjYWxh) | | | |
| [...nal/HoodieDataSourceInternalBatchWriteBuilder.java](https://codecov.io/gh/apache/hudi/pull/2565/diff?src=pr&el=tree#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3BhcmszL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL3NwYXJrMy9pbnRlcm5hbC9Ib29kaWVEYXRhU291cmNlSW50ZXJuYWxCYXRjaFdyaXRlQnVpbGRlci5qYXZh) | | | |
| [...e/hudi/exception/HoodieDeltaStreamerException.java](https://codecov.io/gh/apache/hudi/pull/2565/diff?src=pr&el=tree#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvZXhjZXB0aW9uL0hvb2RpZURlbHRhU3RyZWFtZXJFeGNlcHRpb24uamF2YQ==) | | | |
| [...n/scala/org/apache/hudi/HoodieSparkSqlWriter.scala](https://codecov.io/gh/apache/hudi/pull/2565/diff?src=pr&el=tree#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZVNwYXJrU3FsV3JpdGVyLnNjYWxh) | | | |
| [...src/main/scala/org/apache/hudi/DefaultSource.scala](https://codecov.io/gh/apache/hudi/pull/2565/diff?src=pr&el=tree#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0RlZmF1bHRTb3VyY2Uuc2NhbGE=) | | | |
| [.../main/scala/org/apache/hudi/HoodieSparkUtils.scala](https://codecov.io/gh/apache/hudi/pull/2565/diff?src=pr&el=tree#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZVNwYXJrVXRpbHMuc2NhbGE=) | | | |
| [...i/internal/HoodieBulkInsertDataInternalWriter.java](https://codecov.io/gh/apache/hudi/pull/2565/diff?src=pr&el=tree#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3BhcmsyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2ludGVybmFsL0hvb2RpZUJ1bGtJbnNlcnREYXRhSW50ZXJuYWxXcml0ZXIuamF2YQ==) | | | |
| ... and [26 more](https://codecov.io/gh/apache/hudi/pull/2565/diff?src=pr&el=tree-more) | |
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-io commented on pull request #2565: [HUDI-1611] Added a configuration to allow specific directories to be filtered out during Metadata Table bootstrap.
Posted by GitBox <gi...@apache.org>.
codecov-io commented on pull request #2565:
URL: https://github.com/apache/hudi/pull/2565#issuecomment-777120159
# [Codecov](https://codecov.io/gh/apache/hudi/pull/2565?src=pr&el=h1) Report
> Merging [#2565](https://codecov.io/gh/apache/hudi/pull/2565?src=pr&el=desc) (6cb664d) into [master](https://codecov.io/gh/apache/hudi/commit/26da4f546275e8ab6496537743efe73510cb723d?el=desc) (26da4f5) will **decrease** coverage by `1.81%`.
> The diff coverage is `0.00%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2565/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2565?src=pr&el=tree)
```diff
@@ Coverage Diff @@
## master #2565 +/- ##
============================================
- Coverage 50.92% 49.10% -1.82%
+ Complexity 3168 2819 -349
============================================
Files 433 378 -55
Lines 19812 16948 -2864
Branches 2033 1714 -319
============================================
- Hits 10090 8323 -1767
+ Misses 8902 7967 -935
+ Partials 820 658 -162
```
| Flag | Coverage Δ | Complexity Δ | |
|---|---|---|---|
| hudicli | `36.90% <ø> (ø)` | `0.00 <ø> (ø)` | |
| hudiclient | `100.00% <ø> (ø)` | `0.00 <ø> (ø)` | |
| hudicommon | `51.36% <0.00%> (-0.03%)` | `0.00 <0.00> (ø)` | |
| hudiflink | `43.21% <ø> (ø)` | `0.00 <ø> (ø)` | |
| hudihadoopmr | `33.16% <ø> (ø)` | `0.00 <ø> (ø)` | |
| hudisparkdatasource | `?` | `?` | |
| hudisync | `?` | `?` | |
| huditimelineservice | `?` | `?` | |
| hudiutilities | `69.46% <ø> (-0.06%)` | `0.00 <ø> (ø)` | |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2565?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
|---|---|---|---|
| [...pache/hudi/common/config/HoodieMetadataConfig.java](https://codecov.io/gh/apache/hudi/pull/2565/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL2NvbmZpZy9Ib29kaWVNZXRhZGF0YUNvbmZpZy5qYXZh) | `0.00% <0.00%> (ø)` | `0.00 <0.00> (ø)` | |
| [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2565/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `70.50% <0.00%> (-0.36%)` | `50.00% <0.00%> (-1.00%)` | |
| [...main/java/org/apache/hudi/hive/HiveSyncConfig.java](https://codecov.io/gh/apache/hudi/pull/2565/diff?src=pr&el=tree#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvSGl2ZVN5bmNDb25maWcuamF2YQ==) | | | |
| [...udi/timeline/service/handlers/BaseFileHandler.java](https://codecov.io/gh/apache/hudi/pull/2565/diff?src=pr&el=tree#diff-aHVkaS10aW1lbGluZS1zZXJ2aWNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL3RpbWVsaW5lL3NlcnZpY2UvaGFuZGxlcnMvQmFzZUZpbGVIYW5kbGVyLmphdmE=) | | | |
| [...in/java/org/apache/hudi/hive/SchemaDifference.java](https://codecov.io/gh/apache/hudi/pull/2565/diff?src=pr&el=tree#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvU2NoZW1hRGlmZmVyZW5jZS5qYXZh) | | | |
| [...di/timeline/service/handlers/FileSliceHandler.java](https://codecov.io/gh/apache/hudi/pull/2565/diff?src=pr&el=tree#diff-aHVkaS10aW1lbGluZS1zZXJ2aWNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL3RpbWVsaW5lL3NlcnZpY2UvaGFuZGxlcnMvRmlsZVNsaWNlSGFuZGxlci5qYXZh) | | | |
| [...va/org/apache/hudi/hive/util/ColumnNameXLator.java](https://codecov.io/gh/apache/hudi/pull/2565/diff?src=pr&el=tree#diff-aHVkaS1zeW5jL2h1ZGktaGl2ZS1zeW5jL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2hpdmUvdXRpbC9Db2x1bW5OYW1lWExhdG9yLmphdmE=) | | | |
| [.../src/main/java/org/apache/hudi/dla/util/Utils.java](https://codecov.io/gh/apache/hudi/pull/2565/diff?src=pr&el=tree#diff-aHVkaS1zeW5jL2h1ZGktZGxhLXN5bmMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvZGxhL3V0aWwvVXRpbHMuamF2YQ==) | | | |
| [...in/scala/org/apache/hudi/HoodieEmptyRelation.scala](https://codecov.io/gh/apache/hudi/pull/2565/diff?src=pr&el=tree#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZUVtcHR5UmVsYXRpb24uc2NhbGE=) | | | |
| [...src/main/scala/org/apache/hudi/DefaultSource.scala](https://codecov.io/gh/apache/hudi/pull/2565/diff?src=pr&el=tree#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0RlZmF1bHRTb3VyY2Uuc2NhbGE=) | | | |
| ... and [47 more](https://codecov.io/gh/apache/hudi/pull/2565/diff?src=pr&el=tree-more) | |
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nbalajee commented on pull request #2565: [HUDI-1611] Added a configuration to allow specific directories to be filtered out during Metadata Table bootstrap.
Posted by GitBox <gi...@apache.org>.
nbalajee commented on pull request #2565:
URL: https://github.com/apache/hudi/pull/2565#issuecomment-779996283
> @nbalajee Can you review this ?
Reviewed the changes. LGTM.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan commented on a change in pull request #2565: [HUDI-1611] Added a configuration to allow specific directories to be filtered out during Metadata Table bootstrap.
Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #2565:
URL: https://github.com/apache/hudi/pull/2565#discussion_r577606882
##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java
##########
@@ -326,9 +327,15 @@ private void bootstrapFromFilesystem(HoodieEngineContext engineContext, HoodieTa
}, listingParallelism);
pathsToList.clear();
+
Review comment:
can we remove the extra line breaks.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] n3nash commented on pull request #2565: [HUDI-1611] Added a configuration to allow specific directories to be filtered out during Metadata Table bootstrap.
Posted by GitBox <gi...@apache.org>.
n3nash commented on pull request #2565:
URL: https://github.com/apache/hudi/pull/2565#issuecomment-779609442
@nbalajee Can you review this ?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-io edited a comment on pull request #2565: [HUDI-1611] Added a configuration to allow specific directories to be filtered out during Metadata Table bootstrap.
Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #2565:
URL: https://github.com/apache/hudi/pull/2565#issuecomment-777120159
# [Codecov](https://codecov.io/gh/apache/hudi/pull/2565?src=pr&el=h1) Report
> Merging [#2565](https://codecov.io/gh/apache/hudi/pull/2565?src=pr&el=desc) (6cb664d) into [master](https://codecov.io/gh/apache/hudi/commit/26da4f546275e8ab6496537743efe73510cb723d?el=desc) (26da4f5) will **decrease** coverage by `0.01%`.
> The diff coverage is `0.00%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2565/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2565?src=pr&el=tree)
```diff
@@ Coverage Diff @@
## master #2565 +/- ##
============================================
- Coverage 50.92% 50.91% -0.02%
+ Complexity 3168 3167 -1
============================================
Files 433 433
Lines 19812 19816 +4
Branches 2033 2034 +1
============================================
- Hits 10090 10089 -1
- Misses 8902 8906 +4
- Partials 820 821 +1
```
| Flag | Coverage Δ | Complexity Δ | |
|---|---|---|---|
| hudicli | `36.90% <ø> (ø)` | `0.00 <ø> (ø)` | |
| hudiclient | `100.00% <ø> (ø)` | `0.00 <ø> (ø)` | |
| hudicommon | `51.36% <0.00%> (-0.03%)` | `0.00 <0.00> (ø)` | |
| hudiflink | `43.21% <ø> (ø)` | `0.00 <ø> (ø)` | |
| hudihadoopmr | `33.16% <ø> (ø)` | `0.00 <ø> (ø)` | |
| hudisparkdatasource | `69.73% <ø> (ø)` | `0.00 <ø> (ø)` | |
| hudisync | `48.61% <ø> (ø)` | `0.00 <ø> (ø)` | |
| huditimelineservice | `66.49% <ø> (ø)` | `0.00 <ø> (ø)` | |
| hudiutilities | `69.46% <ø> (-0.06%)` | `0.00 <ø> (ø)` | |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2565?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
|---|---|---|---|
| [...pache/hudi/common/config/HoodieMetadataConfig.java](https://codecov.io/gh/apache/hudi/pull/2565/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL2NvbmZpZy9Ib29kaWVNZXRhZGF0YUNvbmZpZy5qYXZh) | `0.00% <0.00%> (ø)` | `0.00 <0.00> (ø)` | |
| [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/2565/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `70.50% <0.00%> (-0.36%)` | `50.00% <0.00%> (-1.00%)` | |
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org