You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2020/08/27 00:15:30 UTC
[GitHub] [hudi] nbalajee opened a new pull request #2045: [HUDI-1147] Modify GenericRecordFullPayloadGenerator to generate vali…
nbalajee opened a new pull request #2045:
URL: https://github.com/apache/hudi/pull/2045
…d timestamps
## What is the purpose of the pull request
Modify GenericRecordFullPayloadGenerator to generate valid timestamps
## Brief change log
- Hudi-test-suite uses the GenericRecordFullPayloadGenerator for generating test data at scale. With this change,
number of partitions to use during test data generation is configurable. Generated records are distributed among
the requested number of partitions, equally.
## Verify this pull request
This change added tests and can be verified as follows:
- Added testUpdatePayloadGeneratorWithTimestamp to verify the scenario.
## Committer checklist
- [ x] Has a corresponding JIRA in PR title & commit
- [ x] Commit message is descriptive of the change
- [ x] CI is green
- [ ] Necessary doc changes done or have another open PR
- [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-io commented on pull request #2045: [HUDI-1147] Modify GenericRecordFullPayloadGenerator to generate vali…
Posted by GitBox <gi...@apache.org>.
codecov-io commented on pull request #2045:
URL: https://github.com/apache/hudi/pull/2045#issuecomment-751953148
# [Codecov](https://codecov.io/gh/apache/hudi/pull/2045?src=pr&el=h1) Report
> Merging [#2045](https://codecov.io/gh/apache/hudi/pull/2045?src=pr&el=desc) (27726a9) into [master](https://codecov.io/gh/apache/hudi/commit/da51aa64fcaf8cd3099ef9c085c207283999306f?el=desc) (da51aa6) will **decrease** coverage by `42.17%`.
> The diff coverage is `n/a`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2045/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2045?src=pr&el=tree)
```diff
@@ Coverage Diff @@
## master #2045 +/- ##
=============================================
- Coverage 52.21% 10.04% -42.18%
+ Complexity 2662 48 -2614
=============================================
Files 335 52 -283
Lines 14983 1852 -13131
Branches 1506 223 -1283
=============================================
- Hits 7824 186 -7638
+ Misses 6535 1653 -4882
+ Partials 624 13 -611
```
| Flag | Coverage Δ | Complexity Δ | |
|---|---|---|---|
| hudicli | `?` | `?` | |
| hudiclient | `?` | `?` | |
| hudicommon | `?` | `?` | |
| hudihadoopmr | `?` | `?` | |
| huditimelineservice | `?` | `?` | |
| hudiutilities | `10.04% <ø> (-59.62%)` | `0.00 <ø> (ø)` | |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2045?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
|---|---|---|---|
| [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2045/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-2.00%)` | |
| [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2045/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-3.00%)` | |
| [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2045/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
| [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2045/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
| [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2045/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
| [...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2045/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-10.00%)` | |
| [...g/apache/hudi/utilities/sources/JsonDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2045/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkRGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
| [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2045/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-6.00%)` | |
| [...pache/hudi/utilities/sources/ParquetDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2045/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUGFycXVldERGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-5.00%)` | |
| [...lities/schema/SchemaProviderWithPostProcessor.java](https://codecov.io/gh/apache/hudi/pull/2045/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFQcm92aWRlcldpdGhQb3N0UHJvY2Vzc29yLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
| ... and [312 more](https://codecov.io/gh/apache/hudi/pull/2045/diff?src=pr&el=tree-more) | |
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-io edited a comment on pull request #2045: [HUDI-1147] Modify GenericRecordFullPayloadGenerator to generate vali…
Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #2045:
URL: https://github.com/apache/hudi/pull/2045#issuecomment-751953148
# [Codecov](https://codecov.io/gh/apache/hudi/pull/2045?src=pr&el=h1) Report
> Merging [#2045](https://codecov.io/gh/apache/hudi/pull/2045?src=pr&el=desc) (0eea618) into [master](https://codecov.io/gh/apache/hudi/commit/da51aa64fcaf8cd3099ef9c085c207283999306f?el=desc) (da51aa6) will **decrease** coverage by `42.17%`.
> The diff coverage is `n/a`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2045/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2045?src=pr&el=tree)
```diff
@@ Coverage Diff @@
## master #2045 +/- ##
=============================================
- Coverage 52.21% 10.04% -42.18%
+ Complexity 2662 48 -2614
=============================================
Files 335 52 -283
Lines 14983 1852 -13131
Branches 1506 223 -1283
=============================================
- Hits 7824 186 -7638
+ Misses 6535 1653 -4882
+ Partials 624 13 -611
```
| Flag | Coverage Δ | Complexity Δ | |
|---|---|---|---|
| hudicli | `?` | `?` | |
| hudiclient | `?` | `?` | |
| hudicommon | `?` | `?` | |
| hudihadoopmr | `?` | `?` | |
| huditimelineservice | `?` | `?` | |
| hudiutilities | `10.04% <ø> (-59.62%)` | `0.00 <ø> (ø)` | |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2045?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
|---|---|---|---|
| [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2045/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-2.00%)` | |
| [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2045/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-3.00%)` | |
| [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2045/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
| [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2045/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
| [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2045/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
| [...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2045/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-10.00%)` | |
| [...g/apache/hudi/utilities/sources/JsonDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2045/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkRGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
| [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2045/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-6.00%)` | |
| [...pache/hudi/utilities/sources/ParquetDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2045/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUGFycXVldERGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-5.00%)` | |
| [...lities/schema/SchemaProviderWithPostProcessor.java](https://codecov.io/gh/apache/hudi/pull/2045/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFQcm92aWRlcldpdGhQb3N0UHJvY2Vzc29yLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
| ... and [312 more](https://codecov.io/gh/apache/hudi/pull/2045/diff?src=pr&el=tree-more) | |
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan commented on pull request #2045: [HUDI-1147] Modify GenericRecordFullPayloadGenerator to generate vali…
Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #2045:
URL: https://github.com/apache/hudi/pull/2045#issuecomment-751319030
@nbalajee : can you address the comments and rebase and let me know.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan commented on a change in pull request #2045: [HUDI-1147] Modify GenericRecordFullPayloadGenerator to generate vali…
Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #2045:
URL: https://github.com/apache/hudi/pull/2045#discussion_r513659886
##########
File path: hudi-integ-test/src/test/java/org/apache/hudi/integ/testsuite/generator/TestGenericRecordPayloadGenerator.java
##########
@@ -25,11 +25,13 @@
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
+import java.util.concurrent.TimeUnit;
import java.util.stream.IntStream;
import org.apache.avro.Schema;
import org.apache.avro.generic.GenericRecord;
import org.apache.hudi.avro.HoodieAvroUtils;
import org.apache.hudi.utilities.testutils.UtilitiesTestBase;
+import org.junit.Assert;
Review comment:
+1
##########
File path: hudi-integ-test/src/main/java/org/apache/hudi/integ/testsuite/generator/FlexibleSchemaRecordGenerationIterator.java
##########
@@ -60,11 +60,16 @@ public boolean hasNext() {
public GenericRecord next() {
this.counter--;
if (lastRecord == null) {
- GenericRecord record = this.generator.getNewPayload();
+ GenericRecord record = this.partitionPathFieldNames != null && this.partitionPathFieldNames.size() > 0
+ ? this.generator.getNewPayloadWithTimestamp(this.partitionPathFieldNames.get(0))
+ : this.generator.getNewPayload();
lastRecord = record;
return record;
} else {
- return this.generator.randomize(lastRecord, this.partitionPathFieldNames);
+ return this.partitionPathFieldNames != null && this.partitionPathFieldNames.size() > 0
+ ? this.generator.getUpdatePayloadWithTimestamp(lastRecord,
+ this.partitionPathFieldNames, this.partitionPathFieldNames.get(0))
+ : this.generator.getUpdatePayload(lastRecord, this.partitionPathFieldNames);
Review comment:
+1
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-io edited a comment on pull request #2045: [HUDI-1147] Modify GenericRecordFullPayloadGenerator to generate vali…
Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #2045:
URL: https://github.com/apache/hudi/pull/2045#issuecomment-751953148
# [Codecov](https://codecov.io/gh/apache/hudi/pull/2045?src=pr&el=h1) Report
> Merging [#2045](https://codecov.io/gh/apache/hudi/pull/2045?src=pr&el=desc) (27726a9) into [master](https://codecov.io/gh/apache/hudi/commit/da51aa64fcaf8cd3099ef9c085c207283999306f?el=desc) (da51aa6) will **increase** coverage by `0.01%`.
> The diff coverage is `n/a`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2045/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2045?src=pr&el=tree)
```diff
@@ Coverage Diff @@
## master #2045 +/- ##
============================================
+ Coverage 52.21% 52.23% +0.01%
Complexity 2662 2662
============================================
Files 335 335
Lines 14983 14983
Branches 1506 1506
============================================
+ Hits 7824 7826 +2
+ Misses 6535 6534 -1
+ Partials 624 623 -1
```
| Flag | Coverage Δ | Complexity Δ | |
|---|---|---|---|
| hudicli | `38.83% <ø> (ø)` | `0.00 <ø> (ø)` | |
| hudiclient | `100.00% <ø> (ø)` | `0.00 <ø> (ø)` | |
| hudicommon | `54.76% <ø> (+0.02%)` | `0.00 <ø> (ø)` | |
| hudihadoopmr | `33.52% <ø> (ø)` | `0.00 <ø> (ø)` | |
| huditimelineservice | `65.30% <ø> (ø)` | `0.00 <ø> (ø)` | |
| hudiutilities | `69.65% <ø> (ø)` | `0.00 <ø> (ø)` | |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2045?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
|---|---|---|---|
| [...e/hudi/common/table/log/HoodieLogFormatWriter.java](https://codecov.io/gh/apache/hudi/pull/2045/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL2xvZy9Ib29kaWVMb2dGb3JtYXRXcml0ZXIuamF2YQ==) | `78.81% <0.00%> (+1.69%)` | `23.00% <0.00%> (ø%)` | |
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] n3nash commented on pull request #2045: [HUDI-1147] Modify GenericRecordFullPayloadGenerator to generate vali…
Posted by GitBox <gi...@apache.org>.
n3nash commented on pull request #2045:
URL: https://github.com/apache/hudi/pull/2045#issuecomment-718033902
@nsivabalan can you please do a pass at this ?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] xushiyan commented on a change in pull request #2045: [HUDI-1147] Modify GenericRecordFullPayloadGenerator to generate vali…
Posted by GitBox <gi...@apache.org>.
xushiyan commented on a change in pull request #2045:
URL: https://github.com/apache/hudi/pull/2045#discussion_r487451803
##########
File path: hudi-integ-test/src/main/java/org/apache/hudi/integ/testsuite/generator/GenericRecordFullPayloadGenerator.java
##########
@@ -45,14 +45,16 @@
*/
public class GenericRecordFullPayloadGenerator implements Serializable {
+ private static Logger log = LoggerFactory.getLogger(GenericRecordFullPayloadGenerator.class);
Review comment:
```suggestion
private static final Logger LOG = LoggerFactory.getLogger(GenericRecordFullPayloadGenerator.class);
```
##########
File path: hudi-integ-test/src/test/java/org/apache/hudi/integ/testsuite/generator/TestGenericRecordPayloadGenerator.java
##########
@@ -25,11 +25,13 @@
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
+import java.util.concurrent.TimeUnit;
import java.util.stream.IntStream;
import org.apache.avro.Schema;
import org.apache.avro.generic.GenericRecord;
import org.apache.hudi.avro.HoodieAvroUtils;
import org.apache.hudi.utilities.testutils.UtilitiesTestBase;
+import org.junit.Assert;
Review comment:
can we change this to junit 5 APIs? also to reduce verbosity, could we static import the assertion functions?
##########
File path: hudi-integ-test/src/main/java/org/apache/hudi/integ/testsuite/generator/DeltaGenerator.java
##########
@@ -93,14 +93,15 @@ public DeltaGenerator(DeltaConfig deltaOutputConfig, JavaSparkContext jsc, Spark
}
public JavaRDD<GenericRecord> generateInserts(Config operation) {
- long recordsPerPartition = operation.getNumRecordsInsert();
int numPartitions = operation.getNumInsertPartitions();
+ long recordsPerPartition = operation.getNumRecordsInsert();
int minPayloadSize = operation.getRecordSize();
JavaRDD<GenericRecord> inputBatch = jsc.parallelize(Collections.EMPTY_LIST)
.repartition(operation.getNumInsertPartitions()).mapPartitions(p -> {
return new LazyRecordGeneratorIterator(new FlexibleSchemaRecordGenerationIterator(recordsPerPartition,
minPayloadSize, schemaStr, partitionPathFieldNames, numPartitions));
- });
+
+ });
Review comment:
could you revert these diffs please? seems unnecessary changes
##########
File path: hudi-integ-test/src/main/java/org/apache/hudi/integ/testsuite/generator/FlexibleSchemaRecordGenerationIterator.java
##########
@@ -60,11 +60,16 @@ public boolean hasNext() {
public GenericRecord next() {
this.counter--;
if (lastRecord == null) {
- GenericRecord record = this.generator.getNewPayload();
+ GenericRecord record = this.partitionPathFieldNames != null && this.partitionPathFieldNames.size() > 0
+ ? this.generator.getNewPayloadWithTimestamp(this.partitionPathFieldNames.get(0))
+ : this.generator.getNewPayload();
lastRecord = record;
return record;
} else {
- return this.generator.randomize(lastRecord, this.partitionPathFieldNames);
+ return this.partitionPathFieldNames != null && this.partitionPathFieldNames.size() > 0
+ ? this.generator.getUpdatePayloadWithTimestamp(lastRecord,
+ this.partitionPathFieldNames, this.partitionPathFieldNames.get(0))
+ : this.generator.getUpdatePayload(lastRecord, this.partitionPathFieldNames);
Review comment:
looks like `this.partitionPathFieldNames != null && this.partitionPathFieldNames.size() > 0` deserves to be a local boolean variable with a good name to improve readability.
Also can we avoid unnecessary `this.` references?
##########
File path: hudi-integ-test/src/main/java/org/apache/hudi/integ/testsuite/generator/FlexibleSchemaRecordGenerationIterator.java
##########
@@ -60,11 +60,16 @@ public boolean hasNext() {
public GenericRecord next() {
this.counter--;
if (lastRecord == null) {
- GenericRecord record = this.generator.getNewPayload();
+ GenericRecord record = this.partitionPathFieldNames != null && this.partitionPathFieldNames.size() > 0
+ ? this.generator.getNewPayloadWithTimestamp(this.partitionPathFieldNames.get(0))
+ : this.generator.getNewPayload();
lastRecord = record;
return record;
} else {
- return this.generator.randomize(lastRecord, this.partitionPathFieldNames);
+ return this.partitionPathFieldNames != null && this.partitionPathFieldNames.size() > 0
+ ? this.generator.getUpdatePayloadWithTimestamp(lastRecord,
+ this.partitionPathFieldNames, this.partitionPathFieldNames.get(0))
+ : this.generator.getUpdatePayload(lastRecord, this.partitionPathFieldNames);
Review comment:
is it possible make sure `partitionPathFieldNames` not null so that we don't have to do null check here?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-io edited a comment on pull request #2045: [HUDI-1147] Modify GenericRecordFullPayloadGenerator to generate vali…
Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #2045:
URL: https://github.com/apache/hudi/pull/2045#issuecomment-751953148
# [Codecov](https://codecov.io/gh/apache/hudi/pull/2045?src=pr&el=h1) Report
> Merging [#2045](https://codecov.io/gh/apache/hudi/pull/2045?src=pr&el=desc) (0eea618) into [master](https://codecov.io/gh/apache/hudi/commit/da51aa64fcaf8cd3099ef9c085c207283999306f?el=desc) (da51aa6) will **increase** coverage by `0.01%`.
> The diff coverage is `n/a`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2045/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2045?src=pr&el=tree)
```diff
@@ Coverage Diff @@
## master #2045 +/- ##
============================================
+ Coverage 52.21% 52.23% +0.01%
Complexity 2662 2662
============================================
Files 335 335
Lines 14983 14983
Branches 1506 1506
============================================
+ Hits 7824 7826 +2
+ Misses 6535 6534 -1
+ Partials 624 623 -1
```
| Flag | Coverage Δ | Complexity Δ | |
|---|---|---|---|
| hudicli | `38.83% <ø> (ø)` | `0.00 <ø> (ø)` | |
| hudiclient | `100.00% <ø> (ø)` | `0.00 <ø> (ø)` | |
| hudicommon | `54.76% <ø> (+0.02%)` | `0.00 <ø> (ø)` | |
| hudihadoopmr | `33.52% <ø> (ø)` | `0.00 <ø> (ø)` | |
| huditimelineservice | `65.30% <ø> (ø)` | `0.00 <ø> (ø)` | |
| hudiutilities | `69.65% <ø> (ø)` | `0.00 <ø> (ø)` | |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2045?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
|---|---|---|---|
| [...e/hudi/common/table/log/HoodieLogFormatWriter.java](https://codecov.io/gh/apache/hudi/pull/2045/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL2xvZy9Ib29kaWVMb2dGb3JtYXRXcml0ZXIuamF2YQ==) | `78.81% <0.00%> (+1.69%)` | `23.00% <0.00%> (ø%)` | |
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] vinothchandar commented on pull request #2045: [HUDI-1147] Modify GenericRecordFullPayloadGenerator to generate vali…
Posted by GitBox <gi...@apache.org>.
vinothchandar commented on pull request #2045:
URL: https://github.com/apache/hudi/pull/2045#issuecomment-716076506
@n3nash @nsivabalan one of you able to take this home?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] n3nash commented on pull request #2045: [HUDI-1147] Modify GenericRecordFullPayloadGenerator to generate vali…
Posted by GitBox <gi...@apache.org>.
n3nash commented on pull request #2045:
URL: https://github.com/apache/hudi/pull/2045#issuecomment-689318428
@nbalajee can you please rebase ?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan merged pull request #2045: [HUDI-1147] Modify GenericRecordFullPayloadGenerator to generate vali…
Posted by GitBox <gi...@apache.org>.
nsivabalan merged pull request #2045:
URL: https://github.com/apache/hudi/pull/2045
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] vinothchandar commented on pull request #2045: [HUDI-1147] Modify GenericRecordFullPayloadGenerator to generate vali…
Posted by GitBox <gi...@apache.org>.
vinothchandar commented on pull request #2045:
URL: https://github.com/apache/hudi/pull/2045#issuecomment-683889521
@xushiyan can you please review and see this home.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] xushiyan commented on pull request #2045: [HUDI-1147] Modify GenericRecordFullPayloadGenerator to generate vali…
Posted by GitBox <gi...@apache.org>.
xushiyan commented on pull request #2045:
URL: https://github.com/apache/hudi/pull/2045#issuecomment-685828225
Ok checking...meanwhile @nbalajee could you please resolve the conflicts ? thanks.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org