You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2019/12/19 08:03:47 UTC

[GitHub] [incubator-hudi] yanghua opened a new pull request #1115: [HUDI-392] Introduce DIstributedTestDataSource to generate test data

yanghua opened a new pull request #1115: [HUDI-392] Introduce DIstributedTestDataSource to generate test data
URL: https://github.com/apache/incubator-hudi/pull/1115
 
 
   
   ## What is the purpose of the pull request
   
   *Introduce DIstributedTestDataSource to generate test data*
   
   ## Brief change log
   
     - *Introduce DIstributedTestDataSource to generate test data*
   
   ## Verify this pull request
   
   *(Please pick either of the following options)*
   
   
   This change added tests and can be verified as follows:
   
   
     - *TestHoodieTestSuiteJob#testDistributeSourceInsert*
   
   ## Committer checklist
   
    - [ ] Has a corresponding JIRA in PR title & commit
    
    - [ ] Commit message is descriptive of the change
    
    - [ ] CI is green
   
    - [ ] Necessary doc changes done or have another open PR
          
    - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] n3nash commented on issue #1115: [HUDI-392] Introduce DIstributedTestDataSource to generate test data

Posted by GitBox <gi...@apache.org>.
n3nash commented on issue #1115: [HUDI-392] Introduce DIstributedTestDataSource to generate test data
URL: https://github.com/apache/incubator-hudi/pull/1115#issuecomment-570727891
 
 
   @yanghua I was on a holiday break, apologies for the late response. Have you tried to run the test-suite ? If the current data generation methodology meets our needs, we might not require the DistributedTestDataSource. If not, we can tweek the current implementation or bring in the DistributedSource, wdyt ?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hudi] n3nash commented on pull request #1115: [HUDI-392] Introduce DIstributedTestDataSource to generate test data

Posted by GitBox <gi...@apache.org>.
n3nash commented on pull request #1115:
URL: https://github.com/apache/hudi/pull/1115#issuecomment-668905276


   @yanghua Is it okay to close this now ?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-hudi] yanghua commented on issue #1115: [HUDI-392] Introduce DIstributedTestDataSource to generate test data

Posted by GitBox <gi...@apache.org>.
yanghua commented on issue #1115: [HUDI-392] Introduce DIstributedTestDataSource to generate test data
URL: https://github.com/apache/incubator-hudi/pull/1115#issuecomment-568866184
 
 
   @n3nash I think the whole workload generation is a bit confusing now. Rethinking how to refactor them. Do you think the generator implemented by you can replace `DistributedTestDataSource `?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] yanghua commented on issue #1115: [HUDI-392] Introduce DIstributedTestDataSource to generate test data

Posted by GitBox <gi...@apache.org>.
yanghua commented on issue #1115: [HUDI-392] Introduce DIstributedTestDataSource to generate test data
URL: https://github.com/apache/incubator-hudi/pull/1115#issuecomment-568455662
 
 
   Hi @vinothchandar , WDYT about `DIstributedTestDataSource `? It seems this class has not been used anywhere. It's only be tested in `TestHoodieDeltaStreamer`. Can we move it into `hudi-test-suite` module? 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] n3nash commented on issue #1115: [HUDI-392] Introduce DIstributedTestDataSource to generate test data

Posted by GitBox <gi...@apache.org>.
n3nash commented on issue #1115: [HUDI-392] Introduce DIstributedTestDataSource to generate test data
URL: https://github.com/apache/incubator-hudi/pull/1115#issuecomment-571389232
 
 
   @yanghua Okay, it's good to hear that you were able to try out the test suite. May be we need to prepare some more elaborate test suite DAGs which cover all use-cases and code paths/api's ?
   
   I'm open to any refactoring ideas that you might have for the data generation, let me know when you have those thoughts more concrete and shareable.
   
   Integrating azure pipelines and the test suite would be a good to close the loop on a first version of the test suite. Let's continue to focus on that (and Hudi with Flink of course :) ).
   Can you do another pass at the PR and see if there are any glaring open items (apart from the data generation refactor which I will let you do) that need work ? I can then take that up this week so hopefully in the next few days we have a PR ready to go through a final review process ?
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] n3nash commented on a change in pull request #1115: [HUDI-392] Introduce DIstributedTestDataSource to generate test data

Posted by GitBox <gi...@apache.org>.
n3nash commented on a change in pull request #1115: [HUDI-392] Introduce DIstributedTestDataSource to generate test data
URL: https://github.com/apache/incubator-hudi/pull/1115#discussion_r361274338
 
 

 ##########
 File path: hudi-test-suite/src/main/java/org/apache/hudi/testsuite/dag/nodes/DistributedUpsertNode.java
 ##########
 @@ -0,0 +1,48 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.testsuite.dag.nodes;
+
+import org.apache.hudi.testsuite.configuration.DeltaConfig.Config;
+import org.apache.hudi.testsuite.generator.DeltaGenerator;
+
+import org.apache.avro.generic.GenericRecord;
+import org.apache.spark.api.java.JavaRDD;
+
+import java.io.Serializable;
+
+/**
+ * A insert node which used {@link org.apache.hudi.utilities.sources.DistributedTestDataSource}
 
 Review comment:
   Is this supposed to generate inserts or upserts ? The name of the class says differently. 
   Also, the name of the node is slightly confusing, the existing upsertNode is also generating data in a distributed manner - since it also uses RDD based logic. May be name the new class as `UpsertNodeUsingDistributedGenerator` or something along these lines ?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] yanghua commented on issue #1115: [HUDI-392] Introduce DIstributedTestDataSource to generate test data

Posted by GitBox <gi...@apache.org>.
yanghua commented on issue #1115: [HUDI-392] Introduce DIstributedTestDataSource to generate test data
URL: https://github.com/apache/incubator-hudi/pull/1115#issuecomment-575522467
 
 
   > @yanghua Are we close to calling this as a first version of the test suite ?
   
   I agree, now I met dependency conflicts issues after upgrading the Spark version. I still have no time to figure it out. Can you have a look?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] yanghua commented on a change in pull request #1115: [HUDI-392] Introduce DIstributedTestDataSource to generate test data

Posted by GitBox <gi...@apache.org>.
yanghua commented on a change in pull request #1115: [HUDI-392] Introduce DIstributedTestDataSource to generate test data
URL: https://github.com/apache/incubator-hudi/pull/1115#discussion_r361101497
 
 

 ##########
 File path: hudi-test-suite/src/main/java/org/apache/hudi/testsuite/generator/DeltaGenerator.java
 ##########
 @@ -108,6 +114,17 @@ public DeltaGenerator(DeltaConfig deltaOutputConfig, JavaSparkContext jsc, Spark
     return inputBatch;
   }
 
+  public JavaRDD<GenericRecord> generateUpsertsWithDistributedSource(Config operation) {
 
 Review comment:
   I have debugged the call chain of the relevant methods. 
   
   The key chain lists below:
   
   ```
   DistributedTestDataSource#fetchNext
     DistributedTestDataSource#fetchNewData
       DistributedTestDataSource#fetchNextBatch
   ```
   
   In `DistributedTestDataSource#fetchNextBatch`, it will calculate the number of insert and update records. Core logic:
   
   ```
   int numExistingKeys = dataGenerator.getNumExistingKeys();
   
   int numUpdates = Math.min(numExistingKeys, sourceLimit / 2);
   int numInserts = sourceLimit - numUpdates;
   ```
   
   The `sourceLimit` variable is specified by the outside (here is 10000000). However, about `numExistingKeys` variable, it is always `0`. It can only be changed after calling some methods in `HoodieTestDataGenerator` to generate insert records. In our scene, these methods have never been invoked. So here:
   
   ```
   numUpdates = 0;
   numInserts = sourceLimit;
   ```
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] n3nash commented on issue #1115: [HUDI-392] Introduce DIstributedTestDataSource to generate test data

Posted by GitBox <gi...@apache.org>.
n3nash commented on issue #1115: [HUDI-392] Introduce DIstributedTestDataSource to generate test data
URL: https://github.com/apache/incubator-hudi/pull/1115#issuecomment-575381731
 
 
   @yanghua Are we close to calling this as a first version of the test suite ?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] yanghua commented on issue #1115: [HUDI-392] Introduce DIstributedTestDataSource to generate test data

Posted by GitBox <gi...@apache.org>.
yanghua commented on issue #1115: [HUDI-392] Introduce DIstributedTestDataSource to generate test data
URL: https://github.com/apache/incubator-hudi/pull/1115#issuecomment-570743819
 
 
   > @yanghua I was on a holiday break, apologies for the late response. Have you tried to run the test-suite ? If the current data generation methodology meets our needs, we might not require the DistributedTestDataSource. If not, we can tweek the current implementation or bring in the DistributedSource, wdyt ?
   
   Hi @n3nash No need to say apology, happy holiday. Yes, I have run the test suite several times. It works fine.
   
   IMO, the `DistributedTestDataSource` will not block the test suite. Actually, I think the test payload generation is a little confused currently. I was thinking about how to refactor it. However, the work was broken by other things about integrating with Azure pipeline and designing how to integrate Hudi with Flink.
   
   The more details about integrating with Azure can be found here:
    - https://github.com/apachehudi-ci/incubator-hudi/blob/master/azure-pipelines.yml
    - https://dev.azure.com/vinoyang/Hudi/_build?definitionId=2
   
   It has not be done.
   
   cc @vinothchandar 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] n3nash commented on a change in pull request #1115: [HUDI-392] Introduce DIstributedTestDataSource to generate test data

Posted by GitBox <gi...@apache.org>.
n3nash commented on a change in pull request #1115: [HUDI-392] Introduce DIstributedTestDataSource to generate test data
URL: https://github.com/apache/incubator-hudi/pull/1115#discussion_r360987666
 
 

 ##########
 File path: hudi-test-suite/src/main/java/org/apache/hudi/testsuite/generator/DeltaGenerator.java
 ##########
 @@ -108,6 +114,17 @@ public DeltaGenerator(DeltaConfig deltaOutputConfig, JavaSparkContext jsc, Spark
     return inputBatch;
   }
 
+  public JavaRDD<GenericRecord> generateUpsertsWithDistributedSource(Config operation) {
 
 Review comment:
   @yanghua Yes, we should refactor those parts. 
   
   For (5), what I mean is that when we perform distributedTestDataSource.fetchNext(Option.empty(), 10000000) does it return a bunch of updates + inserts (or just inserts) ?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] n3nash commented on a change in pull request #1115: [HUDI-392] Introduce DIstributedTestDataSource to generate test data

Posted by GitBox <gi...@apache.org>.
n3nash commented on a change in pull request #1115: [HUDI-392] Introduce DIstributedTestDataSource to generate test data
URL: https://github.com/apache/incubator-hudi/pull/1115#discussion_r361026575
 
 

 ##########
 File path: hudi-client/src/test/java/org/apache/hudi/common/HoodieTestDataGenerator.java
 ##########
 @@ -114,20 +114,20 @@ public static void writePartitionMetadata(FileSystem fs, String[] partitionPaths
    * Generates a new avro record of the above schema format, retaining the key if optionally provided.
    */
   public static TestRawTripPayload generateRandomValue(HoodieKey key, String commitTime) throws IOException {
-    GenericRecord rec = generateGenericRecord(key.getRecordKey(), "rider-" + commitTime, "driver-" + commitTime, 0.0);
+    GenericRecord rec = generateGenericRecord(key.getRecordKey(), "rider-" + commitTime, "driver-" + commitTime, 0);
     return new TestRawTripPayload(rec.toString(), key.getRecordKey(), key.getPartitionPath(), TRIP_EXAMPLE_SCHEMA);
   }
 
   /**
    * Generates a new avro record of the above schema format, retaining the key if optionally provided.
    */
   public static HoodieAvroPayload generateAvroPayload(HoodieKey key, String commitTime) throws IOException {
-    GenericRecord rec = generateGenericRecord(key.getRecordKey(), "rider-" + commitTime, "driver-" + commitTime, 0.0);
+    GenericRecord rec = generateGenericRecord(key.getRecordKey(), "rider-" + commitTime, "driver-" + commitTime, 0);
     return new HoodieAvroPayload(Option.of(rec));
   }
 
   public static GenericRecord generateGenericRecord(String rowKey, String riderName, String driverName,
-      double timestamp) {
+      long timestamp) {
 
 Review comment:
   Okay, we can fix that, shouldn't be difficult

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] vinothchandar commented on issue #1115: [HUDI-392] Introduce DIstributedTestDataSource to generate test data

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on issue #1115: [HUDI-392] Introduce DIstributedTestDataSource to generate test data
URL: https://github.com/apache/incubator-hudi/pull/1115#issuecomment-568557318
 
 
   @yanghua IIUC, @bvaradar uses it actually to run a test job that generates random data on the cluster.. So, may be leave it in `hoodie-utilities` so that the bundle also has it.. Its in general, I nice way to start running deltastreamer with some fake data.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] yanghua commented on issue #1115: [HUDI-392] Introduce DIstributedTestDataSource to generate test data

Posted by GitBox <gi...@apache.org>.
yanghua commented on issue #1115: [HUDI-392] Introduce DIstributedTestDataSource to generate test data
URL: https://github.com/apache/incubator-hudi/pull/1115#issuecomment-568662558
 
 
   > @yanghua IIUC, @bvaradar uses it actually to run a test job that generates random data on the cluster.. 
   
   I did not see any place where use `DistributedTestDataSource` in the master branch.
   
   > So, may be leave it in `hoodie-utilities` so that the bundle also has it.. Its in general, I nice way to start running deltastreamer with some fake data.
   
   We can leave it in `hoodie-utilities` module. However, it exists in the test package. As @n3nash mentioned, we would better avoid using test code in another module.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] n3nash commented on a change in pull request #1115: [HUDI-392] Introduce DIstributedTestDataSource to generate test data

Posted by GitBox <gi...@apache.org>.
n3nash commented on a change in pull request #1115: [HUDI-392] Introduce DIstributedTestDataSource to generate test data
URL: https://github.com/apache/incubator-hudi/pull/1115#discussion_r361274426
 
 

 ##########
 File path: hudi-test-suite/src/main/java/org/apache/hudi/testsuite/generator/DeltaGenerator.java
 ##########
 @@ -108,6 +114,17 @@ public DeltaGenerator(DeltaConfig deltaOutputConfig, JavaSparkContext jsc, Spark
     return inputBatch;
   }
 
+  public JavaRDD<GenericRecord> generateUpsertsWithDistributedSource(Config operation) {
 
 Review comment:
   okay, so I'm still unclear, can we pass the exact number of inserts/upserts to create using the above logic ? If not, this might not be that useful.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hudi] yanghua commented on pull request #1115: [HUDI-392] Introduce DIstributedTestDataSource to generate test data

Posted by GitBox <gi...@apache.org>.
yanghua commented on pull request #1115:
URL: https://github.com/apache/hudi/pull/1115#issuecomment-668957475


   > @yanghua Is it okay to close this now ?
   
   Yes, closing...


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-hudi] yanghua commented on issue #1115: [HUDI-392] Introduce DIstributedTestDataSource to generate test data

Posted by GitBox <gi...@apache.org>.
yanghua commented on issue #1115: [HUDI-392] Introduce DIstributedTestDataSource to generate test data
URL: https://github.com/apache/incubator-hudi/pull/1115#issuecomment-571537763
 
 
   @n3nash OK, will try to review the whole test suite again to see if I can find some issues.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hudi] yanghua closed pull request #1115: [HUDI-392] Introduce DIstributedTestDataSource to generate test data

Posted by GitBox <gi...@apache.org>.
yanghua closed pull request #1115:
URL: https://github.com/apache/hudi/pull/1115


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-hudi] n3nash commented on a change in pull request #1115: [HUDI-392] Introduce DIstributedTestDataSource to generate test data

Posted by GitBox <gi...@apache.org>.
n3nash commented on a change in pull request #1115: [HUDI-392] Introduce DIstributedTestDataSource to generate test data
URL: https://github.com/apache/incubator-hudi/pull/1115#discussion_r361274506
 
 

 ##########
 File path: hudi-test-suite/src/main/java/org/apache/hudi/testsuite/generator/DeltaGenerator.java
 ##########
 @@ -108,6 +114,17 @@ public DeltaGenerator(DeltaConfig deltaOutputConfig, JavaSparkContext jsc, Spark
     return inputBatch;
   }
 
+  public JavaRDD<GenericRecord> generateUpsertsWithDistributedSource(Config operation) {
+    TypedProperties props = new TypedProperties();
+    props.setProperty(TestSourceConfig.MAX_UNIQUE_RECORDS_PROP, String.valueOf(operation.getNumRecordsInsert()));
+    props.setProperty(TestSourceConfig.NUM_SOURCE_PARTITIONS_PROP, String.valueOf(operation.getNumInsertPartitions()));
+    props.setProperty(TestSourceConfig.USE_ROCKSDB_FOR_TEST_DATAGEN_KEYS, "true");
+    DistributedTestDataSource distributedTestDataSource = new DistributedTestDataSource(
 
 Review comment:
   Still see "test" names in the core logic - either rename it and add it to a utils folder so can be used in the src code.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] n3nash commented on issue #1115: [HUDI-392] Introduce DIstributedTestDataSource to generate test data

Posted by GitBox <gi...@apache.org>.
n3nash commented on issue #1115: [HUDI-392] Introduce DIstributedTestDataSource to generate test data
URL: https://github.com/apache/incubator-hudi/pull/1115#issuecomment-575743720
 
 
   @yanghua Are there other issues you are working on ? If yes, can you please prioritize this over the others ? (I don't have time either but can help if you are unable to find time)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] yanghua commented on issue #1115: [HUDI-392] Introduce DIstributedTestDataSource to generate test data

Posted by GitBox <gi...@apache.org>.
yanghua commented on issue #1115: [HUDI-392] Introduce DIstributedTestDataSource to generate test data
URL: https://github.com/apache/incubator-hudi/pull/1115#issuecomment-568688009
 
 
   The Travis is green now.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] n3nash commented on issue #1115: [HUDI-392] Introduce DIstributedTestDataSource to generate test data

Posted by GitBox <gi...@apache.org>.
n3nash commented on issue #1115: [HUDI-392] Introduce DIstributedTestDataSource to generate test data
URL: https://github.com/apache/incubator-hudi/pull/1115#issuecomment-579385938
 
 
   @yanghua please take a look at this after you are back from the new year holidays, we should probably merge or close this very soon

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services