You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2020/11/25 15:58:50 UTC

[GitHub] [hudi] garyli1019 opened a new pull request #2281: WIP[HUDI-1418] set up flink client unit test infra

garyli1019 opened a new pull request #2281:
URL: https://github.com/apache/hudi/pull/2281


   ## What is the purpose of the pull request
   
   Add unit test infra to flink client
   
   ## Brief change log
   
     - Added HoodieFlinkTestHarness
     - Refactor HoodieWriteableTestTable to remove Spark components
   
   ## Verify this pull request
   
   TODO
   
   ## Committer checklist
   
    - [ ] Has a corresponding JIRA in PR title & commit
    
    - [ ] Commit message is descriptive of the change
    
    - [ ] CI is green
   
    - [ ] Necessary doc changes done or have another open PR
          
    - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] liujinhui1994 commented on a change in pull request #2281: [HUDI-1418] set up flink client unit test infra

Posted by GitBox <gi...@apache.org>.
liujinhui1994 commented on a change in pull request #2281:
URL: https://github.com/apache/hudi/pull/2281#discussion_r544948270



##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieDataSourceConfig.java
##########
@@ -0,0 +1,102 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.config;
+
+import org.apache.hudi.common.config.DefaultHoodieConfig;
+import org.apache.hudi.common.model.OverwriteWithLatestAvroPayload;
+import org.apache.hudi.keygen.SimpleAvroKeyGenerator;
+import org.apache.hudi.keygen.constant.KeyGeneratorOptions;
+
+import java.io.File;
+import java.io.FileReader;
+import java.io.IOException;
+import java.util.Properties;
+
+public class HoodieDataSourceConfig extends DefaultHoodieConfig {
+
+  public static final String TABLE_NAME_PROP = HoodieWriteConfig.TABLE_NAME;
+  public static final String PRECOMBINE_FIELD_PROP = "hoodie.datasource.write.precombine.field";
+  public static final String RECORDKEY_FIELD_PROP = KeyGeneratorOptions.RECORDKEY_FIELD_OPT_KEY;
+  public static final String PARTITIONPATH_FIELD_PROP = KeyGeneratorOptions.PARTITIONPATH_FIELD_OPT_KEY;
+
+  public static final String WRITE_PAYLOAD_CLASS = "hoodie.datasource.write.payload.class";
+  public static final String DEFAULT_WRITE_PAYLOAD_CLASS = OverwriteWithLatestAvroPayload.class.getName();
+  public static final String KEYGENERATOR_CLASS_PROP = "hoodie.datasource.write.keygenerator.class";
+  public static final String DEFAULT_KEYGENERATOR_CLASS = SimpleAvroKeyGenerator.class.getName();

Review comment:
       @garyli1019 @wangxianghu  
   
   I'm almost done, I will post it as soon as possible




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] yanghua merged pull request #2281: [HUDI-1418] Set up flink client unit test infra

Posted by GitBox <gi...@apache.org>.
yanghua merged pull request #2281:
URL: https://github.com/apache/hudi/pull/2281


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] garyli1019 commented on a change in pull request #2281: [HUDI-1418] set up flink client unit test infra

Posted by GitBox <gi...@apache.org>.
garyli1019 commented on a change in pull request #2281:
URL: https://github.com/apache/hudi/pull/2281#discussion_r547280900



##########
File path: hudi-client/hudi-flink-client/src/test/java/org/apache/hudi/testutils/HoodieFlinkClientTestHarness.java
##########
@@ -0,0 +1,149 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.testutils;
+
+import org.apache.hudi.client.HoodieFlinkWriteClient;
+import org.apache.hudi.common.fs.FSUtils;
+import org.apache.hudi.common.model.HoodieRecord;
+import org.apache.hudi.common.model.HoodieTableType;
+import org.apache.hudi.common.table.HoodieTableMetaClient;
+import org.apache.hudi.common.table.view.HoodieTableFileSystemView;
+import org.apache.hudi.common.testutils.HoodieCommonTestHarness;
+import org.apache.hudi.common.testutils.HoodieTestUtils;
+import org.apache.hudi.common.testutils.minicluster.HdfsTestService;
+
+import org.apache.flink.runtime.testutils.MiniClusterResourceConfiguration;
+import org.apache.flink.streaming.api.functions.sink.SinkFunction;
+import org.apache.flink.test.util.MiniClusterWithClientResource;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.LocalFileSystem;
+import org.apache.hadoop.hdfs.DistributedFileSystem;
+import org.apache.hadoop.hdfs.MiniDFSCluster;
+import org.apache.log4j.LogManager;
+import org.apache.log4j.Logger;
+import org.junit.jupiter.api.BeforeEach;
+import org.junit.jupiter.api.TestInfo;
+
+import java.io.IOException;
+import java.io.Serializable;
+import java.util.ArrayList;
+import java.util.List;
+
+public class HoodieFlinkClientTestHarness extends HoodieCommonTestHarness implements Serializable {
+
+  protected static final Logger LOG = LogManager.getLogger(HoodieFlinkClientTestHarness.class);
+  private String testMethodName;
+  protected transient Configuration hadoopConf = null;
+  protected transient FileSystem fs;
+  protected transient HoodieFlinkWriteClient writeClient;

Review comment:
       sure, will do




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] garyli1019 commented on a change in pull request #2281: [HUDI-1418] set up flink client unit test infra

Posted by GitBox <gi...@apache.org>.
garyli1019 commented on a change in pull request #2281:
URL: https://github.com/apache/hudi/pull/2281#discussion_r547281475



##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieDataSourceConfig.java
##########
@@ -0,0 +1,102 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.config;
+
+import org.apache.hudi.common.config.DefaultHoodieConfig;
+import org.apache.hudi.common.model.OverwriteWithLatestAvroPayload;
+import org.apache.hudi.keygen.SimpleAvroKeyGenerator;
+import org.apache.hudi.keygen.constant.KeyGeneratorOptions;
+
+import java.io.File;
+import java.io.FileReader;
+import java.io.IOException;
+import java.util.Properties;
+
+public class HoodieDataSourceConfig extends DefaultHoodieConfig {
+
+  public static final String TABLE_NAME_PROP = HoodieWriteConfig.TABLE_NAME;
+  public static final String PRECOMBINE_FIELD_PROP = "hoodie.datasource.write.precombine.field";
+  public static final String RECORDKEY_FIELD_PROP = KeyGeneratorOptions.RECORDKEY_FIELD_OPT_KEY;
+  public static final String PARTITIONPATH_FIELD_PROP = KeyGeneratorOptions.PARTITIONPATH_FIELD_OPT_KEY;
+
+  public static final String WRITE_PAYLOAD_CLASS = "hoodie.datasource.write.payload.class";
+  public static final String DEFAULT_WRITE_PAYLOAD_CLASS = OverwriteWithLatestAvroPayload.class.getName();
+  public static final String KEYGENERATOR_CLASS_PROP = "hoodie.datasource.write.keygenerator.class";
+  public static final String DEFAULT_KEYGENERATOR_CLASS = SimpleAvroKeyGenerator.class.getName();

Review comment:
       hi @liujinhui1994 any update on refactoring `DataSourceOptions` ticket? 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] wangxianghu commented on pull request #2281: [HUDI-1418] set up flink client unit test infra

Posted by GitBox <gi...@apache.org>.
wangxianghu commented on pull request #2281:
URL: https://github.com/apache/hudi/pull/2281#issuecomment-742901897


   > @wangxianghu Can you help to review this PR when you have time? thanks.
   
   ack, will review soon


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] garyli1019 commented on a change in pull request #2281: [HUDI-1418] set up flink client unit test infra

Posted by GitBox <gi...@apache.org>.
garyli1019 commented on a change in pull request #2281:
URL: https://github.com/apache/hudi/pull/2281#discussion_r550190548



##########
File path: hudi-client/hudi-flink-client/src/test/java/org/apache/hudi/testutils/HoodieFlinkClientTestHarness.java
##########
@@ -0,0 +1,136 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.testutils;
+
+import org.apache.hudi.common.fs.FSUtils;
+import org.apache.hudi.common.model.HoodieRecord;
+import org.apache.hudi.common.model.HoodieTableType;
+import org.apache.hudi.common.table.HoodieTableMetaClient;
+import org.apache.hudi.common.testutils.HoodieCommonTestHarness;
+import org.apache.hudi.common.testutils.HoodieTestUtils;
+
+import org.apache.flink.runtime.testutils.MiniClusterResourceConfiguration;
+import org.apache.flink.streaming.api.functions.sink.SinkFunction;
+import org.apache.flink.test.util.MiniClusterWithClientResource;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.LocalFileSystem;
+import org.apache.log4j.LogManager;
+import org.apache.log4j.Logger;
+import org.junit.jupiter.api.BeforeEach;
+import org.junit.jupiter.api.TestInfo;
+
+import java.io.IOException;
+import java.io.Serializable;
+import java.util.ArrayList;
+import java.util.List;
+
+public class HoodieFlinkClientTestHarness extends HoodieCommonTestHarness implements Serializable {
+
+  protected static final Logger LOG = LogManager.getLogger(HoodieFlinkClientTestHarness.class);
+  private String testMethodName;
+  protected transient Configuration hadoopConf = null;
+  protected transient FileSystem fs;
+  protected transient MiniClusterWithClientResource flinkCluster = null;
+
+  @BeforeEach
+  public void setTestMethodName(TestInfo testInfo) {
+    if (testInfo.getTestMethod().isPresent()) {
+      testMethodName = testInfo.getTestMethod().get().getName();
+    } else {
+      testMethodName = "Unknown";
+    }
+  }
+
+  protected void initFlinkMiniCluster() {
+    flinkCluster = new MiniClusterWithClientResource(
+        new MiniClusterResourceConfiguration.Builder()
+            .setNumberSlotsPerTaskManager(2)
+            .setNumberTaskManagers(1)
+            .build());
+  }
+
+  protected void initFileSystem() {
+    hadoopConf = new Configuration();
+    initFileSystemWithConfiguration(hadoopConf);
+  }
+
+  private void initFileSystemWithConfiguration(Configuration configuration) {
+    if (basePath == null) {
+      throw new IllegalStateException("The base path has not been initialized.");
+    }
+    fs = FSUtils.getFs(basePath, configuration);
+    if (fs instanceof LocalFileSystem) {
+      LocalFileSystem lfs = (LocalFileSystem) fs;
+      // With LocalFileSystem, with checksum disabled, fs.open() returns an inputStream which is FSInputStream
+      // This causes ClassCastExceptions in LogRecordScanner (and potentially other places) calling fs.open
+      // So, for the tests, we enforce checksum verification to circumvent the problem
+      lfs.setVerifyChecksum(true);
+    }
+  }
+
+  /**
+   * Initializes an instance of {@link HoodieTableMetaClient} with a special table type specified by
+   * {@code getTableType()}.
+   *
+   * @throws IOException
+   */
+  protected void initMetaClient() throws IOException {
+    initMetaClient(getTableType());
+  }
+
+  protected void initMetaClient(HoodieTableType tableType) throws IOException {
+    if (basePath == null) {
+      throw new IllegalStateException("The base path has not been initialized.");
+    }
+    metaClient = HoodieTestUtils.init(hadoopConf, basePath, tableType);
+  }
+
+
+  /**
+   * Cleanups file system.
+   *
+   * @throws IOException
+   */
+  protected void cleanupFileSystem() throws IOException {
+    if (fs != null) {
+      LOG.warn("Closing file-system instance used in previous test-run");
+      fs.close();
+      fs = null;
+    }
+  }
+
+  protected void cleanupFlinkMiniCluster() {
+    if (flinkCluster != null) {
+      flinkCluster.after();
+      flinkCluster = null;
+    }
+  }
+
+  public static class SimpleSink implements SinkFunction<HoodieRecord> {

Review comment:
       yes, agree




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] yanghua commented on a change in pull request #2281: [HUDI-1418] set up flink client unit test infra

Posted by GitBox <gi...@apache.org>.
yanghua commented on a change in pull request #2281:
URL: https://github.com/apache/hudi/pull/2281#discussion_r550216289



##########
File path: hudi-client/hudi-client-common/src/test/java/org/apache/hudi/testutils/HoodieWriteableTestTable.java
##########
@@ -104,29 +80,7 @@ public HoodieWriteableTestTable forCommit(String instantTime) {
     return (HoodieWriteableTestTable) super.forCommit(instantTime);
   }
 
-  public String getFileIdWithInserts(String partition) throws Exception {
-    return getFileIdWithInserts(partition, new HoodieRecord[0]);
-  }
-
-  public String getFileIdWithInserts(String partition, HoodieRecord... records) throws Exception {
-    return getFileIdWithInserts(partition, Arrays.asList(records));
-  }
-
-  public String getFileIdWithInserts(String partition, List<HoodieRecord> records) throws Exception {
-    String fileId = UUID.randomUUID().toString();
-    withInserts(partition, fileId, records);
-    return fileId;
-  }
-
-  public HoodieWriteableTestTable withInserts(String partition, String fileId) throws Exception {
-    return withInserts(partition, fileId, new HoodieRecord[0]);
-  }
-
-  public HoodieWriteableTestTable withInserts(String partition, String fileId, HoodieRecord... records) throws Exception {
-    return withInserts(partition, fileId, Arrays.asList(records));
-  }
-
-  public HoodieWriteableTestTable withInserts(String partition, String fileId, List<HoodieRecord> records) throws Exception {
+  public HoodieWriteableTestTable withInserts(String partition, String fileId, List<HoodieRecord> records, TaskContextSupplier contextSupplier) throws Exception {

Review comment:
       This line is too long, it would be better if we could break it. But it does not matter, let's refactor it next time.

##########
File path: hudi-client/hudi-client-common/src/test/java/org/apache/hudi/testutils/HoodieWriteableTestTable.java
##########
@@ -104,29 +80,7 @@ public HoodieWriteableTestTable forCommit(String instantTime) {
     return (HoodieWriteableTestTable) super.forCommit(instantTime);
   }
 
-  public String getFileIdWithInserts(String partition) throws Exception {
-    return getFileIdWithInserts(partition, new HoodieRecord[0]);
-  }
-
-  public String getFileIdWithInserts(String partition, HoodieRecord... records) throws Exception {
-    return getFileIdWithInserts(partition, Arrays.asList(records));
-  }
-
-  public String getFileIdWithInserts(String partition, List<HoodieRecord> records) throws Exception {
-    String fileId = UUID.randomUUID().toString();
-    withInserts(partition, fileId, records);
-    return fileId;
-  }
-
-  public HoodieWriteableTestTable withInserts(String partition, String fileId) throws Exception {
-    return withInserts(partition, fileId, new HoodieRecord[0]);
-  }
-
-  public HoodieWriteableTestTable withInserts(String partition, String fileId, HoodieRecord... records) throws Exception {
-    return withInserts(partition, fileId, Arrays.asList(records));
-  }
-
-  public HoodieWriteableTestTable withInserts(String partition, String fileId, List<HoodieRecord> records) throws Exception {
+  public HoodieWriteableTestTable withInserts(String partition, String fileId, List<HoodieRecord> records, TaskContextSupplier contextSupplier) throws Exception {

Review comment:
       This line is too long, it would be better if we could break it. But it does not matter, let's refactor it next time.

##########
File path: hudi-client/hudi-flink-client/src/test/java/org/apache/hudi/testutils/HoodieFlinkClientTestHarness.java
##########
@@ -0,0 +1,136 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.testutils;
+
+import org.apache.hudi.common.fs.FSUtils;
+import org.apache.hudi.common.model.HoodieRecord;
+import org.apache.hudi.common.model.HoodieTableType;
+import org.apache.hudi.common.table.HoodieTableMetaClient;
+import org.apache.hudi.common.testutils.HoodieCommonTestHarness;
+import org.apache.hudi.common.testutils.HoodieTestUtils;
+
+import org.apache.flink.runtime.testutils.MiniClusterResourceConfiguration;
+import org.apache.flink.streaming.api.functions.sink.SinkFunction;
+import org.apache.flink.test.util.MiniClusterWithClientResource;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.LocalFileSystem;
+import org.apache.log4j.LogManager;
+import org.apache.log4j.Logger;
+import org.junit.jupiter.api.BeforeEach;
+import org.junit.jupiter.api.TestInfo;
+
+import java.io.IOException;
+import java.io.Serializable;
+import java.util.ArrayList;
+import java.util.List;
+
+public class HoodieFlinkClientTestHarness extends HoodieCommonTestHarness implements Serializable {
+
+  protected static final Logger LOG = LogManager.getLogger(HoodieFlinkClientTestHarness.class);

Review comment:
       It would be better to split the non-static and static fields, moreover, it is a logger.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] garyli1019 commented on a change in pull request #2281: [HUDI-1418] set up flink client unit test infra

Posted by GitBox <gi...@apache.org>.
garyli1019 commented on a change in pull request #2281:
URL: https://github.com/apache/hudi/pull/2281#discussion_r550194878



##########
File path: hudi-flink/pom.xml
##########
@@ -173,5 +173,102 @@
       <artifactId>bijection-avro_${scala.binary.version}</artifactId>
       <version>0.9.7</version>
     </dependency>
+
+    <!-- Test -->

Review comment:
       changed to `Junit Test Suite`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-io edited a comment on pull request #2281: [HUDI-1418] set up flink client unit test infra

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #2281:
URL: https://github.com/apache/hudi/pull/2281#issuecomment-742519681


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2281?src=pr&el=h1) Report
   > Merging [#2281](https://codecov.io/gh/apache/hudi/pull/2281?src=pr&el=desc) (3f75fa5) into [master](https://codecov.io/gh/apache/hudi/commit/38b9264dd041db955ffa4c0f10273e1b5fa21cfb?el=desc) (38b9264) will **decrease** coverage by `42.16%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2281/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2281?src=pr&el=tree)
   
   ```diff
   @@              Coverage Diff              @@
   ##             master    #2281       +/-   ##
   =============================================
   - Coverage     52.20%   10.04%   -42.17%     
   + Complexity     2659       48     -2611     
   =============================================
     Files           335       52      -283     
     Lines         14981     1852    -13129     
     Branches       1505      223     -1282     
   =============================================
   - Hits           7821      186     -7635     
   + Misses         6535     1653     -4882     
   + Partials        625       13      -612     
   ```
   
   | Flag | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | hudicli | `?` | `?` | |
   | hudiclient | `100.00% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudicommon | `?` | `?` | |
   | hudihadoopmr | `?` | `?` | |
   | huditimelineservice | `?` | `?` | |
   | hudiutilities | `10.04% <ø> (-59.62%)` | `0.00 <ø> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2281?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2281/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-2.00%)` | |
   | [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2281/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-3.00%)` | |
   | [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2281/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2281/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
   | [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2281/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
   | [...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2281/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-10.00%)` | |
   | [...g/apache/hudi/utilities/sources/JsonDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2281/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkRGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2281/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-6.00%)` | |
   | [...pache/hudi/utilities/sources/ParquetDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2281/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUGFycXVldERGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-5.00%)` | |
   | [...lities/schema/SchemaProviderWithPostProcessor.java](https://codecov.io/gh/apache/hudi/pull/2281/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFQcm92aWRlcldpdGhQb3N0UHJvY2Vzc29yLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | ... and [312 more](https://codecov.io/gh/apache/hudi/pull/2281/diff?src=pr&el=tree-more) | |
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] garyli1019 commented on pull request #2281: [HUDI-1418] set up flink client unit test infra

Posted by GitBox <gi...@apache.org>.
garyli1019 commented on pull request #2281:
URL: https://github.com/apache/hudi/pull/2281#issuecomment-752603991


   > @garyli1019 Similar to the flink client, the java client also needs a similar test framework. Can it be added together?
   
   Hi @shenh062326 We should definitely add tests for the java client. I am targeting to land this before the release cut, so I am not sure we can add it to this PR.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] garyli1019 commented on a change in pull request #2281: [HUDI-1418] set up flink client unit test infra

Posted by GitBox <gi...@apache.org>.
garyli1019 commented on a change in pull request #2281:
URL: https://github.com/apache/hudi/pull/2281#discussion_r550194999



##########
File path: hudi-flink/src/main/java/org/apache/hudi/source/JsonStringToHoodieRecordMapFunction.java
##########
@@ -40,32 +43,47 @@
  */
 public class JsonStringToHoodieRecordMapFunction implements MapFunction<String, HoodieRecord> {
 
-  private final HoodieFlinkStreamer.Config cfg;
+  private TypedProperties props;
   private KeyGenerator keyGenerator;
   private AvroConvertor avroConvertor;
+  private Option<String> schemaStr = Option.empty();
+  private String payloadClassName;
+  private String orderingField;
 
-  public JsonStringToHoodieRecordMapFunction(HoodieFlinkStreamer.Config cfg) {
-    this.cfg = cfg;
+  public JsonStringToHoodieRecordMapFunction(TypedProperties props) {
+    this.props = props;
+    init();
+  }
+
+  public JsonStringToHoodieRecordMapFunction(TypedProperties props, String schemaStr) {

Review comment:
       Done.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] wangxianghu commented on a change in pull request #2281: [HUDI-1418] set up flink client unit test infra

Posted by GitBox <gi...@apache.org>.
wangxianghu commented on a change in pull request #2281:
URL: https://github.com/apache/hudi/pull/2281#discussion_r544866146



##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieDataSourceConfig.java
##########
@@ -0,0 +1,102 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.config;
+
+import org.apache.hudi.common.config.DefaultHoodieConfig;
+import org.apache.hudi.common.model.OverwriteWithLatestAvroPayload;
+import org.apache.hudi.keygen.SimpleAvroKeyGenerator;
+import org.apache.hudi.keygen.constant.KeyGeneratorOptions;
+
+import java.io.File;
+import java.io.FileReader;
+import java.io.IOException;
+import java.util.Properties;
+
+public class HoodieDataSourceConfig extends DefaultHoodieConfig {
+
+  public static final String TABLE_NAME_PROP = HoodieWriteConfig.TABLE_NAME;
+  public static final String PRECOMBINE_FIELD_PROP = "hoodie.datasource.write.precombine.field";
+  public static final String RECORDKEY_FIELD_PROP = KeyGeneratorOptions.RECORDKEY_FIELD_OPT_KEY;
+  public static final String PARTITIONPATH_FIELD_PROP = KeyGeneratorOptions.PARTITIONPATH_FIELD_OPT_KEY;
+
+  public static final String WRITE_PAYLOAD_CLASS = "hoodie.datasource.write.payload.class";
+  public static final String DEFAULT_WRITE_PAYLOAD_CLASS = OverwriteWithLatestAvroPayload.class.getName();
+  public static final String KEYGENERATOR_CLASS_PROP = "hoodie.datasource.write.keygenerator.class";
+  public static final String DEFAULT_KEYGENERATOR_CLASS = SimpleAvroKeyGenerator.class.getName();

Review comment:
       Most of these options have been defined in `DataSourceOptions`, I have been planning to move them to `hudi-client-common` (https://issues.apache.org/jira/browse/HUDI-1438), till then we can reuse these options.
   Do you mind waiting for a while, we can continue this PR when HUDI-1438 is finished
   WDYT ?

##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieDataSourceConfig.java
##########
@@ -0,0 +1,102 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.config;
+
+import org.apache.hudi.common.config.DefaultHoodieConfig;
+import org.apache.hudi.common.model.OverwriteWithLatestAvroPayload;
+import org.apache.hudi.keygen.SimpleAvroKeyGenerator;
+import org.apache.hudi.keygen.constant.KeyGeneratorOptions;
+
+import java.io.File;
+import java.io.FileReader;
+import java.io.IOException;
+import java.util.Properties;
+
+public class HoodieDataSourceConfig extends DefaultHoodieConfig {
+

Review comment:
       Hi @garyli1019 , thanks for adding flink unit test infra.
   how about enhancing `HoodieWriteConfig` instead of adding `HoodieDataSourceConfig`, it seems they have lots in common




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] garyli1019 commented on pull request #2281: [HUDI-1418] set up flink client unit test infra

Posted by GitBox <gi...@apache.org>.
garyli1019 commented on pull request #2281:
URL: https://github.com/apache/hudi/pull/2281#issuecomment-752601886


   > @garyli1019 Left some minor comments.
   
   @yanghua Thanks for reviewing. Addressed the comments


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] yanghua commented on a change in pull request #2281: [HUDI-1418] set up flink client unit test infra

Posted by GitBox <gi...@apache.org>.
yanghua commented on a change in pull request #2281:
URL: https://github.com/apache/hudi/pull/2281#discussion_r550163481



##########
File path: hudi-common/src/test/java/org/apache/hudi/common/testutils/HoodieCommonTestHarness.java
##########
@@ -52,6 +53,24 @@ protected void initPath() {
     }
   }
 
+  /**
+   * Initializes a test data generator which used to generate test datas.
+   *
+   */
+  protected void initTestDataGenerator() {

Review comment:
       Why do we need to move this code snippet from `HoodieClientTestHarness` in this PR?

##########
File path: hudi-client/hudi-flink-client/src/test/java/org/apache/hudi/testutils/HoodieFlinkClientTestHarness.java
##########
@@ -0,0 +1,136 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.testutils;
+
+import org.apache.hudi.common.fs.FSUtils;
+import org.apache.hudi.common.model.HoodieRecord;
+import org.apache.hudi.common.model.HoodieTableType;
+import org.apache.hudi.common.table.HoodieTableMetaClient;
+import org.apache.hudi.common.testutils.HoodieCommonTestHarness;
+import org.apache.hudi.common.testutils.HoodieTestUtils;
+
+import org.apache.flink.runtime.testutils.MiniClusterResourceConfiguration;
+import org.apache.flink.streaming.api.functions.sink.SinkFunction;
+import org.apache.flink.test.util.MiniClusterWithClientResource;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.LocalFileSystem;
+import org.apache.log4j.LogManager;
+import org.apache.log4j.Logger;
+import org.junit.jupiter.api.BeforeEach;
+import org.junit.jupiter.api.TestInfo;
+
+import java.io.IOException;
+import java.io.Serializable;
+import java.util.ArrayList;
+import java.util.List;
+
+public class HoodieFlinkClientTestHarness extends HoodieCommonTestHarness implements Serializable {
+
+  protected static final Logger LOG = LogManager.getLogger(HoodieFlinkClientTestHarness.class);
+  private String testMethodName;
+  protected transient Configuration hadoopConf = null;
+  protected transient FileSystem fs;
+  protected transient MiniClusterWithClientResource flinkCluster = null;
+
+  @BeforeEach
+  public void setTestMethodName(TestInfo testInfo) {
+    if (testInfo.getTestMethod().isPresent()) {
+      testMethodName = testInfo.getTestMethod().get().getName();
+    } else {
+      testMethodName = "Unknown";
+    }
+  }
+
+  protected void initFlinkMiniCluster() {
+    flinkCluster = new MiniClusterWithClientResource(
+        new MiniClusterResourceConfiguration.Builder()
+            .setNumberSlotsPerTaskManager(2)
+            .setNumberTaskManagers(1)
+            .build());
+  }
+
+  protected void initFileSystem() {
+    hadoopConf = new Configuration();
+    initFileSystemWithConfiguration(hadoopConf);
+  }
+
+  private void initFileSystemWithConfiguration(Configuration configuration) {
+    if (basePath == null) {
+      throw new IllegalStateException("The base path has not been initialized.");
+    }
+    fs = FSUtils.getFs(basePath, configuration);
+    if (fs instanceof LocalFileSystem) {
+      LocalFileSystem lfs = (LocalFileSystem) fs;
+      // With LocalFileSystem, with checksum disabled, fs.open() returns an inputStream which is FSInputStream
+      // This causes ClassCastExceptions in LogRecordScanner (and potentially other places) calling fs.open
+      // So, for the tests, we enforce checksum verification to circumvent the problem
+      lfs.setVerifyChecksum(true);
+    }
+  }
+
+  /**
+   * Initializes an instance of {@link HoodieTableMetaClient} with a special table type specified by
+   * {@code getTableType()}.
+   *
+   * @throws IOException
+   */
+  protected void initMetaClient() throws IOException {
+    initMetaClient(getTableType());
+  }
+
+  protected void initMetaClient(HoodieTableType tableType) throws IOException {
+    if (basePath == null) {
+      throw new IllegalStateException("The base path has not been initialized.");
+    }
+    metaClient = HoodieTestUtils.init(hadoopConf, basePath, tableType);
+  }
+
+
+  /**
+   * Cleanups file system.
+   *
+   * @throws IOException
+   */
+  protected void cleanupFileSystem() throws IOException {
+    if (fs != null) {
+      LOG.warn("Closing file-system instance used in previous test-run");
+      fs.close();
+      fs = null;
+    }
+  }
+
+  protected void cleanupFlinkMiniCluster() {
+    if (flinkCluster != null) {
+      flinkCluster.after();
+      flinkCluster = null;
+    }
+  }
+
+  public static class SimpleSink implements SinkFunction<HoodieRecord> {

Review comment:
       `SimpleTestSinkFunction`  sounds better?

##########
File path: hudi-flink/pom.xml
##########
@@ -173,5 +173,102 @@
       <artifactId>bijection-avro_${scala.binary.version}</artifactId>
       <version>0.9.7</version>
     </dependency>
+
+    <!-- Test -->

Review comment:
       `Test suite` or `Test frameworks` sounds better?

##########
File path: hudi-flink/src/main/java/org/apache/hudi/source/JsonStringToHoodieRecordMapFunction.java
##########
@@ -40,32 +43,47 @@
  */
 public class JsonStringToHoodieRecordMapFunction implements MapFunction<String, HoodieRecord> {
 
-  private final HoodieFlinkStreamer.Config cfg;
+  private TypedProperties props;
   private KeyGenerator keyGenerator;
   private AvroConvertor avroConvertor;
+  private Option<String> schemaStr = Option.empty();
+  private String payloadClassName;
+  private String orderingField;
 
-  public JsonStringToHoodieRecordMapFunction(HoodieFlinkStreamer.Config cfg) {
-    this.cfg = cfg;
+  public JsonStringToHoodieRecordMapFunction(TypedProperties props) {
+    this.props = props;
+    init();
+  }
+
+  public JsonStringToHoodieRecordMapFunction(TypedProperties props, String schemaStr) {

Review comment:
       Can we reuse the first constructor?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] garyli1019 commented on a change in pull request #2281: [HUDI-1418] set up flink client unit test infra

Posted by GitBox <gi...@apache.org>.
garyli1019 commented on a change in pull request #2281:
URL: https://github.com/apache/hudi/pull/2281#discussion_r550191182



##########
File path: hudi-common/src/test/java/org/apache/hudi/common/testutils/HoodieCommonTestHarness.java
##########
@@ -52,6 +53,24 @@ protected void initPath() {
     }
   }
 
+  /**
+   * Initializes a test data generator which used to generate test datas.
+   *
+   */
+  protected void initTestDataGenerator() {

Review comment:
       This can be reuse by both Spark and Flink client test harness and only depend on hudi's code.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] garyli1019 commented on pull request #2281: [HUDI-1418] set up flink client unit test infra

Posted by GitBox <gi...@apache.org>.
garyli1019 commented on pull request #2281:
URL: https://github.com/apache/hudi/pull/2281#issuecomment-740613311


   Hi @wangxianghu @leesf @yanghua , if any of you have time, please help to review this PR, thanks!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-io commented on pull request #2281: [HUDI-1418] set up flink client unit test infra

Posted by GitBox <gi...@apache.org>.
codecov-io commented on pull request #2281:
URL: https://github.com/apache/hudi/pull/2281#issuecomment-742519681


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2281?src=pr&el=h1) Report
   > Merging [#2281](https://codecov.io/gh/apache/hudi/pull/2281?src=pr&el=desc) (5b5e2fd) into [master](https://codecov.io/gh/apache/hudi/commit/bd9cceccb582ede88b989824241498e8c32d4f13?el=desc) (bd9ccec) will **decrease** coverage by `41.88%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2281/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2281?src=pr&el=tree)
   
   ```diff
   @@              Coverage Diff              @@
   ##             master    #2281       +/-   ##
   =============================================
   - Coverage     52.29%   10.40%   -41.89%     
   + Complexity     2630       48     -2582     
   =============================================
     Files           329       51      -278     
     Lines         14743     1787    -12956     
     Branches       1484      213     -1271     
   =============================================
   - Hits           7710      186     -7524     
   + Misses         6419     1588     -4831     
   + Partials        614       13      -601     
   ```
   
   | Flag | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | hudicli | `?` | `?` | |
   | hudiclient | `?` | `?` | |
   | hudicommon | `?` | `?` | |
   | hudihadoopmr | `?` | `?` | |
   | huditimelineservice | `?` | `?` | |
   | hudiutilities | `10.40% <ø> (-59.71%)` | `0.00 <ø> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2281?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2281/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-2.00%)` | |
   | [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2281/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-3.00%)` | |
   | [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2281/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2281/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
   | [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2281/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
   | [...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2281/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-10.00%)` | |
   | [...g/apache/hudi/utilities/sources/JsonDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2281/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkRGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2281/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-6.00%)` | |
   | [...pache/hudi/utilities/sources/ParquetDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2281/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUGFycXVldERGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-5.00%)` | |
   | [...lities/schema/SchemaProviderWithPostProcessor.java](https://codecov.io/gh/apache/hudi/pull/2281/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFQcm92aWRlcldpdGhQb3N0UHJvY2Vzc29yLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | ... and [306 more](https://codecov.io/gh/apache/hudi/pull/2281/diff?src=pr&el=tree-more) | |
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] garyli1019 commented on pull request #2281: [HUDI-1418] set up flink client unit test infra

Posted by GitBox <gi...@apache.org>.
garyli1019 commented on pull request #2281:
URL: https://github.com/apache/hudi/pull/2281#issuecomment-752032727


   Ping one more time, @wangxianghu @liujinhui1994 @yanghua any of you guys could review this PR one more time? Thanks a lot!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] yanghua commented on pull request #2281: [HUDI-1418] set up flink client unit test infra

Posted by GitBox <gi...@apache.org>.
yanghua commented on pull request #2281:
URL: https://github.com/apache/hudi/pull/2281#issuecomment-752642375


   > @garyli1019 Similar to the flink client, the java client also needs a similar test framework. Can it be added together?
   
   IMO, we can open a new PR for the java client.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-io edited a comment on pull request #2281: [HUDI-1418] set up flink client unit test infra

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #2281:
URL: https://github.com/apache/hudi/pull/2281#issuecomment-742519681


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2281?src=pr&el=h1) Report
   > Merging [#2281](https://codecov.io/gh/apache/hudi/pull/2281?src=pr&el=desc) (47e7bff) into [master](https://codecov.io/gh/apache/hudi/commit/38b9264dd041db955ffa4c0f10273e1b5fa21cfb?el=desc) (38b9264) will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2281/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2281?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff            @@
   ##             master    #2281   +/-   ##
   =========================================
     Coverage     52.20%   52.20%           
     Complexity     2659     2659           
   =========================================
     Files           335      335           
     Lines         14981    14981           
     Branches       1505     1505           
   =========================================
     Hits           7821     7821           
     Misses         6535     6535           
     Partials        625      625           
   ```
   
   | Flag | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | hudicli | `38.83% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudiclient | `100.00% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudicommon | `54.76% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudihadoopmr | `33.29% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | huditimelineservice | `65.30% <ø> (ø)` | `0.00 <ø> (ø)` | |
   | hudiutilities | `69.65% <ø> (ø)` | `0.00 <ø> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] yanghua commented on pull request #2281: [HUDI-1418] set up flink client unit test infra

Posted by GitBox <gi...@apache.org>.
yanghua commented on pull request #2281:
URL: https://github.com/apache/hudi/pull/2281#issuecomment-742566398


   @wangxianghu Can you help to review this PR when you have time? thanks.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] wangxianghu commented on a change in pull request #2281: [HUDI-1418] set up flink client unit test infra

Posted by GitBox <gi...@apache.org>.
wangxianghu commented on a change in pull request #2281:
URL: https://github.com/apache/hudi/pull/2281#discussion_r544888525



##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieDataSourceConfig.java
##########
@@ -0,0 +1,102 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.config;
+
+import org.apache.hudi.common.config.DefaultHoodieConfig;
+import org.apache.hudi.common.model.OverwriteWithLatestAvroPayload;
+import org.apache.hudi.keygen.SimpleAvroKeyGenerator;
+import org.apache.hudi.keygen.constant.KeyGeneratorOptions;
+
+import java.io.File;
+import java.io.FileReader;
+import java.io.IOException;
+import java.util.Properties;
+
+public class HoodieDataSourceConfig extends DefaultHoodieConfig {
+
+  public static final String TABLE_NAME_PROP = HoodieWriteConfig.TABLE_NAME;
+  public static final String PRECOMBINE_FIELD_PROP = "hoodie.datasource.write.precombine.field";
+  public static final String RECORDKEY_FIELD_PROP = KeyGeneratorOptions.RECORDKEY_FIELD_OPT_KEY;
+  public static final String PARTITIONPATH_FIELD_PROP = KeyGeneratorOptions.PARTITIONPATH_FIELD_OPT_KEY;
+
+  public static final String WRITE_PAYLOAD_CLASS = "hoodie.datasource.write.payload.class";
+  public static final String DEFAULT_WRITE_PAYLOAD_CLASS = OverwriteWithLatestAvroPayload.class.getName();
+  public static final String KEYGENERATOR_CLASS_PROP = "hoodie.datasource.write.keygenerator.class";
+  public static final String DEFAULT_KEYGENERATOR_CLASS = SimpleAvroKeyGenerator.class.getName();

Review comment:
       > I agree that we should move `DataSourceOptions` to `hudi-client-common`. Are we looking for moving them into`HoodieWriteConfig`? If so I can do it on this PR.
   > IMO we need to write tests for every single Flink-related feature actually, otherwise, the features are not trustworthy, so I think we should land this asap and force unit tests for the upcoming Flink features.
   
   I think `DataSourceOptions` should be independent of `HoodieWriteConfig` just as it is before.
   @liujinhui1994 do you have time for HUDI-1438 recently? if not, do you mind resign it to @garyli1019 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] garyli1019 commented on pull request #2281: [HUDI-1418] set up flink client unit test infra

Posted by GitBox <gi...@apache.org>.
garyli1019 commented on pull request #2281:
URL: https://github.com/apache/hudi/pull/2281#issuecomment-750374816


   @yanghua @wangxianghu @liujinhui1994 I slightly change the config, shouldn't block the config refactoring I think. Please review again. Thanks!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] yanghua commented on a change in pull request #2281: [HUDI-1418] set up flink client unit test infra

Posted by GitBox <gi...@apache.org>.
yanghua commented on a change in pull request #2281:
URL: https://github.com/apache/hudi/pull/2281#discussion_r544985094



##########
File path: hudi-client/hudi-flink-client/src/test/java/org/apache/hudi/testutils/HoodieFlinkClientTestHarness.java
##########
@@ -0,0 +1,149 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.testutils;
+
+import org.apache.hudi.client.HoodieFlinkWriteClient;
+import org.apache.hudi.common.fs.FSUtils;
+import org.apache.hudi.common.model.HoodieRecord;
+import org.apache.hudi.common.model.HoodieTableType;
+import org.apache.hudi.common.table.HoodieTableMetaClient;
+import org.apache.hudi.common.table.view.HoodieTableFileSystemView;
+import org.apache.hudi.common.testutils.HoodieCommonTestHarness;
+import org.apache.hudi.common.testutils.HoodieTestUtils;
+import org.apache.hudi.common.testutils.minicluster.HdfsTestService;
+
+import org.apache.flink.runtime.testutils.MiniClusterResourceConfiguration;
+import org.apache.flink.streaming.api.functions.sink.SinkFunction;
+import org.apache.flink.test.util.MiniClusterWithClientResource;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.LocalFileSystem;
+import org.apache.hadoop.hdfs.DistributedFileSystem;
+import org.apache.hadoop.hdfs.MiniDFSCluster;
+import org.apache.log4j.LogManager;
+import org.apache.log4j.Logger;
+import org.junit.jupiter.api.BeforeEach;
+import org.junit.jupiter.api.TestInfo;
+
+import java.io.IOException;
+import java.io.Serializable;
+import java.util.ArrayList;
+import java.util.List;
+
+public class HoodieFlinkClientTestHarness extends HoodieCommonTestHarness implements Serializable {
+
+  protected static final Logger LOG = LogManager.getLogger(HoodieFlinkClientTestHarness.class);
+  private String testMethodName;
+  protected transient Configuration hadoopConf = null;
+  protected transient FileSystem fs;
+  protected transient HoodieFlinkWriteClient writeClient;

Review comment:
       There are many fields that have not been used. Can we add them when we would use them?

##########
File path: hudi-client/hudi-flink-client/src/test/java/org/apache/hudi/testutils/HoodieFlinkClientTestHarness.java
##########
@@ -0,0 +1,149 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.testutils;
+
+import org.apache.hudi.client.HoodieFlinkWriteClient;
+import org.apache.hudi.common.fs.FSUtils;
+import org.apache.hudi.common.model.HoodieRecord;
+import org.apache.hudi.common.model.HoodieTableType;
+import org.apache.hudi.common.table.HoodieTableMetaClient;
+import org.apache.hudi.common.table.view.HoodieTableFileSystemView;
+import org.apache.hudi.common.testutils.HoodieCommonTestHarness;
+import org.apache.hudi.common.testutils.HoodieTestUtils;
+import org.apache.hudi.common.testutils.minicluster.HdfsTestService;
+
+import org.apache.flink.runtime.testutils.MiniClusterResourceConfiguration;
+import org.apache.flink.streaming.api.functions.sink.SinkFunction;
+import org.apache.flink.test.util.MiniClusterWithClientResource;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.LocalFileSystem;
+import org.apache.hadoop.hdfs.DistributedFileSystem;
+import org.apache.hadoop.hdfs.MiniDFSCluster;
+import org.apache.log4j.LogManager;
+import org.apache.log4j.Logger;
+import org.junit.jupiter.api.BeforeEach;
+import org.junit.jupiter.api.TestInfo;
+
+import java.io.IOException;
+import java.io.Serializable;
+import java.util.ArrayList;
+import java.util.List;
+
+public class HoodieFlinkClientTestHarness extends HoodieCommonTestHarness implements Serializable {
+
+  protected static final Logger LOG = LogManager.getLogger(HoodieFlinkClientTestHarness.class);
+  private String testMethodName;
+  protected transient Configuration hadoopConf = null;
+  protected transient FileSystem fs;
+  protected transient HoodieFlinkWriteClient writeClient;
+  protected transient HoodieTableFileSystemView tableView;
+  protected transient MiniClusterWithClientResource flinkCluster = null;
+
+  // dfs
+  protected String dfsBasePath;
+  protected transient HdfsTestService hdfsTestService;
+  protected transient MiniDFSCluster dfsCluster;
+  protected transient DistributedFileSystem dfs;
+
+  @BeforeEach
+  public void setTestMethodName(TestInfo testInfo) {
+    if (testInfo.getTestMethod().isPresent()) {
+      testMethodName = testInfo.getTestMethod().get().getName();
+    } else {
+      testMethodName = "Unknown";
+    }
+  }
+
+  protected void initFlinkMiniCluster() {
+    flinkCluster = new MiniClusterWithClientResource(
+        new MiniClusterResourceConfiguration.Builder()
+            .setNumberSlotsPerTaskManager(2)
+            .setNumberTaskManagers(1)
+            .build());
+  }
+
+  protected void initFileSystem() {
+    hadoopConf = new Configuration();
+    initFileSystemWithConfiguration(hadoopConf);
+  }
+
+  private void initFileSystemWithConfiguration(Configuration configuration) {
+    if (basePath == null) {
+      throw new IllegalStateException("The base path has not been initialized.");
+    }
+    fs = FSUtils.getFs(basePath, configuration);
+    if (fs instanceof LocalFileSystem) {
+      LocalFileSystem lfs = (LocalFileSystem) fs;
+      // With LocalFileSystem, with checksum disabled, fs.open() returns an inputStream which is FSInputStream
+      // This causes ClassCastExceptions in LogRecordScanner (and potentially other places) calling fs.open
+      // So, for the tests, we enforce checksum verification to circumvent the problem
+      lfs.setVerifyChecksum(true);
+    }
+  }
+
+  /**
+   * Initializes an instance of {@link HoodieTableMetaClient} with a special table type specified by
+   * {@code getTableType()}.
+   *
+   * @throws IOException
+   */
+  protected void initMetaClient() throws IOException {
+    initMetaClient(getTableType());
+  }
+
+  protected void initMetaClient(HoodieTableType tableType) throws IOException {
+    if (basePath == null) {
+      throw new IllegalStateException("The base path has not been initialized.");
+    }
+    metaClient = HoodieTestUtils.init(hadoopConf, basePath, tableType);
+  }
+
+
+  /**
+   * Cleanups file system.
+   *
+   * @throws IOException
+   */
+  protected void cleanupFileSystem() throws IOException {
+    if (fs != null) {
+      LOG.warn("Closing file-system instance used in previous test-run");
+      fs.close();
+      fs = null;
+    }
+  }
+
+  protected void cleanupFlinkMiniCluster() {
+    if (flinkCluster != null) {
+      flinkCluster.after();
+      flinkCluster = null;
+    }
+  }
+
+  public static class SimpleSink implements SinkFunction<HoodieRecord> {
+
+    // must be static
+    public static List<HoodieRecord> valuesList = new ArrayList<>();
+
+    @Override
+    public synchronized void invoke(HoodieRecord value) throws Exception {

Review comment:
       Since this method has been deprecated. Can we use the normal one?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] yanghua commented on pull request #2281: [HUDI-1418] set up flink client unit test infra

Posted by GitBox <gi...@apache.org>.
yanghua commented on pull request #2281:
URL: https://github.com/apache/hudi/pull/2281#issuecomment-752121027


   > Ping one more time, @wangxianghu @liujinhui1994 @yanghua any of you guys could review this PR one more time? Thanks a lot!
   
   Will review it tomorrow.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-io edited a comment on pull request #2281: [HUDI-1418] set up flink client unit test infra

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #2281:
URL: https://github.com/apache/hudi/pull/2281#issuecomment-742519681


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/2281?src=pr&el=h1) Report
   > Merging [#2281](https://codecov.io/gh/apache/hudi/pull/2281?src=pr&el=desc) (f58a7b7) into [master](https://codecov.io/gh/apache/hudi/commit/bd9cceccb582ede88b989824241498e8c32d4f13?el=desc) (bd9ccec) will **decrease** coverage by `41.88%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/2281/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/2281?src=pr&el=tree)
   
   ```diff
   @@              Coverage Diff              @@
   ##             master    #2281       +/-   ##
   =============================================
   - Coverage     52.29%   10.40%   -41.89%     
   + Complexity     2630       48     -2582     
   =============================================
     Files           329       51      -278     
     Lines         14743     1787    -12956     
     Branches       1484      213     -1271     
   =============================================
   - Hits           7710      186     -7524     
   + Misses         6419     1588     -4831     
   + Partials        614       13      -601     
   ```
   
   | Flag | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | hudicli | `?` | `?` | |
   | hudiclient | `?` | `?` | |
   | hudicommon | `?` | `?` | |
   | hudihadoopmr | `?` | `?` | |
   | huditimelineservice | `?` | `?` | |
   | hudiutilities | `10.40% <ø> (-59.71%)` | `0.00 <ø> (ø)` | |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/2281?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/2281/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-2.00%)` | |
   | [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/2281/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-3.00%)` | |
   | [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/2281/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | [.../org/apache/hudi/utilities/sources/AvroSource.java](https://codecov.io/gh/apache/hudi/pull/2281/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQXZyb1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
   | [.../org/apache/hudi/utilities/sources/JsonSource.java](https://codecov.io/gh/apache/hudi/pull/2281/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvblNvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
   | [...rg/apache/hudi/utilities/sources/CsvDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2281/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvQ3N2REZTU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-10.00%)` | |
   | [...g/apache/hudi/utilities/sources/JsonDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2281/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkRGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | [...apache/hudi/utilities/sources/JsonKafkaSource.java](https://codecov.io/gh/apache/hudi/pull/2281/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvSnNvbkthZmthU291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-6.00%)` | |
   | [...pache/hudi/utilities/sources/ParquetDFSSource.java](https://codecov.io/gh/apache/hudi/pull/2281/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUGFycXVldERGU1NvdXJjZS5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-5.00%)` | |
   | [...lities/schema/SchemaProviderWithPostProcessor.java](https://codecov.io/gh/apache/hudi/pull/2281/diff?src=pr&el=tree#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFQcm92aWRlcldpdGhQb3N0UHJvY2Vzc29yLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
   | ... and [306 more](https://codecov.io/gh/apache/hudi/pull/2281/diff?src=pr&el=tree-more) | |
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] garyli1019 commented on a change in pull request #2281: [HUDI-1418] set up flink client unit test infra

Posted by GitBox <gi...@apache.org>.
garyli1019 commented on a change in pull request #2281:
URL: https://github.com/apache/hudi/pull/2281#discussion_r544952178



##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieDataSourceConfig.java
##########
@@ -0,0 +1,102 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.config;
+
+import org.apache.hudi.common.config.DefaultHoodieConfig;
+import org.apache.hudi.common.model.OverwriteWithLatestAvroPayload;
+import org.apache.hudi.keygen.SimpleAvroKeyGenerator;
+import org.apache.hudi.keygen.constant.KeyGeneratorOptions;
+
+import java.io.File;
+import java.io.FileReader;
+import java.io.IOException;
+import java.util.Properties;
+
+public class HoodieDataSourceConfig extends DefaultHoodieConfig {
+
+  public static final String TABLE_NAME_PROP = HoodieWriteConfig.TABLE_NAME;
+  public static final String PRECOMBINE_FIELD_PROP = "hoodie.datasource.write.precombine.field";
+  public static final String RECORDKEY_FIELD_PROP = KeyGeneratorOptions.RECORDKEY_FIELD_OPT_KEY;
+  public static final String PARTITIONPATH_FIELD_PROP = KeyGeneratorOptions.PARTITIONPATH_FIELD_OPT_KEY;
+
+  public static final String WRITE_PAYLOAD_CLASS = "hoodie.datasource.write.payload.class";
+  public static final String DEFAULT_WRITE_PAYLOAD_CLASS = OverwriteWithLatestAvroPayload.class.getName();
+  public static final String KEYGENERATOR_CLASS_PROP = "hoodie.datasource.write.keygenerator.class";
+  public static final String DEFAULT_KEYGENERATOR_CLASS = SimpleAvroKeyGenerator.class.getName();

Review comment:
       @liujinhui1994 awesome!




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] garyli1019 commented on a change in pull request #2281: [HUDI-1418] set up flink client unit test infra

Posted by GitBox <gi...@apache.org>.
garyli1019 commented on a change in pull request #2281:
URL: https://github.com/apache/hudi/pull/2281#discussion_r544885347



##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieDataSourceConfig.java
##########
@@ -0,0 +1,102 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.config;
+
+import org.apache.hudi.common.config.DefaultHoodieConfig;
+import org.apache.hudi.common.model.OverwriteWithLatestAvroPayload;
+import org.apache.hudi.keygen.SimpleAvroKeyGenerator;
+import org.apache.hudi.keygen.constant.KeyGeneratorOptions;
+
+import java.io.File;
+import java.io.FileReader;
+import java.io.IOException;
+import java.util.Properties;
+
+public class HoodieDataSourceConfig extends DefaultHoodieConfig {
+
+  public static final String TABLE_NAME_PROP = HoodieWriteConfig.TABLE_NAME;
+  public static final String PRECOMBINE_FIELD_PROP = "hoodie.datasource.write.precombine.field";
+  public static final String RECORDKEY_FIELD_PROP = KeyGeneratorOptions.RECORDKEY_FIELD_OPT_KEY;
+  public static final String PARTITIONPATH_FIELD_PROP = KeyGeneratorOptions.PARTITIONPATH_FIELD_OPT_KEY;
+
+  public static final String WRITE_PAYLOAD_CLASS = "hoodie.datasource.write.payload.class";
+  public static final String DEFAULT_WRITE_PAYLOAD_CLASS = OverwriteWithLatestAvroPayload.class.getName();
+  public static final String KEYGENERATOR_CLASS_PROP = "hoodie.datasource.write.keygenerator.class";
+  public static final String DEFAULT_KEYGENERATOR_CLASS = SimpleAvroKeyGenerator.class.getName();

Review comment:
       I agree that we should move `DataSourceOptions` to `hudi-client-common`. Are we looking for moving them into`HoodieWriteConfig`? If so I can do it on this PR. 
   IMO we need to write tests for every single Flink-related feature actually, otherwise, the features are not trustworthy, so I think we should land this asap and force unit tests for the upcoming Flink features.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] shenh062326 commented on pull request #2281: [HUDI-1418] set up flink client unit test infra

Posted by GitBox <gi...@apache.org>.
shenh062326 commented on pull request #2281:
URL: https://github.com/apache/hudi/pull/2281#issuecomment-752517398


   @garyli1019 Similar to the flink client, the java client also needs a similar test framework. Can it be added together?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org