You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2019/12/29 03:58:22 UTC

[GitHub] [incubator-hudi] dengziming opened a new pull request #1151: Hudi-476: Add hudi-examples module

dengziming opened a new pull request #1151: Hudi-476: Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151
 
 
   ## *Tips*
   - *Thank you very much for contributing to Apache Hudi.*
   - *Please review https://hudi.apache.org/contributing.html before opening a pull request.*
   
   ## What is the purpose of the pull request
   
   [See](https://issues.apache.org/jira/browse/HUDI-476) this pr adds hudi-examples module
   
   ## Brief change log
   
     - add hudi-examples module and add other modules as dependencies in pom
     - add scala dependencies and scala-maven-plugin in parent pom
   
   ## Verify this pull request
   
   *(Please pick either of the following options)*
   
   This pull request is a trivial rework / code cleanup without any test coverage.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #1151: [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on a change in pull request #1151: [HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#discussion_r383479608
 
 

 ##########
 File path: hudi-examples/pom.xml
 ##########
 @@ -0,0 +1,206 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
+
+  <parent>
+    <artifactId>hudi</artifactId>
+    <groupId>org.apache.hudi</groupId>
+    <version>0.5.2-SNAPSHOT</version>
+  </parent>
+  <modelVersion>4.0.0</modelVersion>
+
+  <artifactId>hudi-examples</artifactId>
+  <packaging>jar</packaging>
+
+  <properties>
+    <main.basedir>${project.parent.basedir}</main.basedir>
+  </properties>
+
+  <build>
+    <resources>
+      <resource>
+        <directory>src/main/resources</directory>
+      </resource>
+    </resources>
+
+    <plugins>
+      <plugin>
+        <groupId>org.apache.maven.plugins</groupId>
 
 Review comment:
   This ties back to how we let the users run the examples. Another way is to not have a fat jar here, but just have a `run_hudi_example.sh` script just use the spark-bundle/utilities-bundle after ksql is build.. 
   
   This way, we don't have to also maintain this bundle separately.. Users will be using the bundles under `packaging` in production anyway. So just reuse them?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #1151: [WIP] [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on a change in pull request #1151: [WIP] [HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#discussion_r375657392
 
 

 ##########
 File path: hudi-examples/src/main/java/org/apache/hudi/examples/spark/HoodieWriteClientExample.java
 ##########
 @@ -0,0 +1,133 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.examples.spark;
+
+import org.apache.hudi.HoodieWriteClient;
+import org.apache.hudi.WriteStatus;
+import org.apache.hudi.common.model.HoodieAvroPayload;
+import org.apache.hudi.common.model.HoodieKey;
+import org.apache.hudi.common.model.HoodieRecord;
+import org.apache.hudi.common.model.HoodieTableType;
+import org.apache.hudi.common.table.HoodieTableMetaClient;
+import org.apache.hudi.common.util.FSUtils;
+import org.apache.hudi.common.util.Option;
+import org.apache.hudi.config.HoodieCompactionConfig;
+import org.apache.hudi.config.HoodieIndexConfig;
+import org.apache.hudi.config.HoodieWriteConfig;
+import org.apache.hudi.examples.common.HoodieExampleDataGenerator;
+import org.apache.hudi.examples.common.HoodieExampleSparkUtils;
+import org.apache.hudi.index.HoodieIndex;
+
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.log4j.LogManager;
+import org.apache.log4j.Logger;
+import org.apache.spark.SparkConf;
+import org.apache.spark.api.java.JavaRDD;
+import org.apache.spark.api.java.JavaSparkContext;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.stream.Collectors;
+
+
+/**
+ * Simple examples of #{@link HoodieWriteClient}.
+ *
+ * To run this example, you should
+ *   1. For running in IDE, set VM options `-Dspark.master=local[2]`
+ *   2. For running in shell, using `spark-submit`
+ *
+ * Usage: HoodieWriteClientExample <tablePath> <tableName>
+ * <tablePath> and <tableName> describe root path of hudi and table name
+ * for example, `HoodieWriteClientExample file:///tmp/hoodie/sample-table hoodie_rt`
+ */
+public class HoodieWriteClientExample {
 
 Review comment:
   Please feel free to remove `HoodieClientExample` in hudi-client/src/test/java in favor of this.. (may need to confirm that the integ-test does not depend on it) 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] vinothchandar commented on issue #1151: [WIP][HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on issue #1151: [WIP][HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#issuecomment-614795517
 
 
   >we can just delete the 3 XxxDeltaStreamerExample and replace them with a run_examples.sh and some config files?
   
   yes.. based on your previous comment, that would probably make sense.  

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #1151: [WIP] [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on a change in pull request #1151: [WIP] [HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#discussion_r375657650
 
 

 ##########
 File path: pom.xml
 ##########
 @@ -408,6 +418,13 @@
         <version>${log4j.version}</version>
       </dependency>
 
+      <!-- Scala -->
 
 Review comment:
   why is this needed at the parent pom level? 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] codecov-io commented on issue #1151: [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
codecov-io commented on issue #1151: [HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#issuecomment-593277561
 
 
   # [Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1151?src=pr&el=h1) Report
   > Merging [#1151](https://codecov.io/gh/apache/incubator-hudi/pull/1151?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-hudi/commit/2d040145810b8b14c59c5882f9115698351039d1?src=pr&el=desc) will **decrease** coverage by `66.45%`.
   > The diff coverage is `0%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-hudi/pull/1151/graphs/tree.svg?width=650&token=VTTXabwbs2&height=150&src=pr)](https://codecov.io/gh/apache/incubator-hudi/pull/1151?src=pr&el=tree)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master   #1151       +/-   ##
   ============================================
   - Coverage     67.09%   0.64%   -66.46%     
   + Complexity      223       2      -221     
   ============================================
     Files           333     287       -46     
     Lines         16216   14320     -1896     
     Branches       1659    1465      -194     
   ============================================
   - Hits          10880      92    -10788     
   - Misses         4598   14225     +9627     
   + Partials        738       3      -735
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/incubator-hudi/pull/1151?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | [...rg/apache/hudi/common/model/HoodieAvroPayload.java](https://codecov.io/gh/apache/incubator-hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0hvb2RpZUF2cm9QYXlsb2FkLmphdmE=) | `0% <0%> (-84.62%)` | `0 <0> (ø)` | |
   | [...che/hudi/common/table/timeline/dto/LogFileDTO.java](https://codecov.io/gh/apache/incubator-hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3RpbWVsaW5lL2R0by9Mb2dGaWxlRFRPLmphdmE=) | `0% <0%> (-100%)` | `0% <0%> (ø)` | |
   | [...apache/hudi/common/model/HoodieDeltaWriteStat.java](https://codecov.io/gh/apache/incubator-hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0hvb2RpZURlbHRhV3JpdGVTdGF0LmphdmE=) | `0% <0%> (-100%)` | `0% <0%> (ø)` | |
   | [...org/apache/hudi/common/model/HoodieFileFormat.java](https://codecov.io/gh/apache/incubator-hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0hvb2RpZUZpbGVGb3JtYXQuamF2YQ==) | `0% <0%> (-100%)` | `0% <0%> (ø)` | |
   | [...g/apache/hudi/execution/BulkInsertMapFunction.java](https://codecov.io/gh/apache/incubator-hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvZXhlY3V0aW9uL0J1bGtJbnNlcnRNYXBGdW5jdGlvbi5qYXZh) | `0% <0%> (-100%)` | `0% <0%> (ø)` | |
   | [.../common/util/queue/IteratorBasedQueueProducer.java](https://codecov.io/gh/apache/incubator-hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3V0aWwvcXVldWUvSXRlcmF0b3JCYXNlZFF1ZXVlUHJvZHVjZXIuamF2YQ==) | `0% <0%> (-100%)` | `0% <0%> (ø)` | |
   | [...rg/apache/hudi/index/bloom/KeyRangeLookupTree.java](https://codecov.io/gh/apache/incubator-hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaW5kZXgvYmxvb20vS2V5UmFuZ2VMb29rdXBUcmVlLmphdmE=) | `0% <0%> (-100%)` | `0% <0%> (ø)` | |
   | [...e/hudi/common/table/timeline/dto/FileGroupDTO.java](https://codecov.io/gh/apache/incubator-hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL3RpbWVsaW5lL2R0by9GaWxlR3JvdXBEVE8uamF2YQ==) | `0% <0%> (-100%)` | `0% <0%> (ø)` | |
   | [...apache/hudi/timeline/service/handlers/Handler.java](https://codecov.io/gh/apache/incubator-hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS10aW1lbGluZS1zZXJ2aWNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL3RpbWVsaW5lL3NlcnZpY2UvaGFuZGxlcnMvSGFuZGxlci5qYXZh) | `0% <0%> (-100%)` | `0% <0%> (ø)` | |
   | [.../common/util/queue/FunctionBasedQueueProducer.java](https://codecov.io/gh/apache/incubator-hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3V0aWwvcXVldWUvRnVuY3Rpb25CYXNlZFF1ZXVlUHJvZHVjZXIuamF2YQ==) | `0% <0%> (-100%)` | `0% <0%> (ø)` | |
   | ... and [287 more](https://codecov.io/gh/apache/incubator-hudi/pull/1151/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1151?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1151?src=pr&el=footer). Last update [2d04014...3c3703e](https://codecov.io/gh/apache/incubator-hudi/pull/1151?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] dengziming commented on issue #1151: [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
dengziming commented on issue #1151: [HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#issuecomment-569496952
 
 
   @yanghua thank you, I just rebased the code. also cc @vinothchandar , PTAL.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #1151: [WIP] [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on a change in pull request #1151: [WIP] [HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#discussion_r375656811
 
 

 ##########
 File path: hudi-examples/src/main/java/org/apache/hudi/examples/deltastreamer/HoodieDeltaStreamerKafkaSourceExample.java
 ##########
 @@ -0,0 +1,87 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.examples.deltastreamer;
+
+import org.apache.hudi.DataSourceWriteOptions;
+import org.apache.hudi.examples.common.HoodieExampleDataGenerator;
+import org.apache.hudi.examples.common.HoodieExampleSparkUtils;
+import org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer;
+import org.apache.hudi.utilities.sources.JsonKafkaSource;
+import org.apache.hudi.utilities.transform.IdentityTransformer;
+
+import com.beust.jcommander.JCommander;
+import org.apache.spark.SparkConf;
+import org.apache.spark.api.java.JavaSparkContext;
+
+
+/**
+ * Simple examples of #{@link HoodieDeltaStreamer} from #{@link JsonKafkaSource}.
+ *
+ * To run this example, you should
+ *    1. Start Zookeeper and the Kafka demo server
+ *    2. For running in IDE, set VM options `-Dspark.master=local[2]`
+ *    3. For running in shell, using `spark-submit`
+ *    4. produce some data to hoodie-source-topic configured by `hoodie.deltastreamer.source.kafka.topic`
+ *
+ * Usage: HoodieDeltaStreamerKafkaSourceExample \
+ *        --target-base-path /tmp/hoodie/kafkadeltatable \
+ *        --table-type MERGE_ON_READ \
+ *        --target-table kafkadeltatable
+ */
+public class HoodieDeltaStreamerKafkaSourceExample {
+
+  public static void main(String[] args) throws Exception {
+
+    final HoodieDeltaStreamer.Config cfg = defaultKafkaDeltaStreamerConfig();
+    new JCommander(cfg).parse(args);
+
+    SparkConf sparkConf = HoodieExampleSparkUtils.defaultSparkConf("hoodie-delta-streamer-kafka-source-example");
+    JavaSparkContext jsc = new JavaSparkContext(sparkConf);
+
+    try {
+      new HoodieDeltaStreamer(cfg, jsc).sync();
+    } finally {
+      jsc.stop();
+    }
+  }
+
+  /**
+   * also see #{@link HoodieDeltaStreamer.Config} for more params.
+   * @return default params for Kafka DeltaStreamer
+   */
+  private static HoodieDeltaStreamer.Config defaultKafkaDeltaStreamerConfig() {
+
+    HoodieDeltaStreamer.Config cfg = new HoodieDeltaStreamer.Config();
+
+    cfg.configs.add(String.format("%s=uuid", DataSourceWriteOptions.RECORDKEY_FIELD_OPT_KEY()));
 
 Review comment:
   little more comments for important configs to guide the user? for e.g 
   
   for this , 
   `    cfg.configs.add("bootstrap.servers=localhost:9092");`
   
   we can `// The kafka cluster we want to ingest from` 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] dengziming commented on issue #1151: [WIP] [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
dengziming commented on issue #1151: [WIP] [HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#issuecomment-569914887
 
 
   Hi, @vinothchandar @yanghua 
   I just updated the pr and add 2 examples: HoodieWriteClient API and datasource API, other examples such as HoodieDeltaStreamer and structure streaming use case, I think it's better to add more subtasks to be convenient to do code review.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hudi] codecov-commenter edited a comment on pull request #1151: [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #1151:
URL: https://github.com/apache/hudi/pull/1151#issuecomment-634017621


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/1151?src=pr&el=h1) Report
   > Merging [#1151](https://codecov.io/gh/apache/hudi/pull/1151?src=pr&el=desc) into [master](https://codecov.io/gh/apache/hudi/commit/2a56f82908a8b8f788a7547d3c707c144696c1df&el=desc) will **decrease** coverage by `53.42%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/1151/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/1151?src=pr&el=tree)
   
   ```diff
   @@              Coverage Diff              @@
   ##             master    #1151       +/-   ##
   =============================================
   - Coverage     71.65%   18.22%   -53.43%     
   - Complexity      294      857      +563     
   =============================================
     Files           378      348       -30     
     Lines         16541    15332     -1209     
     Branches       1670     1523      -147     
   =============================================
   - Hits          11852     2794     -9058     
   - Misses         3957    12181     +8224     
   + Partials        732      357      -375     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/1151?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | [...rg/apache/hudi/common/model/HoodieAvroPayload.java](https://codecov.io/gh/apache/hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0hvb2RpZUF2cm9QYXlsb2FkLmphdmE=) | `0.00% <0.00%> (-84.62%)` | `0.00 <0.00> (ø)` | |
   | [.../java/org/apache/hudi/client/HoodieReadClient.java](https://codecov.io/gh/apache/hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpZW50L0hvb2RpZVJlYWRDbGllbnQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | [.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0cmljcy9NZXRyaWNzUmVwb3J0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | [.../java/org/apache/hudi/common/model/ActionType.java](https://codecov.io/gh/apache/hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0FjdGlvblR5cGUuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | [...java/org/apache/hudi/io/HoodieRangeInfoHandle.java](https://codecov.io/gh/apache/hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaW8vSG9vZGllUmFuZ2VJbmZvSGFuZGxlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | [.../java/org/apache/hudi/hadoop/InputPathHandler.java](https://codecov.io/gh/apache/hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1oYWRvb3AtbXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaGFkb29wL0lucHV0UGF0aEhhbmRsZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | [...a/org/apache/hudi/exception/HoodieIOException.java](https://codecov.io/gh/apache/hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvZXhjZXB0aW9uL0hvb2RpZUlPRXhjZXB0aW9uLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | [...org/apache/hudi/table/action/commit/SmallFile.java](https://codecov.io/gh/apache/hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdGFibGUvYWN0aW9uL2NvbW1pdC9TbWFsbEZpbGUuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | [...rg/apache/hudi/index/bloom/KeyRangeLookupTree.java](https://codecov.io/gh/apache/hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaW5kZXgvYmxvb20vS2V5UmFuZ2VMb29rdXBUcmVlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | [...g/apache/hudi/exception/HoodieInsertException.java](https://codecov.io/gh/apache/hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvZXhjZXB0aW9uL0hvb2RpZUluc2VydEV4Y2VwdGlvbi5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | ... and [330 more](https://codecov.io/gh/apache/hudi/pull/1151/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/1151?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/1151?src=pr&el=footer). Last update [2a56f82...59fc6d7](https://codecov.io/gh/apache/hudi/pull/1151?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #1151: [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on a change in pull request #1151: [HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#discussion_r383495270
 
 

 ##########
 File path: hudi-examples/src/main/java/org/apache/hudi/examples/deltastreamer/HoodieDeltaStreamerDfsSourceExample.java
 ##########
 @@ -0,0 +1,81 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.examples.deltastreamer;
+
+import org.apache.hudi.DataSourceWriteOptions;
+import org.apache.hudi.examples.common.HoodieExampleDataGenerator;
+import org.apache.hudi.examples.common.HoodieExampleSparkUtils;
+import org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer;
+import org.apache.hudi.utilities.sources.JsonDFSSource;
+import org.apache.hudi.utilities.transform.IdentityTransformer;
+
+import com.beust.jcommander.JCommander;
+import org.apache.spark.SparkConf;
+import org.apache.spark.api.java.JavaSparkContext;
+
+
+/**
+ * Simple examples of #{@link HoodieDeltaStreamer} from #{@link JsonDFSSource}.
+ *
+ * To run this example, you should
+ *   1. prepare sample data as `hudi-examples/src/main/resources/dfs-delta-streamer`
+ *   2. For running in IDE, set VM options `-Dspark.master=local[2]`
+ *   3. For running in shell, using `spark-submit`
+ *
+ * Usage: HoodieDeltaStreamerDfsSourceExample \
+ *        --target-base-path /tmp/hoodie/dfsdeltatable \
+ *        --table-type MERGE_ON_READ \
+ *        --target-table dfsdeltatable
+ *
+ */
+public class HoodieDeltaStreamerDfsSourceExample {
+
+  public static void main(String[] args) throws Exception {
+
+    final HoodieDeltaStreamer.Config cfg = defaultDfsStreamerConfig();
 
 Review comment:
   I ll actually let you pick :)..  

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #1151: [WIP] [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on a change in pull request #1151: [WIP] [HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#discussion_r375654298
 
 

 ##########
 File path: hudi-examples/pom.xml
 ##########
 @@ -0,0 +1,206 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
+
+  <parent>
+    <artifactId>hudi</artifactId>
+    <groupId>org.apache.hudi</groupId>
+    <version>0.5.2-SNAPSHOT</version>
+  </parent>
+  <modelVersion>4.0.0</modelVersion>
+
+  <artifactId>hudi-examples</artifactId>
+  <packaging>jar</packaging>
+
+  <properties>
+    <main.basedir>${project.parent.basedir}</main.basedir>
+  </properties>
+
+  <build>
+    <resources>
+      <resource>
+        <directory>src/main/resources</directory>
+      </resource>
+    </resources>
+
+    <plugins>
+      <plugin>
+        <groupId>org.apache.maven.plugins</groupId>
+        <artifactId>maven-dependency-plugin</artifactId>
+        <executions>
+          <execution>
+            <id>copy-dependencies</id>
+            <phase>prepare-package</phase>
+            <goals>
+              <goal>copy-dependencies</goal>
+            </goals>
+            <configuration>
+              <outputDirectory>${project.build.directory}/lib</outputDirectory>
+              <overWriteReleases>true</overWriteReleases>
+              <overWriteSnapshots>true</overWriteSnapshots>
+              <overWriteIfNewer>true</overWriteIfNewer>
+            </configuration>
+          </execution>
+        </executions>
+      </plugin>
+      <plugin>
+        <groupId>net.alchim31.maven</groupId>
+        <artifactId>scala-maven-plugin</artifactId>
+        <executions>
+          <execution>
+            <id>scala-compile-first</id>
+            <phase>process-resources</phase>
+            <goals>
+              <goal>add-source</goal>
+              <goal>compile</goal>
+            </goals>
+          </execution>
+        </executions>
+      </plugin>
+      <plugin>
+        <groupId>org.apache.maven.plugins</groupId>
+        <artifactId>maven-compiler-plugin</artifactId>
+        <executions>
+          <execution>
+            <phase>compile</phase>
+            <goals>
+              <goal>compile</goal>
+            </goals>
+          </execution>
+        </executions>
+      </plugin>
+      <plugin>
+        <groupId>org.apache.maven.plugins</groupId>
+        <artifactId>maven-jar-plugin</artifactId>
+        <executions>
+          <execution>
+            <goals>
+              <goal>test-jar</goal>
+            </goals>
+            <phase>test-compile</phase>
+          </execution>
+        </executions>
+        <configuration>
+          <skip>false</skip>
+        </configuration>
+      </plugin>
+      <plugin>
+        <groupId>org.apache.rat</groupId>
+        <artifactId>apache-rat-plugin</artifactId>
+      </plugin>
+    </plugins>
+  </build>
+
+  <dependencies>
+
+    <dependency>
+      <groupId>org.scala-lang</groupId>
+      <artifactId>scala-library</artifactId>
+      <version>${scala.version}</version>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.hudi</groupId>
+      <artifactId>hudi-common</artifactId>
+      <version>${project.version}</version>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.hudi</groupId>
+      <artifactId>hudi-cli</artifactId>
+      <version>${project.version}</version>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.hudi</groupId>
+      <artifactId>hudi-client</artifactId>
+      <version>${project.version}</version>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.hudi</groupId>
+      <artifactId>hudi-utilities_${scala.binary.version}</artifactId>
+      <version>${project.version}</version>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.hudi</groupId>
+      <artifactId>hudi-spark_${scala.binary.version}</artifactId>
+      <version>${project.version}</version>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.hudi</groupId>
+      <artifactId>hudi-hadoop-mr</artifactId>
+      <version>${project.version}</version>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.hudi</groupId>
+      <artifactId>hudi-hive</artifactId>
+      <version>${project.version}</version>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.hudi</groupId>
+      <artifactId>hudi-timeline-service</artifactId>
+      <version>${project.version}</version>
+    </dependency>
+
+    <!-- Spark -->
+    <dependency>
+      <groupId>org.apache.spark</groupId>
+      <artifactId>spark-core_${scala.binary.version}</artifactId>
+    </dependency>
+    <dependency>
+      <groupId>org.apache.spark</groupId>
+      <artifactId>spark-sql_${scala.binary.version}</artifactId>
+    </dependency>
+    <dependency>
+      <groupId>org.apache.spark</groupId>
+      <artifactId>spark-avro_${scala.binary.version}</artifactId>
+      <scope>provided</scope>
 
 Review comment:
   why have this specifically as `provided` given its in that scope in parent pom already?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] dengziming commented on issue #1151: [WIP] [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
dengziming commented on issue #1151: [WIP] [HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#issuecomment-581785306
 
 
   > @dengziming are you still working on this?
   
   I tried to reuse more code but I can't improve it anymore, I think it's ready for review, thank you!

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #1151: [WIP] [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on a change in pull request #1151: [WIP] [HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#discussion_r375653917
 
 

 ##########
 File path: hudi-examples/pom.xml
 ##########
 @@ -0,0 +1,206 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
+
+  <parent>
+    <artifactId>hudi</artifactId>
+    <groupId>org.apache.hudi</groupId>
+    <version>0.5.2-SNAPSHOT</version>
+  </parent>
+  <modelVersion>4.0.0</modelVersion>
+
+  <artifactId>hudi-examples</artifactId>
+  <packaging>jar</packaging>
+
+  <properties>
+    <main.basedir>${project.parent.basedir}</main.basedir>
+  </properties>
+
+  <build>
+    <resources>
+      <resource>
+        <directory>src/main/resources</directory>
+      </resource>
+    </resources>
+
+    <plugins>
+      <plugin>
+        <groupId>org.apache.maven.plugins</groupId>
 
 Review comment:
   IIUC.. we will be building out a fat jar for purposes of running the examples from command line? 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #1151: [WIP] [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on a change in pull request #1151: [WIP] [HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#discussion_r375654859
 
 

 ##########
 File path: hudi-examples/src/main/java/org/apache/hudi/examples/common/HoodieExampleDataGenerator.java
 ##########
 @@ -0,0 +1,216 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.examples.common;
+
+import org.apache.hudi.common.model.HoodieAvroPayload;
+import org.apache.hudi.common.model.HoodieKey;
+import org.apache.hudi.common.model.HoodieRecord;
+import org.apache.hudi.common.model.HoodieRecordPayload;
+import org.apache.hudi.common.util.HoodieAvroUtils;
+import org.apache.hudi.common.util.Option;
+import org.apache.hudi.common.util.TypedProperties;
+import org.apache.hudi.utilities.schema.SchemaProvider;
+
+import org.apache.avro.Schema;
+import org.apache.avro.generic.GenericData;
+import org.apache.avro.generic.GenericRecord;
+import org.apache.spark.api.java.JavaSparkContext;
+
+import java.io.IOException;
+import java.io.Serializable;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Random;
+import java.util.UUID;
+import java.util.stream.Collectors;
+import java.util.stream.IntStream;
+import java.util.stream.Stream;
+
+
+/**
+ * Class to be used to generate test data.
+ */
+public class HoodieExampleDataGenerator<T extends HoodieRecordPayload<T>> {
 
 Review comment:
   I think we had to do this for QuickStartUtils as well.. cc @bhasudha .. May be we can create a code cleanup JIRA to consolidate this data generation into a common module inside Hudi and re-use consistently? 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] dengziming edited a comment on issue #1151: [WIP] [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
dengziming edited a comment on issue #1151: [WIP] [HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#issuecomment-576130421
 
 
   @vinothchandar hi, vinoth, I have added the DeltaStreamExample.
   And I run `mvn test -B` successful locally, but the Travis CI build failed with a:
   ```
   [ERROR] Failed to execute goal on project hudi-examples: Could not resolve dependencies for project org.apache.hudi:hudi-examples:jar:0.5.1-SNAPSHOT: The following artifacts could not be resolved: org.apache.hudi:hudi-utilities:jar:0.5.1-SNAPSHOT, org.apache.hudi:hudi-spark:jar:0.5.1-SNAPSHOT: Failure to find org.apache.hudi:hudi-utilities:jar:0.5.1-SNAPSHOT in https://oss.sonatype.org/content/repositories/snapshots/ was cached in the local repository, resolution will not be reattempted until the update interval of sonatype-snapshots has elapsed or updates are forced -> [Help 1]
   ```
   I searched for this error and found it could be solved by deleting the file  cached in the local repository, but I don't have the privilege, could you help me to solve this problem.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #1151: [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on a change in pull request #1151: [HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#discussion_r383479608
 
 

 ##########
 File path: hudi-examples/pom.xml
 ##########
 @@ -0,0 +1,206 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
+
+  <parent>
+    <artifactId>hudi</artifactId>
+    <groupId>org.apache.hudi</groupId>
+    <version>0.5.2-SNAPSHOT</version>
+  </parent>
+  <modelVersion>4.0.0</modelVersion>
+
+  <artifactId>hudi-examples</artifactId>
+  <packaging>jar</packaging>
+
+  <properties>
+    <main.basedir>${project.parent.basedir}</main.basedir>
+  </properties>
+
+  <build>
+    <resources>
+      <resource>
+        <directory>src/main/resources</directory>
+      </resource>
+    </resources>
+
+    <plugins>
+      <plugin>
+        <groupId>org.apache.maven.plugins</groupId>
 
 Review comment:
   This ties back to how we let the users run the examples. Another way is to not have a fat jar here, but just have a `run_hudi_example.sh` script just use the spark-bundle/utilities-bundle after hudi is build.. 
   
   This way, we don't have to also maintain this bundle separately.. Users will be using the bundles under `packaging` in production anyway. So just reuse them?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #1151: [WIP] [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on a change in pull request #1151: [WIP] [HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#discussion_r375656391
 
 

 ##########
 File path: hudi-examples/src/main/java/org/apache/hudi/examples/deltastreamer/HoodieDeltaStreamerDfsSourceExample.java
 ##########
 @@ -0,0 +1,81 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.examples.deltastreamer;
+
+import org.apache.hudi.DataSourceWriteOptions;
+import org.apache.hudi.examples.common.HoodieExampleDataGenerator;
+import org.apache.hudi.examples.common.HoodieExampleSparkUtils;
+import org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer;
+import org.apache.hudi.utilities.sources.JsonDFSSource;
+import org.apache.hudi.utilities.transform.IdentityTransformer;
+
+import com.beust.jcommander.JCommander;
+import org.apache.spark.SparkConf;
+import org.apache.spark.api.java.JavaSparkContext;
+
+
+/**
+ * Simple examples of #{@link HoodieDeltaStreamer} from #{@link JsonDFSSource}.
+ *
+ * To run this example, you should
+ *   1. prepare sample data as `hudi-examples/src/main/resources/dfs-delta-streamer`
+ *   2. For running in IDE, set VM options `-Dspark.master=local[2]`
+ *   3. For running in shell, using `spark-submit`
+ *
+ * Usage: HoodieDeltaStreamerDfsSourceExample \
+ *        --target-base-path /tmp/hoodie/dfsdeltatable \
+ *        --table-type MERGE_ON_READ \
+ *        --target-table dfsdeltatable
+ *
+ */
+public class HoodieDeltaStreamerDfsSourceExample {
+
+  public static void main(String[] args) throws Exception {
+
+    final HoodieDeltaStreamer.Config cfg = defaultDfsStreamerConfig();
 
 Review comment:
   Since a typical user will provide configs to delta streamer via property files or command line, can we follow the same, instead of constructing the deltastreamer config object programmatically (this approach is awesome for spark datasource where users typically supply options programmatically) 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] dengziming commented on a change in pull request #1151: [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
dengziming commented on a change in pull request #1151: [HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#discussion_r390827545
 
 

 ##########
 File path: hudi-examples/src/main/java/org/apache/hudi/examples/deltastreamer/HoodieDeltaStreamerDfsSourceExample.java
 ##########
 @@ -0,0 +1,81 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.examples.deltastreamer;
+
+import org.apache.hudi.DataSourceWriteOptions;
+import org.apache.hudi.examples.common.HoodieExampleDataGenerator;
+import org.apache.hudi.examples.common.HoodieExampleSparkUtils;
+import org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer;
+import org.apache.hudi.utilities.sources.JsonDFSSource;
+import org.apache.hudi.utilities.transform.IdentityTransformer;
+
+import com.beust.jcommander.JCommander;
+import org.apache.spark.SparkConf;
+import org.apache.spark.api.java.JavaSparkContext;
+
+
+/**
+ * Simple examples of #{@link HoodieDeltaStreamer} from #{@link JsonDFSSource}.
+ *
+ * To run this example, you should
+ *   1. prepare sample data as `hudi-examples/src/main/resources/dfs-delta-streamer`
+ *   2. For running in IDE, set VM options `-Dspark.master=local[2]`
+ *   3. For running in shell, using `spark-submit`
+ *
+ * Usage: HoodieDeltaStreamerDfsSourceExample \
 
 Review comment:
   This is a good idea, I will try to extract the data prep part themselves.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] dengziming commented on issue #1151: Hudi-476: Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
dengziming commented on issue #1151: Hudi-476: Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#issuecomment-569474270
 
 
   @leesf Hi, PTAL.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] dengziming commented on issue #1151: [WIP] [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
dengziming commented on issue #1151: [WIP] [HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#issuecomment-573487458
 
 
   > Could we also provide delta streamer examples?
   
   @vinothchandar  hi, I am working on it, `HoodieDeltaStreamer` seems more complex so I need some time to debug and review the code, and it will be completed soon.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #1151: [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on a change in pull request #1151: [HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#discussion_r386132370
 
 

 ##########
 File path: hudi-examples/pom.xml
 ##########
 @@ -0,0 +1,206 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
+
+  <parent>
+    <artifactId>hudi</artifactId>
+    <groupId>org.apache.hudi</groupId>
+    <version>0.5.2-SNAPSHOT</version>
+  </parent>
+  <modelVersion>4.0.0</modelVersion>
+
+  <artifactId>hudi-examples</artifactId>
+  <packaging>jar</packaging>
+
+  <properties>
+    <main.basedir>${project.parent.basedir}</main.basedir>
+  </properties>
+
+  <build>
+    <resources>
+      <resource>
+        <directory>src/main/resources</directory>
+      </resource>
+    </resources>
+
+    <plugins>
+      <plugin>
+        <groupId>org.apache.maven.plugins</groupId>
 
 Review comment:
   this is probably the biggest item we need to decide on? 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hudi] lamber-ken commented on pull request #1151: [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
lamber-ken commented on pull request #1151:
URL: https://github.com/apache/hudi/pull/1151#issuecomment-633929289


   **Changes:**
   
   1. Remove `Deprecated` usages.
   2. Update scrptis.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] lamber-ken merged pull request #1151: [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
lamber-ken merged pull request #1151:
URL: https://github.com/apache/hudi/pull/1151


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #1151: [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on a change in pull request #1151: [HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#discussion_r383477929
 
 

 ##########
 File path: hudi-examples/src/main/java/org/apache/hudi/examples/deltastreamer/HoodieDeltaStreamerDfsSourceExample.java
 ##########
 @@ -0,0 +1,81 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.examples.deltastreamer;
+
+import org.apache.hudi.DataSourceWriteOptions;
+import org.apache.hudi.examples.common.HoodieExampleDataGenerator;
+import org.apache.hudi.examples.common.HoodieExampleSparkUtils;
+import org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer;
+import org.apache.hudi.utilities.sources.JsonDFSSource;
+import org.apache.hudi.utilities.transform.IdentityTransformer;
+
+import com.beust.jcommander.JCommander;
+import org.apache.spark.SparkConf;
+import org.apache.spark.api.java.JavaSparkContext;
+
+
+/**
+ * Simple examples of #{@link HoodieDeltaStreamer} from #{@link JsonDFSSource}.
+ *
+ * To run this example, you should
+ *   1. prepare sample data as `hudi-examples/src/main/resources/dfs-delta-streamer`
+ *   2. For running in IDE, set VM options `-Dspark.master=local[2]`
+ *   3. For running in shell, using `spark-submit`
+ *
+ * Usage: HoodieDeltaStreamerDfsSourceExample \
 
 Review comment:
   a shell script to run any Example class is a better idea.. I was just referring to having a command that users can just copy paste to a terminal and run.. 
   
   >>data prep part of the examples themselves and then also provide sane defaults for input/output paths
   
   Thoughts on this? 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] dengziming commented on a change in pull request #1151: [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
dengziming commented on a change in pull request #1151: [HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#discussion_r390802033
 
 

 ##########
 File path: hudi-examples/pom.xml
 ##########
 @@ -0,0 +1,206 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
+
+  <parent>
+    <artifactId>hudi</artifactId>
+    <groupId>org.apache.hudi</groupId>
+    <version>0.5.2-SNAPSHOT</version>
+  </parent>
+  <modelVersion>4.0.0</modelVersion>
+
+  <artifactId>hudi-examples</artifactId>
+  <packaging>jar</packaging>
+
+  <properties>
+    <main.basedir>${project.parent.basedir}</main.basedir>
+  </properties>
+
+  <build>
+    <resources>
+      <resource>
+        <directory>src/main/resources</directory>
+      </resource>
+    </resources>
+
+    <plugins>
+      <plugin>
+        <groupId>org.apache.maven.plugins</groupId>
 
 Review comment:
   I think it's better to not have a fat jar here and add a `run_hudi_example.sh`.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #1151: [WIP] [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on a change in pull request #1151: [WIP] [HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#discussion_r375655701
 
 

 ##########
 File path: hudi-examples/src/main/java/org/apache/hudi/examples/deltastreamer/HoodieDeltaStreamerDfsSourceExample.java
 ##########
 @@ -0,0 +1,81 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.examples.deltastreamer;
+
+import org.apache.hudi.DataSourceWriteOptions;
+import org.apache.hudi.examples.common.HoodieExampleDataGenerator;
+import org.apache.hudi.examples.common.HoodieExampleSparkUtils;
+import org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer;
+import org.apache.hudi.utilities.sources.JsonDFSSource;
+import org.apache.hudi.utilities.transform.IdentityTransformer;
+
+import com.beust.jcommander.JCommander;
+import org.apache.spark.SparkConf;
+import org.apache.spark.api.java.JavaSparkContext;
+
+
+/**
+ * Simple examples of #{@link HoodieDeltaStreamer} from #{@link JsonDFSSource}.
+ *
+ * To run this example, you should
+ *   1. prepare sample data as `hudi-examples/src/main/resources/dfs-delta-streamer`
+ *   2. For running in IDE, set VM options `-Dspark.master=local[2]`
+ *   3. For running in shell, using `spark-submit`
+ *
+ * Usage: HoodieDeltaStreamerDfsSourceExample \
 
 Review comment:
   can we provide a full working command for this and all the examples? Also as a general comment, examples are great when the user can jsut hit run or execute a command and it takes care of things like data prep (step 1).. 
   
   Can we make data prep part of the examples themselves and then also provide sane defaults for input/output paths.. for e.g `/tmp/hudi-examples/dfsdeltastreamer/input` and `/tmp/hudi-examples/dfsdeltastreamer/output`, 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] dengziming commented on a change in pull request #1151: [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
dengziming commented on a change in pull request #1151: [HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#discussion_r379895663
 
 

 ##########
 File path: hudi-examples/src/main/java/org/apache/hudi/examples/deltastreamer/HoodieDeltaStreamerDfsSourceExample.java
 ##########
 @@ -0,0 +1,81 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.examples.deltastreamer;
+
+import org.apache.hudi.DataSourceWriteOptions;
+import org.apache.hudi.examples.common.HoodieExampleDataGenerator;
+import org.apache.hudi.examples.common.HoodieExampleSparkUtils;
+import org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer;
+import org.apache.hudi.utilities.sources.JsonDFSSource;
+import org.apache.hudi.utilities.transform.IdentityTransformer;
+
+import com.beust.jcommander.JCommander;
+import org.apache.spark.SparkConf;
+import org.apache.spark.api.java.JavaSparkContext;
+
+
+/**
+ * Simple examples of #{@link HoodieDeltaStreamer} from #{@link JsonDFSSource}.
+ *
+ * To run this example, you should
+ *   1. prepare sample data as `hudi-examples/src/main/resources/dfs-delta-streamer`
+ *   2. For running in IDE, set VM options `-Dspark.master=local[2]`
+ *   3. For running in shell, using `spark-submit`
+ *
+ * Usage: HoodieDeltaStreamerDfsSourceExample \
 
 Review comment:
   You mean, we just add a shell script file to run all these examples?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] dengziming commented on issue #1151: [WIP][HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
dengziming commented on issue #1151: [WIP][HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#issuecomment-613764273
 
 
   Thanks for your reply @vinothchandar . Do you mean we can just delete the 3 XxxDeltaStreamerExample and replace them with a  run_examples.sh and some config files?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] vinothchandar commented on issue #1151: [WIP][HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on issue #1151: [WIP][HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#issuecomment-613743806
 
 
   @dengziming thanks for getting back..  
   
   I think we can solve both problems for deltastreamer by simply reusing the deltastreamer from the utilities-bundle and just provide the command line arguments and config files via the run_examples.sh script...
   
   DataSource examples can hopefully be easily dealt with.. it may be worth it to even get that working first, if it helps us move forward.. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] pratyakshsharma commented on a change in pull request #1151: [WIP] [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
pratyakshsharma commented on a change in pull request #1151: [WIP] [HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#discussion_r368602223
 
 

 ##########
 File path: hudi-examples/src/main/java/org/apache/hudi/examples/spark/HoodieWriteClientExample.java
 ##########
 @@ -0,0 +1,135 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.examples.spark;
+
+import org.apache.hudi.HoodieWriteClient;
+import org.apache.hudi.WriteStatus;
+import org.apache.hudi.common.model.HoodieAvroPayload;
+import org.apache.hudi.common.model.HoodieKey;
+import org.apache.hudi.common.model.HoodieRecord;
+import org.apache.hudi.common.model.HoodieTableType;
+import org.apache.hudi.common.table.HoodieTableMetaClient;
+import org.apache.hudi.common.util.FSUtils;
+import org.apache.hudi.common.util.Option;
+import org.apache.hudi.config.HoodieCompactionConfig;
+import org.apache.hudi.config.HoodieIndexConfig;
+import org.apache.hudi.config.HoodieWriteConfig;
+import org.apache.hudi.examples.common.HoodieExampleDataGenerator;
+import org.apache.hudi.index.HoodieIndex;
+
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.log4j.LogManager;
+import org.apache.log4j.Logger;
+import org.apache.spark.SparkConf;
+import org.apache.spark.api.java.JavaRDD;
+import org.apache.spark.api.java.JavaSparkContext;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.stream.Collectors;
+
+
+/**
+ * Simple examples of #{@link HoodieWriteClient}.
+ *
+ * To run this example, you should
+ *   1. For running in IDE, set VM options `-Dspark.master=local[2]`
+ *   2. For running in shell, using `spark-submit`
+ *
+ * Usage: HoodieWriteClientExample <tablePath> <tableName>
+ * <tablePath> and <tableName> describe root path of hudi and table name
+ * for example, `HoodieWriteClientExample file:///tmp/hoodie/sample-table hoodie_rt`
+ */
+public class HoodieWriteClientExample {
+
+  private static final Logger LOG = LogManager.getLogger(HoodieWriteClientExample.class);
+
+  private static String tableType = HoodieTableType.COPY_ON_WRITE.name();
+
+  public static void main(String[] args) throws Exception {
+    if (args.length < 2) {
+      System.err.println("Usage: HoodieWriteClientExample <tablePath> <tableName>");
+      System.exit(1);
+    }
+    String tablePath = args[0];
+    String tableName = args[1];
+    SparkConf sparkConf = new SparkConf().setAppName("hoodie-client-example");
+    sparkConf.set("spark.serializer", "org.apache.spark.serializer.KryoSerializer");
+    sparkConf.set("spark.kryoserializer.buffer.max", "512m");
+    sparkConf.set("spark.some.config.option", "some-value");
 
 Review comment:
   can we expose a function to get sparkConf in some utility class in this module? I see this is duplicate code in every class.  @dengziming 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #1151: [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on a change in pull request #1151: [HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#discussion_r383479608
 
 

 ##########
 File path: hudi-examples/pom.xml
 ##########
 @@ -0,0 +1,206 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
+
+  <parent>
+    <artifactId>hudi</artifactId>
+    <groupId>org.apache.hudi</groupId>
+    <version>0.5.2-SNAPSHOT</version>
+  </parent>
+  <modelVersion>4.0.0</modelVersion>
+
+  <artifactId>hudi-examples</artifactId>
+  <packaging>jar</packaging>
+
+  <properties>
+    <main.basedir>${project.parent.basedir}</main.basedir>
+  </properties>
+
+  <build>
+    <resources>
+      <resource>
+        <directory>src/main/resources</directory>
+      </resource>
+    </resources>
+
+    <plugins>
+      <plugin>
+        <groupId>org.apache.maven.plugins</groupId>
 
 Review comment:
   This ties back to how we let the users run the examples. Another way is to not have a fat jar here, but just have a `run_hudi_example.sh` script just use the spark-bundle/utilities-bundle after hudi is built.. 
   
   This way, we don't have to also maintain this bundle separately.. Users will be using the bundles under `packaging` in production anyway. So just reuse them?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] dengziming edited a comment on issue #1151: [WIP] [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
dengziming edited a comment on issue #1151: [WIP] [HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#issuecomment-569914887
 
 
   Hi, @vinothchandar @yanghua 
   I just updated the pr and add 2 examples: HoodieWriteClient API and datasource API, other examples such as HoodieDeltaStreamer and structure streaming use case, I think it's better to add more subtasks to be convenient to do code review. 
   And other examples are relatively complex.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] yanghua commented on issue #1151: [WIP] [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
yanghua commented on issue #1151: [WIP] [HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#issuecomment-569851801
 
 
   Agree init a module with at least some example codes, it at least could prove that contributor has the power and energy to drive this work.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #1151: [WIP] [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on a change in pull request #1151: [WIP] [HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#discussion_r375656925
 
 

 ##########
 File path: hudi-examples/src/main/java/org/apache/hudi/examples/deltastreamer/HoodieDeltaStreamerKafkaSourceExample.java
 ##########
 @@ -0,0 +1,87 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.examples.deltastreamer;
+
+import org.apache.hudi.DataSourceWriteOptions;
+import org.apache.hudi.examples.common.HoodieExampleDataGenerator;
+import org.apache.hudi.examples.common.HoodieExampleSparkUtils;
+import org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer;
+import org.apache.hudi.utilities.sources.JsonKafkaSource;
+import org.apache.hudi.utilities.transform.IdentityTransformer;
+
+import com.beust.jcommander.JCommander;
+import org.apache.spark.SparkConf;
+import org.apache.spark.api.java.JavaSparkContext;
+
+
+/**
+ * Simple examples of #{@link HoodieDeltaStreamer} from #{@link JsonKafkaSource}.
 
 Review comment:
   Also more javadocs descriptions here on what each example is trying to achieve?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] dengziming commented on a change in pull request #1151: [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
dengziming commented on a change in pull request #1151: [HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#discussion_r379895474
 
 

 ##########
 File path: hudi-examples/src/main/java/org/apache/hudi/examples/deltastreamer/HoodieDeltaStreamerDfsSourceExample.java
 ##########
 @@ -0,0 +1,81 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.examples.deltastreamer;
+
+import org.apache.hudi.DataSourceWriteOptions;
+import org.apache.hudi.examples.common.HoodieExampleDataGenerator;
+import org.apache.hudi.examples.common.HoodieExampleSparkUtils;
+import org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer;
+import org.apache.hudi.utilities.sources.JsonDFSSource;
+import org.apache.hudi.utilities.transform.IdentityTransformer;
+
+import com.beust.jcommander.JCommander;
+import org.apache.spark.SparkConf;
+import org.apache.spark.api.java.JavaSparkContext;
+
+
+/**
+ * Simple examples of #{@link HoodieDeltaStreamer} from #{@link JsonDFSSource}.
+ *
+ * To run this example, you should
+ *   1. prepare sample data as `hudi-examples/src/main/resources/dfs-delta-streamer`
+ *   2. For running in IDE, set VM options `-Dspark.master=local[2]`
+ *   3. For running in shell, using `spark-submit`
+ *
+ * Usage: HoodieDeltaStreamerDfsSourceExample \
+ *        --target-base-path /tmp/hoodie/dfsdeltatable \
+ *        --table-type MERGE_ON_READ \
+ *        --target-table dfsdeltatable
+ *
+ */
+public class HoodieDeltaStreamerDfsSourceExample {
+
+  public static void main(String[] args) throws Exception {
+
+    final HoodieDeltaStreamer.Config cfg = defaultDfsStreamerConfig();
 
 Review comment:
   some config is indispensable to run the code, for example we hard code `configs.put("spark.serializer", "org.apache.spark.serializer.KryoSerializer")` in every example. So I just hard code some of the required configs and they can be overwritten by property files or command line. We can just remove the hard code and add them in the javadocs descriptions, which one do you prefer?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] dengziming commented on a change in pull request #1151: [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
dengziming commented on a change in pull request #1151: [HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#discussion_r390818477
 
 

 ##########
 File path: hudi-examples/src/main/java/org/apache/hudi/examples/deltastreamer/HoodieDeltaStreamerDfsSourceExample.java
 ##########
 @@ -0,0 +1,81 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.examples.deltastreamer;
+
+import org.apache.hudi.DataSourceWriteOptions;
+import org.apache.hudi.examples.common.HoodieExampleDataGenerator;
+import org.apache.hudi.examples.common.HoodieExampleSparkUtils;
+import org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer;
+import org.apache.hudi.utilities.sources.JsonDFSSource;
+import org.apache.hudi.utilities.transform.IdentityTransformer;
+
+import com.beust.jcommander.JCommander;
+import org.apache.spark.SparkConf;
+import org.apache.spark.api.java.JavaSparkContext;
+
+
+/**
+ * Simple examples of #{@link HoodieDeltaStreamer} from #{@link JsonDFSSource}.
+ *
+ * To run this example, you should
+ *   1. prepare sample data as `hudi-examples/src/main/resources/dfs-delta-streamer`
+ *   2. For running in IDE, set VM options `-Dspark.master=local[2]`
+ *   3. For running in shell, using `spark-submit`
+ *
+ * Usage: HoodieDeltaStreamerDfsSourceExample \
+ *        --target-base-path /tmp/hoodie/dfsdeltatable \
+ *        --table-type MERGE_ON_READ \
+ *        --target-table dfsdeltatable
+ *
+ */
+public class HoodieDeltaStreamerDfsSourceExample {
+
+  public static void main(String[] args) throws Exception {
+
+    final HoodieDeltaStreamer.Config cfg = defaultDfsStreamerConfig();
 
 Review comment:
   The advantage of adding these configs in code is that developers and users can execute them directly, and the main function of examples is just to give users some tutorials, they will remove the hardcode when developing their own application, so we can remain them.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] dengziming commented on a change in pull request #1151: [WIP] [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
dengziming commented on a change in pull request #1151: [WIP] [HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#discussion_r368929527
 
 

 ##########
 File path: hudi-examples/src/main/java/org/apache/hudi/examples/common/HoodieExampleDataGenerator.java
 ##########
 @@ -0,0 +1,216 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.examples.common;
+
+import org.apache.hudi.common.model.HoodieAvroPayload;
+import org.apache.hudi.common.model.HoodieKey;
+import org.apache.hudi.common.model.HoodieRecord;
+import org.apache.hudi.common.model.HoodieRecordPayload;
+import org.apache.hudi.common.util.HoodieAvroUtils;
+import org.apache.hudi.common.util.Option;
+import org.apache.hudi.common.util.TypedProperties;
+import org.apache.hudi.utilities.schema.SchemaProvider;
+
+import org.apache.avro.Schema;
+import org.apache.avro.generic.GenericData;
+import org.apache.avro.generic.GenericRecord;
+import org.apache.spark.api.java.JavaSparkContext;
+
+import java.io.IOException;
+import java.io.Serializable;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Random;
+import java.util.UUID;
+import java.util.stream.Collectors;
+import java.util.stream.IntStream;
+import java.util.stream.Stream;
+
+
+/**
+ * Class to be used to generate test data.
+ */
+public class HoodieExampleDataGenerator<T extends HoodieRecordPayload<T>> {
 
 Review comment:
   I tried to use HoodieTestDataGenerator, but this class is in the `hudi-client/src/test` directory and depends on some other test class so it's difficult to move it out.
   So I think the best way is to copy some code from `HoodieTestDataGenerator`, and after we move the example code in `hudi-client/src/test` to `hudi-examples` module we just delete the code in `HoodieTestDataGenerator`, and this involves much work and we can add it to the sub-tasks of HUDI-475. 
   What do you think about it, suggestions are welcomed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] vinothchandar commented on issue #1151: [WIP] [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on issue #1151: [WIP] [HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#issuecomment-573503338
 
 
   No worries.. please take your time!

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #1151: [WIP] [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on a change in pull request #1151: [WIP] [HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#discussion_r375657083
 
 

 ##########
 File path: hudi-examples/src/main/java/org/apache/hudi/examples/deltastreamer/HoodieDeltaStreamerSimpleExample.java
 ##########
 @@ -0,0 +1,105 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.examples.deltastreamer;
+
+import org.apache.hudi.DataSourceWriteOptions;
+import org.apache.hudi.common.model.HoodieAvroPayload;
+import org.apache.hudi.common.table.timeline.HoodieActiveTimeline;
+import org.apache.hudi.common.util.Option;
+import org.apache.hudi.common.util.TypedProperties;
+import org.apache.hudi.examples.common.HoodieExampleDataGenerator;
+import org.apache.hudi.examples.common.HoodieExampleSparkUtils;
+import org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer;
+import org.apache.hudi.utilities.schema.SchemaProvider;
+import org.apache.hudi.utilities.sources.InputBatch;
+import org.apache.hudi.utilities.sources.JsonSource;
+import org.apache.hudi.utilities.transform.IdentityTransformer;
+
+import com.beust.jcommander.JCommander;
+import org.apache.spark.SparkConf;
+import org.apache.spark.api.java.JavaRDD;
+import org.apache.spark.api.java.JavaSparkContext;
+import org.apache.spark.sql.SparkSession;
+
+import java.util.List;
+
+/**
+ * Simple examples of {@link HoodieDeltaStreamer}.
+ * this class use data from a mock {@link HoodieExampleDataGenerator}.
+ *
+ * To run this example, you should
+ *    1. For running in IDE, set VM options `-Dspark.master=local[2]`
+ *    2. For running in shell, using `spark-submit`
+ *
+ * Usage: HoodieDeltaStreamerSimpleExample \
+ *        --target-base-path /tmp/hoodie/deltastreamertable \
+ *        --table-type MERGE_ON_READ \
+ *        --target-table deltastreamertable
+ */
+public class HoodieDeltaStreamerSimpleExample {
 
 Review comment:
   This cna actually be an example for a custom data source 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hudi] codecov-commenter edited a comment on pull request #1151: [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #1151:
URL: https://github.com/apache/hudi/pull/1151#issuecomment-634017621


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/1151?src=pr&el=h1) Report
   > Merging [#1151](https://codecov.io/gh/apache/hudi/pull/1151?src=pr&el=desc) into [master](https://codecov.io/gh/apache/hudi/commit/2a56f82908a8b8f788a7547d3c707c144696c1df&el=desc) will **decrease** coverage by `53.42%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/1151/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/1151?src=pr&el=tree)
   
   ```diff
   @@              Coverage Diff              @@
   ##             master    #1151       +/-   ##
   =============================================
   - Coverage     71.65%   18.22%   -53.43%     
   - Complexity      294      857      +563     
   =============================================
     Files           378      348       -30     
     Lines         16541    15332     -1209     
     Branches       1670     1523      -147     
   =============================================
   - Hits          11852     2794     -9058     
   - Misses         3957    12181     +8224     
   + Partials        732      357      -375     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/1151?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | [...rg/apache/hudi/common/model/HoodieAvroPayload.java](https://codecov.io/gh/apache/hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0hvb2RpZUF2cm9QYXlsb2FkLmphdmE=) | `0.00% <0.00%> (-84.62%)` | `0.00 <0.00> (ø)` | |
   | [.../java/org/apache/hudi/client/HoodieReadClient.java](https://codecov.io/gh/apache/hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpZW50L0hvb2RpZVJlYWRDbGllbnQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | [.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0cmljcy9NZXRyaWNzUmVwb3J0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | [.../java/org/apache/hudi/common/model/ActionType.java](https://codecov.io/gh/apache/hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0FjdGlvblR5cGUuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | [...java/org/apache/hudi/io/HoodieRangeInfoHandle.java](https://codecov.io/gh/apache/hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaW8vSG9vZGllUmFuZ2VJbmZvSGFuZGxlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | [.../java/org/apache/hudi/hadoop/InputPathHandler.java](https://codecov.io/gh/apache/hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1oYWRvb3AtbXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaGFkb29wL0lucHV0UGF0aEhhbmRsZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | [...a/org/apache/hudi/exception/HoodieIOException.java](https://codecov.io/gh/apache/hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvZXhjZXB0aW9uL0hvb2RpZUlPRXhjZXB0aW9uLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | [...org/apache/hudi/table/action/commit/SmallFile.java](https://codecov.io/gh/apache/hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdGFibGUvYWN0aW9uL2NvbW1pdC9TbWFsbEZpbGUuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | [...rg/apache/hudi/index/bloom/KeyRangeLookupTree.java](https://codecov.io/gh/apache/hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaW5kZXgvYmxvb20vS2V5UmFuZ2VMb29rdXBUcmVlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | [...g/apache/hudi/exception/HoodieInsertException.java](https://codecov.io/gh/apache/hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvZXhjZXB0aW9uL0hvb2RpZUluc2VydEV4Y2VwdGlvbi5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | ... and [330 more](https://codecov.io/gh/apache/hudi/pull/1151/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/1151?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/1151?src=pr&el=footer). Last update [2a56f82...59fc6d7](https://codecov.io/gh/apache/hudi/pull/1151?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-hudi] dengziming commented on a change in pull request #1151: [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
dengziming commented on a change in pull request #1151: [HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#discussion_r379894998
 
 

 ##########
 File path: hudi-examples/pom.xml
 ##########
 @@ -0,0 +1,206 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
+
+  <parent>
+    <artifactId>hudi</artifactId>
+    <groupId>org.apache.hudi</groupId>
+    <version>0.5.2-SNAPSHOT</version>
+  </parent>
+  <modelVersion>4.0.0</modelVersion>
+
+  <artifactId>hudi-examples</artifactId>
+  <packaging>jar</packaging>
+
+  <properties>
+    <main.basedir>${project.parent.basedir}</main.basedir>
+  </properties>
+
+  <build>
+    <resources>
+      <resource>
+        <directory>src/main/resources</directory>
+      </resource>
+    </resources>
+
+    <plugins>
+      <plugin>
+        <groupId>org.apache.maven.plugins</groupId>
+        <artifactId>maven-dependency-plugin</artifactId>
+        <executions>
+          <execution>
+            <id>copy-dependencies</id>
+            <phase>prepare-package</phase>
+            <goals>
+              <goal>copy-dependencies</goal>
+            </goals>
+            <configuration>
+              <outputDirectory>${project.build.directory}/lib</outputDirectory>
+              <overWriteReleases>true</overWriteReleases>
+              <overWriteSnapshots>true</overWriteSnapshots>
+              <overWriteIfNewer>true</overWriteIfNewer>
+            </configuration>
+          </execution>
+        </executions>
+      </plugin>
+      <plugin>
+        <groupId>net.alchim31.maven</groupId>
+        <artifactId>scala-maven-plugin</artifactId>
+        <executions>
+          <execution>
+            <id>scala-compile-first</id>
+            <phase>process-resources</phase>
+            <goals>
+              <goal>add-source</goal>
+              <goal>compile</goal>
+            </goals>
+          </execution>
+        </executions>
+      </plugin>
+      <plugin>
+        <groupId>org.apache.maven.plugins</groupId>
+        <artifactId>maven-compiler-plugin</artifactId>
+        <executions>
+          <execution>
+            <phase>compile</phase>
+            <goals>
+              <goal>compile</goal>
+            </goals>
+          </execution>
+        </executions>
+      </plugin>
+      <plugin>
+        <groupId>org.apache.maven.plugins</groupId>
+        <artifactId>maven-jar-plugin</artifactId>
+        <executions>
+          <execution>
+            <goals>
+              <goal>test-jar</goal>
+            </goals>
+            <phase>test-compile</phase>
+          </execution>
+        </executions>
+        <configuration>
+          <skip>false</skip>
+        </configuration>
+      </plugin>
+      <plugin>
+        <groupId>org.apache.rat</groupId>
+        <artifactId>apache-rat-plugin</artifactId>
+      </plugin>
+    </plugins>
+  </build>
+
+  <dependencies>
+
+    <dependency>
+      <groupId>org.scala-lang</groupId>
+      <artifactId>scala-library</artifactId>
+      <version>${scala.version}</version>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.hudi</groupId>
+      <artifactId>hudi-common</artifactId>
+      <version>${project.version}</version>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.hudi</groupId>
+      <artifactId>hudi-cli</artifactId>
+      <version>${project.version}</version>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.hudi</groupId>
+      <artifactId>hudi-client</artifactId>
+      <version>${project.version}</version>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.hudi</groupId>
+      <artifactId>hudi-utilities_${scala.binary.version}</artifactId>
+      <version>${project.version}</version>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.hudi</groupId>
+      <artifactId>hudi-spark_${scala.binary.version}</artifactId>
+      <version>${project.version}</version>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.hudi</groupId>
+      <artifactId>hudi-hadoop-mr</artifactId>
+      <version>${project.version}</version>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.hudi</groupId>
+      <artifactId>hudi-hive</artifactId>
+      <version>${project.version}</version>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.hudi</groupId>
+      <artifactId>hudi-timeline-service</artifactId>
+      <version>${project.version}</version>
+    </dependency>
+
+    <!-- Spark -->
+    <dependency>
+      <groupId>org.apache.spark</groupId>
+      <artifactId>spark-core_${scala.binary.version}</artifactId>
+    </dependency>
+    <dependency>
+      <groupId>org.apache.spark</groupId>
+      <artifactId>spark-sql_${scala.binary.version}</artifactId>
+    </dependency>
+    <dependency>
+      <groupId>org.apache.spark</groupId>
+      <artifactId>spark-avro_${scala.binary.version}</artifactId>
+      <scope>provided</scope>
 
 Review comment:
   fixed

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] dengziming commented on issue #1151: [WIP][HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
dengziming commented on issue #1151: [WIP][HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#issuecomment-612881257
 
 
   Hello, @vinothchandar ,Sorry for the late reply. I want to address some of your comment and here are issues:
   1. I tried make data prep part of the deltastreamer themselves and then also provide same defaults for input/output paths, but at last I found my code the same as `org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer#main`, it seems that we don't have to add deltastreamer examples because the HoodieDeltaStreamer class is already a complete deltastreamer example. Maybe we just need to add some simple examples rather than a very complete and unified example.
   2. I wrote a `run_hoodie_examples.sh` to run the example and reuse spark-bundle/utilities-bundle instead of building a fat jar, but the build process of hudi-utilities-bundle will relocate `com.beust.jcommander.` to `org.apache.hudi.com.beust.jcommander.`, and my example have a dependency on `com.beust.jcommander.` and the spark-shell failed, so should I also add a relocation to  pom.xml of hudi-examples. 
   How do you think about these 2 problems.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] dengziming edited a comment on issue #1151: [WIP] [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
dengziming edited a comment on issue #1151: [WIP] [HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#issuecomment-576130421
 
 
   @vinothchandar hi, vinoth, 
   I run `mvn test -B` successful locally, but the Travis CI build failed with a:
   ```
   [ERROR] Failed to execute goal on project hudi-examples: Could not resolve dependencies for project org.apache.hudi:hudi-examples:jar:0.5.1-SNAPSHOT: The following artifacts could not be resolved: org.apache.hudi:hudi-utilities:jar:0.5.1-SNAPSHOT, org.apache.hudi:hudi-spark:jar:0.5.1-SNAPSHOT: Failure to find org.apache.hudi:hudi-utilities:jar:0.5.1-SNAPSHOT in https://oss.sonatype.org/content/repositories/snapshots/ was cached in the local repository, resolution will not be reattempted until the update interval of sonatype-snapshots has elapsed or updates are forced -> [Help 1]
   ```
   I searched for this error and found it could be solved by deleting the file  cached in the local repository, but I don't have the privilege, could you help me to solve this problem.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] dengziming edited a comment on issue #1151: [WIP] [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
dengziming edited a comment on issue #1151: [WIP] [HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#issuecomment-569914887
 
 
   Hi, @vinothchandar @yanghua 
   I just updated the pr and add 2 examples: HoodieWriteClient API and datasource API, other examples such as HoodieDeltaStreamer and structure streaming use case, I think it's better to add more subtasks to be convenient to do code review, for they are relatively complex.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] dengziming commented on a change in pull request #1151: [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
dengziming commented on a change in pull request #1151: [HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#discussion_r379894990
 
 

 ##########
 File path: hudi-examples/pom.xml
 ##########
 @@ -0,0 +1,206 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
+
+  <parent>
+    <artifactId>hudi</artifactId>
+    <groupId>org.apache.hudi</groupId>
+    <version>0.5.2-SNAPSHOT</version>
+  </parent>
+  <modelVersion>4.0.0</modelVersion>
+
+  <artifactId>hudi-examples</artifactId>
+  <packaging>jar</packaging>
+
+  <properties>
+    <main.basedir>${project.parent.basedir}</main.basedir>
+  </properties>
+
+  <build>
+    <resources>
+      <resource>
+        <directory>src/main/resources</directory>
+      </resource>
+    </resources>
+
+    <plugins>
+      <plugin>
+        <groupId>org.apache.maven.plugins</groupId>
 
 Review comment:
   this is just a copy from hudi-spark, do you have a better solution?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hudi] codecov-commenter edited a comment on pull request #1151: [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #1151:
URL: https://github.com/apache/hudi/pull/1151#issuecomment-634017621


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/1151?src=pr&el=h1) Report
   > Merging [#1151](https://codecov.io/gh/apache/hudi/pull/1151?src=pr&el=desc) into [master](https://codecov.io/gh/apache/hudi/commit/2a56f82908a8b8f788a7547d3c707c144696c1df&el=desc) will **decrease** coverage by `53.42%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/1151/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/1151?src=pr&el=tree)
   
   ```diff
   @@              Coverage Diff              @@
   ##             master    #1151       +/-   ##
   =============================================
   - Coverage     71.65%   18.22%   -53.43%     
   - Complexity      294      857      +563     
   =============================================
     Files           378      348       -30     
     Lines         16541    15333     -1208     
     Branches       1670     1523      -147     
   =============================================
   - Hits          11852     2795     -9057     
   - Misses         3957    12181     +8224     
   + Partials        732      357      -375     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/1151?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | [...rg/apache/hudi/common/model/HoodieAvroPayload.java](https://codecov.io/gh/apache/hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0hvb2RpZUF2cm9QYXlsb2FkLmphdmE=) | `0.00% <0.00%> (-84.62%)` | `0.00 <0.00> (ø)` | |
   | [.../java/org/apache/hudi/client/HoodieReadClient.java](https://codecov.io/gh/apache/hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpZW50L0hvb2RpZVJlYWRDbGllbnQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | [.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0cmljcy9NZXRyaWNzUmVwb3J0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | [.../java/org/apache/hudi/common/model/ActionType.java](https://codecov.io/gh/apache/hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0FjdGlvblR5cGUuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | [...java/org/apache/hudi/io/HoodieRangeInfoHandle.java](https://codecov.io/gh/apache/hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaW8vSG9vZGllUmFuZ2VJbmZvSGFuZGxlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | [.../java/org/apache/hudi/hadoop/InputPathHandler.java](https://codecov.io/gh/apache/hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1oYWRvb3AtbXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaGFkb29wL0lucHV0UGF0aEhhbmRsZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | [...a/org/apache/hudi/exception/HoodieIOException.java](https://codecov.io/gh/apache/hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvZXhjZXB0aW9uL0hvb2RpZUlPRXhjZXB0aW9uLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | [...org/apache/hudi/table/action/commit/SmallFile.java](https://codecov.io/gh/apache/hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdGFibGUvYWN0aW9uL2NvbW1pdC9TbWFsbEZpbGUuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | [...rg/apache/hudi/index/bloom/KeyRangeLookupTree.java](https://codecov.io/gh/apache/hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaW5kZXgvYmxvb20vS2V5UmFuZ2VMb29rdXBUcmVlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | [...g/apache/hudi/exception/HoodieInsertException.java](https://codecov.io/gh/apache/hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvZXhjZXB0aW9uL0hvb2RpZUluc2VydEV4Y2VwdGlvbi5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | ... and [330 more](https://codecov.io/gh/apache/hudi/pull/1151/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/1151?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/1151?src=pr&el=footer). Last update [2a56f82...4b33b38](https://codecov.io/gh/apache/hudi/pull/1151?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-hudi] pratyakshsharma commented on a change in pull request #1151: [WIP] [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
pratyakshsharma commented on a change in pull request #1151: [WIP] [HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#discussion_r368608157
 
 

 ##########
 File path: hudi-examples/src/main/java/org/apache/hudi/examples/common/HoodieExampleDataGenerator.java
 ##########
 @@ -0,0 +1,216 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.examples.common;
+
+import org.apache.hudi.common.model.HoodieAvroPayload;
+import org.apache.hudi.common.model.HoodieKey;
+import org.apache.hudi.common.model.HoodieRecord;
+import org.apache.hudi.common.model.HoodieRecordPayload;
+import org.apache.hudi.common.util.HoodieAvroUtils;
+import org.apache.hudi.common.util.Option;
+import org.apache.hudi.common.util.TypedProperties;
+import org.apache.hudi.utilities.schema.SchemaProvider;
+
+import org.apache.avro.Schema;
+import org.apache.avro.generic.GenericData;
+import org.apache.avro.generic.GenericRecord;
+import org.apache.spark.api.java.JavaSparkContext;
+
+import java.io.IOException;
+import java.io.Serializable;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Random;
+import java.util.UUID;
+import java.util.stream.Collectors;
+import java.util.stream.IntStream;
+import java.util.stream.Stream;
+
+
+/**
+ * Class to be used to generate test data.
+ */
+public class HoodieExampleDataGenerator<T extends HoodieRecordPayload<T>> {
 
 Review comment:
   Can you see if we can do away with duplicate code here by extending HoodieTestDataGenerator?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] dengziming commented on issue #1151: [WIP] [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
dengziming commented on issue #1151: [WIP] [HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#issuecomment-576130421
 
 
   @vinothchandar why it 
   I run `mvn test -B` successful locally, but the Travis CI build failed with a:
   ```
   [ERROR] Failed to execute goal on project hudi-examples: Could not resolve dependencies for project org.apache.hudi:hudi-examples:jar:0.5.1-SNAPSHOT: The following artifacts could not be resolved: org.apache.hudi:hudi-utilities:jar:0.5.1-SNAPSHOT, org.apache.hudi:hudi-spark:jar:0.5.1-SNAPSHOT: Failure to find org.apache.hudi:hudi-utilities:jar:0.5.1-SNAPSHOT in https://oss.sonatype.org/content/repositories/snapshots/ was cached in the local repository, resolution will not be reattempted until the update interval of sonatype-snapshots has elapsed or updates are forced -> [Help 1]
   ```
   I searched for this error and found it could be solved by deleting the file  cached in the local repository, but I don't have the privilege, could you help me to solve this problem.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] dengziming commented on a change in pull request #1151: [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
dengziming commented on a change in pull request #1151: [HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#discussion_r390802033
 
 

 ##########
 File path: hudi-examples/pom.xml
 ##########
 @@ -0,0 +1,206 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
+
+  <parent>
+    <artifactId>hudi</artifactId>
+    <groupId>org.apache.hudi</groupId>
+    <version>0.5.2-SNAPSHOT</version>
+  </parent>
+  <modelVersion>4.0.0</modelVersion>
+
+  <artifactId>hudi-examples</artifactId>
+  <packaging>jar</packaging>
+
+  <properties>
+    <main.basedir>${project.parent.basedir}</main.basedir>
+  </properties>
+
+  <build>
+    <resources>
+      <resource>
+        <directory>src/main/resources</directory>
+      </resource>
+    </resources>
+
+    <plugins>
+      <plugin>
+        <groupId>org.apache.maven.plugins</groupId>
 
 Review comment:
   I think it's better to not have a fat jar here.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] vinothchandar commented on issue #1151: [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on issue #1151: [HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#issuecomment-596350356
 
 
   @dengziming please let me know when you have addressed some of the comments..  

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] vinothchandar commented on issue #1151: [WIP] [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on issue #1151: [WIP] [HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#issuecomment-581780977
 
 
   @dengziming are you still working on this?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] vinothchandar commented on issue #1151: [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on issue #1151: [HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#issuecomment-598032938
 
 
   @dengziming please ping me when you are ready for another review pass 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hudi] lamber-ken commented on pull request #1151: [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
lamber-ken commented on pull request #1151:
URL: https://github.com/apache/hudi/pull/1151#issuecomment-634637408


   @vinothchandar I test these examples locally and yarn-cluster mode, worked fine, will merging.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-hudi] dengziming commented on a change in pull request #1151: [WIP] [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
dengziming commented on a change in pull request #1151: [WIP] [HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#discussion_r368794985
 
 

 ##########
 File path: hudi-examples/src/main/java/org/apache/hudi/examples/spark/HoodieWriteClientExample.java
 ##########
 @@ -0,0 +1,135 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.examples.spark;
+
+import org.apache.hudi.HoodieWriteClient;
+import org.apache.hudi.WriteStatus;
+import org.apache.hudi.common.model.HoodieAvroPayload;
+import org.apache.hudi.common.model.HoodieKey;
+import org.apache.hudi.common.model.HoodieRecord;
+import org.apache.hudi.common.model.HoodieTableType;
+import org.apache.hudi.common.table.HoodieTableMetaClient;
+import org.apache.hudi.common.util.FSUtils;
+import org.apache.hudi.common.util.Option;
+import org.apache.hudi.config.HoodieCompactionConfig;
+import org.apache.hudi.config.HoodieIndexConfig;
+import org.apache.hudi.config.HoodieWriteConfig;
+import org.apache.hudi.examples.common.HoodieExampleDataGenerator;
+import org.apache.hudi.index.HoodieIndex;
+
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.log4j.LogManager;
+import org.apache.log4j.Logger;
+import org.apache.spark.SparkConf;
+import org.apache.spark.api.java.JavaRDD;
+import org.apache.spark.api.java.JavaSparkContext;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.stream.Collectors;
+
+
+/**
+ * Simple examples of #{@link HoodieWriteClient}.
+ *
+ * To run this example, you should
+ *   1. For running in IDE, set VM options `-Dspark.master=local[2]`
+ *   2. For running in shell, using `spark-submit`
+ *
+ * Usage: HoodieWriteClientExample <tablePath> <tableName>
+ * <tablePath> and <tableName> describe root path of hudi and table name
+ * for example, `HoodieWriteClientExample file:///tmp/hoodie/sample-table hoodie_rt`
+ */
+public class HoodieWriteClientExample {
+
+  private static final Logger LOG = LogManager.getLogger(HoodieWriteClientExample.class);
+
+  private static String tableType = HoodieTableType.COPY_ON_WRITE.name();
+
+  public static void main(String[] args) throws Exception {
+    if (args.length < 2) {
+      System.err.println("Usage: HoodieWriteClientExample <tablePath> <tableName>");
+      System.exit(1);
+    }
+    String tablePath = args[0];
+    String tableName = args[1];
+    SparkConf sparkConf = new SparkConf().setAppName("hoodie-client-example");
+    sparkConf.set("spark.serializer", "org.apache.spark.serializer.KryoSerializer");
+    sparkConf.set("spark.kryoserializer.buffer.max", "512m");
+    sparkConf.set("spark.some.config.option", "some-value");
 
 Review comment:
   you are right, thank you, fixed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] vinothchandar commented on issue #1151: [WIP] [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on issue #1151: [WIP] [HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#issuecomment-573471639
 
 
   Could we also provide delta streamer examples?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] dengziming commented on a change in pull request #1151: [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
dengziming commented on a change in pull request #1151: [HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#discussion_r379895098
 
 

 ##########
 File path: hudi-examples/src/main/java/org/apache/hudi/examples/deltastreamer/HoodieDeltaStreamerKafkaSourceExample.java
 ##########
 @@ -0,0 +1,87 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.examples.deltastreamer;
+
+import org.apache.hudi.DataSourceWriteOptions;
+import org.apache.hudi.examples.common.HoodieExampleDataGenerator;
+import org.apache.hudi.examples.common.HoodieExampleSparkUtils;
+import org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer;
+import org.apache.hudi.utilities.sources.JsonKafkaSource;
+import org.apache.hudi.utilities.transform.IdentityTransformer;
+
+import com.beust.jcommander.JCommander;
+import org.apache.spark.SparkConf;
+import org.apache.spark.api.java.JavaSparkContext;
+
+
+/**
+ * Simple examples of #{@link HoodieDeltaStreamer} from #{@link JsonKafkaSource}.
+ *
+ * To run this example, you should
+ *    1. Start Zookeeper and the Kafka demo server
+ *    2. For running in IDE, set VM options `-Dspark.master=local[2]`
+ *    3. For running in shell, using `spark-submit`
+ *    4. produce some data to hoodie-source-topic configured by `hoodie.deltastreamer.source.kafka.topic`
+ *
+ * Usage: HoodieDeltaStreamerKafkaSourceExample \
+ *        --target-base-path /tmp/hoodie/kafkadeltatable \
+ *        --table-type MERGE_ON_READ \
+ *        --target-table kafkadeltatable
+ */
+public class HoodieDeltaStreamerKafkaSourceExample {
+
+  public static void main(String[] args) throws Exception {
+
+    final HoodieDeltaStreamer.Config cfg = defaultKafkaDeltaStreamerConfig();
+    new JCommander(cfg).parse(args);
+
+    SparkConf sparkConf = HoodieExampleSparkUtils.defaultSparkConf("hoodie-delta-streamer-kafka-source-example");
+    JavaSparkContext jsc = new JavaSparkContext(sparkConf);
+
+    try {
+      new HoodieDeltaStreamer(cfg, jsc).sync();
+    } finally {
+      jsc.stop();
+    }
+  }
+
+  /**
+   * also see #{@link HoodieDeltaStreamer.Config} for more params.
+   * @return default params for Kafka DeltaStreamer
+   */
+  private static HoodieDeltaStreamer.Config defaultKafkaDeltaStreamerConfig() {
+
+    HoodieDeltaStreamer.Config cfg = new HoodieDeltaStreamer.Config();
+
+    cfg.configs.add(String.format("%s=uuid", DataSourceWriteOptions.RECORDKEY_FIELD_OPT_KEY()));
 
 Review comment:
   ok, good.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #1151: [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on a change in pull request #1151: [HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#discussion_r383477929
 
 

 ##########
 File path: hudi-examples/src/main/java/org/apache/hudi/examples/deltastreamer/HoodieDeltaStreamerDfsSourceExample.java
 ##########
 @@ -0,0 +1,81 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.examples.deltastreamer;
+
+import org.apache.hudi.DataSourceWriteOptions;
+import org.apache.hudi.examples.common.HoodieExampleDataGenerator;
+import org.apache.hudi.examples.common.HoodieExampleSparkUtils;
+import org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer;
+import org.apache.hudi.utilities.sources.JsonDFSSource;
+import org.apache.hudi.utilities.transform.IdentityTransformer;
+
+import com.beust.jcommander.JCommander;
+import org.apache.spark.SparkConf;
+import org.apache.spark.api.java.JavaSparkContext;
+
+
+/**
+ * Simple examples of #{@link HoodieDeltaStreamer} from #{@link JsonDFSSource}.
+ *
+ * To run this example, you should
+ *   1. prepare sample data as `hudi-examples/src/main/resources/dfs-delta-streamer`
+ *   2. For running in IDE, set VM options `-Dspark.master=local[2]`
+ *   3. For running in shell, using `spark-submit`
+ *
+ * Usage: HoodieDeltaStreamerDfsSourceExample \
 
 Review comment:
   a shell script to run any Example class is a better idea.. I was just referring to having a command that users can just copy paste to a terminal and run.. 
   
   >>data prep part of the examples themselves and then also provide sane defaults for input/output paths
   Thoughts on this? 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] yanghua commented on issue #1151: [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
yanghua commented on issue #1151: [HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#issuecomment-569479377
 
 
   Hi @dengziming Thanks for your contribution! It would be better to follow the [new contribution guidance](http://hudi.apache.org/contributing.html#life-of-a-contributor) about how to name the title of the PR. 
   
   Firstly, you can squash your commits into one commit.
   
   Regarding adding a new module, we'd better listen to @vinothchandar 's opinion firstly. I agree that it's better to have an example module.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hudi] vinothchandar commented on pull request #1151: [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on pull request #1151:
URL: https://github.com/apache/hudi/pull/1151#issuecomment-634973383


   Thank you @dengziming !! for this really great contribution.. We will keep improving this 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-hudi] dengziming removed a comment on issue #1151: [WIP] [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
dengziming removed a comment on issue #1151: [WIP] [HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#issuecomment-576130421
 
 
   @vinothchandar hi, vinoth, I have added the DeltaStreamExample.
   And I run `mvn test -B` successful locally, but the Travis CI build failed with a:
   ```
   [ERROR] Failed to execute goal on project hudi-examples: Could not resolve dependencies for project org.apache.hudi:hudi-examples:jar:0.5.1-SNAPSHOT: The following artifacts could not be resolved: org.apache.hudi:hudi-utilities:jar:0.5.1-SNAPSHOT, org.apache.hudi:hudi-spark:jar:0.5.1-SNAPSHOT: Failure to find org.apache.hudi:hudi-utilities:jar:0.5.1-SNAPSHOT in https://oss.sonatype.org/content/repositories/snapshots/ was cached in the local repository, resolution will not be reattempted until the update interval of sonatype-snapshots has elapsed or updates are forced -> [Help 1]
   ```
   I searched for this error and found it could be solved by deleting the file  cached in the local repository, but I don't have the privilege, could you help me to solve this problem.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [hudi] codecov-commenter commented on pull request #1151: [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on pull request #1151:
URL: https://github.com/apache/hudi/pull/1151#issuecomment-634017621


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/1151?src=pr&el=h1) Report
   > Merging [#1151](https://codecov.io/gh/apache/hudi/pull/1151?src=pr&el=desc) into [master](https://codecov.io/gh/apache/hudi/commit/2a56f82908a8b8f788a7547d3c707c144696c1df&el=desc) will **decrease** coverage by `53.42%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/1151/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/1151?src=pr&el=tree)
   
   ```diff
   @@              Coverage Diff              @@
   ##             master    #1151       +/-   ##
   =============================================
   - Coverage     71.65%   18.22%   -53.43%     
   - Complexity      294      857      +563     
   =============================================
     Files           378      348       -30     
     Lines         16541    15333     -1208     
     Branches       1670     1523      -147     
   =============================================
   - Hits          11852     2795     -9057     
   - Misses         3957    12181     +8224     
   + Partials        732      357      -375     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/1151?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | [...rg/apache/hudi/common/model/HoodieAvroPayload.java](https://codecov.io/gh/apache/hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0hvb2RpZUF2cm9QYXlsb2FkLmphdmE=) | `0.00% <0.00%> (-84.62%)` | `0.00 <0.00> (ø)` | |
   | [.../java/org/apache/hudi/client/HoodieReadClient.java](https://codecov.io/gh/apache/hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpZW50L0hvb2RpZVJlYWRDbGllbnQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | [.../java/org/apache/hudi/metrics/MetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0cmljcy9NZXRyaWNzUmVwb3J0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | [.../java/org/apache/hudi/common/model/ActionType.java](https://codecov.io/gh/apache/hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0FjdGlvblR5cGUuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | [...java/org/apache/hudi/io/HoodieRangeInfoHandle.java](https://codecov.io/gh/apache/hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaW8vSG9vZGllUmFuZ2VJbmZvSGFuZGxlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | [.../java/org/apache/hudi/hadoop/InputPathHandler.java](https://codecov.io/gh/apache/hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1oYWRvb3AtbXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaGFkb29wL0lucHV0UGF0aEhhbmRsZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | [...a/org/apache/hudi/exception/HoodieIOException.java](https://codecov.io/gh/apache/hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvZXhjZXB0aW9uL0hvb2RpZUlPRXhjZXB0aW9uLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | [...org/apache/hudi/table/action/commit/SmallFile.java](https://codecov.io/gh/apache/hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdGFibGUvYWN0aW9uL2NvbW1pdC9TbWFsbEZpbGUuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | [...rg/apache/hudi/index/bloom/KeyRangeLookupTree.java](https://codecov.io/gh/apache/hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvaW5kZXgvYmxvb20vS2V5UmFuZ2VMb29rdXBUcmVlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | [...g/apache/hudi/exception/HoodieInsertException.java](https://codecov.io/gh/apache/hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvZXhjZXB0aW9uL0hvb2RpZUluc2VydEV4Y2VwdGlvbi5qYXZh) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (ø%)` | |
   | ... and [330 more](https://codecov.io/gh/apache/hudi/pull/1151/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/1151?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/1151?src=pr&el=footer). Last update [2a56f82...4b33b38](https://codecov.io/gh/apache/hudi/pull/1151?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #1151: [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on a change in pull request #1151: [HUDI-476] Add hudi-examples module
URL: https://github.com/apache/incubator-hudi/pull/1151#discussion_r391427811
 
 

 ##########
 File path: hudi-examples/pom.xml
 ##########
 @@ -0,0 +1,206 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
+
+  <parent>
+    <artifactId>hudi</artifactId>
+    <groupId>org.apache.hudi</groupId>
+    <version>0.5.2-SNAPSHOT</version>
+  </parent>
+  <modelVersion>4.0.0</modelVersion>
+
+  <artifactId>hudi-examples</artifactId>
+  <packaging>jar</packaging>
+
+  <properties>
+    <main.basedir>${project.parent.basedir}</main.basedir>
+  </properties>
+
+  <build>
+    <resources>
+      <resource>
+        <directory>src/main/resources</directory>
+      </resource>
+    </resources>
+
+    <plugins>
+      <plugin>
+        <groupId>org.apache.maven.plugins</groupId>
 
 Review comment:
   Cool.. let's reuse the current bundles then!

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] vinothchandar commented on a change in pull request #1151: [WIP][HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on a change in pull request #1151:
URL: https://github.com/apache/incubator-hudi/pull/1151#discussion_r416731242



##########
File path: hudi-examples/src/main/scripts/delta-streamer-cluster
##########
@@ -0,0 +1,33 @@
+#!/usr/bin/env bash
+
+#  Licensed to the Apache Software Foundation (ASF) under one
+#  or more contributor license agreements.  See the NOTICE file
+#  distributed with this work for additional information
+#  regarding copyright ownership.  The ASF licenses this file
+#  to you under the Apache License, Version 2.0 (the
+#  "License"); you may not use this file except in compliance
+#  with the License.  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+JAR_FILE="$(echo packaging/hudi-utilities-bundle/target/hudi-utilities-bundle*-SNAPSHOT.jar | tr ' ' ',')"
+EXAMPLES_JARS="$(echo hudi-examples/target/hudi-examples-*-SNAPSHOT.jar)"
+
+exec "${SPARK_HOME}"/bin/spark-submit \
+--master yarn \

Review comment:
       should master/deploy-mode be configurable via env var? yarn/cluster being the default 

##########
File path: hudi-examples/pom.xml
##########
@@ -0,0 +1,198 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
+
+  <parent>
+    <artifactId>hudi</artifactId>
+    <groupId>org.apache.hudi</groupId>
+    <version>0.6.0-SNAPSHOT</version>
+  </parent>
+  <modelVersion>4.0.0</modelVersion>
+
+  <artifactId>hudi-examples</artifactId>
+  <packaging>jar</packaging>
+
+  <properties>
+    <main.basedir>${project.parent.basedir}</main.basedir>
+  </properties>
+
+  <build>
+    <resources>
+      <resource>
+        <directory>src/main/resources</directory>
+      </resource>
+    </resources>
+
+    <plugins>
+      <plugin>
+        <groupId>org.apache.maven.plugins</groupId>
+        <artifactId>maven-dependency-plugin</artifactId>
+        <executions>
+          <execution>
+            <id>copy-dependencies</id>
+            <phase>prepare-package</phase>
+            <goals>
+              <goal>copy-dependencies</goal>
+            </goals>
+            <configuration>
+              <outputDirectory>${project.build.directory}/lib</outputDirectory>
+              <overWriteReleases>true</overWriteReleases>
+              <overWriteSnapshots>true</overWriteSnapshots>
+              <overWriteIfNewer>true</overWriteIfNewer>
+            </configuration>
+          </execution>
+        </executions>
+      </plugin>
+      <plugin>
+        <groupId>net.alchim31.maven</groupId>
+        <artifactId>scala-maven-plugin</artifactId>
+        <executions>
+          <execution>
+            <id>scala-compile-first</id>
+            <phase>process-resources</phase>
+            <goals>
+              <goal>add-source</goal>
+              <goal>compile</goal>
+            </goals>
+          </execution>
+        </executions>
+      </plugin>
+      <plugin>
+        <groupId>org.apache.maven.plugins</groupId>
+        <artifactId>maven-compiler-plugin</artifactId>
+        <executions>
+          <execution>
+            <phase>compile</phase>
+            <goals>
+              <goal>compile</goal>
+            </goals>
+          </execution>
+        </executions>
+      </plugin>
+      <plugin>
+        <groupId>org.apache.maven.plugins</groupId>
+        <artifactId>maven-jar-plugin</artifactId>
+        <executions>
+          <execution>
+            <goals>
+              <goal>test-jar</goal>
+            </goals>
+            <phase>test-compile</phase>
+          </execution>
+        </executions>
+        <configuration>
+          <skip>false</skip>
+        </configuration>
+      </plugin>
+      <plugin>
+        <groupId>org.apache.rat</groupId>
+        <artifactId>apache-rat-plugin</artifactId>
+      </plugin>
+    </plugins>
+  </build>
+
+  <dependencies>
+
+    <dependency>
+      <groupId>org.scala-lang</groupId>
+      <artifactId>scala-library</artifactId>
+      <version>${scala.version}</version>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.hudi</groupId>
+      <artifactId>hudi-common</artifactId>
+      <version>${project.version}</version>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.hudi</groupId>
+      <artifactId>hudi-cli</artifactId>
+      <version>${project.version}</version>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.hudi</groupId>
+      <artifactId>hudi-client</artifactId>
+      <version>${project.version}</version>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.hudi</groupId>
+      <artifactId>hudi-utilities_${scala.binary.version}</artifactId>
+      <version>${project.version}</version>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.hudi</groupId>
+      <artifactId>hudi-spark_${scala.binary.version}</artifactId>
+      <version>${project.version}</version>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.hudi</groupId>
+      <artifactId>hudi-hadoop-mr</artifactId>
+      <version>${project.version}</version>

Review comment:
       these versions here may not be necessary? and can be just inherited from parent ? 

##########
File path: hudi-examples/src/main/java/org/apache/hudi/examples/common/HoodieExampleDataGenerator.java
##########
@@ -0,0 +1,216 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.examples.common;
+
+import org.apache.hudi.common.model.HoodieAvroPayload;
+import org.apache.hudi.common.model.HoodieKey;
+import org.apache.hudi.common.model.HoodieRecord;
+import org.apache.hudi.common.model.HoodieRecordPayload;
+import org.apache.hudi.common.util.HoodieAvroUtils;
+import org.apache.hudi.common.util.Option;
+import org.apache.hudi.common.util.TypedProperties;
+import org.apache.hudi.utilities.schema.SchemaProvider;
+
+import org.apache.avro.Schema;
+import org.apache.avro.generic.GenericData;
+import org.apache.avro.generic.GenericRecord;
+import org.apache.spark.api.java.JavaSparkContext;
+
+import java.io.IOException;
+import java.io.Serializable;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Random;
+import java.util.UUID;
+import java.util.stream.Collectors;
+import java.util.stream.IntStream;
+import java.util.stream.Stream;
+
+
+/**
+ * Class to be used to generate test data.
+ */
+public class HoodieExampleDataGenerator<T extends HoodieRecordPayload<T>> {

Review comment:
       @dengziming can you make a pass at this class and remove fields/methods that are not actually used. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-hudi] dengziming commented on a change in pull request #1151: [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
dengziming commented on a change in pull request #1151:
URL: https://github.com/apache/incubator-hudi/pull/1151#discussion_r425577242



##########
File path: hudi-examples/pom.xml
##########
@@ -0,0 +1,198 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
+
+  <parent>
+    <artifactId>hudi</artifactId>
+    <groupId>org.apache.hudi</groupId>
+    <version>0.6.0-SNAPSHOT</version>
+  </parent>
+  <modelVersion>4.0.0</modelVersion>
+
+  <artifactId>hudi-examples</artifactId>
+  <packaging>jar</packaging>
+
+  <properties>
+    <main.basedir>${project.parent.basedir}</main.basedir>
+  </properties>
+
+  <build>
+    <resources>
+      <resource>
+        <directory>src/main/resources</directory>
+      </resource>
+    </resources>
+
+    <plugins>
+      <plugin>
+        <groupId>org.apache.maven.plugins</groupId>
+        <artifactId>maven-dependency-plugin</artifactId>
+        <executions>
+          <execution>
+            <id>copy-dependencies</id>
+            <phase>prepare-package</phase>
+            <goals>
+              <goal>copy-dependencies</goal>
+            </goals>
+            <configuration>
+              <outputDirectory>${project.build.directory}/lib</outputDirectory>
+              <overWriteReleases>true</overWriteReleases>
+              <overWriteSnapshots>true</overWriteSnapshots>
+              <overWriteIfNewer>true</overWriteIfNewer>
+            </configuration>
+          </execution>
+        </executions>
+      </plugin>
+      <plugin>
+        <groupId>net.alchim31.maven</groupId>
+        <artifactId>scala-maven-plugin</artifactId>
+        <executions>
+          <execution>
+            <id>scala-compile-first</id>
+            <phase>process-resources</phase>
+            <goals>
+              <goal>add-source</goal>
+              <goal>compile</goal>
+            </goals>
+          </execution>
+        </executions>
+      </plugin>
+      <plugin>
+        <groupId>org.apache.maven.plugins</groupId>
+        <artifactId>maven-compiler-plugin</artifactId>
+        <executions>
+          <execution>
+            <phase>compile</phase>
+            <goals>
+              <goal>compile</goal>
+            </goals>
+          </execution>
+        </executions>
+      </plugin>
+      <plugin>
+        <groupId>org.apache.maven.plugins</groupId>
+        <artifactId>maven-jar-plugin</artifactId>
+        <executions>
+          <execution>
+            <goals>
+              <goal>test-jar</goal>
+            </goals>
+            <phase>test-compile</phase>
+          </execution>
+        </executions>
+        <configuration>
+          <skip>false</skip>
+        </configuration>
+      </plugin>
+      <plugin>
+        <groupId>org.apache.rat</groupId>
+        <artifactId>apache-rat-plugin</artifactId>
+      </plugin>
+    </plugins>
+  </build>
+
+  <dependencies>
+
+    <dependency>
+      <groupId>org.scala-lang</groupId>
+      <artifactId>scala-library</artifactId>
+      <version>${scala.version}</version>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.hudi</groupId>
+      <artifactId>hudi-common</artifactId>
+      <version>${project.version}</version>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.hudi</groupId>
+      <artifactId>hudi-cli</artifactId>
+      <version>${project.version}</version>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.hudi</groupId>
+      <artifactId>hudi-client</artifactId>
+      <version>${project.version}</version>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.hudi</groupId>
+      <artifactId>hudi-utilities_${scala.binary.version}</artifactId>
+      <version>${project.version}</version>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.hudi</groupId>
+      <artifactId>hudi-spark_${scala.binary.version}</artifactId>
+      <version>${project.version}</version>
+    </dependency>
+
+    <dependency>
+      <groupId>org.apache.hudi</groupId>
+      <artifactId>hudi-hadoop-mr</artifactId>
+      <version>${project.version}</version>

Review comment:
       the build process will fail if versions are removed, and other module also have `project.version` in the dependencies.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-hudi] codecov-io edited a comment on pull request #1151: [WIP][HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #1151:
URL: https://github.com/apache/incubator-hudi/pull/1151#issuecomment-593277561


   # [Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1151?src=pr&el=h1) Report
   > Merging [#1151](https://codecov.io/gh/apache/incubator-hudi/pull/1151?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-hudi/commit/2a56f82908a8b8f788a7547d3c707c144696c1df&el=desc) will **increase** coverage by `0.02%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-hudi/pull/1151/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/incubator-hudi/pull/1151?src=pr&el=tree)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #1151      +/-   ##
   ============================================
   + Coverage     71.65%   71.67%   +0.02%     
     Complexity      294      294              
   ============================================
     Files           378      378              
     Lines         16541    16552      +11     
     Branches       1670     1670              
   ============================================
   + Hits          11852    11864      +12     
     Misses         3957     3957              
   + Partials        732      731       -1     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/incubator-hudi/pull/1151?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | [...rg/apache/hudi/common/model/HoodieAvroPayload.java](https://codecov.io/gh/apache/incubator-hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0hvb2RpZUF2cm9QYXlsb2FkLmphdmE=) | `78.57% <0.00%> (-6.05%)` | `0.00 <0.00> (ø)` | |
   | [.../org/apache/hudi/table/HoodieCommitArchiveLog.java](https://codecov.io/gh/apache/incubator-hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdGFibGUvSG9vZGllQ29tbWl0QXJjaGl2ZUxvZy5qYXZh) | `76.43% <0.00%> (-1.06%)` | `0.00% <0.00%> (ø%)` | |
   | [...ava/org/apache/hudi/config/HoodieMemoryConfig.java](https://codecov.io/gh/apache/incubator-hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29uZmlnL0hvb2RpZU1lbW9yeUNvbmZpZy5qYXZh) | `70.00% <0.00%> (+10.00%)` | `0.00% <0.00%> (ø%)` | |
   | [...org/apache/hudi/client/utils/SparkConfigUtils.java](https://codecov.io/gh/apache/incubator-hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpZW50L3V0aWxzL1NwYXJrQ29uZmlnVXRpbHMuamF2YQ==) | `96.00% <0.00%> (+10.28%)` | `0.00% <0.00%> (ø%)` | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1151?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1151?src=pr&el=footer). Last update [2a56f82...827a0f3](https://codecov.io/gh/apache/incubator-hudi/pull/1151?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-hudi] codecov-io edited a comment on pull request #1151: [WIP][HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #1151:
URL: https://github.com/apache/incubator-hudi/pull/1151#issuecomment-593277561


   # [Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1151?src=pr&el=h1) Report
   > Merging [#1151](https://codecov.io/gh/apache/incubator-hudi/pull/1151?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-hudi/commit/2a56f82908a8b8f788a7547d3c707c144696c1df&el=desc) will **increase** coverage by `0.02%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-hudi/pull/1151/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/incubator-hudi/pull/1151?src=pr&el=tree)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #1151      +/-   ##
   ============================================
   + Coverage     71.65%   71.67%   +0.02%     
     Complexity      294      294              
   ============================================
     Files           378      378              
     Lines         16541    16552      +11     
     Branches       1670     1670              
   ============================================
   + Hits          11852    11864      +12     
     Misses         3957     3957              
   + Partials        732      731       -1     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/incubator-hudi/pull/1151?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | [...rg/apache/hudi/common/model/HoodieAvroPayload.java](https://codecov.io/gh/apache/incubator-hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0hvb2RpZUF2cm9QYXlsb2FkLmphdmE=) | `78.57% <0.00%> (-6.05%)` | `0.00 <0.00> (ø)` | |
   | [.../org/apache/hudi/table/HoodieCommitArchiveLog.java](https://codecov.io/gh/apache/incubator-hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdGFibGUvSG9vZGllQ29tbWl0QXJjaGl2ZUxvZy5qYXZh) | `76.43% <0.00%> (-1.06%)` | `0.00% <0.00%> (ø%)` | |
   | [...ava/org/apache/hudi/config/HoodieMemoryConfig.java](https://codecov.io/gh/apache/incubator-hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29uZmlnL0hvb2RpZU1lbW9yeUNvbmZpZy5qYXZh) | `70.00% <0.00%> (+10.00%)` | `0.00% <0.00%> (ø%)` | |
   | [...org/apache/hudi/client/utils/SparkConfigUtils.java](https://codecov.io/gh/apache/incubator-hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpZW50L3V0aWxzL1NwYXJrQ29uZmlnVXRpbHMuamF2YQ==) | `96.00% <0.00%> (+10.28%)` | `0.00% <0.00%> (ø%)` | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1151?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1151?src=pr&el=footer). Last update [2a56f82...827a0f3](https://codecov.io/gh/apache/incubator-hudi/pull/1151?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-hudi] dengziming commented on pull request #1151: [WIP][HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
dengziming commented on pull request #1151:
URL: https://github.com/apache/incubator-hudi/pull/1151#issuecomment-619303737


   @vinothchandar hello, PLAL.
   I deleted the 3 DeltaStreamerExample and replace them with 3 xxx-delta-streamer-example.sh, and provide 2 shell `delta-streamer-cluster` and `delta-streamer-local` to run them local or in cluster.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-hudi] vinothchandar commented on pull request #1151: [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on pull request #1151:
URL: https://github.com/apache/incubator-hudi/pull/1151#issuecomment-628775947


   @lamber-ken are you able to take this across finish line? @dengziming has something that is very close to a first version.. we can try to land that and then improvise 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-hudi] codecov-io edited a comment on pull request #1151: [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #1151:
URL: https://github.com/apache/incubator-hudi/pull/1151#issuecomment-593277561


   # [Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1151?src=pr&el=h1) Report
   > Merging [#1151](https://codecov.io/gh/apache/incubator-hudi/pull/1151?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-hudi/commit/2a56f82908a8b8f788a7547d3c707c144696c1df&el=desc) will **increase** coverage by `0.08%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-hudi/pull/1151/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/incubator-hudi/pull/1151?src=pr&el=tree)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #1151      +/-   ##
   ============================================
   + Coverage     71.65%   71.73%   +0.08%     
   - Complexity      294     1089     +795     
   ============================================
     Files           378      385       +7     
     Lines         16541    16604      +63     
     Branches       1670     1669       -1     
   ============================================
   + Hits          11852    11911      +59     
   - Misses         3957     3963       +6     
   + Partials        732      730       -2     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/incubator-hudi/pull/1151?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | [...rg/apache/hudi/common/model/HoodieAvroPayload.java](https://codecov.io/gh/apache/incubator-hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL21vZGVsL0hvb2RpZUF2cm9QYXlsb2FkLmphdmE=) | `78.57% <0.00%> (-6.05%)` | `0.00 <0.00> (ø)` | |
   | [...g/apache/hudi/metrics/InMemoryMetricsReporter.java](https://codecov.io/gh/apache/incubator-hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0cmljcy9Jbk1lbW9yeU1ldHJpY3NSZXBvcnRlci5qYXZh) | `40.00% <0.00%> (-40.00%)` | `0.00% <0.00%> (ø%)` | |
   | [...src/main/java/org/apache/hudi/metrics/Metrics.java](https://codecov.io/gh/apache/incubator-hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0cmljcy9NZXRyaWNzLmphdmE=) | `56.75% <0.00%> (-10.82%)` | `0.00% <0.00%> (ø%)` | |
   | [...ache/hudi/common/fs/inline/InMemoryFileSystem.java](https://codecov.io/gh/apache/incubator-hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL2ZzL2lubGluZS9Jbk1lbW9yeUZpbGVTeXN0ZW0uamF2YQ==) | `79.31% <0.00%> (-10.35%)` | `0.00% <0.00%> (ø%)` | |
   | [.../apache/hudi/common/table/TableSchemaResolver.java](https://codecov.io/gh/apache/incubator-hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3RhYmxlL1RhYmxlU2NoZW1hUmVzb2x2ZXIuamF2YQ==) | `56.71% <0.00%> (-7.15%)` | `0.00% <0.00%> (ø%)` | |
   | [...le/action/rollback/BaseRollbackActionExecutor.java](https://codecov.io/gh/apache/incubator-hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdGFibGUvYWN0aW9uL3JvbGxiYWNrL0Jhc2VSb2xsYmFja0FjdGlvbkV4ZWN1dG9yLmphdmE=) | `70.83% <0.00%> (-6.95%)` | `0.00% <0.00%> (ø%)` | |
   | [.../main/java/org/apache/hudi/common/util/Option.java](https://codecov.io/gh/apache/incubator-hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3V0aWwvT3B0aW9uLmphdmE=) | `75.67% <0.00%> (-5.41%)` | `17.00% <0.00%> (+17.00%)` | :arrow_down: |
   | [...ain/java/org/apache/hudi/avro/HoodieAvroUtils.java](https://codecov.io/gh/apache/incubator-hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvYXZyby9Ib29kaWVBdnJvVXRpbHMuamF2YQ==) | `80.95% <0.00%> (-3.87%)` | `22.00% <0.00%> (+22.00%)` | :arrow_down: |
   | [...g/apache/hudi/table/action/clean/CleanPlanner.java](https://codecov.io/gh/apache/incubator-hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdGFibGUvYWN0aW9uL2NsZWFuL0NsZWFuUGxhbm5lci5qYXZh) | `86.86% <0.00%> (-2.89%)` | `5.00% <0.00%> (+5.00%)` | :arrow_down: |
   | [...src/main/java/org/apache/hudi/DataSourceUtils.java](https://codecov.io/gh/apache/incubator-hudi/pull/1151/diff?src=pr&el=tree#diff-aHVkaS1zcGFyay9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaHVkaS9EYXRhU291cmNlVXRpbHMuamF2YQ==) | `55.55% <0.00%> (-1.15%)` | `0.00% <0.00%> (ø%)` | |
   | ... and [69 more](https://codecov.io/gh/apache/incubator-hudi/pull/1151/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1151?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/incubator-hudi/pull/1151?src=pr&el=footer). Last update [2a56f82...56f5a7a](https://codecov.io/gh/apache/incubator-hudi/pull/1151?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-hudi] lamber-ken commented on pull request #1151: [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
lamber-ken commented on pull request #1151:
URL: https://github.com/apache/incubator-hudi/pull/1151#issuecomment-629076102


   > > Can you confirm if you have run these examples locally once and verified the instructions work?
   > 
   > @vinothchandar , I ran these examples locally and ensured they do work, but haven't tried them in a yarn-cluster mode.
   
   Will check, thanks 👍 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-hudi] dengziming commented on pull request #1151: [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
dengziming commented on pull request #1151:
URL: https://github.com/apache/incubator-hudi/pull/1151#issuecomment-629044776


   > Can you confirm if you have run these examples locally once and verified the instructions work?
   
   @vinothchandar , I ran these examples locally and ensured they do work, but haven't tried them in a yarn-cluster mode.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-hudi] dengziming commented on pull request #1151: [HUDI-476] Add hudi-examples module

Posted by GitBox <gi...@apache.org>.
dengziming commented on pull request #1151:
URL: https://github.com/apache/incubator-hudi/pull/1151#issuecomment-628968945


   @vinothchandar sorry, a little busy these days, I will addressed your comments in a few days.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org