You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2020/05/25 13:36:07 UTC

[GitHub] [hudi] wangxianghu opened a new pull request #1665: [HUDI-910]Introduce HoodieWriteInput for hudi write client

wangxianghu opened a new pull request #1665:
URL: https://github.com/apache/hudi/pull/1665


   ## *Tips*
   - *Thank you very much for contributing to Apache Hudi.*
   - *Please review https://hudi.apache.org/contributing.html before opening a pull request.*
   
   ## What is the purpose of the pull request
   
   *This pull reques introduce HoodieWriteInput for hudi write client as the unified input format.*
   
   ## Brief change log
   
     - add a new module hudi-writer-commom
     - introduce HoodieWriteInput for hudi write client as the unified input format.
   
   ## Verify this pull request
   
   This pull request is a trivial rework / code cleanup without any test coverage.
   
   ## Committer checklist
   
    - [ ] Has a corresponding JIRA in PR title & commit
    
    - [ ] Commit message is descriptive of the change
    
    - [ ] CI is green
   
    - [ ] Necessary doc changes done or have another open PR
          
    - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter commented on pull request #1665: [HUDI-910]Introduce HoodieWriteInput for hudi write client

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on pull request #1665:
URL: https://github.com/apache/hudi/pull/1665#issuecomment-633592885


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/1665?src=pr&el=h1) Report
   > Merging [#1665](https://codecov.io/gh/apache/hudi/pull/1665?src=pr&el=desc) into [master](https://codecov.io/gh/apache/hudi/commit/802d16c8c9793156ef7fef0c59088040800fe025&el=desc) will **decrease** coverage by `0.10%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/1665/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/1665?src=pr&el=tree)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #1665      +/-   ##
   ============================================
   - Coverage     18.33%   18.22%   -0.11%     
   - Complexity      855      857       +2     
   ============================================
     Files           344      348       +4     
     Lines         15167    15332     +165     
     Branches       1512     1523      +11     
   ============================================
   + Hits           2781     2795      +14     
   - Misses        12033    12180     +147     
   - Partials        353      357       +4     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/1665?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/1665/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29uZmlnL0hvb2RpZVdyaXRlQ29uZmlnLmphdmE=) | `40.07% <0.00%> (-2.19%)` | `48.00% <0.00%> (ø%)` | |
   | [...rg/apache/hudi/metrics/MetricsReporterFactory.java](https://codecov.io/gh/apache/hudi/pull/1665/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0cmljcy9NZXRyaWNzUmVwb3J0ZXJGYWN0b3J5LmphdmE=) | `0.00% <0.00%> (ø)` | `0.00% <0.00%> (ø%)` | |
   | [...apache/hudi/config/HoodieMetricsDatadogConfig.java](https://codecov.io/gh/apache/hudi/pull/1665/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29uZmlnL0hvb2RpZU1ldHJpY3NEYXRhZG9nQ29uZmlnLmphdmE=) | `36.36% <0.00%> (ø)` | `2.00% <0.00%> (?%)` | |
   | [...apache/hudi/metrics/datadog/DatadogHttpClient.java](https://codecov.io/gh/apache/hudi/pull/1665/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0cmljcy9kYXRhZG9nL0RhdGFkb2dIdHRwQ2xpZW50LmphdmE=) | `0.00% <0.00%> (ø)` | `0.00% <0.00%> (?%)` | |
   | [...e/hudi/metrics/datadog/DatadogMetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/1665/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0cmljcy9kYXRhZG9nL0RhdGFkb2dNZXRyaWNzUmVwb3J0ZXIuamF2YQ==) | `0.00% <0.00%> (ø)` | `0.00% <0.00%> (?%)` | |
   | [...g/apache/hudi/metrics/datadog/DatadogReporter.java](https://codecov.io/gh/apache/hudi/pull/1665/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0cmljcy9kYXRhZG9nL0RhdGFkb2dSZXBvcnRlci5qYXZh) | `0.00% <0.00%> (ø)` | `0.00% <0.00%> (?%)` | |
   | [...org/apache/hudi/config/HoodieCompactionConfig.java](https://codecov.io/gh/apache/hudi/pull/1665/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29uZmlnL0hvb2RpZUNvbXBhY3Rpb25Db25maWcuamF2YQ==) | `56.00% <0.00%> (+1.09%)` | `3.00% <0.00%> (ø%)` | |
   | [...va/org/apache/hudi/config/HoodieMetricsConfig.java](https://codecov.io/gh/apache/hudi/pull/1665/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29uZmlnL0hvb2RpZU1ldHJpY3NDb25maWcuamF2YQ==) | `38.46% <0.00%> (+2.35%)` | `3.00% <0.00%> (ø%)` | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/1665?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/1665?src=pr&el=footer). Last update [802d16c...9d9e803](https://codecov.io/gh/apache/hudi/pull/1665?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] wangxianghu commented on pull request #1665: [HUDI-910]Introduce HoodieWriteInput for hudi write client

Posted by GitBox <gi...@apache.org>.
wangxianghu commented on pull request #1665:
URL: https://github.com/apache/hudi/pull/1665#issuecomment-636584997


   > 
   > 
   > @wangxianghu @leesf sorry for that delays. I was trying to understand the relation to this work
   > https://issues.apache.org/jira/browse/HUDI-538 cc @yanghua
   > This is not still clear to me where we are moving towards. While I agree that abstracting RDD into a WriteInput, like to understand how the existing code is going to evolve further. May I ask that we do a draft or first that replaces the entire code in hudi-client with these abstractions ( no need to even have the code compile. But want to understand the final shape we are looking at). I am also happy to do that myself and discuss.. Please let me know if that seems like a reasonable ask
   
   Hi @vinothchandar thanks for your suggestion. Replacing the entire code in hudi-client with new abstractions is a good choose, I'll do it :)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] wangxianghu edited a comment on pull request #1665: [HUDI-910]Introduce HoodieWriteInput for hudi write client

Posted by GitBox <gi...@apache.org>.
wangxianghu edited a comment on pull request #1665:
URL: https://github.com/apache/hudi/pull/1665#issuecomment-641279534


   Hi @vinothchandar , This is the abstration of hudi-client module(in module hudi-client2):
   https://github.com/wangxianghu/hudi/tree/HUDI-xxx
   please take a look when free.
   As the change is huge, If you agree with this abstraction, may I introduce a new module named "hudi-client2" or something else to hold the new abstraction, when the hudi-client2 is ready, we can weaken or delete the hudi-client module.
   or, just introduce a new feature branch to refactor hudi-client and merge it when ready
   The new module structure could be like this:
   ```
   └── hudi-client
       ├── hudi-client-common
       ├── hudi-client-spark
       ├── hudi-client-flink
       └── hudi-client-java
   ```
   cc @yanghua @leesf 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar commented on pull request #1665: [HUDI-910]Introduce HoodieWriteInput for hudi write client

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on pull request #1665:
URL: https://github.com/apache/hudi/pull/1665#issuecomment-633631006


   Is there an umbrella task to understand how all the follow up work will be.. on this..For e.g I am wondering what the eventual methods on `HoodieWriteInput` will be and how it will abstract away the RDD construct


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #1665: [HUDI-910]Introduce HoodieWriteInput for hudi write client

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #1665:
URL: https://github.com/apache/hudi/pull/1665#issuecomment-633592885


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/1665?src=pr&el=h1) Report
   > Merging [#1665](https://codecov.io/gh/apache/hudi/pull/1665?src=pr&el=desc) into [master](https://codecov.io/gh/apache/hudi/commit/bde7a7043e100242fec8fc0111e489a269a1d997&el=desc) will **not change** coverage.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/1665/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/1665?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff            @@
   ##             master    #1665   +/-   ##
   =========================================
     Coverage     18.21%   18.21%           
     Complexity      856      856           
   =========================================
     Files           348      348           
     Lines         15332    15332           
     Branches       1523     1523           
   =========================================
     Hits           2792     2792           
     Misses        12183    12183           
     Partials        357      357           
   ```
   
   
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/1665?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/1665?src=pr&el=footer). Last update [bde7a70...9f0121f](https://codecov.io/gh/apache/hudi/pull/1665?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] leesf commented on pull request #1665: [HUDI-910]Introduce HoodieWriteInput for hudi write client

Posted by GitBox <gi...@apache.org>.
leesf commented on pull request #1665:
URL: https://github.com/apache/hudi/pull/1665#issuecomment-642060233


   > Hi @vinothchandar , This is the abstration of hudi-client module(not finished yet):
   > https://github.com/wangxianghu/hudi/tree/HUDI-xxx
   > please take a look when free.
   > cc @yanghua @leesf
   
   Ack, look at the branch, the structure looks good to me, @vinothchandar would you please also take a look when you are free, we need land this feature ASAP.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] wangxianghu edited a comment on pull request #1665: [HUDI-910]Introduce HoodieWriteInput for hudi write client

Posted by GitBox <gi...@apache.org>.
wangxianghu edited a comment on pull request #1665:
URL: https://github.com/apache/hudi/pull/1665#issuecomment-641279534


   Hi @vinothchandar , This is the abstration of hudi-client module(not finished yet):
   https://github.com/wangxianghu/hudi/tree/HUDI-xxx
   please take a look when free.
   cc @yanghua @leesf 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar commented on pull request #1665: [HUDI-910]Introduce HoodieWriteInput for hudi write client

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on pull request #1665:
URL: https://github.com/apache/hudi/pull/1665#issuecomment-637239524


   @wangxianghu Thanks for being a sport! That will give us good confidence that we can ultimately pull this off.. Happy to help along as needed..


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #1665: [HUDI-910]Introduce HoodieWriteInput for hudi write client

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #1665:
URL: https://github.com/apache/hudi/pull/1665#issuecomment-633592885






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] wangxianghu edited a comment on pull request #1665: [HUDI-910]Introduce HoodieWriteInput for hudi write client

Posted by GitBox <gi...@apache.org>.
wangxianghu edited a comment on pull request #1665:
URL: https://github.com/apache/hudi/pull/1665#issuecomment-633776310


   > 
   > 
   > Is there an umbrella task to understand how all the follow up work will be.. on this..For e.g I am wondering what the eventual methods on `HoodieWriteInput` will be and how it will abstract away the RDD construct
   
   
   
   > 
   > 
   > Is there an umbrella task to understand how all the follow up work will be.. on this..For e.g I am wondering what the eventual methods on `HoodieWriteInput` will be and how it will abstract away the RDD construct
   
   yes, here it is : https://issues.apache.org/jira/browse/HUDI-909
   currently only four subtasks are filed, which is the very foundation of the entire abstraction:
   1. Introduce HoodieWriteInput for hudi write client: https://issues.apache.org/jira/browse/HUDI-910 
   2. Introduce HoodieWriteOutput for hudi write client: https://issues.apache.org/jira/browse/HUDI-911 
   3. Introduce HoodieWriteKey for hudi write client: https://issues.apache.org/jira/browse/HUDI-912 
   4. Introduce HoodieEngineContext for hudi write client: https://issues.apache.org/jira/browse/HUDI-913 
   
   For Spark these could be :
   ```
   JavaRDD<HoodieRecord<T>> records = ... ; // read from souce
   HoodieWriteInput<JavaRDD<HoodieRecord<T>>> inputRecords = new HoodieWriteInput(records);
   JavaRDD<HoodieRecord<T>> inputRdds = inputRecords.getInputs();
   ```
   
   ```
   JavaSparkContext jsc = ...;
   SerializableConfiguration hadoopConf = ...;`
   HoodieEngineContext  hec = new HoodieSparkEngineContext(hadoopConf, jsc); //HoodieSparkEngineContext extends HoodieEngineContext`
   JavaSparkContext jsc = ((HoodieSparkEngineContext)hec).getContext();
   ```
   
   The HoodieWriteKey and HoodieWriteOutput are the same as HoodieWriteInput.
   
   upsert api could be like this:
   ```
   public HoodieWriteOutput<JavaRDD<WriteStatus>> upsert(HoodieWriteInput<JavaRDD<HoodieRecord<T>>> records, final String instantTime) {...}
   ```
   
   The content of the method is almost the same as before.
   
   For Java and Flink,  just replace the `JavaRDD `with `List`.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] wangxianghu closed pull request #1665: [HUDI-910]Introduce HoodieWriteInput for hudi write client

Posted by GitBox <gi...@apache.org>.
wangxianghu closed pull request #1665:
URL: https://github.com/apache/hudi/pull/1665


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar commented on pull request #1665: [HUDI-910]Introduce HoodieWriteInput for hudi write client

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on pull request #1665:
URL: https://github.com/apache/hudi/pull/1665#issuecomment-642819377


   Sg.. Will jump on #1727 . Closing this one


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar closed pull request #1665: [HUDI-910]Introduce HoodieWriteInput for hudi write client

Posted by GitBox <gi...@apache.org>.
vinothchandar closed pull request #1665:
URL: https://github.com/apache/hudi/pull/1665


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] wangxianghu edited a comment on pull request #1665: [HUDI-910]Introduce HoodieWriteInput for hudi write client

Posted by GitBox <gi...@apache.org>.
wangxianghu edited a comment on pull request #1665:
URL: https://github.com/apache/hudi/pull/1665#issuecomment-641279534






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] wangxianghu commented on a change in pull request #1665: [HUDI-910]Introduce HoodieWriteInput for hudi write client

Posted by GitBox <gi...@apache.org>.
wangxianghu commented on a change in pull request #1665:
URL: https://github.com/apache/hudi/pull/1665#discussion_r430116222



##########
File path: hudi-writer-common/src/main/java/org/apache/hudi/format/HoodieWriteInput.java
##########
@@ -0,0 +1,37 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.format;

Review comment:
       reasonable, will do

##########
File path: hudi-writer-common/src/main/java/org/apache/hudi/format/HoodieWriteInput.java
##########
@@ -0,0 +1,37 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.format;

Review comment:
       @vinothchandar done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] wangxianghu edited a comment on pull request #1665: [HUDI-910]Introduce HoodieWriteInput for hudi write client

Posted by GitBox <gi...@apache.org>.
wangxianghu edited a comment on pull request #1665:
URL: https://github.com/apache/hudi/pull/1665#issuecomment-633776310






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] wangxianghu commented on pull request #1665: [HUDI-910]Introduce HoodieWriteInput for hudi write client

Posted by GitBox <gi...@apache.org>.
wangxianghu commented on pull request #1665:
URL: https://github.com/apache/hudi/pull/1665#issuecomment-633776310


   > 
   > 
   > Is there an umbrella task to understand how all the follow up work will be.. on this..For e.g I am wondering what the eventual methods on `HoodieWriteInput` will be and how it will abstract away the RDD construct
   
   
   
   > 
   > 
   > Is there an umbrella task to understand how all the follow up work will be.. on this..For e.g I am wondering what the eventual methods on `HoodieWriteInput` will be and how it will abstract away the RDD construct
   
   yes, here it is : https://issues.apache.org/jira/browse/HUDI-909
   currently only four subtasks are filed, which is the very foundation of the entire abstraction:
   1. Introduce HoodieWriteInput for hudi write client: https://issues.apache.org/jira/browse/HUDI-910 
   2. Introduce HoodieWriteOutput for hudi write client: https://issues.apache.org/jira/browse/HUDI-911 
   3. Introduce HoodieWriteKey for hudi write client: https://issues.apache.org/jira/browse/HUDI-912 
   4. Introduce HoodieEngineContext for hudi write client: https://issues.apache.org/jira/browse/HUDI-913 
   
   For Spark these could be :
   `JavaRDD<HoodieRecord<T>> records = ... ; // read from souce
   HoodieWriteInput<JavaRDD<HoodieRecord<T>>> inputRecords = new HoodieWriteInput(records);
   JavaRDD<HoodieRecord<T>> inputRdds = inputRecords.getInputs();`
   
   `JavaSparkContext jsc = ...;`
   `HoodieEngineContext<JavaSparkContext> hec = new HoodieSparkEngineContext(jsc); //HoodieSparkEngineContext<JavaSparkContext> implements HoodieEngineContext`
   `JavaSparkContext jsc = hec.getContext();`
   
   The HoodieWriteKey and HoodieWriteOutput are the same as HoodieWriteInput.
   
   upsert api could be like this:
   
   `public HoodieWriteOutput<JavaRDD<WriteStatus>> upsert(HoodieWriteInput<JavaRDD<HoodieRecord<T>>> records, final String instantTime) {...}`
   
   The content of the method is almost the same as before.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] wangxianghu commented on pull request #1665: [HUDI-910]Introduce HoodieWriteInput for hudi write client

Posted by GitBox <gi...@apache.org>.
wangxianghu commented on pull request #1665:
URL: https://github.com/apache/hudi/pull/1665#issuecomment-641275707






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar commented on pull request #1665: [HUDI-910]Introduce HoodieWriteInput for hudi write client

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on pull request #1665:
URL: https://github.com/apache/hudi/pull/1665#issuecomment-636382401


   This pr itself is fine. But given we are adding a new module and this is a critical thing to get right, trying to understand more upfront


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] leesf commented on pull request #1665: [HUDI-910]Introduce HoodieWriteInput for hudi write client

Posted by GitBox <gi...@apache.org>.
leesf commented on pull request #1665:
URL: https://github.com/apache/hudi/pull/1665#issuecomment-635387795


   @vinothchandar just a reminder on this PR.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar commented on a change in pull request #1665: [HUDI-910]Introduce HoodieWriteInput for hudi write client

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on a change in pull request #1665:
URL: https://github.com/apache/hudi/pull/1665#discussion_r429998200



##########
File path: hudi-writer-common/src/main/java/org/apache/hudi/format/HoodieWriteInput.java
##########
@@ -0,0 +1,37 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.format;

Review comment:
       format is a bit misleading.. just `org.apache.hudi.writer.common`?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar commented on pull request #1665: [HUDI-910]Introduce HoodieWriteInput for hudi write client

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on pull request #1665:
URL: https://github.com/apache/hudi/pull/1665#issuecomment-636382237


   @wangxianghu @leesf sorry  for that delays. I was trying to understand the relation to this work
   https://issues.apache.org/jira/browse/HUDI-538 cc @yanghua 
   This is not still clear to me where we are moving towards. While I agree that abstracting RDD into a WriteInput, like to understand how the existing code is going to evolve further. May I ask that we do a draft or first that replaces the entire code in hudi-client with these abstractions ( no need to even have the code compile. But want to understand the final shape we are looking at). I am also happy to do that myself and discuss.. Please let me know if that seems like a reasonable ask
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org