You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2020/05/25 13:36:07 UTC
[GitHub] [hudi] wangxianghu opened a new pull request #1665: [HUDI-910]Introduce HoodieWriteInput for hudi write client
wangxianghu opened a new pull request #1665:
URL: https://github.com/apache/hudi/pull/1665
## *Tips*
- *Thank you very much for contributing to Apache Hudi.*
- *Please review https://hudi.apache.org/contributing.html before opening a pull request.*
## What is the purpose of the pull request
*This pull reques introduce HoodieWriteInput for hudi write client as the unified input format.*
## Brief change log
- add a new module hudi-writer-commom
- introduce HoodieWriteInput for hudi write client as the unified input format.
## Verify this pull request
This pull request is a trivial rework / code cleanup without any test coverage.
## Committer checklist
- [ ] Has a corresponding JIRA in PR title & commit
- [ ] Commit message is descriptive of the change
- [ ] CI is green
- [ ] Necessary doc changes done or have another open PR
- [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter commented on pull request #1665: [HUDI-910]Introduce HoodieWriteInput for hudi write client
Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on pull request #1665:
URL: https://github.com/apache/hudi/pull/1665#issuecomment-633592885
# [Codecov](https://codecov.io/gh/apache/hudi/pull/1665?src=pr&el=h1) Report
> Merging [#1665](https://codecov.io/gh/apache/hudi/pull/1665?src=pr&el=desc) into [master](https://codecov.io/gh/apache/hudi/commit/802d16c8c9793156ef7fef0c59088040800fe025&el=desc) will **decrease** coverage by `0.10%`.
> The diff coverage is `n/a`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/1665/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/1665?src=pr&el=tree)
```diff
@@ Coverage Diff @@
## master #1665 +/- ##
============================================
- Coverage 18.33% 18.22% -0.11%
- Complexity 855 857 +2
============================================
Files 344 348 +4
Lines 15167 15332 +165
Branches 1512 1523 +11
============================================
+ Hits 2781 2795 +14
- Misses 12033 12180 +147
- Partials 353 357 +4
```
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/1665?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
|---|---|---|---|
| [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/1665/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29uZmlnL0hvb2RpZVdyaXRlQ29uZmlnLmphdmE=) | `40.07% <0.00%> (-2.19%)` | `48.00% <0.00%> (ø%)` | |
| [...rg/apache/hudi/metrics/MetricsReporterFactory.java](https://codecov.io/gh/apache/hudi/pull/1665/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0cmljcy9NZXRyaWNzUmVwb3J0ZXJGYWN0b3J5LmphdmE=) | `0.00% <0.00%> (ø)` | `0.00% <0.00%> (ø%)` | |
| [...apache/hudi/config/HoodieMetricsDatadogConfig.java](https://codecov.io/gh/apache/hudi/pull/1665/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29uZmlnL0hvb2RpZU1ldHJpY3NEYXRhZG9nQ29uZmlnLmphdmE=) | `36.36% <0.00%> (ø)` | `2.00% <0.00%> (?%)` | |
| [...apache/hudi/metrics/datadog/DatadogHttpClient.java](https://codecov.io/gh/apache/hudi/pull/1665/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0cmljcy9kYXRhZG9nL0RhdGFkb2dIdHRwQ2xpZW50LmphdmE=) | `0.00% <0.00%> (ø)` | `0.00% <0.00%> (?%)` | |
| [...e/hudi/metrics/datadog/DatadogMetricsReporter.java](https://codecov.io/gh/apache/hudi/pull/1665/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0cmljcy9kYXRhZG9nL0RhdGFkb2dNZXRyaWNzUmVwb3J0ZXIuamF2YQ==) | `0.00% <0.00%> (ø)` | `0.00% <0.00%> (?%)` | |
| [...g/apache/hudi/metrics/datadog/DatadogReporter.java](https://codecov.io/gh/apache/hudi/pull/1665/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvbWV0cmljcy9kYXRhZG9nL0RhdGFkb2dSZXBvcnRlci5qYXZh) | `0.00% <0.00%> (ø)` | `0.00% <0.00%> (?%)` | |
| [...org/apache/hudi/config/HoodieCompactionConfig.java](https://codecov.io/gh/apache/hudi/pull/1665/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29uZmlnL0hvb2RpZUNvbXBhY3Rpb25Db25maWcuamF2YQ==) | `56.00% <0.00%> (+1.09%)` | `3.00% <0.00%> (ø%)` | |
| [...va/org/apache/hudi/config/HoodieMetricsConfig.java](https://codecov.io/gh/apache/hudi/pull/1665/diff?src=pr&el=tree#diff-aHVkaS1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29uZmlnL0hvb2RpZU1ldHJpY3NDb25maWcuamF2YQ==) | `38.46% <0.00%> (+2.35%)` | `3.00% <0.00%> (ø%)` | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/1665?src=pr&el=continue).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/1665?src=pr&el=footer). Last update [802d16c...9d9e803](https://codecov.io/gh/apache/hudi/pull/1665?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] wangxianghu commented on pull request #1665: [HUDI-910]Introduce HoodieWriteInput for hudi write client
Posted by GitBox <gi...@apache.org>.
wangxianghu commented on pull request #1665:
URL: https://github.com/apache/hudi/pull/1665#issuecomment-636584997
>
>
> @wangxianghu @leesf sorry for that delays. I was trying to understand the relation to this work
> https://issues.apache.org/jira/browse/HUDI-538 cc @yanghua
> This is not still clear to me where we are moving towards. While I agree that abstracting RDD into a WriteInput, like to understand how the existing code is going to evolve further. May I ask that we do a draft or first that replaces the entire code in hudi-client with these abstractions ( no need to even have the code compile. But want to understand the final shape we are looking at). I am also happy to do that myself and discuss.. Please let me know if that seems like a reasonable ask
Hi @vinothchandar thanks for your suggestion. Replacing the entire code in hudi-client with new abstractions is a good choose, I'll do it :)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] wangxianghu edited a comment on pull request #1665: [HUDI-910]Introduce HoodieWriteInput for hudi write client
Posted by GitBox <gi...@apache.org>.
wangxianghu edited a comment on pull request #1665:
URL: https://github.com/apache/hudi/pull/1665#issuecomment-641279534
Hi @vinothchandar , This is the abstration of hudi-client module(in module hudi-client2):
https://github.com/wangxianghu/hudi/tree/HUDI-xxx
please take a look when free.
As the change is huge, If you agree with this abstraction, may I introduce a new module named "hudi-client2" or something else to hold the new abstraction, when the hudi-client2 is ready, we can weaken or delete the hudi-client module.
or, just introduce a new feature branch to refactor hudi-client and merge it when ready
The new module structure could be like this:
```
└── hudi-client
├── hudi-client-common
├── hudi-client-spark
├── hudi-client-flink
└── hudi-client-java
```
cc @yanghua @leesf
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] vinothchandar commented on pull request #1665: [HUDI-910]Introduce HoodieWriteInput for hudi write client
Posted by GitBox <gi...@apache.org>.
vinothchandar commented on pull request #1665:
URL: https://github.com/apache/hudi/pull/1665#issuecomment-633631006
Is there an umbrella task to understand how all the follow up work will be.. on this..For e.g I am wondering what the eventual methods on `HoodieWriteInput` will be and how it will abstract away the RDD construct
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #1665: [HUDI-910]Introduce HoodieWriteInput for hudi write client
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #1665:
URL: https://github.com/apache/hudi/pull/1665#issuecomment-633592885
# [Codecov](https://codecov.io/gh/apache/hudi/pull/1665?src=pr&el=h1) Report
> Merging [#1665](https://codecov.io/gh/apache/hudi/pull/1665?src=pr&el=desc) into [master](https://codecov.io/gh/apache/hudi/commit/bde7a7043e100242fec8fc0111e489a269a1d997&el=desc) will **not change** coverage.
> The diff coverage is `n/a`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/1665/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2)](https://codecov.io/gh/apache/hudi/pull/1665?src=pr&el=tree)
```diff
@@ Coverage Diff @@
## master #1665 +/- ##
=========================================
Coverage 18.21% 18.21%
Complexity 856 856
=========================================
Files 348 348
Lines 15332 15332
Branches 1523 1523
=========================================
Hits 2792 2792
Misses 12183 12183
Partials 357 357
```
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/1665?src=pr&el=continue).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/1665?src=pr&el=footer). Last update [bde7a70...9f0121f](https://codecov.io/gh/apache/hudi/pull/1665?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] leesf commented on pull request #1665: [HUDI-910]Introduce HoodieWriteInput for hudi write client
Posted by GitBox <gi...@apache.org>.
leesf commented on pull request #1665:
URL: https://github.com/apache/hudi/pull/1665#issuecomment-642060233
> Hi @vinothchandar , This is the abstration of hudi-client module(not finished yet):
> https://github.com/wangxianghu/hudi/tree/HUDI-xxx
> please take a look when free.
> cc @yanghua @leesf
Ack, look at the branch, the structure looks good to me, @vinothchandar would you please also take a look when you are free, we need land this feature ASAP.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] wangxianghu edited a comment on pull request #1665: [HUDI-910]Introduce HoodieWriteInput for hudi write client
Posted by GitBox <gi...@apache.org>.
wangxianghu edited a comment on pull request #1665:
URL: https://github.com/apache/hudi/pull/1665#issuecomment-641279534
Hi @vinothchandar , This is the abstration of hudi-client module(not finished yet):
https://github.com/wangxianghu/hudi/tree/HUDI-xxx
please take a look when free.
cc @yanghua @leesf
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] vinothchandar commented on pull request #1665: [HUDI-910]Introduce HoodieWriteInput for hudi write client
Posted by GitBox <gi...@apache.org>.
vinothchandar commented on pull request #1665:
URL: https://github.com/apache/hudi/pull/1665#issuecomment-637239524
@wangxianghu Thanks for being a sport! That will give us good confidence that we can ultimately pull this off.. Happy to help along as needed..
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #1665: [HUDI-910]Introduce HoodieWriteInput for hudi write client
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #1665:
URL: https://github.com/apache/hudi/pull/1665#issuecomment-633592885
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] wangxianghu edited a comment on pull request #1665: [HUDI-910]Introduce HoodieWriteInput for hudi write client
Posted by GitBox <gi...@apache.org>.
wangxianghu edited a comment on pull request #1665:
URL: https://github.com/apache/hudi/pull/1665#issuecomment-633776310
>
>
> Is there an umbrella task to understand how all the follow up work will be.. on this..For e.g I am wondering what the eventual methods on `HoodieWriteInput` will be and how it will abstract away the RDD construct
>
>
> Is there an umbrella task to understand how all the follow up work will be.. on this..For e.g I am wondering what the eventual methods on `HoodieWriteInput` will be and how it will abstract away the RDD construct
yes, here it is : https://issues.apache.org/jira/browse/HUDI-909
currently only four subtasks are filed, which is the very foundation of the entire abstraction:
1. Introduce HoodieWriteInput for hudi write client: https://issues.apache.org/jira/browse/HUDI-910
2. Introduce HoodieWriteOutput for hudi write client: https://issues.apache.org/jira/browse/HUDI-911
3. Introduce HoodieWriteKey for hudi write client: https://issues.apache.org/jira/browse/HUDI-912
4. Introduce HoodieEngineContext for hudi write client: https://issues.apache.org/jira/browse/HUDI-913
For Spark these could be :
```
JavaRDD<HoodieRecord<T>> records = ... ; // read from souce
HoodieWriteInput<JavaRDD<HoodieRecord<T>>> inputRecords = new HoodieWriteInput(records);
JavaRDD<HoodieRecord<T>> inputRdds = inputRecords.getInputs();
```
```
JavaSparkContext jsc = ...;
SerializableConfiguration hadoopConf = ...;`
HoodieEngineContext hec = new HoodieSparkEngineContext(hadoopConf, jsc); //HoodieSparkEngineContext extends HoodieEngineContext`
JavaSparkContext jsc = ((HoodieSparkEngineContext)hec).getContext();
```
The HoodieWriteKey and HoodieWriteOutput are the same as HoodieWriteInput.
upsert api could be like this:
```
public HoodieWriteOutput<JavaRDD<WriteStatus>> upsert(HoodieWriteInput<JavaRDD<HoodieRecord<T>>> records, final String instantTime) {...}
```
The content of the method is almost the same as before.
For Java and Flink, just replace the `JavaRDD `with `List`.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] wangxianghu closed pull request #1665: [HUDI-910]Introduce HoodieWriteInput for hudi write client
Posted by GitBox <gi...@apache.org>.
wangxianghu closed pull request #1665:
URL: https://github.com/apache/hudi/pull/1665
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] vinothchandar commented on pull request #1665: [HUDI-910]Introduce HoodieWriteInput for hudi write client
Posted by GitBox <gi...@apache.org>.
vinothchandar commented on pull request #1665:
URL: https://github.com/apache/hudi/pull/1665#issuecomment-642819377
Sg.. Will jump on #1727 . Closing this one
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] vinothchandar closed pull request #1665: [HUDI-910]Introduce HoodieWriteInput for hudi write client
Posted by GitBox <gi...@apache.org>.
vinothchandar closed pull request #1665:
URL: https://github.com/apache/hudi/pull/1665
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] wangxianghu edited a comment on pull request #1665: [HUDI-910]Introduce HoodieWriteInput for hudi write client
Posted by GitBox <gi...@apache.org>.
wangxianghu edited a comment on pull request #1665:
URL: https://github.com/apache/hudi/pull/1665#issuecomment-641279534
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] wangxianghu commented on a change in pull request #1665: [HUDI-910]Introduce HoodieWriteInput for hudi write client
Posted by GitBox <gi...@apache.org>.
wangxianghu commented on a change in pull request #1665:
URL: https://github.com/apache/hudi/pull/1665#discussion_r430116222
##########
File path: hudi-writer-common/src/main/java/org/apache/hudi/format/HoodieWriteInput.java
##########
@@ -0,0 +1,37 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.format;
Review comment:
reasonable, will do
##########
File path: hudi-writer-common/src/main/java/org/apache/hudi/format/HoodieWriteInput.java
##########
@@ -0,0 +1,37 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.format;
Review comment:
@vinothchandar done
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] wangxianghu edited a comment on pull request #1665: [HUDI-910]Introduce HoodieWriteInput for hudi write client
Posted by GitBox <gi...@apache.org>.
wangxianghu edited a comment on pull request #1665:
URL: https://github.com/apache/hudi/pull/1665#issuecomment-633776310
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] wangxianghu commented on pull request #1665: [HUDI-910]Introduce HoodieWriteInput for hudi write client
Posted by GitBox <gi...@apache.org>.
wangxianghu commented on pull request #1665:
URL: https://github.com/apache/hudi/pull/1665#issuecomment-633776310
>
>
> Is there an umbrella task to understand how all the follow up work will be.. on this..For e.g I am wondering what the eventual methods on `HoodieWriteInput` will be and how it will abstract away the RDD construct
>
>
> Is there an umbrella task to understand how all the follow up work will be.. on this..For e.g I am wondering what the eventual methods on `HoodieWriteInput` will be and how it will abstract away the RDD construct
yes, here it is : https://issues.apache.org/jira/browse/HUDI-909
currently only four subtasks are filed, which is the very foundation of the entire abstraction:
1. Introduce HoodieWriteInput for hudi write client: https://issues.apache.org/jira/browse/HUDI-910
2. Introduce HoodieWriteOutput for hudi write client: https://issues.apache.org/jira/browse/HUDI-911
3. Introduce HoodieWriteKey for hudi write client: https://issues.apache.org/jira/browse/HUDI-912
4. Introduce HoodieEngineContext for hudi write client: https://issues.apache.org/jira/browse/HUDI-913
For Spark these could be :
`JavaRDD<HoodieRecord<T>> records = ... ; // read from souce
HoodieWriteInput<JavaRDD<HoodieRecord<T>>> inputRecords = new HoodieWriteInput(records);
JavaRDD<HoodieRecord<T>> inputRdds = inputRecords.getInputs();`
`JavaSparkContext jsc = ...;`
`HoodieEngineContext<JavaSparkContext> hec = new HoodieSparkEngineContext(jsc); //HoodieSparkEngineContext<JavaSparkContext> implements HoodieEngineContext`
`JavaSparkContext jsc = hec.getContext();`
The HoodieWriteKey and HoodieWriteOutput are the same as HoodieWriteInput.
upsert api could be like this:
`public HoodieWriteOutput<JavaRDD<WriteStatus>> upsert(HoodieWriteInput<JavaRDD<HoodieRecord<T>>> records, final String instantTime) {...}`
The content of the method is almost the same as before.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] wangxianghu commented on pull request #1665: [HUDI-910]Introduce HoodieWriteInput for hudi write client
Posted by GitBox <gi...@apache.org>.
wangxianghu commented on pull request #1665:
URL: https://github.com/apache/hudi/pull/1665#issuecomment-641275707
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] vinothchandar commented on pull request #1665: [HUDI-910]Introduce HoodieWriteInput for hudi write client
Posted by GitBox <gi...@apache.org>.
vinothchandar commented on pull request #1665:
URL: https://github.com/apache/hudi/pull/1665#issuecomment-636382401
This pr itself is fine. But given we are adding a new module and this is a critical thing to get right, trying to understand more upfront
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] leesf commented on pull request #1665: [HUDI-910]Introduce HoodieWriteInput for hudi write client
Posted by GitBox <gi...@apache.org>.
leesf commented on pull request #1665:
URL: https://github.com/apache/hudi/pull/1665#issuecomment-635387795
@vinothchandar just a reminder on this PR.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] vinothchandar commented on a change in pull request #1665: [HUDI-910]Introduce HoodieWriteInput for hudi write client
Posted by GitBox <gi...@apache.org>.
vinothchandar commented on a change in pull request #1665:
URL: https://github.com/apache/hudi/pull/1665#discussion_r429998200
##########
File path: hudi-writer-common/src/main/java/org/apache/hudi/format/HoodieWriteInput.java
##########
@@ -0,0 +1,37 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.format;
Review comment:
format is a bit misleading.. just `org.apache.hudi.writer.common`?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] vinothchandar commented on pull request #1665: [HUDI-910]Introduce HoodieWriteInput for hudi write client
Posted by GitBox <gi...@apache.org>.
vinothchandar commented on pull request #1665:
URL: https://github.com/apache/hudi/pull/1665#issuecomment-636382237
@wangxianghu @leesf sorry for that delays. I was trying to understand the relation to this work
https://issues.apache.org/jira/browse/HUDI-538 cc @yanghua
This is not still clear to me where we are moving towards. While I agree that abstracting RDD into a WriteInput, like to understand how the existing code is going to evolve further. May I ask that we do a draft or first that replaces the entire code in hudi-client with these abstractions ( no need to even have the code compile. But want to understand the final shape we are looking at). I am also happy to do that myself and discuss.. Please let me know if that seems like a reasonable ask
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org