You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2021/06/23 16:38:35 UTC
[GitHub] [hudi] codope opened a new pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
codope opened a new pull request #3142:
URL: https://github.com/apache/hudi/pull/3142
## What is the purpose of the pull request
This pull request adds async clustering support for HoodieDeltaStreamer and Spark streaming writes to Hudi table.
## Brief change log
- Async clustering is configurable.
- Reuses the existing clustering methods in write client and timeline.
- Hence, it has the same limitations that exist with current clustering strategy i.e. updates are rejected.
## Verify this pull request
This change added tests and can be verified as follows:
- Added unit tests in `TestHoodieDeltaStreamer` and `TestStructuredStreaming`.
- Manually verified the change by running a job locally.
## Committer checklist
- [ ] Has a corresponding JIRA in PR title & commit
- [ ] Commit message is descriptive of the change
- [ ] CI is green
- [ ] Necessary doc changes done or have another open PR
- [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-867078369
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-867078369
# [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#3142](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (7d4f981) into [master](https://codecov.io/gh/apache/hudi/commit/9b01d2a04520db6230cd16ef2b29013c013b1944?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (9b01d2a) will **decrease** coverage by `31.75%`.
> The diff coverage is `40.35%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3142/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #3142 +/- ##
=============================================
- Coverage 47.67% 15.91% -31.76%
+ Complexity 5516 493 -5023
=============================================
Files 929 283 -646
Lines 41303 11710 -29593
Branches 4144 961 -3183
=============================================
- Hits 19692 1864 -17828
+ Misses 19863 9683 -10180
+ Partials 1748 163 -1585
```
| Flag | Coverage Δ | |
|---|---|---|
| hudicli | `?` | |
| hudiclient | `0.00% <0.00%> (-34.58%)` | :arrow_down: |
| hudicommon | `?` | |
| hudiflink | `?` | |
| hudihadoopmr | `?` | |
| hudisparkdatasource | `?` | |
| hudisync | `5.37% <ø> (-49.15%)` | :arrow_down: |
| huditimelineservice | `?` | |
| hudiutilities | `59.26% <90.19%> (+0.69%)` | :arrow_up: |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [.../org/apache/hudi/async/AsyncClusteringService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ2x1c3RlcmluZ1NlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
| [...ava/org/apache/hudi/async/AsyncCompactService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ29tcGFjdFNlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
| [...java/org/apache/hudi/async/HoodieAsyncService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0hvb2RpZUFzeW5jU2VydmljZS5qYXZh) | `0.00% <0.00%> (ø)` | |
| [...g/apache/hudi/client/AbstractClusteringClient.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9BYnN0cmFjdENsdXN0ZXJpbmdDbGllbnQuamF2YQ==) | `0.00% <0.00%> (ø)` | |
| [...org/apache/hudi/config/HoodieClusteringConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVDbHVzdGVyaW5nQ29uZmlnLmphdmE=) | `0.00% <0.00%> (-71.28%)` | :arrow_down: |
| [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh) | `0.00% <0.00%> (-42.89%)` | :arrow_down: |
| [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `71.56% <75.00%> (+0.42%)` | :arrow_up: |
| [...i/utilities/deltastreamer/HoodieDeltaStreamer.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvSG9vZGllRGVsdGFTdHJlYW1lci5qYXZh) | `75.08% <92.50%> (+2.25%)` | :arrow_up: |
| [...s/deltastreamer/HoodieMultiTableDeltaStreamer.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvSG9vZGllTXVsdGlUYWJsZURlbHRhU3RyZWFtZXIuamF2YQ==) | `76.60% <100.00%> (+0.41%)` | :arrow_up: |
| [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| ... and [724 more](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [9b01d2a...7d4f981](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
"triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"triggerType" : "PUSH"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867615903",
"triggerType" : "MANUAL"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867624272",
"triggerType" : "MANUAL"
}, {
"hash" : "0138dadc9c21dc753582505e03a139efe1f63884",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568",
"triggerID" : "0138dadc9c21dc753582505e03a139efe1f63884",
"triggerType" : "PUSH"
}, {
"hash" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=659",
"triggerID" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
"triggerType" : "PUSH"
}, {
"hash" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=744",
"triggerID" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
"triggerType" : "PUSH"
}, {
"hash" : "cea6548b53a242e93e861401564a5bb55d317f24",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=812",
"triggerID" : "cea6548b53a242e93e861401564a5bb55d317f24",
"triggerType" : "PUSH"
}, {
"hash" : "890e9822855fcd45c8387f83740975f43474cddc",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=814",
"triggerID" : "890e9822855fcd45c8387f83740975f43474cddc",
"triggerType" : "PUSH"
}, {
"hash" : "890e9822855fcd45c8387f83740975f43474cddc",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=818",
"triggerID" : "877110964",
"triggerType" : "MANUAL"
}, {
"hash" : "f86f50e817a625dc30f35a39b7495a4f359e4da5",
"status" : "PENDING",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=823",
"triggerID" : "f86f50e817a625dc30f35a39b7495a4f359e4da5",
"triggerType" : "PUSH"
}, {
"hash" : "f86f50e817a625dc30f35a39b7495a4f359e4da5",
"status" : "PENDING",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=824",
"triggerID" : "877330225",
"triggerType" : "MANUAL"
} ]
}-->
## CI report:
* 890e9822855fcd45c8387f83740975f43474cddc Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=814) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=818)
* f86f50e817a625dc30f35a39b7495a4f359e4da5 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=823) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=824)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codope commented on a change in pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
codope commented on a change in pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#discussion_r664561678
##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/async/AsyncClusteringService.java
##########
@@ -0,0 +1,125 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.async;
+
+import org.apache.hudi.client.AbstractClusteringClient;
+import org.apache.hudi.client.AbstractHoodieWriteClient;
+import org.apache.hudi.common.table.timeline.HoodieInstant;
+import org.apache.hudi.common.util.collection.Pair;
+import org.apache.hudi.exception.HoodieIOException;
+
+import org.apache.log4j.LogManager;
+import org.apache.log4j.Logger;
+
+import java.io.IOException;
+import java.util.concurrent.BlockingQueue;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import java.util.concurrent.LinkedBlockingQueue;
+import java.util.concurrent.locks.Condition;
+import java.util.concurrent.locks.ReentrantLock;
+import java.util.stream.IntStream;
+
+/**
+ * Async clustering service that runs in a separate thread.
+ * Currently, only one clustering thread is allowed to run at any time.
+ */
+public abstract class AsyncClusteringService extends HoodieAsyncService {
+
+ private static final long serialVersionUID = 1L;
+ private static final Logger LOG = LogManager.getLogger(AsyncClusteringService.class);
+
+ private final int maxConcurrentClustering;
+ private transient AbstractClusteringClient clusteringClient;
+ private transient BlockingQueue<HoodieInstant> pendingClustering = new LinkedBlockingQueue<>();
+ private transient ReentrantLock queueLock = new ReentrantLock();
Review comment:
As discussed, moved the common methods and variables to `HoodieAsyncService`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
"triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"triggerType" : "PUSH"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867615903",
"triggerType" : "MANUAL"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867624272",
"triggerType" : "MANUAL"
}, {
"hash" : "0138dadc9c21dc753582505e03a139efe1f63884",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568",
"triggerID" : "0138dadc9c21dc753582505e03a139efe1f63884",
"triggerType" : "PUSH"
}, {
"hash" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=659",
"triggerID" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
"triggerType" : "PUSH"
}, {
"hash" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=744",
"triggerID" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
"triggerType" : "PUSH"
}, {
"hash" : "cea6548b53a242e93e861401564a5bb55d317f24",
"status" : "CANCELED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=812",
"triggerID" : "cea6548b53a242e93e861401564a5bb55d317f24",
"triggerType" : "PUSH"
}, {
"hash" : "890e9822855fcd45c8387f83740975f43474cddc",
"status" : "PENDING",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=814",
"triggerID" : "890e9822855fcd45c8387f83740975f43474cddc",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* cea6548b53a242e93e861401564a5bb55d317f24 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=812)
* 890e9822855fcd45c8387f83740975f43474cddc Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=814)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
"triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"triggerType" : "PUSH"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "PENDING",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867615903",
"triggerType" : "MANUAL"
} ]
}-->
## CI report:
* 4a651f8d6b32f63838d8317c9f083508c17e4458 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-867078369
# [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#3142](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (cd27818) into [master](https://codecov.io/gh/apache/hudi/commit/70d9c2e7473154178bafaef40a15fc3e10a9df6a?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (70d9c2e) will **decrease** coverage by `31.72%`.
> The diff coverage is `38.26%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3142/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #3142 +/- ##
=============================================
- Coverage 47.51% 15.78% -31.73%
+ Complexity 5427 482 -4945
=============================================
Files 922 282 -640
Lines 40966 11650 -29316
Branches 4100 954 -3146
=============================================
- Hits 19463 1839 -17624
+ Misses 19784 9650 -10134
+ Partials 1719 161 -1558
```
| Flag | Coverage Δ | |
|---|---|---|
| hudicli | `?` | |
| hudiclient | `0.00% <0.00%> (-34.59%)` | :arrow_down: |
| hudicommon | `?` | |
| hudiflink | `?` | |
| hudihadoopmr | `?` | |
| hudisparkdatasource | `?` | |
| hudisync | `5.38% <ø> (-48.67%)` | :arrow_down: |
| huditimelineservice | `?` | |
| hudiutilities | `58.76% <89.79%> (+0.74%)` | :arrow_up: |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [.../org/apache/hudi/async/AsyncClusteringService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ2x1c3RlcmluZ1NlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
| [...ava/org/apache/hudi/async/AsyncCompactService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ29tcGFjdFNlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
| [...java/org/apache/hudi/async/HoodieAsyncService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0hvb2RpZUFzeW5jU2VydmljZS5qYXZh) | `0.00% <0.00%> (ø)` | |
| [...g/apache/hudi/client/AbstractClusteringClient.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9BYnN0cmFjdENsdXN0ZXJpbmdDbGllbnQuamF2YQ==) | `0.00% <0.00%> (ø)` | |
| [...org/apache/hudi/config/HoodieClusteringConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVDbHVzdGVyaW5nQ29uZmlnLmphdmE=) | `0.00% <0.00%> (-71.28%)` | :arrow_down: |
| [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh) | `0.00% <0.00%> (-42.79%)` | :arrow_down: |
| [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `71.47% <75.00%> (+0.43%)` | :arrow_up: |
| [...i/utilities/deltastreamer/HoodieDeltaStreamer.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvSG9vZGllRGVsdGFTdHJlYW1lci5qYXZh) | `75.34% <92.10%> (+2.50%)` | :arrow_up: |
| [...s/deltastreamer/HoodieMultiTableDeltaStreamer.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvSG9vZGllTXVsdGlUYWJsZURlbHRhU3RyZWFtZXIuamF2YQ==) | `76.60% <100.00%> (+0.41%)` | :arrow_up: |
| [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| ... and [718 more](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [70d9c2e...cd27818](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
"triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"triggerType" : "PUSH"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867615903",
"triggerType" : "MANUAL"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867624272",
"triggerType" : "MANUAL"
}, {
"hash" : "0138dadc9c21dc753582505e03a139efe1f63884",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568",
"triggerID" : "0138dadc9c21dc753582505e03a139efe1f63884",
"triggerType" : "PUSH"
}, {
"hash" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=659",
"triggerID" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
"triggerType" : "PUSH"
}, {
"hash" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
"status" : "PENDING",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=744",
"triggerID" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* cd2781856c66d1b8527b12f29c99f3faf31e3fdb Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=659)
* 2d1f23811a18ed5b127c12391ea0fcabe50bb8a4 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=744)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-867078369
# [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#3142](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (7d4f981) into [master](https://codecov.io/gh/apache/hudi/commit/9b01d2a04520db6230cd16ef2b29013c013b1944?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (9b01d2a) will **decrease** coverage by `44.81%`.
> The diff coverage is `0.00%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3142/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #3142 +/- ##
============================================
- Coverage 47.67% 2.86% -44.82%
+ Complexity 5516 85 -5431
============================================
Files 929 283 -646
Lines 41303 11710 -29593
Branches 4144 961 -3183
============================================
- Hits 19692 335 -19357
+ Misses 19863 11349 -8514
+ Partials 1748 26 -1722
```
| Flag | Coverage Δ | |
|---|---|---|
| hudicli | `?` | |
| hudiclient | `0.00% <0.00%> (-34.58%)` | :arrow_down: |
| hudicommon | `?` | |
| hudiflink | `?` | |
| hudihadoopmr | `?` | |
| hudisparkdatasource | `?` | |
| hudisync | `5.37% <ø> (-49.15%)` | :arrow_down: |
| huditimelineservice | `?` | |
| hudiutilities | `9.11% <0.00%> (-49.46%)` | :arrow_down: |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [.../org/apache/hudi/async/AsyncClusteringService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ2x1c3RlcmluZ1NlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
| [...ava/org/apache/hudi/async/AsyncCompactService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ29tcGFjdFNlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
| [...java/org/apache/hudi/async/HoodieAsyncService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0hvb2RpZUFzeW5jU2VydmljZS5qYXZh) | `0.00% <0.00%> (ø)` | |
| [...g/apache/hudi/client/AbstractClusteringClient.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9BYnN0cmFjdENsdXN0ZXJpbmdDbGllbnQuamF2YQ==) | `0.00% <0.00%> (ø)` | |
| [...org/apache/hudi/config/HoodieClusteringConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVDbHVzdGVyaW5nQ29uZmlnLmphdmE=) | `0.00% <0.00%> (-71.28%)` | :arrow_down: |
| [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh) | `0.00% <0.00%> (-42.89%)` | :arrow_down: |
| [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `0.00% <0.00%> (-71.15%)` | :arrow_down: |
| [...i/utilities/deltastreamer/HoodieDeltaStreamer.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvSG9vZGllRGVsdGFTdHJlYW1lci5qYXZh) | `0.00% <0.00%> (-72.84%)` | :arrow_down: |
| [...s/deltastreamer/HoodieMultiTableDeltaStreamer.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvSG9vZGllTXVsdGlUYWJsZURlbHRhU3RyZWFtZXIuamF2YQ==) | `0.00% <0.00%> (-76.20%)` | :arrow_down: |
| [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| ... and [770 more](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [9b01d2a...7d4f981](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-867078369
# [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#3142](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (f86f50e) into [master](https://codecov.io/gh/apache/hudi/commit/371526789d663dee85041eb31c27c52c81ef87ef?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (3715267) will **decrease** coverage by `24.54%`.
> The diff coverage is `0.00%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3142/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #3142 +/- ##
============================================
- Coverage 27.40% 2.86% -24.55%
+ Complexity 1287 85 -1202
============================================
Files 381 283 -98
Lines 15108 11710 -3398
Branches 1305 961 -344
============================================
- Hits 4141 335 -3806
- Misses 10667 11349 +682
+ Partials 300 26 -274
```
| Flag | Coverage Δ | |
|---|---|---|
| hudiclient | `0.00% <0.00%> (-21.06%)` | :arrow_down: |
| hudisync | `5.37% <ø> (ø)` | |
| hudiutilities | `9.11% <0.00%> (-49.46%)` | :arrow_down: |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [.../org/apache/hudi/async/AsyncClusteringService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ2x1c3RlcmluZ1NlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
| [...ava/org/apache/hudi/async/AsyncCompactService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ29tcGFjdFNlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
| [...java/org/apache/hudi/async/HoodieAsyncService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0hvb2RpZUFzeW5jU2VydmljZS5qYXZh) | `0.00% <0.00%> (ø)` | |
| [...g/apache/hudi/client/AbstractClusteringClient.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9BYnN0cmFjdENsdXN0ZXJpbmdDbGllbnQuamF2YQ==) | `0.00% <0.00%> (ø)` | |
| [...org/apache/hudi/config/HoodieClusteringConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVDbHVzdGVyaW5nQ29uZmlnLmphdmE=) | `0.00% <0.00%> (ø)` | |
| [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh) | `0.00% <0.00%> (ø)` | |
| [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `0.00% <0.00%> (-71.15%)` | :arrow_down: |
| [...i/utilities/deltastreamer/HoodieDeltaStreamer.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvSG9vZGllRGVsdGFTdHJlYW1lci5qYXZh) | `0.00% <0.00%> (-72.84%)` | :arrow_down: |
| [...s/deltastreamer/HoodieMultiTableDeltaStreamer.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvSG9vZGllTXVsdGlUYWJsZURlbHRhU3RyZWFtZXIuamF2YQ==) | `0.00% <0.00%> (-76.20%)` | :arrow_down: |
| [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| ... and [148 more](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [3715267...f86f50e](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
"triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"triggerType" : "PUSH"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867615903",
"triggerType" : "MANUAL"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867624272",
"triggerType" : "MANUAL"
}, {
"hash" : "0138dadc9c21dc753582505e03a139efe1f63884",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568",
"triggerID" : "0138dadc9c21dc753582505e03a139efe1f63884",
"triggerType" : "PUSH"
}, {
"hash" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=659",
"triggerID" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* cd2781856c66d1b8527b12f29c99f3faf31e3fdb Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=659)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan commented on a change in pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#discussion_r665443646
##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/async/HoodieAsyncService.java
##########
@@ -165,4 +176,51 @@ private void monitorThreads(Function<Boolean, Boolean> onShutdownCallback) {
public boolean isRunInDaemonMode() {
return runInDaemonMode;
}
+
+ /**
+ * Wait till outstanding pending compaction/clustering reduces to the passed in value.
+ *
+ * @param numPending Maximum pending compactions/clustering allowed
+ * @throws InterruptedException
+ */
+ public void waitTillPendingActionReducesTo(int numPending) throws InterruptedException {
+ try {
+ queueLock.lock();
+ while (!isShutdown() && (pendingInstants.size() > numPending)) {
+ consumed.await();
+ }
+ } finally {
+ queueLock.unlock();
+ }
+ }
+
+ /**
+ * Enqueues new pending clustering instant.
+ * @param instant {@link HoodieInstant} to enqueue.
+ */
+ public void enqueuePendingAction(HoodieInstant instant) {
+ LOG.info("Enqueuing new pending clustering instant: " + instant.getTimestamp());
+ pendingInstants.add(instant);
+ }
+
+ /**
+ * Fetch next pending compaction/clustering instant if available.
+ *
+ * @return {@link HoodieInstant} corresponding to the next pending compaction/clustering.
+ * @throws InterruptedException
+ */
+ HoodieInstant fetchNextActionInstant() throws InterruptedException {
Review comment:
nit. fetchNextAsyncServiceInstant
##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieMultiTableDeltaStreamer.java
##########
@@ -296,6 +297,11 @@ public static void main(String[] args) throws IOException {
+ "outstanding compactions is less than this number")
public Integer maxPendingCompactions = 5;
+ @Parameter(names = {"--max-pending-clustering"},
Review comment:
@pratyakshsharma ^
##########
File path: hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestStructuredStreaming.scala
##########
@@ -207,12 +209,26 @@ class TestStructuredStreaming extends HoodieClientTestBase {
metaClient.reloadActiveTimeline()
assertEquals(1, getLatestFileGroupsFileId(HoodieTestDataGenerator.DEFAULT_FIRST_PARTITION_PATH).size)
}
- structuredStreamingForTestClusteringRunner(sourcePath, destPath, true,
+ structuredStreamingForTestClusteringRunner(sourcePath, destPath, true, false,
HoodieTestDataGenerator.DEFAULT_FIRST_PARTITION_PATH, checkClusteringResult)
}
@Test
- def testStructuredStreamingWithoutInlineClustering(): Unit = {
+ def testStructuredStreamingWithAsyncClustering(): Unit = {
Review comment:
not sure if we this will be too hard to achieve. Is there a way to simulate resource unavailability. i.e. when async clustering is scheduled, no resources to schedule right away. But after you open up resources in your test, async clustering should get triggered. basically to validate that a pending async clustering should get triggered when resources become available and not get cancelled.
##########
File path: hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/DataSourceOptions.scala
##########
@@ -473,6 +473,11 @@ object DataSourceWriteOptions {
.defaultValue("true")
.withDocumentation("")
+ val ASYNC_CLUSTERING_ENABLE_OPT_KEY: ConfigProperty[String] = ConfigProperty
+ .key("hoodie.datasource.clustering.async.enable")
+ .defaultValue("false")
Review comment:
can we set the min version as well.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "PENDING",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
"triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* 4a651f8d6b32f63838d8317c9f083508c17e4458 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
"triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* 4a651f8d6b32f63838d8317c9f083508c17e4458 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
"triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"triggerType" : "PUSH"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867615903",
"triggerType" : "MANUAL"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867624272",
"triggerType" : "MANUAL"
}, {
"hash" : "0138dadc9c21dc753582505e03a139efe1f63884",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568",
"triggerID" : "0138dadc9c21dc753582505e03a139efe1f63884",
"triggerType" : "PUSH"
}, {
"hash" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=659",
"triggerID" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
"triggerType" : "PUSH"
}, {
"hash" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
"status" : "UNKNOWN",
"url" : "TBD",
"triggerID" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* cd2781856c66d1b8527b12f29c99f3faf31e3fdb Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=659)
* 2d1f23811a18ed5b127c12391ea0fcabe50bb8a4 UNKNOWN
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codope commented on a change in pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
codope commented on a change in pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#discussion_r661474341
##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java
##########
@@ -396,10 +424,10 @@ public int hashCode() {
baseFileFormat, propsFilePath, configs, sourceClassName,
sourceOrderingField, payloadClassName, schemaProviderClassName,
transformerClassNames, sourceLimit, operation, filterDupes,
- enableHiveSync, maxPendingCompactions, continuousMode,
+ enableHiveSync, maxPendingCompactions, maxPendingClustering, continuousMode,
minSyncIntervalSeconds, sparkMaster, commitOnErrors,
deltaSyncSchedulingWeight, compactSchedulingWeight, deltaSyncSchedulingMinShare,
- compactSchedulingMinShare, forceDisableCompaction, checkpoint,
+ compactSchedulingMinShare, forceDisableCompaction, forceDisableClustering, checkpoint,
Review comment:
This config will be required only if we create a separate job pool for clustering and assign weights. As mentioned [here](https://github.com/apache/hudi/pull/3142#issuecomment-869829116) we are not creating a separate pool.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-867078369
# [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#3142](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (2d1f238) into [master](https://codecov.io/gh/apache/hudi/commit/6e2443468280dfd768afe9ac4d17df7fbbbb51bf?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (6e24434) will **decrease** coverage by `44.76%`.
> The diff coverage is `0.00%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3142/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #3142 +/- ##
============================================
- Coverage 47.61% 2.85% -44.77%
+ Complexity 5487 82 -5405
============================================
Files 924 282 -642
Lines 41206 11705 -29501
Branches 4134 959 -3175
============================================
- Hits 19621 334 -19287
+ Misses 19843 11345 -8498
+ Partials 1742 26 -1716
```
| Flag | Coverage Δ | |
|---|---|---|
| hudicli | `?` | |
| hudiclient | `0.00% <0.00%> (-34.59%)` | :arrow_down: |
| hudicommon | `?` | |
| hudiflink | `?` | |
| hudihadoopmr | `?` | |
| hudisparkdatasource | `?` | |
| hudisync | `5.28% <ø> (-49.20%)` | :arrow_down: |
| huditimelineservice | `?` | |
| hudiutilities | `9.12% <0.00%> (-49.48%)` | :arrow_down: |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [.../org/apache/hudi/async/AsyncClusteringService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ2x1c3RlcmluZ1NlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
| [...ava/org/apache/hudi/async/AsyncCompactService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ29tcGFjdFNlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
| [...java/org/apache/hudi/async/HoodieAsyncService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0hvb2RpZUFzeW5jU2VydmljZS5qYXZh) | `0.00% <0.00%> (ø)` | |
| [...g/apache/hudi/client/AbstractClusteringClient.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9BYnN0cmFjdENsdXN0ZXJpbmdDbGllbnQuamF2YQ==) | `0.00% <0.00%> (ø)` | |
| [...org/apache/hudi/config/HoodieClusteringConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVDbHVzdGVyaW5nQ29uZmlnLmphdmE=) | `0.00% <0.00%> (-71.28%)` | :arrow_down: |
| [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh) | `0.00% <0.00%> (-42.89%)` | :arrow_down: |
| [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `0.00% <0.00%> (-71.15%)` | :arrow_down: |
| [...i/utilities/deltastreamer/HoodieDeltaStreamer.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvSG9vZGllRGVsdGFTdHJlYW1lci5qYXZh) | `0.00% <0.00%> (-72.84%)` | :arrow_down: |
| [...s/deltastreamer/HoodieMultiTableDeltaStreamer.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvSG9vZGllTXVsdGlUYWJsZURlbHRhU3RyZWFtZXIuamF2YQ==) | `0.00% <0.00%> (-76.20%)` | :arrow_down: |
| [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| ... and [767 more](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [6e24434...2d1f238](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-867078369
# [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#3142](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (f86f50e) into [master](https://codecov.io/gh/apache/hudi/commit/371526789d663dee85041eb31c27c52c81ef87ef?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (3715267) will **decrease** coverage by `11.49%`.
> The diff coverage is `40.35%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3142/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #3142 +/- ##
=============================================
- Coverage 27.40% 15.91% -11.50%
+ Complexity 1287 493 -794
=============================================
Files 381 283 -98
Lines 15108 11710 -3398
Branches 1305 961 -344
=============================================
- Hits 4141 1864 -2277
+ Misses 10667 9683 -984
+ Partials 300 163 -137
```
| Flag | Coverage Δ | |
|---|---|---|
| hudiclient | `0.00% <0.00%> (-21.06%)` | :arrow_down: |
| hudisync | `5.37% <ø> (ø)` | |
| hudiutilities | `59.26% <90.19%> (+0.69%)` | :arrow_up: |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [.../org/apache/hudi/async/AsyncClusteringService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ2x1c3RlcmluZ1NlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
| [...ava/org/apache/hudi/async/AsyncCompactService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ29tcGFjdFNlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
| [...java/org/apache/hudi/async/HoodieAsyncService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0hvb2RpZUFzeW5jU2VydmljZS5qYXZh) | `0.00% <0.00%> (ø)` | |
| [...g/apache/hudi/client/AbstractClusteringClient.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9BYnN0cmFjdENsdXN0ZXJpbmdDbGllbnQuamF2YQ==) | `0.00% <0.00%> (ø)` | |
| [...org/apache/hudi/config/HoodieClusteringConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVDbHVzdGVyaW5nQ29uZmlnLmphdmE=) | `0.00% <0.00%> (ø)` | |
| [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh) | `0.00% <0.00%> (ø)` | |
| [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `71.56% <75.00%> (+0.42%)` | :arrow_up: |
| [...i/utilities/deltastreamer/HoodieDeltaStreamer.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvSG9vZGllRGVsdGFTdHJlYW1lci5qYXZh) | `75.08% <92.50%> (+2.25%)` | :arrow_up: |
| [...s/deltastreamer/HoodieMultiTableDeltaStreamer.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvSG9vZGllTXVsdGlUYWJsZURlbHRhU3RyZWFtZXIuamF2YQ==) | `76.60% <100.00%> (+0.41%)` | :arrow_up: |
| [...n/bulkinsert/PartitionSortPartitionerWithRows.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1zcGFyay1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvZXhlY3V0aW9uL2J1bGtpbnNlcnQvUGFydGl0aW9uU29ydFBhcnRpdGlvbmVyV2l0aFJvd3MuamF2YQ==) | | |
| ... and [102 more](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [3715267...f86f50e](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codope commented on a change in pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
codope commented on a change in pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#discussion_r666691523
##########
File path: hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestStructuredStreaming.scala
##########
@@ -207,12 +209,26 @@ class TestStructuredStreaming extends HoodieClientTestBase {
metaClient.reloadActiveTimeline()
assertEquals(1, getLatestFileGroupsFileId(HoodieTestDataGenerator.DEFAULT_FIRST_PARTITION_PATH).size)
}
- structuredStreamingForTestClusteringRunner(sourcePath, destPath, true,
+ structuredStreamingForTestClusteringRunner(sourcePath, destPath, true, false,
HoodieTestDataGenerator.DEFAULT_FIRST_PARTITION_PATH, checkClusteringResult)
}
@Test
- def testStructuredStreamingWithoutInlineClustering(): Unit = {
+ def testStructuredStreamingWithAsyncClustering(): Unit = {
Review comment:
I will try to simulate this as part of integration test. https://issues.apache.org/jira/browse/HUDI-1077
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
"triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"triggerType" : "PUSH"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867615903",
"triggerType" : "MANUAL"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867624272",
"triggerType" : "MANUAL"
}, {
"hash" : "0138dadc9c21dc753582505e03a139efe1f63884",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568",
"triggerID" : "0138dadc9c21dc753582505e03a139efe1f63884",
"triggerType" : "PUSH"
}, {
"hash" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=659",
"triggerID" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
"triggerType" : "PUSH"
}, {
"hash" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=744",
"triggerID" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
"triggerType" : "PUSH"
}, {
"hash" : "cea6548b53a242e93e861401564a5bb55d317f24",
"status" : "CANCELED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=812",
"triggerID" : "cea6548b53a242e93e861401564a5bb55d317f24",
"triggerType" : "PUSH"
}, {
"hash" : "890e9822855fcd45c8387f83740975f43474cddc",
"status" : "UNKNOWN",
"url" : "TBD",
"triggerID" : "890e9822855fcd45c8387f83740975f43474cddc",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* cea6548b53a242e93e861401564a5bb55d317f24 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=812)
* 890e9822855fcd45c8387f83740975f43474cddc UNKNOWN
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
"triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"triggerType" : "PUSH"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867615903",
"triggerType" : "MANUAL"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867624272",
"triggerType" : "MANUAL"
}, {
"hash" : "0138dadc9c21dc753582505e03a139efe1f63884",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568",
"triggerID" : "0138dadc9c21dc753582505e03a139efe1f63884",
"triggerType" : "PUSH"
}, {
"hash" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=659",
"triggerID" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
"triggerType" : "PUSH"
}, {
"hash" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=744",
"triggerID" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
"triggerType" : "PUSH"
}, {
"hash" : "cea6548b53a242e93e861401564a5bb55d317f24",
"status" : "PENDING",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=812",
"triggerID" : "cea6548b53a242e93e861401564a5bb55d317f24",
"triggerType" : "PUSH"
}, {
"hash" : "890e9822855fcd45c8387f83740975f43474cddc",
"status" : "UNKNOWN",
"url" : "TBD",
"triggerID" : "890e9822855fcd45c8387f83740975f43474cddc",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* 2d1f23811a18ed5b127c12391ea0fcabe50bb8a4 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=744)
* cea6548b53a242e93e861401564a5bb55d317f24 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=812)
* 890e9822855fcd45c8387f83740975f43474cddc UNKNOWN
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codope commented on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
codope commented on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-877330225
@hudi-bot run azure
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
"triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"triggerType" : "PUSH"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867615903",
"triggerType" : "MANUAL"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867624272",
"triggerType" : "MANUAL"
}, {
"hash" : "0138dadc9c21dc753582505e03a139efe1f63884",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568",
"triggerID" : "0138dadc9c21dc753582505e03a139efe1f63884",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* 0138dadc9c21dc753582505e03a139efe1f63884 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
"triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"triggerType" : "PUSH"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867615903",
"triggerType" : "MANUAL"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867624272",
"triggerType" : "MANUAL"
}, {
"hash" : "0138dadc9c21dc753582505e03a139efe1f63884",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568",
"triggerID" : "0138dadc9c21dc753582505e03a139efe1f63884",
"triggerType" : "PUSH"
}, {
"hash" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=659",
"triggerID" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
"triggerType" : "PUSH"
}, {
"hash" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=744",
"triggerID" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
"triggerType" : "PUSH"
}, {
"hash" : "cea6548b53a242e93e861401564a5bb55d317f24",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=812",
"triggerID" : "cea6548b53a242e93e861401564a5bb55d317f24",
"triggerType" : "PUSH"
}, {
"hash" : "890e9822855fcd45c8387f83740975f43474cddc",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=814",
"triggerID" : "890e9822855fcd45c8387f83740975f43474cddc",
"triggerType" : "PUSH"
}, {
"hash" : "890e9822855fcd45c8387f83740975f43474cddc",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=818",
"triggerID" : "877110964",
"triggerType" : "MANUAL"
}, {
"hash" : "f86f50e817a625dc30f35a39b7495a4f359e4da5",
"status" : "UNKNOWN",
"url" : "TBD",
"triggerID" : "f86f50e817a625dc30f35a39b7495a4f359e4da5",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* 890e9822855fcd45c8387f83740975f43474cddc Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=814) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=818)
* f86f50e817a625dc30f35a39b7495a4f359e4da5 UNKNOWN
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan commented on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-867624272
@hudi-bot run travis
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codope commented on a change in pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
codope commented on a change in pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#discussion_r661474972
##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/async/AsyncClusteringService.java
##########
@@ -0,0 +1,131 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.async;
+
+import org.apache.hudi.client.AbstractClusteringClient;
+import org.apache.hudi.client.AbstractHoodieWriteClient;
+import org.apache.hudi.common.table.timeline.HoodieInstant;
+import org.apache.hudi.common.util.collection.Pair;
+import org.apache.hudi.exception.HoodieIOException;
+
+import org.apache.log4j.LogManager;
+import org.apache.log4j.Logger;
+
+import java.io.IOException;
+import java.util.concurrent.BlockingQueue;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import java.util.concurrent.LinkedBlockingQueue;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.locks.Condition;
+import java.util.concurrent.locks.ReentrantLock;
+import java.util.stream.IntStream;
+
+public abstract class AsyncClusteringService extends HoodieAsyncService {
+
+ private static final Logger LOG = LogManager.getLogger(AsyncClusteringService.class);
+
+ private final int maxConcurrentClustering;
+ private transient AbstractClusteringClient clusteringClient;
+ private transient BlockingQueue<HoodieInstant> pendingClustering = new LinkedBlockingQueue<>();
+ private transient ReentrantLock queueLock = new ReentrantLock();
+ private transient Condition consumed = queueLock.newCondition();
+
+ public AsyncClusteringService(AbstractHoodieWriteClient writeClient) {
+ this(writeClient, false);
+ }
+
+ public AsyncClusteringService(AbstractHoodieWriteClient writeClient, boolean runInDaemonMode) {
+ super(runInDaemonMode);
+ this.clusteringClient = createClusteringClient(writeClient);
+ this.maxConcurrentClustering = 1;
+ }
+
+ protected abstract AbstractClusteringClient createClusteringClient(AbstractHoodieWriteClient client);
+
+ public void enqueuePendingClustering(HoodieInstant instant) {
+ LOG.info("Enqueuing new pending clustering instant: " + instant.getTimestamp());
+ pendingClustering.add(instant);
+ }
+
+ public void waitTillPendingClusteringReducesTo(int numPendingClustering) throws InterruptedException {
+ try {
+ queueLock.lock();
+ while (!isShutdown() && (pendingClustering.size() > numPendingClustering)) {
+ consumed.await();
+ }
+ } finally {
+ queueLock.unlock();
+ }
+ }
+
+ private HoodieInstant fetchNextClusteringInstant() throws InterruptedException {
+ LOG.info("Waiting for next clustering instant for 10 seconds");
+ HoodieInstant instant = pendingClustering.poll(10, TimeUnit.SECONDS);
+ if (instant != null) {
+ try {
+ queueLock.lock();
+ // Signal waiting thread
+ consumed.signal();
+ } finally {
+ queueLock.unlock();
+ }
+ }
+ return instant;
+ }
+
+ /**
+ * Start clustering service.
+ */
+ @Override
+ protected Pair<CompletableFuture, ExecutorService> startService() {
+ ExecutorService executor = Executors.newFixedThreadPool(maxConcurrentClustering,
+ r -> {
+ Thread t = new Thread(r, "async_clustering_thread");
+ t.setDaemon(isRunInDaemonMode());
+ return t;
+ });
+
+ return Pair.of(CompletableFuture.allOf(IntStream.range(0, maxConcurrentClustering).mapToObj(i -> CompletableFuture.supplyAsync(() -> {
+ try {
+ while (!isShutdownRequested()) {
+ final HoodieInstant instant = fetchNextClusteringInstant();
+ if (null != instant) {
+ LOG.info("Starting clustering for instant " + instant);
+ clusteringClient.cluster(instant);
+ LOG.info("Finished clustering for instant " + instant);
+ }
+ }
+ LOG.info("Clustering executor shutting down properly");
+ } catch (InterruptedException ie) {
+ LOG.warn("Clustering executor got interrupted exception! Stopping", ie);
+ } catch (IOException e) {
+ LOG.error("Clustering executor failed", e);
+ throw new HoodieIOException(e.getMessage(), e);
+ }
+ return true;
+ }, executor)).toArray(CompletableFuture[]::new)), executor);
+ }
+
+ public synchronized void updateWriteClient(AbstractHoodieWriteClient writeClient) {
Review comment:
Moved some common logic to `HoodieAsyncService`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] zhangyue19921010 commented on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
zhangyue19921010 commented on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-879326501
Hi @codope Thanks for your response. What if after `ClusteringPlanStrategy#getFileSlicesEligibleForClustering` file slice11 were created which means file slice1 was picked up for clustering. After this plan execute, maybe new data in slice11 was lost.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codope commented on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
codope commented on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-877110964
@hudi-bot run azure
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
"triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"triggerType" : "PUSH"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867615903",
"triggerType" : "MANUAL"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867624272",
"triggerType" : "MANUAL"
}, {
"hash" : "0138dadc9c21dc753582505e03a139efe1f63884",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568",
"triggerID" : "0138dadc9c21dc753582505e03a139efe1f63884",
"triggerType" : "PUSH"
}, {
"hash" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=659",
"triggerID" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
"triggerType" : "PUSH"
}, {
"hash" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=744",
"triggerID" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
"triggerType" : "PUSH"
}, {
"hash" : "cea6548b53a242e93e861401564a5bb55d317f24",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=812",
"triggerID" : "cea6548b53a242e93e861401564a5bb55d317f24",
"triggerType" : "PUSH"
}, {
"hash" : "890e9822855fcd45c8387f83740975f43474cddc",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=814",
"triggerID" : "890e9822855fcd45c8387f83740975f43474cddc",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* 890e9822855fcd45c8387f83740975f43474cddc Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=814)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
"triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"triggerType" : "PUSH"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867615903",
"triggerType" : "MANUAL"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867624272",
"triggerType" : "MANUAL"
}, {
"hash" : "0138dadc9c21dc753582505e03a139efe1f63884",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568",
"triggerID" : "0138dadc9c21dc753582505e03a139efe1f63884",
"triggerType" : "PUSH"
}, {
"hash" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=659",
"triggerID" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
"triggerType" : "PUSH"
}, {
"hash" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=744",
"triggerID" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
"triggerType" : "PUSH"
}, {
"hash" : "cea6548b53a242e93e861401564a5bb55d317f24",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=812",
"triggerID" : "cea6548b53a242e93e861401564a5bb55d317f24",
"triggerType" : "PUSH"
}, {
"hash" : "890e9822855fcd45c8387f83740975f43474cddc",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=814",
"triggerID" : "890e9822855fcd45c8387f83740975f43474cddc",
"triggerType" : "PUSH"
}, {
"hash" : "890e9822855fcd45c8387f83740975f43474cddc",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=818",
"triggerID" : "877110964",
"triggerType" : "MANUAL"
}, {
"hash" : "f86f50e817a625dc30f35a39b7495a4f359e4da5",
"status" : "CANCELED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=823",
"triggerID" : "f86f50e817a625dc30f35a39b7495a4f359e4da5",
"triggerType" : "PUSH"
}, {
"hash" : "f86f50e817a625dc30f35a39b7495a4f359e4da5",
"status" : "PENDING",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=824",
"triggerID" : "877330225",
"triggerType" : "MANUAL"
}, {
"hash" : "f86f50e817a625dc30f35a39b7495a4f359e4da5",
"status" : "CANCELED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=823",
"triggerID" : "877330225",
"triggerType" : "MANUAL"
} ]
}-->
## CI report:
* f86f50e817a625dc30f35a39b7495a4f359e4da5 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=824) Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=823)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
"triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"triggerType" : "PUSH"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867615903",
"triggerType" : "MANUAL"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867624272",
"triggerType" : "MANUAL"
}, {
"hash" : "0138dadc9c21dc753582505e03a139efe1f63884",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568",
"triggerID" : "0138dadc9c21dc753582505e03a139efe1f63884",
"triggerType" : "PUSH"
}, {
"hash" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=659",
"triggerID" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
"triggerType" : "PUSH"
}, {
"hash" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=744",
"triggerID" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
"triggerType" : "PUSH"
}, {
"hash" : "cea6548b53a242e93e861401564a5bb55d317f24",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=812",
"triggerID" : "cea6548b53a242e93e861401564a5bb55d317f24",
"triggerType" : "PUSH"
}, {
"hash" : "890e9822855fcd45c8387f83740975f43474cddc",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=814",
"triggerID" : "890e9822855fcd45c8387f83740975f43474cddc",
"triggerType" : "PUSH"
}, {
"hash" : "890e9822855fcd45c8387f83740975f43474cddc",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=818",
"triggerID" : "877110964",
"triggerType" : "MANUAL"
}, {
"hash" : "f86f50e817a625dc30f35a39b7495a4f359e4da5",
"status" : "CANCELED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=823",
"triggerID" : "f86f50e817a625dc30f35a39b7495a4f359e4da5",
"triggerType" : "PUSH"
}, {
"hash" : "f86f50e817a625dc30f35a39b7495a4f359e4da5",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=824",
"triggerID" : "877330225",
"triggerType" : "MANUAL"
}, {
"hash" : "f86f50e817a625dc30f35a39b7495a4f359e4da5",
"status" : "CANCELED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=823",
"triggerID" : "877330225",
"triggerType" : "MANUAL"
} ]
}-->
## CI report:
* f86f50e817a625dc30f35a39b7495a4f359e4da5 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=824) Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=823)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codope commented on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
codope commented on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-869829116
@nsivabalan Thanks for reviewing the PR. I agree with your source code comments. There is scope for reusability. I will address them and update the PR. For the high level questions, my response is as below.
> * Now we have both clustering and compaction, I see that you have added clustering related code just after compaction where ever applicable. Is the higher priority for compaction intentional? or should we have clustering followed by compaction? or does it not matter at all.
In case when both clustering and compaction are enabled then compaction will run just before clustering. The intention is that since currently compaction and clustering cannot run at the same time on the same file groups and clustering could take significant time, so let compaction thread start first. When clustering is scheduled for the filegroups under compaction it would be ignored and picked up in the subsequent run after compaction completes.
> * I came across a class named SchedulerConfGenerator. Don't we need to make any changes here for async clustering?
We will need to make changes here if we create separate job pool for clustering and assign weights for different jobs. Unlike compaction, I did not feel the need for a separate job pool for clustering. By default, each pool gets equal share of resource but within each pool, jobs run in FIFO order.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-867078369
# [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#3142](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (890e982) into [master](https://codecov.io/gh/apache/hudi/commit/047d956e01b6d7c92320686d8321b2bbe9d2188e?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (047d956) will **decrease** coverage by `28.11%`.
> The diff coverage is `40.35%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3142/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #3142 +/- ##
=============================================
- Coverage 44.03% 15.91% -28.12%
+ Complexity 5100 493 -4607
=============================================
Files 930 283 -647
Lines 41275 11710 -29565
Branches 4138 961 -3177
=============================================
- Hits 18174 1864 -16310
+ Misses 21487 9683 -11804
+ Partials 1614 163 -1451
```
| Flag | Coverage Δ | |
|---|---|---|
| hudicli | `?` | |
| hudiclient | `0.00% <0.00%> (-34.59%)` | :arrow_down: |
| hudicommon | `?` | |
| hudiflink | `?` | |
| hudihadoopmr | `?` | |
| hudisparkdatasource | `?` | |
| hudisync | `5.37% <ø> (-49.15%)` | :arrow_down: |
| huditimelineservice | `?` | |
| hudiutilities | `59.26% <90.19%> (+50.00%)` | :arrow_up: |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [.../org/apache/hudi/async/AsyncClusteringService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ2x1c3RlcmluZ1NlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
| [...ava/org/apache/hudi/async/AsyncCompactService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ29tcGFjdFNlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
| [...java/org/apache/hudi/async/HoodieAsyncService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0hvb2RpZUFzeW5jU2VydmljZS5qYXZh) | `0.00% <0.00%> (ø)` | |
| [...g/apache/hudi/client/AbstractClusteringClient.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9BYnN0cmFjdENsdXN0ZXJpbmdDbGllbnQuamF2YQ==) | `0.00% <0.00%> (ø)` | |
| [...org/apache/hudi/config/HoodieClusteringConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVDbHVzdGVyaW5nQ29uZmlnLmphdmE=) | `0.00% <0.00%> (-71.28%)` | :arrow_down: |
| [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh) | `0.00% <0.00%> (-42.89%)` | :arrow_down: |
| [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `71.56% <75.00%> (+71.56%)` | :arrow_up: |
| [...i/utilities/deltastreamer/HoodieDeltaStreamer.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvSG9vZGllRGVsdGFTdHJlYW1lci5qYXZh) | `75.08% <92.50%> (+75.08%)` | :arrow_up: |
| [...s/deltastreamer/HoodieMultiTableDeltaStreamer.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvSG9vZGllTXVsdGlUYWJsZURlbHRhU3RyZWFtZXIuamF2YQ==) | `76.60% <100.00%> (+76.60%)` | :arrow_up: |
| [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| ... and [771 more](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [047d956...890e982](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan commented on a change in pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#discussion_r663257958
##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieMultiTableDeltaStreamer.java
##########
@@ -296,6 +297,11 @@ public static void main(String[] args) throws IOException {
+ "outstanding compactions is less than this number")
public Integer maxPendingCompactions = 5;
+ @Parameter(names = {"--max-pending-clustering"},
Review comment:
out of curiosity. how come we have maxPendingCompaction/Clustering property defined as first class(top level) config for multiTableDeltaStreamer, but don't see the properties to enable/disable them. I assume those are fetched from property file for each source. So, why not fetch these properties also from the property file? I know this is not specific to clustering, but it was how compaction was defined.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan commented on a change in pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#discussion_r659456141
##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/async/AsyncClusteringService.java
##########
@@ -0,0 +1,131 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.async;
+
+import org.apache.hudi.client.AbstractClusteringClient;
+import org.apache.hudi.client.AbstractHoodieWriteClient;
+import org.apache.hudi.common.table.timeline.HoodieInstant;
+import org.apache.hudi.common.util.collection.Pair;
+import org.apache.hudi.exception.HoodieIOException;
+
+import org.apache.log4j.LogManager;
+import org.apache.log4j.Logger;
+
+import java.io.IOException;
+import java.util.concurrent.BlockingQueue;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import java.util.concurrent.LinkedBlockingQueue;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.locks.Condition;
+import java.util.concurrent.locks.ReentrantLock;
+import java.util.stream.IntStream;
+
+public abstract class AsyncClusteringService extends HoodieAsyncService {
Review comment:
java docs would be nice.
##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/async/AsyncClusteringService.java
##########
@@ -0,0 +1,131 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.async;
+
+import org.apache.hudi.client.AbstractClusteringClient;
+import org.apache.hudi.client.AbstractHoodieWriteClient;
+import org.apache.hudi.common.table.timeline.HoodieInstant;
+import org.apache.hudi.common.util.collection.Pair;
+import org.apache.hudi.exception.HoodieIOException;
+
+import org.apache.log4j.LogManager;
+import org.apache.log4j.Logger;
+
+import java.io.IOException;
+import java.util.concurrent.BlockingQueue;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import java.util.concurrent.LinkedBlockingQueue;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.locks.Condition;
+import java.util.concurrent.locks.ReentrantLock;
+import java.util.stream.IntStream;
+
+public abstract class AsyncClusteringService extends HoodieAsyncService {
+
+ private static final Logger LOG = LogManager.getLogger(AsyncClusteringService.class);
+
+ private final int maxConcurrentClustering;
+ private transient AbstractClusteringClient clusteringClient;
+ private transient BlockingQueue<HoodieInstant> pendingClustering = new LinkedBlockingQueue<>();
+ private transient ReentrantLock queueLock = new ReentrantLock();
+ private transient Condition consumed = queueLock.newCondition();
+
+ public AsyncClusteringService(AbstractHoodieWriteClient writeClient) {
+ this(writeClient, false);
+ }
+
+ public AsyncClusteringService(AbstractHoodieWriteClient writeClient, boolean runInDaemonMode) {
+ super(runInDaemonMode);
+ this.clusteringClient = createClusteringClient(writeClient);
+ this.maxConcurrentClustering = 1;
+ }
+
+ protected abstract AbstractClusteringClient createClusteringClient(AbstractHoodieWriteClient client);
+
+ public void enqueuePendingClustering(HoodieInstant instant) {
+ LOG.info("Enqueuing new pending clustering instant: " + instant.getTimestamp());
+ pendingClustering.add(instant);
+ }
+
+ public void waitTillPendingClusteringReducesTo(int numPendingClustering) throws InterruptedException {
+ try {
+ queueLock.lock();
+ while (!isShutdown() && (pendingClustering.size() > numPendingClustering)) {
+ consumed.await();
+ }
+ } finally {
+ queueLock.unlock();
+ }
+ }
+
+ private HoodieInstant fetchNextClusteringInstant() throws InterruptedException {
+ LOG.info("Waiting for next clustering instant for 10 seconds");
+ HoodieInstant instant = pendingClustering.poll(10, TimeUnit.SECONDS);
+ if (instant != null) {
+ try {
+ queueLock.lock();
+ // Signal waiting thread
+ consumed.signal();
+ } finally {
+ queueLock.unlock();
+ }
+ }
+ return instant;
+ }
+
+ /**
+ * Start clustering service.
+ */
+ @Override
+ protected Pair<CompletableFuture, ExecutorService> startService() {
+ ExecutorService executor = Executors.newFixedThreadPool(maxConcurrentClustering,
+ r -> {
+ Thread t = new Thread(r, "async_clustering_thread");
+ t.setDaemon(isRunInDaemonMode());
+ return t;
+ });
+
+ return Pair.of(CompletableFuture.allOf(IntStream.range(0, maxConcurrentClustering).mapToObj(i -> CompletableFuture.supplyAsync(() -> {
+ try {
+ while (!isShutdownRequested()) {
+ final HoodieInstant instant = fetchNextClusteringInstant();
+ if (null != instant) {
+ LOG.info("Starting clustering for instant " + instant);
+ clusteringClient.cluster(instant);
+ LOG.info("Finished clustering for instant " + instant);
+ }
+ }
+ LOG.info("Clustering executor shutting down properly");
+ } catch (InterruptedException ie) {
+ LOG.warn("Clustering executor got interrupted exception! Stopping", ie);
+ } catch (IOException e) {
+ LOG.error("Clustering executor failed", e);
+ throw new HoodieIOException(e.getMessage(), e);
+ }
+ return true;
+ }, executor)).toArray(CompletableFuture[]::new)), executor);
+ }
+
+ public synchronized void updateWriteClient(AbstractHoodieWriteClient writeClient) {
Review comment:
I see lot of commonality between this and AsyncCompactionService. Can we please try to re-use code as much as possible?
##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieClusteringConfig.java
##########
@@ -153,6 +155,11 @@ public Builder withInlineClusteringNumCommits(int numCommits) {
return this;
}
+ public Builder withAsyncClusteringNumCommits(int numCommits) {
Review comment:
minor. withAsyncClustering**Max**commits
##########
File path: hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala
##########
@@ -620,6 +641,17 @@ object HoodieSparkSqlWriter {
}
}
+ private def isAsyncClusteringEnabled(client: SparkRDDWriteClient[HoodieRecordPayload[Nothing]],
+ parameters: Map[String, String]) : Boolean = {
+ log.info(s"Config.asyncClusteringEnabled ? ${client.getConfig.isAsyncClusteringEnabled}")
+ if (asyncClusteringTriggerFnDefined && client.getConfig.isAsyncClusteringEnabled
Review comment:
can we return in 1 line.
asyncClusteringTriggerFnDefined && client.getConfig.isAsyncClusteringEnabled && parameters.get(ASYNC_CLUSTERING_ENABLE_OPT_KEY).exists(r => r.toBoolean)
##########
File path: hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/HoodieSparkClusteringClient.java
##########
@@ -0,0 +1,53 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.client;
+
+import org.apache.hudi.common.model.HoodieKey;
+import org.apache.hudi.common.model.HoodieRecord;
+import org.apache.hudi.common.model.HoodieRecordPayload;
+import org.apache.hudi.common.table.timeline.HoodieInstant;
+
+import org.apache.log4j.LogManager;
+import org.apache.log4j.Logger;
+import org.apache.spark.api.java.JavaRDD;
+
+import java.io.IOException;
+
+public class HoodieSparkClusteringClient<T extends HoodieRecordPayload> extends
Review comment:
java docs.
##########
File path: hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/async/SparkAsyncClusteringService.java
##########
@@ -0,0 +1,36 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.async;
+
+import org.apache.hudi.client.AbstractClusteringClient;
+import org.apache.hudi.client.AbstractHoodieWriteClient;
+import org.apache.hudi.client.HoodieSparkClusteringClient;
+
+public class SparkAsyncClusteringService extends AsyncClusteringService {
Review comment:
java docs.
##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/AbstractClusteringClient.java
##########
@@ -0,0 +1,41 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.client;
+
+import org.apache.hudi.common.model.HoodieRecordPayload;
+import org.apache.hudi.common.table.timeline.HoodieInstant;
+
+import java.io.IOException;
+import java.io.Serializable;
+
+public abstract class AbstractClusteringClient<T extends HoodieRecordPayload, I, K, O> implements Serializable {
Review comment:
java docs please.
##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/AbstractClusteringClient.java
##########
@@ -0,0 +1,41 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.client;
+
+import org.apache.hudi.common.model.HoodieRecordPayload;
+import org.apache.hudi.common.table.timeline.HoodieInstant;
+
+import java.io.IOException;
+import java.io.Serializable;
+
+public abstract class AbstractClusteringClient<T extends HoodieRecordPayload, I, K, O> implements Serializable {
+
+ protected transient AbstractHoodieWriteClient<T, I, K, O> clusteringClient;
+
+ public AbstractClusteringClient(AbstractHoodieWriteClient<T, I, K, O> clusteringClient) {
+ this.clusteringClient = clusteringClient;
+ }
+
+ public abstract void cluster(HoodieInstant instant) throws IOException;
Review comment:
Here also we can think of something like AsyncServiceClient.
may be we can have a common method as below.
```
public abstract void doAction(HoodieInstant instant) throw IOException;
```
same abstract class for both clustering and compaction.
Not too strong on this suggestion though. Let's see what others have to say.
##########
File path: hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala
##########
@@ -583,13 +594,23 @@ object HoodieSparkSqlWriter {
log.info(s"Compaction Scheduled is $compactionInstant")
+ val asyncClusteringEnabled = isAsyncClusteringEnabled(client, parameters)
+ val clusteringInstant: common.util.Option[java.lang.String] =
+ if (asyncClusteringEnabled) {
+ client.scheduleClustering(common.util.Option.of(new util.HashMap[String, String](mapAsJavaMap(metaMap))))
+ } else {
+ common.util.Option.empty()
+ }
+
+ log.info(s"Clustering Scheduled is $clusteringInstant")
+
Review comment:
should we fix line 610 as well.
```
if(!asyncCompactionEnabled && !asyncClusteringEnabled) {
```
##########
File path: hudi-spark-datasource/hudi-spark/src/main/java/org/apache/hudi/async/SparkStreamingAsyncClusteringService.java
##########
@@ -0,0 +1,36 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.async;
+
+import org.apache.hudi.client.AbstractClusteringClient;
+import org.apache.hudi.client.AbstractHoodieWriteClient;
+import org.apache.hudi.client.HoodieSparkClusteringClient;
+
+public class SparkStreamingAsyncClusteringService extends AsyncClusteringService {
Review comment:
java docs please.
##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java
##########
@@ -396,10 +424,10 @@ public int hashCode() {
baseFileFormat, propsFilePath, configs, sourceClassName,
sourceOrderingField, payloadClassName, schemaProviderClassName,
transformerClassNames, sourceLimit, operation, filterDupes,
- enableHiveSync, maxPendingCompactions, continuousMode,
+ enableHiveSync, maxPendingCompactions, maxPendingClustering, continuousMode,
minSyncIntervalSeconds, sparkMaster, commitOnErrors,
deltaSyncSchedulingWeight, compactSchedulingWeight, deltaSyncSchedulingMinShare,
- compactSchedulingMinShare, forceDisableCompaction, checkpoint,
+ compactSchedulingMinShare, forceDisableCompaction, forceDisableClustering, checkpoint,
Review comment:
I see a config called compactSchedulingMinShare. Is there any necessity to create one for clustering?
##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/AbstractClusteringClient.java
##########
@@ -0,0 +1,41 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.client;
+
+import org.apache.hudi.common.model.HoodieRecordPayload;
+import org.apache.hudi.common.table.timeline.HoodieInstant;
+
+import java.io.IOException;
+import java.io.Serializable;
+
+public abstract class AbstractClusteringClient<T extends HoodieRecordPayload, I, K, O> implements Serializable {
+
+ protected transient AbstractHoodieWriteClient<T, I, K, O> clusteringClient;
+
Review comment:
again, serialVersionUUID would be good.
##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/async/AsyncClusteringService.java
##########
@@ -0,0 +1,131 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.async;
+
+import org.apache.hudi.client.AbstractClusteringClient;
+import org.apache.hudi.client.AbstractHoodieWriteClient;
+import org.apache.hudi.common.table.timeline.HoodieInstant;
+import org.apache.hudi.common.util.collection.Pair;
+import org.apache.hudi.exception.HoodieIOException;
+
+import org.apache.log4j.LogManager;
+import org.apache.log4j.Logger;
+
+import java.io.IOException;
+import java.util.concurrent.BlockingQueue;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import java.util.concurrent.LinkedBlockingQueue;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.locks.Condition;
+import java.util.concurrent.locks.ReentrantLock;
+import java.util.stream.IntStream;
+
+public abstract class AsyncClusteringService extends HoodieAsyncService {
+
+ private static final Logger LOG = LogManager.getLogger(AsyncClusteringService.class);
+
Review comment:
can we add serialVersionUUID as well.
##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/AbstractClusteringClient.java
##########
@@ -0,0 +1,41 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.client;
+
+import org.apache.hudi.common.model.HoodieRecordPayload;
+import org.apache.hudi.common.table.timeline.HoodieInstant;
+
+import java.io.IOException;
+import java.io.Serializable;
+
+public abstract class AbstractClusteringClient<T extends HoodieRecordPayload, I, K, O> implements Serializable {
+
+ protected transient AbstractHoodieWriteClient<T, I, K, O> clusteringClient;
+
+ public AbstractClusteringClient(AbstractHoodieWriteClient<T, I, K, O> clusteringClient) {
+ this.clusteringClient = clusteringClient;
+ }
+
+ public abstract void cluster(HoodieInstant instant) throws IOException;
Review comment:
also, lets try to add java docs for all public methods. I do understand that AbstractCompactor does not have java docs. Its fine. atleast for the code we write, lets try to add java docs.
##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/DeltaSync.java
##########
@@ -733,4 +743,15 @@ public Config getCfg() {
public Option<HoodieTimeline> getCommitTimelineOpt() {
return commitTimelineOpt;
}
+
+ /**
+ * Schedule clustering.
+ * Called from {@link HoodieDeltaStreamer} when async clustering is enabled.
+ *
+ * @return Requested clustering instant.
+ */
+ public Option<String> getClusteringInstant() {
Review comment:
minor. getClusteringInstant**Opt**()
##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/async/AsyncClusteringService.java
##########
@@ -0,0 +1,131 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.async;
+
+import org.apache.hudi.client.AbstractClusteringClient;
+import org.apache.hudi.client.AbstractHoodieWriteClient;
+import org.apache.hudi.common.table.timeline.HoodieInstant;
+import org.apache.hudi.common.util.collection.Pair;
+import org.apache.hudi.exception.HoodieIOException;
+
+import org.apache.log4j.LogManager;
+import org.apache.log4j.Logger;
+
+import java.io.IOException;
+import java.util.concurrent.BlockingQueue;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import java.util.concurrent.LinkedBlockingQueue;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.locks.Condition;
+import java.util.concurrent.locks.ReentrantLock;
+import java.util.stream.IntStream;
+
+public abstract class AsyncClusteringService extends HoodieAsyncService {
Review comment:
minor. Can you fix the java docs in HoodieAsyncService to also add clustering. as of now, it looks like below.
```
Base Class for running clean/delta-sync/compaction in separate thread and controlling their life-cycle.
```
##########
File path: hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/hudi/HoodieStreamingSink.scala
##########
@@ -86,6 +93,11 @@ class HoodieStreamingSink(sqlContext: SQLContext,
asyncCompactorService.enqueuePendingCompaction(
new HoodieInstant(State.REQUESTED, HoodieTimeline.COMPACTION_ACTION, compactionInstantOps.get()))
}
+ if (clusteringInstant.isPresent) {
+ asyncClusteringService.enqueuePendingClustering(new HoodieInstant(
+ State.REQUESTED, HoodieTimeline.REPLACE_COMMIT_ACTION, clusteringInstant.get()
+ ))
+ }
Review comment:
in line 101, do we need to return clusteringInstant as well ?
##########
File path: hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/DataSourceOptions.scala
##########
@@ -380,6 +380,10 @@ object DataSourceWriteOptions {
val ASYNC_COMPACT_ENABLE_OPT_KEY = "hoodie.datasource.compaction.async.enable"
val DEFAULT_ASYNC_COMPACT_ENABLE_OPT_VAL = "true"
+ // Async Clustering - Enabled by default
+ val ASYNC_CLUSTERING_ENABLE_OPT_KEY = "hoodie.datasource.clustering.async.enable"
+ val DEFAULT_ASYNC_CLUSTERING_ENABLE_OPT_VAL = "true"
Review comment:
may I know why this is enabled by default?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] zhangyue19921010 edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
zhangyue19921010 edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-878004113
Hi @codope Just want to know, is this Async clustering function can handle the following scenarios and losing no data:
There are 3 small file groups named fg1, fg2 and fg3 contained file slice1, file slice2 and file slices3 separately.
When async schedule **start to make a cluster plan but not finished**, there is an inflight or requested commit for fg1 which will create file slice 11 based on file slice1. In other words **file slice11 is creating but not committed** ---> I believe this is this scene is similar to multi writer.
What does this async clustering function will do?
Will this clustering plan contains file slice1? if contained, I think the new data in file slice11 will be lost.
Looking forward to your reply, thanks a lot.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codope commented on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
codope commented on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-876943613
> Tests coverage:
>
> * Do we have tests where both compaction and clustering are async triggered? (both spark streaming and deltastreamer) ?
Added such test in `TestHoodieDeltaStreamer` and `TestStructuredStreaming`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan merged pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
nsivabalan merged pull request #3142:
URL: https://github.com/apache/hudi/pull/3142
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-867078369
# [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#3142](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (7d4f981) into [master](https://codecov.io/gh/apache/hudi/commit/9b01d2a04520db6230cd16ef2b29013c013b1944?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (9b01d2a) will **decrease** coverage by `1.70%`.
> The diff coverage is `49.14%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3142/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #3142 +/- ##
============================================
- Coverage 47.67% 45.97% -1.71%
+ Complexity 5516 4727 -789
============================================
Files 929 832 -97
Lines 41303 37953 -3350
Branches 4144 3811 -333
============================================
- Hits 19692 17449 -2243
+ Misses 19863 18887 -976
+ Partials 1748 1617 -131
```
| Flag | Coverage Δ | |
|---|---|---|
| hudicli | `39.97% <ø> (ø)` | |
| hudiclient | `22.89% <9.52%> (-11.68%)` | :arrow_down: |
| hudicommon | `48.56% <0.00%> (-0.03%)` | :arrow_down: |
| hudiflink | `60.03% <ø> (ø)` | |
| hudihadoopmr | `51.29% <ø> (ø)` | |
| hudisparkdatasource | `67.30% <56.66%> (-0.02%)` | :arrow_down: |
| hudisync | `54.51% <ø> (ø)` | |
| huditimelineservice | `64.07% <ø> (ø)` | |
| hudiutilities | `59.26% <90.19%> (+0.69%)` | :arrow_up: |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [.../org/apache/hudi/async/AsyncClusteringService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ2x1c3RlcmluZ1NlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
| [...ava/org/apache/hudi/async/AsyncCompactService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ29tcGFjdFNlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
| [...java/org/apache/hudi/async/HoodieAsyncService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0hvb2RpZUFzeW5jU2VydmljZS5qYXZh) | `0.00% <0.00%> (ø)` | |
| [...g/apache/hudi/client/AbstractClusteringClient.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9BYnN0cmFjdENsdXN0ZXJpbmdDbGllbnQuamF2YQ==) | `0.00% <0.00%> (ø)` | |
| [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh) | `42.80% <0.00%> (-0.08%)` | :arrow_down: |
| [...a/org/apache/hudi/common/util/ClusteringUtils.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3V0aWwvQ2x1c3RlcmluZ1V0aWxzLmphdmE=) | `88.40% <0.00%> (-1.31%)` | :arrow_down: |
| [...in/scala/org/apache/hudi/HoodieStreamingSink.scala](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZVN0cmVhbWluZ1Npbmsuc2NhbGE=) | `28.00% <36.66%> (+4.00%)` | :arrow_up: |
| [...n/scala/org/apache/hudi/HoodieSparkSqlWriter.scala](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZVNwYXJrU3FsV3JpdGVyLnNjYWxh) | `72.03% <72.00%> (+0.76%)` | :arrow_up: |
| [...org/apache/hudi/config/HoodieClusteringConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVDbHVzdGVyaW5nQ29uZmlnLmphdmE=) | `71.28% <75.00%> (+0.01%)` | :arrow_up: |
| [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `71.56% <75.00%> (+0.42%)` | :arrow_up: |
| ... and [110 more](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [9b01d2a...7d4f981](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codope commented on a change in pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
codope commented on a change in pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#discussion_r661475846
##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/AbstractClusteringClient.java
##########
@@ -0,0 +1,41 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.client;
+
+import org.apache.hudi.common.model.HoodieRecordPayload;
+import org.apache.hudi.common.table.timeline.HoodieInstant;
+
+import java.io.IOException;
+import java.io.Serializable;
+
+public abstract class AbstractClusteringClient<T extends HoodieRecordPayload, I, K, O> implements Serializable {
+
+ protected transient AbstractHoodieWriteClient<T, I, K, O> clusteringClient;
+
+ public AbstractClusteringClient(AbstractHoodieWriteClient<T, I, K, O> clusteringClient) {
+ this.clusteringClient = clusteringClient;
+ }
+
+ public abstract void cluster(HoodieInstant instant) throws IOException;
Review comment:
> same abstract class for both clustering and compaction.
> Not too strong on this suggestion though. Let's see what others have to say.
@satishkotha what are your thoughts about this?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan commented on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-867615903
@hudi-bot run azure
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codope commented on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
codope commented on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-869829116
@nsivabalan Thanks for reviewing the PR. I agree with your source code comments. There is scope for reusability. I will address them and update the PR. For the high level questions, my response is as below.
> * Now we have both clustering and compaction, I see that you have added clustering related code just after compaction where ever applicable. Is the higher priority for compaction intentional? or should we have clustering followed by compaction? or does it not matter at all.
In case when both clustering and compaction are enabled then compaction will run just before clustering. The intention is that since currently compaction and clustering cannot run at the same time on the same file groups and clustering could take significant time, so let compaction thread start first. When clustering is scheduled for the filegroups under compaction it would be ignored and picked up in the subsequent run after compaction completes.
> * I came across a class named SchedulerConfGenerator. Don't we need to make any changes here for async clustering?
We will need to make changes here if we create separate job pool for clustering and assign weights for different jobs. Unlike compaction, I did not feel the need for a separate job pool for clustering. By default, each pool gets equal share of resource but within each pool, jobs run in FIFO order.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-867078369
# [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#3142](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (890e982) into [master](https://codecov.io/gh/apache/hudi/commit/047d956e01b6d7c92320686d8321b2bbe9d2188e?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (047d956) will **decrease** coverage by `41.17%`.
> The diff coverage is `0.00%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3142/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #3142 +/- ##
============================================
- Coverage 44.03% 2.86% -41.18%
+ Complexity 5100 85 -5015
============================================
Files 930 283 -647
Lines 41275 11710 -29565
Branches 4138 961 -3177
============================================
- Hits 18174 335 -17839
+ Misses 21487 11349 -10138
+ Partials 1614 26 -1588
```
| Flag | Coverage Δ | |
|---|---|---|
| hudicli | `?` | |
| hudiclient | `0.00% <0.00%> (-34.59%)` | :arrow_down: |
| hudicommon | `?` | |
| hudiflink | `?` | |
| hudihadoopmr | `?` | |
| hudisparkdatasource | `?` | |
| hudisync | `5.37% <ø> (-49.15%)` | :arrow_down: |
| huditimelineservice | `?` | |
| hudiutilities | `9.11% <0.00%> (-0.14%)` | :arrow_down: |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [.../org/apache/hudi/async/AsyncClusteringService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ2x1c3RlcmluZ1NlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
| [...ava/org/apache/hudi/async/AsyncCompactService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ29tcGFjdFNlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
| [...java/org/apache/hudi/async/HoodieAsyncService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0hvb2RpZUFzeW5jU2VydmljZS5qYXZh) | `0.00% <0.00%> (ø)` | |
| [...g/apache/hudi/client/AbstractClusteringClient.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9BYnN0cmFjdENsdXN0ZXJpbmdDbGllbnQuamF2YQ==) | `0.00% <0.00%> (ø)` | |
| [...org/apache/hudi/config/HoodieClusteringConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVDbHVzdGVyaW5nQ29uZmlnLmphdmE=) | `0.00% <0.00%> (-71.28%)` | :arrow_down: |
| [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh) | `0.00% <0.00%> (-42.89%)` | :arrow_down: |
| [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `0.00% <0.00%> (ø)` | |
| [...i/utilities/deltastreamer/HoodieDeltaStreamer.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvSG9vZGllRGVsdGFTdHJlYW1lci5qYXZh) | `0.00% <0.00%> (ø)` | |
| [...s/deltastreamer/HoodieMultiTableDeltaStreamer.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvSG9vZGllTXVsdGlUYWJsZURlbHRhU3RyZWFtZXIuamF2YQ==) | `0.00% <0.00%> (ø)` | |
| [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| ... and [723 more](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [047d956...890e982](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] nsivabalan commented on a change in pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#discussion_r663254225
##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/async/AsyncClusteringService.java
##########
@@ -0,0 +1,125 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.async;
+
+import org.apache.hudi.client.AbstractClusteringClient;
+import org.apache.hudi.client.AbstractHoodieWriteClient;
+import org.apache.hudi.common.table.timeline.HoodieInstant;
+import org.apache.hudi.common.util.collection.Pair;
+import org.apache.hudi.exception.HoodieIOException;
+
+import org.apache.log4j.LogManager;
+import org.apache.log4j.Logger;
+
+import java.io.IOException;
+import java.util.concurrent.BlockingQueue;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import java.util.concurrent.LinkedBlockingQueue;
+import java.util.concurrent.locks.Condition;
+import java.util.concurrent.locks.ReentrantLock;
+import java.util.stream.IntStream;
+
+/**
+ * Async clustering service that runs in a separate thread.
+ * Currently, only one clustering thread is allowed to run at any time.
+ */
+public abstract class AsyncClusteringService extends HoodieAsyncService {
+
+ private static final long serialVersionUID = 1L;
+ private static final Logger LOG = LogManager.getLogger(AsyncClusteringService.class);
+
+ private final int maxConcurrentClustering;
+ private transient AbstractClusteringClient clusteringClient;
+ private transient BlockingQueue<HoodieInstant> pendingClustering = new LinkedBlockingQueue<>();
+ private transient ReentrantLock queueLock = new ReentrantLock();
Review comment:
let's sync up on this. I feel we can take these vars also one level up and reuse across clustering and compaction.
##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/async/HoodieAsyncService.java
##########
@@ -165,4 +169,49 @@ private void monitorThreads(Function<Boolean, Boolean> onShutdownCallback) {
public boolean isRunInDaemonMode() {
return runInDaemonMode;
}
+
+ /**
+ * Wait till outstanding pending compaction/clustering reduces to the passed in value.
+ *
+ * @param numPending Maximum pending compactions/clustering allowed
+ * @param pendingInstants Currently enqueued pending compaction/clustering instants
+ * @param queueLock
Review comment:
java docs on params as well :)
##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieMultiTableDeltaStreamer.java
##########
@@ -296,6 +297,11 @@ public static void main(String[] args) throws IOException {
+ "outstanding compactions is less than this number")
public Integer maxPendingCompactions = 5;
+ @Parameter(names = {"--max-pending-clustering"},
Review comment:
out of curiosity. how come we have maxPendingCompaction/Clustering property defined as first class config for multiTableDeltaStreamer, but don't see the properties to enable/disable them. I assume those are fetched from property file for each source. So, why not fetch these properties also from the property file? I know this is not specific to clustering, but it was how compaction was defined.
##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/async/HoodieAsyncService.java
##########
@@ -165,4 +169,49 @@ private void monitorThreads(Function<Boolean, Boolean> onShutdownCallback) {
public boolean isRunInDaemonMode() {
return runInDaemonMode;
}
+
+ /**
+ * Wait till outstanding pending compaction/clustering reduces to the passed in value.
+ *
+ * @param numPending Maximum pending compactions/clustering allowed
+ * @param pendingInstants Currently enqueued pending compaction/clustering instants
+ * @param queueLock
+ * @param consumed
+ * @throws InterruptedException
+ */
+ public void waitTillPendingActionReducesTo(int numPending, BlockingQueue<HoodieInstant> pendingInstants,
+ ReentrantLock queueLock, Condition consumed) throws InterruptedException {
+ try {
+ queueLock.lock();
+ while (!isShutdown() && (pendingInstants.size() > numPending)) {
+ consumed.await();
+ }
+ } finally {
+ queueLock.unlock();
+ }
+ }
+
+ /**
+ * Fetch Next pending compaction/clustering instant if available.
+ *
+ * @param pendingInstants Currently enqueued pending compaction/clustering instants
+ * @param queueLock
+ * @param consumed
Review comment:
same here. also "return" as well
##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java
##########
@@ -616,6 +651,7 @@ public DeltaSync getDeltaSync() {
}
} finally {
shutdownCompactor(error);
+ shutdownClusteringService(error);
Review comment:
can we have a single method and call it shutdownAsyncServices or backgroundServices and shut down all such services within that method ?
##########
File path: hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/hudi/HoodieStreamingSink.scala
##########
@@ -86,6 +93,11 @@ class HoodieStreamingSink(sqlContext: SQLContext,
asyncCompactorService.enqueuePendingCompaction(
new HoodieInstant(State.REQUESTED, HoodieTimeline.COMPACTION_ACTION, compactionInstantOps.get()))
}
+ if (clusteringInstant.isPresent) {
+ asyncClusteringService.enqueuePendingClustering(new HoodieInstant(
+ State.REQUESTED, HoodieTimeline.REPLACE_COMMIT_ACTION, clusteringInstant.get()
+ ))
+ }
Review comment:
I know we synced up f2f on this. but for book keeping purposes, can you leave a comment as if you haven't addressed my feedback.
##########
File path: hudi-spark-datasource/hudi-spark-common/src/main/java/org/apache/hudi/DataSourceUtils.java
##########
@@ -150,6 +152,9 @@ public static HoodieWriteConfig createHoodieConfig(String schemaStr, String base
.withCompactionConfig(HoodieCompactionConfig.newBuilder()
.withPayloadClass(parameters.get(DataSourceWriteOptions.PAYLOAD_CLASS_OPT_KEY().key()))
.withInlineCompaction(inlineCompact).build())
+ .withClusteringConfig(HoodieClusteringConfig.newBuilder()
+ .withInlineClustering(!asyncClusteringEnabled)
Review comment:
isn't this supposed to be INLINE_CLUSTERING_PROP(hoodie.clustering.inline)? what happens if there is some contradiction among this existing property and the new property(hoodie.datasource.clustering.async.enable)?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
"triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"triggerType" : "PUSH"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867615903",
"triggerType" : "MANUAL"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867624272",
"triggerType" : "MANUAL"
}, {
"hash" : "0138dadc9c21dc753582505e03a139efe1f63884",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568",
"triggerID" : "0138dadc9c21dc753582505e03a139efe1f63884",
"triggerType" : "PUSH"
}, {
"hash" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=659",
"triggerID" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
"triggerType" : "PUSH"
}, {
"hash" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=744",
"triggerID" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
"triggerType" : "PUSH"
}, {
"hash" : "cea6548b53a242e93e861401564a5bb55d317f24",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=812",
"triggerID" : "cea6548b53a242e93e861401564a5bb55d317f24",
"triggerType" : "PUSH"
}, {
"hash" : "890e9822855fcd45c8387f83740975f43474cddc",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=814",
"triggerID" : "890e9822855fcd45c8387f83740975f43474cddc",
"triggerType" : "PUSH"
}, {
"hash" : "890e9822855fcd45c8387f83740975f43474cddc",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=818",
"triggerID" : "877110964",
"triggerType" : "MANUAL"
}, {
"hash" : "f86f50e817a625dc30f35a39b7495a4f359e4da5",
"status" : "CANCELED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=823",
"triggerID" : "f86f50e817a625dc30f35a39b7495a4f359e4da5",
"triggerType" : "PUSH"
}, {
"hash" : "f86f50e817a625dc30f35a39b7495a4f359e4da5",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=824",
"triggerID" : "877330225",
"triggerType" : "MANUAL"
}, {
"hash" : "f86f50e817a625dc30f35a39b7495a4f359e4da5",
"status" : "CANCELED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=823",
"triggerID" : "877330225",
"triggerType" : "MANUAL"
}, {
"hash" : "7d4f9812e05062092c33bb5a4718ed25aa29bac6",
"status" : "UNKNOWN",
"url" : "TBD",
"triggerID" : "7d4f9812e05062092c33bb5a4718ed25aa29bac6",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* f86f50e817a625dc30f35a39b7495a4f359e4da5 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=824) Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=823)
* 7d4f9812e05062092c33bb5a4718ed25aa29bac6 UNKNOWN
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter commented on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-867078369
# [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#3142](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (4a651f8) into [master](https://codecov.io/gh/apache/hudi/commit/380518e232b883bb12579b3e98283659b464285a?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (380518e) will **decrease** coverage by `41.04%`.
> The diff coverage is `0.00%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3142/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #3142 +/- ##
============================================
- Coverage 44.15% 3.10% -41.05%
+ Complexity 4537 82 -4455
============================================
Files 819 276 -543
Lines 36171 10762 -25409
Branches 3920 1101 -2819
============================================
- Hits 15970 334 -15636
+ Misses 18467 10402 -8065
+ Partials 1734 26 -1708
```
| Flag | Coverage Δ | |
|---|---|---|
| hudicli | `?` | |
| hudiclient | `0.00% <0.00%> (-16.46%)` | :arrow_down: |
| hudicommon | `?` | |
| hudiflink | `?` | |
| hudihadoopmr | `?` | |
| hudisparkdatasource | `?` | |
| hudisync | `6.72% <ø> (-45.01%)` | :arrow_down: |
| huditimelineservice | `?` | |
| hudiutilities | `9.34% <0.00%> (-49.07%)` | :arrow_down: |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [.../org/apache/hudi/async/AsyncClusteringService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ2x1c3RlcmluZ1NlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
| [...g/apache/hudi/client/AbstractClusteringClient.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9BYnN0cmFjdENsdXN0ZXJpbmdDbGllbnQuamF2YQ==) | `0.00% <0.00%> (ø)` | |
| [...org/apache/hudi/config/HoodieClusteringConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVDbHVzdGVyaW5nQ29uZmlnLmphdmE=) | `0.00% <0.00%> (-25.50%)` | :arrow_down: |
| [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh) | `0.00% <0.00%> (-17.01%)` | :arrow_down: |
| [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `0.00% <0.00%> (-71.19%)` | :arrow_down: |
| [...i/utilities/deltastreamer/HoodieDeltaStreamer.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvSG9vZGllRGVsdGFTdHJlYW1lci5qYXZh) | `0.00% <0.00%> (-72.84%)` | :arrow_down: |
| [...s/deltastreamer/HoodieMultiTableDeltaStreamer.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvSG9vZGllTXVsdGlUYWJsZURlbHRhU3RyZWFtZXIuamF2YQ==) | `0.00% <0.00%> (-76.20%)` | :arrow_down: |
| [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| ... and [658 more](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [380518e...4a651f8](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
"triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"triggerType" : "PUSH"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867615903",
"triggerType" : "MANUAL"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867624272",
"triggerType" : "MANUAL"
}, {
"hash" : "0138dadc9c21dc753582505e03a139efe1f63884",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568",
"triggerID" : "0138dadc9c21dc753582505e03a139efe1f63884",
"triggerType" : "PUSH"
}, {
"hash" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=659",
"triggerID" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
"triggerType" : "PUSH"
}, {
"hash" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=744",
"triggerID" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
"triggerType" : "PUSH"
}, {
"hash" : "cea6548b53a242e93e861401564a5bb55d317f24",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=812",
"triggerID" : "cea6548b53a242e93e861401564a5bb55d317f24",
"triggerType" : "PUSH"
}, {
"hash" : "890e9822855fcd45c8387f83740975f43474cddc",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=814",
"triggerID" : "890e9822855fcd45c8387f83740975f43474cddc",
"triggerType" : "PUSH"
}, {
"hash" : "890e9822855fcd45c8387f83740975f43474cddc",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=818",
"triggerID" : "877110964",
"triggerType" : "MANUAL"
}, {
"hash" : "f86f50e817a625dc30f35a39b7495a4f359e4da5",
"status" : "CANCELED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=823",
"triggerID" : "f86f50e817a625dc30f35a39b7495a4f359e4da5",
"triggerType" : "PUSH"
}, {
"hash" : "f86f50e817a625dc30f35a39b7495a4f359e4da5",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=824",
"triggerID" : "877330225",
"triggerType" : "MANUAL"
}, {
"hash" : "f86f50e817a625dc30f35a39b7495a4f359e4da5",
"status" : "CANCELED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=823",
"triggerID" : "877330225",
"triggerType" : "MANUAL"
}, {
"hash" : "7d4f9812e05062092c33bb5a4718ed25aa29bac6",
"status" : "PENDING",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=849",
"triggerID" : "7d4f9812e05062092c33bb5a4718ed25aa29bac6",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* f86f50e817a625dc30f35a39b7495a4f359e4da5 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=824) Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=823)
* 7d4f9812e05062092c33bb5a4718ed25aa29bac6 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=849)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] zhangyue19921010 commented on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
zhangyue19921010 commented on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-878004113
Hi @codope Just want to know, is this Async clustering function can handle the following scenarios and losing no data:
There are 3 small file group named fg1, fg2 and fg3 contained file slice1, file slice2 and file slices3 separately.
When async schedule **start to make a cluster plan but not finished**, there is an inflight or requested commit for fg1 which will create file slice 11 based on file slice1. In other words **file slice11 is creating but not committed** ---> I believe this is this scene is similar to multi writer.
What does this async clustering function will do?
Will this clustering plan contains file slice1? if contained, I think the new data in file slice11 will be lost.
Looking forward to your reply, thanks a lot.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] zhangyue19921010 edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
zhangyue19921010 edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-878004113
Hi @codope Just want to know, is this Async clustering function can handle the following scenarios and losing no data:
There are 3 small file groups named fg1, fg2 and fg3 contained file slice1, file slice2 and file slices3 separately.
When async schedule **start to make a cluster plan but not finished**, there is an inflight or requested commit for fg1 which will create file slice 11 based on file slice1. In other words **file slice11 is creating but not committed** ---> I believe this scene is similar to multi writer.
What does this async clustering function will do?
Will this clustering plan contains file slice1? if contained, I think the new data in file slice11 will be lost.
Looking forward to your reply, thanks a lot.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
"triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"triggerType" : "PUSH"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867615903",
"triggerType" : "MANUAL"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867624272",
"triggerType" : "MANUAL"
}, {
"hash" : "0138dadc9c21dc753582505e03a139efe1f63884",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568",
"triggerID" : "0138dadc9c21dc753582505e03a139efe1f63884",
"triggerType" : "PUSH"
}, {
"hash" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=659",
"triggerID" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
"triggerType" : "PUSH"
}, {
"hash" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=744",
"triggerID" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
"triggerType" : "PUSH"
}, {
"hash" : "cea6548b53a242e93e861401564a5bb55d317f24",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=812",
"triggerID" : "cea6548b53a242e93e861401564a5bb55d317f24",
"triggerType" : "PUSH"
}, {
"hash" : "890e9822855fcd45c8387f83740975f43474cddc",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=814",
"triggerID" : "890e9822855fcd45c8387f83740975f43474cddc",
"triggerType" : "PUSH"
}, {
"hash" : "890e9822855fcd45c8387f83740975f43474cddc",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=818",
"triggerID" : "877110964",
"triggerType" : "MANUAL"
} ]
}-->
## CI report:
* 890e9822855fcd45c8387f83740975f43474cddc Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=814) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=818)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot commented on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "UNKNOWN",
"url" : "TBD",
"triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* 4a651f8d6b32f63838d8317c9f083508c17e4458 UNKNOWN
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
"triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"triggerType" : "PUSH"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867615903",
"triggerType" : "MANUAL"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867624272",
"triggerType" : "MANUAL"
}, {
"hash" : "0138dadc9c21dc753582505e03a139efe1f63884",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568",
"triggerID" : "0138dadc9c21dc753582505e03a139efe1f63884",
"triggerType" : "PUSH"
}, {
"hash" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
"status" : "PENDING",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=659",
"triggerID" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* 0138dadc9c21dc753582505e03a139efe1f63884 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568)
* cd2781856c66d1b8527b12f29c99f3faf31e3fdb Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=659)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codope commented on a change in pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
codope commented on a change in pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#discussion_r664562160
##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java
##########
@@ -616,6 +651,7 @@ public DeltaSync getDeltaSync() {
}
} finally {
shutdownCompactor(error);
+ shutdownClusteringService(error);
Review comment:
Done.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-867078369
# [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#3142](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (cd27818) into [master](https://codecov.io/gh/apache/hudi/commit/70d9c2e7473154178bafaef40a15fc3e10a9df6a?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (70d9c2e) will **decrease** coverage by `12.61%`.
> The diff coverage is `0.00%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3142/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #3142 +/- ##
============================================
- Coverage 15.48% 2.86% -12.62%
+ Complexity 478 82 -396
============================================
Files 280 282 +2
Lines 11548 11650 +102
Branches 945 954 +9
============================================
- Hits 1788 334 -1454
- Misses 9601 11290 +1689
+ Partials 159 26 -133
```
| Flag | Coverage Δ | |
|---|---|---|
| hudiclient | `0.00% <0.00%> (ø)` | |
| hudisync | `5.38% <ø> (ø)` | |
| hudiutilities | `9.16% <0.00%> (-48.87%)` | :arrow_down: |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [.../org/apache/hudi/async/AsyncClusteringService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ2x1c3RlcmluZ1NlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
| [...ava/org/apache/hudi/async/AsyncCompactService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ29tcGFjdFNlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
| [...java/org/apache/hudi/async/HoodieAsyncService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0hvb2RpZUFzeW5jU2VydmljZS5qYXZh) | `0.00% <0.00%> (ø)` | |
| [...g/apache/hudi/client/AbstractClusteringClient.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9BYnN0cmFjdENsdXN0ZXJpbmdDbGllbnQuamF2YQ==) | `0.00% <0.00%> (ø)` | |
| [...org/apache/hudi/config/HoodieClusteringConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVDbHVzdGVyaW5nQ29uZmlnLmphdmE=) | `0.00% <0.00%> (ø)` | |
| [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh) | `0.00% <0.00%> (ø)` | |
| [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `0.00% <0.00%> (-71.05%)` | :arrow_down: |
| [...i/utilities/deltastreamer/HoodieDeltaStreamer.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvSG9vZGllRGVsdGFTdHJlYW1lci5qYXZh) | `0.00% <0.00%> (-72.84%)` | :arrow_down: |
| [...s/deltastreamer/HoodieMultiTableDeltaStreamer.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvSG9vZGllTXVsdGlUYWJsZURlbHRhU3RyZWFtZXIuamF2YQ==) | `0.00% <0.00%> (-76.20%)` | :arrow_down: |
| [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| ... and [48 more](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [70d9c2e...cd27818](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
"triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"triggerType" : "PUSH"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867615903",
"triggerType" : "MANUAL"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867624272",
"triggerType" : "MANUAL"
}, {
"hash" : "0138dadc9c21dc753582505e03a139efe1f63884",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568",
"triggerID" : "0138dadc9c21dc753582505e03a139efe1f63884",
"triggerType" : "PUSH"
}, {
"hash" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=659",
"triggerID" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
"triggerType" : "PUSH"
}, {
"hash" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=744",
"triggerID" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
"triggerType" : "PUSH"
}, {
"hash" : "cea6548b53a242e93e861401564a5bb55d317f24",
"status" : "UNKNOWN",
"url" : "TBD",
"triggerID" : "cea6548b53a242e93e861401564a5bb55d317f24",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* 2d1f23811a18ed5b127c12391ea0fcabe50bb8a4 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=744)
* cea6548b53a242e93e861401564a5bb55d317f24 UNKNOWN
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
"triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"triggerType" : "PUSH"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867615903",
"triggerType" : "MANUAL"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867624272",
"triggerType" : "MANUAL"
}, {
"hash" : "0138dadc9c21dc753582505e03a139efe1f63884",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568",
"triggerID" : "0138dadc9c21dc753582505e03a139efe1f63884",
"triggerType" : "PUSH"
}, {
"hash" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=659",
"triggerID" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
"triggerType" : "PUSH"
}, {
"hash" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=744",
"triggerID" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* 2d1f23811a18ed5b127c12391ea0fcabe50bb8a4 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=744)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codope commented on a change in pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
codope commented on a change in pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#discussion_r661470750
##########
File path: hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala
##########
@@ -583,13 +594,23 @@ object HoodieSparkSqlWriter {
log.info(s"Compaction Scheduled is $compactionInstant")
+ val asyncClusteringEnabled = isAsyncClusteringEnabled(client, parameters)
+ val clusteringInstant: common.util.Option[java.lang.String] =
+ if (asyncClusteringEnabled) {
+ client.scheduleClustering(common.util.Option.of(new util.HashMap[String, String](mapAsJavaMap(metaMap))))
+ } else {
+ common.util.Option.empty()
+ }
+
+ log.info(s"Clustering Scheduled is $clusteringInstant")
+
Review comment:
Good catch. Done.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
"triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"triggerType" : "PUSH"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867615903",
"triggerType" : "MANUAL"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867624272",
"triggerType" : "MANUAL"
}, {
"hash" : "0138dadc9c21dc753582505e03a139efe1f63884",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568",
"triggerID" : "0138dadc9c21dc753582505e03a139efe1f63884",
"triggerType" : "PUSH"
}, {
"hash" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
"status" : "UNKNOWN",
"url" : "TBD",
"triggerID" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* 0138dadc9c21dc753582505e03a139efe1f63884 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568)
* cd2781856c66d1b8527b12f29c99f3faf31e3fdb UNKNOWN
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-867078369
# [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#3142](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (f86f50e) into [master](https://codecov.io/gh/apache/hudi/commit/371526789d663dee85041eb31c27c52c81ef87ef?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (3715267) will **increase** coverage by `0.12%`.
> The diff coverage is `34.84%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3142/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #3142 +/- ##
============================================
+ Coverage 27.40% 27.53% +0.12%
- Complexity 1287 1291 +4
============================================
Files 381 385 +4
Lines 15108 15214 +106
Branches 1305 1316 +11
============================================
+ Hits 4141 4189 +48
- Misses 10667 10722 +55
- Partials 300 303 +3
```
| Flag | Coverage Δ | |
|---|---|---|
| hudiclient | `20.93% <0.00%> (-0.12%)` | :arrow_down: |
| hudisync | `5.37% <ø> (ø)` | |
| hudiutilities | `59.26% <90.19%> (+0.69%)` | :arrow_up: |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [.../org/apache/hudi/async/AsyncClusteringService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ2x1c3RlcmluZ1NlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
| [...ava/org/apache/hudi/async/AsyncCompactService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ29tcGFjdFNlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
| [...java/org/apache/hudi/async/HoodieAsyncService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0hvb2RpZUFzeW5jU2VydmljZS5qYXZh) | `0.00% <0.00%> (ø)` | |
| [...g/apache/hudi/client/AbstractClusteringClient.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9BYnN0cmFjdENsdXN0ZXJpbmdDbGllbnQuamF2YQ==) | `0.00% <0.00%> (ø)` | |
| [...org/apache/hudi/config/HoodieClusteringConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVDbHVzdGVyaW5nQ29uZmlnLmphdmE=) | `0.00% <0.00%> (ø)` | |
| [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh) | `0.00% <0.00%> (ø)` | |
| [...apache/hudi/async/SparkAsyncClusteringService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1zcGFyay1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvYXN5bmMvU3BhcmtBc3luY0NsdXN0ZXJpbmdTZXJ2aWNlLmphdmE=) | `0.00% <0.00%> (ø)` | |
| [...pache/hudi/client/HoodieSparkClusteringClient.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1zcGFyay1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpZW50L0hvb2RpZVNwYXJrQ2x1c3RlcmluZ0NsaWVudC5qYXZh) | `0.00% <0.00%> (ø)` | |
| [...ion/cluster/SparkClusteringPlanActionExecutor.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1zcGFyay1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdGFibGUvYWN0aW9uL2NsdXN0ZXIvU3BhcmtDbHVzdGVyaW5nUGxhbkFjdGlvbkV4ZWN1dG9yLmphdmE=) | `60.00% <0.00%> (-15.00%)` | :arrow_down: |
| [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `71.56% <75.00%> (+0.42%)` | :arrow_up: |
| ... and [8 more](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [3715267...f86f50e](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
"triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"triggerType" : "PUSH"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867615903",
"triggerType" : "MANUAL"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867624272",
"triggerType" : "MANUAL"
}, {
"hash" : "0138dadc9c21dc753582505e03a139efe1f63884",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568",
"triggerID" : "0138dadc9c21dc753582505e03a139efe1f63884",
"triggerType" : "PUSH"
}, {
"hash" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=659",
"triggerID" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
"triggerType" : "PUSH"
}, {
"hash" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=744",
"triggerID" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
"triggerType" : "PUSH"
}, {
"hash" : "cea6548b53a242e93e861401564a5bb55d317f24",
"status" : "PENDING",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=812",
"triggerID" : "cea6548b53a242e93e861401564a5bb55d317f24",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* 2d1f23811a18ed5b127c12391ea0fcabe50bb8a4 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=744)
* cea6548b53a242e93e861401564a5bb55d317f24 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=812)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
"triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"triggerType" : "PUSH"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "CANCELED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867615903",
"triggerType" : "MANUAL"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "CANCELED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867624272",
"triggerType" : "MANUAL"
} ]
}-->
## CI report:
* 4a651f8d6b32f63838d8317c9f083508c17e4458 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383) Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
"triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"triggerType" : "PUSH"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "PENDING",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867615903",
"triggerType" : "MANUAL"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "UNKNOWN",
"url" : "TBD",
"triggerID" : "867624272",
"triggerType" : "MANUAL"
} ]
}-->
## CI report:
* 4a651f8d6b32f63838d8317c9f083508c17e4458 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-867078369
# [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#3142](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (0138dad) into [master](https://codecov.io/gh/apache/hudi/commit/07e93de8b49560eee23237817fc24fbe763f2891?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (07e93de) will **decrease** coverage by `29.48%`.
> The diff coverage is `42.24%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3142/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #3142 +/- ##
=============================================
- Coverage 46.23% 16.74% -29.49%
+ Complexity 5396 482 -4914
=============================================
Files 921 282 -639
Lines 40085 10992 -29093
Branches 4298 1120 -3178
=============================================
- Hits 18533 1841 -16692
+ Misses 19665 8989 -10676
+ Partials 1887 162 -1725
```
| Flag | Coverage Δ | |
|---|---|---|
| hudicli | `?` | |
| hudiclient | `0.00% <0.00%> (-30.46%)` | :arrow_down: |
| hudicommon | `?` | |
| hudiflink | `?` | |
| hudihadoopmr | `?` | |
| hudisparkdatasource | `?` | |
| hudisync | `5.38% <ø> (-48.67%)` | :arrow_down: |
| huditimelineservice | `?` | |
| hudiutilities | `59.38% <89.09%> (+0.79%)` | :arrow_up: |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [.../org/apache/hudi/async/AsyncClusteringService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ2x1c3RlcmluZ1NlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
| [...ava/org/apache/hudi/async/AsyncCompactService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ29tcGFjdFNlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
| [...java/org/apache/hudi/async/HoodieAsyncService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0hvb2RpZUFzeW5jU2VydmljZS5qYXZh) | `0.00% <0.00%> (ø)` | |
| [...g/apache/hudi/client/AbstractClusteringClient.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9BYnN0cmFjdENsdXN0ZXJpbmdDbGllbnQuamF2YQ==) | `0.00% <0.00%> (ø)` | |
| [...org/apache/hudi/config/HoodieClusteringConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVDbHVzdGVyaW5nQ29uZmlnLmphdmE=) | `0.00% <0.00%> (-25.50%)` | :arrow_down: |
| [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh) | `0.00% <0.00%> (-17.01%)` | :arrow_down: |
| [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `71.38% <75.00%> (+0.43%)` | :arrow_up: |
| [...i/utilities/deltastreamer/HoodieDeltaStreamer.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvSG9vZGllRGVsdGFTdHJlYW1lci5qYXZh) | `75.25% <90.24%> (+2.41%)` | :arrow_up: |
| [...s/deltastreamer/HoodieMultiTableDeltaStreamer.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvSG9vZGllTXVsdGlUYWJsZURlbHRhU3RyZWFtZXIuamF2YQ==) | `77.01% <100.00%> (+0.82%)` | :arrow_up: |
| [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
| ... and [716 more](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [07e93de...0138dad](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
"triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"triggerType" : "PUSH"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "CANCELED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867615903",
"triggerType" : "MANUAL"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "CANCELED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867624272",
"triggerType" : "MANUAL"
}, {
"hash" : "0138dadc9c21dc753582505e03a139efe1f63884",
"status" : "PENDING",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568",
"triggerID" : "0138dadc9c21dc753582505e03a139efe1f63884",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* 4a651f8d6b32f63838d8317c9f083508c17e4458 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383) Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413)
* 0138dadc9c21dc753582505e03a139efe1f63884 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
"triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"triggerType" : "PUSH"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867615903",
"triggerType" : "MANUAL"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867624272",
"triggerType" : "MANUAL"
}, {
"hash" : "0138dadc9c21dc753582505e03a139efe1f63884",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568",
"triggerID" : "0138dadc9c21dc753582505e03a139efe1f63884",
"triggerType" : "PUSH"
}, {
"hash" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=659",
"triggerID" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
"triggerType" : "PUSH"
}, {
"hash" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=744",
"triggerID" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
"triggerType" : "PUSH"
}, {
"hash" : "cea6548b53a242e93e861401564a5bb55d317f24",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=812",
"triggerID" : "cea6548b53a242e93e861401564a5bb55d317f24",
"triggerType" : "PUSH"
}, {
"hash" : "890e9822855fcd45c8387f83740975f43474cddc",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=814",
"triggerID" : "890e9822855fcd45c8387f83740975f43474cddc",
"triggerType" : "PUSH"
}, {
"hash" : "890e9822855fcd45c8387f83740975f43474cddc",
"status" : "PENDING",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=818",
"triggerID" : "877110964",
"triggerType" : "MANUAL"
} ]
}-->
## CI report:
* 890e9822855fcd45c8387f83740975f43474cddc Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=814) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=818)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
"triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"triggerType" : "PUSH"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867615903",
"triggerType" : "MANUAL"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867624272",
"triggerType" : "MANUAL"
}, {
"hash" : "0138dadc9c21dc753582505e03a139efe1f63884",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568",
"triggerID" : "0138dadc9c21dc753582505e03a139efe1f63884",
"triggerType" : "PUSH"
}, {
"hash" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=659",
"triggerID" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
"triggerType" : "PUSH"
}, {
"hash" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=744",
"triggerID" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
"triggerType" : "PUSH"
}, {
"hash" : "cea6548b53a242e93e861401564a5bb55d317f24",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=812",
"triggerID" : "cea6548b53a242e93e861401564a5bb55d317f24",
"triggerType" : "PUSH"
}, {
"hash" : "890e9822855fcd45c8387f83740975f43474cddc",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=814",
"triggerID" : "890e9822855fcd45c8387f83740975f43474cddc",
"triggerType" : "PUSH"
}, {
"hash" : "890e9822855fcd45c8387f83740975f43474cddc",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=818",
"triggerID" : "877110964",
"triggerType" : "MANUAL"
}, {
"hash" : "f86f50e817a625dc30f35a39b7495a4f359e4da5",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=823",
"triggerID" : "f86f50e817a625dc30f35a39b7495a4f359e4da5",
"triggerType" : "PUSH"
}, {
"hash" : "f86f50e817a625dc30f35a39b7495a4f359e4da5",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=824",
"triggerID" : "877330225",
"triggerType" : "MANUAL"
}, {
"hash" : "f86f50e817a625dc30f35a39b7495a4f359e4da5",
"status" : "DELETED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=823",
"triggerID" : "877330225",
"triggerType" : "MANUAL"
}, {
"hash" : "7d4f9812e05062092c33bb5a4718ed25aa29bac6",
"status" : "SUCCESS",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=849",
"triggerID" : "7d4f9812e05062092c33bb5a4718ed25aa29bac6",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* 7d4f9812e05062092c33bb5a4718ed25aa29bac6 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=849)
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codecov-commenter edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-867078369
# [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#3142](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (7d4f981) into [master](https://codecov.io/gh/apache/hudi/commit/9b01d2a04520db6230cd16ef2b29013c013b1944?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (9b01d2a) will **decrease** coverage by `3.41%`.
> The diff coverage is `49.14%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3142/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
```diff
@@ Coverage Diff @@
## master #3142 +/- ##
============================================
- Coverage 47.67% 44.26% -3.42%
+ Complexity 5516 4478 -1038
============================================
Files 929 822 -107
Lines 41303 37123 -4180
Branches 4144 3758 -386
============================================
- Hits 19692 16432 -3260
+ Misses 19863 19169 -694
+ Partials 1748 1522 -226
```
| Flag | Coverage Δ | |
|---|---|---|
| hudicli | `39.97% <ø> (ø)` | |
| hudiclient | `22.89% <9.52%> (-11.68%)` | :arrow_down: |
| hudicommon | `48.56% <0.00%> (-0.03%)` | :arrow_down: |
| hudiflink | `60.03% <ø> (ø)` | |
| hudihadoopmr | `51.29% <ø> (ø)` | |
| hudisparkdatasource | `67.30% <56.66%> (-0.02%)` | :arrow_down: |
| hudisync | `5.37% <ø> (-49.15%)` | :arrow_down: |
| huditimelineservice | `?` | |
| hudiutilities | `59.26% <90.19%> (+0.69%)` | :arrow_up: |
Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
| [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [.../org/apache/hudi/async/AsyncClusteringService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ2x1c3RlcmluZ1NlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
| [...ava/org/apache/hudi/async/AsyncCompactService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ29tcGFjdFNlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
| [...java/org/apache/hudi/async/HoodieAsyncService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0hvb2RpZUFzeW5jU2VydmljZS5qYXZh) | `0.00% <0.00%> (ø)` | |
| [...g/apache/hudi/client/AbstractClusteringClient.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9BYnN0cmFjdENsdXN0ZXJpbmdDbGllbnQuamF2YQ==) | `0.00% <0.00%> (ø)` | |
| [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh) | `42.80% <0.00%> (-0.08%)` | :arrow_down: |
| [...a/org/apache/hudi/common/util/ClusteringUtils.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3V0aWwvQ2x1c3RlcmluZ1V0aWxzLmphdmE=) | `88.40% <0.00%> (-1.31%)` | :arrow_down: |
| [...in/scala/org/apache/hudi/HoodieStreamingSink.scala](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZVN0cmVhbWluZ1Npbmsuc2NhbGE=) | `28.00% <36.66%> (+4.00%)` | :arrow_up: |
| [...n/scala/org/apache/hudi/HoodieSparkSqlWriter.scala](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZVNwYXJrU3FsV3JpdGVyLnNjYWxh) | `72.03% <72.00%> (+0.76%)` | :arrow_up: |
| [...org/apache/hudi/config/HoodieClusteringConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVDbHVzdGVyaW5nQ29uZmlnLmphdmE=) | `71.28% <75.00%> (+0.01%)` | :arrow_up: |
| [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `71.56% <75.00%> (+0.42%)` | :arrow_up: |
| ... and [138 more](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [9b01d2a...7d4f981](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072
<!--
Meta data
{
"version" : 1,
"metaDataEntries" : [ {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "FAILURE",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
"triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"triggerType" : "PUSH"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "CANCELED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867615903",
"triggerType" : "MANUAL"
}, {
"hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
"status" : "CANCELED",
"url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
"triggerID" : "867624272",
"triggerType" : "MANUAL"
}, {
"hash" : "0138dadc9c21dc753582505e03a139efe1f63884",
"status" : "UNKNOWN",
"url" : "TBD",
"triggerID" : "0138dadc9c21dc753582505e03a139efe1f63884",
"triggerType" : "PUSH"
} ]
}-->
## CI report:
* 4a651f8d6b32f63838d8317c9f083508c17e4458 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383) Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413)
* 0138dadc9c21dc753582505e03a139efe1f63884 UNKNOWN
<details>
<summary>Bot commands</summary>
@hudi-bot supports the following commands:
- `@hudi-bot run travis` re-run the last Travis build
- `@hudi-bot run azure` re-run the last Azure build
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] codope commented on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
codope commented on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-879223675
> Hi @codope Just want to know, is this Async clustering function can handle the following scenarios and losing no data:
>
> There are 3 small file groups named fg1, fg2 and fg3 contained file slice1, file slice2 and file slices3 separately.
>
> When async schedule **start to make a cluster plan but not finished**, there is an inflight or requested commit for fg1 which will create file slice 11 based on file slice1. In other words **file slice11 is creating but not committed** ---> I believe this scene is similar to multi writers.
>
> What does this async clustering function will do?
> Will this clustering plan contains file slice1? if contained, I think the new data in file slice11 will be lost.
>
> Looking forward to your reply, thanks a lot.
@zhangyue19921010 It will depend on what point of time during clustering planning file slice11 is created. If it is before the `ClusteringPlanStrategy#getFileSlicesEligibleForClustering` is invoked then clustering plan will not contain file slice1. So, just like multi writers there is a race condition here. However, while actually clustering, the default (and currently only) strategy is to reject updates. So, it will throw exception after seeing that there is an a filegroup with update (in this case fg1). This should get picked up in the next run of clustering.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] zhangyue19921010 edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming
Posted by GitBox <gi...@apache.org>.
zhangyue19921010 edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-878004113
Hi @codope Just want to know, is this Async clustering function can handle the following scenarios and losing no data:
There are 3 small file groups named fg1, fg2 and fg3 contained file slice1, file slice2 and file slices3 separately.
When async schedule **start to make a cluster plan but not finished**, there is an inflight or requested commit for fg1 which will create file slice 11 based on file slice1. In other words **file slice11 is creating but not committed** ---> I believe this scene is similar to multi writers.
What does this async clustering function will do?
Will this clustering plan contains file slice1? if contained, I think the new data in file slice11 will be lost.
Looking forward to your reply, thanks a lot.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org