You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2021/06/23 16:38:35 UTC

[GitHub] [hudi] codope opened a new pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

codope opened a new pull request #3142:
URL: https://github.com/apache/hudi/pull/3142


   ## What is the purpose of the pull request
   
   This pull request adds async clustering support for HoodieDeltaStreamer and Spark streaming writes to Hudi table.
   
   ## Brief change log
   
     - Async clustering is configurable.
     - Reuses the existing clustering methods in write client and timeline.
     - Hence, it has the same limitations that exist with current clustering strategy i.e. updates are rejected.
   
   ## Verify this pull request
   
   This change added tests and can be verified as follows:
   
     - Added unit tests in `TestHoodieDeltaStreamer` and `TestStructuredStreaming`.
     - Manually verified the change by running a job locally.
   
   ## Committer checklist
   
    - [ ] Has a corresponding JIRA in PR title & commit
    
    - [ ] Commit message is descriptive of the change
    
    - [ ] CI is green
   
    - [ ] Necessary doc changes done or have another open PR
          
    - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-867078369






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-867078369


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#3142](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (7d4f981) into [master](https://codecov.io/gh/apache/hudi/commit/9b01d2a04520db6230cd16ef2b29013c013b1944?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (9b01d2a) will **decrease** coverage by `31.75%`.
   > The diff coverage is `40.35%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3142/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@              Coverage Diff              @@
   ##             master    #3142       +/-   ##
   =============================================
   - Coverage     47.67%   15.91%   -31.76%     
   + Complexity     5516      493     -5023     
   =============================================
     Files           929      283      -646     
     Lines         41303    11710    -29593     
     Branches       4144      961     -3183     
   =============================================
   - Hits          19692     1864    -17828     
   + Misses        19863     9683    -10180     
   + Partials       1748      163     -1585     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `?` | |
   | hudiclient | `0.00% <0.00%> (-34.58%)` | :arrow_down: |
   | hudicommon | `?` | |
   | hudiflink | `?` | |
   | hudihadoopmr | `?` | |
   | hudisparkdatasource | `?` | |
   | hudisync | `5.37% <ø> (-49.15%)` | :arrow_down: |
   | huditimelineservice | `?` | |
   | hudiutilities | `59.26% <90.19%> (+0.69%)` | :arrow_up: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [.../org/apache/hudi/async/AsyncClusteringService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ2x1c3RlcmluZ1NlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...ava/org/apache/hudi/async/AsyncCompactService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ29tcGFjdFNlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...java/org/apache/hudi/async/HoodieAsyncService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0hvb2RpZUFzeW5jU2VydmljZS5qYXZh) | `0.00% <0.00%> (ø)` | |
   | [...g/apache/hudi/client/AbstractClusteringClient.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9BYnN0cmFjdENsdXN0ZXJpbmdDbGllbnQuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...org/apache/hudi/config/HoodieClusteringConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVDbHVzdGVyaW5nQ29uZmlnLmphdmE=) | `0.00% <0.00%> (-71.28%)` | :arrow_down: |
   | [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh) | `0.00% <0.00%> (-42.89%)` | :arrow_down: |
   | [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `71.56% <75.00%> (+0.42%)` | :arrow_up: |
   | [...i/utilities/deltastreamer/HoodieDeltaStreamer.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvSG9vZGllRGVsdGFTdHJlYW1lci5qYXZh) | `75.08% <92.50%> (+2.25%)` | :arrow_up: |
   | [...s/deltastreamer/HoodieMultiTableDeltaStreamer.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvSG9vZGllTXVsdGlUYWJsZURlbHRhU3RyZWFtZXIuamF2YQ==) | `76.60% <100.00%> (+0.41%)` | :arrow_up: |
   | [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | ... and [724 more](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [9b01d2a...7d4f981](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
       "triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867615903",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867624272",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "0138dadc9c21dc753582505e03a139efe1f63884",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568",
       "triggerID" : "0138dadc9c21dc753582505e03a139efe1f63884",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=659",
       "triggerID" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=744",
       "triggerID" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cea6548b53a242e93e861401564a5bb55d317f24",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=812",
       "triggerID" : "cea6548b53a242e93e861401564a5bb55d317f24",
       "triggerType" : "PUSH"
     }, {
       "hash" : "890e9822855fcd45c8387f83740975f43474cddc",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=814",
       "triggerID" : "890e9822855fcd45c8387f83740975f43474cddc",
       "triggerType" : "PUSH"
     }, {
       "hash" : "890e9822855fcd45c8387f83740975f43474cddc",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=818",
       "triggerID" : "877110964",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f86f50e817a625dc30f35a39b7495a4f359e4da5",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=823",
       "triggerID" : "f86f50e817a625dc30f35a39b7495a4f359e4da5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f86f50e817a625dc30f35a39b7495a4f359e4da5",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=824",
       "triggerID" : "877330225",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * 890e9822855fcd45c8387f83740975f43474cddc Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=814) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=818) 
   * f86f50e817a625dc30f35a39b7495a4f359e4da5 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=823) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=824) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codope commented on a change in pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
codope commented on a change in pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#discussion_r664561678



##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/async/AsyncClusteringService.java
##########
@@ -0,0 +1,125 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.async;
+
+import org.apache.hudi.client.AbstractClusteringClient;
+import org.apache.hudi.client.AbstractHoodieWriteClient;
+import org.apache.hudi.common.table.timeline.HoodieInstant;
+import org.apache.hudi.common.util.collection.Pair;
+import org.apache.hudi.exception.HoodieIOException;
+
+import org.apache.log4j.LogManager;
+import org.apache.log4j.Logger;
+
+import java.io.IOException;
+import java.util.concurrent.BlockingQueue;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import java.util.concurrent.LinkedBlockingQueue;
+import java.util.concurrent.locks.Condition;
+import java.util.concurrent.locks.ReentrantLock;
+import java.util.stream.IntStream;
+
+/**
+ * Async clustering service that runs in a separate thread.
+ * Currently, only one clustering thread is allowed to run at any time.
+ */
+public abstract class AsyncClusteringService extends HoodieAsyncService {
+
+  private static final long serialVersionUID = 1L;
+  private static final Logger LOG = LogManager.getLogger(AsyncClusteringService.class);
+
+  private final int maxConcurrentClustering;
+  private transient AbstractClusteringClient clusteringClient;
+  private transient BlockingQueue<HoodieInstant> pendingClustering = new LinkedBlockingQueue<>();
+  private transient ReentrantLock queueLock = new ReentrantLock();

Review comment:
       As discussed, moved the common methods and variables to `HoodieAsyncService`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
       "triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867615903",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867624272",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "0138dadc9c21dc753582505e03a139efe1f63884",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568",
       "triggerID" : "0138dadc9c21dc753582505e03a139efe1f63884",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=659",
       "triggerID" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=744",
       "triggerID" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cea6548b53a242e93e861401564a5bb55d317f24",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=812",
       "triggerID" : "cea6548b53a242e93e861401564a5bb55d317f24",
       "triggerType" : "PUSH"
     }, {
       "hash" : "890e9822855fcd45c8387f83740975f43474cddc",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=814",
       "triggerID" : "890e9822855fcd45c8387f83740975f43474cddc",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * cea6548b53a242e93e861401564a5bb55d317f24 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=812) 
   * 890e9822855fcd45c8387f83740975f43474cddc Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=814) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
       "triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867615903",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * 4a651f8d6b32f63838d8317c9f083508c17e4458 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-867078369


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#3142](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (cd27818) into [master](https://codecov.io/gh/apache/hudi/commit/70d9c2e7473154178bafaef40a15fc3e10a9df6a?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (70d9c2e) will **decrease** coverage by `31.72%`.
   > The diff coverage is `38.26%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3142/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@              Coverage Diff              @@
   ##             master    #3142       +/-   ##
   =============================================
   - Coverage     47.51%   15.78%   -31.73%     
   + Complexity     5427      482     -4945     
   =============================================
     Files           922      282      -640     
     Lines         40966    11650    -29316     
     Branches       4100      954     -3146     
   =============================================
   - Hits          19463     1839    -17624     
   + Misses        19784     9650    -10134     
   + Partials       1719      161     -1558     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `?` | |
   | hudiclient | `0.00% <0.00%> (-34.59%)` | :arrow_down: |
   | hudicommon | `?` | |
   | hudiflink | `?` | |
   | hudihadoopmr | `?` | |
   | hudisparkdatasource | `?` | |
   | hudisync | `5.38% <ø> (-48.67%)` | :arrow_down: |
   | huditimelineservice | `?` | |
   | hudiutilities | `58.76% <89.79%> (+0.74%)` | :arrow_up: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [.../org/apache/hudi/async/AsyncClusteringService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ2x1c3RlcmluZ1NlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...ava/org/apache/hudi/async/AsyncCompactService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ29tcGFjdFNlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...java/org/apache/hudi/async/HoodieAsyncService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0hvb2RpZUFzeW5jU2VydmljZS5qYXZh) | `0.00% <0.00%> (ø)` | |
   | [...g/apache/hudi/client/AbstractClusteringClient.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9BYnN0cmFjdENsdXN0ZXJpbmdDbGllbnQuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...org/apache/hudi/config/HoodieClusteringConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVDbHVzdGVyaW5nQ29uZmlnLmphdmE=) | `0.00% <0.00%> (-71.28%)` | :arrow_down: |
   | [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh) | `0.00% <0.00%> (-42.79%)` | :arrow_down: |
   | [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `71.47% <75.00%> (+0.43%)` | :arrow_up: |
   | [...i/utilities/deltastreamer/HoodieDeltaStreamer.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvSG9vZGllRGVsdGFTdHJlYW1lci5qYXZh) | `75.34% <92.10%> (+2.50%)` | :arrow_up: |
   | [...s/deltastreamer/HoodieMultiTableDeltaStreamer.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvSG9vZGllTXVsdGlUYWJsZURlbHRhU3RyZWFtZXIuamF2YQ==) | `76.60% <100.00%> (+0.41%)` | :arrow_up: |
   | [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | ... and [718 more](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [70d9c2e...cd27818](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
       "triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867615903",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867624272",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "0138dadc9c21dc753582505e03a139efe1f63884",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568",
       "triggerID" : "0138dadc9c21dc753582505e03a139efe1f63884",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=659",
       "triggerID" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=744",
       "triggerID" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * cd2781856c66d1b8527b12f29c99f3faf31e3fdb Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=659) 
   * 2d1f23811a18ed5b127c12391ea0fcabe50bb8a4 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=744) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-867078369


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#3142](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (7d4f981) into [master](https://codecov.io/gh/apache/hudi/commit/9b01d2a04520db6230cd16ef2b29013c013b1944?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (9b01d2a) will **decrease** coverage by `44.81%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3142/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master   #3142       +/-   ##
   ============================================
   - Coverage     47.67%   2.86%   -44.82%     
   + Complexity     5516      85     -5431     
   ============================================
     Files           929     283      -646     
     Lines         41303   11710    -29593     
     Branches       4144     961     -3183     
   ============================================
   - Hits          19692     335    -19357     
   + Misses        19863   11349     -8514     
   + Partials       1748      26     -1722     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `?` | |
   | hudiclient | `0.00% <0.00%> (-34.58%)` | :arrow_down: |
   | hudicommon | `?` | |
   | hudiflink | `?` | |
   | hudihadoopmr | `?` | |
   | hudisparkdatasource | `?` | |
   | hudisync | `5.37% <ø> (-49.15%)` | :arrow_down: |
   | huditimelineservice | `?` | |
   | hudiutilities | `9.11% <0.00%> (-49.46%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [.../org/apache/hudi/async/AsyncClusteringService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ2x1c3RlcmluZ1NlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...ava/org/apache/hudi/async/AsyncCompactService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ29tcGFjdFNlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...java/org/apache/hudi/async/HoodieAsyncService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0hvb2RpZUFzeW5jU2VydmljZS5qYXZh) | `0.00% <0.00%> (ø)` | |
   | [...g/apache/hudi/client/AbstractClusteringClient.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9BYnN0cmFjdENsdXN0ZXJpbmdDbGllbnQuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...org/apache/hudi/config/HoodieClusteringConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVDbHVzdGVyaW5nQ29uZmlnLmphdmE=) | `0.00% <0.00%> (-71.28%)` | :arrow_down: |
   | [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh) | `0.00% <0.00%> (-42.89%)` | :arrow_down: |
   | [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `0.00% <0.00%> (-71.15%)` | :arrow_down: |
   | [...i/utilities/deltastreamer/HoodieDeltaStreamer.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvSG9vZGllRGVsdGFTdHJlYW1lci5qYXZh) | `0.00% <0.00%> (-72.84%)` | :arrow_down: |
   | [...s/deltastreamer/HoodieMultiTableDeltaStreamer.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvSG9vZGllTXVsdGlUYWJsZURlbHRhU3RyZWFtZXIuamF2YQ==) | `0.00% <0.00%> (-76.20%)` | :arrow_down: |
   | [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | ... and [770 more](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [9b01d2a...7d4f981](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-867078369


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#3142](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (f86f50e) into [master](https://codecov.io/gh/apache/hudi/commit/371526789d663dee85041eb31c27c52c81ef87ef?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (3715267) will **decrease** coverage by `24.54%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3142/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master   #3142       +/-   ##
   ============================================
   - Coverage     27.40%   2.86%   -24.55%     
   + Complexity     1287      85     -1202     
   ============================================
     Files           381     283       -98     
     Lines         15108   11710     -3398     
     Branches       1305     961      -344     
   ============================================
   - Hits           4141     335     -3806     
   - Misses        10667   11349      +682     
   + Partials        300      26      -274     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudiclient | `0.00% <0.00%> (-21.06%)` | :arrow_down: |
   | hudisync | `5.37% <ø> (ø)` | |
   | hudiutilities | `9.11% <0.00%> (-49.46%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [.../org/apache/hudi/async/AsyncClusteringService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ2x1c3RlcmluZ1NlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...ava/org/apache/hudi/async/AsyncCompactService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ29tcGFjdFNlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...java/org/apache/hudi/async/HoodieAsyncService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0hvb2RpZUFzeW5jU2VydmljZS5qYXZh) | `0.00% <0.00%> (ø)` | |
   | [...g/apache/hudi/client/AbstractClusteringClient.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9BYnN0cmFjdENsdXN0ZXJpbmdDbGllbnQuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...org/apache/hudi/config/HoodieClusteringConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVDbHVzdGVyaW5nQ29uZmlnLmphdmE=) | `0.00% <0.00%> (ø)` | |
   | [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh) | `0.00% <0.00%> (ø)` | |
   | [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `0.00% <0.00%> (-71.15%)` | :arrow_down: |
   | [...i/utilities/deltastreamer/HoodieDeltaStreamer.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvSG9vZGllRGVsdGFTdHJlYW1lci5qYXZh) | `0.00% <0.00%> (-72.84%)` | :arrow_down: |
   | [...s/deltastreamer/HoodieMultiTableDeltaStreamer.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvSG9vZGllTXVsdGlUYWJsZURlbHRhU3RyZWFtZXIuamF2YQ==) | `0.00% <0.00%> (-76.20%)` | :arrow_down: |
   | [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | ... and [148 more](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [3715267...f86f50e](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
       "triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867615903",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867624272",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "0138dadc9c21dc753582505e03a139efe1f63884",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568",
       "triggerID" : "0138dadc9c21dc753582505e03a139efe1f63884",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=659",
       "triggerID" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * cd2781856c66d1b8527b12f29c99f3faf31e3fdb Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=659) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on a change in pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#discussion_r665443646



##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/async/HoodieAsyncService.java
##########
@@ -165,4 +176,51 @@ private void monitorThreads(Function<Boolean, Boolean> onShutdownCallback) {
   public boolean isRunInDaemonMode() {
     return runInDaemonMode;
   }
+
+  /**
+   * Wait till outstanding pending compaction/clustering reduces to the passed in value.
+   *
+   * @param numPending Maximum pending compactions/clustering allowed
+   * @throws InterruptedException
+   */
+  public void waitTillPendingActionReducesTo(int numPending) throws InterruptedException {
+    try {
+      queueLock.lock();
+      while (!isShutdown() && (pendingInstants.size() > numPending)) {
+        consumed.await();
+      }
+    } finally {
+      queueLock.unlock();
+    }
+  }
+
+  /**
+   * Enqueues new pending clustering instant.
+   * @param instant {@link HoodieInstant} to enqueue.
+   */
+  public void enqueuePendingAction(HoodieInstant instant) {
+    LOG.info("Enqueuing new pending clustering instant: " + instant.getTimestamp());
+    pendingInstants.add(instant);
+  }
+
+  /**
+   * Fetch next pending compaction/clustering instant if available.
+   *
+   * @return {@link HoodieInstant} corresponding to the next pending compaction/clustering.
+   * @throws InterruptedException
+   */
+  HoodieInstant fetchNextActionInstant() throws InterruptedException {

Review comment:
       nit. fetchNextAsyncServiceInstant

##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieMultiTableDeltaStreamer.java
##########
@@ -296,6 +297,11 @@ public static void main(String[] args) throws IOException {
         + "outstanding compactions is less than this number")
     public Integer maxPendingCompactions = 5;
 
+    @Parameter(names = {"--max-pending-clustering"},

Review comment:
       @pratyakshsharma ^ 

##########
File path: hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestStructuredStreaming.scala
##########
@@ -207,12 +209,26 @@ class TestStructuredStreaming extends HoodieClientTestBase {
       metaClient.reloadActiveTimeline()
       assertEquals(1, getLatestFileGroupsFileId(HoodieTestDataGenerator.DEFAULT_FIRST_PARTITION_PATH).size)
     }
-    structuredStreamingForTestClusteringRunner(sourcePath, destPath, true,
+    structuredStreamingForTestClusteringRunner(sourcePath, destPath, true, false,
       HoodieTestDataGenerator.DEFAULT_FIRST_PARTITION_PATH, checkClusteringResult)
   }
 
   @Test
-  def testStructuredStreamingWithoutInlineClustering(): Unit = {
+  def testStructuredStreamingWithAsyncClustering(): Unit = {

Review comment:
       not sure if we this will be too hard to achieve. Is there a way to simulate resource unavailability. i.e. when async clustering is scheduled, no resources to schedule right away. But after you open up resources in your test, async clustering should get triggered. basically to validate that a pending async clustering should get triggered when resources become available and not get cancelled.

##########
File path: hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/DataSourceOptions.scala
##########
@@ -473,6 +473,11 @@ object DataSourceWriteOptions {
     .defaultValue("true")
     .withDocumentation("")
 
+  val ASYNC_CLUSTERING_ENABLE_OPT_KEY: ConfigProperty[String] = ConfigProperty
+    .key("hoodie.datasource.clustering.async.enable")
+    .defaultValue("false")

Review comment:
       can we set the min version as well.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
       "triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 4a651f8d6b32f63838d8317c9f083508c17e4458 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
       "triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 4a651f8d6b32f63838d8317c9f083508c17e4458 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
       "triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867615903",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867624272",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "0138dadc9c21dc753582505e03a139efe1f63884",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568",
       "triggerID" : "0138dadc9c21dc753582505e03a139efe1f63884",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=659",
       "triggerID" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * cd2781856c66d1b8527b12f29c99f3faf31e3fdb Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=659) 
   * 2d1f23811a18ed5b127c12391ea0fcabe50bb8a4 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codope commented on a change in pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
codope commented on a change in pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#discussion_r661474341



##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java
##########
@@ -396,10 +424,10 @@ public int hashCode() {
               baseFileFormat, propsFilePath, configs, sourceClassName,
               sourceOrderingField, payloadClassName, schemaProviderClassName,
               transformerClassNames, sourceLimit, operation, filterDupes,
-              enableHiveSync, maxPendingCompactions, continuousMode,
+              enableHiveSync, maxPendingCompactions, maxPendingClustering, continuousMode,
               minSyncIntervalSeconds, sparkMaster, commitOnErrors,
               deltaSyncSchedulingWeight, compactSchedulingWeight, deltaSyncSchedulingMinShare,
-              compactSchedulingMinShare, forceDisableCompaction, checkpoint,
+              compactSchedulingMinShare, forceDisableCompaction, forceDisableClustering, checkpoint,

Review comment:
       This config will be required only if we create a separate job pool for clustering and assign weights. As mentioned [here](https://github.com/apache/hudi/pull/3142#issuecomment-869829116) we are not creating a separate pool.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-867078369


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#3142](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (2d1f238) into [master](https://codecov.io/gh/apache/hudi/commit/6e2443468280dfd768afe9ac4d17df7fbbbb51bf?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (6e24434) will **decrease** coverage by `44.76%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3142/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master   #3142       +/-   ##
   ============================================
   - Coverage     47.61%   2.85%   -44.77%     
   + Complexity     5487      82     -5405     
   ============================================
     Files           924     282      -642     
     Lines         41206   11705    -29501     
     Branches       4134     959     -3175     
   ============================================
   - Hits          19621     334    -19287     
   + Misses        19843   11345     -8498     
   + Partials       1742      26     -1716     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `?` | |
   | hudiclient | `0.00% <0.00%> (-34.59%)` | :arrow_down: |
   | hudicommon | `?` | |
   | hudiflink | `?` | |
   | hudihadoopmr | `?` | |
   | hudisparkdatasource | `?` | |
   | hudisync | `5.28% <ø> (-49.20%)` | :arrow_down: |
   | huditimelineservice | `?` | |
   | hudiutilities | `9.12% <0.00%> (-49.48%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [.../org/apache/hudi/async/AsyncClusteringService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ2x1c3RlcmluZ1NlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...ava/org/apache/hudi/async/AsyncCompactService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ29tcGFjdFNlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...java/org/apache/hudi/async/HoodieAsyncService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0hvb2RpZUFzeW5jU2VydmljZS5qYXZh) | `0.00% <0.00%> (ø)` | |
   | [...g/apache/hudi/client/AbstractClusteringClient.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9BYnN0cmFjdENsdXN0ZXJpbmdDbGllbnQuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...org/apache/hudi/config/HoodieClusteringConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVDbHVzdGVyaW5nQ29uZmlnLmphdmE=) | `0.00% <0.00%> (-71.28%)` | :arrow_down: |
   | [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh) | `0.00% <0.00%> (-42.89%)` | :arrow_down: |
   | [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `0.00% <0.00%> (-71.15%)` | :arrow_down: |
   | [...i/utilities/deltastreamer/HoodieDeltaStreamer.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvSG9vZGllRGVsdGFTdHJlYW1lci5qYXZh) | `0.00% <0.00%> (-72.84%)` | :arrow_down: |
   | [...s/deltastreamer/HoodieMultiTableDeltaStreamer.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvSG9vZGllTXVsdGlUYWJsZURlbHRhU3RyZWFtZXIuamF2YQ==) | `0.00% <0.00%> (-76.20%)` | :arrow_down: |
   | [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | ... and [767 more](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [6e24434...2d1f238](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-867078369


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#3142](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (f86f50e) into [master](https://codecov.io/gh/apache/hudi/commit/371526789d663dee85041eb31c27c52c81ef87ef?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (3715267) will **decrease** coverage by `11.49%`.
   > The diff coverage is `40.35%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3142/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@              Coverage Diff              @@
   ##             master    #3142       +/-   ##
   =============================================
   - Coverage     27.40%   15.91%   -11.50%     
   + Complexity     1287      493      -794     
   =============================================
     Files           381      283       -98     
     Lines         15108    11710     -3398     
     Branches       1305      961      -344     
   =============================================
   - Hits           4141     1864     -2277     
   + Misses        10667     9683      -984     
   + Partials        300      163      -137     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudiclient | `0.00% <0.00%> (-21.06%)` | :arrow_down: |
   | hudisync | `5.37% <ø> (ø)` | |
   | hudiutilities | `59.26% <90.19%> (+0.69%)` | :arrow_up: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [.../org/apache/hudi/async/AsyncClusteringService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ2x1c3RlcmluZ1NlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...ava/org/apache/hudi/async/AsyncCompactService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ29tcGFjdFNlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...java/org/apache/hudi/async/HoodieAsyncService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0hvb2RpZUFzeW5jU2VydmljZS5qYXZh) | `0.00% <0.00%> (ø)` | |
   | [...g/apache/hudi/client/AbstractClusteringClient.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9BYnN0cmFjdENsdXN0ZXJpbmdDbGllbnQuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...org/apache/hudi/config/HoodieClusteringConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVDbHVzdGVyaW5nQ29uZmlnLmphdmE=) | `0.00% <0.00%> (ø)` | |
   | [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh) | `0.00% <0.00%> (ø)` | |
   | [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `71.56% <75.00%> (+0.42%)` | :arrow_up: |
   | [...i/utilities/deltastreamer/HoodieDeltaStreamer.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvSG9vZGllRGVsdGFTdHJlYW1lci5qYXZh) | `75.08% <92.50%> (+2.25%)` | :arrow_up: |
   | [...s/deltastreamer/HoodieMultiTableDeltaStreamer.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvSG9vZGllTXVsdGlUYWJsZURlbHRhU3RyZWFtZXIuamF2YQ==) | `76.60% <100.00%> (+0.41%)` | :arrow_up: |
   | [...n/bulkinsert/PartitionSortPartitionerWithRows.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1zcGFyay1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvZXhlY3V0aW9uL2J1bGtpbnNlcnQvUGFydGl0aW9uU29ydFBhcnRpdGlvbmVyV2l0aFJvd3MuamF2YQ==) | | |
   | ... and [102 more](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [3715267...f86f50e](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codope commented on a change in pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
codope commented on a change in pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#discussion_r666691523



##########
File path: hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestStructuredStreaming.scala
##########
@@ -207,12 +209,26 @@ class TestStructuredStreaming extends HoodieClientTestBase {
       metaClient.reloadActiveTimeline()
       assertEquals(1, getLatestFileGroupsFileId(HoodieTestDataGenerator.DEFAULT_FIRST_PARTITION_PATH).size)
     }
-    structuredStreamingForTestClusteringRunner(sourcePath, destPath, true,
+    structuredStreamingForTestClusteringRunner(sourcePath, destPath, true, false,
       HoodieTestDataGenerator.DEFAULT_FIRST_PARTITION_PATH, checkClusteringResult)
   }
 
   @Test
-  def testStructuredStreamingWithoutInlineClustering(): Unit = {
+  def testStructuredStreamingWithAsyncClustering(): Unit = {

Review comment:
       I will try to simulate this as part of integration test. https://issues.apache.org/jira/browse/HUDI-1077




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
       "triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867615903",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867624272",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "0138dadc9c21dc753582505e03a139efe1f63884",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568",
       "triggerID" : "0138dadc9c21dc753582505e03a139efe1f63884",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=659",
       "triggerID" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=744",
       "triggerID" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cea6548b53a242e93e861401564a5bb55d317f24",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=812",
       "triggerID" : "cea6548b53a242e93e861401564a5bb55d317f24",
       "triggerType" : "PUSH"
     }, {
       "hash" : "890e9822855fcd45c8387f83740975f43474cddc",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "890e9822855fcd45c8387f83740975f43474cddc",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * cea6548b53a242e93e861401564a5bb55d317f24 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=812) 
   * 890e9822855fcd45c8387f83740975f43474cddc UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
       "triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867615903",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867624272",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "0138dadc9c21dc753582505e03a139efe1f63884",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568",
       "triggerID" : "0138dadc9c21dc753582505e03a139efe1f63884",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=659",
       "triggerID" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=744",
       "triggerID" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cea6548b53a242e93e861401564a5bb55d317f24",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=812",
       "triggerID" : "cea6548b53a242e93e861401564a5bb55d317f24",
       "triggerType" : "PUSH"
     }, {
       "hash" : "890e9822855fcd45c8387f83740975f43474cddc",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "890e9822855fcd45c8387f83740975f43474cddc",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 2d1f23811a18ed5b127c12391ea0fcabe50bb8a4 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=744) 
   * cea6548b53a242e93e861401564a5bb55d317f24 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=812) 
   * 890e9822855fcd45c8387f83740975f43474cddc UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codope commented on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
codope commented on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-877330225


   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
       "triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867615903",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867624272",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "0138dadc9c21dc753582505e03a139efe1f63884",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568",
       "triggerID" : "0138dadc9c21dc753582505e03a139efe1f63884",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 0138dadc9c21dc753582505e03a139efe1f63884 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
       "triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867615903",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867624272",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "0138dadc9c21dc753582505e03a139efe1f63884",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568",
       "triggerID" : "0138dadc9c21dc753582505e03a139efe1f63884",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=659",
       "triggerID" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=744",
       "triggerID" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cea6548b53a242e93e861401564a5bb55d317f24",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=812",
       "triggerID" : "cea6548b53a242e93e861401564a5bb55d317f24",
       "triggerType" : "PUSH"
     }, {
       "hash" : "890e9822855fcd45c8387f83740975f43474cddc",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=814",
       "triggerID" : "890e9822855fcd45c8387f83740975f43474cddc",
       "triggerType" : "PUSH"
     }, {
       "hash" : "890e9822855fcd45c8387f83740975f43474cddc",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=818",
       "triggerID" : "877110964",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f86f50e817a625dc30f35a39b7495a4f359e4da5",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f86f50e817a625dc30f35a39b7495a4f359e4da5",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 890e9822855fcd45c8387f83740975f43474cddc Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=814) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=818) 
   * f86f50e817a625dc30f35a39b7495a4f359e4da5 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-867624272


   @hudi-bot run travis


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codope commented on a change in pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
codope commented on a change in pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#discussion_r661474972



##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/async/AsyncClusteringService.java
##########
@@ -0,0 +1,131 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.async;
+
+import org.apache.hudi.client.AbstractClusteringClient;
+import org.apache.hudi.client.AbstractHoodieWriteClient;
+import org.apache.hudi.common.table.timeline.HoodieInstant;
+import org.apache.hudi.common.util.collection.Pair;
+import org.apache.hudi.exception.HoodieIOException;
+
+import org.apache.log4j.LogManager;
+import org.apache.log4j.Logger;
+
+import java.io.IOException;
+import java.util.concurrent.BlockingQueue;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import java.util.concurrent.LinkedBlockingQueue;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.locks.Condition;
+import java.util.concurrent.locks.ReentrantLock;
+import java.util.stream.IntStream;
+
+public abstract class AsyncClusteringService extends HoodieAsyncService {
+
+  private static final Logger LOG = LogManager.getLogger(AsyncClusteringService.class);
+
+  private final int maxConcurrentClustering;
+  private transient AbstractClusteringClient clusteringClient;
+  private transient BlockingQueue<HoodieInstant> pendingClustering = new LinkedBlockingQueue<>();
+  private transient ReentrantLock queueLock = new ReentrantLock();
+  private transient Condition consumed = queueLock.newCondition();
+
+  public AsyncClusteringService(AbstractHoodieWriteClient writeClient) {
+    this(writeClient, false);
+  }
+
+  public AsyncClusteringService(AbstractHoodieWriteClient writeClient, boolean runInDaemonMode) {
+    super(runInDaemonMode);
+    this.clusteringClient = createClusteringClient(writeClient);
+    this.maxConcurrentClustering = 1;
+  }
+
+  protected abstract AbstractClusteringClient createClusteringClient(AbstractHoodieWriteClient client);
+
+  public void enqueuePendingClustering(HoodieInstant instant) {
+    LOG.info("Enqueuing new pending clustering instant: " + instant.getTimestamp());
+    pendingClustering.add(instant);
+  }
+
+  public void waitTillPendingClusteringReducesTo(int numPendingClustering) throws InterruptedException {
+    try {
+      queueLock.lock();
+      while (!isShutdown() && (pendingClustering.size() > numPendingClustering)) {
+        consumed.await();
+      }
+    } finally {
+      queueLock.unlock();
+    }
+  }
+
+  private HoodieInstant fetchNextClusteringInstant() throws InterruptedException {
+    LOG.info("Waiting for next clustering instant for 10 seconds");
+    HoodieInstant instant = pendingClustering.poll(10, TimeUnit.SECONDS);
+    if (instant != null) {
+      try {
+        queueLock.lock();
+        // Signal waiting thread
+        consumed.signal();
+      } finally {
+        queueLock.unlock();
+      }
+    }
+    return instant;
+  }
+
+  /**
+   * Start clustering service.
+   */
+  @Override
+  protected Pair<CompletableFuture, ExecutorService> startService() {
+    ExecutorService executor = Executors.newFixedThreadPool(maxConcurrentClustering,
+        r -> {
+          Thread t = new Thread(r, "async_clustering_thread");
+          t.setDaemon(isRunInDaemonMode());
+          return t;
+        });
+
+    return Pair.of(CompletableFuture.allOf(IntStream.range(0, maxConcurrentClustering).mapToObj(i -> CompletableFuture.supplyAsync(() -> {
+      try {
+        while (!isShutdownRequested()) {
+          final HoodieInstant instant = fetchNextClusteringInstant();
+          if (null != instant) {
+            LOG.info("Starting clustering for instant " + instant);
+            clusteringClient.cluster(instant);
+            LOG.info("Finished clustering for instant " + instant);
+          }
+        }
+        LOG.info("Clustering executor shutting down properly");
+      } catch (InterruptedException ie) {
+        LOG.warn("Clustering executor got interrupted exception! Stopping", ie);
+      } catch (IOException e) {
+        LOG.error("Clustering executor failed", e);
+        throw new HoodieIOException(e.getMessage(), e);
+      }
+      return true;
+    }, executor)).toArray(CompletableFuture[]::new)), executor);
+  }
+
+  public synchronized void updateWriteClient(AbstractHoodieWriteClient writeClient) {

Review comment:
       Moved some common logic to `HoodieAsyncService`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] zhangyue19921010 commented on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
zhangyue19921010 commented on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-879326501


   Hi @codope Thanks for your response. What if after `ClusteringPlanStrategy#getFileSlicesEligibleForClustering` file slice11 were created which means file slice1 was picked up for clustering. After this plan execute, maybe new data in slice11 was lost. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codope commented on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
codope commented on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-877110964


   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
       "triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867615903",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867624272",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "0138dadc9c21dc753582505e03a139efe1f63884",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568",
       "triggerID" : "0138dadc9c21dc753582505e03a139efe1f63884",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=659",
       "triggerID" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=744",
       "triggerID" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cea6548b53a242e93e861401564a5bb55d317f24",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=812",
       "triggerID" : "cea6548b53a242e93e861401564a5bb55d317f24",
       "triggerType" : "PUSH"
     }, {
       "hash" : "890e9822855fcd45c8387f83740975f43474cddc",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=814",
       "triggerID" : "890e9822855fcd45c8387f83740975f43474cddc",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 890e9822855fcd45c8387f83740975f43474cddc Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=814) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
       "triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867615903",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867624272",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "0138dadc9c21dc753582505e03a139efe1f63884",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568",
       "triggerID" : "0138dadc9c21dc753582505e03a139efe1f63884",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=659",
       "triggerID" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=744",
       "triggerID" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cea6548b53a242e93e861401564a5bb55d317f24",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=812",
       "triggerID" : "cea6548b53a242e93e861401564a5bb55d317f24",
       "triggerType" : "PUSH"
     }, {
       "hash" : "890e9822855fcd45c8387f83740975f43474cddc",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=814",
       "triggerID" : "890e9822855fcd45c8387f83740975f43474cddc",
       "triggerType" : "PUSH"
     }, {
       "hash" : "890e9822855fcd45c8387f83740975f43474cddc",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=818",
       "triggerID" : "877110964",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f86f50e817a625dc30f35a39b7495a4f359e4da5",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=823",
       "triggerID" : "f86f50e817a625dc30f35a39b7495a4f359e4da5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f86f50e817a625dc30f35a39b7495a4f359e4da5",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=824",
       "triggerID" : "877330225",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f86f50e817a625dc30f35a39b7495a4f359e4da5",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=823",
       "triggerID" : "877330225",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * f86f50e817a625dc30f35a39b7495a4f359e4da5 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=824) Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=823) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
       "triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867615903",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867624272",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "0138dadc9c21dc753582505e03a139efe1f63884",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568",
       "triggerID" : "0138dadc9c21dc753582505e03a139efe1f63884",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=659",
       "triggerID" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=744",
       "triggerID" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cea6548b53a242e93e861401564a5bb55d317f24",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=812",
       "triggerID" : "cea6548b53a242e93e861401564a5bb55d317f24",
       "triggerType" : "PUSH"
     }, {
       "hash" : "890e9822855fcd45c8387f83740975f43474cddc",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=814",
       "triggerID" : "890e9822855fcd45c8387f83740975f43474cddc",
       "triggerType" : "PUSH"
     }, {
       "hash" : "890e9822855fcd45c8387f83740975f43474cddc",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=818",
       "triggerID" : "877110964",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f86f50e817a625dc30f35a39b7495a4f359e4da5",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=823",
       "triggerID" : "f86f50e817a625dc30f35a39b7495a4f359e4da5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f86f50e817a625dc30f35a39b7495a4f359e4da5",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=824",
       "triggerID" : "877330225",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f86f50e817a625dc30f35a39b7495a4f359e4da5",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=823",
       "triggerID" : "877330225",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * f86f50e817a625dc30f35a39b7495a4f359e4da5 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=824) Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=823) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codope commented on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
codope commented on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-869829116


   @nsivabalan Thanks for reviewing the PR. I agree with your source code comments. There is scope for reusability. I will address them and update the PR. For the high level questions, my response is as below.
    
   > * Now we have both clustering and compaction, I see that you have added clustering related code just after compaction where ever applicable. Is the higher priority for compaction intentional? or should we have clustering followed by compaction? or does it not matter at all.
   
   In case when both clustering and compaction are enabled then compaction will run just before clustering. The intention is that since currently compaction and clustering cannot run at the same time on the same file groups and clustering could take significant time, so let compaction thread start first. When clustering is scheduled for the filegroups under compaction it would be ignored and picked up in the subsequent run after compaction completes.
   
   > * I came across a class named SchedulerConfGenerator. Don't we need to make any changes here for async clustering?
   
   We will need to make changes here if we create separate job pool for clustering and assign weights for different jobs. Unlike compaction, I did not feel the need for a separate job pool for clustering. By default, each pool gets equal share of resource but within each pool, jobs run in FIFO order. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-867078369


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#3142](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (890e982) into [master](https://codecov.io/gh/apache/hudi/commit/047d956e01b6d7c92320686d8321b2bbe9d2188e?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (047d956) will **decrease** coverage by `28.11%`.
   > The diff coverage is `40.35%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3142/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@              Coverage Diff              @@
   ##             master    #3142       +/-   ##
   =============================================
   - Coverage     44.03%   15.91%   -28.12%     
   + Complexity     5100      493     -4607     
   =============================================
     Files           930      283      -647     
     Lines         41275    11710    -29565     
     Branches       4138      961     -3177     
   =============================================
   - Hits          18174     1864    -16310     
   + Misses        21487     9683    -11804     
   + Partials       1614      163     -1451     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `?` | |
   | hudiclient | `0.00% <0.00%> (-34.59%)` | :arrow_down: |
   | hudicommon | `?` | |
   | hudiflink | `?` | |
   | hudihadoopmr | `?` | |
   | hudisparkdatasource | `?` | |
   | hudisync | `5.37% <ø> (-49.15%)` | :arrow_down: |
   | huditimelineservice | `?` | |
   | hudiutilities | `59.26% <90.19%> (+50.00%)` | :arrow_up: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [.../org/apache/hudi/async/AsyncClusteringService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ2x1c3RlcmluZ1NlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...ava/org/apache/hudi/async/AsyncCompactService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ29tcGFjdFNlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...java/org/apache/hudi/async/HoodieAsyncService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0hvb2RpZUFzeW5jU2VydmljZS5qYXZh) | `0.00% <0.00%> (ø)` | |
   | [...g/apache/hudi/client/AbstractClusteringClient.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9BYnN0cmFjdENsdXN0ZXJpbmdDbGllbnQuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...org/apache/hudi/config/HoodieClusteringConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVDbHVzdGVyaW5nQ29uZmlnLmphdmE=) | `0.00% <0.00%> (-71.28%)` | :arrow_down: |
   | [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh) | `0.00% <0.00%> (-42.89%)` | :arrow_down: |
   | [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `71.56% <75.00%> (+71.56%)` | :arrow_up: |
   | [...i/utilities/deltastreamer/HoodieDeltaStreamer.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvSG9vZGllRGVsdGFTdHJlYW1lci5qYXZh) | `75.08% <92.50%> (+75.08%)` | :arrow_up: |
   | [...s/deltastreamer/HoodieMultiTableDeltaStreamer.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvSG9vZGllTXVsdGlUYWJsZURlbHRhU3RyZWFtZXIuamF2YQ==) | `76.60% <100.00%> (+76.60%)` | :arrow_up: |
   | [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | ... and [771 more](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [047d956...890e982](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on a change in pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#discussion_r663257958



##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieMultiTableDeltaStreamer.java
##########
@@ -296,6 +297,11 @@ public static void main(String[] args) throws IOException {
         + "outstanding compactions is less than this number")
     public Integer maxPendingCompactions = 5;
 
+    @Parameter(names = {"--max-pending-clustering"},

Review comment:
       out of curiosity. how come we have maxPendingCompaction/Clustering property defined as first class(top level) config for multiTableDeltaStreamer, but don't see the properties to enable/disable them. I assume those are fetched from property file for each source. So, why not fetch these properties also from the property file? I know this is not specific to clustering, but it was how compaction was defined. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on a change in pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#discussion_r659456141



##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/async/AsyncClusteringService.java
##########
@@ -0,0 +1,131 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.async;
+
+import org.apache.hudi.client.AbstractClusteringClient;
+import org.apache.hudi.client.AbstractHoodieWriteClient;
+import org.apache.hudi.common.table.timeline.HoodieInstant;
+import org.apache.hudi.common.util.collection.Pair;
+import org.apache.hudi.exception.HoodieIOException;
+
+import org.apache.log4j.LogManager;
+import org.apache.log4j.Logger;
+
+import java.io.IOException;
+import java.util.concurrent.BlockingQueue;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import java.util.concurrent.LinkedBlockingQueue;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.locks.Condition;
+import java.util.concurrent.locks.ReentrantLock;
+import java.util.stream.IntStream;
+
+public abstract class AsyncClusteringService extends HoodieAsyncService {

Review comment:
       java docs would be nice. 

##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/async/AsyncClusteringService.java
##########
@@ -0,0 +1,131 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.async;
+
+import org.apache.hudi.client.AbstractClusteringClient;
+import org.apache.hudi.client.AbstractHoodieWriteClient;
+import org.apache.hudi.common.table.timeline.HoodieInstant;
+import org.apache.hudi.common.util.collection.Pair;
+import org.apache.hudi.exception.HoodieIOException;
+
+import org.apache.log4j.LogManager;
+import org.apache.log4j.Logger;
+
+import java.io.IOException;
+import java.util.concurrent.BlockingQueue;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import java.util.concurrent.LinkedBlockingQueue;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.locks.Condition;
+import java.util.concurrent.locks.ReentrantLock;
+import java.util.stream.IntStream;
+
+public abstract class AsyncClusteringService extends HoodieAsyncService {
+
+  private static final Logger LOG = LogManager.getLogger(AsyncClusteringService.class);
+
+  private final int maxConcurrentClustering;
+  private transient AbstractClusteringClient clusteringClient;
+  private transient BlockingQueue<HoodieInstant> pendingClustering = new LinkedBlockingQueue<>();
+  private transient ReentrantLock queueLock = new ReentrantLock();
+  private transient Condition consumed = queueLock.newCondition();
+
+  public AsyncClusteringService(AbstractHoodieWriteClient writeClient) {
+    this(writeClient, false);
+  }
+
+  public AsyncClusteringService(AbstractHoodieWriteClient writeClient, boolean runInDaemonMode) {
+    super(runInDaemonMode);
+    this.clusteringClient = createClusteringClient(writeClient);
+    this.maxConcurrentClustering = 1;
+  }
+
+  protected abstract AbstractClusteringClient createClusteringClient(AbstractHoodieWriteClient client);
+
+  public void enqueuePendingClustering(HoodieInstant instant) {
+    LOG.info("Enqueuing new pending clustering instant: " + instant.getTimestamp());
+    pendingClustering.add(instant);
+  }
+
+  public void waitTillPendingClusteringReducesTo(int numPendingClustering) throws InterruptedException {
+    try {
+      queueLock.lock();
+      while (!isShutdown() && (pendingClustering.size() > numPendingClustering)) {
+        consumed.await();
+      }
+    } finally {
+      queueLock.unlock();
+    }
+  }
+
+  private HoodieInstant fetchNextClusteringInstant() throws InterruptedException {
+    LOG.info("Waiting for next clustering instant for 10 seconds");
+    HoodieInstant instant = pendingClustering.poll(10, TimeUnit.SECONDS);
+    if (instant != null) {
+      try {
+        queueLock.lock();
+        // Signal waiting thread
+        consumed.signal();
+      } finally {
+        queueLock.unlock();
+      }
+    }
+    return instant;
+  }
+
+  /**
+   * Start clustering service.
+   */
+  @Override
+  protected Pair<CompletableFuture, ExecutorService> startService() {
+    ExecutorService executor = Executors.newFixedThreadPool(maxConcurrentClustering,
+        r -> {
+          Thread t = new Thread(r, "async_clustering_thread");
+          t.setDaemon(isRunInDaemonMode());
+          return t;
+        });
+
+    return Pair.of(CompletableFuture.allOf(IntStream.range(0, maxConcurrentClustering).mapToObj(i -> CompletableFuture.supplyAsync(() -> {
+      try {
+        while (!isShutdownRequested()) {
+          final HoodieInstant instant = fetchNextClusteringInstant();
+          if (null != instant) {
+            LOG.info("Starting clustering for instant " + instant);
+            clusteringClient.cluster(instant);
+            LOG.info("Finished clustering for instant " + instant);
+          }
+        }
+        LOG.info("Clustering executor shutting down properly");
+      } catch (InterruptedException ie) {
+        LOG.warn("Clustering executor got interrupted exception! Stopping", ie);
+      } catch (IOException e) {
+        LOG.error("Clustering executor failed", e);
+        throw new HoodieIOException(e.getMessage(), e);
+      }
+      return true;
+    }, executor)).toArray(CompletableFuture[]::new)), executor);
+  }
+
+  public synchronized void updateWriteClient(AbstractHoodieWriteClient writeClient) {

Review comment:
       I see lot of commonality between this and AsyncCompactionService. Can we please try to re-use code as much as possible? 

##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieClusteringConfig.java
##########
@@ -153,6 +155,11 @@ public Builder withInlineClusteringNumCommits(int numCommits) {
       return this;
     }
 
+    public Builder withAsyncClusteringNumCommits(int numCommits) {

Review comment:
       minor. withAsyncClustering**Max**commits

##########
File path: hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala
##########
@@ -620,6 +641,17 @@ object HoodieSparkSqlWriter {
     }
   }
 
+  private def isAsyncClusteringEnabled(client: SparkRDDWriteClient[HoodieRecordPayload[Nothing]],
+                                       parameters: Map[String, String]) : Boolean = {
+    log.info(s"Config.asyncClusteringEnabled ? ${client.getConfig.isAsyncClusteringEnabled}")
+    if (asyncClusteringTriggerFnDefined && client.getConfig.isAsyncClusteringEnabled

Review comment:
       can we return in 1 line. 
   asyncClusteringTriggerFnDefined && client.getConfig.isAsyncClusteringEnabled && parameters.get(ASYNC_CLUSTERING_ENABLE_OPT_KEY).exists(r => r.toBoolean)

##########
File path: hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/HoodieSparkClusteringClient.java
##########
@@ -0,0 +1,53 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.client;
+
+import org.apache.hudi.common.model.HoodieKey;
+import org.apache.hudi.common.model.HoodieRecord;
+import org.apache.hudi.common.model.HoodieRecordPayload;
+import org.apache.hudi.common.table.timeline.HoodieInstant;
+
+import org.apache.log4j.LogManager;
+import org.apache.log4j.Logger;
+import org.apache.spark.api.java.JavaRDD;
+
+import java.io.IOException;
+
+public class HoodieSparkClusteringClient<T extends HoodieRecordPayload> extends

Review comment:
       java docs. 

##########
File path: hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/async/SparkAsyncClusteringService.java
##########
@@ -0,0 +1,36 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.async;
+
+import org.apache.hudi.client.AbstractClusteringClient;
+import org.apache.hudi.client.AbstractHoodieWriteClient;
+import org.apache.hudi.client.HoodieSparkClusteringClient;
+
+public class SparkAsyncClusteringService extends AsyncClusteringService {

Review comment:
       java docs.

##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/AbstractClusteringClient.java
##########
@@ -0,0 +1,41 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.client;
+
+import org.apache.hudi.common.model.HoodieRecordPayload;
+import org.apache.hudi.common.table.timeline.HoodieInstant;
+
+import java.io.IOException;
+import java.io.Serializable;
+
+public abstract class AbstractClusteringClient<T extends HoodieRecordPayload, I, K, O> implements Serializable {

Review comment:
       java docs please. 

##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/AbstractClusteringClient.java
##########
@@ -0,0 +1,41 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.client;
+
+import org.apache.hudi.common.model.HoodieRecordPayload;
+import org.apache.hudi.common.table.timeline.HoodieInstant;
+
+import java.io.IOException;
+import java.io.Serializable;
+
+public abstract class AbstractClusteringClient<T extends HoodieRecordPayload, I, K, O> implements Serializable {
+
+  protected transient AbstractHoodieWriteClient<T, I, K, O> clusteringClient;
+
+  public AbstractClusteringClient(AbstractHoodieWriteClient<T, I, K, O> clusteringClient) {
+    this.clusteringClient = clusteringClient;
+  }
+
+  public abstract void cluster(HoodieInstant instant) throws IOException;

Review comment:
       Here also we can think of something like AsyncServiceClient. 
   may be we can have a common method as below. 
   ```
      public abstract void doAction(HoodieInstant instant) throw IOException;
   ```
   
   same abstract class for both clustering and compaction. 
   Not too strong on this suggestion though. Let's see what others have to say. 

##########
File path: hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala
##########
@@ -583,13 +594,23 @@ object HoodieSparkSqlWriter {
 
       log.info(s"Compaction Scheduled is $compactionInstant")
 
+      val asyncClusteringEnabled = isAsyncClusteringEnabled(client, parameters)
+      val clusteringInstant: common.util.Option[java.lang.String] =
+        if (asyncClusteringEnabled) {
+          client.scheduleClustering(common.util.Option.of(new util.HashMap[String, String](mapAsJavaMap(metaMap))))
+        } else {
+          common.util.Option.empty()
+        }
+
+      log.info(s"Clustering Scheduled is $clusteringInstant")
+

Review comment:
       should we fix line 610 as well. 
   ```
   if(!asyncCompactionEnabled && !asyncClusteringEnabled) {
   ```

##########
File path: hudi-spark-datasource/hudi-spark/src/main/java/org/apache/hudi/async/SparkStreamingAsyncClusteringService.java
##########
@@ -0,0 +1,36 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.async;
+
+import org.apache.hudi.client.AbstractClusteringClient;
+import org.apache.hudi.client.AbstractHoodieWriteClient;
+import org.apache.hudi.client.HoodieSparkClusteringClient;
+
+public class SparkStreamingAsyncClusteringService extends AsyncClusteringService {

Review comment:
       java docs please. 

##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java
##########
@@ -396,10 +424,10 @@ public int hashCode() {
               baseFileFormat, propsFilePath, configs, sourceClassName,
               sourceOrderingField, payloadClassName, schemaProviderClassName,
               transformerClassNames, sourceLimit, operation, filterDupes,
-              enableHiveSync, maxPendingCompactions, continuousMode,
+              enableHiveSync, maxPendingCompactions, maxPendingClustering, continuousMode,
               minSyncIntervalSeconds, sparkMaster, commitOnErrors,
               deltaSyncSchedulingWeight, compactSchedulingWeight, deltaSyncSchedulingMinShare,
-              compactSchedulingMinShare, forceDisableCompaction, checkpoint,
+              compactSchedulingMinShare, forceDisableCompaction, forceDisableClustering, checkpoint,

Review comment:
       I see a config called compactSchedulingMinShare. Is there any necessity to create one for clustering? 

##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/AbstractClusteringClient.java
##########
@@ -0,0 +1,41 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.client;
+
+import org.apache.hudi.common.model.HoodieRecordPayload;
+import org.apache.hudi.common.table.timeline.HoodieInstant;
+
+import java.io.IOException;
+import java.io.Serializable;
+
+public abstract class AbstractClusteringClient<T extends HoodieRecordPayload, I, K, O> implements Serializable {
+
+  protected transient AbstractHoodieWriteClient<T, I, K, O> clusteringClient;
+

Review comment:
       again, serialVersionUUID would be good. 

##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/async/AsyncClusteringService.java
##########
@@ -0,0 +1,131 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.async;
+
+import org.apache.hudi.client.AbstractClusteringClient;
+import org.apache.hudi.client.AbstractHoodieWriteClient;
+import org.apache.hudi.common.table.timeline.HoodieInstant;
+import org.apache.hudi.common.util.collection.Pair;
+import org.apache.hudi.exception.HoodieIOException;
+
+import org.apache.log4j.LogManager;
+import org.apache.log4j.Logger;
+
+import java.io.IOException;
+import java.util.concurrent.BlockingQueue;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import java.util.concurrent.LinkedBlockingQueue;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.locks.Condition;
+import java.util.concurrent.locks.ReentrantLock;
+import java.util.stream.IntStream;
+
+public abstract class AsyncClusteringService extends HoodieAsyncService {
+
+  private static final Logger LOG = LogManager.getLogger(AsyncClusteringService.class);
+

Review comment:
       can we add serialVersionUUID as well.

##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/AbstractClusteringClient.java
##########
@@ -0,0 +1,41 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.client;
+
+import org.apache.hudi.common.model.HoodieRecordPayload;
+import org.apache.hudi.common.table.timeline.HoodieInstant;
+
+import java.io.IOException;
+import java.io.Serializable;
+
+public abstract class AbstractClusteringClient<T extends HoodieRecordPayload, I, K, O> implements Serializable {
+
+  protected transient AbstractHoodieWriteClient<T, I, K, O> clusteringClient;
+
+  public AbstractClusteringClient(AbstractHoodieWriteClient<T, I, K, O> clusteringClient) {
+    this.clusteringClient = clusteringClient;
+  }
+
+  public abstract void cluster(HoodieInstant instant) throws IOException;

Review comment:
       also, lets try to add java docs for all public methods. I do understand that AbstractCompactor does not have java docs. Its fine. atleast for the code we write, lets try to add java docs. 

##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/DeltaSync.java
##########
@@ -733,4 +743,15 @@ public Config getCfg() {
   public Option<HoodieTimeline> getCommitTimelineOpt() {
     return commitTimelineOpt;
   }
+
+  /**
+   * Schedule clustering.
+   * Called from {@link HoodieDeltaStreamer} when async clustering is enabled.
+   *
+   * @return Requested clustering instant.
+   */
+  public Option<String> getClusteringInstant() {

Review comment:
       minor. getClusteringInstant**Opt**()

##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/async/AsyncClusteringService.java
##########
@@ -0,0 +1,131 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.async;
+
+import org.apache.hudi.client.AbstractClusteringClient;
+import org.apache.hudi.client.AbstractHoodieWriteClient;
+import org.apache.hudi.common.table.timeline.HoodieInstant;
+import org.apache.hudi.common.util.collection.Pair;
+import org.apache.hudi.exception.HoodieIOException;
+
+import org.apache.log4j.LogManager;
+import org.apache.log4j.Logger;
+
+import java.io.IOException;
+import java.util.concurrent.BlockingQueue;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import java.util.concurrent.LinkedBlockingQueue;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.locks.Condition;
+import java.util.concurrent.locks.ReentrantLock;
+import java.util.stream.IntStream;
+
+public abstract class AsyncClusteringService extends HoodieAsyncService {

Review comment:
       minor. Can you fix the java docs in HoodieAsyncService to also add clustering. as of now, it looks like below.
   ```
   Base Class for running clean/delta-sync/compaction in separate thread and controlling their life-cycle.
   ```

##########
File path: hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/hudi/HoodieStreamingSink.scala
##########
@@ -86,6 +93,11 @@ class HoodieStreamingSink(sqlContext: SQLContext,
             asyncCompactorService.enqueuePendingCompaction(
               new HoodieInstant(State.REQUESTED, HoodieTimeline.COMPACTION_ACTION, compactionInstantOps.get()))
           }
+          if (clusteringInstant.isPresent) {
+            asyncClusteringService.enqueuePendingClustering(new HoodieInstant(
+              State.REQUESTED, HoodieTimeline.REPLACE_COMMIT_ACTION, clusteringInstant.get()
+            ))
+          }

Review comment:
       in line 101, do we need to return clusteringInstant as well ?

##########
File path: hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/DataSourceOptions.scala
##########
@@ -380,6 +380,10 @@ object DataSourceWriteOptions {
   val ASYNC_COMPACT_ENABLE_OPT_KEY = "hoodie.datasource.compaction.async.enable"
   val DEFAULT_ASYNC_COMPACT_ENABLE_OPT_VAL = "true"
 
+  // Async Clustering - Enabled by default
+  val ASYNC_CLUSTERING_ENABLE_OPT_KEY = "hoodie.datasource.clustering.async.enable"
+  val DEFAULT_ASYNC_CLUSTERING_ENABLE_OPT_VAL = "true"

Review comment:
       may I know why this is enabled by default? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] zhangyue19921010 edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
zhangyue19921010 edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-878004113


   Hi @codope Just want to know, is this Async clustering function can handle the following scenarios and losing no data:
   
   There are 3 small file groups named fg1, fg2 and fg3 contained file slice1, file slice2 and file slices3 separately.
   
   When async schedule **start to make a cluster plan but not finished**, there is an inflight or requested commit for fg1 which will create file slice 11 based on file slice1. In other words **file slice11 is creating but not committed**  ---> I believe this is this scene is similar to multi writer.
   
   What does this async clustering function will do? 
   Will this clustering plan contains file slice1? if contained, I think the new data in file slice11 will be lost.
   
   Looking forward to your reply, thanks a lot.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codope commented on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
codope commented on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-876943613


   > Tests coverage:
   > 
   > * Do we have tests where both compaction and clustering are async triggered? (both spark streaming and deltastreamer) ?
   
   Added such test in `TestHoodieDeltaStreamer` and `TestStructuredStreaming`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan merged pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
nsivabalan merged pull request #3142:
URL: https://github.com/apache/hudi/pull/3142


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-867078369


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#3142](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (7d4f981) into [master](https://codecov.io/gh/apache/hudi/commit/9b01d2a04520db6230cd16ef2b29013c013b1944?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (9b01d2a) will **decrease** coverage by `1.70%`.
   > The diff coverage is `49.14%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3142/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #3142      +/-   ##
   ============================================
   - Coverage     47.67%   45.97%   -1.71%     
   + Complexity     5516     4727     -789     
   ============================================
     Files           929      832      -97     
     Lines         41303    37953    -3350     
     Branches       4144     3811     -333     
   ============================================
   - Hits          19692    17449    -2243     
   + Misses        19863    18887     -976     
   + Partials       1748     1617     -131     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `39.97% <ø> (ø)` | |
   | hudiclient | `22.89% <9.52%> (-11.68%)` | :arrow_down: |
   | hudicommon | `48.56% <0.00%> (-0.03%)` | :arrow_down: |
   | hudiflink | `60.03% <ø> (ø)` | |
   | hudihadoopmr | `51.29% <ø> (ø)` | |
   | hudisparkdatasource | `67.30% <56.66%> (-0.02%)` | :arrow_down: |
   | hudisync | `54.51% <ø> (ø)` | |
   | huditimelineservice | `64.07% <ø> (ø)` | |
   | hudiutilities | `59.26% <90.19%> (+0.69%)` | :arrow_up: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [.../org/apache/hudi/async/AsyncClusteringService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ2x1c3RlcmluZ1NlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...ava/org/apache/hudi/async/AsyncCompactService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ29tcGFjdFNlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...java/org/apache/hudi/async/HoodieAsyncService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0hvb2RpZUFzeW5jU2VydmljZS5qYXZh) | `0.00% <0.00%> (ø)` | |
   | [...g/apache/hudi/client/AbstractClusteringClient.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9BYnN0cmFjdENsdXN0ZXJpbmdDbGllbnQuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh) | `42.80% <0.00%> (-0.08%)` | :arrow_down: |
   | [...a/org/apache/hudi/common/util/ClusteringUtils.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3V0aWwvQ2x1c3RlcmluZ1V0aWxzLmphdmE=) | `88.40% <0.00%> (-1.31%)` | :arrow_down: |
   | [...in/scala/org/apache/hudi/HoodieStreamingSink.scala](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZVN0cmVhbWluZ1Npbmsuc2NhbGE=) | `28.00% <36.66%> (+4.00%)` | :arrow_up: |
   | [...n/scala/org/apache/hudi/HoodieSparkSqlWriter.scala](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZVNwYXJrU3FsV3JpdGVyLnNjYWxh) | `72.03% <72.00%> (+0.76%)` | :arrow_up: |
   | [...org/apache/hudi/config/HoodieClusteringConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVDbHVzdGVyaW5nQ29uZmlnLmphdmE=) | `71.28% <75.00%> (+0.01%)` | :arrow_up: |
   | [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `71.56% <75.00%> (+0.42%)` | :arrow_up: |
   | ... and [110 more](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [9b01d2a...7d4f981](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codope commented on a change in pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
codope commented on a change in pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#discussion_r661475846



##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/AbstractClusteringClient.java
##########
@@ -0,0 +1,41 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.client;
+
+import org.apache.hudi.common.model.HoodieRecordPayload;
+import org.apache.hudi.common.table.timeline.HoodieInstant;
+
+import java.io.IOException;
+import java.io.Serializable;
+
+public abstract class AbstractClusteringClient<T extends HoodieRecordPayload, I, K, O> implements Serializable {
+
+  protected transient AbstractHoodieWriteClient<T, I, K, O> clusteringClient;
+
+  public AbstractClusteringClient(AbstractHoodieWriteClient<T, I, K, O> clusteringClient) {
+    this.clusteringClient = clusteringClient;
+  }
+
+  public abstract void cluster(HoodieInstant instant) throws IOException;

Review comment:
       > same abstract class for both clustering and compaction.
   > Not too strong on this suggestion though. Let's see what others have to say.
   
   @satishkotha what are your thoughts about this?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-867615903


   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codope commented on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
codope commented on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-869829116


   @nsivabalan Thanks for reviewing the PR. I agree with your source code comments. There is scope for reusability. I will address them and update the PR. For the high level questions, my response is as below.
    
   > * Now we have both clustering and compaction, I see that you have added clustering related code just after compaction where ever applicable. Is the higher priority for compaction intentional? or should we have clustering followed by compaction? or does it not matter at all.
   
   In case when both clustering and compaction are enabled then compaction will run just before clustering. The intention is that since currently compaction and clustering cannot run at the same time on the same file groups and clustering could take significant time, so let compaction thread start first. When clustering is scheduled for the filegroups under compaction it would be ignored and picked up in the subsequent run after compaction completes.
   
   > * I came across a class named SchedulerConfGenerator. Don't we need to make any changes here for async clustering?
   
   We will need to make changes here if we create separate job pool for clustering and assign weights for different jobs. Unlike compaction, I did not feel the need for a separate job pool for clustering. By default, each pool gets equal share of resource but within each pool, jobs run in FIFO order. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-867078369


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#3142](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (890e982) into [master](https://codecov.io/gh/apache/hudi/commit/047d956e01b6d7c92320686d8321b2bbe9d2188e?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (047d956) will **decrease** coverage by `41.17%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3142/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master   #3142       +/-   ##
   ============================================
   - Coverage     44.03%   2.86%   -41.18%     
   + Complexity     5100      85     -5015     
   ============================================
     Files           930     283      -647     
     Lines         41275   11710    -29565     
     Branches       4138     961     -3177     
   ============================================
   - Hits          18174     335    -17839     
   + Misses        21487   11349    -10138     
   + Partials       1614      26     -1588     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `?` | |
   | hudiclient | `0.00% <0.00%> (-34.59%)` | :arrow_down: |
   | hudicommon | `?` | |
   | hudiflink | `?` | |
   | hudihadoopmr | `?` | |
   | hudisparkdatasource | `?` | |
   | hudisync | `5.37% <ø> (-49.15%)` | :arrow_down: |
   | huditimelineservice | `?` | |
   | hudiutilities | `9.11% <0.00%> (-0.14%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [.../org/apache/hudi/async/AsyncClusteringService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ2x1c3RlcmluZ1NlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...ava/org/apache/hudi/async/AsyncCompactService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ29tcGFjdFNlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...java/org/apache/hudi/async/HoodieAsyncService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0hvb2RpZUFzeW5jU2VydmljZS5qYXZh) | `0.00% <0.00%> (ø)` | |
   | [...g/apache/hudi/client/AbstractClusteringClient.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9BYnN0cmFjdENsdXN0ZXJpbmdDbGllbnQuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...org/apache/hudi/config/HoodieClusteringConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVDbHVzdGVyaW5nQ29uZmlnLmphdmE=) | `0.00% <0.00%> (-71.28%)` | :arrow_down: |
   | [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh) | `0.00% <0.00%> (-42.89%)` | :arrow_down: |
   | [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `0.00% <0.00%> (ø)` | |
   | [...i/utilities/deltastreamer/HoodieDeltaStreamer.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvSG9vZGllRGVsdGFTdHJlYW1lci5qYXZh) | `0.00% <0.00%> (ø)` | |
   | [...s/deltastreamer/HoodieMultiTableDeltaStreamer.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvSG9vZGllTXVsdGlUYWJsZURlbHRhU3RyZWFtZXIuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | ... and [723 more](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [047d956...890e982](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on a change in pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on a change in pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#discussion_r663254225



##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/async/AsyncClusteringService.java
##########
@@ -0,0 +1,125 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hudi.async;
+
+import org.apache.hudi.client.AbstractClusteringClient;
+import org.apache.hudi.client.AbstractHoodieWriteClient;
+import org.apache.hudi.common.table.timeline.HoodieInstant;
+import org.apache.hudi.common.util.collection.Pair;
+import org.apache.hudi.exception.HoodieIOException;
+
+import org.apache.log4j.LogManager;
+import org.apache.log4j.Logger;
+
+import java.io.IOException;
+import java.util.concurrent.BlockingQueue;
+import java.util.concurrent.CompletableFuture;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import java.util.concurrent.LinkedBlockingQueue;
+import java.util.concurrent.locks.Condition;
+import java.util.concurrent.locks.ReentrantLock;
+import java.util.stream.IntStream;
+
+/**
+ * Async clustering service that runs in a separate thread.
+ * Currently, only one clustering thread is allowed to run at any time.
+ */
+public abstract class AsyncClusteringService extends HoodieAsyncService {
+
+  private static final long serialVersionUID = 1L;
+  private static final Logger LOG = LogManager.getLogger(AsyncClusteringService.class);
+
+  private final int maxConcurrentClustering;
+  private transient AbstractClusteringClient clusteringClient;
+  private transient BlockingQueue<HoodieInstant> pendingClustering = new LinkedBlockingQueue<>();
+  private transient ReentrantLock queueLock = new ReentrantLock();

Review comment:
       let's sync up on this. I feel we can take these vars also one level up and reuse across clustering and compaction. 

##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/async/HoodieAsyncService.java
##########
@@ -165,4 +169,49 @@ private void monitorThreads(Function<Boolean, Boolean> onShutdownCallback) {
   public boolean isRunInDaemonMode() {
     return runInDaemonMode;
   }
+
+  /**
+   * Wait till outstanding pending compaction/clustering reduces to the passed in value.
+   *
+   * @param numPending Maximum pending compactions/clustering allowed
+   * @param pendingInstants Currently enqueued pending compaction/clustering instants
+   * @param queueLock

Review comment:
       java docs on params as well :) 

##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieMultiTableDeltaStreamer.java
##########
@@ -296,6 +297,11 @@ public static void main(String[] args) throws IOException {
         + "outstanding compactions is less than this number")
     public Integer maxPendingCompactions = 5;
 
+    @Parameter(names = {"--max-pending-clustering"},

Review comment:
       out of curiosity. how come we have maxPendingCompaction/Clustering property defined as first class config for multiTableDeltaStreamer, but don't see the properties to enable/disable them. I assume those are fetched from property file for each source. So, why not fetch these properties also from the property file? I know this is not specific to clustering, but it was how compaction was defined. 

##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/async/HoodieAsyncService.java
##########
@@ -165,4 +169,49 @@ private void monitorThreads(Function<Boolean, Boolean> onShutdownCallback) {
   public boolean isRunInDaemonMode() {
     return runInDaemonMode;
   }
+
+  /**
+   * Wait till outstanding pending compaction/clustering reduces to the passed in value.
+   *
+   * @param numPending Maximum pending compactions/clustering allowed
+   * @param pendingInstants Currently enqueued pending compaction/clustering instants
+   * @param queueLock
+   * @param consumed
+   * @throws InterruptedException
+   */
+  public void waitTillPendingActionReducesTo(int numPending, BlockingQueue<HoodieInstant> pendingInstants,
+                                             ReentrantLock queueLock, Condition consumed) throws InterruptedException {
+    try {
+      queueLock.lock();
+      while (!isShutdown() && (pendingInstants.size() > numPending)) {
+        consumed.await();
+      }
+    } finally {
+      queueLock.unlock();
+    }
+  }
+
+  /**
+   * Fetch Next pending compaction/clustering instant if available.
+   *
+   * @param pendingInstants Currently enqueued pending compaction/clustering instants
+   * @param queueLock
+   * @param consumed

Review comment:
       same here. also "return" as well

##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java
##########
@@ -616,6 +651,7 @@ public DeltaSync getDeltaSync() {
           }
         } finally {
           shutdownCompactor(error);
+          shutdownClusteringService(error);

Review comment:
       can we have a single method and call it shutdownAsyncServices or backgroundServices and shut down all such services within that method ?

##########
File path: hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/hudi/HoodieStreamingSink.scala
##########
@@ -86,6 +93,11 @@ class HoodieStreamingSink(sqlContext: SQLContext,
             asyncCompactorService.enqueuePendingCompaction(
               new HoodieInstant(State.REQUESTED, HoodieTimeline.COMPACTION_ACTION, compactionInstantOps.get()))
           }
+          if (clusteringInstant.isPresent) {
+            asyncClusteringService.enqueuePendingClustering(new HoodieInstant(
+              State.REQUESTED, HoodieTimeline.REPLACE_COMMIT_ACTION, clusteringInstant.get()
+            ))
+          }

Review comment:
       I know we synced up f2f on this. but for book keeping purposes, can you leave a comment as if you haven't addressed my feedback. 

##########
File path: hudi-spark-datasource/hudi-spark-common/src/main/java/org/apache/hudi/DataSourceUtils.java
##########
@@ -150,6 +152,9 @@ public static HoodieWriteConfig createHoodieConfig(String schemaStr, String base
         .withCompactionConfig(HoodieCompactionConfig.newBuilder()
             .withPayloadClass(parameters.get(DataSourceWriteOptions.PAYLOAD_CLASS_OPT_KEY().key()))
             .withInlineCompaction(inlineCompact).build())
+        .withClusteringConfig(HoodieClusteringConfig.newBuilder()
+            .withInlineClustering(!asyncClusteringEnabled)

Review comment:
       isn't this supposed to be INLINE_CLUSTERING_PROP(hoodie.clustering.inline)? what happens if there is some contradiction among this existing property and the new property(hoodie.datasource.clustering.async.enable)?
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
       "triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867615903",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867624272",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "0138dadc9c21dc753582505e03a139efe1f63884",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568",
       "triggerID" : "0138dadc9c21dc753582505e03a139efe1f63884",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=659",
       "triggerID" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=744",
       "triggerID" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cea6548b53a242e93e861401564a5bb55d317f24",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=812",
       "triggerID" : "cea6548b53a242e93e861401564a5bb55d317f24",
       "triggerType" : "PUSH"
     }, {
       "hash" : "890e9822855fcd45c8387f83740975f43474cddc",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=814",
       "triggerID" : "890e9822855fcd45c8387f83740975f43474cddc",
       "triggerType" : "PUSH"
     }, {
       "hash" : "890e9822855fcd45c8387f83740975f43474cddc",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=818",
       "triggerID" : "877110964",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f86f50e817a625dc30f35a39b7495a4f359e4da5",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=823",
       "triggerID" : "f86f50e817a625dc30f35a39b7495a4f359e4da5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f86f50e817a625dc30f35a39b7495a4f359e4da5",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=824",
       "triggerID" : "877330225",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f86f50e817a625dc30f35a39b7495a4f359e4da5",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=823",
       "triggerID" : "877330225",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "7d4f9812e05062092c33bb5a4718ed25aa29bac6",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "7d4f9812e05062092c33bb5a4718ed25aa29bac6",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f86f50e817a625dc30f35a39b7495a4f359e4da5 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=824) Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=823) 
   * 7d4f9812e05062092c33bb5a4718ed25aa29bac6 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter commented on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-867078369


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#3142](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (4a651f8) into [master](https://codecov.io/gh/apache/hudi/commit/380518e232b883bb12579b3e98283659b464285a?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (380518e) will **decrease** coverage by `41.04%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3142/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master   #3142       +/-   ##
   ============================================
   - Coverage     44.15%   3.10%   -41.05%     
   + Complexity     4537      82     -4455     
   ============================================
     Files           819     276      -543     
     Lines         36171   10762    -25409     
     Branches       3920    1101     -2819     
   ============================================
   - Hits          15970     334    -15636     
   + Misses        18467   10402     -8065     
   + Partials       1734      26     -1708     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `?` | |
   | hudiclient | `0.00% <0.00%> (-16.46%)` | :arrow_down: |
   | hudicommon | `?` | |
   | hudiflink | `?` | |
   | hudihadoopmr | `?` | |
   | hudisparkdatasource | `?` | |
   | hudisync | `6.72% <ø> (-45.01%)` | :arrow_down: |
   | huditimelineservice | `?` | |
   | hudiutilities | `9.34% <0.00%> (-49.07%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [.../org/apache/hudi/async/AsyncClusteringService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ2x1c3RlcmluZ1NlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...g/apache/hudi/client/AbstractClusteringClient.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9BYnN0cmFjdENsdXN0ZXJpbmdDbGllbnQuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...org/apache/hudi/config/HoodieClusteringConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVDbHVzdGVyaW5nQ29uZmlnLmphdmE=) | `0.00% <0.00%> (-25.50%)` | :arrow_down: |
   | [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh) | `0.00% <0.00%> (-17.01%)` | :arrow_down: |
   | [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `0.00% <0.00%> (-71.19%)` | :arrow_down: |
   | [...i/utilities/deltastreamer/HoodieDeltaStreamer.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvSG9vZGllRGVsdGFTdHJlYW1lci5qYXZh) | `0.00% <0.00%> (-72.84%)` | :arrow_down: |
   | [...s/deltastreamer/HoodieMultiTableDeltaStreamer.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvSG9vZGllTXVsdGlUYWJsZURlbHRhU3RyZWFtZXIuamF2YQ==) | `0.00% <0.00%> (-76.20%)` | :arrow_down: |
   | [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...va/org/apache/hudi/utilities/schema/SchemaSet.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NjaGVtYS9TY2hlbWFTZXQuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | [...a/org/apache/hudi/utilities/sources/RowSource.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL3NvdXJjZXMvUm93U291cmNlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | ... and [658 more](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [380518e...4a651f8](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
       "triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867615903",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867624272",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "0138dadc9c21dc753582505e03a139efe1f63884",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568",
       "triggerID" : "0138dadc9c21dc753582505e03a139efe1f63884",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=659",
       "triggerID" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=744",
       "triggerID" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cea6548b53a242e93e861401564a5bb55d317f24",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=812",
       "triggerID" : "cea6548b53a242e93e861401564a5bb55d317f24",
       "triggerType" : "PUSH"
     }, {
       "hash" : "890e9822855fcd45c8387f83740975f43474cddc",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=814",
       "triggerID" : "890e9822855fcd45c8387f83740975f43474cddc",
       "triggerType" : "PUSH"
     }, {
       "hash" : "890e9822855fcd45c8387f83740975f43474cddc",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=818",
       "triggerID" : "877110964",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f86f50e817a625dc30f35a39b7495a4f359e4da5",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=823",
       "triggerID" : "f86f50e817a625dc30f35a39b7495a4f359e4da5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f86f50e817a625dc30f35a39b7495a4f359e4da5",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=824",
       "triggerID" : "877330225",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f86f50e817a625dc30f35a39b7495a4f359e4da5",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=823",
       "triggerID" : "877330225",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "7d4f9812e05062092c33bb5a4718ed25aa29bac6",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=849",
       "triggerID" : "7d4f9812e05062092c33bb5a4718ed25aa29bac6",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * f86f50e817a625dc30f35a39b7495a4f359e4da5 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=824) Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=823) 
   * 7d4f9812e05062092c33bb5a4718ed25aa29bac6 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=849) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] zhangyue19921010 commented on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
zhangyue19921010 commented on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-878004113


   Hi @codope Just want to know, is this Async clustering function can handle the following scenarios and losing no data:
   
   There are 3 small file group named fg1, fg2 and fg3 contained file slice1, file slice2 and file slices3 separately.
   
   When async schedule **start to make a cluster plan but not finished**, there is an inflight or requested commit for fg1 which will create file slice 11 based on file slice1. In other words **file slice11 is creating but not committed**  ---> I believe this is this scene is similar to multi writer.
   
   What does this async clustering function will do? 
   Will this clustering plan contains file slice1? if contained, I think the new data in file slice11 will be lost.
   
   Looking forward to your reply, thanks a lot.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] zhangyue19921010 edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
zhangyue19921010 edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-878004113


   Hi @codope Just want to know, is this Async clustering function can handle the following scenarios and losing no data:
   
   There are 3 small file groups named fg1, fg2 and fg3 contained file slice1, file slice2 and file slices3 separately.
   
   When async schedule **start to make a cluster plan but not finished**, there is an inflight or requested commit for fg1 which will create file slice 11 based on file slice1. In other words **file slice11 is creating but not committed**  ---> I believe this scene is similar to multi writer.
   
   What does this async clustering function will do? 
   Will this clustering plan contains file slice1? if contained, I think the new data in file slice11 will be lost.
   
   Looking forward to your reply, thanks a lot.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
       "triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867615903",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867624272",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "0138dadc9c21dc753582505e03a139efe1f63884",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568",
       "triggerID" : "0138dadc9c21dc753582505e03a139efe1f63884",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=659",
       "triggerID" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=744",
       "triggerID" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cea6548b53a242e93e861401564a5bb55d317f24",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=812",
       "triggerID" : "cea6548b53a242e93e861401564a5bb55d317f24",
       "triggerType" : "PUSH"
     }, {
       "hash" : "890e9822855fcd45c8387f83740975f43474cddc",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=814",
       "triggerID" : "890e9822855fcd45c8387f83740975f43474cddc",
       "triggerType" : "PUSH"
     }, {
       "hash" : "890e9822855fcd45c8387f83740975f43474cddc",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=818",
       "triggerID" : "877110964",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * 890e9822855fcd45c8387f83740975f43474cddc Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=814) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=818) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 4a651f8d6b32f63838d8317c9f083508c17e4458 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
       "triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867615903",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867624272",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "0138dadc9c21dc753582505e03a139efe1f63884",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568",
       "triggerID" : "0138dadc9c21dc753582505e03a139efe1f63884",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=659",
       "triggerID" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 0138dadc9c21dc753582505e03a139efe1f63884 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568) 
   * cd2781856c66d1b8527b12f29c99f3faf31e3fdb Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=659) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codope commented on a change in pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
codope commented on a change in pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#discussion_r664562160



##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/HoodieDeltaStreamer.java
##########
@@ -616,6 +651,7 @@ public DeltaSync getDeltaSync() {
           }
         } finally {
           shutdownCompactor(error);
+          shutdownClusteringService(error);

Review comment:
       Done.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-867078369


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#3142](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (cd27818) into [master](https://codecov.io/gh/apache/hudi/commit/70d9c2e7473154178bafaef40a15fc3e10a9df6a?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (70d9c2e) will **decrease** coverage by `12.61%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3142/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master   #3142       +/-   ##
   ============================================
   - Coverage     15.48%   2.86%   -12.62%     
   + Complexity      478      82      -396     
   ============================================
     Files           280     282        +2     
     Lines         11548   11650      +102     
     Branches        945     954        +9     
   ============================================
   - Hits           1788     334     -1454     
   - Misses         9601   11290     +1689     
   + Partials        159      26      -133     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudiclient | `0.00% <0.00%> (ø)` | |
   | hudisync | `5.38% <ø> (ø)` | |
   | hudiutilities | `9.16% <0.00%> (-48.87%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [.../org/apache/hudi/async/AsyncClusteringService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ2x1c3RlcmluZ1NlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...ava/org/apache/hudi/async/AsyncCompactService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ29tcGFjdFNlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...java/org/apache/hudi/async/HoodieAsyncService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0hvb2RpZUFzeW5jU2VydmljZS5qYXZh) | `0.00% <0.00%> (ø)` | |
   | [...g/apache/hudi/client/AbstractClusteringClient.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9BYnN0cmFjdENsdXN0ZXJpbmdDbGllbnQuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...org/apache/hudi/config/HoodieClusteringConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVDbHVzdGVyaW5nQ29uZmlnLmphdmE=) | `0.00% <0.00%> (ø)` | |
   | [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh) | `0.00% <0.00%> (ø)` | |
   | [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `0.00% <0.00%> (-71.05%)` | :arrow_down: |
   | [...i/utilities/deltastreamer/HoodieDeltaStreamer.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvSG9vZGllRGVsdGFTdHJlYW1lci5qYXZh) | `0.00% <0.00%> (-72.84%)` | :arrow_down: |
   | [...s/deltastreamer/HoodieMultiTableDeltaStreamer.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvSG9vZGllTXVsdGlUYWJsZURlbHRhU3RyZWFtZXIuamF2YQ==) | `0.00% <0.00%> (-76.20%)` | :arrow_down: |
   | [...va/org/apache/hudi/utilities/IdentitySplitter.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL0lkZW50aXR5U3BsaXR0ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | ... and [48 more](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [70d9c2e...cd27818](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
       "triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867615903",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867624272",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "0138dadc9c21dc753582505e03a139efe1f63884",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568",
       "triggerID" : "0138dadc9c21dc753582505e03a139efe1f63884",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=659",
       "triggerID" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=744",
       "triggerID" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cea6548b53a242e93e861401564a5bb55d317f24",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "cea6548b53a242e93e861401564a5bb55d317f24",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 2d1f23811a18ed5b127c12391ea0fcabe50bb8a4 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=744) 
   * cea6548b53a242e93e861401564a5bb55d317f24 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
       "triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867615903",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867624272",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "0138dadc9c21dc753582505e03a139efe1f63884",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568",
       "triggerID" : "0138dadc9c21dc753582505e03a139efe1f63884",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=659",
       "triggerID" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=744",
       "triggerID" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 2d1f23811a18ed5b127c12391ea0fcabe50bb8a4 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=744) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codope commented on a change in pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
codope commented on a change in pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#discussion_r661470750



##########
File path: hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala
##########
@@ -583,13 +594,23 @@ object HoodieSparkSqlWriter {
 
       log.info(s"Compaction Scheduled is $compactionInstant")
 
+      val asyncClusteringEnabled = isAsyncClusteringEnabled(client, parameters)
+      val clusteringInstant: common.util.Option[java.lang.String] =
+        if (asyncClusteringEnabled) {
+          client.scheduleClustering(common.util.Option.of(new util.HashMap[String, String](mapAsJavaMap(metaMap))))
+        } else {
+          common.util.Option.empty()
+        }
+
+      log.info(s"Clustering Scheduled is $clusteringInstant")
+

Review comment:
       Good catch. Done.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
       "triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867615903",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867624272",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "0138dadc9c21dc753582505e03a139efe1f63884",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568",
       "triggerID" : "0138dadc9c21dc753582505e03a139efe1f63884",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 0138dadc9c21dc753582505e03a139efe1f63884 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568) 
   * cd2781856c66d1b8527b12f29c99f3faf31e3fdb UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-867078369


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#3142](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (f86f50e) into [master](https://codecov.io/gh/apache/hudi/commit/371526789d663dee85041eb31c27c52c81ef87ef?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (3715267) will **increase** coverage by `0.12%`.
   > The diff coverage is `34.84%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3142/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #3142      +/-   ##
   ============================================
   + Coverage     27.40%   27.53%   +0.12%     
   - Complexity     1287     1291       +4     
   ============================================
     Files           381      385       +4     
     Lines         15108    15214     +106     
     Branches       1305     1316      +11     
   ============================================
   + Hits           4141     4189      +48     
   - Misses        10667    10722      +55     
   - Partials        300      303       +3     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudiclient | `20.93% <0.00%> (-0.12%)` | :arrow_down: |
   | hudisync | `5.37% <ø> (ø)` | |
   | hudiutilities | `59.26% <90.19%> (+0.69%)` | :arrow_up: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [.../org/apache/hudi/async/AsyncClusteringService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ2x1c3RlcmluZ1NlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...ava/org/apache/hudi/async/AsyncCompactService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ29tcGFjdFNlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...java/org/apache/hudi/async/HoodieAsyncService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0hvb2RpZUFzeW5jU2VydmljZS5qYXZh) | `0.00% <0.00%> (ø)` | |
   | [...g/apache/hudi/client/AbstractClusteringClient.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9BYnN0cmFjdENsdXN0ZXJpbmdDbGllbnQuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...org/apache/hudi/config/HoodieClusteringConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVDbHVzdGVyaW5nQ29uZmlnLmphdmE=) | `0.00% <0.00%> (ø)` | |
   | [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh) | `0.00% <0.00%> (ø)` | |
   | [...apache/hudi/async/SparkAsyncClusteringService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1zcGFyay1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvYXN5bmMvU3BhcmtBc3luY0NsdXN0ZXJpbmdTZXJ2aWNlLmphdmE=) | `0.00% <0.00%> (ø)` | |
   | [...pache/hudi/client/HoodieSparkClusteringClient.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1zcGFyay1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY2xpZW50L0hvb2RpZVNwYXJrQ2x1c3RlcmluZ0NsaWVudC5qYXZh) | `0.00% <0.00%> (ø)` | |
   | [...ion/cluster/SparkClusteringPlanActionExecutor.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1zcGFyay1jbGllbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdGFibGUvYWN0aW9uL2NsdXN0ZXIvU3BhcmtDbHVzdGVyaW5nUGxhbkFjdGlvbkV4ZWN1dG9yLmphdmE=) | `60.00% <0.00%> (-15.00%)` | :arrow_down: |
   | [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `71.56% <75.00%> (+0.42%)` | :arrow_up: |
   | ... and [8 more](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [3715267...f86f50e](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
       "triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867615903",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867624272",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "0138dadc9c21dc753582505e03a139efe1f63884",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568",
       "triggerID" : "0138dadc9c21dc753582505e03a139efe1f63884",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=659",
       "triggerID" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=744",
       "triggerID" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cea6548b53a242e93e861401564a5bb55d317f24",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=812",
       "triggerID" : "cea6548b53a242e93e861401564a5bb55d317f24",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 2d1f23811a18ed5b127c12391ea0fcabe50bb8a4 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=744) 
   * cea6548b53a242e93e861401564a5bb55d317f24 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=812) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
       "triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867615903",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867624272",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * 4a651f8d6b32f63838d8317c9f083508c17e4458 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383) Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
       "triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867615903",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "867624272",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * 4a651f8d6b32f63838d8317c9f083508c17e4458 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-867078369


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#3142](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (0138dad) into [master](https://codecov.io/gh/apache/hudi/commit/07e93de8b49560eee23237817fc24fbe763f2891?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (07e93de) will **decrease** coverage by `29.48%`.
   > The diff coverage is `42.24%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3142/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@              Coverage Diff              @@
   ##             master    #3142       +/-   ##
   =============================================
   - Coverage     46.23%   16.74%   -29.49%     
   + Complexity     5396      482     -4914     
   =============================================
     Files           921      282      -639     
     Lines         40085    10992    -29093     
     Branches       4298     1120     -3178     
   =============================================
   - Hits          18533     1841    -16692     
   + Misses        19665     8989    -10676     
   + Partials       1887      162     -1725     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `?` | |
   | hudiclient | `0.00% <0.00%> (-30.46%)` | :arrow_down: |
   | hudicommon | `?` | |
   | hudiflink | `?` | |
   | hudihadoopmr | `?` | |
   | hudisparkdatasource | `?` | |
   | hudisync | `5.38% <ø> (-48.67%)` | :arrow_down: |
   | huditimelineservice | `?` | |
   | hudiutilities | `59.38% <89.09%> (+0.79%)` | :arrow_up: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [.../org/apache/hudi/async/AsyncClusteringService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ2x1c3RlcmluZ1NlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...ava/org/apache/hudi/async/AsyncCompactService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ29tcGFjdFNlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...java/org/apache/hudi/async/HoodieAsyncService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0hvb2RpZUFzeW5jU2VydmljZS5qYXZh) | `0.00% <0.00%> (ø)` | |
   | [...g/apache/hudi/client/AbstractClusteringClient.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9BYnN0cmFjdENsdXN0ZXJpbmdDbGllbnQuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...org/apache/hudi/config/HoodieClusteringConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVDbHVzdGVyaW5nQ29uZmlnLmphdmE=) | `0.00% <0.00%> (-25.50%)` | :arrow_down: |
   | [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh) | `0.00% <0.00%> (-17.01%)` | :arrow_down: |
   | [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `71.38% <75.00%> (+0.43%)` | :arrow_up: |
   | [...i/utilities/deltastreamer/HoodieDeltaStreamer.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvSG9vZGllRGVsdGFTdHJlYW1lci5qYXZh) | `75.25% <90.24%> (+2.41%)` | :arrow_up: |
   | [...s/deltastreamer/HoodieMultiTableDeltaStreamer.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvSG9vZGllTXVsdGlUYWJsZURlbHRhU3RyZWFtZXIuamF2YQ==) | `77.01% <100.00%> (+0.82%)` | :arrow_up: |
   | [...main/java/org/apache/hudi/metrics/HoodieGauge.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL21ldHJpY3MvSG9vZGllR2F1Z2UuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | :arrow_down: |
   | ... and [716 more](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [07e93de...0138dad](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
       "triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867615903",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867624272",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "0138dadc9c21dc753582505e03a139efe1f63884",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568",
       "triggerID" : "0138dadc9c21dc753582505e03a139efe1f63884",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 4a651f8d6b32f63838d8317c9f083508c17e4458 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383) Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413) 
   * 0138dadc9c21dc753582505e03a139efe1f63884 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
       "triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867615903",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867624272",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "0138dadc9c21dc753582505e03a139efe1f63884",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568",
       "triggerID" : "0138dadc9c21dc753582505e03a139efe1f63884",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=659",
       "triggerID" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=744",
       "triggerID" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cea6548b53a242e93e861401564a5bb55d317f24",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=812",
       "triggerID" : "cea6548b53a242e93e861401564a5bb55d317f24",
       "triggerType" : "PUSH"
     }, {
       "hash" : "890e9822855fcd45c8387f83740975f43474cddc",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=814",
       "triggerID" : "890e9822855fcd45c8387f83740975f43474cddc",
       "triggerType" : "PUSH"
     }, {
       "hash" : "890e9822855fcd45c8387f83740975f43474cddc",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=818",
       "triggerID" : "877110964",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * 890e9822855fcd45c8387f83740975f43474cddc Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=814) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=818) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
       "triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867615903",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867624272",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "0138dadc9c21dc753582505e03a139efe1f63884",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=568",
       "triggerID" : "0138dadc9c21dc753582505e03a139efe1f63884",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=659",
       "triggerID" : "cd2781856c66d1b8527b12f29c99f3faf31e3fdb",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=744",
       "triggerID" : "2d1f23811a18ed5b127c12391ea0fcabe50bb8a4",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cea6548b53a242e93e861401564a5bb55d317f24",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=812",
       "triggerID" : "cea6548b53a242e93e861401564a5bb55d317f24",
       "triggerType" : "PUSH"
     }, {
       "hash" : "890e9822855fcd45c8387f83740975f43474cddc",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=814",
       "triggerID" : "890e9822855fcd45c8387f83740975f43474cddc",
       "triggerType" : "PUSH"
     }, {
       "hash" : "890e9822855fcd45c8387f83740975f43474cddc",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=818",
       "triggerID" : "877110964",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f86f50e817a625dc30f35a39b7495a4f359e4da5",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=823",
       "triggerID" : "f86f50e817a625dc30f35a39b7495a4f359e4da5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f86f50e817a625dc30f35a39b7495a4f359e4da5",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=824",
       "triggerID" : "877330225",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "f86f50e817a625dc30f35a39b7495a4f359e4da5",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=823",
       "triggerID" : "877330225",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "7d4f9812e05062092c33bb5a4718ed25aa29bac6",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=849",
       "triggerID" : "7d4f9812e05062092c33bb5a4718ed25aa29bac6",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 7d4f9812e05062092c33bb5a4718ed25aa29bac6 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=849) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codecov-commenter edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-867078369


   # [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#3142](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (7d4f981) into [master](https://codecov.io/gh/apache/hudi/commit/9b01d2a04520db6230cd16ef2b29013c013b1944?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (9b01d2a) will **decrease** coverage by `3.41%`.
   > The diff coverage is `49.14%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/hudi/pull/3142/graphs/tree.svg?width=650&height=150&src=pr&token=VTTXabwbs2&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #3142      +/-   ##
   ============================================
   - Coverage     47.67%   44.26%   -3.42%     
   + Complexity     5516     4478    -1038     
   ============================================
     Files           929      822     -107     
     Lines         41303    37123    -4180     
     Branches       4144     3758     -386     
   ============================================
   - Hits          19692    16432    -3260     
   + Misses        19863    19169     -694     
   + Partials       1748     1522     -226     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | hudicli | `39.97% <ø> (ø)` | |
   | hudiclient | `22.89% <9.52%> (-11.68%)` | :arrow_down: |
   | hudicommon | `48.56% <0.00%> (-0.03%)` | :arrow_down: |
   | hudiflink | `60.03% <ø> (ø)` | |
   | hudihadoopmr | `51.29% <ø> (ø)` | |
   | hudisparkdatasource | `67.30% <56.66%> (-0.02%)` | :arrow_down: |
   | hudisync | `5.37% <ø> (-49.15%)` | :arrow_down: |
   | huditimelineservice | `?` | |
   | hudiutilities | `59.26% <90.19%> (+0.69%)` | :arrow_up: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [.../org/apache/hudi/async/AsyncClusteringService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ2x1c3RlcmluZ1NlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...ava/org/apache/hudi/async/AsyncCompactService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0FzeW5jQ29tcGFjdFNlcnZpY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...java/org/apache/hudi/async/HoodieAsyncService.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2FzeW5jL0hvb2RpZUFzeW5jU2VydmljZS5qYXZh) | `0.00% <0.00%> (ø)` | |
   | [...g/apache/hudi/client/AbstractClusteringClient.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NsaWVudC9BYnN0cmFjdENsdXN0ZXJpbmdDbGllbnQuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...java/org/apache/hudi/config/HoodieWriteConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVXcml0ZUNvbmZpZy5qYXZh) | `42.80% <0.00%> (-0.08%)` | :arrow_down: |
   | [...a/org/apache/hudi/common/util/ClusteringUtils.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvY29tbW9uL3V0aWwvQ2x1c3RlcmluZ1V0aWxzLmphdmE=) | `88.40% <0.00%> (-1.31%)` | :arrow_down: |
   | [...in/scala/org/apache/hudi/HoodieStreamingSink.scala](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZVN0cmVhbWluZ1Npbmsuc2NhbGE=) | `28.00% <36.66%> (+4.00%)` | :arrow_up: |
   | [...n/scala/org/apache/hudi/HoodieSparkSqlWriter.scala](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1zcGFyay1kYXRhc291cmNlL2h1ZGktc3Bhcmsvc3JjL21haW4vc2NhbGEvb3JnL2FwYWNoZS9odWRpL0hvb2RpZVNwYXJrU3FsV3JpdGVyLnNjYWxh) | `72.03% <72.00%> (+0.76%)` | :arrow_up: |
   | [...org/apache/hudi/config/HoodieClusteringConfig.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS1jbGllbnQvaHVkaS1jbGllbnQtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9odWRpL2NvbmZpZy9Ib29kaWVDbHVzdGVyaW5nQ29uZmlnLmphdmE=) | `71.28% <75.00%> (+0.01%)` | :arrow_up: |
   | [...apache/hudi/utilities/deltastreamer/DeltaSync.java](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVkaS11dGlsaXRpZXMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2h1ZGkvdXRpbGl0aWVzL2RlbHRhc3RyZWFtZXIvRGVsdGFTeW5jLmphdmE=) | `71.56% <75.00%> (+0.42%)` | :arrow_up: |
   | ... and [138 more](https://codecov.io/gh/apache/hudi/pull/3142/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [9b01d2a...7d4f981](https://codecov.io/gh/apache/hudi/pull/3142?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-866996072


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383",
       "triggerID" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867615903",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4a651f8d6b32f63838d8317c9f083508c17e4458",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413",
       "triggerID" : "867624272",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "0138dadc9c21dc753582505e03a139efe1f63884",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "0138dadc9c21dc753582505e03a139efe1f63884",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 4a651f8d6b32f63838d8317c9f083508c17e4458 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=383) Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=413) 
   * 0138dadc9c21dc753582505e03a139efe1f63884 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] codope commented on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
codope commented on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-879223675


   > Hi @codope Just want to know, is this Async clustering function can handle the following scenarios and losing no data:
   > 
   > There are 3 small file groups named fg1, fg2 and fg3 contained file slice1, file slice2 and file slices3 separately.
   > 
   > When async schedule **start to make a cluster plan but not finished**, there is an inflight or requested commit for fg1 which will create file slice 11 based on file slice1. In other words **file slice11 is creating but not committed** ---> I believe this scene is similar to multi writers.
   > 
   > What does this async clustering function will do?
   > Will this clustering plan contains file slice1? if contained, I think the new data in file slice11 will be lost.
   > 
   > Looking forward to your reply, thanks a lot.
   
   @zhangyue19921010 It will depend on what point of time during clustering planning file slice11 is created. If it is before the `ClusteringPlanStrategy#getFileSlicesEligibleForClustering` is invoked then clustering plan will not contain file slice1. So, just like multi writers there is a race condition here. However, while actually clustering, the default (and currently only) strategy is to reject updates. So, it will throw exception after seeing that there is an a filegroup with update (in this case fg1). This should get picked up in the next run of clustering.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] zhangyue19921010 edited a comment on pull request #3142: [HUDI-1483] Support async clustering for deltastreamer and Spark streaming

Posted by GitBox <gi...@apache.org>.
zhangyue19921010 edited a comment on pull request #3142:
URL: https://github.com/apache/hudi/pull/3142#issuecomment-878004113


   Hi @codope Just want to know, is this Async clustering function can handle the following scenarios and losing no data:
   
   There are 3 small file groups named fg1, fg2 and fg3 contained file slice1, file slice2 and file slices3 separately.
   
   When async schedule **start to make a cluster plan but not finished**, there is an inflight or requested commit for fg1 which will create file slice 11 based on file slice1. In other words **file slice11 is creating but not committed**  ---> I believe this scene is similar to multi writers.
   
   What does this async clustering function will do? 
   Will this clustering plan contains file slice1? if contained, I think the new data in file slice11 will be lost.
   
   Looking forward to your reply, thanks a lot.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org