You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@celeborn.apache.org by "zhongqiangczq (via GitHub)" <gi...@apache.org> on 2023/02/13 03:31:31 UTC

[GitHub] [incubator-celeborn] zhongqiangczq opened a new pull request, #1224: [CELEBORN-291] improve code about shuffleclientimpl pushing data to m…

zhongqiangczq opened a new pull request, #1224:
URL: https://github.com/apache/incubator-celeborn/pull/1224

   …appartition
   
   <!--
   Thanks for sending a pull request!  Here are some tips for you:
     - Make sure the PR title start w/ a JIRA ticket, e.g. '[CELEBORN-XXXX] Your PR title ...'.
     - Be sure to keep the PR description updated to reflect all changes.
     - Please write your PR title to summarize what this PR proposes.
     - If possible, provide a concise example to reproduce the issue for a faster review.
   -->
   
   ### What changes were proposed in this pull request?
   optimize  currentClient to concurrentmap to avoid that different maptask use the same currentclient
   the first int of data header in mappartition should be partitionid(subpartitionid) rather than mappid
   closefunction is called while successfullly writeandflush insteading of get successfull response
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@celeborn.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-celeborn] FMX commented on a diff in pull request #1224: [CELEBORN-291] optimize shuffleclientimpl creating client and pushdata for mappartition

Posted by "FMX (via GitHub)" <gi...@apache.org>.
FMX commented on code in PR #1224:
URL: https://github.com/apache/incubator-celeborn/pull/1224#discussion_r1109368964


##########
common/src/main/java/org/apache/celeborn/common/network/client/TransportClient.java:
##########
@@ -338,4 +352,26 @@ protected void handleFailure(String errorMsg, Throwable cause) {
       callback.onFailure(new IOException(errorMsg, cause));
     }
   }
+
+  private class RpcChannelListenerWithCompleteCallback extends RpcChannelListener {
+    final BooleanSupplier closeCallBack;

Review Comment:
   No result is expected, the supplier can be replaced.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@celeborn.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-celeborn] FMX commented on a diff in pull request #1224: [CELEBORN-291] optimize shuffleclientimpl creating client and pushdata for mappartition

Posted by "FMX (via GitHub)" <gi...@apache.org>.
FMX commented on code in PR #1224:
URL: https://github.com/apache/incubator-celeborn/pull/1224#discussion_r1109466026


##########
common/src/main/java/org/apache/celeborn/common/network/client/TransportClient.java:
##########
@@ -175,6 +175,19 @@ public ChannelFuture pushData(
     return channelFuture;
   }
 
+  public ChannelFuture pushDataWithCompleteCallback(

Review Comment:
   This method can be merged with previous method because listener is different.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@celeborn.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-celeborn] zhongqiangczq commented on a diff in pull request #1224: [CELEBORN-291] optimize shuffleclientimpl creating client and pushdata for mappartition

Posted by "zhongqiangczq (via GitHub)" <gi...@apache.org>.
zhongqiangczq commented on code in PR #1224:
URL: https://github.com/apache/incubator-celeborn/pull/1224#discussion_r1106816829


##########
client/src/main/java/org/apache/celeborn/client/ShuffleClientImpl.java:
##########
@@ -131,7 +131,7 @@ private static class ReduceFileGroups {
   // key: shuffleId
   private final Map<Integer, ReduceFileGroups> reduceFileGroupsMap = new ConcurrentHashMap<>();
 
-  private TransportClient currentClient;
+  private ConcurrentHashMap<String, TransportClient> currentClient = new ConcurrentHashMap<>();

Review Comment:
   yes, it should be cleauped in cleaup function



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@celeborn.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-celeborn] zhongqiangczq commented on a diff in pull request #1224: [CELEBORN-291] optimize shuffleclientimpl creating client and pushdata for mappartition

Posted by "zhongqiangczq (via GitHub)" <gi...@apache.org>.
zhongqiangczq commented on code in PR #1224:
URL: https://github.com/apache/incubator-celeborn/pull/1224#discussion_r1109520080


##########
common/src/main/java/org/apache/celeborn/common/network/client/TransportClient.java:
##########
@@ -175,6 +175,19 @@ public ChannelFuture pushData(
     return channelFuture;
   }
 
+  public ChannelFuture pushDataWithCompleteCallback(

Review Comment:
   ok, merge common code



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@celeborn.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-celeborn] zhongqiangczq commented on a diff in pull request #1224: [CELEBORN-291] optimize shuffleclientimpl creating client and pushdata for mappartition

Posted by "zhongqiangczq (via GitHub)" <gi...@apache.org>.
zhongqiangczq commented on code in PR #1224:
URL: https://github.com/apache/incubator-celeborn/pull/1224#discussion_r1109527744


##########
common/src/main/java/org/apache/celeborn/common/network/client/TransportClient.java:
##########
@@ -175,6 +175,19 @@ public ChannelFuture pushData(
     return channelFuture;
   }
 
+  public ChannelFuture pushDataWithCompleteCallback(
+      PushData pushData, RpcResponseCallback callback, Runnable closeCallback) {
+    if (logger.isTraceEnabled()) {
+      logger.trace("Pushing data to {}", NettyUtils.getRemoteAddress(channel));
+    }
+    long requestId = requestId();
+    handler.addRpcRequest(requestId, callback);

Review Comment:
   ok, code reuse solves this problem



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@celeborn.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-celeborn] FMX commented on a diff in pull request #1224: [CELEBORN-291] optimize shuffleclientimpl creating client and pushdata for mappartition

Posted by "FMX (via GitHub)" <gi...@apache.org>.
FMX commented on code in PR #1224:
URL: https://github.com/apache/incubator-celeborn/pull/1224#discussion_r1109461326


##########
common/src/main/java/org/apache/celeborn/common/network/client/TransportClient.java:
##########
@@ -175,6 +175,19 @@ public ChannelFuture pushData(
     return channelFuture;
   }
 
+  public ChannelFuture pushDataWithCompleteCallback(
+      PushData pushData, RpcResponseCallback callback, Runnable closeCallback) {
+    if (logger.isTraceEnabled()) {
+      logger.trace("Pushing data to {}", NettyUtils.getRemoteAddress(channel));
+    }
+    long requestId = requestId();
+    handler.addRpcRequest(requestId, callback);

Review Comment:
   Should use "addPushRequest".



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@celeborn.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-celeborn] FMX merged pull request #1224: [CELEBORN-291] optimize shuffleclientimpl creating client and pushdata for mappartition

Posted by "FMX (via GitHub)" <gi...@apache.org>.
FMX merged PR #1224:
URL: https://github.com/apache/incubator-celeborn/pull/1224


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@celeborn.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-celeborn] codecov[bot] commented on pull request #1224: [CELEBORN-291] improve code about shuffleclientimpl pushing data to m…

Posted by "codecov[bot] (via GitHub)" <gi...@apache.org>.
codecov[bot] commented on PR #1224:
URL: https://github.com/apache/incubator-celeborn/pull/1224#issuecomment-1427345432

   # [Codecov](https://codecov.io/gh/apache/incubator-celeborn/pull/1224?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#1224](https://codecov.io/gh/apache/incubator-celeborn/pull/1224?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (9eca8d8) into [main](https://codecov.io/gh/apache/incubator-celeborn/commit/adb6592d316b28fdf26cc0364779b64de96c9e6a?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (adb6592) will **decrease** coverage by `0.03%`.
   > The diff coverage is `25.93%`.
   
   ```diff
   @@             Coverage Diff              @@
   ##               main    #1224      +/-   ##
   ============================================
   - Coverage     27.18%   27.14%   -0.03%     
     Complexity      806      806              
   ============================================
     Files           212      212              
     Lines         17974    18015      +41     
     Branches       1964     1965       +1     
   ============================================
   + Hits           4885     4889       +4     
   - Misses        12766    12803      +37     
     Partials        323      323              
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/incubator-celeborn/pull/1224?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...eleborn/common/network/client/TransportClient.java](https://codecov.io/gh/apache/incubator-celeborn/pull/1224?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Y29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9jZWxlYm9ybi9jb21tb24vbmV0d29yay9jbGllbnQvVHJhbnNwb3J0Q2xpZW50LmphdmE=) | `32.07% <0.00%> (-4.78%)` | :arrow_down: |
   | [.../org/apache/celeborn/client/ShuffleClientImpl.java](https://codecov.io/gh/apache/incubator-celeborn/pull/1224?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Y2xpZW50L3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9jZWxlYm9ybi9jbGllbnQvU2h1ZmZsZUNsaWVudEltcGwuamF2YQ==) | `18.13% <66.67%> (-1.22%)` | :arrow_down: |
   | [...cala/org/apache/celeborn/common/CelebornConf.scala](https://codecov.io/gh/apache/incubator-celeborn/pull/1224?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Y29tbW9uL3NyYy9tYWluL3NjYWxhL29yZy9hcGFjaGUvY2VsZWJvcm4vY29tbW9uL0NlbGVib3JuQ29uZi5zY2FsYQ==) | `81.01% <100.00%> (ø)` | |
   | [...ice/deploy/master/clustermeta/ha/HARaftServer.java](https://codecov.io/gh/apache/incubator-celeborn/pull/1224?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-bWFzdGVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9jZWxlYm9ybi9zZXJ2aWNlL2RlcGxveS9tYXN0ZXIvY2x1c3Rlcm1ldGEvaGEvSEFSYWZ0U2VydmVyLmphdmE=) | `76.58% <0.00%> (-1.35%)` | :arrow_down: |
   | [...apache/celeborn/service/deploy/worker/Worker.scala](https://codecov.io/gh/apache/incubator-celeborn/pull/1224?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-d29ya2VyL3NyYy9tYWluL3NjYWxhL29yZy9hcGFjaGUvY2VsZWJvcm4vc2VydmljZS9kZXBsb3kvd29ya2VyL1dvcmtlci5zY2FsYQ==) | `0.00% <0.00%> (ø)` | |
   | [.../celeborn/service/deploy/worker/WorkerSource.scala](https://codecov.io/gh/apache/incubator-celeborn/pull/1224?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-d29ya2VyL3NyYy9tYWluL3NjYWxhL29yZy9hcGFjaGUvY2VsZWJvcm4vc2VydmljZS9kZXBsb3kvd29ya2VyL1dvcmtlclNvdXJjZS5zY2FsYQ==) | `100.00% <0.00%> (ø)` | |
   | [...eleborn/common/metrics/source/AbstractSource.scala](https://codecov.io/gh/apache/incubator-celeborn/pull/1224?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Y29tbW9uL3NyYy9tYWluL3NjYWxhL29yZy9hcGFjaGUvY2VsZWJvcm4vY29tbW9uL21ldHJpY3Mvc291cmNlL0Fic3RyYWN0U291cmNlLnNjYWxh) | `0.00% <0.00%> (ø)` | |
   | [...oy/worker/congestcontrol/CongestionController.java](https://codecov.io/gh/apache/incubator-celeborn/pull/1224?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-d29ya2VyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9jZWxlYm9ybi9zZXJ2aWNlL2RlcGxveS93b3JrZXIvY29uZ2VzdGNvbnRyb2wvQ29uZ2VzdGlvbkNvbnRyb2xsZXIuamF2YQ==) | `76.86% <0.00%> (+9.18%)` | :arrow_up: |
   
   :mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@celeborn.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-celeborn] zhongqiangczq commented on a diff in pull request #1224: [CELEBORN-291] optimize shuffleclientimpl creating client and pushdata for mappartition

Posted by "zhongqiangczq (via GitHub)" <gi...@apache.org>.
zhongqiangczq commented on code in PR #1224:
URL: https://github.com/apache/incubator-celeborn/pull/1224#discussion_r1109374513


##########
common/src/main/java/org/apache/celeborn/common/network/client/TransportClient.java:
##########
@@ -338,4 +352,26 @@ protected void handleFailure(String errorMsg, Throwable cause) {
       callback.onFailure(new IOException(errorMsg, cause));
     }
   }
+
+  private class RpcChannelListenerWithCompleteCallback extends RpcChannelListener {
+    final BooleanSupplier closeCallBack;

Review Comment:
   yes, runnable is more  appropriate



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@celeborn.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-celeborn] RexXiong commented on a diff in pull request #1224: [CELEBORN-291] optimize shuffleclientimpl creating client and pushdata for mappartition

Posted by "RexXiong (via GitHub)" <gi...@apache.org>.
RexXiong commented on code in PR #1224:
URL: https://github.com/apache/incubator-celeborn/pull/1224#discussion_r1104061143


##########
client/src/main/java/org/apache/celeborn/client/ShuffleClientImpl.java:
##########
@@ -131,7 +131,7 @@ private static class ReduceFileGroups {
   // key: shuffleId
   private final Map<Integer, ReduceFileGroups> reduceFileGroupsMap = new ConcurrentHashMap<>();
 
-  private TransportClient currentClient;
+  private ConcurrentHashMap<String, TransportClient> currentClient = new ConcurrentHashMap<>();

Review Comment:
   need cleanup this when the map task/shuffle is finished.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@celeborn.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org