You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@celeborn.apache.org by "RexXiong (via GitHub)" <gi...@apache.org> on 2023/02/22 04:57:26 UTC

[GitHub] [incubator-celeborn] RexXiong opened a new pull request, #1260: [CELEBORN-326)] [Flink] lifecycleManager supports flink-yarn-session mode to handle multiple Flink jobs.

RexXiong opened a new pull request, #1260:
URL: https://github.com/apache/incubator-celeborn/pull/1260

   …n mode to handle multiple Flink jobs
   
   <!--
   Thanks for sending a pull request!  Here are some tips for you:
     - Make sure the PR title start w/ a JIRA ticket, e.g. '[CELEBORN-XXXX] Your PR title ...'.
     - Be sure to keep the PR description updated to reflect all changes.
     - Please write your PR title to summarize what this PR proposes.
     - If possible, provide a concise example to reproduce the issue for a faster review.
   -->
   
   ### What changes were proposed in this pull request?
   lifecycleManager in FlinkJobMaster supports flink-yarn-session mode to handle multiple Flink jobs
   
   
   ### Why are the changes needed?
   As yarn session mode, jobManager and taskManagers will share across multiple jobs, if we use one lifecycleManager perJob, that will cause two major problem
   1. (map)ShuffleClientImpl can not reuse  in taskManager 
   2. multiple jobs register will use many connection and port that will consume lots of system resource.
   
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   ### How was this patch tested?
   UT/Manual Test
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@celeborn.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-celeborn] FMX commented on a diff in pull request #1260: [CELEBORN-326)] [Flink] lifecycleManager supports flink-yarn-session mode to handle multiple Flink jobs.

Posted by "FMX (via GitHub)" <gi...@apache.org>.
FMX commented on code in PR #1260:
URL: https://github.com/apache/incubator-celeborn/pull/1260#discussion_r1113903442


##########
client-flink/flink-1.14/src/main/java/org/apache/celeborn/plugin/flink/RemoteShuffleMaster.java:
##########
@@ -57,28 +61,36 @@ public RemoteShuffleMaster(ShuffleMasterContext shuffleMasterContext) {
 
   @Override
   public void registerJob(JobShuffleContext context) {
-    JobID jobId = context.getJobId();
+    JobID jobID = context.getJobId();
+    if (lifecycleManager == null) {
+      synchronized (RemoteShuffleMaster.class) {
+        if (lifecycleManager == null) {
+          // use first jobID as celeborn shared appId for all other flink jobs
+          celebornAppId = FlinkUtils.toCelebornAppId(jobID);
+          CelebornConf celebornConf =
+              FlinkUtils.toCelebornConf(shuffleMasterContext.getConfiguration());
+          lifecycleManager = new LifecycleManager(celebornAppId, celebornConf);
+        }
+      }
+    }
+
     Future<?> submit =

Review Comment:
   Looks like unnecessary async block.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@celeborn.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-celeborn] RexXiong commented on pull request #1260: [CELEBORN-326)] [Flink] lifecycleManager supports flink-yarn-session mode to handle multiple Flink jobs.

Posted by "RexXiong (via GitHub)" <gi...@apache.org>.
RexXiong commented on PR #1260:
URL: https://github.com/apache/incubator-celeborn/pull/1260#issuecomment-1439500796

   @FMX @zhongqiangczq 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@celeborn.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-celeborn] codecov[bot] commented on pull request #1260: [CELEBORN-326)] [Flink] lifecycleManager supports flink-yarn-session mode to handle multiple Flink jobs.

Posted by "codecov[bot] (via GitHub)" <gi...@apache.org>.
codecov[bot] commented on PR #1260:
URL: https://github.com/apache/incubator-celeborn/pull/1260#issuecomment-1439446668

   # [Codecov](https://codecov.io/gh/apache/incubator-celeborn/pull/1260?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#1260](https://codecov.io/gh/apache/incubator-celeborn/pull/1260?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (12d46df) into [main](https://codecov.io/gh/apache/incubator-celeborn/commit/cb8df62ec580297f88d78fa57355a68f818684d7?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (cb8df62) will **increase** coverage by `0.03%`.
   > The diff coverage is `n/a`.
   
   > :exclamation: Current head 12d46df differs from pull request most recent head 1e69def. Consider uploading reports for the commit 1e69def to get more accurate results
   
   ```diff
   @@             Coverage Diff              @@
   ##               main    #1260      +/-   ##
   ============================================
   + Coverage     27.11%   27.13%   +0.03%     
   - Complexity      811      814       +3     
   ============================================
     Files           214      214              
     Lines         18353    18353              
     Branches       1997     1997              
   ============================================
   + Hits           4975     4979       +4     
   + Misses        13052    13050       -2     
   + Partials        326      324       -2     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/incubator-celeborn/pull/1260?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [.../org/apache/celeborn/client/LifecycleManager.scala](https://codecov.io/gh/apache/incubator-celeborn/pull/1260?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Y2xpZW50L3NyYy9tYWluL3NjYWxhL29yZy9hcGFjaGUvY2VsZWJvcm4vY2xpZW50L0xpZmVjeWNsZU1hbmFnZXIuc2NhbGE=) | `0.00% <ø> (ø)` | |
   | [...oy/worker/congestcontrol/CongestionController.java](https://codecov.io/gh/apache/incubator-celeborn/pull/1260?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-d29ya2VyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9jZWxlYm9ybi9zZXJ2aWNlL2RlcGxveS93b3JrZXIvY29uZ2VzdGNvbnRyb2wvQ29uZ2VzdGlvbkNvbnRyb2xsZXIuamF2YQ==) | `77.78% <0.00%> (+0.93%)` | :arrow_up: |
   | [...ice/deploy/master/clustermeta/ha/HARaftServer.java](https://codecov.io/gh/apache/incubator-celeborn/pull/1260?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-bWFzdGVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9jZWxlYm9ybi9zZXJ2aWNlL2RlcGxveS9tYXN0ZXIvY2x1c3Rlcm1ldGEvaGEvSEFSYWZ0U2VydmVyLmphdmE=) | `77.93% <0.00%> (+1.36%)` | :arrow_up: |
   
   :mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@celeborn.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-celeborn] FMX merged pull request #1260: [CELEBORN-326)] [Flink] lifecycleManager supports flink-yarn-session mode to handle multiple Flink jobs.

Posted by "FMX (via GitHub)" <gi...@apache.org>.
FMX merged PR #1260:
URL: https://github.com/apache/incubator-celeborn/pull/1260


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@celeborn.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org