You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "huwh (via GitHub)" <gi...@apache.org> on 2023/03/02 06:40:17 UTC

[GitHub] [flink] huwh opened a new pull request, #22066: [FLINK-29852][Runtime] Fix AdaptiveScheduler add operator repeatedly in json plan

huwh opened a new pull request, #22066:
URL: https://github.com/apache/flink/pull/22066

   ## What is the purpose of the change
   adaptive scheduler will update the execution graph when parallelism changed.
   It add all execution vertices to the json plan which should only add job vertices.
   This cause the operator is repeatedly displayed on the Flink Web UI. 
   
   
   ## Brief change log
     - *Use JobVertices instead ExecutionVertices when generate new json plan*
   
   
   ## Verifying this change
   This change added tests and can be verified as follows:
     - *Enrich the unit test to verify operator in json plan is not repeat*
     - *Manually verified the change by running adaptive scheduler with parallelism = 4*
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): (no)
     - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no)
     - The serializers: (no)
     - The runtime per-record code paths (performance sensitive): (no)
     - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
     - The S3 file system connector: (no)
   
   ## Documentation
   
     - Does this pull request introduce a new feature? (no)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] huwh commented on pull request #22066: [FLINK-29852][Runtime] Fix AdaptiveScheduler add operator repeatedly in json plan

Posted by "huwh (via GitHub)" <gi...@apache.org>.
huwh commented on PR #22066:
URL: https://github.com/apache/flink/pull/22066#issuecomment-1452070777

   @dmvk sorry for create the duplicated backports. It's my problem, I forgot to check the Github ui before create new backports.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] dmvk commented on pull request #22066: [FLINK-29852][Runtime] Fix AdaptiveScheduler add operator repeatedly in json plan

Posted by "dmvk (via GitHub)" <gi...@apache.org>.
dmvk commented on PR #22066:
URL: https://github.com/apache/flink/pull/22066#issuecomment-1452063810

   @huwh Sorry for the confusion regarding backports, I thought that you'll get notified about linked PRs, since they show up in the Github UI 😢 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] flinkbot commented on pull request #22066: [FLINK-29852][Runtime] Fix AdaptiveScheduler add operator repeatedly in json plan

Posted by "flinkbot (via GitHub)" <gi...@apache.org>.
flinkbot commented on PR #22066:
URL: https://github.com/apache/flink/pull/22066#issuecomment-1451378565

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4c27324d51a07484182bbca3e54c4c0d20a6b9f1",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "4c27324d51a07484182bbca3e54c4c0d20a6b9f1",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 4c27324d51a07484182bbca3e54c4c0d20a6b9f1 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] dmvk commented on pull request #22066: [FLINK-29852][Runtime] Fix AdaptiveScheduler add operator repeatedly in json plan

Posted by "dmvk (via GitHub)" <gi...@apache.org>.
dmvk commented on PR #22066:
URL: https://github.com/apache/flink/pull/22066#issuecomment-1451517818

   And to 1.17 🙈 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] huwh commented on pull request #22066: [FLINK-29852][Runtime] Fix AdaptiveScheduler add operator repeatedly in json plan

Posted by "huwh (via GitHub)" <gi...@apache.org>.
huwh commented on PR #22066:
URL: https://github.com/apache/flink/pull/22066#issuecomment-1451492667

   @dmvk good suggestion, I've updated this MR. Could you help to merge it after this change is OK.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] dmvk commented on a diff in pull request #22066: [FLINK-29852][Runtime] Fix AdaptiveScheduler add operator repeatedly in json plan

Posted by "dmvk (via GitHub)" <gi...@apache.org>.
dmvk commented on code in PR #22066:
URL: https://github.com/apache/flink/pull/22066#discussion_r1122707632


##########
flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/adaptive/CreatingExecutionGraph.java:
##########
@@ -128,10 +129,10 @@ private void handleExecutionGraphCreation(
                                 () ->
                                         StreamSupport.stream(
                                                         executionGraph
-                                                                .getAllExecutionVertices()
+                                                                .getVerticesTopologically()
                                                                 .spliterator(),
                                                         false)
-                                                .map(v -> v.getJobVertex().getJobVertex())
+                                                .map(ExecutionJobVertex::getJobVertex)
                                                 .iterator(),

Review Comment:
   Can we reuse `IterableUtils` here to make this slightly more readable?
   
   
   ```
   IterableUtils.toStream(
                   executionGraph.getVerticesTopologically())
           .map(ExecutionJobVertex::getJobVertex)
           .iterator(),
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] dmvk commented on a diff in pull request #22066: [FLINK-29852][Runtime] Fix AdaptiveScheduler add operator repeatedly in json plan

Posted by "dmvk (via GitHub)" <gi...@apache.org>.
dmvk commented on code in PR #22066:
URL: https://github.com/apache/flink/pull/22066#discussion_r1122707632


##########
flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/adaptive/CreatingExecutionGraph.java:
##########
@@ -128,10 +129,10 @@ private void handleExecutionGraphCreation(
                                 () ->
                                         StreamSupport.stream(
                                                         executionGraph
-                                                                .getAllExecutionVertices()
+                                                                .getVerticesTopologically()
                                                                 .spliterator(),
                                                         false)
-                                                .map(v -> v.getJobVertex().getJobVertex())
+                                                .map(ExecutionJobVertex::getJobVertex)
                                                 .iterator(),

Review Comment:
   Can we reuse `IterableUtils` here to make this slightly more readable?
   
   
   ```
                                           IterableUtils.toStream(
                                                           executionGraph.getVerticesTopologically())
                                                   .map(ExecutionJobVertex::getJobVertex)
                                                   .iterator(),
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] dmvk commented on a diff in pull request #22066: [FLINK-29852][Runtime] Fix AdaptiveScheduler add operator repeatedly in json plan

Posted by "dmvk (via GitHub)" <gi...@apache.org>.
dmvk commented on code in PR #22066:
URL: https://github.com/apache/flink/pull/22066#discussion_r1122709873


##########
flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/adaptive/CreatingExecutionGraph.java:
##########
@@ -128,10 +129,10 @@ private void handleExecutionGraphCreation(
                                 () ->
                                         StreamSupport.stream(
                                                         executionGraph
-                                                                .getAllExecutionVertices()
+                                                                .getVerticesTopologically()
                                                                 .spliterator(),
                                                         false)
-                                                .map(v -> v.getJobVertex().getJobVertex())
+                                                .map(ExecutionJobVertex::getJobVertex)
                                                 .iterator(),

Review Comment:
   nit: it would be nice to mark `updatedPlan` as final along-side the above change, to make it consistent with rest of the code



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] dmvk commented on pull request #22066: [FLINK-29852][Runtime] Fix AdaptiveScheduler add operator repeatedly in json plan

Posted by "dmvk (via GitHub)" <gi...@apache.org>.
dmvk commented on PR #22066:
URL: https://github.com/apache/flink/pull/22066#issuecomment-1451514994

   Sure, I'll merge this after CI gives the green light 👍 Can you please open a PR with backport to 1.16 in the meantime?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [flink] dmvk merged pull request #22066: [FLINK-29852][Runtime] Fix AdaptiveScheduler add operator repeatedly in json plan

Posted by "dmvk (via GitHub)" <gi...@apache.org>.
dmvk merged PR #22066:
URL: https://github.com/apache/flink/pull/22066


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org