You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2021/01/05 10:53:25 UTC

[GitHub] [flink] KarmaGYZ opened a new pull request #14561: [FLINK-20836] Register TaskManager with total and default slot resource profile in SlotManager

KarmaGYZ opened a new pull request #14561:
URL: https://github.com/apache/flink/pull/14561


   
   ## What is the purpose of the change
   
   In FLINK-14188 and FLINK-14189, the TaskExecutor will derive and register to ResourceManager with the total/default slot resource profile. We need to pass that information to SlotManager as it is required in fine-grained resource management.
   
   ## Brief change log
   
   - Register TaskManager with total and default slot resource profile in SlotManager
   
   
   ## Verifying this change
   
   This change is a trivial rework / code cleanup without any test coverage.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot commented on pull request #14561: [FLINK-20836] Register TaskManager with total and default slot resource profile in SlotManager

Posted by GitBox <gi...@apache.org>.
flinkbot commented on pull request #14561:
URL: https://github.com/apache/flink/pull/14561#issuecomment-754574127


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "d2f645553ea7f6051cb81fee377672920ef871e0",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d2f645553ea7f6051cb81fee377672920ef871e0",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d2f645553ea7f6051cb81fee377672920ef871e0 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot commented on pull request #14561: [FLINK-20836] Register TaskManager with total and default slot resource profile in SlotManager

Posted by GitBox <gi...@apache.org>.
flinkbot commented on pull request #14561:
URL: https://github.com/apache/flink/pull/14561#issuecomment-754563356


   Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress of the review.
   
   
   ## Automated Checks
   Last check on commit 99f4497d41499d861f2da0e16c765d2ef179bebe (Tue Jan 05 10:55:53 UTC 2021)
   
   **Warnings:**
    * No documentation files were touched! Remember to keep the Flink docs up to date!
   
   
   <sub>Mention the bot in a comment to re-run the automated checks.</sub>
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process.<details>
    The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`)
    - `@flinkbot approve all` to approve all aspects
    - `@flinkbot approve-until architecture` to approve everything until `architecture`
    - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention
    - `@flinkbot disapprove architecture` to remove an approval you gave earlier
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #14561: [FLINK-20836] Register TaskManager with total and default slot resource profile in SlotManager

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #14561:
URL: https://github.com/apache/flink/pull/14561#issuecomment-754574127


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "d2f645553ea7f6051cb81fee377672920ef871e0",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11653",
       "triggerID" : "d2f645553ea7f6051cb81fee377672920ef871e0",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d2f645553ea7f6051cb81fee377672920ef871e0 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11653) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] xintongsong commented on a change in pull request #14561: [FLINK-20836] Register TaskManager with total and default slot resource profile in SlotManager

Posted by GitBox <gi...@apache.org>.
xintongsong commented on a change in pull request #14561:
URL: https://github.com/apache/flink/pull/14561#discussion_r553079447



##########
File path: flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/DeclarativeSlotManager.java
##########
@@ -275,7 +275,10 @@ public void processResourceRequirements(ResourceRequirements resourceRequirement
      */
     @Override
     public boolean registerTaskManager(
-            final TaskExecutorConnection taskExecutorConnection, SlotReport initialSlotReport) {
+            final TaskExecutorConnection taskExecutorConnection,
+            SlotReport initialSlotReport,
+            ResourceProfile totalResourceProfile,
+            ResourceProfile defaultSlotResourceProfile) {

Review comment:
       JavaDoc should be updated accordingly.

##########
File path: flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/SlotManagerImpl.java
##########
@@ -459,7 +459,10 @@ public boolean unregisterSlotRequest(AllocationID allocationId) {
      */
     @Override
     public boolean registerTaskManager(
-            final TaskExecutorConnection taskExecutorConnection, SlotReport initialSlotReport) {
+            final TaskExecutorConnection taskExecutorConnection,
+            SlotReport initialSlotReport,
+            ResourceProfile totalResourceProfile,
+            ResourceProfile defaultSlotResourceProfile) {

Review comment:
       JavaDoc should be updated accordingly.

##########
File path: flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/ResourceManager.java
##########
@@ -464,7 +464,11 @@ private void stopResourceManagerServices() throws Exception {
                 taskExecutors.get(taskManagerResourceId);
 
         if (workerTypeWorkerRegistration.getInstanceID().equals(taskManagerRegistrationId)) {
-            if (slotManager.registerTaskManager(workerTypeWorkerRegistration, slotReport)) {
+            if (slotManager.registerTaskManager(
+                    workerTypeWorkerRegistration,
+                    slotReport,
+                    workerTypeWorkerRegistration.getTotalResourceProfile(),
+                    workerTypeWorkerRegistration.getDefaultSlotResourceProfile())) {

Review comment:
       It's a bit weird that we have to pass in `workerTypeWorkerRegistration. getTotalResourceProfile()` and `workerTypeWorkerRegistration. getDefaultSlotResourceProfile()` when we have already passed in `workerTypeWorkerRegistration`.
   
   Despite the names, I think the boundary between `TaskExecutorConnection` and `WorkerRegistration` is that, the former contains information needed in `SlotManager` while the latter contains additional information needed in `ResourceManager`. (The name `TaskExecutorConnection` is probably because previously we need nothing more than the IDs and the RPC gateway in `SlotManager`.)
   
   Since the total and default slot resource profiles are only used in `SlotManager`, we probably should move them into `TaskExecutorConnection`. We may also rename the two classes as follows to explicitly suggest their scope of usage.
   * `WorkerRegistration` -> `ResourceManagerWorkerRegistration`
   * `TaskExecutorConnection` -> `SlotManagerWorkerRegistration`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #14561: [FLINK-20836] Register TaskManager with total and default slot resource profile in SlotManager

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #14561:
URL: https://github.com/apache/flink/pull/14561#issuecomment-754574127


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "d2f645553ea7f6051cb81fee377672920ef871e0",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11653",
       "triggerID" : "d2f645553ea7f6051cb81fee377672920ef871e0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2172813d98fb963cd7426c80e9de863caf46a87e",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "2172813d98fb963cd7426c80e9de863caf46a87e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d2f645553ea7f6051cb81fee377672920ef871e0 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11653) 
   * 2172813d98fb963cd7426c80e9de863caf46a87e UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #14561: [FLINK-20836] Register TaskManager with total and default slot resource profile in SlotManager

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #14561:
URL: https://github.com/apache/flink/pull/14561#issuecomment-754574127


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "d2f645553ea7f6051cb81fee377672920ef871e0",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11653",
       "triggerID" : "d2f645553ea7f6051cb81fee377672920ef871e0",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d2f645553ea7f6051cb81fee377672920ef871e0 Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11653) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] xintongsong closed pull request #14561: [FLINK-20836] Register TaskManager with total and default slot resource profile in SlotManager

Posted by GitBox <gi...@apache.org>.
xintongsong closed pull request #14561:
URL: https://github.com/apache/flink/pull/14561


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #14561: [FLINK-20836] Register TaskManager with total and default slot resource profile in SlotManager

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #14561:
URL: https://github.com/apache/flink/pull/14561#issuecomment-754563356


   Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress of the review.
   
   
   ## Automated Checks
   Last check on commit 2172813d98fb963cd7426c80e9de863caf46a87e (Fri May 28 07:02:18 UTC 2021)
   
   **Warnings:**
    * No documentation files were touched! Remember to keep the Flink docs up to date!
   
   
   <sub>Mention the bot in a comment to re-run the automated checks.</sub>
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process.<details>
    The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`)
    - `@flinkbot approve all` to approve all aspects
    - `@flinkbot approve-until architecture` to approve everything until `architecture`
    - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention
    - `@flinkbot disapprove architecture` to remove an approval you gave earlier
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] KarmaGYZ commented on a change in pull request #14561: [FLINK-20836] Register TaskManager with total and default slot resource profile in SlotManager

Posted by GitBox <gi...@apache.org>.
KarmaGYZ commented on a change in pull request #14561:
URL: https://github.com/apache/flink/pull/14561#discussion_r553153813



##########
File path: flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/ResourceManager.java
##########
@@ -464,7 +464,11 @@ private void stopResourceManagerServices() throws Exception {
                 taskExecutors.get(taskManagerResourceId);
 
         if (workerTypeWorkerRegistration.getInstanceID().equals(taskManagerRegistrationId)) {
-            if (slotManager.registerTaskManager(workerTypeWorkerRegistration, slotReport)) {
+            if (slotManager.registerTaskManager(
+                    workerTypeWorkerRegistration,
+                    slotReport,
+                    workerTypeWorkerRegistration.getTotalResourceProfile(),
+                    workerTypeWorkerRegistration.getDefaultSlotResourceProfile())) {

Review comment:
       After an offline discussion, we find it may need a refactor with a larger scope:
   - The current inheritance relationship between `WorkerRegistration ` and `TaskExecutorConnection` is not reasonable.
   - The `TaskManagerRegisteration` also needs to be renamed.
   We decided to have some more in-depth discussions and move it out of the scope of this PR.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #14561: [FLINK-20836] Register TaskManager with total and default slot resource profile in SlotManager

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #14561:
URL: https://github.com/apache/flink/pull/14561#issuecomment-754574127


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "d2f645553ea7f6051cb81fee377672920ef871e0",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11653",
       "triggerID" : "d2f645553ea7f6051cb81fee377672920ef871e0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2172813d98fb963cd7426c80e9de863caf46a87e",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11722",
       "triggerID" : "2172813d98fb963cd7426c80e9de863caf46a87e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d2f645553ea7f6051cb81fee377672920ef871e0 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11653) 
   * 2172813d98fb963cd7426c80e9de863caf46a87e Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11722) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [flink] flinkbot edited a comment on pull request #14561: [FLINK-20836] Register TaskManager with total and default slot resource profile in SlotManager

Posted by GitBox <gi...@apache.org>.
flinkbot edited a comment on pull request #14561:
URL: https://github.com/apache/flink/pull/14561#issuecomment-754574127


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "d2f645553ea7f6051cb81fee377672920ef871e0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11653",
       "triggerID" : "d2f645553ea7f6051cb81fee377672920ef871e0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2172813d98fb963cd7426c80e9de863caf46a87e",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11722",
       "triggerID" : "2172813d98fb963cd7426c80e9de863caf46a87e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 2172813d98fb963cd7426c80e9de863caf46a87e Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=11722) 
   
   <details>
   <summary>Bot commands</summary>
     The @flinkbot bot supports the following commands:
   
    - `@flinkbot run travis` re-run the last Travis build
    - `@flinkbot run azure` re-run the last Azure build
   </details>


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org