You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by "shounakmk219 (via GitHub)" <gi...@apache.org> on 2023/06/23 11:40:08 UTC

[GitHub] [pinot] shounakmk219 opened a new pull request, #10969: Instance drop check api

shounakmk219 opened a new pull request, #10969:
URL: https://github.com/apache/pinot/pull/10969

   Expose an endpoint to perform safety check validation of the instance drop operation. Right now in order to drop an instance we can use 
   
   - `/instances/{instanceName}` with `DELETE` method type
   - `/instances/{instanceName}/state` 
   
   **Problem**
   Both the APIs perform the safety check validation before dropping the instance. The user is informed about the possible reasons, in case of an unsafe instance drop, only when the drop operation is actually performed.
   
   **Proposal**
   Instead of relying on the actual drop operation API to give back the reasons on why an operation is unsafe, we can have a validation endpoint that can be consumed and user can take required actions before performing the actual operation.
   
   Endpoint : `/instances/safetyCheck/dropInstance`
   Method type : `POST`
   Query param : `instanceNames` - list of instance names to validate for drop operation
   Sample Response : 
   
   ```
   [
     {
       "instanceName": "Server_192.168.1.3_7050",
       "safe": false,
       "issueList": [
         "Instance Server_192.168.1.3_7050 is still live",
         "Instance Server_192.168.1.3_7050 exists in ideal state for airlineStats_OFFLINE",
         "Instance Server_192.168.1.3_7050 exists in ideal state for airlineStats_REALTIME"
       ]
     },
     {
       "instanceName": "Broker_192.168.1.3_8000",
       "safe": true,
       "issueList": []
     }
   ]
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] saurabhd336 commented on a diff in pull request #10969: Instance drop operation validation check API

Posted by "saurabhd336 (via GitHub)" <gi...@apache.org>.
saurabhd336 commented on code in PR #10969:
URL: https://github.com/apache/pinot/pull/10969#discussion_r1244781157


##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/PinotHelixResourceManager.java:
##########
@@ -3148,6 +3137,31 @@ public PinotResourceManagerResponse dropInstance(String instanceName) {
     return PinotResourceManagerResponse.success("Instance " + instanceName + " dropped");
   }
 
+  /**
+   * Utility to perform a safety check of the operation to drop an instance.
+   * If the resource is not safe to drop the utility lists all the possible reasons.
+   * @param instanceName Pinot instance name
+   * @return List of reasons why the operation is unsafe. Empty list if the operation is safe.
+   */
+  public List<String> instanceDropSafetyCheck(String instanceName) {
+    List<String> issues = new ArrayList<>();
+    // Check if the instance is live
+    if (_helixDataAccessor.getProperty(_keyBuilder.liveInstance(instanceName)) != null) {
+      issues.add(String.format("Instance %s is still live", instanceName));

Review Comment:
   These reason strings can be moved to constants.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] shounakmk219 commented on a diff in pull request #10969: Instance drop operation validation check API

Posted by "shounakmk219 (via GitHub)" <gi...@apache.org>.
shounakmk219 commented on code in PR #10969:
URL: https://github.com/apache/pinot/pull/10969#discussion_r1251535522


##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/PinotHelixResourceManager.java:
##########
@@ -3148,6 +3137,31 @@ public PinotResourceManagerResponse dropInstance(String instanceName) {
     return PinotResourceManagerResponse.success("Instance " + instanceName + " dropped");
   }
 
+  /**
+   * Utility to perform a safety check of the operation to drop an instance.
+   * If the resource is not safe to drop the utility lists all the possible reasons.
+   * @param instanceName Pinot instance name
+   * @return List of reasons why the operation is unsafe. Empty list if the operation is safe.
+   */
+  public List<String> instanceDropSafetyCheck(String instanceName) {
+    List<String> issues = new ArrayList<>();
+    // Check if the instance is live
+    if (_helixDataAccessor.getProperty(_keyBuilder.liveInstance(instanceName)) != null) {
+      issues.add(String.format("Instance %s is still live", instanceName));

Review Comment:
   Updated the response structure



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] saurabhd336 commented on a diff in pull request #10969: Instance drop operation validation check API

Posted by "saurabhd336 (via GitHub)" <gi...@apache.org>.
saurabhd336 commented on code in PR #10969:
URL: https://github.com/apache/pinot/pull/10969#discussion_r1244779760


##########
pinot-controller/src/main/java/org/apache/pinot/controller/api/resources/PinotInstanceRestletResource.java:
##########
@@ -390,4 +391,33 @@ public SuccessResponse updateBrokerResource(
           Response.Status.INTERNAL_SERVER_ERROR, e);
     }
   }
+
+  @POST

Review Comment:
   +1



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] snleee merged pull request #10969: Instance drop operation validation check API

Posted by "snleee (via GitHub)" <gi...@apache.org>.
snleee merged PR #10969:
URL: https://github.com/apache/pinot/pull/10969


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] snleee commented on a diff in pull request #10969: Instance drop operation validation check API

Posted by "snleee (via GitHub)" <gi...@apache.org>.
snleee commented on code in PR #10969:
URL: https://github.com/apache/pinot/pull/10969#discussion_r1247597496


##########
pinot-controller/src/main/java/org/apache/pinot/controller/api/resources/PinotInstanceRestletResource.java:
##########
@@ -390,4 +391,33 @@ public SuccessResponse updateBrokerResource(
           Response.Status.INTERNAL_SERVER_ERROR, e);
     }
   }
+
+  @POST
+  @Path("/instances/safetyCheck/dropInstance")

Review Comment:
   `/instances/dropInstance/validate` maybe a better path?
   
   We do have `POST /schemas/validate` & `POST /tables/validate`



##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/PinotHelixResourceManager.java:
##########
@@ -3148,6 +3137,31 @@ public PinotResourceManagerResponse dropInstance(String instanceName) {
     return PinotResourceManagerResponse.success("Instance " + instanceName + " dropped");
   }
 
+  /**
+   * Utility to perform a safety check of the operation to drop an instance.
+   * If the resource is not safe to drop the utility lists all the possible reasons.
+   * @param instanceName Pinot instance name
+   * @return List of reasons why the operation is unsafe. Empty list if the operation is safe.
+   */
+  public List<String> instanceDropSafetyCheck(String instanceName) {
+    List<String> issues = new ArrayList<>();
+    // Check if the instance is live
+    if (_helixDataAccessor.getProperty(_keyBuilder.liveInstance(instanceName)) != null) {
+      issues.add(String.format("Instance %s is still live", instanceName));

Review Comment:
   +1 on this change. We can probably use some error code or enum status. If we want to add the explanation, we can model the response as `code/status, message`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] codecov-commenter commented on pull request #10969: Instance drop operation validation check API

Posted by "codecov-commenter (via GitHub)" <gi...@apache.org>.
codecov-commenter commented on PR #10969:
URL: https://github.com/apache/pinot/pull/10969#issuecomment-1604210200

   ## [Codecov](https://app.codecov.io/gh/apache/pinot/pull/10969?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) Report
   > Merging [#10969](https://app.codecov.io/gh/apache/pinot/pull/10969?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) (418edd3) into [master](https://app.codecov.io/gh/apache/pinot/commit/73830ae3ce94428b7e9991d1bd160bd62a14b7c1?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) (73830ae) will **increase** coverage by `0.00%`.
   > The diff coverage is `0.00%`.
   
   ```diff
   @@            Coverage Diff            @@
   ##           master   #10969     +/-   ##
   =========================================
     Coverage    0.11%    0.11%             
   =========================================
     Files        2188     2136     -52     
     Lines      117749   115516   -2233     
     Branches    17796    17551    -245     
   =========================================
     Hits          137      137             
   + Misses     117592   115359   -2233     
     Partials       20       20             
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | integration1temurin11 | `?` | |
   | integration1temurin17 | `?` | |
   | integration1temurin20 | `?` | |
   | integration2temurin11 | `?` | |
   | integration2temurin17 | `?` | |
   | integration2temurin20 | `?` | |
   | unittests1temurin11 | `?` | |
   | unittests1temurin17 | `?` | |
   | unittests1temurin20 | `?` | |
   | unittests2temurin11 | `?` | |
   | unittests2temurin17 | `?` | |
   | unittests2temurin20 | `0.11% <0.00%> (-0.01%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://app.codecov.io/gh/apache/pinot/pull/10969?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | Coverage Δ | |
   |---|---|---|
   | [...er/api/resources/OperationSafetyCheckResponse.java](https://app.codecov.io/gh/apache/pinot/pull/10969?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3QtY29udHJvbGxlci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29udHJvbGxlci9hcGkvcmVzb3VyY2VzL09wZXJhdGlvblNhZmV0eUNoZWNrUmVzcG9uc2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...er/api/resources/PinotInstanceRestletResource.java](https://app.codecov.io/gh/apache/pinot/pull/10969?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3QtY29udHJvbGxlci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29udHJvbGxlci9hcGkvcmVzb3VyY2VzL1Bpbm90SW5zdGFuY2VSZXN0bGV0UmVzb3VyY2UuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...ntroller/helix/core/PinotHelixResourceManager.java](https://app.codecov.io/gh/apache/pinot/pull/10969?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3QtY29udHJvbGxlci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29udHJvbGxlci9oZWxpeC9jb3JlL1Bpbm90SGVsaXhSZXNvdXJjZU1hbmFnZXIuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   
   ... and [84 files with indirect coverage changes](https://app.codecov.io/gh/apache/pinot/pull/10969/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
   
   :mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] saurabhd336 commented on a diff in pull request #10969: Instance drop operation validation check API

Posted by "saurabhd336 (via GitHub)" <gi...@apache.org>.
saurabhd336 commented on code in PR #10969:
URL: https://github.com/apache/pinot/pull/10969#discussion_r1244790349


##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/PinotHelixResourceManager.java:
##########
@@ -3148,6 +3137,31 @@ public PinotResourceManagerResponse dropInstance(String instanceName) {
     return PinotResourceManagerResponse.success("Instance " + instanceName + " dropped");
   }
 
+  /**
+   * Utility to perform a safety check of the operation to drop an instance.
+   * If the resource is not safe to drop the utility lists all the possible reasons.
+   * @param instanceName Pinot instance name
+   * @return List of reasons why the operation is unsafe. Empty list if the operation is safe.
+   */
+  public List<String> instanceDropSafetyCheck(String instanceName) {
+    List<String> issues = new ArrayList<>();
+    // Check if the instance is live
+    if (_helixDataAccessor.getProperty(_keyBuilder.liveInstance(instanceName)) != null) {
+      issues.add(String.format("Instance %s is still live", instanceName));
+    }
+    // Check if any ideal state includes the instance
+    getAllResources().forEach(resource -> {
+      IdealState idealState = _helixAdmin.getResourceIdealState(_helixClusterName, resource);
+      for (String partition : idealState.getPartitionSet()) {
+        if (idealState.getInstanceSet(partition).contains(instanceName)) {
+          issues.add(String.format("Instance %s exists in ideal state for %s", instanceName, resource));
+          break;

Review Comment:
   We're early terminating here, but not early terminating if the instance is live? Is this intentional? Or should we include all tables where this instance is part of the ideal state to give a comprehensive list of issues?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] shounakmk219 commented on a diff in pull request #10969: Instance drop operation validation check API

Posted by "shounakmk219 (via GitHub)" <gi...@apache.org>.
shounakmk219 commented on code in PR #10969:
URL: https://github.com/apache/pinot/pull/10969#discussion_r1251536118


##########
pinot-controller/src/main/java/org/apache/pinot/controller/api/resources/PinotInstanceRestletResource.java:
##########
@@ -390,4 +391,33 @@ public SuccessResponse updateBrokerResource(
           Response.Status.INTERNAL_SERVER_ERROR, e);
     }
   }
+
+  @POST
+  @Path("/instances/safetyCheck/dropInstance")

Review Comment:
   yeah that's much better. Updated the path.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] shounakmk219 commented on pull request #10969: Instance drop operation validation check API

Posted by "shounakmk219 (via GitHub)" <gi...@apache.org>.
shounakmk219 commented on PR #10969:
URL: https://github.com/apache/pinot/pull/10969#issuecomment-1623049024

   @snleee I have updated the description.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] shounakmk219 commented on a diff in pull request #10969: Instance drop operation validation check API

Posted by "shounakmk219 (via GitHub)" <gi...@apache.org>.
shounakmk219 commented on code in PR #10969:
URL: https://github.com/apache/pinot/pull/10969#discussion_r1245142369


##########
pinot-controller/src/main/java/org/apache/pinot/controller/api/resources/PinotInstanceRestletResource.java:
##########
@@ -390,4 +391,33 @@ public SuccessResponse updateBrokerResource(
           Response.Status.INTERNAL_SERVER_ERROR, e);
     }
   }
+
+  @POST

Review Comment:
   Agree, will make the changes.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] Jackie-Jiang commented on a diff in pull request #10969: Instance drop operation validation check API

Posted by "Jackie-Jiang (via GitHub)" <gi...@apache.org>.
Jackie-Jiang commented on code in PR #10969:
URL: https://github.com/apache/pinot/pull/10969#discussion_r1244528990


##########
pinot-controller/src/main/java/org/apache/pinot/controller/api/resources/PinotInstanceRestletResource.java:
##########
@@ -390,4 +391,33 @@ public SuccessResponse updateBrokerResource(
           Response.Status.INTERNAL_SERVER_ERROR, e);
     }
   }
+
+  @POST

Review Comment:
   Since this is just reading some information without applying any change to the cluster, should we make it a POST request of GET?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] shounakmk219 commented on a diff in pull request #10969: Instance drop operation validation check API

Posted by "shounakmk219 (via GitHub)" <gi...@apache.org>.
shounakmk219 commented on code in PR #10969:
URL: https://github.com/apache/pinot/pull/10969#discussion_r1245149485


##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/PinotHelixResourceManager.java:
##########
@@ -3148,6 +3137,31 @@ public PinotResourceManagerResponse dropInstance(String instanceName) {
     return PinotResourceManagerResponse.success("Instance " + instanceName + " dropped");
   }
 
+  /**
+   * Utility to perform a safety check of the operation to drop an instance.
+   * If the resource is not safe to drop the utility lists all the possible reasons.
+   * @param instanceName Pinot instance name
+   * @return List of reasons why the operation is unsafe. Empty list if the operation is safe.
+   */
+  public List<String> instanceDropSafetyCheck(String instanceName) {
+    List<String> issues = new ArrayList<>();
+    // Check if the instance is live
+    if (_helixDataAccessor.getProperty(_keyBuilder.liveInstance(instanceName)) != null) {
+      issues.add(String.format("Instance %s is still live", instanceName));
+    }
+    // Check if any ideal state includes the instance
+    getAllResources().forEach(resource -> {
+      IdealState idealState = _helixAdmin.getResourceIdealState(_helixClusterName, resource);
+      for (String partition : idealState.getPartitionSet()) {
+        if (idealState.getInstanceSet(partition).contains(instanceName)) {
+          issues.add(String.format("Instance %s exists in ideal state for %s", instanceName, resource));
+          break;

Review Comment:
   We are just early terminating the loop on resource partitions, like even if one segment is present on the instance mark it as unsafe saying that particular table is on that instance. The outer loop will still check for all the tables and give a comprehensive list of issues at table level. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] saurabhd336 commented on pull request #10969: Instance drop operation validation check API

Posted by "saurabhd336 (via GitHub)" <gi...@apache.org>.
saurabhd336 commented on PR #10969:
URL: https://github.com/apache/pinot/pull/10969#issuecomment-1622934928

   lgtm!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org