You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by "mcvsubbu (via GitHub)" <gi...@apache.org> on 2023/05/14 20:35:10 UTC

[GitHub] [pinot] mcvsubbu opened a new pull request, #10766: Compress idealstate according to estimated size

mcvsubbu opened a new pull request, #10766:
URL: https://github.com/apache/pinot/pull/10766

   Currently we compress idealstate based on the number of segments in a table. Added an alfgorithm to estimate the size of the idealstate znode while deciding to enable compression.
   
   Size threshold configurable as per zookeeper installation requirements.
   
   Instructions:
   1. The PR has to be tagged with at least one of the following labels (*):
      1. `feature`
      2. `bugfix`
      3. `performance`
      4. `ui`
      5. `backward-incompat`
      6. `release-notes` (**)
   2. Remove these instructions before publishing the PR.
    
   (*) Other labels to consider:
   - `testing`
   - `dependencies`
   - `docker`
   - `kubernetes`
   - `observability`
   - `security`
   - `code-style`
   - `extension-point`
   - `refactor`
   - `cleanup`
   
   (**) Use `release-notes` label for scenarios like:
   - New configuration options
   - Deprecation of configurations
   - Signature changes to public methods/interfaces
   - New plugins added or old plugins removed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] mcvsubbu commented on a diff in pull request #10766: Compress idealstate according to estimated size

Posted by "mcvsubbu (via GitHub)" <gi...@apache.org>.
mcvsubbu commented on code in PR #10766:
URL: https://github.com/apache/pinot/pull/10766#discussion_r1193902948


##########
pinot-common/src/main/java/org/apache/pinot/common/utils/helix/HelixHelper.java:
##########
@@ -163,6 +169,31 @@ public Boolean call() {
             return true;
           }
         }
+
+        private boolean shouldCompress(IdealState is) {
+          if (is.getNumPartitions() > NUM_PARTITIONS_THRESHOLD_TO_ENABLE_COMPRESSION) {
+            return true;
+          }
+
+          // Find the number of characters in one partition in idealstate, and extrapolate
+          // to estimate the number of characters.
+          Iterator<String> it = is.getPartitionSet().iterator();

Review Comment:
   > qq: can we just check the string's length of the ideal state? By doing so, even if the length is not purely from partitions the partition set, the overhead from the rest of the fields might not be that high.
   
   To check the length we will need to serialize it, I was trying to avoid that.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] mcvsubbu merged pull request #10766: Compress idealstate according to estimated size

Posted by "mcvsubbu (via GitHub)" <gi...@apache.org>.
mcvsubbu merged PR #10766:
URL: https://github.com/apache/pinot/pull/10766


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] codecov-commenter commented on pull request #10766: Compress idealstate according to estimated size

Posted by "codecov-commenter (via GitHub)" <gi...@apache.org>.
codecov-commenter commented on PR #10766:
URL: https://github.com/apache/pinot/pull/10766#issuecomment-1547003473

   ## [Codecov](https://app.codecov.io/gh/apache/pinot/pull/10766?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) Report
   > Merging [#10766](https://app.codecov.io/gh/apache/pinot/pull/10766?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) (8e43fc6) into [master](https://app.codecov.io/gh/apache/pinot/commit/9e1160272aec88591d9e768d15104b72521b2ffb?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) (9e11602) will **decrease** coverage by `54.82%`.
   > The diff coverage is `8.69%`.
   
   ```diff
   @@              Coverage Diff              @@
   ##             master   #10766       +/-   ##
   =============================================
   - Coverage     68.51%   13.69%   -54.82%     
   + Complexity     5637      439     -5198     
   =============================================
     Files          2153     2098       -55     
     Lines        115731   113228     -2503     
     Branches      17477    17177      -300     
   =============================================
   - Hits          79290    15509    -63781     
   - Misses        30836    96447    +65611     
   + Partials       5605     1272     -4333     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | integration1 | `?` | |
   | unittests1 | `?` | |
   | unittests2 | `13.69% <8.69%> (+<0.01%)` | :arrow_up: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://app.codecov.io/gh/apache/pinot/pull/10766?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | Coverage Δ | |
   |---|---|---|
   | [...g/apache/pinot/common/utils/helix/HelixHelper.java](https://app.codecov.io/gh/apache/pinot/pull/10766?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vdXRpbHMvaGVsaXgvSGVsaXhIZWxwZXIuamF2YQ==) | `0.00% <0.00%> (-49.81%)` | :arrow_down: |
   | [...va/org/apache/pinot/controller/ControllerConf.java](https://app.codecov.io/gh/apache/pinot/pull/10766?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3QtY29udHJvbGxlci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29udHJvbGxlci9Db250cm9sbGVyQ29uZi5qYXZh) | `57.54% <33.33%> (-1.68%)` | :arrow_down: |
   | [...apache/pinot/controller/BaseControllerStarter.java](https://app.codecov.io/gh/apache/pinot/pull/10766?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#diff-cGlub3QtY29udHJvbGxlci9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29udHJvbGxlci9CYXNlQ29udHJvbGxlclN0YXJ0ZXIuamF2YQ==) | `79.46% <100.00%> (-2.03%)` | :arrow_down: |
   
   ... and [1662 files with indirect coverage changes](https://app.codecov.io/gh/apache/pinot/pull/10766/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
   
   :mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] jackjlli commented on a diff in pull request #10766: Compress idealstate according to estimated size

Posted by "jackjlli (via GitHub)" <gi...@apache.org>.
jackjlli commented on code in PR #10766:
URL: https://github.com/apache/pinot/pull/10766#discussion_r1193208757


##########
pinot-common/src/main/java/org/apache/pinot/common/utils/helix/HelixHelper.java:
##########
@@ -163,6 +169,31 @@ public Boolean call() {
             return true;
           }
         }
+
+        private boolean shouldCompress(IdealState is) {
+          if (is.getNumPartitions() > NUM_PARTITIONS_THRESHOLD_TO_ENABLE_COMPRESSION) {
+            return true;
+          }
+
+          // Find the number of characters in one partition in idealstate, and extrapolate
+          // to estimate the number of characters.
+          Iterator<String> it = is.getPartitionSet().iterator();

Review Comment:
   qq: can we just check the string's length of the ideal state? By doing so, even if the length is not purely from partitions the partition set, the overhead from the rest of the fields might not be that high.



##########
pinot-common/src/main/java/org/apache/pinot/common/utils/helix/HelixHelper.java:
##########
@@ -163,6 +169,31 @@ public Boolean call() {
             return true;
           }
         }
+
+        private boolean shouldCompress(IdealState is) {
+          if (is.getNumPartitions() > NUM_PARTITIONS_THRESHOLD_TO_ENABLE_COMPRESSION) {
+            return true;
+          }
+
+          // Find the number of characters in one partition in idealstate, and extrapolate
+          // to estimate the number of characters.
+          Iterator<String> it = is.getPartitionSet().iterator();

Review Comment:
   Also, according to the name from the variable (`MIN_NUM_CHARS_IN_IS_TO_TURN_ON_COMPRESSION`), it doesn't exclude the rest of the fields though.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] mcvsubbu commented on a diff in pull request #10766: Compress idealstate according to estimated size

Posted by "mcvsubbu (via GitHub)" <gi...@apache.org>.
mcvsubbu commented on code in PR #10766:
URL: https://github.com/apache/pinot/pull/10766#discussion_r1193904441


##########
pinot-common/src/main/java/org/apache/pinot/common/utils/helix/HelixHelper.java:
##########
@@ -163,6 +169,31 @@ public Boolean call() {
             return true;
           }
         }
+
+        private boolean shouldCompress(IdealState is) {
+          if (is.getNumPartitions() > NUM_PARTITIONS_THRESHOLD_TO_ENABLE_COMPRESSION) {
+            return true;
+          }
+
+          // Find the number of characters in one partition in idealstate, and extrapolate
+          // to estimate the number of characters.
+          Iterator<String> it = is.getPartitionSet().iterator();

Review Comment:
   > Also, according to the name from the variable (`MIN_NUM_CHARS_IN_IS_TO_TURN_ON_COMPRESSION`), it doesn't exclude the rest of the fields though.
   
   That is correct.  I could not come up with another name though :) 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org