You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/02/24 07:21:33 UTC

[GitHub] [spark] mridulm commented on a change in pull request #35559: [SPARK-33206][CORE] Fix shuffle index cache weight calculation for small index files

mridulm commented on a change in pull request #35559:
URL: https://github.com/apache/spark/pull/35559#discussion_r813606907



##########
File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ShuffleIndexInformation.java
##########
@@ -46,8 +45,11 @@ public ShuffleIndexInformation(File indexFile) throws IOException {
    * Size of the index file
    * @return size
    */
-  public int getSize() {
-    return size;
+  public int getRetainedMemorySize() {
+    // SPARK-33206: here the offsets' capacity is multiplied by 8 as offsets stores long values.
+    // And the extra 176 bytes is the estimate of the `ShuffleIndexInformation` memory footprint
+    // which is relevant in case of small index files (i.e. storing only 2 offsets = 16 bytes).
+    return (offsets.capacity() << 3) + 176;
   }

Review comment:
       nit: pull the `176` as a package private constant and have the Suite depend on it as well.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org