You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/02/17 18:45:46 UTC

[GitHub] [spark] attilapiros opened a new pull request #35559: [SPARK-33206] Fix shuffle index cache weight calculation for small index files

attilapiros opened a new pull request #35559:
URL: https://github.com/apache/spark/pull/35559


   ### What changes were proposed in this pull request?
   
   Increasing the shuffle index weight with a constant number to avoid underestimating retained memory size caused by the bookkeeping objects (including Guava cache internal objects, the `File `object and the `ShuffleIndexInformation` object itself).
   
   ### Why are the changes needed?
   
   Underestimating cache entry size easily can cause OOM in the Yarn NodeManager. 
   In the following analyses of a prod issue (HPROF file) we can see the leak suspect Guava's `LocalCache$Segment` objects:
   
   <img width="943" alt="Screenshot 2022-02-17 at 18 55 40" src="https://user-images.githubusercontent.com/2017933/154541995-44014212-2046-41d6-ba7f-99369ca7d739.png">
   
   Going further we can see a `ShuffleIndexInformation` for a small index file (16 bytes) but the retained heap memory is 1192 bytes (by the Guava internal object used for bookkeeping):
   
   <img width="1635" alt="Screenshot 2022-02-17 at 19 01 33" src="https://user-images.githubusercontent.com/2017933/154547083-60b48ee7-9c79-4f4c-97d8-72a96cfb453b.png">
   
   Finally we can see this is very common within this heap dump (using MAT's Object Query Language):
    
   <img width="1418" alt="image" src="https://user-images.githubusercontent.com/2017933/154547678-44c8af34-1765-4e14-b71a-dc03d1a304aa.png">
   
   I have even exported the data to a CSV and done some calculations with `awk`:
   
   ```
   $ tail -n+2 export.csv | awk -F, 'BEGIN { numUnderEstimated=0; } { sumOldSize += $1; corrected=$1 + 1176; sumCorrectedSize += corrected; sumRetainedMem += $2; if (corrected < $2) numUnderEstimated+=1; } END { print "sum old size: " sumOldSize / 1024 / 1024   " MB, sum corrected size: " sumCorrectedSize / 1024 / 1024 " MB, sum retained memory:" sumRetainedMem / 1024 / 1024  " MB, num under estimated: " numUnderEstimated }'
   ```
   
   It gives the followings:
   ```
   sum old size: 76.8785 MB, sum corrected size: 1066.93 MB, sum retained memory:1064.47 MB, num under estimated: 0
   ```
   
   So using the old calculation we were at 7.6.8 MB way under the default cache limit (100 MB). 
   Using the correction (applying 1176 as increment to the size) we are at 1066.93 MB (~1GB) which is close to the real retained sum heap: 1064.47 MB (~1GB) and there is no entry which was underestimated.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   With the calculations above.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] attilapiros commented on pull request #35559: [SPARK-33206][CORE] Fix shuffle index cache weight calculation for small index files

Posted by GitBox <gi...@apache.org>.
attilapiros commented on pull request #35559:
URL: https://github.com/apache/spark/pull/35559#issuecomment-1048989630


   > Just check that MiMa doesn't complain about changing the public method, but seems thorough and reasonable to me
   
   Mima passed without any error.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #35559: [SPARK-33206][CORE] Fix shuffle index cache weight calculation for small index files

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #35559:
URL: https://github.com/apache/spark/pull/35559#discussion_r814355526



##########
File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ShuffleIndexInformation.java
##########
@@ -46,8 +50,9 @@ public ShuffleIndexInformation(File indexFile) throws IOException {
    * Size of the index file
    * @return size
    */
-  public int getSize() {
-    return size;
+  public int getRetainedMemorySize() {

Review comment:
       Shall we update the function description together?
   > * Size of the index file
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun closed pull request #35559: [SPARK-33206][CORE] Fix shuffle index cache weight calculation for small index files

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun closed pull request #35559:
URL: https://github.com/apache/spark/pull/35559


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] wypoon commented on a change in pull request #35559: [SPARK-33206][CORE] Fix shuffle index cache weight calculation for small index files

Posted by GitBox <gi...@apache.org>.
wypoon commented on a change in pull request #35559:
URL: https://github.com/apache/spark/pull/35559#discussion_r813436289



##########
File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExecutorDiskUtils.java
##########
@@ -40,7 +40,7 @@ public static File getFile(String[] localDirs, int subDirsPerLocalDir, String fi
     // So we are creating a File just to get the normalized path back to intern it.
     // Finally a new File is built and returned with this interned normalized path.

Review comment:
       Update the comment so it won't cause confusion.
   "Finally a new File is built and returned with this interned normalized path." -> "We return this interned normalized path."




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] attilapiros commented on a change in pull request #35559: [SPARK-33206][CORE] Fix shuffle index cache weight calculation for small index files

Posted by GitBox <gi...@apache.org>.
attilapiros commented on a change in pull request #35559:
URL: https://github.com/apache/spark/pull/35559#discussion_r810928288



##########
File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalShuffleBlockResolver.java
##########
@@ -118,9 +118,17 @@ public ShuffleIndexInformation load(File file) throws IOException {
             return new ShuffleIndexInformation(file);
           }
         };
+
+    // SPARK-33206: weightCorrection is a constant to increase the shuffle index weight for the
+    // index cache to avoid underestimating the retained memory size coming from the bookkeeping
+    // objects in case of very small index files (i.e files with two offsets are only 16 bytes
+    // in size but related bookkeeping objects retained memory is 1192 bytes) otherwise we can
+    // easily cause an OOM in the NodeManager even when the default cache size limit is used.

Review comment:
       External Shuffle Service is running in the NodeManager so the problem is more serious.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #35559: [SPARK-33206][CORE] Fix shuffle index cache weight calculation for small index files

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #35559:
URL: https://github.com/apache/spark/pull/35559#discussion_r814358891



##########
File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ShuffleIndexInformation.java
##########
@@ -46,8 +50,9 @@ public ShuffleIndexInformation(File indexFile) throws IOException {
    * Size of the index file
    * @return size
    */
-  public int getSize() {
-    return size;
+  public int getRetainedMemorySize() {
+    // SPARK-33206: here the offsets' capacity is multiplied by 8 as offsets stores long values.
+    return (offsets.capacity() << 3) + INSTANCE_MEMORY_FOOTPRINT;

Review comment:
       If it's not feasible, it would be great if we add the explanation here explicitly.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #35559: [SPARK-33206][CORE] Fix shuffle index cache weight calculation for small index files

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #35559:
URL: https://github.com/apache/spark/pull/35559#discussion_r814358375



##########
File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ShuffleIndexInformation.java
##########
@@ -46,8 +50,9 @@ public ShuffleIndexInformation(File indexFile) throws IOException {
    * Size of the index file
    * @return size
    */
-  public int getSize() {
-    return size;
+  public int getRetainedMemorySize() {
+    // SPARK-33206: here the offsets' capacity is multiplied by 8 as offsets stores long values.
+    return (offsets.capacity() << 3) + INSTANCE_MEMORY_FOOTPRINT;

Review comment:
       Is this safe from `overflow`? Since the range of `offsets.capacity()` is `int` and the return value of `getRetainedMemorySize` is `int`, this formula may hit overflow issue theoretically. I'm wondering if there is a chance for this function to return a negative value.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] mridulm commented on pull request #35559: [SPARK-33206] Fix shuffle index cache weight calculation for small index files

Posted by GitBox <gi...@apache.org>.
mridulm commented on pull request #35559:
URL: https://github.com/apache/spark/pull/35559#issuecomment-1045041724


   @attilapiros Agree on changing from File to absolute path for key.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] lfrancke commented on pull request #35559: [SPARK-33206] Fix shuffle index cache weight calculation for small index files

Posted by GitBox <gi...@apache.org>.
lfrancke commented on pull request #35559:
URL: https://github.com/apache/spark/pull/35559#issuecomment-1043371993


   Thank you very much for tackling this.
   It's been a while since I looked at it. I'm unsure why your number is 10 times larger than mine though.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Ngone51 commented on a change in pull request #35559: [SPARK-33206][CORE] Fix shuffle index cache weight calculation for small index files

Posted by GitBox <gi...@apache.org>.
Ngone51 commented on a change in pull request #35559:
URL: https://github.com/apache/spark/pull/35559#discussion_r810845761



##########
File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalShuffleBlockResolver.java
##########
@@ -118,9 +118,17 @@ public ShuffleIndexInformation load(File file) throws IOException {
             return new ShuffleIndexInformation(file);
           }
         };
+
+    // SPARK-33206: weightCorrection is a constant to increase the shuffle index weight for the
+    // index cache to avoid underestimating the retained memory size coming from the bookkeeping
+    // objects in case of very small index files (i.e files with two offsets are only 16 bytes
+    // in size but related bookkeeping objects retained memory is 1192 bytes) otherwise we can
+    // easily cause an OOM in the NodeManager even when the default cache size limit is used.

Review comment:
       `NodeManager` -> `external shuffle service`?

##########
File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalShuffleBlockResolver.java
##########
@@ -118,9 +118,17 @@ public ShuffleIndexInformation load(File file) throws IOException {
             return new ShuffleIndexInformation(file);
           }
         };
+
+    // SPARK-33206: weightCorrection is a constant to increase the shuffle index weight for the
+    // index cache to avoid underestimating the retained memory size coming from the bookkeeping
+    // objects in case of very small index files (i.e files with two offsets are only 16 bytes
+    // in size but related bookkeeping objects retained memory is 1192 bytes) otherwise we can
+    // easily cause an OOM in the NodeManager even when the default cache size limit is used.
+    final int weightCorrection = 1176;
     shuffleIndexCache = CacheBuilder.newBuilder()
       .maximumWeight(JavaUtils.byteStringAsBytes(indexCacheSize))
-      .weigher((Weigher<File, ShuffleIndexInformation>) (file, indexInfo) -> indexInfo.getSize())
+      .weigher((Weigher<File, ShuffleIndexInformation>)
+        (file, indexInfo) -> indexInfo.getSize() + weightCorrection)

Review comment:
       So, `indexInfo.getSize() + weightCorrection` doesn't include the size of key (`java.io.File`), right?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] attilapiros commented on a change in pull request #35559: [SPARK-33206][CORE] Fix shuffle index cache weight calculation for small index files

Posted by GitBox <gi...@apache.org>.
attilapiros commented on a change in pull request #35559:
URL: https://github.com/apache/spark/pull/35559#discussion_r815258229



##########
File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ShuffleIndexInformation.java
##########
@@ -46,8 +50,9 @@ public ShuffleIndexInformation(File indexFile) throws IOException {
    * Size of the index file
    * @return size
    */
-  public int getSize() {
-    return size;
+  public int getRetainedMemorySize() {

Review comment:
       I am removing the method description as the method name is enough here.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] attilapiros commented on pull request #35559: [SPARK-33206] Fix shuffle index cache weight calculation for small index files

Posted by GitBox <gi...@apache.org>.
attilapiros commented on pull request #35559:
URL: https://github.com/apache/spark/pull/35559#issuecomment-1044112522


   > Thank you very much for tackling this. It's been a while since I looked at it. I'm unsure why your number is 10 times larger than mine though.
   
   The reason must be that I have focused on the leak suspect `LocalCache$Segment` where both the key (`java.io.File` ~ 960 bytes because of storing the file path) and value is stored (`ShuffleIndexInformation` ~ 160 bytes in the pic).  
   
   <img width="851" alt="image" src="https://user-images.githubusercontent.com/2017933/154637434-d1296105-bd56-4ed7-a1ae-83f2059eac35.png">
   
   Both solution would work. In my case we have a stronger limit for full cache.
   
   But look at the Weigher interface:
   https://guava.dev/releases/18.0/api/docs/com/google/common/cache/Weigher.html
   
   It gets the `Key` too and the description mentions cache entry and not only the value:
   > Returns the weight of a cache entry. There is no unit for entry weights; rather they are simply relative to each other.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] attilapiros commented on pull request #35559: [SPARK-33206] Fix shuffle index cache weight calculation for small index files

Posted by GitBox <gi...@apache.org>.
attilapiros commented on pull request #35559:
URL: https://github.com/apache/spark/pull/35559#issuecomment-1044113741


   I am updating the description according to the new findings.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] attilapiros commented on a change in pull request #35559: [SPARK-33206] Fix shuffle index cache weight calculation for small index files

Posted by GitBox <gi...@apache.org>.
attilapiros commented on a change in pull request #35559:
URL: https://github.com/apache/spark/pull/35559#discussion_r809371495



##########
File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalShuffleBlockResolver.java
##########
@@ -118,9 +118,17 @@ public ShuffleIndexInformation load(File file) throws IOException {
             return new ShuffleIndexInformation(file);
           }
         };
+
+    // SPARK-33206: weightCorrection is a constant to increase the shuffle index weight for the
+    // index cache to avoid underestimating the retained memory size coming from the bookkeeping
+    // objects in case of very small index files (i.e files with two offsets are only 16 bytes
+    // in size but related bookkeeping objects retained memory is 1192 bytes) otherwise we can
+    // easily cause an OOM in the NodeManager even when the default cache size limit is used.
+    final int weightCorrection = 1176;
     shuffleIndexCache = CacheBuilder.newBuilder()
       .maximumWeight(JavaUtils.byteStringAsBytes(indexCacheSize))
-      .weigher((Weigher<File, ShuffleIndexInformation>) (file, indexInfo) -> indexInfo.getSize())
+      .weigher((Weigher<File, ShuffleIndexInformation>)
+        (file, indexInfo) -> indexInfo.getSize() + weightCorrection)

Review comment:
       I think the applied correction is very Guava related/specific this is why I kept it close to it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] attilapiros commented on a change in pull request #35559: [SPARK-33206][CORE] Fix shuffle index cache weight calculation for small index files

Posted by GitBox <gi...@apache.org>.
attilapiros commented on a change in pull request #35559:
URL: https://github.com/apache/spark/pull/35559#discussion_r812861984



##########
File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ShuffleIndexInformation.java
##########
@@ -46,8 +45,11 @@ public ShuffleIndexInformation(File indexFile) throws IOException {
    * Size of the index file
    * @return size
    */
-  public int getSize() {
-    return size;
+  public int getRetainedMemorySize() {
+    // SPARK-33206: here the offsets' capacity is multiplied by 8 as offsets stores long values.
+    // And the extra 176 bytes is the estimate of the `ShuffleIndexInformation` memory footprint
+    // which is relevant in case of small index files (i.e. storing only 2 offsets = 16 bytes).
+    return (offsets.capacity() << 3) + 176;

Review comment:
       960 bytes was the `java.io.File` but now we have an interned `String` (which would not count into the cache):
   
   ![image](https://user-images.githubusercontent.com/2017933/155323240-3974f3c1-a2f7-4030-9f48-5e7a7523ec34.png)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] attilapiros commented on a change in pull request #35559: [SPARK-33206] Fix shuffle index cache weight calculation for small index files

Posted by GitBox <gi...@apache.org>.
attilapiros commented on a change in pull request #35559:
URL: https://github.com/apache/spark/pull/35559#discussion_r809772091



##########
File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalShuffleBlockResolver.java
##########
@@ -118,9 +118,17 @@ public ShuffleIndexInformation load(File file) throws IOException {
             return new ShuffleIndexInformation(file);
           }
         };
+
+    // SPARK-33206: weightCorrection is a constant to increase the shuffle index weight for the
+    // index cache to avoid underestimating the retained memory size coming from the bookkeeping
+    // objects in case of very small index files (i.e files with two offsets are only 16 bytes
+    // in size but related bookkeeping objects retained memory is 1192 bytes) otherwise we can
+    // easily cause an OOM in the NodeManager even when the default cache size limit is used.
+    final int weightCorrection = 1176;
     shuffleIndexCache = CacheBuilder.newBuilder()
       .maximumWeight(JavaUtils.byteStringAsBytes(indexCacheSize))
-      .weigher((Weigher<File, ShuffleIndexInformation>) (file, indexInfo) -> indexInfo.getSize())
+      .weigher((Weigher<File, ShuffleIndexInformation>)
+        (file, indexInfo) -> indexInfo.getSize() + weightCorrection)

Review comment:
       Based upon my new findings https://github.com/apache/spark/pull/35559#issuecomment-1044112522 it is more like cache entry size including the key (`java.io.File`). We can separate those two and I can introduce a new method for the `ShuffleIndexInformation`'s retained memory size.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Ngone51 commented on a change in pull request #35559: [SPARK-33206][CORE] Fix shuffle index cache weight calculation for small index files

Posted by GitBox <gi...@apache.org>.
Ngone51 commented on a change in pull request #35559:
URL: https://github.com/apache/spark/pull/35559#discussion_r810931758



##########
File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalShuffleBlockResolver.java
##########
@@ -118,9 +118,17 @@ public ShuffleIndexInformation load(File file) throws IOException {
             return new ShuffleIndexInformation(file);
           }
         };
+
+    // SPARK-33206: weightCorrection is a constant to increase the shuffle index weight for the
+    // index cache to avoid underestimating the retained memory size coming from the bookkeeping
+    // objects in case of very small index files (i.e files with two offsets are only 16 bytes
+    // in size but related bookkeeping objects retained memory is 1192 bytes) otherwise we can
+    // easily cause an OOM in the NodeManager even when the default cache size limit is used.

Review comment:
       I was thinking that for other cases like Standalone Worker, which isn't a NodeManager but facing the same issue from the external shuffle service.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] attilapiros commented on pull request #35559: [SPARK-33206][CORE] Fix shuffle index cache weight calculation for small index files

Posted by GitBox <gi...@apache.org>.
attilapiros commented on pull request #35559:
URL: https://github.com/apache/spark/pull/35559#issuecomment-1048123958


   I have extended the description:
   > But we can go further and get rid of java.io.File completely and store the ShuffleIndexInformation for the file path.
   This way not only the cache size estimate is improved but the its size is decreased as well.
   Here the path size is not counted into the cache size as that string is interned.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #35559: [SPARK-33206][CORE] Fix shuffle index cache weight calculation for small index files

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #35559:
URL: https://github.com/apache/spark/pull/35559#issuecomment-1055786367


   Could you make backporting PRs, @attilapiros ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Ngone51 commented on a change in pull request #35559: [SPARK-33206][CORE] Fix shuffle index cache weight calculation for small index files

Posted by GitBox <gi...@apache.org>.
Ngone51 commented on a change in pull request #35559:
URL: https://github.com/apache/spark/pull/35559#discussion_r812831808



##########
File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ShuffleIndexInformation.java
##########
@@ -46,8 +45,11 @@ public ShuffleIndexInformation(File indexFile) throws IOException {
    * Size of the index file
    * @return size
    */
-  public int getSize() {
-    return size;
+  public int getRetainedMemorySize() {
+    // SPARK-33206: here the offsets' capacity is multiplied by 8 as offsets stores long values.
+    // And the extra 176 bytes is the estimate of the `ShuffleIndexInformation` memory footprint
+    // which is relevant in case of small index files (i.e. storing only 2 offsets = 16 bytes).
+    return (offsets.capacity() << 3) + 176;

Review comment:
       Just want to double confirm the number of hard-coded extra bytes since I saw the original number was 1176.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #35559: [SPARK-33206] Fix shuffle index cache weight calculation for small index files

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #35559:
URL: https://github.com/apache/spark/pull/35559#discussion_r809369318



##########
File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalShuffleBlockResolver.java
##########
@@ -118,9 +118,17 @@ public ShuffleIndexInformation load(File file) throws IOException {
             return new ShuffleIndexInformation(file);
           }
         };
+
+    // SPARK-33206: weightCorrection is a constant to increase the shuffle index weight for the
+    // index cache to avoid underestimating the retained memory size coming from the bookkeeping
+    // objects in case of very small index files (i.e files with two offsets are only 16 bytes
+    // in size but related bookkeeping objects retained memory is 1192 bytes) otherwise we can
+    // easily cause an OOM in the NodeManager even when the default cache size limit is used.
+    final int weightCorrection = 1176;
     shuffleIndexCache = CacheBuilder.newBuilder()
       .maximumWeight(JavaUtils.byteStringAsBytes(indexCacheSize))
-      .weigher((Weigher<File, ShuffleIndexInformation>) (file, indexInfo) -> indexInfo.getSize())
+      .weigher((Weigher<File, ShuffleIndexInformation>)
+        (file, indexInfo) -> indexInfo.getSize() + weightCorrection)

Review comment:
       Your find looks useful. Why don't we add a new method to `ShuffleIndexInformation` for the future?

##########
File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalShuffleBlockResolver.java
##########
@@ -118,9 +118,17 @@ public ShuffleIndexInformation load(File file) throws IOException {
             return new ShuffleIndexInformation(file);
           }
         };
+
+    // SPARK-33206: weightCorrection is a constant to increase the shuffle index weight for the
+    // index cache to avoid underestimating the retained memory size coming from the bookkeeping
+    // objects in case of very small index files (i.e files with two offsets are only 16 bytes
+    // in size but related bookkeeping objects retained memory is 1192 bytes) otherwise we can
+    // easily cause an OOM in the NodeManager even when the default cache size limit is used.
+    final int weightCorrection = 1176;
     shuffleIndexCache = CacheBuilder.newBuilder()
       .maximumWeight(JavaUtils.byteStringAsBytes(indexCacheSize))
-      .weigher((Weigher<File, ShuffleIndexInformation>) (file, indexInfo) -> indexInfo.getSize())
+      .weigher((Weigher<File, ShuffleIndexInformation>)
+        (file, indexInfo) -> indexInfo.getSize() + weightCorrection)

Review comment:
       Your finding looks useful. Why don't we add a new method to `ShuffleIndexInformation` for the future?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] lfrancke commented on pull request #35559: [SPARK-33206] Fix shuffle index cache weight calculation for small index files

Posted by GitBox <gi...@apache.org>.
lfrancke commented on pull request #35559:
URL: https://github.com/apache/spark/pull/35559#issuecomment-1044207510


   It's been a while since I looked at it. Let me know if there's anything I can do to help/verify.
   Making the cache entries smaller is definitely worth it. I had another suggestion in the ticket but I have to admit that I can't remember the details: 
   "[...] we can maybe get rid of the size field in ShuffleIndexInformation to save a few more bytes per entry."


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] attilapiros commented on pull request #35559: [SPARK-33206][CORE] Fix shuffle index cache weight calculation for small index files

Posted by GitBox <gi...@apache.org>.
attilapiros commented on pull request #35559:
URL: https://github.com/apache/spark/pull/35559#issuecomment-1048640173


   @Ngone51 @dongjoon-hyun @mridulm @lfrancke  May I ask another review on the current state?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] attilapiros commented on a change in pull request #35559: [SPARK-33206][CORE] Fix shuffle index cache weight calculation for small index files

Posted by GitBox <gi...@apache.org>.
attilapiros commented on a change in pull request #35559:
URL: https://github.com/apache/spark/pull/35559#discussion_r812188973



##########
File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalShuffleBlockResolver.java
##########
@@ -118,9 +118,17 @@ public ShuffleIndexInformation load(File file) throws IOException {
             return new ShuffleIndexInformation(file);
           }
         };
+
+    // SPARK-33206: weightCorrection is a constant to increase the shuffle index weight for the
+    // index cache to avoid underestimating the retained memory size coming from the bookkeeping
+    // objects in case of very small index files (i.e files with two offsets are only 16 bytes
+    // in size but related bookkeeping objects retained memory is 1192 bytes) otherwise we can
+    // easily cause an OOM in the NodeManager even when the default cache size limit is used.
+    final int weightCorrection = 1176;
     shuffleIndexCache = CacheBuilder.newBuilder()
       .maximumWeight(JavaUtils.byteStringAsBytes(indexCacheSize))
-      .weigher((Weigher<File, ShuffleIndexInformation>) (file, indexInfo) -> indexInfo.getSize())
+      .weigher((Weigher<File, ShuffleIndexInformation>)
+        (file, indexInfo) -> indexInfo.getSize() + weightCorrection)

Review comment:
       It does but soon expect much better solution where File is not used as a key but the path.
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun edited a comment on pull request #35559: [SPARK-33206][CORE] Fix shuffle index cache weight calculation for small index files

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun edited a comment on pull request #35559:
URL: https://github.com/apache/spark/pull/35559#issuecomment-1055786367


   There were some conflicts. Could you make backporting PRs per branch, @attilapiros ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #35559: [SPARK-33206] Fix shuffle index cache weight calculation for small index files

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #35559:
URL: https://github.com/apache/spark/pull/35559#issuecomment-1043304139


   cc @mridulm 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] attilapiros commented on pull request #35559: [SPARK-33206] Fix shuffle index cache weight calculation for small index files

Posted by GitBox <gi...@apache.org>.
attilapiros commented on pull request #35559:
URL: https://github.com/apache/spark/pull/35559#issuecomment-1044204954


   @dongjoon-hyun @lfrancke 
   We can improve this a bit further by changing the cache key from File to String (the file path).
   
   Tomorrow/beginning of next week I look into this.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #35559: [SPARK-33206][CORE] Fix shuffle index cache weight calculation for small index files

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #35559:
URL: https://github.com/apache/spark/pull/35559#discussion_r814379203



##########
File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExecutorDiskUtils.java
##########
@@ -38,9 +38,9 @@ public static File getFile(String[] localDirs, int subDirsPerLocalDir, String fi
     // Unfortunately, we cannot just call the normalization code that java.io.File
     // uses, since it is in the package-private class java.io.FileSystem.
     // So we are creating a File just to get the normalized path back to intern it.
-    // Finally a new File is built and returned with this interned normalized path.
+    // We return this interned normalized path.
     final String normalizedInternedPath = new File(notNormalizedPath).getPath().intern();
-    return new File(normalizedInternedPath);
+    return normalizedInternedPath;

Review comment:
       At this PR, shall we return it simply instead of defining `final String`?
   ```java
   - final String normalizedInternedPath = new File(notNormalizedPath).getPath().intern();
   - return normalizedInternedPath;
   + return new File(notNormalizedPath).getPath().intern();
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] wypoon commented on a change in pull request #35559: [SPARK-33206][CORE] Fix shuffle index cache weight calculation for small index files

Posted by GitBox <gi...@apache.org>.
wypoon commented on a change in pull request #35559:
URL: https://github.com/apache/spark/pull/35559#discussion_r813436289



##########
File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExecutorDiskUtils.java
##########
@@ -40,7 +40,7 @@ public static File getFile(String[] localDirs, int subDirsPerLocalDir, String fi
     // So we are creating a File just to get the normalized path back to intern it.
     // Finally a new File is built and returned with this interned normalized path.

Review comment:
       Update the comment so it won't cause confusion.
   "Finally a new File is built and returned with this interned normalized path." -> "We return this interned normalized path."
   Or drop the comment altogether.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] mridulm commented on a change in pull request #35559: [SPARK-33206][CORE] Fix shuffle index cache weight calculation for small index files

Posted by GitBox <gi...@apache.org>.
mridulm commented on a change in pull request #35559:
URL: https://github.com/apache/spark/pull/35559#discussion_r813606907



##########
File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ShuffleIndexInformation.java
##########
@@ -46,8 +45,11 @@ public ShuffleIndexInformation(File indexFile) throws IOException {
    * Size of the index file
    * @return size
    */
-  public int getSize() {
-    return size;
+  public int getRetainedMemorySize() {
+    // SPARK-33206: here the offsets' capacity is multiplied by 8 as offsets stores long values.
+    // And the extra 176 bytes is the estimate of the `ShuffleIndexInformation` memory footprint
+    // which is relevant in case of small index files (i.e. storing only 2 offsets = 16 bytes).
+    return (offsets.capacity() << 3) + 176;
   }

Review comment:
       nit: pull the `176` as a package private constant and have the Suite depend on it as well.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org