You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/02/18 08:10:55 UTC

[GitHub] [spark] attilapiros commented on pull request #35559: [SPARK-33206] Fix shuffle index cache weight calculation for small index files

attilapiros commented on pull request #35559:
URL: https://github.com/apache/spark/pull/35559#issuecomment-1044112522


   > Thank you very much for tackling this. It's been a while since I looked at it. I'm unsure why your number is 10 times larger than mine though.
   
   The reason must be that I have focused on the leak suspect `LocalCache$Segment` where both the key (`java.io.File` ~ 960 bytes because of storing the file path) and value is stored (`ShuffleIndexInformation` ~ 160 bytes in the pic).  
   
   <img width="851" alt="image" src="https://user-images.githubusercontent.com/2017933/154637434-d1296105-bd56-4ed7-a1ae-83f2059eac35.png">
   
   Both solution would work. In my case we have a stronger limit for full cache.
   
   But look at the Weigher interface:
   https://guava.dev/releases/18.0/api/docs/com/google/common/cache/Weigher.html
   
   It gets the `Key` too and the description mentions cache entry and not only the value:
   > Returns the weight of a cache entry. There is no unit for entry weights; rather they are simply relative to each other.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org