You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/12/11 17:42:41 UTC

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #26828: [SPARK-30198][Core] BytesToBytesMap does not grow internal long array as expected

dongjoon-hyun commented on a change in pull request #26828: [SPARK-30198][Core] BytesToBytesMap does not grow internal long array as expected
URL: https://github.com/apache/spark/pull/26828#discussion_r356740416
 
 

 ##########
 File path: core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java
 ##########
 @@ -741,7 +741,9 @@ public boolean append(Object kbase, long koff, int klen, Object vbase, long voff
         longArray.set(pos * 2 + 1, keyHashcode);
         isDefined = true;
 
-        if (numKeys >= growthThreshold && longArray.size() < MAX_CAPACITY) {
+        // We use two array entries per key, so the array size is twice the capacity.
+        // We should compare the current capacity of the array, instead of its size.
+        if (numKeys >= growthThreshold && longArray.size() / 2 < MAX_CAPACITY) {
 
 Review comment:
   @viirya . I agree your issue, but `longArray.size() / 2` might be too strict because we cap with `MAX_CAPACITY`. In other words, there is a room to grow still.
   ```
    // Allocate the new data structures
       allocate(Math.min(growthStrategy.nextCapacity(oldCapacity), MAX_CAPACITY));
   ```
   
   Can we have explicit test cases for these boundary conditions?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org