You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/12/11 17:42:41 UTC
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #26828:
[SPARK-30198][Core] BytesToBytesMap does not grow internal long array as
expected
dongjoon-hyun commented on a change in pull request #26828: [SPARK-30198][Core] BytesToBytesMap does not grow internal long array as expected
URL: https://github.com/apache/spark/pull/26828#discussion_r356740416
##########
File path: core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java
##########
@@ -741,7 +741,9 @@ public boolean append(Object kbase, long koff, int klen, Object vbase, long voff
longArray.set(pos * 2 + 1, keyHashcode);
isDefined = true;
- if (numKeys >= growthThreshold && longArray.size() < MAX_CAPACITY) {
+ // We use two array entries per key, so the array size is twice the capacity.
+ // We should compare the current capacity of the array, instead of its size.
+ if (numKeys >= growthThreshold && longArray.size() / 2 < MAX_CAPACITY) {
Review comment:
@viirya . I agree your issue, but `longArray.size() / 2` might be too strict because we cap with `MAX_CAPACITY`. In other words, there is a room to grow still.
```
// Allocate the new data structures
allocate(Math.min(growthStrategy.nextCapacity(oldCapacity), MAX_CAPACITY));
```
Can we have explicit test cases for these boundary conditions?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org