You are viewing a plain text version of this content. The canonical link for it is here.
Posted to pr@cassandra.apache.org by GitBox <gi...@apache.org> on 2022/04/06 15:07:20 UTC

[GitHub] [cassandra] belliottsmith commented on a diff in pull request #1462: CASSANDRA-15510 (4.0): Optimise BTree.{build,update,transform}

belliottsmith commented on code in PR #1462:
URL: https://github.com/apache/cassandra/pull/1462#discussion_r844064713


##########
src/java/org/apache/cassandra/utils/btree/BTree.java:
##########
@@ -219,103 +214,92 @@ public static Dir desc(boolean desc)
         if (height == 2)
         {
             // we use the _exact same logic_ as below, only we invoke buildLeaf
-            // denseCount: subtract the minimal (sparse) size of the tree and
-            // divide the remainder by the amount extra needed extra per dense
-            int denseCount = (size - (childCount * MIN_KEYS + keyCount + 1)) / (1+MIN_KEYS);
+            int remaining = size;
+            int threshold = MAX_KEYS + 1 + MIN_KEYS;
             int i = 0;
-            while (i < denseCount)
-            {
-                sizeMap[i] = i * (1+MAX_KEYS) + MAX_KEYS;
-                branch[keyCount + i] = buildLeafWithoutSizeTracking(source, MAX_KEYS, updateF);
-                branch[i++] = isSimple(updateF) ? source.next() : updateF.apply(source.next());
-            }
-            int sparseCount = childCount - (1 + denseCount);
-            int remainderSize = size - (denseCount * (1+MAX_KEYS) + sparseCount * (1+MIN_KEYS));
-            {
-                branch[keyCount + i] = buildLeafWithoutSizeTracking(source, remainderSize, updateF);
-                sizeMap[i] = size = i * (1+MAX_KEYS) + remainderSize;
-            }
-            // finally build "sparse" children to consume the remaining elements
-            while (i < keyCount)
+            while (remaining >= threshold)
             {
-                branch[i++] = isSimple(updateF) ? source.next() : updateF.apply(source.next());
-                branch[keyCount + i] = buildLeafWithoutSizeTracking(source, MIN_KEYS, updateF);
-                sizeMap[i] = size += 1 + MIN_KEYS;
+                branch[keyCount + i] = buildLeaf(source, MAX_KEYS, updateF);
+                branch[i] = isSimple(updateF) ? source.next() : updateF.apply(source.next());
+                remaining -= MAX_KEYS + 1;
+                sizeMap[i++] = size - remaining - 1;
             }
-            if (!isSimple(updateF))
+            if (remaining > MAX_KEYS)
             {
-                updateF.onAllocated(denseCount * PERFECT_DENSE_SIZE_ON_HEAP[0]
-                                    + sparseCount * PERFECT_SPARSE_SIZE_ON_HEAP[0]
-                                    + ObjectSizes.sizeOfReferenceArray(remainderSize | 1));
+                int childSize = remaining / 2;
+                branch[keyCount + i] = buildLeaf(source, childSize, updateF);
+                branch[i] = isSimple(updateF) ? source.next() : updateF.apply(source.next());
+                remaining -= childSize + 1;
+                sizeMap[i++] = size - remaining - 1;
             }
+            branch[keyCount + i] = buildLeaf(source, remaining, updateF);
+            sizeMap[i++] = size;
+            assert i == childCount;
         }
         else
         {
             --height;
-
-            // denseCount: subtract the minimal (sparse) size of the tree and
-            // divide the remainder by the amount extra needed extra per dense
-            int denseCount = (size - (childCount * sparseSize(height) + keyCount + 1)) / (denseSize(height) - sparseSize(height));
+            int denseChildSize = denseSize(height);
+            int denseGrandChildSize = denseSize(height - 1);
+            // The threshold is the point after which we can't add a dense child and still add another child with
+            // at least MIN_KEYS fully dense children plus at least one more key.
+            int threshold = denseChildSize + 1 + MIN_KEYS * (denseGrandChildSize + 1);
+            int remaining = size;
             int i = 0;
-            while (i < denseCount)
-            {
-                sizeMap[i] = (1 + i) * (1 + denseSize(height)) - 1;
-                branch[keyCount + i] = buildPerfectDense(source, height, updateF);
-                branch[i++] = isSimple(updateF) ? source.next() : updateF.apply(source.next());
-            }
-            int sparseCount = childCount - (1 + denseCount);
-            int remainderSize = size - (denseCount * (1+denseSize(height)) + sparseCount * (1+sparseSize(height)));
+            // Add dense children until we reach the threshold.
+            while (remaining >= threshold)
             {
-                // need to decide our number of children either based on denseGrandChildSize,
-                // or the least number of children we can use, if we want to be maximally dense
-                int grandChildCount = remainderSize < denseSize(height)/2
-                                      ? MIN_KEYS + 1
-                                      : remainderSize / (denseSize(height-1)+1) + 1;
-
-                branch[keyCount + i] = buildMaximallyDense(source, grandChildCount, remainderSize, height, updateF);
-                sizeMap[i] = size = (1 + denseSize(height)) * i + remainderSize;
+                branch[keyCount + i] = buildPerfectDenseAndAddUsage(source, height, updateF);
+                branch[i] = isSimple(updateF) ? source.next() : updateF.apply(source.next());
+                remaining -= denseChildSize + 1;
+                sizeMap[i++] = size - remaining - 1;
             }
-            // finally build sparse children to consume the remaining elements
-            while (i < keyCount)
+            // If the remainder does not fit one child, split it roughly in half, where the first child only has dense
+            // grandchildren.
+            if (remaining > denseChildSize)
             {
-                branch[i++] = isSimple(updateF) ? source.next() : updateF.apply(source.next());
-                branch[keyCount + i] = buildPerfectSparse(source, height, updateF);
-                sizeMap[i] = size += 1 + sparseSize(height);
-            }
-            if (!isSimple(updateF))
-            {
-                // accounts only for the perfect allocations; recursive calls of buildMaximallyDense will account for the remainder
-                updateF.onAllocated(denseCount * PERFECT_DENSE_SIZE_ON_HEAP[height - 1]
-                                    + sparseCount * PERFECT_SPARSE_SIZE_ON_HEAP[height - 1]);
+                int grandChildCount = remaining / ((denseGrandChildSize + 1) * 2);
+                assert grandChildCount >= MIN_KEYS + 1;
+                int childSize = grandChildCount * (denseGrandChildSize + 1) - 1;
+                branch[keyCount + i] = buildMaximallyDense(source, grandChildCount, childSize, height, updateF);

Review Comment:
   could we use `buildPerfectDenseAndAddUsage` here?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org