You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@geode.apache.org by GitBox <gi...@apache.org> on 2021/06/08 23:11:25 UTC

[GitHub] [geode] DonalEvans opened a new pull request #6590: GEODE-9301: Implement accurate sizeable for RedisHash

DonalEvans opened a new pull request #6590:
URL: https://github.com/apache/geode/pull/6590


    - Extend Object2ObjectOpenCustomHashMap to allow direct access to the
    current capacity of the backing arrays
    - Use ReflectionSingleObjectSizer to calculate per-element overhead, as
    the previous implementation in SizeableObjectOpenCustomHashSet was
    effectively just reimplementing it
    - Calculate the overhead associated with resizing the backing arrays
    - Update tests to reflect the new behaviour
   
   Authored-by: Donal Evans <do...@vmware.com>
   
   Thank you for submitting a contribution to Apache Geode.
   
   In order to streamline the review of the contribution we ask you
   to ensure the following steps have been taken:
   
   ### For all changes:
   - [x] Is there a JIRA ticket associated with this PR? Is it referenced in the commit message?
   
   - [x] Has your PR been rebased against the latest commit within the target branch (typically `develop`)?
   
   - [x] Is your initial contribution a single, squashed commit?
   
   - [x] Does `gradlew build` run cleanly?
   
   - [x] Have you written or updated unit tests to verify your changes?
   
   - [N/A] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)?
   
   ### Note:
   Please ensure that once the PR is submitted, check Concourse for build issues and
   submit an update to your PR as soon as possible. If you need help, please send an
   email to dev@geode.apache.org.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] dschneider-pivotal commented on a change in pull request #6590: GEODE-9301: Implement accurate sizeable for RedisHash

Posted by GitBox <gi...@apache.org>.
dschneider-pivotal commented on a change in pull request #6590:
URL: https://github.com/apache/geode/pull/6590#discussion_r650208396



##########
File path: geode-apis-compatible-with-redis/src/main/java/org/apache/geode/redis/internal/collections/SizeableObject2ObjectOpenCustomHashMapWithCursor.java
##########
@@ -183,6 +192,50 @@ private int hash(K key) {
     return mix(strategy.hashCode(key)) & mask;
   }
 
+  @Override
+  public V put(K k, V v) {
+    V oldValue = super.put(k, v);
+    if (oldValue == null) {
+      // A create
+      arrayContentsOverhead += (int) (elementSizer.sizeof(k) + elementSizer.sizeof(v));
+    } else {
+      // An update
+      arrayContentsOverhead += (int) (elementSizer.sizeof(v) - elementSizer.sizeof(oldValue));
+    }
+    return oldValue;
+  }
+
+  @Override
+  public V remove(Object k) {
+    V oldValue = super.remove(k);
+    if (oldValue != null) {
+      arrayContentsOverhead -= (elementSizer.sizeof(k) + elementSizer.sizeof(oldValue));
+    }
+    return oldValue;
+  }
+
+  @Override
+  public int getSizeInBytes() {
+    return arrayContentsOverhead + calculateBackingArraysOverhead();
+  }
+
+  @VisibleForTesting
+  int calculateBackingArraysOverhead() {
+    // This formula determined experimentally using tests.
+    return BACKING_ARRAY_OVERHEAD_CONSTANT
+        + BACKING_ARRAY_LENGTH_COEFFICIENT * (key.length + value.length);

Review comment:
       you could use getTotalBackingArrayLength() here




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] DonalEvans commented on a change in pull request #6590: GEODE-9301: Implement accurate sizeable for RedisHash

Posted by GitBox <gi...@apache.org>.
DonalEvans commented on a change in pull request #6590:
URL: https://github.com/apache/geode/pull/6590#discussion_r649576384



##########
File path: geode-apis-compatible-with-redis/src/main/java/org/apache/geode/redis/internal/collections/SizeableObject2ObjectOpenCustomHashMapWithCursor.java
##########
@@ -29,52 +33,58 @@
  * The scan method provides the same guarantees as Redis's HSCAN, and in fact
  * uses the same algorithm.
  */
-public class Object2ObjectOpenCustomHashMapWithCursor<K, V>
-    extends Object2ObjectOpenCustomHashMap<K, V> {
+public class SizeableObject2ObjectOpenCustomHashMapWithCursor<K, V>
+    extends Object2ObjectOpenCustomHashMap<K, V> implements Sizeable {
+
+  private static final long serialVersionUID = 9079713776660851891L;
+  public static final int BACKING_ARRAY_OVERHEAD_CONSTANT = 136;
+  public static final int BACKING_ARRAY_LENGTH_COEFFICIENT = 4;
+  private static final ReflectionSingleObjectSizer elementSizer =
+      ReflectionSingleObjectSizer.getInstance();
 
+  private int keysOverhead;

Review comment:
       The length of the arrays isn't updated every time an element is added to the collection, only when there are sufficient elements to fill a percentage of it determined by the load factor, which then forces a resize of the backing arrays, so it's necessary to check the length of the arrays to determine the current overhead.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] DonalEvans commented on a change in pull request #6590: GEODE-9301: Implement accurate sizeable for RedisHash

Posted by GitBox <gi...@apache.org>.
DonalEvans commented on a change in pull request #6590:
URL: https://github.com/apache/geode/pull/6590#discussion_r650203570



##########
File path: geode-apis-compatible-with-redis/src/test/java/org/apache/geode/redis/internal/collections/SizeableObjectOpenCustomHashSetTest.java
##########
@@ -99,13 +50,15 @@ public void backingArrayOverheadCalculationTest() {
 
       // Calculate the overhead associated only with the backing array
       backingArrayOverhead = sizer.sizeof(set) - set.getMemberOverhead();
+      int expected = BACKING_ARRAY_OVERHEAD_CONSTANT
+          + BACKING_ARRAY_LENGTH_COEFFICIENT * set.getBackingArrayLength();
       softly.assertThat(backingArrayOverhead)

Review comment:
       May I ask why? For this test in particular, the use of soft assertions is helpful in determining in one run what's going wrong with the sizing, as if we just failed on the first assertion, it wouldn't be clear if the size was off by a constant amount or if there was a compounding error in the calculation.

##########
File path: geode-apis-compatible-with-redis/src/main/java/org/apache/geode/redis/internal/collections/SizeableObjectOpenCustomHashSet.java
##########
@@ -14,23 +14,23 @@
  */
 package org.apache.geode.redis.internal.collections;
 
-import static org.apache.geode.internal.size.ReflectionSingleObjectSizer.roundUpSize;
-
 import java.util.Collection;
 import java.util.Iterator;
 
 import it.unimi.dsi.fastutil.objects.ObjectCollection;
 import it.unimi.dsi.fastutil.objects.ObjectOpenCustomHashSet;
 
 import org.apache.geode.annotations.VisibleForTesting;
+import org.apache.geode.internal.size.ReflectionSingleObjectSizer;
 import org.apache.geode.internal.size.Sizeable;
 
 public class SizeableObjectOpenCustomHashSet<K> extends ObjectOpenCustomHashSet<K>
     implements Sizeable {
   private static final long serialVersionUID = 9174920505089089517L;
-  public static final int MEMBER_OVERHEAD_CONSTANT = 16;
   public static final int BACKING_ARRAY_OVERHEAD_CONSTANT = 92;
   public static final int BACKING_ARRAY_LENGTH_COEFFICIENT = 4;
+  private static final ReflectionSingleObjectSizer elementSizer =
+      ReflectionSingleObjectSizer.getInstance();

Review comment:
       Done!

##########
File path: geode-apis-compatible-with-redis/src/test/java/org/apache/geode/redis/internal/collections/SizeableObject2ObjectOpenCustomHashMapWithCursorTest.java
##########
@@ -0,0 +1,349 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more contributor license
+ * agreements. See the NOTICE file distributed with this work for additional information regarding
+ * copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License. You may obtain a
+ * copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software distributed under the License
+ * is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
+ * or implied. See the License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.geode.redis.internal.collections;
+
+import static org.apache.geode.redis.internal.collections.SizeableObject2ObjectOpenCustomHashMapWithCursor.BACKING_ARRAY_LENGTH_COEFFICIENT;
+import static org.apache.geode.redis.internal.collections.SizeableObject2ObjectOpenCustomHashMapWithCursor.BACKING_ARRAY_OVERHEAD_CONSTANT;
+import static org.assertj.core.api.Assertions.assertThat;
+
+import java.util.HashMap;
+import java.util.Map;
+import java.util.Random;
+import java.util.stream.IntStream;
+
+import it.unimi.dsi.fastutil.Hash;
+import it.unimi.dsi.fastutil.bytes.ByteArrays;
+import org.assertj.core.api.SoftAssertions;
+import org.junit.Test;
+
+import org.apache.geode.internal.size.ReflectionObjectSizer;
+
+public class SizeableObject2ObjectOpenCustomHashMapWithCursorTest {
+  private final ReflectionObjectSizer sizer = ReflectionObjectSizer.getInstance();

Review comment:
       The `ReflectionObjectSizer` is a pre-existing singleton class that doesn't allow instantiation. Removing/refactoring that is outside the scope of this PR, so I'll leave this as is (but change the type to the `ObjectSizer` interface).

##########
File path: geode-apis-compatible-with-redis/src/test/java/org/apache/geode/redis/internal/collections/SizeableObject2ObjectOpenCustomHashMapWithCursorTest.java
##########
@@ -0,0 +1,349 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more contributor license
+ * agreements. See the NOTICE file distributed with this work for additional information regarding
+ * copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License. You may obtain a
+ * copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software distributed under the License
+ * is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
+ * or implied. See the License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.geode.redis.internal.collections;
+
+import static org.apache.geode.redis.internal.collections.SizeableObject2ObjectOpenCustomHashMapWithCursor.BACKING_ARRAY_LENGTH_COEFFICIENT;
+import static org.apache.geode.redis.internal.collections.SizeableObject2ObjectOpenCustomHashMapWithCursor.BACKING_ARRAY_OVERHEAD_CONSTANT;
+import static org.assertj.core.api.Assertions.assertThat;
+
+import java.util.HashMap;
+import java.util.Map;
+import java.util.Random;
+import java.util.stream.IntStream;
+
+import it.unimi.dsi.fastutil.Hash;
+import it.unimi.dsi.fastutil.bytes.ByteArrays;
+import org.assertj.core.api.SoftAssertions;
+import org.junit.Test;
+
+import org.apache.geode.internal.size.ReflectionObjectSizer;
+
+public class SizeableObject2ObjectOpenCustomHashMapWithCursorTest {
+  private final ReflectionObjectSizer sizer = ReflectionObjectSizer.getInstance();
+
+  private static final Hash.Strategy<Integer> NATURAL_HASH = new Hash.Strategy<Integer>() {
+    @Override
+    public int hashCode(Integer o) {
+      return o.hashCode();
+    }
+
+    @Override
+    public boolean equals(Integer a, Integer b) {
+      return a.equals(b);
+    }
+  };
+
+  private static Hash.Strategy<Integer> COLLIDING_HASH = new Hash.Strategy<Integer>() {
+    @Override
+    public int hashCode(Integer o) {
+      return o % 5;
+    }
+
+    @Override
+    public boolean equals(Integer a, Integer b) {
+      return a.equals(b);
+    }
+  };
+
+  @Test
+  public void scanEntireMap_ReturnsExpectedElements() {
+    SizeableObject2ObjectOpenCustomHashMapWithCursor<Integer, String> map =
+        new SizeableObject2ObjectOpenCustomHashMapWithCursor<>(NATURAL_HASH);
+    IntStream.range(0, 10).forEach(i -> map.put(i, "value-" + i));
+
+    HashMap<Integer, String> scanned = new HashMap<>();

Review comment:
       Thanks for catching this. This code was just moved, not actually written for this PR, but I'll clean up the class while I'm here.

##########
File path: geode-core/src/main/java/org/apache/geode/internal/size/ReflectionSingleObjectSizer.java
##########
@@ -179,5 +181,8 @@ private static int sizeType(Class<?> t) {
     }
   }
 
+  public static ReflectionSingleObjectSizer getInstance() {
+    return INSTANCE;
+  }

Review comment:
       Done!

##########
File path: geode-apis-compatible-with-redis/src/main/java/org/apache/geode/redis/internal/collections/SizeableObject2ObjectOpenCustomHashMapWithCursor.java
##########
@@ -183,6 +192,50 @@ private int hash(K key) {
     return mix(strategy.hashCode(key)) & mask;
   }
 
+  @Override
+  public V put(K k, V v) {
+    V oldValue = super.put(k, v);
+    if (oldValue == null) {
+      // A create
+      arrayContentsOverhead += (int) (elementSizer.sizeof(k) + elementSizer.sizeof(v));
+    } else {
+      // An update
+      arrayContentsOverhead += (int) (elementSizer.sizeof(v) - elementSizer.sizeof(oldValue));
+    }
+    return oldValue;
+  }
+
+  @Override
+  public V remove(Object k) {
+    V oldValue = super.remove(k);
+    if (oldValue != null) {
+      arrayContentsOverhead -= (elementSizer.sizeof(k) + elementSizer.sizeof(oldValue));
+    }
+    return oldValue;
+  }
+
+  @Override
+  public int getSizeInBytes() {
+    return arrayContentsOverhead + calculateBackingArraysOverhead();
+  }
+
+  @VisibleForTesting
+  int calculateBackingArraysOverhead() {
+    // This formula determined experimentally using tests.
+    return BACKING_ARRAY_OVERHEAD_CONSTANT
+        + BACKING_ARRAY_LENGTH_COEFFICIENT * (key.length + value.length);

Review comment:
       Done!

##########
File path: geode-apis-compatible-with-redis/src/test/java/org/apache/geode/redis/internal/collections/SizeableObject2ObjectOpenCustomHashMapWithCursorTest.java
##########
@@ -0,0 +1,349 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more contributor license
+ * agreements. See the NOTICE file distributed with this work for additional information regarding
+ * copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License. You may obtain a
+ * copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software distributed under the License
+ * is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
+ * or implied. See the License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.geode.redis.internal.collections;
+
+import static org.apache.geode.redis.internal.collections.SizeableObject2ObjectOpenCustomHashMapWithCursor.BACKING_ARRAY_LENGTH_COEFFICIENT;
+import static org.apache.geode.redis.internal.collections.SizeableObject2ObjectOpenCustomHashMapWithCursor.BACKING_ARRAY_OVERHEAD_CONSTANT;
+import static org.assertj.core.api.Assertions.assertThat;
+
+import java.util.HashMap;
+import java.util.Map;
+import java.util.Random;
+import java.util.stream.IntStream;
+
+import it.unimi.dsi.fastutil.Hash;
+import it.unimi.dsi.fastutil.bytes.ByteArrays;
+import org.assertj.core.api.SoftAssertions;
+import org.junit.Test;
+
+import org.apache.geode.internal.size.ReflectionObjectSizer;
+
+public class SizeableObject2ObjectOpenCustomHashMapWithCursorTest {
+  private final ReflectionObjectSizer sizer = ReflectionObjectSizer.getInstance();
+
+  private static final Hash.Strategy<Integer> NATURAL_HASH = new Hash.Strategy<Integer>() {
+    @Override
+    public int hashCode(Integer o) {
+      return o.hashCode();
+    }
+
+    @Override
+    public boolean equals(Integer a, Integer b) {
+      return a.equals(b);
+    }
+  };
+
+  private static Hash.Strategy<Integer> COLLIDING_HASH = new Hash.Strategy<Integer>() {
+    @Override
+    public int hashCode(Integer o) {
+      return o % 5;
+    }
+
+    @Override
+    public boolean equals(Integer a, Integer b) {
+      return a.equals(b);
+    }
+  };
+
+  @Test
+  public void scanEntireMap_ReturnsExpectedElements() {
+    SizeableObject2ObjectOpenCustomHashMapWithCursor<Integer, String> map =
+        new SizeableObject2ObjectOpenCustomHashMapWithCursor<>(NATURAL_HASH);
+    IntStream.range(0, 10).forEach(i -> map.put(i, "value-" + i));
+
+    HashMap<Integer, String> scanned = new HashMap<>();
+    int result = map.scan(0, 10000, HashMap::put, scanned);
+    assertThat(result).isEqualTo(0);
+    assertThat(scanned).isEqualTo(map);
+  }
+
+  @Test
+  public void twoScansWithNoModifications_ReturnsExpectedElements() {
+    SizeableObject2ObjectOpenCustomHashMapWithCursor<Integer, String> map =
+        new SizeableObject2ObjectOpenCustomHashMapWithCursor<>(NATURAL_HASH);
+    IntStream.range(0, 10).forEach(i -> map.put(i, "value-" + i));
+
+    HashMap<Integer, String> scanned = new HashMap<>();
+    int cursor = map.scan(0, 3, HashMap::put, scanned);
+    assertThat(scanned).hasSize(3);
+    cursor = map.scan(cursor, 3, HashMap::put, scanned);
+    assertThat(scanned).hasSize(6);
+    cursor = map.scan(cursor, 4, HashMap::put, scanned);
+    assertThat(scanned).hasSize(10);
+    cursor = map.scan(cursor, 4, HashMap::put, scanned);
+    assertThat(scanned).hasSize(10);
+    assertThat(cursor).isEqualTo(0);
+
+    assertThat(scanned).isEqualTo(map);
+  }

Review comment:
       I didn't write this test so I could be misunderstanding what it's trying to do, but I don't think it's possible to have a separate test for each assertion, as the test is showing that successive calls to scan behave as expected. I've refactored the test slightly to simplify it to just two calls.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] kirklund commented on a change in pull request #6590: GEODE-9301: Implement accurate sizeable for RedisHash

Posted by GitBox <gi...@apache.org>.
kirklund commented on a change in pull request #6590:
URL: https://github.com/apache/geode/pull/6590#discussion_r650181203



##########
File path: geode-apis-compatible-with-redis/src/test/java/org/apache/geode/redis/internal/collections/SizeableObject2ObjectOpenCustomHashMapWithCursorTest.java
##########
@@ -0,0 +1,349 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more contributor license
+ * agreements. See the NOTICE file distributed with this work for additional information regarding
+ * copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License. You may obtain a
+ * copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software distributed under the License
+ * is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
+ * or implied. See the License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.geode.redis.internal.collections;
+
+import static org.apache.geode.redis.internal.collections.SizeableObject2ObjectOpenCustomHashMapWithCursor.BACKING_ARRAY_LENGTH_COEFFICIENT;
+import static org.apache.geode.redis.internal.collections.SizeableObject2ObjectOpenCustomHashMapWithCursor.BACKING_ARRAY_OVERHEAD_CONSTANT;
+import static org.assertj.core.api.Assertions.assertThat;
+
+import java.util.HashMap;
+import java.util.Map;
+import java.util.Random;
+import java.util.stream.IntStream;
+
+import it.unimi.dsi.fastutil.Hash;
+import it.unimi.dsi.fastutil.bytes.ByteArrays;
+import org.assertj.core.api.SoftAssertions;
+import org.junit.Test;
+
+import org.apache.geode.internal.size.ReflectionObjectSizer;
+
+public class SizeableObject2ObjectOpenCustomHashMapWithCursorTest {
+  private final ReflectionObjectSizer sizer = ReflectionObjectSizer.getInstance();

Review comment:
       Another singleton. Since this is a test, just instantiate it.

##########
File path: geode-core/src/main/java/org/apache/geode/internal/size/ReflectionSingleObjectSizer.java
##########
@@ -179,5 +181,8 @@ private static int sizeType(Class<?> t) {
     }
   }
 
+  public static ReflectionSingleObjectSizer getInstance() {
+    return INSTANCE;
+  }

Review comment:
       Please don't add any new singletons into the code base. We want to remove all of the one that are already there. It's better to have classes that are dependent on this class create an instance. If you need to share an instance across many dependent instances, then have a factory create an instance of ReflectionSingleObjectSizer to pass into all of the dependents.

##########
File path: geode-apis-compatible-with-redis/src/main/java/org/apache/geode/redis/internal/collections/SizeableObjectOpenCustomHashSet.java
##########
@@ -14,23 +14,23 @@
  */
 package org.apache.geode.redis.internal.collections;
 
-import static org.apache.geode.internal.size.ReflectionSingleObjectSizer.roundUpSize;
-
 import java.util.Collection;
 import java.util.Iterator;
 
 import it.unimi.dsi.fastutil.objects.ObjectCollection;
 import it.unimi.dsi.fastutil.objects.ObjectOpenCustomHashSet;
 
 import org.apache.geode.annotations.VisibleForTesting;
+import org.apache.geode.internal.size.ReflectionSingleObjectSizer;
 import org.apache.geode.internal.size.Sizeable;
 
 public class SizeableObjectOpenCustomHashSet<K> extends ObjectOpenCustomHashSet<K>
     implements Sizeable {
   private static final long serialVersionUID = 9174920505089089517L;
-  public static final int MEMBER_OVERHEAD_CONSTANT = 16;
   public static final int BACKING_ARRAY_OVERHEAD_CONSTANT = 92;
   public static final int BACKING_ARRAY_LENGTH_COEFFICIENT = 4;
+  private static final ReflectionSingleObjectSizer elementSizer =
+      ReflectionSingleObjectSizer.getInstance();

Review comment:
       Unless you're creating large numbers of instances of SizeableObjectOpenCustomHashSet, I recommend avoiding singletons even here. Unless there's some reason that ReflectionSingleObjectSizer should NEVER have multiple instances, then just create an instance here.
   
   Also, can you type it as SingleObjectSizer? It would be so much better to depend only on interfaces instead of concrete implementation.

##########
File path: geode-apis-compatible-with-redis/src/test/java/org/apache/geode/redis/internal/collections/SizeableObject2ObjectOpenCustomHashMapWithCursorTest.java
##########
@@ -0,0 +1,349 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more contributor license
+ * agreements. See the NOTICE file distributed with this work for additional information regarding
+ * copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License. You may obtain a
+ * copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software distributed under the License
+ * is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
+ * or implied. See the License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.geode.redis.internal.collections;
+
+import static org.apache.geode.redis.internal.collections.SizeableObject2ObjectOpenCustomHashMapWithCursor.BACKING_ARRAY_LENGTH_COEFFICIENT;
+import static org.apache.geode.redis.internal.collections.SizeableObject2ObjectOpenCustomHashMapWithCursor.BACKING_ARRAY_OVERHEAD_CONSTANT;
+import static org.assertj.core.api.Assertions.assertThat;
+
+import java.util.HashMap;
+import java.util.Map;
+import java.util.Random;
+import java.util.stream.IntStream;
+
+import it.unimi.dsi.fastutil.Hash;
+import it.unimi.dsi.fastutil.bytes.ByteArrays;
+import org.assertj.core.api.SoftAssertions;
+import org.junit.Test;
+
+import org.apache.geode.internal.size.ReflectionObjectSizer;
+
+public class SizeableObject2ObjectOpenCustomHashMapWithCursorTest {
+  private final ReflectionObjectSizer sizer = ReflectionObjectSizer.getInstance();
+
+  private static final Hash.Strategy<Integer> NATURAL_HASH = new Hash.Strategy<Integer>() {
+    @Override
+    public int hashCode(Integer o) {
+      return o.hashCode();
+    }
+
+    @Override
+    public boolean equals(Integer a, Integer b) {
+      return a.equals(b);
+    }
+  };
+
+  private static Hash.Strategy<Integer> COLLIDING_HASH = new Hash.Strategy<Integer>() {
+    @Override
+    public int hashCode(Integer o) {
+      return o % 5;
+    }
+
+    @Override
+    public boolean equals(Integer a, Integer b) {
+      return a.equals(b);
+    }
+  };
+
+  @Test
+  public void scanEntireMap_ReturnsExpectedElements() {
+    SizeableObject2ObjectOpenCustomHashMapWithCursor<Integer, String> map =
+        new SizeableObject2ObjectOpenCustomHashMapWithCursor<>(NATURAL_HASH);
+    IntStream.range(0, 10).forEach(i -> map.put(i, "value-" + i));
+
+    HashMap<Integer, String> scanned = new HashMap<>();
+    int result = map.scan(0, 10000, HashMap::put, scanned);
+    assertThat(result).isEqualTo(0);
+    assertThat(scanned).isEqualTo(map);
+  }
+
+  @Test
+  public void twoScansWithNoModifications_ReturnsExpectedElements() {
+    SizeableObject2ObjectOpenCustomHashMapWithCursor<Integer, String> map =
+        new SizeableObject2ObjectOpenCustomHashMapWithCursor<>(NATURAL_HASH);
+    IntStream.range(0, 10).forEach(i -> map.put(i, "value-" + i));
+
+    HashMap<Integer, String> scanned = new HashMap<>();
+    int cursor = map.scan(0, 3, HashMap::put, scanned);
+    assertThat(scanned).hasSize(3);
+    cursor = map.scan(cursor, 3, HashMap::put, scanned);
+    assertThat(scanned).hasSize(6);
+    cursor = map.scan(cursor, 4, HashMap::put, scanned);
+    assertThat(scanned).hasSize(10);
+    cursor = map.scan(cursor, 4, HashMap::put, scanned);
+    assertThat(scanned).hasSize(10);
+    assertThat(cursor).isEqualTo(0);
+
+    assertThat(scanned).isEqualTo(map);
+  }

Review comment:
       Since this is a unit test that runs very fast, it's much better to copy the test for each assertion and then just assert one thing in each test method.

##########
File path: geode-apis-compatible-with-redis/src/test/java/org/apache/geode/redis/internal/collections/SizeableObjectOpenCustomHashSetTest.java
##########
@@ -99,13 +50,15 @@ public void backingArrayOverheadCalculationTest() {
 
       // Calculate the overhead associated only with the backing array
       backingArrayOverhead = sizer.sizeof(set) - set.getMemberOverhead();
+      int expected = BACKING_ARRAY_OVERHEAD_CONSTANT
+          + BACKING_ARRAY_LENGTH_COEFFICIENT * set.getBackingArrayLength();
       softly.assertThat(backingArrayOverhead)

Review comment:
       Not a request for changes... Please steer developers away from using SoftAssertions. This is not a good practice in unit tests.

##########
File path: geode-apis-compatible-with-redis/src/test/java/org/apache/geode/redis/internal/collections/SizeableObject2ObjectOpenCustomHashMapWithCursorTest.java
##########
@@ -0,0 +1,349 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more contributor license
+ * agreements. See the NOTICE file distributed with this work for additional information regarding
+ * copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License. You may obtain a
+ * copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software distributed under the License
+ * is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
+ * or implied. See the License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.geode.redis.internal.collections;
+
+import static org.apache.geode.redis.internal.collections.SizeableObject2ObjectOpenCustomHashMapWithCursor.BACKING_ARRAY_LENGTH_COEFFICIENT;
+import static org.apache.geode.redis.internal.collections.SizeableObject2ObjectOpenCustomHashMapWithCursor.BACKING_ARRAY_OVERHEAD_CONSTANT;
+import static org.assertj.core.api.Assertions.assertThat;
+
+import java.util.HashMap;
+import java.util.Map;
+import java.util.Random;
+import java.util.stream.IntStream;
+
+import it.unimi.dsi.fastutil.Hash;
+import it.unimi.dsi.fastutil.bytes.ByteArrays;
+import org.assertj.core.api.SoftAssertions;
+import org.junit.Test;
+
+import org.apache.geode.internal.size.ReflectionObjectSizer;
+
+public class SizeableObject2ObjectOpenCustomHashMapWithCursorTest {
+  private final ReflectionObjectSizer sizer = ReflectionObjectSizer.getInstance();
+
+  private static final Hash.Strategy<Integer> NATURAL_HASH = new Hash.Strategy<Integer>() {
+    @Override
+    public int hashCode(Integer o) {
+      return o.hashCode();
+    }
+
+    @Override
+    public boolean equals(Integer a, Integer b) {
+      return a.equals(b);
+    }
+  };
+
+  private static Hash.Strategy<Integer> COLLIDING_HASH = new Hash.Strategy<Integer>() {
+    @Override
+    public int hashCode(Integer o) {
+      return o % 5;
+    }
+
+    @Override
+    public boolean equals(Integer a, Integer b) {
+      return a.equals(b);
+    }
+  };
+
+  @Test
+  public void scanEntireMap_ReturnsExpectedElements() {
+    SizeableObject2ObjectOpenCustomHashMapWithCursor<Integer, String> map =
+        new SizeableObject2ObjectOpenCustomHashMapWithCursor<>(NATURAL_HASH);
+    IntStream.range(0, 10).forEach(i -> map.put(i, "value-" + i));
+
+    HashMap<Integer, String> scanned = new HashMap<>();

Review comment:
       Let's stop using concrete implementations for type declarations. This should be:
   ```
   Map<Integer, String> scanned = new HashMap<>();
   ```
   Same for any other type declarations.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] DonalEvans commented on pull request #6590: GEODE-9301: Implement accurate sizeable for RedisHash

Posted by GitBox <gi...@apache.org>.
DonalEvans commented on pull request #6590:
URL: https://github.com/apache/geode/pull/6590#issuecomment-863434372


   Force pushing after rebasing on latest develop to resolve merge conflicts.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] dschneider-pivotal commented on a change in pull request #6590: GEODE-9301: Implement accurate sizeable for RedisHash

Posted by GitBox <gi...@apache.org>.
dschneider-pivotal commented on a change in pull request #6590:
URL: https://github.com/apache/geode/pull/6590#discussion_r648533960



##########
File path: geode-apis-compatible-with-redis/src/main/java/org/apache/geode/redis/internal/collections/SizeableObject2ObjectOpenCustomHashMapWithCursor.java
##########
@@ -29,52 +33,58 @@
  * The scan method provides the same guarantees as Redis's HSCAN, and in fact
  * uses the same algorithm.
  */
-public class Object2ObjectOpenCustomHashMapWithCursor<K, V>
-    extends Object2ObjectOpenCustomHashMap<K, V> {
+public class SizeableObject2ObjectOpenCustomHashMapWithCursor<K, V>

Review comment:
       since our redis data classes always use the class with K=byte[] and V=byte[] it seems like you could make this class be SizeableByteArray2ByteArrayOpenCustomHashMapWithCursor extends Object2ObjectOpenCustomHashMap<byte[], byte[]>
   You could also do this for the HashSet class.
   Then in the size code instead of using the ReflectionSingleObjectSizer you can just compute the size of a byte[] which is very simple. ReflectionSingleObjectSizer has to do extra work to handle any type of array. It seems like this would give you fast sizing.

##########
File path: geode-apis-compatible-with-redis/src/main/java/org/apache/geode/redis/internal/collections/SizeableObject2ObjectOpenCustomHashMapWithCursor.java
##########
@@ -29,52 +33,58 @@
  * The scan method provides the same guarantees as Redis's HSCAN, and in fact
  * uses the same algorithm.
  */
-public class Object2ObjectOpenCustomHashMapWithCursor<K, V>
-    extends Object2ObjectOpenCustomHashMap<K, V> {
+public class SizeableObject2ObjectOpenCustomHashMapWithCursor<K, V>
+    extends Object2ObjectOpenCustomHashMap<K, V> implements Sizeable {
+
+  private static final long serialVersionUID = 9079713776660851891L;
+  public static final int BACKING_ARRAY_OVERHEAD_CONSTANT = 136;
+  public static final int BACKING_ARRAY_LENGTH_COEFFICIENT = 4;
+  private static final ReflectionSingleObjectSizer elementSizer =
+      ReflectionSingleObjectSizer.getInstance();
 
+  private int keysOverhead;
+  private int valuesOverhead;

Review comment:
       I'm okay with having two overhead variables but I don't see any need for two. It looks like you could just have one "sizeInBytes" that tracks both keys and values. If you initialized it to BACKING_ARRAY_OVERHEAD_CONSTANT then the getSizeInBytes could just be "return sizeInBytes;"

##########
File path: geode-apis-compatible-with-redis/src/main/java/org/apache/geode/redis/internal/collections/SizeableObject2ObjectOpenCustomHashMapWithCursor.java
##########
@@ -29,52 +33,58 @@
  * The scan method provides the same guarantees as Redis's HSCAN, and in fact
  * uses the same algorithm.
  */
-public class Object2ObjectOpenCustomHashMapWithCursor<K, V>
-    extends Object2ObjectOpenCustomHashMap<K, V> {
+public class SizeableObject2ObjectOpenCustomHashMapWithCursor<K, V>
+    extends Object2ObjectOpenCustomHashMap<K, V> implements Sizeable {
+
+  private static final long serialVersionUID = 9079713776660851891L;
+  public static final int BACKING_ARRAY_OVERHEAD_CONSTANT = 136;
+  public static final int BACKING_ARRAY_LENGTH_COEFFICIENT = 4;
+  private static final ReflectionSingleObjectSizer elementSizer =
+      ReflectionSingleObjectSizer.getInstance();
 
+  private int keysOverhead;

Review comment:
       Currently these "Overhead" ints only include the size of the byte arrays. But if every time you changed the Overhead for a byte array you also included the BACKING_ARRAY_LENGTH_COEFFICIENT (by incing it or decing it) then you would not need to consult the length the arrays later when you compute the total size. Seems like it would make it a little faster.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] DonalEvans commented on a change in pull request #6590: GEODE-9301: Implement accurate sizeable for RedisHash

Posted by GitBox <gi...@apache.org>.
DonalEvans commented on a change in pull request #6590:
URL: https://github.com/apache/geode/pull/6590#discussion_r649584597



##########
File path: geode-apis-compatible-with-redis/src/main/java/org/apache/geode/redis/internal/collections/SizeableObject2ObjectOpenCustomHashMapWithCursor.java
##########
@@ -29,52 +33,58 @@
  * The scan method provides the same guarantees as Redis's HSCAN, and in fact
  * uses the same algorithm.
  */
-public class Object2ObjectOpenCustomHashMapWithCursor<K, V>
-    extends Object2ObjectOpenCustomHashMap<K, V> {
+public class SizeableObject2ObjectOpenCustomHashMapWithCursor<K, V>
+    extends Object2ObjectOpenCustomHashMap<K, V> implements Sizeable {
+
+  private static final long serialVersionUID = 9079713776660851891L;
+  public static final int BACKING_ARRAY_OVERHEAD_CONSTANT = 136;
+  public static final int BACKING_ARRAY_LENGTH_COEFFICIENT = 4;
+  private static final ReflectionSingleObjectSizer elementSizer =
+      ReflectionSingleObjectSizer.getInstance();
 
+  private int keysOverhead;
+  private int valuesOverhead;

Review comment:
       Good call, I've combined the two.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] lgtm-com[bot] commented on pull request #6590: GEODE-9301: Implement accurate sizeable for RedisHash

Posted by GitBox <gi...@apache.org>.
lgtm-com[bot] commented on pull request #6590:
URL: https://github.com/apache/geode/pull/6590#issuecomment-857266790


   This pull request **introduces 4 alerts** when merging 77975f3036f34965dd7af896ac21161075d79288 into 3aa6553c9a5ac382b58789856ebf0f46ed8151cf - [view on LGTM.com](https://lgtm.com/projects/g/apache/geode/rev/pr-afd36f55aa957bd68f3a9d1abb22214223867b59)
   
   **new alerts:**
   
   * 4 for Implicit narrowing conversion in compound assignment


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] DonalEvans merged pull request #6590: GEODE-9301: Implement accurate sizeable for RedisHash

Posted by GitBox <gi...@apache.org>.
DonalEvans merged pull request #6590:
URL: https://github.com/apache/geode/pull/6590


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] DonalEvans commented on a change in pull request #6590: GEODE-9301: Implement accurate sizeable for RedisHash

Posted by GitBox <gi...@apache.org>.
DonalEvans commented on a change in pull request #6590:
URL: https://github.com/apache/geode/pull/6590#discussion_r649583861



##########
File path: geode-apis-compatible-with-redis/src/main/java/org/apache/geode/redis/internal/collections/SizeableObject2ObjectOpenCustomHashMapWithCursor.java
##########
@@ -29,52 +33,58 @@
  * The scan method provides the same guarantees as Redis's HSCAN, and in fact
  * uses the same algorithm.
  */
-public class Object2ObjectOpenCustomHashMapWithCursor<K, V>
-    extends Object2ObjectOpenCustomHashMap<K, V> {
+public class SizeableObject2ObjectOpenCustomHashMapWithCursor<K, V>

Review comment:
       We currently have [a PR up](https://github.com/apache/geode/pull/6548) that uses a non-byte[] value in the CustomHashMap implementation, so I don't want to conflict with that by changing the underlying data structure.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org