You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@geode.apache.org by GitBox <gi...@apache.org> on 2021/07/15 20:10:20 UTC
[GitHub] [geode] dschneider-pivotal commented on a change in pull request #6699: GEODE-9304: Make RedisSortedSet measurement of bytes in use accurate

dschneider-pivotal commented on a change in pull request #6699:
URL: https://github.com/apache/geode/pull/6699#discussion_r670756392



##########
File path: geode-apis-compatible-with-redis/src/main/java/org/apache/geode/redis/internal/collections/SizableOrderStatisticsTree.java
##########
@@ -258,6 +267,7 @@ public boolean remove(Object o) {
     }
 
     x = deleteNode(x);
+    updateSizeInBytes(x.key, false);

Review comment:
       I think these calls would be a bit easier to understand if the looked like this: incSizeInBytes(element) and decSizeInBytes(element). Those two methods could call a common method getElementSize(element) if needed. 

##########
File path: geode-apis-compatible-with-redis/src/main/java/org/apache/geode/redis/internal/collections/SizableOrderStatisticsTree.java
##########
@@ -46,11 +49,15 @@
  * @author Rodion "rodde" Efremov
  * @version 1.6 (Feb 11, 2016)
  */
-public class OrderStatisticsTree<E extends Comparable<? super E>>
+public class SizableOrderStatisticsTree<E extends Comparable<? super E>>

Review comment:
       What was the reason for renaming this class? I noticed you did not rename the interface it implements, OrderedStatisticsSet, to SizableOrderedStatisticsSet so I was wondering if this needed to be renamed.

##########
File path: geode-apis-compatible-with-redis/src/main/java/org/apache/geode/redis/internal/data/RedisSortedSet.java
##########
@@ -34,45 +35,44 @@
 import java.util.Objects;
 
 import it.unimi.dsi.fastutil.bytes.ByteArrays;
-import it.unimi.dsi.fastutil.objects.Object2ObjectOpenCustomHashMap;
 
 import org.apache.geode.annotations.VisibleForTesting;
 import org.apache.geode.cache.Region;
 import org.apache.geode.internal.InternalDataSerializer;
 import org.apache.geode.internal.serialization.DeserializationContext;
 import org.apache.geode.internal.serialization.KnownVersion;
 import org.apache.geode.internal.serialization.SerializationContext;
+import org.apache.geode.internal.size.Sizeable;
 import org.apache.geode.redis.internal.RedisConstants;
 import org.apache.geode.redis.internal.collections.OrderStatisticsSet;
-import org.apache.geode.redis.internal.collections.OrderStatisticsTree;
+import org.apache.geode.redis.internal.collections.SizableOrderStatisticsTree;
+import org.apache.geode.redis.internal.collections.SizeableObject2ObjectOpenCustomHashMapWithCursor;
 import org.apache.geode.redis.internal.delta.AddsDeltaInfo;
 import org.apache.geode.redis.internal.delta.DeltaInfo;
 import org.apache.geode.redis.internal.delta.RemsDeltaInfo;
 import org.apache.geode.redis.internal.executor.sortedset.SortedSetRangeOptions;
 import org.apache.geode.redis.internal.executor.sortedset.ZAddOptions;
 
 public class RedisSortedSet extends AbstractRedisData {
-  private Object2ObjectOpenCustomHashMap<byte[], OrderedSetEntry> members;
+  private SizeableObject2ObjectOpenCustomHashMapWithCursor<byte[], OrderedSetEntry> members;
   private OrderStatisticsSet<AbstractOrderedSetEntry> scoreSet;
+  // This field is used to keep track of the size associated with objects that are referenced by
+  // both backing collections, since they will be counted twice otherwise
+  private int sizeInBytesAdjustment = 0;
 
-  protected static final int BASE_REDIS_SORTED_SET_OVERHEAD = 184;
-  protected static final int PER_PAIR_OVERHEAD = 48;
-
-  private int sizeInBytes = BASE_REDIS_SORTED_SET_OVERHEAD;
+  protected static final int BASE_REDIS_SORTED_SET_OVERHEAD = 40;

Review comment:
       It seems like comments on these OVERHEAD fields would be helpful. They should describe what fields were used calculate the overhead. Then in the future if someone changes a field or removes one or adds one they could figure out how to adjust these constants.

##########
File path: geode-apis-compatible-with-redis/src/main/java/org/apache/geode/redis/internal/collections/SizableOrderStatisticsTree.java
##########
@@ -731,6 +741,25 @@ private int count(Node<E> node) {
     return leftTreeSize + 1 + rightTreeSize;
   }
 
+  void updateSizeInBytes(E element, boolean isAdd) {
+    int sizeChange;
+    if (element instanceof Sizeable) {

Review comment:
       Would it be worth having our own SizableSizer that would do this instanceof check and then delegate to a ReflectionSingleObjectSizer if it is not a Sizable? It would clean this call up a bit. I don't know if we have this same pattern in other places.

##########
File path: geode-apis-compatible-with-redis/src/main/java/org/apache/geode/redis/internal/data/RedisSortedSet.java
##########
@@ -171,20 +170,24 @@ public int getDSFID() {
   }
 
   protected synchronized byte[] memberAdd(byte[] memberToAdd, byte[] scoreToAdd) {
-    byte[] oldScore = null;
-
     OrderedSetEntry newEntry = new OrderedSetEntry(memberToAdd, scoreToAdd);
-    OrderedSetEntry orderedSetEntry = members.put(memberToAdd, newEntry);
-    if (orderedSetEntry == null) {
+    OrderedSetEntry existingEntry = members.put(memberToAdd, newEntry);
+    if (existingEntry == null) {
       scoreSet.add(newEntry);
-      sizeInBytes += calculateSizeOfFieldValuePair(memberToAdd, scoreToAdd);
+      // Without this adjustment, we count the entry and member name array twice, since references
+      // to them appear in both backing collections.
+      sizeInBytesAdjustment += newEntry.getSizeInBytes() + calculateByteArraySize(memberToAdd);
+      return null;
     } else {
-      scoreSet.remove(orderedSetEntry);
+      scoreSet.remove(existingEntry);
       scoreSet.add(newEntry);
-      oldScore = orderedSetEntry.getScoreBytes();
-      sizeInBytes += scoreToAdd.length - oldScore.length;
+      // When updating an entry, the references to the member name array in the backing collections

Review comment:
       It seems to me like we should in this update case canonicalize the memberToAdd byte array so that we reuse the one already stored in the members map instead of this new one. I think you could do this by doing a OrderedSetEntry existingEntry = members.get(memberToAdd) and if it returns non-null then reading existingEntry.getMember() and using that to create the newEntry. If OrderedSetEntry is always read while synchronized then you could make it mutable and in that case you wouldn't need the update to create a new entry each time. I think you would need to do the scoreSet.remove(existingEntry) then mutate existingEntry and then scoreSet.add(existingEntry). This would prevent some gc churn.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org