You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@geode.apache.org by GitBox <gi...@apache.org> on 2021/07/14 22:20:34 UTC

[GitHub] [geode] DonalEvans opened a new pull request #6699: GEODE-9304: Make RedisSortedSet measurement of bytes in use accurate

DonalEvans opened a new pull request #6699:
URL: https://github.com/apache/geode/pull/6699


    - WIP
   
   Authored-by: Donal Evans <do...@vmware.com>
   
   Thank you for submitting a contribution to Apache Geode.
   
   In order to streamline the review of the contribution we ask you
   to ensure the following steps have been taken:
   
   ### For all changes:
   - [ ] Is there a JIRA ticket associated with this PR? Is it referenced in the commit message?
   
   - [ ] Has your PR been rebased against the latest commit within the target branch (typically `develop`)?
   
   - [ ] Is your initial contribution a single, squashed commit?
   
   - [ ] Does `gradlew build` run cleanly?
   
   - [ ] Have you written or updated unit tests to verify your changes?
   
   - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)?
   
   ### Note:
   Please ensure that once the PR is submitted, check Concourse for build issues and
   submit an update to your PR as soon as possible. If you need help, please send an
   email to dev@geode.apache.org.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] DonalEvans commented on a change in pull request #6699: GEODE-9304: Make RedisSortedSet measurement of bytes in use accurate

Posted by GitBox <gi...@apache.org>.
DonalEvans commented on a change in pull request #6699:
URL: https://github.com/apache/geode/pull/6699#discussion_r670865597



##########
File path: geode-apis-compatible-with-redis/src/main/java/org/apache/geode/redis/internal/collections/SizableOrderStatisticsTree.java
##########
@@ -731,6 +741,25 @@ private int count(Node<E> node) {
     return leftTreeSize + 1 + rightTreeSize;
   }
 
+  void updateSizeInBytes(E element, boolean isAdd) {
+    int sizeChange;
+    if (element instanceof Sizeable) {

Review comment:
       I don't think we do this anywhere else, but I did it here to avoid having to use ReflectionSingleObjectSizer for objects that are already keeping track of their size, since that would reduce overhead (I think). Your suggestion is a good one though, so I'll implement it here.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] dschneider-pivotal commented on a change in pull request #6699: GEODE-9304: Make RedisSortedSet measurement of bytes in use accurate

Posted by GitBox <gi...@apache.org>.
dschneider-pivotal commented on a change in pull request #6699:
URL: https://github.com/apache/geode/pull/6699#discussion_r674205274



##########
File path: geode-apis-compatible-with-redis/src/main/java/org/apache/geode/redis/internal/data/RedisSortedSet.java
##########
@@ -173,20 +176,23 @@ public int getDSFID() {
   }
 
   protected synchronized byte[] memberAdd(byte[] memberToAdd, byte[] scoreToAdd) {
-    byte[] oldScore = null;
-
-    OrderedSetEntry newEntry = new OrderedSetEntry(memberToAdd, scoreToAdd);
-    OrderedSetEntry orderedSetEntry = members.put(memberToAdd, newEntry);
-    if (orderedSetEntry == null) {
+    OrderedSetEntry existingEntry = members.get(memberToAdd);
+    if (existingEntry == null) {
+      OrderedSetEntry newEntry = new OrderedSetEntry(memberToAdd, scoreToAdd);
+      members.put(memberToAdd, newEntry);
       scoreSet.add(newEntry);
-      sizeInBytes += calculateSizeOfFieldValuePair(memberToAdd, scoreToAdd);
+      // Without this adjustment, we count the entry and member name array twice, since references
+      // to them appear in both backing collections.
+      sizeInBytesAdjustment += newEntry.getSizeInBytes() + calculateByteArraySize(memberToAdd);
+      return null;
     } else {
-      scoreSet.remove(orderedSetEntry);
-      scoreSet.add(newEntry);
-      oldScore = orderedSetEntry.getScoreBytes();
-      sizeInBytes += scoreToAdd.length - oldScore.length;
+      scoreSet.remove(existingEntry);
+      byte[] oldScore = existingEntry.scoreBytes;
+      existingEntry.updateScore(stripTrailingZeroFromDouble(scoreToAdd));
+      members.put(memberToAdd, existingEntry);

Review comment:
       you might want to add a note to that new ticket that this extra members.put call can be removed once this modify no longer changes the size because we are just updating the score stored as a "double".




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] DonalEvans commented on a change in pull request #6699: GEODE-9304: Make RedisSortedSet measurement of bytes in use accurate

Posted by GitBox <gi...@apache.org>.
DonalEvans commented on a change in pull request #6699:
URL: https://github.com/apache/geode/pull/6699#discussion_r673410063



##########
File path: geode-core/src/main/java/org/apache/geode/internal/size/SizeableObjectSizer.java
##########
@@ -0,0 +1,33 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more contributor license
+ * agreements. See the NOTICE file distributed with this work for additional information regarding
+ * copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License. You may obtain a
+ * copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software distributed under the License
+ * is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
+ * or implied. See the License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.geode.internal.size;
+
+/**
+ * A sizer that allows for efficient sizing of objects that implement the {@link Sizeable}
+ * interface, and delegates to {@link ReflectionSingleObjectSizer} otherwise.
+ */
+public class SizeableObjectSizer implements SingleObjectSizer {
+
+  private final SingleObjectSizer sizer = new ReflectionSingleObjectSizer();
+
+  @Override
+  public long sizeof(Object object) {

Review comment:
       The `SingleObjectSizer` interface that `SizeableObjectSizer` implements returns a long from its `sizeof()` method, so returning an int here would mean either breaking the inheritance relationship between the two or changing `SingleObjectSizer` to return int instead of long, which will also affect `CachingSingleObjectSizer`, `InstrumentationSingleObjectSizer` and `ReflectionSingleObjectSizer`, with further knock-on effects that are outside the scope of this PR. As a compromise, would it be acceptable to leave the return type as long, but call `Coder.narrowLongToInt()` on the returned value rather than casting it? This will prevent integer overflow in the size, but avoid the need to do refactoring outside the redis module.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] DonalEvans commented on a change in pull request #6699: GEODE-9304: Make RedisSortedSet measurement of bytes in use accurate

Posted by GitBox <gi...@apache.org>.
DonalEvans commented on a change in pull request #6699:
URL: https://github.com/apache/geode/pull/6699#discussion_r674378002



##########
File path: geode-apis-compatible-with-redis/src/main/java/org/apache/geode/redis/internal/data/RedisSortedSet.java
##########
@@ -173,20 +176,23 @@ public int getDSFID() {
   }
 
   protected synchronized byte[] memberAdd(byte[] memberToAdd, byte[] scoreToAdd) {
-    byte[] oldScore = null;
-
-    OrderedSetEntry newEntry = new OrderedSetEntry(memberToAdd, scoreToAdd);
-    OrderedSetEntry orderedSetEntry = members.put(memberToAdd, newEntry);
-    if (orderedSetEntry == null) {
+    OrderedSetEntry existingEntry = members.get(memberToAdd);
+    if (existingEntry == null) {
+      OrderedSetEntry newEntry = new OrderedSetEntry(memberToAdd, scoreToAdd);
+      members.put(memberToAdd, newEntry);
       scoreSet.add(newEntry);
-      sizeInBytes += calculateSizeOfFieldValuePair(memberToAdd, scoreToAdd);
+      // Without this adjustment, we count the entry and member name array twice, since references
+      // to them appear in both backing collections.
+      sizeInBytesAdjustment += newEntry.getSizeInBytes() + calculateByteArraySize(memberToAdd);
+      return null;
     } else {
-      scoreSet.remove(orderedSetEntry);
-      scoreSet.add(newEntry);
-      oldScore = orderedSetEntry.getScoreBytes();
-      sizeInBytes += scoreToAdd.length - oldScore.length;
+      scoreSet.remove(existingEntry);
+      byte[] oldScore = existingEntry.scoreBytes;
+      existingEntry.updateScore(stripTrailingZeroFromDouble(scoreToAdd));
+      members.put(memberToAdd, existingEntry);

Review comment:
       Filed GEODE-9446 to track the refactoring work.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] DonalEvans commented on a change in pull request #6699: GEODE-9304: Make RedisSortedSet measurement of bytes in use accurate

Posted by GitBox <gi...@apache.org>.
DonalEvans commented on a change in pull request #6699:
URL: https://github.com/apache/geode/pull/6699#discussion_r670868944



##########
File path: geode-apis-compatible-with-redis/src/main/java/org/apache/geode/redis/internal/data/RedisSortedSet.java
##########
@@ -34,45 +35,44 @@
 import java.util.Objects;
 
 import it.unimi.dsi.fastutil.bytes.ByteArrays;
-import it.unimi.dsi.fastutil.objects.Object2ObjectOpenCustomHashMap;
 
 import org.apache.geode.annotations.VisibleForTesting;
 import org.apache.geode.cache.Region;
 import org.apache.geode.internal.InternalDataSerializer;
 import org.apache.geode.internal.serialization.DeserializationContext;
 import org.apache.geode.internal.serialization.KnownVersion;
 import org.apache.geode.internal.serialization.SerializationContext;
+import org.apache.geode.internal.size.Sizeable;
 import org.apache.geode.redis.internal.RedisConstants;
 import org.apache.geode.redis.internal.collections.OrderStatisticsSet;
-import org.apache.geode.redis.internal.collections.OrderStatisticsTree;
+import org.apache.geode.redis.internal.collections.SizableOrderStatisticsTree;
+import org.apache.geode.redis.internal.collections.SizeableObject2ObjectOpenCustomHashMapWithCursor;
 import org.apache.geode.redis.internal.delta.AddsDeltaInfo;
 import org.apache.geode.redis.internal.delta.DeltaInfo;
 import org.apache.geode.redis.internal.delta.RemsDeltaInfo;
 import org.apache.geode.redis.internal.executor.sortedset.SortedSetRangeOptions;
 import org.apache.geode.redis.internal.executor.sortedset.ZAddOptions;
 
 public class RedisSortedSet extends AbstractRedisData {
-  private Object2ObjectOpenCustomHashMap<byte[], OrderedSetEntry> members;
+  private SizeableObject2ObjectOpenCustomHashMapWithCursor<byte[], OrderedSetEntry> members;
   private OrderStatisticsSet<AbstractOrderedSetEntry> scoreSet;
+  // This field is used to keep track of the size associated with objects that are referenced by
+  // both backing collections, since they will be counted twice otherwise
+  private int sizeInBytesAdjustment = 0;
 
-  protected static final int BASE_REDIS_SORTED_SET_OVERHEAD = 184;
-  protected static final int PER_PAIR_OVERHEAD = 48;
-
-  private int sizeInBytes = BASE_REDIS_SORTED_SET_OVERHEAD;
+  protected static final int BASE_REDIS_SORTED_SET_OVERHEAD = 40;

Review comment:
       These constants are determined experimentally using the tests found in `RedisSortedSetTest`. I'll add a comment here pointing that out though, so people know which tests to run if they modify these classes.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] dschneider-pivotal commented on a change in pull request #6699: GEODE-9304: Make RedisSortedSet measurement of bytes in use accurate

Posted by GitBox <gi...@apache.org>.
dschneider-pivotal commented on a change in pull request #6699:
URL: https://github.com/apache/geode/pull/6699#discussion_r673358353



##########
File path: geode-core/src/main/java/org/apache/geode/internal/size/SizeableObjectSizer.java
##########
@@ -0,0 +1,33 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more contributor license
+ * agreements. See the NOTICE file distributed with this work for additional information regarding
+ * copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License. You may obtain a
+ * copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software distributed under the License
+ * is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
+ * or implied. See the License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.geode.internal.size;
+
+/**
+ * A sizer that allows for efficient sizing of objects that implement the {@link Sizeable}
+ * interface, and delegates to {@link ReflectionSingleObjectSizer} otherwise.
+ */
+public class SizeableObjectSizer implements SingleObjectSizer {
+
+  private final SingleObjectSizer sizer = new ReflectionSingleObjectSizer();
+
+  @Override
+  public long sizeof(Object object) {

Review comment:
       I noticed you cast to "int" in the calls of this. Would it be better for this sizer to match Sizable and have sizeof return "int" instead of "long"? Instead of simply casting "sizer.sizeof" you could check if it is > Integer.MAX and if it is return Integer.MAX. I really doubt any objects will be larger than Integer.MAX but if so these seems better than a cast.

##########
File path: geode-apis-compatible-with-redis/src/main/java/org/apache/geode/redis/internal/data/RedisSortedSet.java
##########
@@ -173,20 +176,23 @@ public int getDSFID() {
   }
 
   protected synchronized byte[] memberAdd(byte[] memberToAdd, byte[] scoreToAdd) {
-    byte[] oldScore = null;
-
-    OrderedSetEntry newEntry = new OrderedSetEntry(memberToAdd, scoreToAdd);
-    OrderedSetEntry orderedSetEntry = members.put(memberToAdd, newEntry);
-    if (orderedSetEntry == null) {
+    OrderedSetEntry existingEntry = members.get(memberToAdd);
+    if (existingEntry == null) {
+      OrderedSetEntry newEntry = new OrderedSetEntry(memberToAdd, scoreToAdd);
+      members.put(memberToAdd, newEntry);
       scoreSet.add(newEntry);
-      sizeInBytes += calculateSizeOfFieldValuePair(memberToAdd, scoreToAdd);
+      // Without this adjustment, we count the entry and member name array twice, since references
+      // to them appear in both backing collections.
+      sizeInBytesAdjustment += newEntry.getSizeInBytes() + calculateByteArraySize(memberToAdd);
+      return null;
     } else {
-      scoreSet.remove(orderedSetEntry);
-      scoreSet.add(newEntry);
-      oldScore = orderedSetEntry.getScoreBytes();
-      sizeInBytes += scoreToAdd.length - oldScore.length;
+      scoreSet.remove(existingEntry);
+      byte[] oldScore = existingEntry.scoreBytes;
+      existingEntry.updateScore(stripTrailingZeroFromDouble(scoreToAdd));
+      members.put(memberToAdd, existingEntry);

Review comment:
       why is this members.put needed when we modified an existing entry? Is it simply to update the size if the byte array form has a different length? We are using extra memory by storing both the byte array form and the double form. We would use less memory if we just stored it as a "double". In this case storing it in both forms uses more CPU since we need to do the extra put. What operations are being optimized by storing it in both forms?

##########
File path: geode-apis-compatible-with-redis/src/main/java/org/apache/geode/redis/internal/collections/SizeableObject2ObjectOpenCustomHashMapWithCursor.java
##########
@@ -226,6 +226,14 @@ int calculateBackingArraysOverhead() {
         + BACKING_ARRAY_LENGTH_COEFFICIENT * getTotalBackingArrayLength();
   }
 
+  private <E> int getElementSize(E element) {
+    if (element instanceof Sizeable) {

Review comment:
       could this class also use your new SizeableObjectSizer?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] DonalEvans commented on a change in pull request #6699: GEODE-9304: Make RedisSortedSet measurement of bytes in use accurate

Posted by GitBox <gi...@apache.org>.
DonalEvans commented on a change in pull request #6699:
URL: https://github.com/apache/geode/pull/6699#discussion_r674228640



##########
File path: geode-core/src/main/java/org/apache/geode/internal/size/SizeableObjectSizer.java
##########
@@ -0,0 +1,33 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more contributor license
+ * agreements. See the NOTICE file distributed with this work for additional information regarding
+ * copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License. You may obtain a
+ * copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software distributed under the License
+ * is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
+ * or implied. See the License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.geode.internal.size;
+
+/**
+ * A sizer that allows for efficient sizing of objects that implement the {@link Sizeable}
+ * interface, and delegates to {@link ReflectionSingleObjectSizer} otherwise.
+ */
+public class SizeableObjectSizer implements SingleObjectSizer {
+
+  private final SingleObjectSizer sizer = new ReflectionSingleObjectSizer();
+
+  @Override
+  public long sizeof(Object object) {

Review comment:
       Yeah, that would work. I'll move the Sizer and not have it implement the interface.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] DonalEvans commented on a change in pull request #6699: GEODE-9304: Make RedisSortedSet measurement of bytes in use accurate

Posted by GitBox <gi...@apache.org>.
DonalEvans commented on a change in pull request #6699:
URL: https://github.com/apache/geode/pull/6699#discussion_r673417456



##########
File path: geode-apis-compatible-with-redis/src/main/java/org/apache/geode/redis/internal/data/RedisSortedSet.java
##########
@@ -173,20 +176,23 @@ public int getDSFID() {
   }
 
   protected synchronized byte[] memberAdd(byte[] memberToAdd, byte[] scoreToAdd) {
-    byte[] oldScore = null;
-
-    OrderedSetEntry newEntry = new OrderedSetEntry(memberToAdd, scoreToAdd);
-    OrderedSetEntry orderedSetEntry = members.put(memberToAdd, newEntry);
-    if (orderedSetEntry == null) {
+    OrderedSetEntry existingEntry = members.get(memberToAdd);
+    if (existingEntry == null) {
+      OrderedSetEntry newEntry = new OrderedSetEntry(memberToAdd, scoreToAdd);
+      members.put(memberToAdd, newEntry);
       scoreSet.add(newEntry);
-      sizeInBytes += calculateSizeOfFieldValuePair(memberToAdd, scoreToAdd);
+      // Without this adjustment, we count the entry and member name array twice, since references
+      // to them appear in both backing collections.
+      sizeInBytesAdjustment += newEntry.getSizeInBytes() + calculateByteArraySize(memberToAdd);
+      return null;
     } else {
-      scoreSet.remove(orderedSetEntry);
-      scoreSet.add(newEntry);
-      oldScore = orderedSetEntry.getScoreBytes();
-      sizeInBytes += scoreToAdd.length - oldScore.length;
+      scoreSet.remove(existingEntry);
+      byte[] oldScore = existingEntry.scoreBytes;
+      existingEntry.updateScore(stripTrailingZeroFromDouble(scoreToAdd));
+      members.put(memberToAdd, existingEntry);

Review comment:
       I think that the reason we store both the byte array and double is that some of the SortedSet commands (zadd and zrem) were implemented before we had an actual sorted set implementation, so were just using the member and score byte arrays in a `SizeableObject2ObjectOpenCustomHashMapWithCursor`. With that original implementation, it was faster to leave the scores in their byte array format at all times, but after new commands and the `OrderStatisticsSet` were added to RedisSortedSet, this is no longer the case.
   
   I think that the only time we need a byte array rather than a double is when we're sending the response, so we can eliminate the field from `AbstractOrderedStatisticsEntry` entirely and just change how we return from certain sorted set methods. However, this is a potentially large refactor that's not strictly within the scope of this PR, so I'd prefer to open a new ticket for it rather than do it as part of this one.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] DonalEvans commented on a change in pull request #6699: GEODE-9304: Make RedisSortedSet measurement of bytes in use accurate

Posted by GitBox <gi...@apache.org>.
DonalEvans commented on a change in pull request #6699:
URL: https://github.com/apache/geode/pull/6699#discussion_r673412283



##########
File path: geode-apis-compatible-with-redis/src/main/java/org/apache/geode/redis/internal/collections/SizeableObject2ObjectOpenCustomHashMapWithCursor.java
##########
@@ -226,6 +226,14 @@ int calculateBackingArraysOverhead() {
         + BACKING_ARRAY_LENGTH_COEFFICIENT * getTotalBackingArrayLength();
   }
 
+  private <E> int getElementSize(E element) {
+    if (element instanceof Sizeable) {

Review comment:
       Yes, good catch, thanks




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] DonalEvans commented on a change in pull request #6699: GEODE-9304: Make RedisSortedSet measurement of bytes in use accurate

Posted by GitBox <gi...@apache.org>.
DonalEvans commented on a change in pull request #6699:
URL: https://github.com/apache/geode/pull/6699#discussion_r670883384



##########
File path: geode-apis-compatible-with-redis/src/main/java/org/apache/geode/redis/internal/data/RedisSortedSet.java
##########
@@ -171,20 +170,24 @@ public int getDSFID() {
   }
 
   protected synchronized byte[] memberAdd(byte[] memberToAdd, byte[] scoreToAdd) {
-    byte[] oldScore = null;
-
     OrderedSetEntry newEntry = new OrderedSetEntry(memberToAdd, scoreToAdd);
-    OrderedSetEntry orderedSetEntry = members.put(memberToAdd, newEntry);
-    if (orderedSetEntry == null) {
+    OrderedSetEntry existingEntry = members.put(memberToAdd, newEntry);
+    if (existingEntry == null) {
       scoreSet.add(newEntry);
-      sizeInBytes += calculateSizeOfFieldValuePair(memberToAdd, scoreToAdd);
+      // Without this adjustment, we count the entry and member name array twice, since references
+      // to them appear in both backing collections.
+      sizeInBytesAdjustment += newEntry.getSizeInBytes() + calculateByteArraySize(memberToAdd);
+      return null;
     } else {
-      scoreSet.remove(orderedSetEntry);
+      scoreSet.remove(existingEntry);
       scoreSet.add(newEntry);
-      oldScore = orderedSetEntry.getScoreBytes();
-      sizeInBytes += scoreToAdd.length - oldScore.length;
+      // When updating an entry, the references to the member name array in the backing collections

Review comment:
       Ah, I had tried to find a solution for that when I was writing this, but I ran out of steam before I could figure one out. Yours works perfectly, thanks!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] lgtm-com[bot] commented on pull request #6699: GEODE-9304: Make RedisSortedSet measurement of bytes in use accurate

Posted by GitBox <gi...@apache.org>.
lgtm-com[bot] commented on pull request #6699:
URL: https://github.com/apache/geode/pull/6699#issuecomment-881105321


   This pull request **introduces 1 alert** when merging e03991ca5198029a9329af31d764807b8daa886d into 970497912dec5af43605842195bc8a7dd1479851 - [view on LGTM.com](https://lgtm.com/projects/g/apache/geode/rev/pr-724f2644718168ee812716e561da9774538fa21c)
   
   **new alerts:**
   
   * 1 for Implicit narrowing conversion in compound assignment


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] dschneider-pivotal commented on a change in pull request #6699: GEODE-9304: Make RedisSortedSet measurement of bytes in use accurate

Posted by GitBox <gi...@apache.org>.
dschneider-pivotal commented on a change in pull request #6699:
URL: https://github.com/apache/geode/pull/6699#discussion_r674204117



##########
File path: geode-apis-compatible-with-redis/src/main/java/org/apache/geode/redis/internal/data/RedisSortedSet.java
##########
@@ -173,20 +176,23 @@ public int getDSFID() {
   }
 
   protected synchronized byte[] memberAdd(byte[] memberToAdd, byte[] scoreToAdd) {
-    byte[] oldScore = null;
-
-    OrderedSetEntry newEntry = new OrderedSetEntry(memberToAdd, scoreToAdd);
-    OrderedSetEntry orderedSetEntry = members.put(memberToAdd, newEntry);
-    if (orderedSetEntry == null) {
+    OrderedSetEntry existingEntry = members.get(memberToAdd);
+    if (existingEntry == null) {
+      OrderedSetEntry newEntry = new OrderedSetEntry(memberToAdd, scoreToAdd);
+      members.put(memberToAdd, newEntry);
       scoreSet.add(newEntry);
-      sizeInBytes += calculateSizeOfFieldValuePair(memberToAdd, scoreToAdd);
+      // Without this adjustment, we count the entry and member name array twice, since references
+      // to them appear in both backing collections.
+      sizeInBytesAdjustment += newEntry.getSizeInBytes() + calculateByteArraySize(memberToAdd);
+      return null;
     } else {
-      scoreSet.remove(orderedSetEntry);
-      scoreSet.add(newEntry);
-      oldScore = orderedSetEntry.getScoreBytes();
-      sizeInBytes += scoreToAdd.length - oldScore.length;
+      scoreSet.remove(existingEntry);
+      byte[] oldScore = existingEntry.scoreBytes;
+      existingEntry.updateScore(stripTrailingZeroFromDouble(scoreToAdd));
+      members.put(memberToAdd, existingEntry);

Review comment:
       sounds good




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] DonalEvans commented on a change in pull request #6699: GEODE-9304: Make RedisSortedSet measurement of bytes in use accurate

Posted by GitBox <gi...@apache.org>.
DonalEvans commented on a change in pull request #6699:
URL: https://github.com/apache/geode/pull/6699#discussion_r674228039



##########
File path: geode-apis-compatible-with-redis/src/main/java/org/apache/geode/redis/internal/data/RedisSortedSet.java
##########
@@ -173,20 +176,23 @@ public int getDSFID() {
   }
 
   protected synchronized byte[] memberAdd(byte[] memberToAdd, byte[] scoreToAdd) {
-    byte[] oldScore = null;
-
-    OrderedSetEntry newEntry = new OrderedSetEntry(memberToAdd, scoreToAdd);
-    OrderedSetEntry orderedSetEntry = members.put(memberToAdd, newEntry);
-    if (orderedSetEntry == null) {
+    OrderedSetEntry existingEntry = members.get(memberToAdd);
+    if (existingEntry == null) {
+      OrderedSetEntry newEntry = new OrderedSetEntry(memberToAdd, scoreToAdd);
+      members.put(memberToAdd, newEntry);
       scoreSet.add(newEntry);
-      sizeInBytes += calculateSizeOfFieldValuePair(memberToAdd, scoreToAdd);
+      // Without this adjustment, we count the entry and member name array twice, since references
+      // to them appear in both backing collections.
+      sizeInBytesAdjustment += newEntry.getSizeInBytes() + calculateByteArraySize(memberToAdd);
+      return null;
     } else {
-      scoreSet.remove(orderedSetEntry);
-      scoreSet.add(newEntry);
-      oldScore = orderedSetEntry.getScoreBytes();
-      sizeInBytes += scoreToAdd.length - oldScore.length;
+      scoreSet.remove(existingEntry);
+      byte[] oldScore = existingEntry.scoreBytes;
+      existingEntry.updateScore(stripTrailingZeroFromDouble(scoreToAdd));
+      members.put(memberToAdd, existingEntry);

Review comment:
       Will do




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] DonalEvans merged pull request #6699: GEODE-9304: Make RedisSortedSet measurement of bytes in use accurate

Posted by GitBox <gi...@apache.org>.
DonalEvans merged pull request #6699:
URL: https://github.com/apache/geode/pull/6699


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] dschneider-pivotal commented on a change in pull request #6699: GEODE-9304: Make RedisSortedSet measurement of bytes in use accurate

Posted by GitBox <gi...@apache.org>.
dschneider-pivotal commented on a change in pull request #6699:
URL: https://github.com/apache/geode/pull/6699#discussion_r670756392



##########
File path: geode-apis-compatible-with-redis/src/main/java/org/apache/geode/redis/internal/collections/SizableOrderStatisticsTree.java
##########
@@ -258,6 +267,7 @@ public boolean remove(Object o) {
     }
 
     x = deleteNode(x);
+    updateSizeInBytes(x.key, false);

Review comment:
       I think these calls would be a bit easier to understand if the looked like this: incSizeInBytes(element) and decSizeInBytes(element). Those two methods could call a common method getElementSize(element) if needed. 

##########
File path: geode-apis-compatible-with-redis/src/main/java/org/apache/geode/redis/internal/collections/SizableOrderStatisticsTree.java
##########
@@ -46,11 +49,15 @@
  * @author Rodion "rodde" Efremov
  * @version 1.6 (Feb 11, 2016)
  */
-public class OrderStatisticsTree<E extends Comparable<? super E>>
+public class SizableOrderStatisticsTree<E extends Comparable<? super E>>

Review comment:
       What was the reason for renaming this class? I noticed you did not rename the interface it implements, OrderedStatisticsSet, to SizableOrderedStatisticsSet so I was wondering if this needed to be renamed.

##########
File path: geode-apis-compatible-with-redis/src/main/java/org/apache/geode/redis/internal/data/RedisSortedSet.java
##########
@@ -34,45 +35,44 @@
 import java.util.Objects;
 
 import it.unimi.dsi.fastutil.bytes.ByteArrays;
-import it.unimi.dsi.fastutil.objects.Object2ObjectOpenCustomHashMap;
 
 import org.apache.geode.annotations.VisibleForTesting;
 import org.apache.geode.cache.Region;
 import org.apache.geode.internal.InternalDataSerializer;
 import org.apache.geode.internal.serialization.DeserializationContext;
 import org.apache.geode.internal.serialization.KnownVersion;
 import org.apache.geode.internal.serialization.SerializationContext;
+import org.apache.geode.internal.size.Sizeable;
 import org.apache.geode.redis.internal.RedisConstants;
 import org.apache.geode.redis.internal.collections.OrderStatisticsSet;
-import org.apache.geode.redis.internal.collections.OrderStatisticsTree;
+import org.apache.geode.redis.internal.collections.SizableOrderStatisticsTree;
+import org.apache.geode.redis.internal.collections.SizeableObject2ObjectOpenCustomHashMapWithCursor;
 import org.apache.geode.redis.internal.delta.AddsDeltaInfo;
 import org.apache.geode.redis.internal.delta.DeltaInfo;
 import org.apache.geode.redis.internal.delta.RemsDeltaInfo;
 import org.apache.geode.redis.internal.executor.sortedset.SortedSetRangeOptions;
 import org.apache.geode.redis.internal.executor.sortedset.ZAddOptions;
 
 public class RedisSortedSet extends AbstractRedisData {
-  private Object2ObjectOpenCustomHashMap<byte[], OrderedSetEntry> members;
+  private SizeableObject2ObjectOpenCustomHashMapWithCursor<byte[], OrderedSetEntry> members;
   private OrderStatisticsSet<AbstractOrderedSetEntry> scoreSet;
+  // This field is used to keep track of the size associated with objects that are referenced by
+  // both backing collections, since they will be counted twice otherwise
+  private int sizeInBytesAdjustment = 0;
 
-  protected static final int BASE_REDIS_SORTED_SET_OVERHEAD = 184;
-  protected static final int PER_PAIR_OVERHEAD = 48;
-
-  private int sizeInBytes = BASE_REDIS_SORTED_SET_OVERHEAD;
+  protected static final int BASE_REDIS_SORTED_SET_OVERHEAD = 40;

Review comment:
       It seems like comments on these OVERHEAD fields would be helpful. They should describe what fields were used calculate the overhead. Then in the future if someone changes a field or removes one or adds one they could figure out how to adjust these constants.

##########
File path: geode-apis-compatible-with-redis/src/main/java/org/apache/geode/redis/internal/collections/SizableOrderStatisticsTree.java
##########
@@ -731,6 +741,25 @@ private int count(Node<E> node) {
     return leftTreeSize + 1 + rightTreeSize;
   }
 
+  void updateSizeInBytes(E element, boolean isAdd) {
+    int sizeChange;
+    if (element instanceof Sizeable) {

Review comment:
       Would it be worth having our own SizableSizer that would do this instanceof check and then delegate to a ReflectionSingleObjectSizer if it is not a Sizable? It would clean this call up a bit. I don't know if we have this same pattern in other places.

##########
File path: geode-apis-compatible-with-redis/src/main/java/org/apache/geode/redis/internal/data/RedisSortedSet.java
##########
@@ -171,20 +170,24 @@ public int getDSFID() {
   }
 
   protected synchronized byte[] memberAdd(byte[] memberToAdd, byte[] scoreToAdd) {
-    byte[] oldScore = null;
-
     OrderedSetEntry newEntry = new OrderedSetEntry(memberToAdd, scoreToAdd);
-    OrderedSetEntry orderedSetEntry = members.put(memberToAdd, newEntry);
-    if (orderedSetEntry == null) {
+    OrderedSetEntry existingEntry = members.put(memberToAdd, newEntry);
+    if (existingEntry == null) {
       scoreSet.add(newEntry);
-      sizeInBytes += calculateSizeOfFieldValuePair(memberToAdd, scoreToAdd);
+      // Without this adjustment, we count the entry and member name array twice, since references
+      // to them appear in both backing collections.
+      sizeInBytesAdjustment += newEntry.getSizeInBytes() + calculateByteArraySize(memberToAdd);
+      return null;
     } else {
-      scoreSet.remove(orderedSetEntry);
+      scoreSet.remove(existingEntry);
       scoreSet.add(newEntry);
-      oldScore = orderedSetEntry.getScoreBytes();
-      sizeInBytes += scoreToAdd.length - oldScore.length;
+      // When updating an entry, the references to the member name array in the backing collections

Review comment:
       It seems to me like we should in this update case canonicalize the memberToAdd byte array so that we reuse the one already stored in the members map instead of this new one. I think you could do this by doing a OrderedSetEntry existingEntry = members.get(memberToAdd) and if it returns non-null then reading existingEntry.getMember() and using that to create the newEntry. If OrderedSetEntry is always read while synchronized then you could make it mutable and in that case you wouldn't need the update to create a new entry each time. I think you would need to do the scoreSet.remove(existingEntry) then mutate existingEntry and then scoreSet.add(existingEntry). This would prevent some gc churn.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] dschneider-pivotal commented on a change in pull request #6699: GEODE-9304: Make RedisSortedSet measurement of bytes in use accurate

Posted by GitBox <gi...@apache.org>.
dschneider-pivotal commented on a change in pull request #6699:
URL: https://github.com/apache/geode/pull/6699#discussion_r674212507



##########
File path: geode-core/src/main/java/org/apache/geode/internal/size/SizeableObjectSizer.java
##########
@@ -0,0 +1,33 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more contributor license
+ * agreements. See the NOTICE file distributed with this work for additional information regarding
+ * copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License. You may obtain a
+ * copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software distributed under the License
+ * is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
+ * or implied. See the License for the specific language governing permissions and limitations under
+ * the License.
+ */
+package org.apache.geode.internal.size;
+
+/**
+ * A sizer that allows for efficient sizing of objects that implement the {@link Sizeable}
+ * interface, and delegates to {@link ReflectionSingleObjectSizer} otherwise.
+ */
+public class SizeableObjectSizer implements SingleObjectSizer {
+
+  private final SingleObjectSizer sizer = new ReflectionSingleObjectSizer();
+
+  @Override
+  public long sizeof(Object object) {

Review comment:
       Couldn't you just change the new SizeableObjectSizer to not implement the interface? It could be in the redis module instead of core. It is currently only used by redis. I think the classes that use it don't need it to be a SingleObjectSizer. I like the idea of it using Coder.narrowLongToInt but would prefer it to be done in one place instead of all the callers.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [geode] DonalEvans commented on a change in pull request #6699: GEODE-9304: Make RedisSortedSet measurement of bytes in use accurate

Posted by GitBox <gi...@apache.org>.
DonalEvans commented on a change in pull request #6699:
URL: https://github.com/apache/geode/pull/6699#discussion_r670863366



##########
File path: geode-apis-compatible-with-redis/src/main/java/org/apache/geode/redis/internal/collections/SizableOrderStatisticsTree.java
##########
@@ -46,11 +49,15 @@
  * @author Rodion "rodde" Efremov
  * @version 1.6 (Feb 11, 2016)
  */
-public class OrderStatisticsTree<E extends Comparable<? super E>>
+public class SizableOrderStatisticsTree<E extends Comparable<? super E>>

Review comment:
       It probably doesn't need to be renamed, since we don't have a non-sizeable implementation of OrderedStatisticsSet, so it doesn't make sense to distinguish it like this. I'll put the name back how it was.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@geode.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org