You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@geode.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2021/05/11 18:08:00 UTC

[jira] [Commented] (GEODE-9136) Redis: make RedisData implement Sizeable

    [ https://issues.apache.org/jira/browse/GEODE-9136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17342783#comment-17342783 ] 

ASF subversion and git services commented on GEODE-9136:
--------------------------------------------------------

Commit 6a0eba25d5ed5cc7146ce6374d39dd12b22745f3 in geode's branch refs/heads/develop from Hale Bales
[ https://gitbox.apache.org/repos/asf?p=geode.git;h=6a0eba2 ]

GEODE-9136: make RedisData implement Sizeable (#6296)

 - use Sizeable in RedisString, RedisSet, and RedisHash
 - add unit tests of bytes in use for all three classes
 - update memory overhead tests to reflect decrease in overhead
 - calculate estimated size in unit tests and use constants in the classes

RedisData should implement Sizeable to increase the accuracy of the estimation of bytes in use. It is important that we are close to the right size in order for rebalancing to work properly, however the exact value is not important. These changes make the size for sets and hashes be within 5% of the size measured by using reflection. The accuracy changes as you add entries, and as entries get larger, because of the way that hashes, sets, and ByteArrayWrappers get resized. As we fill up the hash and set, we become more accurate in our estimation of the size. When full, the next entry added causes the structure to be resized, which means that our data now only accounts for a portion of the total size of the structure. We could have kept track of the resizes that happen in order to be even more accurate in our estimations, but doing so would add computational overhead for adding each key, and for each field at that key. It was decided that the benefits do not outweigh the cost.

RedisHash and RedisSet both use constants to account for the overheads of:

 - an empty (RedisHash/RedisSet)
 - an empty (HashMap/HashSet) within the (RedisHash/RedisSet)
 - each member in the (hash/set)
 - RedisHash also has an overhead for the first value in the hash.
We decided to use constants in the RedisData classes in order to reduce the number of calculations we have to do every time we add a new key. The math used to derive those constants live in the RedisSetTest and RedisHashTest classes. The tests of the constants will fail if the internal implementation of the classes changes. If the overheads decrease, then the constants need to be updated to reflect that. If the overheads increase, think through the changes before adding to the overhead. Significant changes to RedisSet or RedisHash could potentially break the math, so be sure to check where the problem is, in case we need to change the calculations.
There is a magical +5 in RedisSetTest.perMemberOverheadConstant_shouldMatchCalculatedValue. We think it came from the resizing, and that 5 happens to work ok, even though it should be dynamically calculated. But as previously said, the benefits do not outweigh the cost.

> Redis: make RedisData implement Sizeable
> ----------------------------------------
>
>                 Key: GEODE-9136
>                 URL: https://issues.apache.org/jira/browse/GEODE-9136
>             Project: Geode
>          Issue Type: New Feature
>          Components: redis
>            Reporter: Hale Bales
>            Assignee: Hale Bales
>            Priority: Major
>              Labels: pull-request-available
>
> If RedisData.java implemented the org/apache/geode/internal/size/Sizeable.java interface, size calculations for deltas could be much faster.
> **AC**
> - add tests for String, Set, and Hash add, append, and remove operations changing the bytesInUse
> - all existing hydra tests pass



--
This message was sent by Atlassian Jira
(v8.3.4#803005)