You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Ekaterina Dimitrova (Jira)" <ji...@apache.org> on 2021/01/04 20:06:00 UTC
[jira] [Commented] (CASSANDRA-16318) Memtable heap size is severely
underestimated
[ https://issues.apache.org/jira/browse/CASSANDRA-16318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17258469#comment-17258469 ]
Ekaterina Dimitrova commented on CASSANDRA-16318:
-------------------------------------------------
Results from the original test, fixed reads:
Offheap_objects:
{code:java}
287.056MiB (16%) on-heap, 53.406MiB (3%) off-heap
Writing Memtable-table_01@1665900061(58.174MiB serialized bytes, 1000000 ops, 16%/3% of on/off-heap limit), flushed range = (null, null]
{code}
Around 320mb actual size:
{code:java}
num #instances #bytes class name
----------------------------------------------
1: 2006251 48510800 [Ljava.lang.Object;
2: 1000012 32000384 org.apache.cassandra.db.rows.EncodingStats
3: 1000002 32000064 org.apache.cassandra.db.partitions.AbstractBTreePartition$Holder
4: 1000002 32000064 org.apache.cassandra.db.rows.BTreeRow
5: 1000001 32000032 org.apache.cassandra.db.partitions.AtomicBTreePartition
6: 1000066 24001584 java.util.concurrent.ConcurrentSkipListMap$Node
7: 1000014 24000336 org.apache.cassandra.db.rows.NativeCell
8: 1000004 24000096 org.apache.cassandra.dht.Murmur3Partitioner$LongToken
9: 1000002 24000048 org.apache.cassandra.db.LivenessInfo
10: 1000001 24000024 org.apache.cassandra.db.NativeClustering
11: 1000001 24000024 org.apache.cassandra.db.NativeDecoratedKey
12: 500094 12002256 java.util.concurrent.ConcurrentSkipListMap$Index
{code}
> Memtable heap size is severely underestimated
> ---------------------------------------------
>
> Key: CASSANDRA-16318
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16318
> Project: Cassandra
> Issue Type: Bug
> Components: Local/Memtable
> Reporter: Branimir Lambov
> Assignee: Ekaterina Dimitrova
> Priority: Normal
> Fix For: 4.0-beta
>
> Attachments: image-2020-12-09-10-57-21-994.png, image-2020-12-09-11-01-31-273.png
>
>
> We seem to be estimating the size of the on-heap memtable metadata to be around half of what it actually is. For example, during a [read benchmark which writes 1 million single-long rows|https://github.com/blambov/cassandra/blob/memtable-heap/test/microbench/org/apache/cassandra/test/microbench/instance/ReadTestSmallPartitions.java] the memtable reports
> {code}
> 1000000 ops, 58.174MiB serialized bytes, 385.284MiB (19%) on heap, 0.000KiB (0%) off-heap
> {code}
> while a heap dump taken at this point:
> !image-2020-12-09-10-57-21-994.png!
> lists an usage of about 666MB altogether.
> Switching to {{offheap_objects}}, the reported numbers are
> {code}
> 1000000 ops, 58.174MiB serialized bytes, 233.650MiB (12%) on heap, 53.406MiB (3%) off-heap
> {code}
> while actual heap usage:
> !image-2020-12-09-11-01-31-273.png!
> is about 442MB.
> Looking at the code we definitely are not counting the {{AtomicBTreePartition.Holder}}, {{EncodingStats}}, liveness and deletion info objects associated with each partition, and most probably others.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org