You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Branimir Lambov (Jira)" <ji...@apache.org> on 2020/12/09 09:30:00 UTC

[jira] [Commented] (CASSANDRA-16318) Memtable heap size is severely underestimated

    [ https://issues.apache.org/jira/browse/CASSANDRA-16318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17246392#comment-17246392 ] 

Branimir Lambov commented on CASSANDRA-16318:
---------------------------------------------

An easier way to get the actual usage is {{jmap -histo:live <pid>}}, e.g.
{code:java}
$ jmap -histo:live 10472

 num     #instances         #bytes  class name
----------------------------------------------
   1:       3000190      144009120  java.nio.HeapByteBuffer
   2:       2005596       48476360  [Ljava.lang.Object;
   3:       1000014       40000560  org.apache.cassandra.db.rows.BufferCell
   4:       1000012       32000384  org.apache.cassandra.db.rows.EncodingStats
   5:       1000002       32000064  org.apache.cassandra.db.partitions.AbstractBTreePartition$Holder
   6:       1000002       32000064  org.apache.cassandra.db.rows.BTreeRow
   7:       1000001       32000032  org.apache.cassandra.db.partitions.AtomicBTreePartition
   8:          1597       25789448  [B
   9:       1000015       24008944  [Ljava.nio.ByteBuffer;
  10:       1000066       24001584  java.util.concurrent.ConcurrentSkipListMap$Node
  11:       1000010       24000240  org.apache.cassandra.dht.Murmur3Partitioner$LongToken
  12:       1000009       24000216  org.apache.cassandra.db.BufferDecoratedKey
  13:       1000002       24000048  org.apache.cassandra.db.LivenessInfo
  14:       1000000       24000000  org.apache.cassandra.db.BufferClustering
  15:        500098       12002352  java.util.concurrent.ConcurrentSkipListMap$Index
...{code}

> Memtable heap size is severely underestimated
> ---------------------------------------------
>
>                 Key: CASSANDRA-16318
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16318
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Local/Memtable
>            Reporter: Branimir Lambov
>            Priority: Normal
>         Attachments: image-2020-12-09-10-57-21-994.png, image-2020-12-09-11-01-31-273.png
>
>
> We seem to be estimating the size of the on-heap memtable metadata to be around half of what it actually is. For example, during a [read benchmark which writes 1 million single-long rows|https://github.com/blambov/cassandra/blob/memtable-heap/test/microbench/org/apache/cassandra/test/microbench/instance/ReadTestSmallPartitions.java] the memtable reports
> {code}
> 1000000 ops, 58.174MiB serialized bytes, 385.284MiB (19%) on heap, 0.000KiB (0%) off-heap
> {code}
> while a heap dump taken at this point:
>  !image-2020-12-09-10-57-21-994.png! 
> lists an usage of about 666MB altogether.
> Switching to {{offheap_objects}}, the reported numbers are
> {code}
> 1000000 ops, 58.174MiB serialized bytes, 233.650MiB (12%) on heap, 53.406MiB (3%) off-heap
> {code}
> while actual heap usage:
>  !image-2020-12-09-11-01-31-273.png! 
> is about 442MB.
> Looking at the code we definitely are not counting the {{AtomicBTreePartition.Holder}}, {{EncodingStats}}, liveness and deletion info objects associated with each partition, and most probably others.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org