You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Nicolas Henneaux (Jira)" <ji...@apache.org> on 2022/12/19 19:07:00 UTC

[jira] [Comment Edited] (CASSANDRA-18125) AssertionError on thread MemtableReclaimMemory in MemtablePool$SubPool.released(MemtablePool.java:193)

    [ https://issues.apache.org/jira/browse/CASSANDRA-18125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17649441#comment-17649441 ] 

Nicolas Henneaux edited comment on CASSANDRA-18125 at 12/19/22 7:06 PM:
------------------------------------------------------------------------

Thanks for your feedback, of course I can provide more details.

1. There is no secondary index in use.
2. Not sure what is really heavy but I don't believe it is heavy, each node has ~200GB of data with ttl of 10 days for most of the data (~20GB written by day). About overwrite, I guess you mean updating some cells, yes we usually insert then update a row within a short time frame (~10min).
3. I don't think off-heap is used. If disabled by default, that's not something I enabled

A compaction was running it seems. I got a lot of 
- TWCS skipping check for fully expired SSTables
- TWCS expired check sufficiently far in the past, checking for fully expired SSTables

Here is an extract of the logs around ~1min
{code}
Dec 19, 2022 @ 13:07:51.334COMMIT-LOG-ALLOCATOR org.apache.cassandra.db.ColumnFamilyStore logFlush - Enqueuing flush of T4: 35.156KiB (0%) on-heap, 0.000KiB (0%) off-heap
Dec 19, 2022 @ 13:07:51.334COMMIT-LOG-ALLOCATOR org.apache.cassandra.db.ColumnFamilyStore logFlush - Enqueuing flush of T3: 852.600KiB (0%) on-heap, 0.000KiB (0%) off-heap
Dec 19, 2022 @ 13:07:51.334COMMIT-LOG-ALLOCATOR org.apache.cassandra.db.ColumnFamilyStore logFlush - Enqueuing flush of T2: 41.210KiB (0%) on-heap, 0.000KiB (0%) off-heap
Dec 19, 2022 @ 13:07:51.334COMMIT-LOG-ALLOCATOR org.apache.cassandra.db.ColumnFamilyStore logFlush - Enqueuing flush of T1: 364.268KiB (0%) on-heap, 0.000KiB (0%) off-heap
Dec 19, 2022 @ 13:07:15.561NonPeriodicTasks:1 org.apache.cassandra.io.sstable.SSTable delete - Deleting sstable: /data/cassandra/data/system_distributed/repair_history-759fffad624b318180eefa9a52d1f627/nb-26-big
Dec 19, 2022 @ 13:07:15.560CompactionExecutor:6104 org.apache.cassandra.db.compaction.CompactionTask runMayThrow - Compacted (ad77aef0-7f95-11ed-bab7-e7d7c3e60f28) 2 sstables to [/data/cassandra/data/system_distributed/repair_history-759fffad624b318180eefa9a52d1f627/nb-29-big,] to level=0.  5.429KiB to 5.426KiB (~99% of original) in 9ms.  Read Throughput = 593.925KiB/s, Write Throughput = 593.604KiB/s, Row Throughput = ~118/s.  59 total partitions merged to 58.  Partition merge counts were {1:57, 2:1, }
Dec 19, 2022 @ 13:07:15.560NonPeriodicTasks:1 org.apache.cassandra.io.sstable.SSTable delete - Deleting sstable: /data/cassandra/data/system_distributed/repair_history-759fffad624b318180eefa9a52d1f627/nb-27-big
Dec 19, 2022 @ 13:07:15.551MemtableReclaimMemory:2628 org.apache.cassandra.service.CassandraDaemon uncaughtException - Exception in thread Thread[MemtableReclaimMemory:2628,5,main]java.lang.AssertionError: null
	at org.apache.cassandra.utils.memory.MemtablePool$SubPool.released(MemtablePool.java:193)
	at org.apache.cassandra.utils.memory.MemtableAllocator$SubAllocator.releaseAll(MemtableAllocator.java:151)
	at org.apache.cassandra.utils.memory.MemtableAllocator$SubAllocator.setDiscarded(MemtableAllocator.java:142)
	at org.apache.cassandra.utils.memory.MemtableAllocator.setDiscarded(MemtableAllocator.java:93)
	at org.apache.cassandra.utils.memory.SlabAllocator.setDiscarded(SlabAllocator.java:120)
	at org.apache.cassandra.db.Memtable.setDiscarded(Memtable.java:201)
	at org.apache.cassandra.db.ColumnFamilyStore$Flush$1.runMayThrow(ColumnFamilyStore.java:1216)
	at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.base/java.lang.Thread.run(Thread.java:829)
Dec 19, 2022 @ 13:07:15.551MemtablePostFlush:2626 org.apache.cassandra.db.commitlog.AbstractCommitLogSegmentManager archiveAndDiscard - Segment CommitLogSegment(/data/cassandra/commitlog/CommitLog-7-1670343439312.log) is no longer active and will be deleted now
Dec 19, 2022 @ 13:07:15.551MemtablePostFlush:2626 org.apache.cassandra.db.commitlog.CommitLog discardCompletedSegments - Commit log segment CommitLogSegment(/data/cassandra/commitlog/CommitLog-7-1670343439312.log) is unused
Dec 19, 2022 @ 13:07:15.551CompactionExecutor:6104 org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy newestBucket - bucket size 2 >= 2 and not in current bucket, compacting what's here: [BigTableReader(path='/data/cassandra/data/system_distributed/repair_history-759fffad624b318180eefa9a52d1f627/nb-27-big-Data.db'), BigTableReader(path='/data/cassandra/data/system_distributed/repair_history-759fffad624b318180eefa9a52d1f627/nb-26-big-Data.db')]
Dec 19, 2022 @ 13:07:15.551CompactionExecutor:6104 org.apache.cassandra.db.compaction.CompactionTask runMayThrow - Compacting (ad77aef0-7f95-11ed-bab7-e7d7c3e60f28) [/data/cassandra/data/system_distributed/repair_history-759fffad624b318180eefa9a52d1f627/nb-27-big-Data.db:level=0, /data/cassandra/data/system_distributed/repair_history-759fffad624b318180eefa9a52d1f627/nb-26-big-Data.db:level=0, ]
Dec 19, 2022 @ 13:07:15.550MemtableFlushWriter:3385 org.apache.cassandra.db.ColumnFamilyStore flushMemtable - Flushed to [BigTableReader(path='/data/cassandra/data/system_distributed/parent_repair_history-deabd734b99d3b9c92e5fd92eb5abf14/nb-31-big-Data.db')] (1 sstables, 7.193KiB), biggest 7.193KiB, smallest 7.193KiB
Dec 19, 2022 @ 13:07:15.550MemtableFlushWriter:3384 org.apache.cassandra.db.ColumnFamilyStore flushMemtable - Flushed to [BigTableReader(path='/data/cassandra/data/system_distributed/repair_history-759fffad624b318180eefa9a52d1f627/nb-28-big-Data.db')] (1 sstables, 6.510KiB), biggest 6.510KiB, smallest 6.510KiB
Dec 19, 2022 @ 13:07:15.546PerDiskMemtableFlushWriter_0:3385 org.apache.cassandra.db.Memtable writeSortedContents - Completed flushing /data/cassandra/data/system_distributed/parent_repair_history-deabd734b99d3b9c92e5fd92eb5abf14/nb-31-big-Data.db (4.170KiB) for commitlog position CommitLogPosition(segmentId=1670343439574, position=3282)
Dec 19, 2022 @ 13:07:15.545PerDiskMemtableFlushWriter_0:3384 org.apache.cassandra.db.Memtable writeSortedContents - Completed flushing /data/cassandra/data/system_distributed/repair_history-759fffad624b318180eefa9a52d1f627/nb-28-big-Data.db (1.785KiB) for commitlog position CommitLogPosition(segmentId=1670343439574, position=1779)
Dec 19, 2022 @ 13:07:15.545PerDiskMemtableFlushWriter_0:3384 org.apache.cassandra.db.Memtable writeSortedContents - Writing Memtable-repair_history@685478737(-8.285KiB serialized bytes, 246 ops, -0%/0% of on/off-heap limit), flushed range = (min(-9223372036854775808), max(9223372036854775807)]Dec 19, 2022 @ 13:07:15.545PerDiskMemtableFlushWriter_0:3385 org.apache.cassandra.db.Memtable writeSortedContents - Writing Memtable-parent_repair_history@860004687(5.211KiB serialized bytes, 2 ops, 0%/0% of on/off-heap limit), flushed range = (min(-9223372036854775808), max(9223372036854775807)]Dec 19, 2022 @ 13:07:15.543COMMIT-LOG-ALLOCATOR org.apache.cassandra.db.ColumnFamilyStore logFlush - Enqueuing flush of parent_repair_history: 14.763KiB (0%) on-heap, 0.000KiB (0%) off-heap
Dec 19, 2022 @ 13:07:15.542COMMIT-LOG-ALLOCATOR org.apache.cassandra.db.ColumnFamilyStore logFlush - Enqueuing flush of repair_history: -8.000KiB (-0%) on-heap, 0.000KiB (0%) off-heap
Dec 19, 2022 @ 13:06:22.905ReadStage-1 org.apache.cassandra.utils.memory.BufferPool log - Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB
Dec 19, 2022 @ 13:05:36.979NonPeriodicTasks:1 org.apache.cassandra.io.sstable.SSTable delete - Deleting sstable: /data/cassandra/data/X/Y-48b97b403c1011eab0e9f58b98b870bd/nb-2963-big
Dec 19, 2022 @ 13:05:36.979CompactionExecutor:6105 org.apache.cassandra.db.compaction.CompactionTask runMayThrow - Compacted (72b6a320-7f95-11ed-bab7-e7d7c3e60f28) 1 sstables to [] to level=0.  0.000KiB to 0.000KiB (~0% of original) in 1ms.  Read Throughput = 0.000KiB/s, Write Throughput = 0.000KiB/s, Row Throughput = ~0/s.  0 total partitions merged to 0.  Partition merge counts were {}
{code}



was (Author: nicolas.henneaux):
Thanks for your feedback, of course I can provide more details.

1. There is no secondary index in use.
2. Not sure what is really heavy but I don't believe it is heavy, each node has ~200GB of data with ttl of 10 days for most of the data (~20GB written by day). About overwrite, I guess you mean updating some cells, yes we usually insert then update a row within a short time frame (~10min).
3. I don't think off-heap is used. If disabled by default, that's not something I enabled


> AssertionError on thread MemtableReclaimMemory in MemtablePool$SubPool.released(MemtablePool.java:193)
> ------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-18125
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-18125
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Nicolas Henneaux
>            Priority: Normal
>
> On two nodes (on a 5 nodes cluster) on the cluster I'm running, I got the following exception. It occurred at 3,5 minutes interval.
> {code}
> MemtableReclaimMemory:2625 org.apache.cassandra.service.CassandraDaemon uncaughtException - Exception in thread Thread[MemtableReclaimMemory:2625,5,main]java.lang.AssertionError: null
> 	at org.apache.cassandra.utils.memory.MemtablePool$SubPool.released(MemtablePool.java:193)
> 	at org.apache.cassandra.utils.memory.MemtableAllocator$SubAllocator.releaseAll(MemtableAllocator.java:151)
> 	at org.apache.cassandra.utils.memory.MemtableAllocator$SubAllocator.setDiscarded(MemtableAllocator.java:142)
> 	at org.apache.cassandra.utils.memory.MemtableAllocator.setDiscarded(MemtableAllocator.java:93)
> 	at org.apache.cassandra.utils.memory.SlabAllocator.setDiscarded(SlabAllocator.java:120)
> 	at org.apache.cassandra.db.Memtable.setDiscarded(Memtable.java:201)
> 	at org.apache.cassandra.db.ColumnFamilyStore$Flush$1.runMayThrow(ColumnFamilyStore.java:1216)
> 	at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> 	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> 	at java.base/java.lang.Thread.run(Thread.java:829)
> {code} 
> {code}
> $ nodetool info
> ID                     : 
> Gossip active          : true
> Native Transport active: true
> Load                   : 204.67 GiB
> Generation No          : 1670343179
> Uptime (seconds)       : 1110514
> Heap Memory (MB)       : 7218.07 / 24576.00
> Off Heap Memory (MB)   : 784.06
> Data Center            : par
> Rack                   : e1
> Exceptions             : 1
> Key Cache              : entries 802712, size 100 MiB, capacity 100 MiB, 774541004 hits, 914207516 requests, 0.847 recent hit rate, 14400 save period in seconds
> Row Cache              : entries 0, size 0 bytes, capacity 0 bytes, 0 hits, 0 requests, NaN recent hit rate, 0 save period in seconds
> Counter Cache          : entries 0, size 0 bytes, capacity 50 MiB, 0 hits, 0 requests, NaN recent hit rate, 7200 save period in seconds
> Percent Repaired       : 2.3272298419424144E-5%
> Token                  : (invoke with -T/--tokens to see all 8 tokens)
> $ java -version
> openjdk version "11.0.16" 2022-07-19 LTS
> OpenJDK Runtime Environment (Red_Hat-11.0.16.0.8-1.el7_9) (build 11.0.16+8-LTS)
> OpenJDK 64-Bit Server VM (Red_Hat-11.0.16.0.8-1.el7_9) (build 11.0.16+8-LTS, mixed mode, sharing)
> $ nodetool version
> ReleaseVersion: 4.0.6
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org