You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Flavien Charlon (JIRA)" <ji...@apache.org> on 2014/12/21 00:36:13 UTC
[jira] [Created] (CASSANDRA-8529) Cassandra suddenly stops
responding to clients though process is still running
Flavien Charlon created CASSANDRA-8529:
------------------------------------------
Summary: Cassandra suddenly stops responding to clients though process is still running
Key: CASSANDRA-8529
URL: https://issues.apache.org/jira/browse/CASSANDRA-8529
Project: Cassandra
Issue Type: Bug
Components: Core
Reporter: Flavien Charlon
I am running a moderate write-only load onto a 3 nodes cluster.
After some time, nodes become completely unresponsive to clients, even though the process is still running.
tpstats on affected nodes indicate pending compaction, which never gets executued.
This is tpstats on the affected node hours after the load has stopped:
Pool Name Active Pending Completed Blocked All time blocked
CounterMutationStage 0 0 0 0 0
ReadStage 0 0 243384 0 0
RequestResponseStage 0 0 3336833 0 0
MutationStage 32 1902 4775909 0 0
ReadRepairStage 0 0 14445 0 0
GossipStage 0 0 128499 0 0
CacheCleanupExecutor 0 0 0 0 0
AntiEntropyStage 0 0 0 0 0
MigrationStage 0 0 36 0 0
ValidationExecutor 0 0 0 0 0
CommitLogArchiver 0 0 0 0 0
MiscStage 0 0 0 0 0
MemtableFlushWriter 2 7 947 0 0
MemtableReclaimMemory 0 0 947 0 0
PendingRangeCalculator 0 0 5 0 0
MemtablePostFlush 1 8 1241 0 0
CompactionExecutor 2 8 1035 0 0
InternalResponseStage 0 0 9 0 0
HintedHandoff 0 0 6 0 0
Message type Dropped
RANGE_SLICE 0
READ_REPAIR 0
PAGED_RANGE 0
BINARY 0
READ 0
MUTATION 0
_TRACE 0
REQUEST_RESPONSE 0
COUNTER_MUTATION 0
Also, compactionstats shows that compaction is stalled:
compaction type keyspace table completed total unit progress
Compaction testnet transactions 117833347 117834891 bytes 100.00%
Compaction testnet scripts 206418064 206419414 bytes 100.00%
Active compaction remaining time : 0h00m00s
And again, this has been like this for hours.
I have reproduced this on several clusters with various memory configurations.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)