You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Marcus Eriksson (Jira)" <ji...@apache.org> on 2021/04/01 12:20:00 UTC
[jira] [Commented] (CASSANDRA-16552) Anticompaction appears to race with Compaction, preventing forward compaction progress after an incremental repair

    [ https://issues.apache.org/jira/browse/CASSANDRA-16552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17313137#comment-17313137 ] 

Marcus Eriksson commented on CASSANDRA-16552:
---------------------------------------------

Looks like this is caused by CASSANDRA-14103 where we started keeping the levels in a TreeSet sorted by first token but seems we get notified that sstables with {{MOVED_START}} have been removed (which can't be found because the first token is not the same anymore). Then, when we re-add the correct sstable (which is added as {{NORMAL}}) and has the original first token) we see that we already have the sstable in the manifest and skip it - which is incorrect, because it is a new instance of the same sstable, so [this|https://github.com/apache/cassandra/blob/24013e5c5ae538442cab083f8644563ea149ed7b/src/java/org/apache/cassandra/db/lifecycle/View.java#L269] check ({{view.sstablesMap.get(reader) != reader}}) fails and we can't mark the sstables as compacting. I should know this by now...

A workaround is setting {{sstable_preemptive_open_interval_in_mb: -1}} which disables early opening.

> Anticompaction appears to race with Compaction, preventing forward compaction progress after an incremental repair
> ------------------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-16552
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16552
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Local/Compaction
>            Reporter: Joey Lynch
>            Assignee: Marcus Eriksson
>            Priority: High
>             Fix For: 4.0, 4.0-rc
>
>         Attachments: CompactionStuck.png, CurrentView.png, NeighborsFallingBehind.png, anticompaction_before_issue.txt, debug.log.gz, system.log.gz
>
>
> While testing 4.0-rc1 on a 12 i3en.2xlarge x 2 region (AWS us-east-1 and eu-west-1) cluster I attempted to run {{nodetool repair}} while the cluster was taking moderate read/write load. 
> The first time it worked as expected, but when I ran an incremental run the second time multiple nodes got stuck trying to compact the unrepaired sstables. They are now spinning with:
> {noformat}
> $ nt compactionstats
> pending tasks: 827
> - acceptance_josephl.acceptance_josephl_cass4: 827
> $ nt tpstats            
> Pool Name                    Active Pending Completed Blocked All time blocked
> RequestResponseStage         0      0       422359133 0       0               
> MutationStage                0      0       164540628 0       0               
> ReadStage                    0      0       198857844 0       0               
> CompactionExecutor           0      0       60782     0       0    
> $ tail system.log
> DEBUG [CompactionExecutor:684] 2021-03-31 15:13:59,902 LeveledManifest.java:292 - L0 is too far behind, performing size-tiering there first
> DEBUG [CompactionExecutor:684] 2021-03-31 15:13:59,908 LeveledManifest.java:292 - L0 is too far behind, performing size-tiering there first
> WARN  [CompactionExecutor:684] 2021-03-31 15:13:59,912 LeveledCompactionStrategy.java:154 - Could not acquire references for compacting SSTables [BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11eba
> fb40b81cbd6fb3d/na-4826-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4872-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acc
> eptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4849-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4874-big-Data.db'), BigTableReader(path='/mnt/dat
> a/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4841-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4897-big-D
> ata.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4924-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e79
> 0917c11ebafb40b81cbd6fb3d/na-4837-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4926-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_j
> osephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4729-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4723-big-Data.db'), BigTableReader(path
> ='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4875-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-
> 4922-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4920-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cas
> s4-6144e790917c11ebafb40b81cbd6fb3d/na-4869-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4823-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/ac
> ceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4846-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4873-big-Data.db'), BigTableR
> eader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4840-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cb
> d6fb3d/na-4833-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4829-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_j
> osephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4726-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4923-big-Data.db'), BigTableReader(path='/mnt/data/cassand
> ra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4925-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4905-big-Data.db'),
> BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4876-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11eb
> afb40b81cbd6fb3d/na-4901-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4732-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/ac
> ceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4909-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4915-big-Data.db'), BigTableReader(path='/mnt/da
> ta/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4921-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4860-big-
> Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4693-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e7
> 90917c11ebafb40b81cbd6fb3d/na-4694-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4692-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_
> josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4691-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4696-big-Data.db'), BigTableReader(pat
> h='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4697-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na
> -4700-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4698-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_ca
> ss4-6144e790917c11ebafb40b81cbd6fb3d/na-4688-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4689-big-Data.db')] which is not a problem per se,unless it happens
> frequently, in which case it must be reported. Will retry later.
> {noformat}
> I've attached some starting breadcrumbs. I believe the issue is a potential race in [marking sstables for compaction|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/LeveledCompactionStrategy.java#L152-L161] getting null back from [tryModify|https://github.com/apache/cassandra/blob/d42087a63309178b96909c012dd0073fe0b6ea11/src/java/org/apache/cassandra/db/lifecycle/Tracker.java#L100] which I think can only happen under a [small number of circumstances|https://github.com/apache/cassandra/blob/d42087a63309178b96909c012dd0073fe0b6ea11/src/java/org/apache/cassandra/db/lifecycle/View.java#L269]. From the initial investigation it does appear that only the unrepaired products get into this state.
> I have a heap dump containing the View state but it contains potentially sensitive infrastructure details so if you're debugging just message me in slack and I can send it to you directly.
> The following mitigation appears to unstick the nodes via a forced full compaction:
> {noformat}
> nodetool stop COMPACTION; nodetool compact <ks> <table>
> {noformat}
> I'm not confident in this mitigation though.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org