You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Joey Lynch (Jira)" <ji...@apache.org> on 2021/03/31 19:02:00 UTC
[jira] [Commented] (CASSANDRA-16552) Anticompaction appears to race with Compaction, preventing forward compaction progress after an incremental repair

    [ https://issues.apache.org/jira/browse/CASSANDRA-16552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17312626#comment-17312626 ] 

Joey Lynch commented on CASSANDRA-16552:
----------------------------------------

The repair runs I ran were
{noformat}
$ nt repair
[2021-03-31 02:21:51,831] Starting repair command #1 (daaad400-91c7-11eb-941c-93f0b7c99277), repairing keyspace acceptance_josephl with repair options (parallelism: parallel, primary range: false, incremental: true, job threads: 1, ColumnFamilies: [], dataCenters: [], hosts: [], previewKind: NONE, # of ranges: 6, pull repair: false, force repair: false, optimise streams: false, ignore unreplicated keyspaces: false)
[2021-03-31 02:22:36,780] Repair session f558d090-91c7-11eb-941c-93f0b7c99277 for range [(3074457345991006714,3074457347426834202]] finished (progress: 40%)
[2021-03-31 02:22:36,813] Repair session f55ca120-91c7-11eb-941c-93f0b7c99277 for range [(4611686018800136016,4611686020235963504]] finished (progress: 50%)
[2021-03-31 02:22:36,814] Repair session f55bddd0-91c7-11eb-941c-93f0b7c99277 for range [(6148914691609265317,6148914693045092805]] finished (progress: 60%)
[2021-03-31 02:32:58,204] Repair session f55e00b0-91c7-11eb-941c-93f0b7c99277 for range [(1537228674617704901,3074457345991006714]] finished (progress: 70%)
[2021-03-31 02:33:34,869] Repair session f55af370-91c7-11eb-941c-93f0b7c99277 for range [(3074457347426834202,4611686018800136016]] finished (progress: 80%)
[2021-03-31 02:34:10,553] Repair session f5548ad0-91c7-11eb-941c-93f0b7c99277 for range [(4611686020235963504,6148914691609265317]] finished (progress: 90%)
[2021-03-31 02:34:10,645] Repair completed successfully
[2021-03-31 02:34:10,648] Repair command #1 finished in 12 minutes 18 seconds
{noformat}

Then the second run:
{noformat}
$ nt repair
[2021-03-31 02:46:07,320] Starting repair command #3 (3e376b20-91cb-11eb-941c-93f0b7c99277), repairing keyspace acceptance_josephl with repair options (parallelism: parallel, primary range: false, incremental: true, job threads: 1, ColumnFamilies: [], dataCenters: [], hosts: [], previewKind: NONE, # of ranges: 6, pull repair: false, force repair: false, optimise streams: false, ignore unreplicated keyspaces: false)
[2021-03-31 02:49:00,849] Repair session a598df60-91cb-11eb-941c-93f0b7c99277 for range [(3074457345991006714,3074457347426834202]] finished (progress: 40%)
[2021-03-31 02:49:00,886] Repair session a59a8d10-91cb-11eb-941c-93f0b7c99277 for range [(4611686018800136016,4611686020235963504]] finished (progress: 50%)
[2021-03-31 02:49:00,895] Repair session a599f0d0-91cb-11eb-941c-93f0b7c99277 for range [(6148914691609265317,6148914693045092805]] finished (progress: 60%)
[2021-03-31 02:50:16,011] Repair session a5995490-91cb-11eb-941c-93f0b7c99277 for range [(3074457347426834202,4611686018800136016]] finished (progress: 70%)
[2021-03-31 02:50:16,160] Repair session a59adb30-91cb-11eb-941c-93f0b7c99277 for range [(1537228674617704901,3074457345991006714]] finished (progress: 80%)
[2021-03-31 02:50:17,025] Repair session a5984320-91cb-11eb-941c-93f0b7c99277 for range [(4611686020235963504,6148914691609265317]] finished (progress: 90%)
[2021-03-31 02:50:17,119] Repair completed successfully
[2021-03-31 02:50:17,119] Repair command #3 finished in 4 minutes 9 seconds
{noformat}

The affected nodes are all neighbors, and all have the " Could not acquire references for compacting SSTables" line.
 !NeighborsFallingBehind.png! 


> Anticompaction appears to race with Compaction, preventing forward compaction progress after an incremental repair
> ------------------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-16552
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16552
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Local/Compaction
>            Reporter: Joey Lynch
>            Priority: Normal
>             Fix For: 4.0-rc2
>
>         Attachments: CompactionStuck.png, NeighborsFallingBehind.png, anticompaction_before_issue.txt
>
>
> While testing 4.0-rc1 on a 12 i3en.2xlarge x 2 region (AWS us-east-1 and eu-west-1) cluster I attempted to run {{nodetool repair}} while the cluster was taking moderate read/write load. 
> The first time it worked as expected, but when I ran an incremental run the second time multiple nodes got stuck trying to compact the unrepaired sstables. They are now spinning with:
> {noformat}
> $ nt compactionstats
> pending tasks: 827
> - acceptance_josephl.acceptance_josephl_cass4: 827
> $ nt tpstats            
> Pool Name                    Active Pending Completed Blocked All time blocked
> RequestResponseStage         0      0       422359133 0       0               
> MutationStage                0      0       164540628 0       0               
> ReadStage                    0      0       198857844 0       0               
> CompactionExecutor           0      0       60782     0       0    
> $ tail system.log
> DEBUG [CompactionExecutor:684] 2021-03-31 15:13:59,902 LeveledManifest.java:292 - L0 is too far behind, performing size-tiering there first
> DEBUG [CompactionExecutor:684] 2021-03-31 15:13:59,908 LeveledManifest.java:292 - L0 is too far behind, performing size-tiering there first
> WARN  [CompactionExecutor:684] 2021-03-31 15:13:59,912 LeveledCompactionStrategy.java:154 - Could not acquire references for compacting SSTables [BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11eba
> fb40b81cbd6fb3d/na-4826-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4872-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acc
> eptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4849-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4874-big-Data.db'), BigTableReader(path='/mnt/dat
> a/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4841-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4897-big-D
> ata.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4924-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e79
> 0917c11ebafb40b81cbd6fb3d/na-4837-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4926-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_j
> osephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4729-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4723-big-Data.db'), BigTableReader(path
> ='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4875-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-
> 4922-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4920-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cas
> s4-6144e790917c11ebafb40b81cbd6fb3d/na-4869-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4823-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/ac
> ceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4846-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4873-big-Data.db'), BigTableR
> eader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4840-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cb
> d6fb3d/na-4833-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4829-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_j
> osephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4726-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4923-big-Data.db'), BigTableReader(path='/mnt/data/cassand
> ra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4925-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4905-big-Data.db'),
> BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4876-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11eb
> afb40b81cbd6fb3d/na-4901-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4732-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/ac
> ceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4909-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4915-big-Data.db'), BigTableReader(path='/mnt/da
> ta/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4921-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4860-big-
> Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4693-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e7
> 90917c11ebafb40b81cbd6fb3d/na-4694-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4692-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_
> josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4691-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4696-big-Data.db'), BigTableReader(pat
> h='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4697-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na
> -4700-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4698-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_ca
> ss4-6144e790917c11ebafb40b81cbd6fb3d/na-4688-big-Data.db'), BigTableReader(path='/mnt/data/cassandra/data/acceptance_josephl/acceptance_josephl_cass4-6144e790917c11ebafb40b81cbd6fb3d/na-4689-big-Data.db')] which is not a problem per se,unless it happens
> frequently, in which case it must be reported. Will retry later.
> {noformat}
> I've attached some starting breadcrumbs. I believe the issue is a potential race in [marking sstables for compaction|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/LeveledCompactionStrategy.java#L152-L161] getting null back from [tryModify|https://github.com/apache/cassandra/blob/d42087a63309178b96909c012dd0073fe0b6ea11/src/java/org/apache/cassandra/db/lifecycle/Tracker.java#L100] which I think can only happen under a [small number of circumstances|https://github.com/apache/cassandra/blob/d42087a63309178b96909c012dd0073fe0b6ea11/src/java/org/apache/cassandra/db/lifecycle/View.java#L269]. From the initial investigation it does appear that only the unrepaired products get into this state.
> I have a heap dump containing the View state but it contains potentially sensitive infrastructure details so if you're debugging just message me in slack and I can send it to you directly.
> The following mitigation appears to unstick the nodes via a forced full compaction:
> {noformat}
> nodetool stop COMPACTION; nodetool compact <ks> <table>
> {noformat}
> I'm not confident in this mitigation though.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org