You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Benedict Elliott Smith (Jira)" <ji...@apache.org> on 2020/02/13 13:51:00 UTC
[jira] [Commented] (CASSANDRA-15367) Memtable memory allocations may deadlock

    [ https://issues.apache.org/jira/browse/CASSANDRA-15367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17036235#comment-17036235 ] 

Benedict Elliott Smith commented on CASSANDRA-15367:
----------------------------------------------------

So, I'm looking at this more closely now I have some time, and I wonder if you could outline how you think the deadlock occurs between {{setCommitLogUpperBound}} and {{writeBarrier.issue()}}?  Because the deadlock requires a new cohort to exist, that does not get instantiated until {{writeBarrier.issue()}} so the deadlock cannot occur until then?

However there _is_ a window _after_ {{!writeOp.isBehindBarrier()}}, which cannot be avoided because there are no timed wait mechanisms for obtaining a monitor, and {{tryMonitorEnter}} anyway isn't possible in later versions of Java.

So, I propose a variant of my earlier approach that definitely worked, that waited for all earlier operations to complete, to instead essentially invert the behaviour of your suggestion: if there are any running older operations, refuse to lock until they all complete (and invoke {{Thread.yield()}} once to give them an opportunity with the CPU).  So locking is essentially disabled for all newer operations until the older ones expire, and we try to give them dibs on the CPU if the scheduler lets us, so that this window is as narrow as possible.

> Memtable memory allocations may deadlock
> ----------------------------------------
>
>                 Key: CASSANDRA-15367
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15367
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Local/Commit Log, Local/Memtable
>            Reporter: Benedict Elliott Smith
>            Assignee: Benedict Elliott Smith
>            Priority: Normal
>             Fix For: 4.0, 2.2.x, 3.0.x, 3.11.x
>
>
> * Under heavy contention, we guard modifications to a partition with a mutex, for the lifetime of the memtable.
> * Memtables block for the completion of all {{OpOrder.Group}} started before their flush began
> * Memtables permit operations from this cohort to fall-through to the following Memtable, in order to guarantee a precise commitLogUpperBound
> * Memtable memory limits may be lifted for operations in the first cohort, since they block flush (and hence block future memory allocation)
> With very unfortunate scheduling
> * A contended partition may rapidly escalate to a mutex
> * The system may reach memory limits that prevent allocations for the new Memtable’s cohort (C2) 
> * An operation from C2 may hold the mutex when this occurs
> * Operations from a prior Memtable’s cohort (C1), for a contended partition, may fall-through to the next Memtable
> * The operations from C1 may execute after the above is encountered by those from C2



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org