You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Eric Evans (JIRA)" <ji...@apache.org> on 2011/09/01 04:43:10 UTC

[jira] [Created] (CASSANDRA-3116) Compactions can (seriously )delay schema migrations

Compactions can (seriously )delay schema migrations
---------------------------------------------------

                 Key: CASSANDRA-3116
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3116
             Project: Cassandra
          Issue Type: Bug
          Components: Core
    Affects Versions: 0.8.4
            Reporter: Eric Evans


A compaction lock is acquired when dropping keyspaces or column families which will cause the schema migration to block if a compaction is in progress.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-3116) Compactions can (seriously) delay schema migrations

Posted by "Sylvain Lebresne (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13140130#comment-13140130 ] 

Sylvain Lebresne commented on CASSANDRA-3116:
---------------------------------------------

+1 on v3. Nit: for the assert in DT.postReplace(), I believe the msg should include the sstable, not 'this'.
                
> Compactions can (seriously) delay schema migrations
> ---------------------------------------------------
>
>                 Key: CASSANDRA-3116
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3116
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7.0
>            Reporter: Eric Evans
>            Assignee: Jonathan Ellis
>              Labels: compaction
>             Fix For: 1.1
>
>         Attachments: 3116-v2.txt, 3116-v3.txt, 3116.txt
>
>
> A compaction lock is acquired when dropping keyspaces or column families which will cause the schema migration to block if a compaction is in progress.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-3116) Compactions can (seriously) delay schema migrations

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13140328#comment-13140328 ] 

Hudson commented on CASSANDRA-3116:
-----------------------------------

Integrated in Cassandra #1177 (See [https://builds.apache.org/job/Cassandra/1177/])
    replace compactionlock use in schema migration by checking CFS.isInvalidD
patch by jbellis; reviewed by slebresne for CASSANDRA-3116

jbellis : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1195542
Files : 
* /cassandra/trunk/CHANGES.txt
* /cassandra/trunk/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
* /cassandra/trunk/src/java/org/apache/cassandra/db/DataTracker.java
* /cassandra/trunk/src/java/org/apache/cassandra/db/Memtable.java
* /cassandra/trunk/src/java/org/apache/cassandra/db/Table.java
* /cassandra/trunk/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
* /cassandra/trunk/src/java/org/apache/cassandra/db/index/SecondaryIndex.java
* /cassandra/trunk/src/java/org/apache/cassandra/db/index/SecondaryIndexManager.java
* /cassandra/trunk/src/java/org/apache/cassandra/db/index/keys/KeysIndex.java
* /cassandra/trunk/src/java/org/apache/cassandra/db/migration/DropColumnFamily.java
* /cassandra/trunk/src/java/org/apache/cassandra/db/migration/DropKeyspace.java
* /cassandra/trunk/src/java/org/apache/cassandra/io/sstable/SSTableReader.java
* /cassandra/trunk/test/unit/org/apache/cassandra/db/KeyCacheTest.java
* /cassandra/trunk/test/unit/org/apache/cassandra/db/compaction/CompactionsTest.java
* /cassandra/trunk/test/unit/org/apache/cassandra/streaming/StreamingTransferTest.java

                
> Compactions can (seriously) delay schema migrations
> ---------------------------------------------------
>
>                 Key: CASSANDRA-3116
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3116
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7.0
>            Reporter: Eric Evans
>            Assignee: Jonathan Ellis
>              Labels: compaction
>             Fix For: 1.1
>
>         Attachments: 3116-v2.txt, 3116-v3.txt, 3116.txt
>
>
> A compaction lock is acquired when dropping keyspaces or column families which will cause the schema migration to block if a compaction is in progress.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-3116) Compactions can (seriously) delay schema migrations

Posted by "Jonathan Ellis (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-3116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-3116:
--------------------------------------

    Attachment: 3116-v3.txt

bq. What we could do is checked the sstableReader isCompacted flag

I think I like moving the check into removeOldSSTables instead, since it's clearly threadsafe (even though there are no existing thread safety issues, why take chances with future complications).  The new version also allows markCompacted to "work" when called multiple times with asserts turned off; before, the file create IOException would blow up anyway.

bq. attemptUpdate breaks atomicity for unreferenceSSTable

Damn it, you're right.  So many CAS loops feels like we're making this too fragile.  But you're right, that's better than passing View references around.

bq. rename the unregisterMBean method in KeysIndex to invalidate

done.

bq. We may still want to check for isValid before doing a validation compaction 

Done, although I suspect my comment attempting to explain the reason for the check may cause more confusion than it's worth. :)

v3 attached.  CASSANDRA-3409 throws a bit of a wrench into things since I don't see a good way to avoid the lock; still, that's not a frequently used migration.
                
> Compactions can (seriously) delay schema migrations
> ---------------------------------------------------
>
>                 Key: CASSANDRA-3116
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3116
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7.0
>            Reporter: Eric Evans
>            Assignee: Jonathan Ellis
>              Labels: compaction
>             Fix For: 1.1
>
>         Attachments: 3116-v2.txt, 3116-v3.txt, 3116.txt
>
>
> A compaction lock is acquired when dropping keyspaces or column families which will cause the schema migration to block if a compaction is in progress.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-3116) Compactions can (seriously) delay schema migrations

Posted by "Sylvain Lebresne (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13138556#comment-13138556 ] 

Sylvain Lebresne commented on CASSANDRA-3116:
---------------------------------------------

The patch already needs rebase, but based on the diff a few comments:
* In unmarkCompacting, calling twice markCompacted is not so harmless has it will trigger an assertion. What we could do is checked the sstableReader isCompacted flag.
* attemptUpdate breaks atomicity for unreferenceSSTable. We should make sure the compareAndSet is done on the view we used to compute notCompacting, otherwise we could have bug in View.replace (like in CASSANDRA-3306). It's probably simpler to move back the compareAndSet in both unreferenceSSTable and replace, and call a 'finalizeReplace' for the addNewSSTableSize and following methods.
* We could rename the unregisterMBean method in KeysIndex to invalidate.
* We may still want to check for isValid before doing a validation compaction because it doesn't call markCompacting, so it could still run after it's invalidated on some sstable that have not yet be removed because are being compacted.

                
> Compactions can (seriously) delay schema migrations
> ---------------------------------------------------
>
>                 Key: CASSANDRA-3116
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3116
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7.0
>            Reporter: Eric Evans
>            Assignee: Jonathan Ellis
>              Labels: compaction
>             Fix For: 1.1
>
>         Attachments: 3116-v2.txt, 3116.txt
>
>
> A compaction lock is acquired when dropping keyspaces or column families which will cause the schema migration to block if a compaction is in progress.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-3116) Compactions can (seriously) delay schema migrations

Posted by "Jonathan Ellis (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-3116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-3116:
--------------------------------------

    Affects Version/s:     (was: 0.8.4)
                       0.7.0
             Assignee: Jonathan Ellis
              Summary: Compactions can (seriously) delay schema migrations  (was: Compactions can (seriously )delay schema migrations)
    
> Compactions can (seriously) delay schema migrations
> ---------------------------------------------------
>
>                 Key: CASSANDRA-3116
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3116
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7.0
>            Reporter: Eric Evans
>            Assignee: Jonathan Ellis
>             Fix For: 1.1
>
>
> A compaction lock is acquired when dropping keyspaces or column families which will cause the schema migration to block if a compaction is in progress.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-3116) Compactions can (seriously) delay schema migrations

Posted by "Jonathan Ellis (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-3116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-3116:
--------------------------------------

    Attachment: 3116-v2.txt

bq. only remove the one that are not compacting

done.  (renamed removeAllSSTables to unreferenceSSTables, which could still stand improvement...)

bq. removing any flushed memtable ... also that replacements are directly marked as compacted too 

done (both by ultimately funneling through the replace method)

bq. we could make sure no new compaction is automatically triggered on an invalidated CF 

this shouldn't be a problem, if it happens.  I'd rather not go to extra effort to prevent something harmless.

v2 also gets rid of CFS.flushlock (we already flush for the drop snapshot) and removes CFS.isDropped in favor of isValid.
                
> Compactions can (seriously) delay schema migrations
> ---------------------------------------------------
>
>                 Key: CASSANDRA-3116
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3116
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7.0
>            Reporter: Eric Evans
>            Assignee: Jonathan Ellis
>              Labels: compaction
>             Fix For: 1.1
>
>         Attachments: 3116-v2.txt, 3116.txt
>
>
> A compaction lock is acquired when dropping keyspaces or column families which will cause the schema migration to block if a compaction is in progress.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-3116) Compactions can (seriously )delay schema migrations

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-3116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-3116:
--------------------------------------

    Fix Version/s: 1.1

agreed that this sucks.  would like to switch to some kind of test-and-set safety for compaction vs migration instead.  not immediately obvious to me how to do this.

> Compactions can (seriously )delay schema migrations
> ---------------------------------------------------
>
>                 Key: CASSANDRA-3116
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3116
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.8.4
>            Reporter: Eric Evans
>             Fix For: 1.1
>
>
> A compaction lock is acquired when dropping keyspaces or column families which will cause the schema migration to block if a compaction is in progress.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-3116) Compactions can (seriously) delay schema migrations

Posted by "Jonathan Ellis (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-3116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-3116:
--------------------------------------

    Attachment: 3116.txt

Patch to replace locking in migrations + valid checks in CompactionManager with isValid checks in DataTracker.

compactionLock is still used but only for major compaction.  should we get rid of that too and say "if you want to be absolutely sure you're compacting everything, disable minor compactions before invoking major?"
                
> Compactions can (seriously) delay schema migrations
> ---------------------------------------------------
>
>                 Key: CASSANDRA-3116
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3116
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7.0
>            Reporter: Eric Evans
>            Assignee: Jonathan Ellis
>              Labels: compaction
>             Fix For: 1.1
>
>         Attachments: 3116.txt
>
>
> A compaction lock is acquired when dropping keyspaces or column families which will cause the schema migration to block if a compaction is in progress.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-3116) Compactions can (seriously )delay schema migrations

Posted by "Tom Wilkie (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13095416#comment-13095416 ] 

Tom Wilkie commented on CASSANDRA-3116:
---------------------------------------

Its worse than this too; a migration tried to get the write lock, which gets blocked on a big compaction (holding the read lock).  This migration waiting on the write lock then blocks all other compactions waiting on the read lock.  So you only get one compaction going on and thousands backing up.

A really hacky temporary fix would be to use a tryLock(timeout) and short sleep in a loop in the migration.  This would at least not starve the merges, but would starve the migrations quite badly.

> Compactions can (seriously )delay schema migrations
> ---------------------------------------------------
>
>                 Key: CASSANDRA-3116
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3116
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.8.4
>            Reporter: Eric Evans
>             Fix For: 1.1
>
>
> A compaction lock is acquired when dropping keyspaces or column families which will cause the schema migration to block if a compaction is in progress.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-3116) Compactions can (seriously )delay schema migrations

Posted by "Sylvain Lebresne (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13095861#comment-13095861 ] 

Sylvain Lebresne commented on CASSANDRA-3116:
---------------------------------------------

I think the _raison d'être_ of the lock is because we need to mark all sstables compacted for them to be removed when dropping, but that cannot be done correctly if some sstable are being compacted. But couldn't we just "delay" the compacted marking ? For instance, we could have a 'isDropped' switch in DataTracker such that when that switch is on the replace() method just remove the 'replacements' sstables. So the drop keyspace/cf would set the isDropped flag first, then grab any sstable files that is not being compacted and mark those right away. It may be a bit tricky to do that atomically but _à priori_ it sounds doable. We'll probably want to add a call to some checkForDropped() method in case a compaction fails to be sure we don't leave sstables behind in that case.

Another option may be to just stop the running compactions (CASSANDRA-1740) so that we can mark everything compacted at once. I may be harder to make that thread safe though, not sure, and CASSANDRA-1740 is not in yet. 

> Compactions can (seriously )delay schema migrations
> ---------------------------------------------------
>
>                 Key: CASSANDRA-3116
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3116
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.8.4
>            Reporter: Eric Evans
>             Fix For: 1.1
>
>
> A compaction lock is acquired when dropping keyspaces or column families which will cause the schema migration to block if a compaction is in progress.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (CASSANDRA-3116) Compactions can (seriously) delay schema migrations

Posted by "Sylvain Lebresne (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13130688#comment-13130688 ] 

Sylvain Lebresne commented on CASSANDRA-3116:
---------------------------------------------

Not sure this work correctly. I believe we first have a problem with DT.removeAllSSTables(), because this is during the drop and really do remove *all* sstables, including the ones that are being compacted (and thus it will unreference those while they are being compacted, which is bad). So we should first change that to only remove the one that are not compacting. Then we must make sure that anything that was not removed by that gets removed later. Which involves removing any flushed memtable (though that doesn't really matter since a dropped CF is flushed before being invalidated) and we must make sure that compacted sstable do are marked compacted but also that replacements are directly marked as compacted too (which mainly involve that we call removeOldSSTablesSize() on them). And I suppose we could make sure no new compaction is automatically triggered on an invalidated CF so we don't have a race or something.

bq. compactionLock is still used but only for major compaction. should we get rid of that too and say "if you want to be absolutely sure you're compacting everything, disable minor compactions before invoking major?"

I think there is really no much cost to keeping the lock in there if the write lock is only acquired by events triggered by a user and I would prefer having major compaction do what it pretend by default rather that having a "complicated" procedure. That being said, I would be for replacing the global compactionLock by one lock per CF (which should be easy).

                
> Compactions can (seriously) delay schema migrations
> ---------------------------------------------------
>
>                 Key: CASSANDRA-3116
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3116
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7.0
>            Reporter: Eric Evans
>            Assignee: Jonathan Ellis
>              Labels: compaction
>             Fix For: 1.1
>
>         Attachments: 3116.txt
>
>
> A compaction lock is acquired when dropping keyspaces or column families which will cause the schema migration to block if a compaction is in progress.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira