You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Benedict (JIRA)" <ji...@apache.org> on 2015/01/01 16:22:13 UTC

[jira] [Comment Edited] (CASSANDRA-8399) Reference Counter exception when dropping user type

    [ https://issues.apache.org/jira/browse/CASSANDRA-8399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14262576#comment-14262576 ] 

Benedict edited comment on CASSANDRA-8399 at 1/1/15 3:22 PM:
-------------------------------------------------------------

But surely we have a bug if we're trying to compact an obsoleted sstable? If markCompacting fails to acquire a reference we really shouldn't be compacting it, as we're either duplicating data or reintroducing dropped data. The comment on markCompacting even states this is a valid option, but decides against it based on slightly erroneous logic:

{noformat}
     * Note that we could acquire references on the marked sstables and release them in
     * unmarkCompacting, but since we will never call markObsolete on a sstable marked
     * as compacting (unless there is a serious bug), we can skip this.
{noformat}

This justification for not doing this doesn't seem sufficient, because markObsolete is unrelated to releaseReference. If it's also not true that it can be done safely without an infinite loop, it's doubly wrong :) For 2.0 I could see us not wanting to risk new behaviour, but for 2.1 I think we should try to get on the case and deal with it since it is essentially a bug. 

One possible way of dealing with this without acquiring a reference, at least to begin with, is to releaseReference() in unmarkCompacting(), NOT in markCompactedSSTablesReplaced, IFF the sstable is marked obsolete. i.e. we simply markObsolete each of the sstables, then let unmarkCompacting cleanup. 

Long term, I think we could do with redesigning and formalising this whole process a lot more clearly. Objects that encapsulate only correct behaviour are much better than ensuring a plethora of different but related methods are correctly interleaved, as it's hard to retain at any one time what the behaviour of every other possible method call is.


was (Author: benedict):
But surely we have a bug if we're trying to compact an obsoleted sstable? If markCompacting fails to acquire a reference we really shouldn't be compacting it, as we're either duplicating data or reintroducing dropped data. The comment on markCompacting even states this is a valid option, but decides against it based on slightly erroneous logic:

{noformat}
     * Note that we could acquire references on the marked sstables and release them in
     * unmarkCompacting, but since we will never call markObsolete on a sstable marked
     * as compacting (unless there is a serious bug), we can skip this.
{noformat}

This justification for not doing this doesn't seem sufficient, because markObsolete is unrelated to releaseReference. If it's also not true that it can be done safely without an infinite loop, it's doubly wrong :) For 2.0 I could see us not wanting to risk new behaviour, but for 2.1 I think we should try to get on the case and deal with it since it is essentially a bug. 

One possible way of dealing with this without acquiring a reference, at least to begin with (since it is more correct to do so, and I think we should for 2.1), is to releaseReference() in unmarkCompacting(), NOT in markCompactedSSTablesReplaced, IFF the sstable is marked obsolete. i.e. we simply markObsolete each of the sstables, then let unmarkCompacting cleanup. 

Long term, I think we could do with redesigning and formalising this whole process a lot more clearly. Objects that encapsulate only correct behaviour are much better than ensuring a plethora of different but related methods are correctly interleaved, as it's hard to retain at any one time what the behaviour of every other possible method call is.

> Reference Counter exception when dropping user type
> ---------------------------------------------------
>
>                 Key: CASSANDRA-8399
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8399
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Philip Thompson
>            Assignee: Joshua McKenzie
>             Fix For: 2.1.3
>
>         Attachments: 8399_fix_empty_results.txt, 8399_v2.txt, node2.log, ubuntu-8399.log
>
>
> When running the dtest {{user_types_test.py:TestUserTypes.test_type_keyspace_permission_isolation}} with the current 2.1-HEAD code, very frequently, but not always, when dropping a type, the following exception is seen:{code}
> ERROR [MigrationStage:1] 2014-12-01 13:54:54,824 CassandraDaemon.java:170 - Exception in thread Thread[MigrationStage:1,5,main]
> java.lang.AssertionError: Reference counter -1 for /var/folders/v3/z4wf_34n1q506_xjdy49gb780000gn/T/dtest-eW2RXj/test/node2/data/system/schema_keyspaces-b0f2235744583cdb9631c43e59ce3676/system-sche
> ma_keyspaces-ka-14-Data.db
>         at org.apache.cassandra.io.sstable.SSTableReader.releaseReference(SSTableReader.java:1662) ~[main/:na]
>         at org.apache.cassandra.io.sstable.SSTableScanner.close(SSTableScanner.java:164) ~[main/:na]
>         at org.apache.cassandra.utils.MergeIterator.close(MergeIterator.java:62) ~[main/:na]
>         at org.apache.cassandra.db.ColumnFamilyStore$8.close(ColumnFamilyStore.java:1943) ~[main/:na]
>         at org.apache.cassandra.db.ColumnFamilyStore.filter(ColumnFamilyStore.java:2116) ~[main/:na]
>         at org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:2029) ~[main/:na]
>         at org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1963) ~[main/:na]
>         at org.apache.cassandra.db.SystemKeyspace.serializedSchema(SystemKeyspace.java:744) ~[main/:na]
>         at org.apache.cassandra.db.SystemKeyspace.serializedSchema(SystemKeyspace.java:731) ~[main/:na]
>         at org.apache.cassandra.config.Schema.updateVersion(Schema.java:374) ~[main/:na]
>         at org.apache.cassandra.config.Schema.updateVersionAndAnnounce(Schema.java:399) ~[main/:na]
>         at org.apache.cassandra.db.DefsTables.mergeSchema(DefsTables.java:167) ~[main/:na]
>         at org.apache.cassandra.db.DefinitionsUpdateVerbHandler$1.runMayThrow(DefinitionsUpdateVerbHandler.java:49) ~[main/:na]
>         at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) ~[main/:na]
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.7.0_67]
>         at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_67]
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_67]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_67]
>         at java.lang.Thread.run(Thread.java:745) [na:1.7.0_67]{code}
> Log of the node with the error is attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)