You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jordan West (JIRA)" <ji...@apache.org> on 2018/01/31 23:40:00 UTC

[jira] [Comment Edited] (CASSANDRA-14055) Index redistribution breaks SASI index

    [ https://issues.apache.org/jira/browse/CASSANDRA-14055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347781#comment-16347781 ] 

Jordan West edited comment on CASSANDRA-14055 at 1/31/18 11:39 PM:
-------------------------------------------------------------------

Hi [~lboutros],

One of the original authors of SASI here. I've been taking a look at this issue and your patch. Using the provided test against the {{cassandra-3.11}} branch (fc3357a00e2b6e56d399f07c5b81a82780c1e143), I see three different failure cases – two related directly to this issue and one tangentially related. More details on those below. With respect to this issue in particular, the three scenarios cause the test to fail because {{IndexSummaryManager}} ends up creating a new {{View}} where {{oldSSTables}} and {{newIndexes}} have overlapping values. This occurs because the {{IndexSummaryManager}} may "update" (re-open) an {{SSTableReader}} for an index already in the view. I believe this is unique to {{IndexSummaryManager}} and I am able to make your tests pass* without your patch by ensuring that there is no overlap between {{oldSStables}} and {{newIndexes}} (favoring {{newIndexes}}). Your patch looks to do this as well, though the approach is a bit different.

One thing I am curious about in your patch is the {{keepFile}} changes to {{SSTableIndex#release}}. Generally, this concerns me because it seems to be working around improper reference counting rather than correcting the reference counting itself. Also, while using the provided test, I am unable to hit a case where the condition {{obsolete.get() || sstableRef.globalCount() == 0}} is true. I see the file missing in the {{View}} but not on disk itself. Could you elaborate a bit more on the need for this change and your use of the {{keepFile}} flag?

The three failure scenarios I see using the provided test are:
h5. 8 keys returned - sequential case

In this scenario, at the time when the query that fails runs, the {{View}} is missing the most recently flushed sstable. As mentioned previously, this is because the intersection of {{oldSSTables}} and {{newIndexes}} is non-empty. This can be fixed* by ensuring nothing in {{newIndexes}} is in {{oldSSTables}}. I call this the sequential case because the compaction that occurs during the test completes before the index summary redistribution begins to create a new {{View}}. This is also addressed by your patch.
h5. 8 keys returned - race case

This scenario is similar to the previous one but has the additional issue of triggering improper {{SSTableIndex}} reference counting. From the perspective of the provided test, the failure scenario is the same and the fix* is as well. The issue occurs because of a race between compaction and index redistribution's creation of new {{View}} instances. This causes redistribution to create two {{View}} instances, the first of which is thrown away due to a failed compare and swap. The problem is the side-effects (calling {{SSTableIndex#release}}) have occurred already inside the creation of the garbage {{View}}, causing the reference count for the index to drop below 0. I see this issue as a separate one from this ticket and have filed [CASSANDRA-14207|https://issues.apache.org/jira/browse/CASSANDRA-14207]. It is not fixed by the previously mentioned change and while I haven't checked in detail, I don't think the provided patch addresses this either.
h5. 0 keys returned

This scenario is similar to the first but there are three threads involved in the race: the compaction, the flushing of the last memtable, and the index redistribution. In this case, the end result is an empty {{View}}, which leads to no keys being returned since the system thinks there are no indexes to search. This is fixed* by what I mentioned previously and occurs because index redistribution re-opens both sstables in the original {{View}} instead of just one. It is also addressed by your patch. 

 

I am curious if you see any other failure scenarios besides these three and, in particular, if you can elaborate on and provide examples of the issues you see regarding the files being missing on disk and the need for the {{keepFile}} change.

\* While this fix makes the provided test pass I am still verifying its correct from the reference counting perspective.


was (Author: jrwest):
Hi [~lboutros],

One of the original authors of SASI here. I've been taking a look at this issue and your patch. Using the provided test against the {{cassandra-3.11}} branch (fc3357a00e2b6e56d399f07c5b81a82780c1e143), I see three different failure cases – two related directly to this issue and one tangentially related. More details on those below. With respect to this issue in particular, the three scenarios cause the test to fail because {{IndexSummaryManager}} ends up creating a new {{View}} where {{oldSSTables}} and {{newIndexes}} have overlapping values. This occurs because the {{IndexSummaryManager}} may "update" (re-open) an {{SSTableReader}} for an index already in the view. I believe this is unique to {{IndexSummaryManager}} and I am able to make your tests pass* without your patch by ensuring that there is no overlap between {{oldSStables}} and {{newIndexes}} (favoring {{newIndexes}}). Your patch looks to do this as well, though the approach is a bit different.

One thing I am curious about in your patch is the {{keepFile}} changes to {{SSTableIndex#release}}. Generally, this concerns me because it seems to be working around improper reference counting rather than correcting the reference counting itself. Also, while using the provided test, I am unable to hit a case where the condition {{obsolete.get() || sstableRef.globalCount() == 0}} is true. I see the file missing in the {{View}} but not on disk itself. Could you elaborate a bit more on the need for this change and your use of the {{keepFile}} flag?

The three failure scenarios I see using the provided test are:
h5. 8 keys returned - sequential case

In this scenario, at the time when the query that fails runs, the {{View}} is missing the most recently flushed sstable. As mentioned previously, this is because the intersection of {{oldSSTables}} and {{newIndexes}} is non-empty. This can be fixed* by ensuring nothing in {{newIndexes}} is in {{oldSSTables}}. I call this the sequential case because the compaction that occurs during the test completes before the index summary redistribution begins to create a new {{View}}. This is also addressed by your patch.
h5. 8 keys returned - race case

This scenario is similar to the previous one but has the additional issue of triggering improper {{SSTableIndex}} reference counting. From the perspective of the provided test, the failure scenario is the same and the fix* is as well. The issue occurs because of a race between compaction and index redistribution's creation of new {{View}} instances. This causes redistribution to create two {{View}} instances, the first of which is thrown away due to a failed compare and swap. The problem is the side-effects (calling {{SSTableIndex#release}}) have occurred already inside the creation of the garbage {{View}}, causing the reference count for the index to drop below 0. I see this issue as a separate one from this ticket and will file a separate JIRA. It is not fixed by the previously mentioned change and while I haven't checked in detail, I don't think the provided patch addresses this either.
h5. 0 keys returned

This scenario is similar to the first but there are three threads involved in the race: the compaction, the flushing of the last memtable, and the index redistribution. In this case, the end result is an empty {{View}}, which leads to no keys being returned since the system thinks there are no indexes to search. This is fixed* by what I mentioned previously and occurs because index redistribution re-opens both sstables in the original {{View}} instead of just one. It is also addressed by your patch. 

 

I am curious if you see any other failure scenarios besides these three and, in particular, if you can elaborate on and provide examples of the issues you see regarding the files being missing on disk and the need for the {{keepFile}} change.
 
\* While this fix makes the provided test pass I am still verifying its correct from the reference counting perspective.

> Index redistribution breaks SASI index
> --------------------------------------
>
>                 Key: CASSANDRA-14055
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14055
>             Project: Cassandra
>          Issue Type: Bug
>          Components: sasi
>            Reporter: Ludovic Boutros
>            Assignee: Ludovic Boutros
>            Priority: Major
>              Labels: patch
>             Fix For: 3.11.2
>
>         Attachments: CASSANDRA-14055.patch, CASSANDRA-14055.patch, CASSANDRA-14055.patch
>
>
> During index redistribution process, a new view is created.
> During this creation, old indexes should be released.
> But, new indexes are "attached" to the same SSTable as the old indexes.
> This leads to the deletion of the last SASI index file and breaks the index.
> The issue is in this function : [https://github.com/apache/cassandra/blob/9ee44db49b13d4b4c91c9d6332ce06a6e2abf944/src/java/org/apache/cassandra/index/sasi/conf/view/View.java#L62]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org