You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Sylvain Lebresne (Created) (JIRA)" <ji...@apache.org> on 2012/02/07 17:57:02 UTC

[jira] [Created] (CASSANDRA-3872) Sub-columns removal is broken in 1.1

Sub-columns removal is broken in 1.1
------------------------------------

                 Key: CASSANDRA-3872
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3872
             Project: Cassandra
          Issue Type: Bug
            Reporter: Sylvain Lebresne
            Assignee: Sylvain Lebresne


CASSANDRA-3716 actually broke sub-columns deletion. The reason is that in QueryFilter.isRelevant, we've switched in checking getLocalDeletionTime() only (without looking for isMarkedForDelete). But for columns containers (in this case SuperColumn), the default local deletion time when not deleted is Integer.MIN_VALUE. In other words, a SC with only non-gcable tombstones will be considered as not relevant (while it should).

This is caught by two unit tests (RemoveSuperColumnTest and RemoveSubColumnTest) that are failing currently.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-3872) Sub-columns removal is broken in 1.1

Posted by "Sylvain Lebresne (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13203476#comment-13203476 ] 

Sylvain Lebresne commented on CASSANDRA-3872:
---------------------------------------------

Apologies, I didn't realized that there was actually kind of 2 problems and I mixed both, so let's be more precise.

The regression that CASSANDRA-3716 introduced and that is making the tests fail is that a *live* super column with only non-gcable tombstones is now considered irrelevant, while it is relevant in say 1.0.

That led me to realize that containers are using MIN_VALUE for gLDT when not deleted. Now, leaving aside the tests failure, I do think we should change that. Post-CASSANDRA-3716, we've made sure that for normal columns, testing {{getLocalDeletionTime() < gcbefore}} allows us solely to decide whether the column is gcable (and thus by extension tombstoned) or not. I think it's a good thing, the local deletion time is here for the tombstone gc and so that makes sense. But currently, this is not true for SuperColumns nor ColumnFamily (but it's worst for SC since they actually are IColumn), where {{getLocalDeletionTime() < gcbefore}} means 'either gcable or not deleted at all'. Leaving things that way would be very error-prone imho and defeat the purpose of CASSANDRA-3716 since we're reintroducing subtle differences that are hard to work with.

Now, as it happens, I think we have an old bug in QF.isRelevant (i.e. including in 0.8 and probably before that). Namely that a *live* SC with only non-gcable tombstones is considered relevant, but a *gcable* SC with only non-gcable is considered irrelevant but I don't think it should. To fix that, we can indeed do the mostRecentChangeAt change (and that would fix the unit tests in particular but as said above, I think we should still do the MIN_VALUE->MAX_VALUE change for other reasons). However, that would imply that *gcable* SC with only *gcable* tombstones would be considered relevant (if the max tombstone timestamp is greater than the SC timestamp), so maybe we want to switch to a column.mostRecentNonGcableChangeAt() instead.
I'll open a separate ticket for that change, as this affect previous version too. I'm leaving this one open to focus on the MIN_VALUE->MAX_VALUE change.
                
> Sub-columns removal is broken in 1.1
> ------------------------------------
>
>                 Key: CASSANDRA-3872
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3872
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.1
>            Reporter: Sylvain Lebresne
>            Assignee: Sylvain Lebresne
>             Fix For: 1.1
>
>         Attachments: 3872.patch
>
>
> CASSANDRA-3716 actually broke sub-columns deletion. The reason is that in QueryFilter.isRelevant, we've switched in checking getLocalDeletionTime() only (without looking for isMarkedForDelete). But for columns containers (in this case SuperColumn), the default local deletion time when not deleted is Integer.MIN_VALUE. In other words, a SC with only non-gcable tombstones will be considered as not relevant (while it should).
> This is caught by two unit tests (RemoveSuperColumnTest and RemoveSubColumnTest) that are failing currently.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-3872) Sub-columns removal is broken in 1.1

Posted by "Sylvain Lebresne (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-3872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sylvain Lebresne updated CASSANDRA-3872:
----------------------------------------

    Attachment: 3872.patch

Attaching patch for this. The idea is to make containers use MAX_VALUE instead of MIN_VALUE  for localDeleteionTime. There is however two things to consider:
# for containers, we do serialize the localDeletionTime, so we have to be careful for upgrade (mixed version cluster). What the patch does is to recognize when someone provided MIN_VALUE and transforming to MAX_VALUE. This only work when old node send use MIN_VALUE, however *I think* it is actually OK to send MAX_VALUE to old node (pre-1.1 code uses markedForDeleteAt to decide if a container is deleted, not localDeletionTime, so we shouldn't break anything).
# CFS.removeDeleted has to be able to distinguish between an empty container that is marked for deletion (but not gcable) and one that is not marked for deletion. In the former, we should keep the container, not in the latter. This means that this patch actually reintroduce the use of isMarkedForDelete() for containers in CFS.removeDeleted (for containers, isMarkedForDelete is not timing dependent).

Note that there would likely be other ways to fix this issue (reverting CASSANDRA-3716 would be one).

In any case, the patch fixes the two unit tests.
                
> Sub-columns removal is broken in 1.1
> ------------------------------------
>
>                 Key: CASSANDRA-3872
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3872
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.1
>            Reporter: Sylvain Lebresne
>            Assignee: Sylvain Lebresne
>             Fix For: 1.1
>
>         Attachments: 3872.patch
>
>
> CASSANDRA-3716 actually broke sub-columns deletion. The reason is that in QueryFilter.isRelevant, we've switched in checking getLocalDeletionTime() only (without looking for isMarkedForDelete). But for columns containers (in this case SuperColumn), the default local deletion time when not deleted is Integer.MIN_VALUE. In other words, a SC with only non-gcable tombstones will be considered as not relevant (while it should).
> This is caught by two unit tests (RemoveSuperColumnTest and RemoveSubColumnTest) that are failing currently.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-3872) Sub-columns removal is broken in 1.1

Posted by "Sylvain Lebresne (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13206806#comment-13206806 ] 

Sylvain Lebresne commented on CASSANDRA-3872:
---------------------------------------------

I do not pretend this reduce line of codes, but I do think that it makes it easier to not make subtle mistakes.

Currently, there is a mismatch between how Column (the class) and the two IColumnContainer classes (CF and SC) handles getLocalDeletionTime() for non-deleted. The former uses MAX_VALUE, the latter uses MIN_VALUE. The lack of consistency alone is annoying but as long as SC lives it is made much worst by the fact that SC is both a IColumn and a IColumnContainer.

The attached patch tries to make things more consistent. The localDeletionTime is here for the purpose of tombstone garbage collection, so it seems to me that it is cleaner to use it for that purpose and that purpose only. In other words, with this patch, {{(getLocalDeletionTime() < gcbefore)}} tells you without ambiguity if you're dealing with a gcable tombstone or not.

Now there is the fact that live but empty containers are not returned to the user. I believe that was one of the reason of using MIN_VALUE for live containers. But imho this is a hack and it's much more clear in removeDeleted to read:
{noformat}
if (cf.getColumnCount() == 0 && (!cf.isMarkedForDelete() || cf.getLocalDeletionTime() < gcBefore))
{noformat}
which directly translate into: if the cf is empty and it's either a gcable tombstone or a live cf, we can skip it, rather that having to check the code of ColumnFamily to understand why that does skip live empty CF *and* to have to remember each time you use CF.localDeletionTime that it may be MIN_VALUE for non-deleted CF and assert if it matters or not.
                
> Sub-columns removal is broken in 1.1
> ------------------------------------
>
>                 Key: CASSANDRA-3872
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3872
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.1.0
>            Reporter: Sylvain Lebresne
>            Assignee: Sylvain Lebresne
>             Fix For: 1.1.0
>
>         Attachments: 3872.patch
>
>
> CASSANDRA-3716 actually broke sub-columns deletion. The reason is that in QueryFilter.isRelevant, we've switched in checking getLocalDeletionTime() only (without looking for isMarkedForDelete). But for columns containers (in this case SuperColumn), the default local deletion time when not deleted is Integer.MIN_VALUE. In other words, a SC with only non-gcable tombstones will be considered as not relevant (while it should).
> This is caught by two unit tests (RemoveSuperColumnTest and RemoveSubColumnTest) that are failing currently.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-3872) Sub-columns removal is broken in 1.1

Posted by "Jonathan Ellis (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13202983#comment-13202983 ] 

Jonathan Ellis commented on CASSANDRA-3872:
-------------------------------------------

Could we do a simpler fix by turning column.mostRecentLiveChangeAt into column.mostRecentChangeAt? I don't think restricting to live columns is actually useful here (although it is in SQF, so we'd need a new method instead of replacing mRLCA).
                
> Sub-columns removal is broken in 1.1
> ------------------------------------
>
>                 Key: CASSANDRA-3872
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3872
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.1
>            Reporter: Sylvain Lebresne
>            Assignee: Sylvain Lebresne
>             Fix For: 1.1
>
>         Attachments: 3872.patch
>
>
> CASSANDRA-3716 actually broke sub-columns deletion. The reason is that in QueryFilter.isRelevant, we've switched in checking getLocalDeletionTime() only (without looking for isMarkedForDelete). But for columns containers (in this case SuperColumn), the default local deletion time when not deleted is Integer.MIN_VALUE. In other words, a SC with only non-gcable tombstones will be considered as not relevant (while it should).
> This is caught by two unit tests (RemoveSuperColumnTest and RemoveSubColumnTest) that are failing currently.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-3872) Sub-columns removal is broken in 1.1

Posted by "Jonathan Ellis (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13207371#comment-13207371 ] 

Jonathan Ellis commented on CASSANDRA-3872:
-------------------------------------------

+1

Can you add a reference to this ticket to the "new default is MAX_VALUE" comment on commit?
                
> Sub-columns removal is broken in 1.1
> ------------------------------------
>
>                 Key: CASSANDRA-3872
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3872
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.1.0
>            Reporter: Sylvain Lebresne
>            Assignee: Sylvain Lebresne
>             Fix For: 1.1.0
>
>         Attachments: 3872.patch
>
>
> CASSANDRA-3716 actually broke sub-columns deletion. The reason is that in QueryFilter.isRelevant, we've switched in checking getLocalDeletionTime() only (without looking for isMarkedForDelete). But for columns containers (in this case SuperColumn), the default local deletion time when not deleted is Integer.MIN_VALUE. In other words, a SC with only non-gcable tombstones will be considered as not relevant (while it should).
> This is caught by two unit tests (RemoveSuperColumnTest and RemoveSubColumnTest) that are failing currently.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-3872) Sub-columns removal is broken in 1.1

Posted by "Jonathan Ellis (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13205881#comment-13205881 ] 

Jonathan Ellis commented on CASSANDRA-3872:
-------------------------------------------

Isn't this patch kind of a wash?  The MIN_VALUE checks go away but in return we need to add extra isMarkedForDelete checks.

                
> Sub-columns removal is broken in 1.1
> ------------------------------------
>
>                 Key: CASSANDRA-3872
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3872
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.1.0
>            Reporter: Sylvain Lebresne
>            Assignee: Sylvain Lebresne
>             Fix For: 1.1.0
>
>         Attachments: 3872.patch
>
>
> CASSANDRA-3716 actually broke sub-columns deletion. The reason is that in QueryFilter.isRelevant, we've switched in checking getLocalDeletionTime() only (without looking for isMarkedForDelete). But for columns containers (in this case SuperColumn), the default local deletion time when not deleted is Integer.MIN_VALUE. In other words, a SC with only non-gcable tombstones will be considered as not relevant (while it should).
> This is caught by two unit tests (RemoveSuperColumnTest and RemoveSubColumnTest) that are failing currently.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira