You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Sylvain Lebresne (JIRA)" <ji...@apache.org> on 2016/12/06 15:28:58 UTC

[jira] [Commented] (CASSANDRA-12997) dtest failure in org.apache.cassandra.cql3.validation.operations.AlterTest.testDropListAndAddListWithSameName

    [ https://issues.apache.org/jira/browse/CASSANDRA-12997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15725815#comment-15725815 ] 

Sylvain Lebresne commented on CASSANDRA-12997:
----------------------------------------------

That test seem to have failed only once from CI history and I haven't been able to reproduce in 100+ local runs, so I can theorize about what happened, but it's just that, theory.

The test insert a value for a column, drop the column, add it back (same type), and query it. It expects the value to be {{null}} since it has remove through the drop, and that's what it gets, except for that one run when it got the initial value back on the read.

My best theory is that it's a timestamp issue. Somehow, the initial insertion ended up with a higher timestamp than the timestamp used for the {{DROP}}, so that on a read the old value is considered post-{{DROP}}. Which, and I'm still in the realm of theories, could happen if on CI nptd kicked in just at the wrong moment. It's a stretch, but as that has only even happened once that I can tell, this could be a case of bad luck.

There is another reason why this could happen: inserts use {{ClientState.getTimestamp()}} to generate their timestamp, which ensure that inserts coordinated by the same node always get strictly increasing timestamps, and do so by adding "fake" microseconds when it has to generate multiple timestamps in the same milliseconds. {{ALTER TABLE DROP}} simply relies on {{System.currentTimeMillis()}} however, so if multiple inserts, followed by a {{DROP}} happen in the same milliseconds, all but the first insert will actually have a higher timestamp than the drop one. This is also a bit of a stretch because the test does only a single insert, so while a prior test insert could interfer, that would be pretty timing. But it seems to be what this is ultimately so ....

Anyway, the previous point is the crux of the patch I'm attaching below: it uses the {{ClientState.getTimestamp()}} mechanism for {{ALTER TABLE DROP}} too. This ensures that the drop will always get a higher timestamp than any previous insert (coordinated by the same node), thus avoiding both theoretical problem above. It feels kind of the right thing to do anyway. The one annoyance is that we have to pass the {{QueryState}} to {{announceMigration()}} which is overriden by all schema altering statement, which is why the patch is more than 2 lines (but it's 2 lines of non automatic change).

| [12997|https://github.com/pcmanus/cassandra/commits/12997] | [utests|http://cassci.datastax.com/job/pcmanus-12997-testall] | [dtests|http://cassci.datastax.com/job/pcmanus-12997-dtest] |

I'll note lastly that the patch is against trunk, as was the test failure, but assuming my theories above are indeed the reason for that one failure, this could happen on pretty much any version. That said, it's really just to make the test happy as no-one in real life will even run into this, so I don't see the point of taking any kind of risk.


> dtest failure in org.apache.cassandra.cql3.validation.operations.AlterTest.testDropListAndAddListWithSameName
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-12997
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12997
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Sean McCarthy
>            Assignee: Sylvain Lebresne
>              Labels: test-failure, testall
>
> example failure:
> http://cassci.datastax.com/job/trunk_testall/1298/testReport/org.apache.cassandra.cql3.validation.operations/AlterTest/testDropListAndAddListWithSameName
> {code}
> Error Message
> Invalid value for row 0 column 2 (mycollection of type list<text>), expected <null> but got <[first element]>
> {code}{code}Stacktrace
> junit.framework.AssertionFailedError: Invalid value for row 0 column 2 (mycollection of type list<text>), expected <null> but got <[first element]>
> 	at org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:908)
> 	at org.apache.cassandra.cql3.validation.operations.AlterTest.testDropListAndAddListWithSameName(AlterTest.java:87)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)