You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Sam Tunnicliffe (JIRA)" <ji...@apache.org> on 2016/08/23 10:59:20 UTC
[jira] [Commented] (CASSANDRA-12523) dtest failure in upgrade_internal_auth_test.TestAuthUpgrade.upgrade_to_30_test

    [ https://issues.apache.org/jira/browse/CASSANDRA-12523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15432616#comment-15432616 ] 

Sam Tunnicliffe commented on CASSANDRA-12523:
---------------------------------------------

The test attempts to ensure that client connections are both exclusive to a single node and short-lived specifically to avoid non-fatal authentication errors during the period that the cluster is in a mixed state. 

The logs show however, that a connection attempt was made to node2 shortly after it was upgraded but before the upgrade on node3 was complete, leading to the (non-fatal) error message in the node2 log. 

I'm probably overlooking something obvious, but I don't see where that connection attempt is coming from unless the calls to {{session.shutdown()}} are leaving some sessions in a state where they retry their connections.

One additional thing, the sessions opened directly in {{do_upgrade_with_internal_auth}} for executing DDL/DCL are not exclusive, so they will be opening connections to all 3 nodes, which may be worth changing (although they are also shutdown as early as possible, so if that works as expected then this shouldn't be significant). 


> dtest failure in upgrade_internal_auth_test.TestAuthUpgrade.upgrade_to_30_test
> ------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-12523
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12523
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Sam Tunnicliffe
>              Labels: dtest
>         Attachments: node1.log, node1_debug.log, node1_gc.log, node2.log, node2_debug.log, node2_gc.log, node3.log, node3_debug.log, node3_gc.log
>
>
> example failure:
> http://cassci.datastax.com/view/cassandra-3.9/job/cassandra-3.9_dtest/46/testReport/junit/upgrade_internal_auth_test/TestAuthUpgrade/upgrade_to_30_test/
> Failed on CassCI build cassandra-3.9_dtest #46
> Relevant error in logs is
> {code}
> Unexpected error in node2 log, error: 
> ERROR [SharedPool-Worker-1] 2016-08-22 17:35:31,108 Message.java:617 - Unexpected exception during request; channel = [id: 0x603c4ef1, /127.0.0.1:46357 => /127.0.0.2:9042]
> java.lang.RuntimeException: org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out - received only 0 responses.
> 	at org.apache.cassandra.auth.CassandraRoleManager.getRole(CassandraRoleManager.java:489) ~[main/:na]
> 	at org.apache.cassandra.auth.CassandraRoleManager.canLogin(CassandraRoleManager.java:298) ~[main/:na]
> 	at org.apache.cassandra.service.ClientState.login(ClientState.java:270) ~[main/:na]
> 	at org.apache.cassandra.transport.messages.AuthResponse.execute(AuthResponse.java:79) ~[main/:na]
> 	at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:513) [main/:na]
> 	at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:407) [main/:na]
> 	at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) [netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) [netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32) [netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324) [netty-all-4.0.23.Final.jar:4.0.23.Final]
> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_45]
> 	at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164) [main/:na]
> 	at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) [main/:na]
> 	at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45]
> Caused by: org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out - received only 0 responses.
> 	at org.apache.cassandra.service.ReadCallback.awaitResults(ReadCallback.java:132) ~[main/:na]
> 	at org.apache.cassandra.service.ReadCallback.get(ReadCallback.java:137) ~[main/:na]
> 	at org.apache.cassandra.service.AbstractReadExecutor.get(AbstractReadExecutor.java:145) ~[main/:na]
> 	at org.apache.cassandra.service.StorageProxy$SinglePartitionReadLifecycle.awaitResultsAndRetryOnDigestMismatch(StorageProxy.java:1715) ~[main/:na]
> 	at org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:1664) ~[main/:na]
> 	at org.apache.cassandra.service.StorageProxy.readRegular(StorageProxy.java:1605) ~[main/:na]
> 	at org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:1524) ~[main/:na]
> 	at org.apache.cassandra.db.SinglePartitionReadCommand$Group.execute(SinglePartitionReadCommand.java:954) ~[main/:na]
> 	at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:263) ~[main/:na]
> 	at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:224) ~[main/:na]
> 	at org.apache.cassandra.auth.CassandraRoleManager.getRoleFromTable(CassandraRoleManager.java:497) ~[main/:na]
> 	at org.apache.cassandra.auth.CassandraRoleManager.getRole(CassandraRoleManager.java:484) ~[main/:na]
> 	... 13 common frames omitted
> {code}
> Note: this is unrelated to CASSANDRA-12518



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)