You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "peng xiao (JIRA)" <ji...@apache.org> on 2016/06/29 07:10:45 UTC

[jira] [Comment Edited] (CASSANDRA-12103) Cassandra is hang and cqlsh was not able to login with OperationTimeout error

    [ https://issues.apache.org/jira/browse/CASSANDRA-12103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15354702#comment-15354702 ] 

peng xiao edited comment on CASSANDRA-12103 at 6/29/16 7:10 AM:
----------------------------------------------------------------

Thanks Sam for you reply.
1)no,we are using NetworkTopologyStrategy now.
cassandra51@cqlsh:system> select * from system.schema_keyspaces where keyspace_name='system_auth';

 keyspace_name | durable_writes | strategy_class                                       | strategy_options
---------------+----------------+------------------------------------------------------+---------------------------------
   system_auth |           True | org.apache.cassandra.locator.NetworkTopologyStrategy | {"DC2":"9","DC1":"3","DC3":"1"}

2)we don't have any node down at the time,but all the 3 nodes in DC1 has no response.
We are not able to login even with cqlsh.

3)the client don't use user cassandra,the authentication is go live for nearly one month,but we suddenly met this problem just yesterday.

And we are using local_quorum.


was (Author: wavelet):
Thanks Sam for you reply.
1)no,we are using NetworkTopologyStrategy now.
cassandra51@cqlsh:system> select * from system.schema_keyspaces where keyspace_name='system_auth';

 keyspace_name | durable_writes | strategy_class                                       | strategy_options
---------------+----------------+------------------------------------------------------+---------------------------------
   system_auth |           True | org.apache.cassandra.locator.NetworkTopologyStrategy | {"DC2":"9","DC1":"3","DC3":"1"}

2)we don't have any node down at the time,but all the 3 nodes in DC1 has no response.
We are not able to login even with cqlsh.

3)the client don't use user cassandra,the authentication is go live for nearly one month,but we suddenly met this problem just yesterday.

> Cassandra is hang and cqlsh was not able to login with OperationTimeout error
> -----------------------------------------------------------------------------
>
>                 Key: CASSANDRA-12103
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12103
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core, Local Write-Read Paths
>         Environment: centos 6.5 cassandra 2.1.9
>            Reporter: peng xiao
>            Priority: Critical
>         Attachments: system.log.2016-06-28_1257.gz
>
>
> Hi,
> We have two DCs(DC1 and DC2) with DC1 3 nodes and DC2 9 nodes.
> And we experienced a Timeout error today,all applications connected to DC1 were hang and no response,even cqlsh was not able to log into any node in DC1.
> I restarted the 3 nodes in DC1,the problem was not resolved.
> Then we switched to DC2,then applications back to normal.
> Could you please help to take a look?
> Thanks
> many errors like below:
> ERROR [SharedPool-Worker-43] 2016-06-28 11:58:49,705 Message.java:538 - Unexpected exception during request; channel = [id: 0x87e315d6, /172.16.10.198:13604 => /172.16.11.13:9042]
> java.lang.RuntimeException: org.apache.cassandra.exceptions.ReadTimeoutException: Operation timed out - received only 0 responses.
>         at org.apache.cassandra.auth.Auth.selectUser(Auth.java:276) ~[apache-cassandra-2.1.9.jar:2.1.9]
>         at org.apache.cassandra.auth.Auth.isExistingUser(Auth.java:86) ~[apache-cassandra-2.1.9.jar:2.1.9]
>         at org.apache.cassandra.service.ClientState.login(ClientState.java:206) ~[apache-cassandra-2.1.9.jar:2.1.9]
>         at org.apache.cassandra.transport.messages.AuthResponse.execute(AuthResponse.java:82) ~[apache-cassandra-2.1.9.jar:2.1.9]
>         at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:439) [apache-cassandra-2.1.9.jar:2.1.9]
>         at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:335) [apache-cassandra-2.1.9.jar:2.1.9]
>         at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) [netty-all-4.0.23.Final.jar:4.0.23.Final]
>         at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) [netty-all-4.0.23.Final.jar:4.0.23.Final]
>         at io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32) [netty-all-4.0.23.Final.jar:4.0.23.Final]
>         at io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324) [netty-all-4.0.23.Final.jar:4.0.23.Final]
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0]
>         at org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164) [apache-cassandra-2.1.9.jar:2.1.9]
>         at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) [apache-cassandra-2.1.9.jar:2.1.9]
>         at java.lang.Thread.run(Thread.java:744) [na:1.8.0]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)