You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Anuj (JIRA)" <ji...@apache.org> on 2015/01/09 12:22:35 UTC

[jira] [Commented] (CASSANDRA-8479) Timeout Exception on Node Failure in Remote Data Center

    [ https://issues.apache.org/jira/browse/CASSANDRA-8479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14270893#comment-14270893 ] 

Anuj commented on CASSANDRA-8479:
---------------------------------

I have attached TRACE level logs. You can find multiple ReadTimeoutException in System.log.3 . Once we killed Cassandra on one of the nodes in DC2, around 7 read requests failed for around 17 seconds on DC1 and then everything was back to normal. We need to understand why these reads failed when we are using LOCAL_QUORUM in our application. Also, in another Cassandra log file System.log.2, we saw java.nio.file.NoSuchFileException. 

We got Hector's HTimeoutException in our application logs during these 17 seconds. 
Stack Trace from application logs:
com.ericsson.rm.service.voucher.InternalServerException: Internal server error, me.prettyprint.hector.api.exceptions.HTimedOutException: TimedOutException()
	at com.ericsson.rm.voucher.traffic.reservation.cassandra.CassandraReservation.getReservationSlice(CassandraReservation.java:552) ~[na:na]
	at com.ericsson.rm.voucher.traffic.reservation.cassandra.CassandraReservation.lookup(CassandraReservation.java:499) ~[na:na]
	at com.ericsson.rm.voucher.traffic.VoucherTraffic.getReservedOrPendingVoucher(VoucherTraffic.java:764) ~[na:na]
	at com.ericsson.rm.voucher.traffic.VoucherTraffic.commit(VoucherTraffic.java:686) ~[na:na]
	... 6 common frames omitted
Caused by: com.ericsson.rm.service.cassandra.xa.ConnectionException: me.prettyprint.hector.api.exceptions.HTimedOutException: TimedOutException()
	at com.ericsson.rm.cassandra.xa.keyspace.row.KeyedRowQuery.execute(KeyedRowQuery.java:93) ~[na:na]
	at com.ericsson.rm.voucher.traffic.reservation.cassandra.CassandraReservation.getReservationSlice(CassandraReservation.java:548) ~[na:na]
	... 9 common frames omitted
Caused by: me.prettyprint.hector.api.exceptions.HTimedOutException: TimedOutException()
	at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:42) ~[na:na]
	at me.prettyprint.cassandra.service.KeyspaceServiceImpl$7.execute(KeyspaceServiceImpl.java:286) ~[na:na]
	at me.prettyprint.cassandra.service.KeyspaceServiceImpl$7.execute(KeyspaceServiceImpl.java:269) ~[na:na]
	at me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:104) ~[na:na]
	at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:258) ~[na:na]
	at me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:132) ~[na:na]
	at me.prettyprint.cassandra.service.KeyspaceServiceImpl.getSlice(KeyspaceServiceImpl.java:290) ~[na:na]
	at me.prettyprint.cassandra.model.thrift.ThriftSliceQuery$1.doInKeyspace(ThriftSliceQuery.java:53) ~[na:na]
	at me.prettyprint.cassandra.model.thrift.ThriftSliceQuery$1.doInKeyspace(ThriftSliceQuery.java:49) ~[na:na]
	at me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20) ~[na:na]
	at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:101) ~[na:na]
	at me.prettyprint.cassandra.model.thrift.ThriftSliceQuery.execute(ThriftSliceQuery.java:48) ~[na:na]
	at com.ericsson.rm.cassandra.xa.keyspace.row.KeyedRowQuery.execute(KeyedRowQuery.java:77) ~[na:na]
	... 10 common frames omitted
Caused by: org.apache.cassandra.thrift.TimedOutException: null
	at org.apache.cassandra.thrift.Cassandra$get_slice_result$get_slice_resultStandardScheme.read(Cassandra.java:11504) ~[na:na]
	at org.apache.cassandra.thrift.Cassandra$get_slice_result$get_slice_resultStandardScheme.read(Cassandra.java:11453) ~[na:na]
	at org.apache.cassandra.thrift.Cassandra$get_slice_result.read(Cassandra.java:11379) ~[na:na]
	at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) ~[na:na]
	at org.apache.cassandra.thrift.Cassandra$Client.recv_get_slice(Cassandra.java:653) ~[na:na]
	at org.apache.cassandra.thrift.Cassandra$Client.get_slice(Cassandra.java:637) ~[na:na]
	at me.prettyprint.cassandra.service.KeyspaceServiceImpl$7.execute(KeyspaceServiceImpl.java:274) ~[na:na]
	... 21 common frames omitted

Please have a look at https://issues.apache.org/jira/browse/CASSANDRA-8352 for more details about the issue.




> Timeout Exception on Node Failure in Remote Data Center
> -------------------------------------------------------
>
>                 Key: CASSANDRA-8479
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8479
>             Project: Cassandra
>          Issue Type: Bug
>          Components: API, Core, Tools
>         Environment: Unix, Cassandra 2.0.11
>            Reporter: Amit Singh Chowdhery
>            Assignee: Ryan McGuire
>            Priority: Minor
>         Attachments: TRACE_LOGS.zip
>
>
> Issue Faced :
> We have a Geo-red setup with 2 Data centers having 3 nodes each. When we bring down a single Cassandra node down in DC2 by kill -9 <Cassandra-pid>, reads fail on DC1 with TimedOutException for a brief amount of time (15-20 sec~).
> Reference :
> Already a ticket has been opened/resolved and link is provided below :
> https://issues.apache.org/jira/browse/CASSANDRA-8352
> Activity Done as per Resolution Provided :
> Upgraded to Cassandra 2.0.11 .
> We have two 3 node clusters in two different DCs and if one or more of the nodes go down in one Data Center , ~5-10% traffic failure is observed on the other.
> CL: LOCAL_QUORUM
> RF=3



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)