You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Ruchir Jha <ru...@gmail.com> on 2014/07/11 14:50:37 UTC

UnavailableException

We have a 12 node cluster and we are consistently seeing this exception being thrown during peak write traffic. We have a replication factor of 3 and a write consistency level of QUORUM. Also note there is no unusual Or Full GC activity during this time. Appreciate any help. 

Sent from my iPhone

Re: UnavailableException

Posted by Ruchir Jha <ru...@gmail.com>.

Yes the line is : Datacenter: datacenter1 which matches with my create
keyspace command. As for the NodeDiscoveryType, we will follow it but I
don't believe it to be the root of my issue here because the nodes start up
atleast 6 hours before the UnavailableException and as far as adding nodes
is concerned we would only do it after hours.


On Mon, Jul 14, 2014 at 2:34 PM, Chris Lohfink <cl...@blackbirdit.com>
wrote:

> If you list all 12 nodes in seeds list, you can try using
> NodeDiscoveryType.NONE instead of RING_DESCRIBE.
>
> Its been recommended that way by some anyway so if you add nodes to
> cluster your app wont start using it until all bootstrapping and
> everythings settled down.
>
> Chris
>
> On Jul 14, 2014, at 12:04 PM, Ruchir Jha <ru...@gmail.com> wrote:
>
> Mark,
>
> Here you go:
>
> *NodeTool status:*
>
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address      Load       Tokens  Owns   Host ID
>       Rack
> UN  10.10.20.15  1.62 TB    256     8.1%
> 01a01f07-4df2-4c87-98e9-8dd38b3e4aee  rack1
> UN  10.10.20.19  1.66 TB    256     8.3%
> 30ddf003-4d59-4a3e-85fa-e94e4adba1cb  rack1
> UN  10.10.20.35  1.62 TB    256     9.0%
> 17cb8772-2444-46ff-8525-33746514727d  rack1
> UN  10.10.20.31  1.64 TB    256     8.3%
> 1435acf9-c64d-4bcd-b6a4-abcec209815e  rack1
> UN  10.10.20.52  1.59 TB    256     9.1%
> 6b5aca07-1b14-4bc2-a7ba-96f026fa0e4e  rack1
> UN  10.10.20.27  1.66 TB    256     7.7%
> 76023cdd-c42d-4068-8b53-ae94584b8b04  rack1
> UN  10.10.20.22  1.66 TB    256     8.9%
> 46af9664-8975-4c91-847f-3f7b8f8d5ce2  rack1
> UN  10.10.20.39  1.68 TB    256     8.0%
> b7d44c26-4d75-4d36-a779-b7e7bdaecbc9  rack1
> UN  10.10.20.45  1.49 TB    256     7.7%
> 8d6bce33-8179-4660-8443-2cf822074ca4  rack1
> UN  10.10.20.47  1.64 TB    256     7.9%
> bcd51a92-3150-41ae-9c51-104ea154f6fa  rack1
> UN  10.10.20.62  1.59 TB    256     8.2%
> 84b47313-da75-4519-94f3-3951d554a3e5  rack1
> UN  10.10.20.51  1.66 TB    256     8.9%
> 0343cd58-3686-465f-8280-56fb72d161e2  rack1
>
>
> *Astyanax Connection Settings:*
>
> seeds           :12
> maxConns           :16
> maxConnsPerHost    :16
> connectTimeout     :2000
> socketTimeout      :60000
> maxTimeoutCount    :16
> maxBlockedThreadsPerHost:16
> maxOperationsPerConnection:16
> DiscoveryType: RING_DESCRIBE
> ConnectionPoolType: TOKEN_AWARE
> DefaultReadConsistencyLevel: CL_QUORUM
> DefaultWriteConsistencyLevel: CL_QUORUM
>
>
>
> On Fri, Jul 11, 2014 at 5:04 PM, Mark Reddy <ma...@boxever.com>
> wrote:
>
>> Can you post the output of nodetool status and your Astyanax connection
>> settings?
>>
>>
>> On Fri, Jul 11, 2014 at 9:06 PM, Ruchir Jha <ru...@gmail.com> wrote:
>>
>>> This is how we create our keyspace. We just ran this command once
>>> through a cqlsh session on one of the nodes, so don't quite understand what
>>> you mean by "check that your DC names match up"
>>>
>>> CREATE KEYSPACE prod WITH replication = {
>>>   'class': 'NetworkTopologyStrategy',
>>>   'datacenter1': '3'
>>> };
>>>
>>>
>>>
>>> On Fri, Jul 11, 2014 at 3:48 PM, Chris Lohfink <clohfink@blackbirdit.com
>>> > wrote:
>>>
>>>> What replication strategy are you using? if using
>>>> NetworkTopolgyStrategy double check that your DC names match up (case
>>>> sensitive)
>>>>
>>>> Chris
>>>>
>>>> On Jul 11, 2014, at 9:38 AM, Ruchir Jha <ru...@gmail.com> wrote:
>>>>
>>>> Here's the complete stack trace:
>>>>
>>>> com.netflix.astyanax.connectionpool.exceptions.TokenRangeOfflineException:
>>>> TokenRangeOfflineException:
>>>> [host=ny4lpcas5.fusionts.corp(10.10.20.47):9160, latency=22784(42874),
>>>> attempts=3]UnavailableException()
>>>>         at
>>>> com.netflix.astyanax.thrift.ThriftConverter.ToConnectionPoolException(ThriftConverter.java:165)
>>>>         at
>>>> com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:65)
>>>>         at
>>>> com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:28)
>>>>         at
>>>> com.netflix.astyanax.thrift.ThriftSyncConnectionFactoryImpl$ThriftConnection.execute(ThriftSyncConnectionFactoryImpl.java:151)
>>>>         at
>>>> com.netflix.astyanax.connectionpool.impl.AbstractExecuteWithFailoverImpl.tryOperation(AbstractExecuteWithFailoverImpl.java:69)
>>>>         at
>>>> com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionConnectionPool.java:256)
>>>>         at
>>>> com.netflix.astyanax.thrift.ThriftKeyspaceImpl.executeOperation(ThriftKeyspaceImpl.java:485)
>>>>         at
>>>> com.netflix.astyanax.thrift.ThriftKeyspaceImpl.access$000(ThriftKeyspaceImpl.java:79)
>>>>         at
>>>> com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1.execute(ThriftKeyspaceImpl.java:123)
>>>> Caused by: UnavailableException()
>>>>         at
>>>> org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:20841)
>>>>         at
>>>> org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
>>>>         at
>>>> org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:964)
>>>>         at
>>>> org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:950)
>>>>         at
>>>> com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1$1.internalExecute(ThriftKeyspaceImpl.java:129)
>>>>         at
>>>> com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1$1.internalExecute(ThriftKeyspaceImpl.java:126)
>>>>         at
>>>> com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:60)
>>>>         ... 12 more
>>>>
>>>>
>>>>
>>>> On Fri, Jul 11, 2014 at 9:11 AM, Prem Yadav <ip...@gmail.com>
>>>> wrote:
>>>>
>>>>> Please post the full exception.
>>>>>
>>>>>
>>>>> On Fri, Jul 11, 2014 at 1:50 PM, Ruchir Jha <ru...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> We have a 12 node cluster and we are consistently seeing this
>>>>>> exception being thrown during peak write traffic. We have a replication
>>>>>> factor of 3 and a write consistency level of QUORUM. Also note there is no
>>>>>> unusual Or Full GC activity during this time. Appreciate any help.
>>>>>>
>>>>>> Sent from my iPhone
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>
>

Re: UnavailableException

Posted by Chris Lohfink <cl...@blackbirdit.com>.

If you list all 12 nodes in seeds list, you can try using NodeDiscoveryType.NONE instead of RING_DESCRIBE.  

Its been recommended that way by some anyway so if you add nodes to cluster your app wont start using it until all bootstrapping and everythings settled down.

Chris

On Jul 14, 2014, at 12:04 PM, Ruchir Jha <ru...@gmail.com> wrote:

> Mark,
> 
> Here you go:
> 
> NodeTool status:
> 
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address      Load       Tokens  Owns   Host ID                               Rack
> UN  10.10.20.15  1.62 TB    256     8.1%   01a01f07-4df2-4c87-98e9-8dd38b3e4aee  rack1
> UN  10.10.20.19  1.66 TB    256     8.3%   30ddf003-4d59-4a3e-85fa-e94e4adba1cb  rack1
> UN  10.10.20.35  1.62 TB    256     9.0%   17cb8772-2444-46ff-8525-33746514727d  rack1
> UN  10.10.20.31  1.64 TB    256     8.3%   1435acf9-c64d-4bcd-b6a4-abcec209815e  rack1
> UN  10.10.20.52  1.59 TB    256     9.1%   6b5aca07-1b14-4bc2-a7ba-96f026fa0e4e  rack1
> UN  10.10.20.27  1.66 TB    256     7.7%   76023cdd-c42d-4068-8b53-ae94584b8b04  rack1
> UN  10.10.20.22  1.66 TB    256     8.9%   46af9664-8975-4c91-847f-3f7b8f8d5ce2  rack1
> UN  10.10.20.39  1.68 TB    256     8.0%   b7d44c26-4d75-4d36-a779-b7e7bdaecbc9  rack1
> UN  10.10.20.45  1.49 TB    256     7.7%   8d6bce33-8179-4660-8443-2cf822074ca4  rack1
> UN  10.10.20.47  1.64 TB    256     7.9%   bcd51a92-3150-41ae-9c51-104ea154f6fa  rack1
> UN  10.10.20.62  1.59 TB    256     8.2%   84b47313-da75-4519-94f3-3951d554a3e5  rack1
> UN  10.10.20.51  1.66 TB    256     8.9%   0343cd58-3686-465f-8280-56fb72d161e2  rack1
> 
> 
> Astyanax Connection Settings:
> 
> seeds           :12
> maxConns           :16
> maxConnsPerHost    :16
> connectTimeout     :2000
> socketTimeout      :60000
> maxTimeoutCount    :16
> maxBlockedThreadsPerHost:16
> maxOperationsPerConnection:16
> DiscoveryType: RING_DESCRIBE
> ConnectionPoolType: TOKEN_AWARE
> DefaultReadConsistencyLevel: CL_QUORUM
> DefaultWriteConsistencyLevel: CL_QUORUM
> 
> 
> 
> On Fri, Jul 11, 2014 at 5:04 PM, Mark Reddy <ma...@boxever.com> wrote:
> Can you post the output of nodetool status and your Astyanax connection settings?
> 
> 
> On Fri, Jul 11, 2014 at 9:06 PM, Ruchir Jha <ru...@gmail.com> wrote:
> This is how we create our keyspace. We just ran this command once through a cqlsh session on one of the nodes, so don't quite understand what you mean by "check that your DC names match up"
> 
> CREATE KEYSPACE prod WITH replication = {
>   'class': 'NetworkTopologyStrategy',
>   'datacenter1': '3'
> };
> 
> 
> 
> On Fri, Jul 11, 2014 at 3:48 PM, Chris Lohfink <cl...@blackbirdit.com> wrote:
> What replication strategy are you using? if using NetworkTopolgyStrategy double check that your DC names match up (case sensitive)
> 
> Chris
> 
> On Jul 11, 2014, at 9:38 AM, Ruchir Jha <ru...@gmail.com> wrote:
> 
>> Here's the complete stack trace:
>> 
>> com.netflix.astyanax.connectionpool.exceptions.TokenRangeOfflineException: TokenRangeOfflineException: [host=ny4lpcas5.fusionts.corp(10.10.20.47):9160, latency=22784(42874), attempts=3]UnavailableException()
>>         at com.netflix.astyanax.thrift.ThriftConverter.ToConnectionPoolException(ThriftConverter.java:165)
>>         at com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:65)
>>         at com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:28)
>>         at com.netflix.astyanax.thrift.ThriftSyncConnectionFactoryImpl$ThriftConnection.execute(ThriftSyncConnectionFactoryImpl.java:151)
>>         at com.netflix.astyanax.connectionpool.impl.AbstractExecuteWithFailoverImpl.tryOperation(AbstractExecuteWithFailoverImpl.java:69)
>>         at com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionConnectionPool.java:256)
>>         at com.netflix.astyanax.thrift.ThriftKeyspaceImpl.executeOperation(ThriftKeyspaceImpl.java:485)
>>         at com.netflix.astyanax.thrift.ThriftKeyspaceImpl.access$000(ThriftKeyspaceImpl.java:79)
>>         at com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1.execute(ThriftKeyspaceImpl.java:123)
>> Caused by: UnavailableException()
>>         at org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:20841)
>>         at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
>>         at org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:964)
>>         at org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:950)
>>         at com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1$1.internalExecute(ThriftKeyspaceImpl.java:129)
>>         at com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1$1.internalExecute(ThriftKeyspaceImpl.java:126)
>>         at com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:60)
>>         ... 12 more
>> 
>> 
>> 
>> On Fri, Jul 11, 2014 at 9:11 AM, Prem Yadav <ip...@gmail.com> wrote:
>> Please post the full exception.
>> 
>> 
>> On Fri, Jul 11, 2014 at 1:50 PM, Ruchir Jha <ru...@gmail.com> wrote:
>> We have a 12 node cluster and we are consistently seeing this exception being thrown during peak write traffic. We have a replication factor of 3 and a write consistency level of QUORUM. Also note there is no unusual Or Full GC activity during this time. Appreciate any help.
>> 
>> Sent from my iPhone
>> 
>> 
> 
> 
> 
>

Re: UnavailableException

Posted by Chris Lohfink <cl...@blackbirdit.com>.

Is there a line when doing nodetool info/status like: 

Datacenter: datacenter1
=====================

You need to make sure the Datacenter name matches the name specified in your replication factor

Chris

On Jul 14, 2014, at 12:04 PM, Ruchir Jha <ru...@gmail.com> wrote:

> Mark,
> 
> Here you go:
> 
> NodeTool status:
> 
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address      Load       Tokens  Owns   Host ID                               Rack
> UN  10.10.20.15  1.62 TB    256     8.1%   01a01f07-4df2-4c87-98e9-8dd38b3e4aee  rack1
> UN  10.10.20.19  1.66 TB    256     8.3%   30ddf003-4d59-4a3e-85fa-e94e4adba1cb  rack1
> UN  10.10.20.35  1.62 TB    256     9.0%   17cb8772-2444-46ff-8525-33746514727d  rack1
> UN  10.10.20.31  1.64 TB    256     8.3%   1435acf9-c64d-4bcd-b6a4-abcec209815e  rack1
> UN  10.10.20.52  1.59 TB    256     9.1%   6b5aca07-1b14-4bc2-a7ba-96f026fa0e4e  rack1
> UN  10.10.20.27  1.66 TB    256     7.7%   76023cdd-c42d-4068-8b53-ae94584b8b04  rack1
> UN  10.10.20.22  1.66 TB    256     8.9%   46af9664-8975-4c91-847f-3f7b8f8d5ce2  rack1
> UN  10.10.20.39  1.68 TB    256     8.0%   b7d44c26-4d75-4d36-a779-b7e7bdaecbc9  rack1
> UN  10.10.20.45  1.49 TB    256     7.7%   8d6bce33-8179-4660-8443-2cf822074ca4  rack1
> UN  10.10.20.47  1.64 TB    256     7.9%   bcd51a92-3150-41ae-9c51-104ea154f6fa  rack1
> UN  10.10.20.62  1.59 TB    256     8.2%   84b47313-da75-4519-94f3-3951d554a3e5  rack1
> UN  10.10.20.51  1.66 TB    256     8.9%   0343cd58-3686-465f-8280-56fb72d161e2  rack1
> 
> 
> Astyanax Connection Settings:
> 
> seeds           :12
> maxConns           :16
> maxConnsPerHost    :16
> connectTimeout     :2000
> socketTimeout      :60000
> maxTimeoutCount    :16
> maxBlockedThreadsPerHost:16
> maxOperationsPerConnection:16
> DiscoveryType: RING_DESCRIBE
> ConnectionPoolType: TOKEN_AWARE
> DefaultReadConsistencyLevel: CL_QUORUM
> DefaultWriteConsistencyLevel: CL_QUORUM
> 
> 
> 
> On Fri, Jul 11, 2014 at 5:04 PM, Mark Reddy <ma...@boxever.com> wrote:
> Can you post the output of nodetool status and your Astyanax connection settings?
> 
> 
> On Fri, Jul 11, 2014 at 9:06 PM, Ruchir Jha <ru...@gmail.com> wrote:
> This is how we create our keyspace. We just ran this command once through a cqlsh session on one of the nodes, so don't quite understand what you mean by "check that your DC names match up"
> 
> CREATE KEYSPACE prod WITH replication = {
>   'class': 'NetworkTopologyStrategy',
>   'datacenter1': '3'
> };
> 
> 
> 
> On Fri, Jul 11, 2014 at 3:48 PM, Chris Lohfink <cl...@blackbirdit.com> wrote:
> What replication strategy are you using? if using NetworkTopolgyStrategy double check that your DC names match up (case sensitive)
> 
> Chris
> 
> On Jul 11, 2014, at 9:38 AM, Ruchir Jha <ru...@gmail.com> wrote:
> 
>> Here's the complete stack trace:
>> 
>> com.netflix.astyanax.connectionpool.exceptions.TokenRangeOfflineException: TokenRangeOfflineException: [host=ny4lpcas5.fusionts.corp(10.10.20.47):9160, latency=22784(42874), attempts=3]UnavailableException()
>>         at com.netflix.astyanax.thrift.ThriftConverter.ToConnectionPoolException(ThriftConverter.java:165)
>>         at com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:65)
>>         at com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:28)
>>         at com.netflix.astyanax.thrift.ThriftSyncConnectionFactoryImpl$ThriftConnection.execute(ThriftSyncConnectionFactoryImpl.java:151)
>>         at com.netflix.astyanax.connectionpool.impl.AbstractExecuteWithFailoverImpl.tryOperation(AbstractExecuteWithFailoverImpl.java:69)
>>         at com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionConnectionPool.java:256)
>>         at com.netflix.astyanax.thrift.ThriftKeyspaceImpl.executeOperation(ThriftKeyspaceImpl.java:485)
>>         at com.netflix.astyanax.thrift.ThriftKeyspaceImpl.access$000(ThriftKeyspaceImpl.java:79)
>>         at com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1.execute(ThriftKeyspaceImpl.java:123)
>> Caused by: UnavailableException()
>>         at org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:20841)
>>         at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
>>         at org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:964)
>>         at org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:950)
>>         at com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1$1.internalExecute(ThriftKeyspaceImpl.java:129)
>>         at com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1$1.internalExecute(ThriftKeyspaceImpl.java:126)
>>         at com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:60)
>>         ... 12 more
>> 
>> 
>> 
>> On Fri, Jul 11, 2014 at 9:11 AM, Prem Yadav <ip...@gmail.com> wrote:
>> Please post the full exception.
>> 
>> 
>> On Fri, Jul 11, 2014 at 1:50 PM, Ruchir Jha <ru...@gmail.com> wrote:
>> We have a 12 node cluster and we are consistently seeing this exception being thrown during peak write traffic. We have a replication factor of 3 and a write consistency level of QUORUM. Also note there is no unusual Or Full GC activity during this time. Appreciate any help.
>> 
>> Sent from my iPhone
>> 
>> 
> 
> 
> 
>

Re: UnavailableException

Posted by Ruchir Jha <ru...@gmail.com>.

Mark,

Here you go:

*NodeTool status:*

Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address      Load       Tokens  Owns   Host ID
      Rack
UN  10.10.20.15  1.62 TB    256     8.1%
01a01f07-4df2-4c87-98e9-8dd38b3e4aee  rack1
UN  10.10.20.19  1.66 TB    256     8.3%
30ddf003-4d59-4a3e-85fa-e94e4adba1cb  rack1
UN  10.10.20.35  1.62 TB    256     9.0%
17cb8772-2444-46ff-8525-33746514727d  rack1
UN  10.10.20.31  1.64 TB    256     8.3%
1435acf9-c64d-4bcd-b6a4-abcec209815e  rack1
UN  10.10.20.52  1.59 TB    256     9.1%
6b5aca07-1b14-4bc2-a7ba-96f026fa0e4e  rack1
UN  10.10.20.27  1.66 TB    256     7.7%
76023cdd-c42d-4068-8b53-ae94584b8b04  rack1
UN  10.10.20.22  1.66 TB    256     8.9%
46af9664-8975-4c91-847f-3f7b8f8d5ce2  rack1
UN  10.10.20.39  1.68 TB    256     8.0%
b7d44c26-4d75-4d36-a779-b7e7bdaecbc9  rack1
UN  10.10.20.45  1.49 TB    256     7.7%
8d6bce33-8179-4660-8443-2cf822074ca4  rack1
UN  10.10.20.47  1.64 TB    256     7.9%
bcd51a92-3150-41ae-9c51-104ea154f6fa  rack1
UN  10.10.20.62  1.59 TB    256     8.2%
84b47313-da75-4519-94f3-3951d554a3e5  rack1
UN  10.10.20.51  1.66 TB    256     8.9%
0343cd58-3686-465f-8280-56fb72d161e2  rack1


*Astyanax Connection Settings:*

seeds           :12
maxConns           :16
maxConnsPerHost    :16
connectTimeout     :2000
socketTimeout      :60000
maxTimeoutCount    :16
maxBlockedThreadsPerHost:16
maxOperationsPerConnection:16
DiscoveryType: RING_DESCRIBE
ConnectionPoolType: TOKEN_AWARE
DefaultReadConsistencyLevel: CL_QUORUM
DefaultWriteConsistencyLevel: CL_QUORUM



On Fri, Jul 11, 2014 at 5:04 PM, Mark Reddy <ma...@boxever.com> wrote:

> Can you post the output of nodetool status and your Astyanax connection
> settings?
>
>
> On Fri, Jul 11, 2014 at 9:06 PM, Ruchir Jha <ru...@gmail.com> wrote:
>
>> This is how we create our keyspace. We just ran this command once through
>> a cqlsh session on one of the nodes, so don't quite understand what you
>> mean by "check that your DC names match up"
>>
>> CREATE KEYSPACE prod WITH replication = {
>>   'class': 'NetworkTopologyStrategy',
>>   'datacenter1': '3'
>> };
>>
>>
>>
>> On Fri, Jul 11, 2014 at 3:48 PM, Chris Lohfink <cl...@blackbirdit.com>
>> wrote:
>>
>>> What replication strategy are you using? if using NetworkTopolgyStrategy
>>> double check that your DC names match up (case sensitive)
>>>
>>> Chris
>>>
>>> On Jul 11, 2014, at 9:38 AM, Ruchir Jha <ru...@gmail.com> wrote:
>>>
>>> Here's the complete stack trace:
>>>
>>> com.netflix.astyanax.connectionpool.exceptions.TokenRangeOfflineException:
>>> TokenRangeOfflineException:
>>> [host=ny4lpcas5.fusionts.corp(10.10.20.47):9160, latency=22784(42874),
>>> attempts=3]UnavailableException()
>>>         at
>>> com.netflix.astyanax.thrift.ThriftConverter.ToConnectionPoolException(ThriftConverter.java:165)
>>>         at
>>> com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:65)
>>>         at
>>> com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:28)
>>>         at
>>> com.netflix.astyanax.thrift.ThriftSyncConnectionFactoryImpl$ThriftConnection.execute(ThriftSyncConnectionFactoryImpl.java:151)
>>>         at
>>> com.netflix.astyanax.connectionpool.impl.AbstractExecuteWithFailoverImpl.tryOperation(AbstractExecuteWithFailoverImpl.java:69)
>>>         at
>>> com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionConnectionPool.java:256)
>>>         at
>>> com.netflix.astyanax.thrift.ThriftKeyspaceImpl.executeOperation(ThriftKeyspaceImpl.java:485)
>>>         at
>>> com.netflix.astyanax.thrift.ThriftKeyspaceImpl.access$000(ThriftKeyspaceImpl.java:79)
>>>         at
>>> com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1.execute(ThriftKeyspaceImpl.java:123)
>>> Caused by: UnavailableException()
>>>         at
>>> org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:20841)
>>>         at
>>> org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
>>>         at
>>> org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:964)
>>>         at
>>> org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:950)
>>>         at
>>> com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1$1.internalExecute(ThriftKeyspaceImpl.java:129)
>>>         at
>>> com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1$1.internalExecute(ThriftKeyspaceImpl.java:126)
>>>         at
>>> com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:60)
>>>         ... 12 more
>>>
>>>
>>>
>>> On Fri, Jul 11, 2014 at 9:11 AM, Prem Yadav <ip...@gmail.com>
>>> wrote:
>>>
>>>> Please post the full exception.
>>>>
>>>>
>>>> On Fri, Jul 11, 2014 at 1:50 PM, Ruchir Jha <ru...@gmail.com>
>>>> wrote:
>>>>
>>>>> We have a 12 node cluster and we are consistently seeing this
>>>>> exception being thrown during peak write traffic. We have a replication
>>>>> factor of 3 and a write consistency level of QUORUM. Also note there is no
>>>>> unusual Or Full GC activity during this time. Appreciate any help.
>>>>>
>>>>> Sent from my iPhone
>>>>
>>>>
>>>>
>>>
>>>
>>
>

Re: UnavailableException

Posted by Mark Reddy <ma...@boxever.com>.

Can you post the output of nodetool status and your Astyanax connection
settings?


On Fri, Jul 11, 2014 at 9:06 PM, Ruchir Jha <ru...@gmail.com> wrote:

> This is how we create our keyspace. We just ran this command once through
> a cqlsh session on one of the nodes, so don't quite understand what you
> mean by "check that your DC names match up"
>
> CREATE KEYSPACE prod WITH replication = {
>   'class': 'NetworkTopologyStrategy',
>   'datacenter1': '3'
> };
>
>
>
> On Fri, Jul 11, 2014 at 3:48 PM, Chris Lohfink <cl...@blackbirdit.com>
> wrote:
>
>> What replication strategy are you using? if using NetworkTopolgyStrategy
>> double check that your DC names match up (case sensitive)
>>
>> Chris
>>
>> On Jul 11, 2014, at 9:38 AM, Ruchir Jha <ru...@gmail.com> wrote:
>>
>> Here's the complete stack trace:
>>
>> com.netflix.astyanax.connectionpool.exceptions.TokenRangeOfflineException:
>> TokenRangeOfflineException:
>> [host=ny4lpcas5.fusionts.corp(10.10.20.47):9160, latency=22784(42874),
>> attempts=3]UnavailableException()
>>         at
>> com.netflix.astyanax.thrift.ThriftConverter.ToConnectionPoolException(ThriftConverter.java:165)
>>         at
>> com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:65)
>>         at
>> com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:28)
>>         at
>> com.netflix.astyanax.thrift.ThriftSyncConnectionFactoryImpl$ThriftConnection.execute(ThriftSyncConnectionFactoryImpl.java:151)
>>         at
>> com.netflix.astyanax.connectionpool.impl.AbstractExecuteWithFailoverImpl.tryOperation(AbstractExecuteWithFailoverImpl.java:69)
>>         at
>> com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionConnectionPool.java:256)
>>         at
>> com.netflix.astyanax.thrift.ThriftKeyspaceImpl.executeOperation(ThriftKeyspaceImpl.java:485)
>>         at
>> com.netflix.astyanax.thrift.ThriftKeyspaceImpl.access$000(ThriftKeyspaceImpl.java:79)
>>         at
>> com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1.execute(ThriftKeyspaceImpl.java:123)
>> Caused by: UnavailableException()
>>         at
>> org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:20841)
>>         at
>> org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
>>         at
>> org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:964)
>>         at
>> org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:950)
>>         at
>> com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1$1.internalExecute(ThriftKeyspaceImpl.java:129)
>>         at
>> com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1$1.internalExecute(ThriftKeyspaceImpl.java:126)
>>         at
>> com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:60)
>>         ... 12 more
>>
>>
>>
>> On Fri, Jul 11, 2014 at 9:11 AM, Prem Yadav <ip...@gmail.com> wrote:
>>
>>> Please post the full exception.
>>>
>>>
>>> On Fri, Jul 11, 2014 at 1:50 PM, Ruchir Jha <ru...@gmail.com>
>>> wrote:
>>>
>>>> We have a 12 node cluster and we are consistently seeing this exception
>>>> being thrown during peak write traffic. We have a replication factor of 3
>>>> and a write consistency level of QUORUM. Also note there is no unusual Or
>>>> Full GC activity during this time. Appreciate any help.
>>>>
>>>> Sent from my iPhone
>>>
>>>
>>>
>>
>>
>

Re: UnavailableException

Posted by Ruchir Jha <ru...@gmail.com>.

This is how we create our keyspace. We just ran this command once through a
cqlsh session on one of the nodes, so don't quite understand what you mean
by "check that your DC names match up"

CREATE KEYSPACE prod WITH replication = {
  'class': 'NetworkTopologyStrategy',
  'datacenter1': '3'
};



On Fri, Jul 11, 2014 at 3:48 PM, Chris Lohfink <cl...@blackbirdit.com>
wrote:

> What replication strategy are you using? if using NetworkTopolgyStrategy
> double check that your DC names match up (case sensitive)
>
> Chris
>
> On Jul 11, 2014, at 9:38 AM, Ruchir Jha <ru...@gmail.com> wrote:
>
> Here's the complete stack trace:
>
> com.netflix.astyanax.connectionpool.exceptions.TokenRangeOfflineException:
> TokenRangeOfflineException:
> [host=ny4lpcas5.fusionts.corp(10.10.20.47):9160, latency=22784(42874),
> attempts=3]UnavailableException()
>         at
> com.netflix.astyanax.thrift.ThriftConverter.ToConnectionPoolException(ThriftConverter.java:165)
>         at
> com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:65)
>         at
> com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:28)
>         at
> com.netflix.astyanax.thrift.ThriftSyncConnectionFactoryImpl$ThriftConnection.execute(ThriftSyncConnectionFactoryImpl.java:151)
>         at
> com.netflix.astyanax.connectionpool.impl.AbstractExecuteWithFailoverImpl.tryOperation(AbstractExecuteWithFailoverImpl.java:69)
>         at
> com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionConnectionPool.java:256)
>         at
> com.netflix.astyanax.thrift.ThriftKeyspaceImpl.executeOperation(ThriftKeyspaceImpl.java:485)
>         at
> com.netflix.astyanax.thrift.ThriftKeyspaceImpl.access$000(ThriftKeyspaceImpl.java:79)
>         at
> com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1.execute(ThriftKeyspaceImpl.java:123)
> Caused by: UnavailableException()
>         at
> org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:20841)
>         at
> org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
>         at
> org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:964)
>         at
> org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:950)
>         at
> com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1$1.internalExecute(ThriftKeyspaceImpl.java:129)
>         at
> com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1$1.internalExecute(ThriftKeyspaceImpl.java:126)
>         at
> com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:60)
>         ... 12 more
>
>
>
> On Fri, Jul 11, 2014 at 9:11 AM, Prem Yadav <ip...@gmail.com> wrote:
>
>> Please post the full exception.
>>
>>
>> On Fri, Jul 11, 2014 at 1:50 PM, Ruchir Jha <ru...@gmail.com> wrote:
>>
>>> We have a 12 node cluster and we are consistently seeing this exception
>>> being thrown during peak write traffic. We have a replication factor of 3
>>> and a write consistency level of QUORUM. Also note there is no unusual Or
>>> Full GC activity during this time. Appreciate any help.
>>>
>>> Sent from my iPhone
>>
>>
>>
>
>

Re: UnavailableException

Posted by Chris Lohfink <cl...@blackbirdit.com>.

What replication strategy are you using? if using NetworkTopolgyStrategy double check that your DC names match up (case sensitive)

Chris

On Jul 11, 2014, at 9:38 AM, Ruchir Jha <ru...@gmail.com> wrote:

> Here's the complete stack trace:
> 
> com.netflix.astyanax.connectionpool.exceptions.TokenRangeOfflineException: TokenRangeOfflineException: [host=ny4lpcas5.fusionts.corp(10.10.20.47):9160, latency=22784(42874), attempts=3]UnavailableException()
>         at com.netflix.astyanax.thrift.ThriftConverter.ToConnectionPoolException(ThriftConverter.java:165)
>         at com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:65)
>         at com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:28)
>         at com.netflix.astyanax.thrift.ThriftSyncConnectionFactoryImpl$ThriftConnection.execute(ThriftSyncConnectionFactoryImpl.java:151)
>         at com.netflix.astyanax.connectionpool.impl.AbstractExecuteWithFailoverImpl.tryOperation(AbstractExecuteWithFailoverImpl.java:69)
>         at com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionConnectionPool.java:256)
>         at com.netflix.astyanax.thrift.ThriftKeyspaceImpl.executeOperation(ThriftKeyspaceImpl.java:485)
>         at com.netflix.astyanax.thrift.ThriftKeyspaceImpl.access$000(ThriftKeyspaceImpl.java:79)
>         at com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1.execute(ThriftKeyspaceImpl.java:123)
> Caused by: UnavailableException()
>         at org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:20841)
>         at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
>         at org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:964)
>         at org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:950)
>         at com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1$1.internalExecute(ThriftKeyspaceImpl.java:129)
>         at com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1$1.internalExecute(ThriftKeyspaceImpl.java:126)
>         at com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:60)
>         ... 12 more
> 
> 
> 
> On Fri, Jul 11, 2014 at 9:11 AM, Prem Yadav <ip...@gmail.com> wrote:
> Please post the full exception.
> 
> 
> On Fri, Jul 11, 2014 at 1:50 PM, Ruchir Jha <ru...@gmail.com> wrote:
> We have a 12 node cluster and we are consistently seeing this exception being thrown during peak write traffic. We have a replication factor of 3 and a write consistency level of QUORUM. Also note there is no unusual Or Full GC activity during this time. Appreciate any help.
> 
> Sent from my iPhone
> 
>

Re: UnavailableException

Posted by Ruchir Jha <ru...@gmail.com>.

Here's the complete stack trace:

com.netflix.astyanax.connectionpool.exceptions.TokenRangeOfflineException:
TokenRangeOfflineException:
[host=ny4lpcas5.fusionts.corp(10.10.20.47):9160, latency=22784(42874),
attempts=3]UnavailableException()
        at
com.netflix.astyanax.thrift.ThriftConverter.ToConnectionPoolException(ThriftConverter.java:165)
        at
com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:65)
        at
com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:28)
        at
com.netflix.astyanax.thrift.ThriftSyncConnectionFactoryImpl$ThriftConnection.execute(ThriftSyncConnectionFactoryImpl.java:151)
        at
com.netflix.astyanax.connectionpool.impl.AbstractExecuteWithFailoverImpl.tryOperation(AbstractExecuteWithFailoverImpl.java:69)
        at
com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionConnectionPool.java:256)
        at
com.netflix.astyanax.thrift.ThriftKeyspaceImpl.executeOperation(ThriftKeyspaceImpl.java:485)
        at
com.netflix.astyanax.thrift.ThriftKeyspaceImpl.access$000(ThriftKeyspaceImpl.java:79)
        at
com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1.execute(ThriftKeyspaceImpl.java:123)
Caused by: UnavailableException()
        at
org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:20841)
        at
org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
        at
org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:964)
        at
org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:950)
        at
com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1$1.internalExecute(ThriftKeyspaceImpl.java:129)
        at
com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1$1.internalExecute(ThriftKeyspaceImpl.java:126)
        at
com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.java:60)
        ... 12 more



On Fri, Jul 11, 2014 at 9:11 AM, Prem Yadav <ip...@gmail.com> wrote:

> Please post the full exception.
>
>
> On Fri, Jul 11, 2014 at 1:50 PM, Ruchir Jha <ru...@gmail.com> wrote:
>
>> We have a 12 node cluster and we are consistently seeing this exception
>> being thrown during peak write traffic. We have a replication factor of 3
>> and a write consistency level of QUORUM. Also note there is no unusual Or
>> Full GC activity during this time. Appreciate any help.
>>
>> Sent from my iPhone
>
>
>

Re: UnavailableException

Posted by Prem Yadav <ip...@gmail.com>.

Please post the full exception.


On Fri, Jul 11, 2014 at 1:50 PM, Ruchir Jha <ru...@gmail.com> wrote:

> We have a 12 node cluster and we are consistently seeing this exception
> being thrown during peak write traffic. We have a replication factor of 3
> and a write consistency level of QUORUM. Also note there is no unusual Or
> Full GC activity during this time. Appreciate any help.
>
> Sent from my iPhone