You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Yuhan Zhang <yz...@onescreen.com> on 2011/10/05 22:14:48 UTC

TimedOutException and UnavailableException from multiGetSliceQuery

Hi all,

I have been experiencing the unavailableException and TimedOutException on a
3-node cassandra cluster
during a multiGetSliceQuery with 1000 columns. Since there are many keys
involved in the query, I divided
them into groups of 5000 rows and process each group individually in a for
loop. but seems like it is not helping..
Once the TimedOutException appears, further requests to cassandra will cause
UnavailableException.
However, the servers can recover after a while without intervention.

Which settings should I pay attention to in order to fix the problem? This
problem becomes very frequent recently.


Thank you.

Yuhan

The exception looks like:

1/10/05 13:05:31 ERROR connection.HConnectionManager: Could not fullfill
request on this host
CassandraClient<ec2-75-101-238-70.compute-1.amazonaws.com:9160-33>
11/10/05 13:05:31 ERROR connection.HConnectionManager: Exception:
me.prettyprint.hector.api.exceptions.HTimedOutException: TimedOutException()
    at
me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:32)
    at
me.prettyprint.cassandra.service.KeyspaceServiceImpl$3.execute(KeyspaceServiceImpl.java:161)
    at
me.prettyprint.cassandra.service.KeyspaceServiceImpl$3.execute(KeyspaceServiceImpl.java:143)
    at
me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:101)
    at
me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:155)
...
Caused by: TimedOutException()
    at
org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassandra.java:12104)
    at
org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:732)


11/10/05 20:06:05 ERROR connection.HConnectionManager: Could not fullfill
request on this host
CassandraClient<ec2-184-73-116-237.compute-1.amazonaws.com:9160-15>
11/10/05 20:06:05 ERROR connection.HConnectionManager: Exception:
me.prettyprint.hector.api.exceptions.HUnavailableException:
UnavailableException()
    at
me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:50)
    at
me.prettyprint.cassandra.service.KeyspaceServiceImpl$10.execute(KeyspaceServiceImpl.java:397)
    at
me.prettyprint.cassandra.service.KeyspaceServiceImpl$10.execute(KeyspaceServiceImpl.java:383)
    at
me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:101)

Caused by: UnavailableException()
    at
org.apache.cassandra.thrift.Cassandra$multiget_slice_result.read(Cassandra.java:9620)
    at
org.apache.cassandra.thrift.Cassandra$Client.recv_multiget_slice(Cassandra.java:636)
    at
org.apache.cassandra.thrift.Cassandra$Client.multiget_slice(Cassandra.java:608)
    at
me.prettyprint.cassandra.service.KeyspaceServiceImpl$10.execute(KeyspaceServiceImpl.java:388)
    ... 35 more

Re: TimedOutException and UnavailableException from multiGetSliceQuery

Posted by Yuhan Zhang <yz...@onescreen.com>.
Hi Aaron,

thanks for the suggestion. It works again after I cut back the # of rows.

On Wed, Oct 5, 2011 at 1:43 PM, aaron morton <aa...@thelastpickle.com>wrote:

> 5000 rows in a mutli get is way, way, way (did I say way ? ) to many.
>
> Whenever you get a TimedOutException check the tp stats on the nodes, you
> will normally see a high pending count. Every row get get turns into an
> message in a TP. So if you ask for 5k rows you flood the TP with 5k messages
> which will often result in the node(s) been temporarily overloaded.
>
> More is not always more. I would guess 100 as a starting point, i would be
> doubtful that you would see much benefit beyond 1000. pycassa defaults to
> 1024
> https://github.com/pycassa/pycassa/blob/master/pycassa/columnfamily.py#L63
>
>
>  Cheers
>
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 6/10/2011, at 9:14 AM, Yuhan Zhang wrote:
>
> Hi all,
>
> I have been experiencing the unavailableException and TimedOutException on
> a 3-node cassandra cluster
> during a multiGetSliceQuery with 1000 columns. Since there are many keys
> involved in the query, I divided
> them into groups of 5000 rows and process each group individually in a for
> loop. but seems like it is not helping..
> Once the TimedOutException appears, further requests to cassandra will
> cause UnavailableException.
> However, the servers can recover after a while without intervention.
>
> Which settings should I pay attention to in order to fix the problem? This
> problem becomes very frequent recently.
>
>
> Thank you.
>
> Yuhan
>
> The exception looks like:
>
> 1/10/05 13:05:31 ERROR connection.HConnectionManager: Could not fullfill
> request on this host
> CassandraClient<ec2-75-101-238-70.compute-1.amazonaws.com:9160-33>
> 11/10/05 13:05:31 ERROR connection.HConnectionManager: Exception:
> me.prettyprint.hector.api.exceptions.HTimedOutException:
> TimedOutException()
>     at
> me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:32)
>     at
> me.prettyprint.cassandra.service.KeyspaceServiceImpl$3.execute(KeyspaceServiceImpl.java:161)
>     at
> me.prettyprint.cassandra.service.KeyspaceServiceImpl$3.execute(KeyspaceServiceImpl.java:143)
>     at
> me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:101)
>     at
> me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:155)
> ...
> Caused by: TimedOutException()
>     at
> org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassandra.java:12104)
>     at
> org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:732)
>
>
> 11/10/05 20:06:05 ERROR connection.HConnectionManager: Could not fullfill
> request on this host
> CassandraClient<ec2-184-73-116-237.compute-1.amazonaws.com:9160-15>
> 11/10/05 20:06:05 ERROR connection.HConnectionManager: Exception:
> me.prettyprint.hector.api.exceptions.HUnavailableException:
> UnavailableException()
>     at
> me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:50)
>     at
> me.prettyprint.cassandra.service.KeyspaceServiceImpl$10.execute(KeyspaceServiceImpl.java:397)
>     at
> me.prettyprint.cassandra.service.KeyspaceServiceImpl$10.execute(KeyspaceServiceImpl.java:383)
>     at
> me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:101)
>
> Caused by: UnavailableException()
>     at
> org.apache.cassandra.thrift.Cassandra$multiget_slice_result.read(Cassandra.java:9620)
>     at
> org.apache.cassandra.thrift.Cassandra$Client.recv_multiget_slice(Cassandra.java:636)
>     at
> org.apache.cassandra.thrift.Cassandra$Client.multiget_slice(Cassandra.java:608)
>     at
> me.prettyprint.cassandra.service.KeyspaceServiceImpl$10.execute(KeyspaceServiceImpl.java:388)
>     ... 35 more
>
>
>

Re: TimedOutException and UnavailableException from multiGetSliceQuery

Posted by aaron morton <aa...@thelastpickle.com>.
5000 rows in a mutli get is way, way, way (did I say way ? ) to many. 

Whenever you get a TimedOutException check the tp stats on the nodes, you will normally see a high pending count. Every row get get turns into an message in a TP. So if you ask for 5k rows you flood the TP with 5k messages which will often result in the node(s) been temporarily overloaded. 

More is not always more. I would guess 100 as a starting point, i would be doubtful that you would see much benefit beyond 1000. pycassa defaults to 1024 https://github.com/pycassa/pycassa/blob/master/pycassa/columnfamily.py#L63 

 Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 6/10/2011, at 9:14 AM, Yuhan Zhang wrote:

> Hi all,
> 
> I have been experiencing the unavailableException and TimedOutException on a 3-node cassandra cluster
> during a multiGetSliceQuery with 1000 columns. Since there are many keys involved in the query, I divided
> them into groups of 5000 rows and process each group individually in a for loop. but seems like it is not helping..
> Once the TimedOutException appears, further requests to cassandra will cause UnavailableException.
> However, the servers can recover after a while without intervention. 
> 
> Which settings should I pay attention to in order to fix the problem? This problem becomes very frequent recently.
>  
> 
> Thank you.
> 
> Yuhan
> 
> The exception looks like:
> 
> 1/10/05 13:05:31 ERROR connection.HConnectionManager: Could not fullfill request on this host CassandraClient<ec2-75-101-238-70.compute-1.amazonaws.com:9160-33>
> 11/10/05 13:05:31 ERROR connection.HConnectionManager: Exception: 
> me.prettyprint.hector.api.exceptions.HTimedOutException: TimedOutException()
>     at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:32)
>     at me.prettyprint.cassandra.service.KeyspaceServiceImpl$3.execute(KeyspaceServiceImpl.java:161)
>     at me.prettyprint.cassandra.service.KeyspaceServiceImpl$3.execute(KeyspaceServiceImpl.java:143)
>     at me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:101)
>     at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:155)
> ...
> Caused by: TimedOutException()
>     at org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassandra.java:12104)
>     at org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:732)
> 
> 
> 11/10/05 20:06:05 ERROR connection.HConnectionManager: Could not fullfill request on this host CassandraClient<ec2-184-73-116-237.compute-1.amazonaws.com:9160-15>
> 11/10/05 20:06:05 ERROR connection.HConnectionManager: Exception: 
> me.prettyprint.hector.api.exceptions.HUnavailableException: UnavailableException()
>     at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:50)
>     at me.prettyprint.cassandra.service.KeyspaceServiceImpl$10.execute(KeyspaceServiceImpl.java:397)
>     at me.prettyprint.cassandra.service.KeyspaceServiceImpl$10.execute(KeyspaceServiceImpl.java:383)
>     at me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:101)
> 
> Caused by: UnavailableException()
>     at org.apache.cassandra.thrift.Cassandra$multiget_slice_result.read(Cassandra.java:9620)
>     at org.apache.cassandra.thrift.Cassandra$Client.recv_multiget_slice(Cassandra.java:636)
>     at org.apache.cassandra.thrift.Cassandra$Client.multiget_slice(Cassandra.java:608)
>     at me.prettyprint.cassandra.service.KeyspaceServiceImpl$10.execute(KeyspaceServiceImpl.java:388)
>     ... 35 more
>