You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by srmore <co...@gmail.com> on 2013/11/10 01:02:17 UTC

A lot of MUTATION and REQUEST_RESPONSE messages dropped

I recently upgraded to 1.2.9 and I am seeing a lot of REQUEST_RESPONSE and
MUTATION messages are being dropped.

This happens when I have multiple nodes in the cluster (about 3 nodes) and
I send traffic to only one node. I don't think the traffic is that high, it
is around 400 msg/sec with 100 threads. When I take down other two nodes I
don't see any errors (at least on the client side) I am using Pelops.

On the client I get UnavailableException, but the nodes are up. Initially I
thought I am hitting CASSANDRA-6297 (gossip thread blocking) so I changed
memtable_flush_writers to 3. Still no luck.

UnavailableException:
org.scale7.cassandra.pelops.exceptions.UnavailableException: null at
org.scale7.cassandra.pelops.exceptions.IExceptionTranslator$ExceptionTranslator.translate(IExceptionTranslator.java:61)
~[na:na] at

In the debug log on the cassandra node this is the exception I see

DEBUG [Thrift:78] 2013-11-09 16:47:28,212 CustomTThreadPoolServer.java
Thrift transport error occurred during processing of message.
org.apache.thrift.transport.TTransportException
        at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
        at
org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
        at
org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129)
        at
org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101)
        at
org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
        at
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
        at
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
        at
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
        at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:22)
        at
org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:206)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
        at java.lang.Thread.run(Thread.java:662)

Could this be because of high load ? with Cassandra 1.0.011 I did not see
this issue.

Thanks,
Sandeep

Re: A lot of MUTATION and REQUEST_RESPONSE messages dropped

Posted by srmore <co...@gmail.com>.

The problem was cross_node_timeout  value,I had it set to true and my ntp
clocks were not  synchronized as a result, some of the requests were
dropped.

Thanks,
Sandeep


On Sat, Nov 9, 2013 at 6:02 PM, srmore <co...@gmail.com> wrote:

> I recently upgraded to 1.2.9 and I am seeing a lot of REQUEST_RESPONSE and
> MUTATION messages are being dropped.
>
> This happens when I have multiple nodes in the cluster (about 3 nodes) and
> I send traffic to only one node. I don't think the traffic is that high, it
> is around 400 msg/sec with 100 threads. When I take down other two nodes I
> don't see any errors (at least on the client side) I am using Pelops.
>
> On the client I get UnavailableException, but the nodes are up. Initially
> I thought I am hitting CASSANDRA-6297 (gossip thread blocking) so I
> changed memtable_flush_writers to 3. Still no luck.
>
> UnavailableException:
> org.scale7.cassandra.pelops.exceptions.UnavailableException: null at
> org.scale7.cassandra.pelops.exceptions.IExceptionTranslator$ExceptionTranslator.translate(IExceptionTranslator.java:61)
> ~[na:na] at
>
> In the debug log on the cassandra node this is the exception I see
>
> DEBUG [Thrift:78] 2013-11-09 16:47:28,212 CustomTThreadPoolServer.java
> Thrift transport error occurred during processing of message.
> org.apache.thrift.transport.TTransportException
>         at
> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
>         at
> org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
>         at
> org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129)
>         at
> org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101)
>         at
> org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
>         at
> org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
>         at
> org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
>         at
> org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
>         at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:22)
>         at
> org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:206)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
>         at java.lang.Thread.run(Thread.java:662)
>
> Could this be because of high load ? with Cassandra 1.0.011 I did not see
> this issue.
>
> Thanks,
> Sandeep
>
>
>