You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by cem <ca...@gmail.com> on 2013/06/18 19:12:01 UTC

Dropped mutation messages

Hi All,

I have a cluster of 5 nodes with C* 1.2.4.

Each node has 4 disks 1 TB each.

I see  a lot of dropped messages after it stores 400 GB  per disk. (1.6 TB
per node).

The recommendation was 500 GB max per node before 1.2.  Datastax says that
we can store terabytes of data per node with 1.2.
http://www.datastax.com/docs/1.2/cluster_architecture/cluster_planning

Do I need to enable anything to leverage from 1.2? Do you have any other
advice?

What should be the path to investigate this?

Thanks in advance!

Best Regards,
Cem.

Re: Dropped mutation messages

Posted by aaron morton <aa...@thelastpickle.com>.
> What should be the path to investigate this?
Dropped messages are a symptom of other problems. 

Look for the GCInspector logging lots of ParNew, or the IO system being overloaded, or large (1000's) read or write batches from the client. 
Cheers

-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 20/06/2013, at 12:40 AM, Shahab Yunus <sh...@gmail.com> wrote:

> Hello Arthur,
> 
> What do you mean by "The queries need to be lightened"?
> 
> Thanks,
> Shahb
> 
> 
> On Tue, Jun 18, 2013 at 8:47 PM, Arthur Zubarev <Ar...@aol.com> wrote:
> Cem hi,
>  
> as per http://wiki.apache.org/cassandra/FAQ#dropped_messages
>  
> Internode messages which are received by a node, but do not get not to be processed within rpc_timeout are dropped rather than processed. As the coordinator node will no longer be waiting for a response. If the Coordinator node does not receive Consistency Level responses before the rpc_timeout it will return a TimedOutException to the client. If the coordinator receives Consistency Level responses it will return success to the client.
> 
> For MUTATION messages this means that the mutation was not applied to all replicas it was sent to. The inconsistency will be repaired by Read Repair or Anti Entropy Repair.
> 
> For READ messages this means a read request may not have completed.
> 
> Load shedding is part of the Cassandra architecture, if this is a persistent issue it is generally a sign of an overloaded node or cluster.
> 
> By the way, I am on C* 1.2.4 too in dev mode, after having my node filled with 400 GB I started getting RPC timeouts on large data retrievals, so in short, you may need to revise how you query.
> 
> The queries need to be lightened
> 
> /Arthur
> 
>  
> From: cem
> Sent: Tuesday, June 18, 2013 1:12 PM
> To: user@cassandra.apache.org
> Subject: Dropped mutation messages
>  
> Hi All,
>  
> I have a cluster of 5 nodes with C* 1.2.4.
>  
> Each node has 4 disks 1 TB each.
>  
> I see  a lot of dropped messages after it stores 400 GB  per disk. (1.6 TB per node).
>  
> The recommendation was 500 GB max per node before 1.2.  Datastax says that we can store terabytes of data per node with 1.2.
> http://www.datastax.com/docs/1.2/cluster_architecture/cluster_planning
>  
> Do I need to enable anything to leverage from 1.2? Do you have any other advice?
>  
> What should be the path to investigate this?
>  
> Thanks in advance!
>  
> Best Regards,
> Cem.
>  
>  
> 


Re: Dropped mutation messages

Posted by Shahab Yunus <sh...@gmail.com>.
Hello Arthur,

What do you mean by "The queries need to be lightened"?

Thanks,
Shahb


On Tue, Jun 18, 2013 at 8:47 PM, Arthur Zubarev <Ar...@aol.com>wrote:

>   Cem hi,
>
> as per http://wiki.apache.org/cassandra/FAQ#dropped_messages
>
>
> Internode messages which are received by a node, but do not get not to be
> processed within rpc_timeout are dropped rather than processed. As the
> coordinator node will no longer be waiting for a response. If the
> Coordinator node does not receive Consistency Level responses before the
> rpc_timeout it will return a TimedOutException to the client. If the
> coordinator receives Consistency Level responses it will return success to
> the client.
>
> For MUTATION messages this means that the mutation was not applied to all
> replicas it was sent to. The inconsistency will be repaired by Read Repair
> or Anti Entropy Repair.
>
> For READ messages this means a read request may not have completed.
>
> Load shedding is part of the Cassandra architecture, if this is a
> persistent issue it is generally a sign of an overloaded node or cluster.
>
> By the way, I am on C* 1.2.4 too in dev mode, after having my node filled
> with 400 GB I started getting RPC timeouts on large data retrievals, so in
> short, you may need to revise how you query.
>
> The queries need to be lightened
>
> /Arthur
>
>  *From:* cem <ca...@gmail.com>
> *Sent:* Tuesday, June 18, 2013 1:12 PM
> *To:* user@cassandra.apache.org
> *Subject:* Dropped mutation messages
>
>  Hi All,
>
> I have a cluster of 5 nodes with C* 1.2.4.
>
> Each node has 4 disks 1 TB each.
>
> I see  a lot of dropped messages after it stores 400 GB  per disk. (1.6 TB
> per node).
>
> The recommendation was 500 GB max per node before 1.2.  Datastax says that
> we can store terabytes of data per node with 1.2.
> http://www.datastax.com/docs/1.2/cluster_architecture/cluster_planning
>
> Do I need to enable anything to leverage from 1.2? Do you have any other
> advice?
>
> What should be the path to investigate this?
>
> Thanks in advance!
>
> Best Regards,
> Cem.
>
>
>

Re: Dropped mutation messages

Posted by Arthur Zubarev <Ar...@Aol.com>.
Cem hi,

as per http://wiki.apache.org/cassandra/FAQ#dropped_messages

Internode messages which are received by a node, but do not get not to be processed within rpc_timeout are dropped rather than processed. As the coordinator node will no longer be waiting for a response. If the Coordinator node does not receive Consistency Level responses before the rpc_timeout it will return a TimedOutException to the client. If the coordinator receives Consistency Level responses it will return success to the client.

For MUTATION messages this means that the mutation was not applied to all replicas it was sent to. The inconsistency will be repaired by Read Repair or Anti Entropy Repair.

For READ messages this means a read request may not have completed.

Load shedding is part of the Cassandra architecture, if this is a persistent issue it is generally a sign of an overloaded node or cluster.

By the way, I am on C* 1.2.4 too in dev mode, after having my node filled with 400 GB I started getting RPC timeouts on large data retrievals, so in short, you may need to revise how you query.

The queries need to be lightened 

/Arthur


From: cem 
Sent: Tuesday, June 18, 2013 1:12 PM
To: user@cassandra.apache.org 
Subject: Dropped mutation messages

Hi All, 

I have a cluster of 5 nodes with C* 1.2.4.

Each node has 4 disks 1 TB each.

I see  a lot of dropped messages after it stores 400 GB  per disk. (1.6 TB per node).

The recommendation was 500 GB max per node before 1.2.  Datastax says that we can store terabytes of data per node with 1.2.
http://www.datastax.com/docs/1.2/cluster_architecture/cluster_planning

Do I need to enable anything to leverage from 1.2? Do you have any other advice?


What should be the path to investigate this?

Thanks in advance! 

Best Regards,
Cem.