You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by "Tiwari, Dushyant" <Du...@morganstanley.com> on 2012/03/05 11:32:20 UTC

Mutation Dropped Messages

Hi All,

While benchmarking Cassandra I found "Mutation Dropped" messages in the logs.  Now I know this is a good old question. It will be really great if someone can provide a check list to recover when such a thing happens. I am looking for answers of the following questions  -


1.       Which parameters to tune in the config files? - Especially looking for heavy writes

2.       What is the difference between TimedOutException and silently dropping mutation messages while operating on a CL of QUORUM.


Regards,
Dushyant

--------------------------------------------------------------------------
NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained herein are not intended to be, and do not constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received this communication in error, please destroy all electronic and paper copies and notify the sender immediately. Mistransmission is not intended to waive confidentiality or privilege. Morgan Stanley reserves the right, to the extent permitted under applicable law, to monitor electronic communications. This message is subject to terms available at the following link: http://www.morganstanley.com/disclaimers. If you cannot access these links, please notify us by reply message and we will send the contents to you. By messaging with Morgan Stanley you consent to the foregoing.

Re: Mutation Dropped Messages

Posted by aaron morton <aa...@thelastpickle.com>.
> 1.       One node is running at 8G rest on 10G – same config
Make them all the same. 

> 2.       Nodetool –
Even though the token ranges are not balanced, the load looks a little odd. Have you moved tokens ? Did you do a cleanup ? 

You'll need to look at the node that is dropping messages (not sure what that is). 

What is happening in the log ? Is it having GC problems ? 
What is happening with the io and CPU load on the machine ? 

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 6/03/2012, at 11:57 PM, Tiwari, Dushyant wrote:

> 1.       One node is running at 8G rest on 10G – same config
> 2.       Nodetool –
> Status State   Load            Owns    Token
>                                                                                162563731948587347959549934419333022646
> Up     Normal  107.79 MB       25.00%  34957844353235424160784456632419943350
> Up     Normal  116.44 MB       25.00%  77493140218352732093706282561390969782
> Up     Normal  27.01 MB        12.68%  99065646426277998282363457251162269147
> Up     Normal  35.9 MB         12.32%  120028436083470040026628108490361996214
> Up     Normal  512.55 KB       25.00%  162563731948587347959549934419333022646
>  
> RF:2 and CL: QUORUM – writes at a rate of 1750 rows/s – every row – 5 cols and 2 of them indexes.
>  
> Thanks,
> Dushyant
>  
>  
> From: aaron morton [mailto:aaron@thelastpickle.com] 
> Sent: Monday, March 05, 2012 11:07 PM
> To: user@cassandra.apache.org
> Subject: Re: Mutation Dropped Messages
>  
> I increased the size of the cluster also the concurrent_writes parameter. Still there is a node which keeps on dropping the mutation messages.
> Ensure all the nodes have the same spec, and the nodes have the same config. In a virtual environment consider moving the node.
>  
> Is this due to some improper load balancing? 
> What does nodetool ring say and what sort of queries (and RF and CL) are you sending.
>  
> Cheers
>  
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>  
> On 6/03/2012, at 3:58 AM, Tiwari, Dushyant wrote:
> 
> 
> Hey Aaron,
>  
> I increased the size of the cluster also the concurrent_writes parameter. Still there is a node which keeps on dropping the mutation messages. The other nodes are not dropping mutation messages. I am using Hector API and had done nothing for load balancing so far. Just provided the host:port of the nodes in the Cassandrahostconfig. Is this due to some improper load balancing? Also the physical host where the node is hosted is relatively heavier than other nodes’ host. What can I do to improve?
> PS: The node is seed of the cluster.
>  
> Thanks,
> Dushyant
>  
> From: aaron morton [mailto:aaron@thelastpickle.com] 
> Sent: Monday, March 05, 2012 4:15 PM
> To: user@cassandra.apache.org
> Subject: Re: Mutation Dropped Messages
>  
> 1.       Which parameters to tune in the config files? – Especially looking for heavy writes
> The node is overloaded. It may be because there are no enough nodes, or the node is under temporary stress such as GC or repair. 
> If you have spare IO / CPU capacity you could increase the current_writes to increase throughput on the write stage. You then need to ensure the commit log and, to a lesser degree, the data volumes can keep up. 
>  
> 2.       What is the difference between TimedOutException and silently dropping mutation messages while operating on a CL of QUORUM.
> TimedOutExceptions means CL nodes did not respond to the coordinator before rpc_timeout. Dropping messages happens when a message is removed from the queue in the a thread pool after rpc_timeout has occurred. it is a feature of the architecture, and correct behaviour under stress. 
> Inconsistencies created by dropped messages are repaired via reads as high CL, HH (in 1.+), Read Repair or Anti Entropy.
>  
> Cheers
>  
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>  
> On 5/03/2012, at 11:32 PM, Tiwari, Dushyant wrote:
> 
> 
> 
> Hi All,
>  
> While benchmarking Cassandra I found “Mutation Dropped” messages in the logs.  Now I know this is a good old question. It will be really great if someone can provide a check list to recover when such a thing happens. I am looking for answers of the following questions  -
>  
> 1.       Which parameters to tune in the config files? – Especially looking for heavy writes
> 2.       What is the difference between TimedOutException and silently dropping mutation messages while operating on a CL of QUORUM.
>  
>  
> Regards,
> Dushyant
> NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained herein are not intended to be, and do not constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received this communication in error, please destroy all electronic and paper copies and notify the sender immediately. Mistransmission is not intended to waive confidentiality or privilege. Morgan Stanley reserves the right, to the extent permitted under applicable law, to monitor electronic communications. This message is subject to terms available at the following link: http://www.morganstanley.com/disclaimers. If you cannot access these links, please notify us by reply message and we will send the contents to you. By messaging with Morgan Stanley you consent to the foregoing.
>  
> NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained herein are not intended to be, and do not constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received this communication in error, please destroy all electronic and paper copies and notify the sender immediately. Mistransmission is not intended to waive confidentiality or privilege. Morgan Stanley reserves the right, to the extent permitted under applicable law, to monitor electronic communications. This message is subject to terms available at the following link:http://www.morganstanley.com/disclaimers. If you cannot access these links, please notify us by reply message and we will send the contents to you. By messaging with Morgan Stanley you consent to the foregoing.
>  
> NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained herein are not intended to be, and do not constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received this communication in error, please destroy all electronic and paper copies and notify the sender immediately. Mistransmission is not intended to waive confidentiality or privilege. Morgan Stanley reserves the right, to the extent permitted under applicable law, to monitor electronic communications. This message is subject to terms available at the following link:http://www.morganstanley.com/disclaimers. If you cannot access these links, please notify us by reply message and we will send the contents to you. By messaging with Morgan Stanley you consent to the foregoing.


RE: Mutation Dropped Messages

Posted by "Tiwari, Dushyant" <Du...@morganstanley.com>.
1.       One node is running at 8G rest on 10G - same config

2.       Nodetool -

Status State   Load            Owns    Token

                                                                               162563731948587347959549934419333022646

Up     Normal  107.79 MB       25.00%  34957844353235424160784456632419943350

Up     Normal  116.44 MB       25.00%  77493140218352732093706282561390969782

Up     Normal  27.01 MB        12.68%  99065646426277998282363457251162269147

Up     Normal  35.9 MB         12.32%  120028436083470040026628108490361996214

Up     Normal  512.55 KB       25.00%  162563731948587347959549934419333022646



RF:2 and CL: QUORUM - writes at a rate of 1750 rows/s - every row - 5 cols and 2 of them indexes.



Thanks,

Dushyant



From: aaron morton [mailto:aaron@thelastpickle.com]
Sent: Monday, March 05, 2012 11:07 PM
To: user@cassandra.apache.org
Subject: Re: Mutation Dropped Messages

I increased the size of the cluster also the concurrent_writes parameter. Still there is a node which keeps on dropping the mutation messages.
Ensure all the nodes have the same spec, and the nodes have the same config. In a virtual environment consider moving the node.

Is this due to some improper load balancing?
What does nodetool ring say and what sort of queries (and RF and CL) are you sending.

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 6/03/2012, at 3:58 AM, Tiwari, Dushyant wrote:


Hey Aaron,

I increased the size of the cluster also the concurrent_writes parameter. Still there is a node which keeps on dropping the mutation messages. The other nodes are not dropping mutation messages. I am using Hector API and had done nothing for load balancing so far. Just provided the host:port of the nodes in the Cassandrahostconfig. Is this due to some improper load balancing? Also the physical host where the node is hosted is relatively heavier than other nodes' host. What can I do to improve?
PS: The node is seed of the cluster.

Thanks,
Dushyant

From: aaron morton [mailto:aaron@thelastpickle.com]
Sent: Monday, March 05, 2012 4:15 PM
To: user@cassandra.apache.org<ma...@cassandra.apache.org>
Subject: Re: Mutation Dropped Messages

1.       Which parameters to tune in the config files? - Especially looking for heavy writes
The node is overloaded. It may be because there are no enough nodes, or the node is under temporary stress such as GC or repair.
If you have spare IO / CPU capacity you could increase the current_writes to increase throughput on the write stage. You then need to ensure the commit log and, to a lesser degree, the data volumes can keep up.

2.       What is the difference between TimedOutException and silently dropping mutation messages while operating on a CL of QUORUM.
TimedOutExceptions means CL nodes did not respond to the coordinator before rpc_timeout. Dropping messages happens when a message is removed from the queue in the a thread pool after rpc_timeout has occurred. it is a feature of the architecture, and correct behaviour under stress.
Inconsistencies created by dropped messages are repaired via reads as high CL, HH (in 1.+), Read Repair or Anti Entropy.

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 5/03/2012, at 11:32 PM, Tiwari, Dushyant wrote:



Hi All,

While benchmarking Cassandra I found "Mutation Dropped" messages in the logs.  Now I know this is a good old question. It will be really great if someone can provide a check list to recover when such a thing happens. I am looking for answers of the following questions  -

1.       Which parameters to tune in the config files? - Especially looking for heavy writes
2.       What is the difference between TimedOutException and silently dropping mutation messages while operating on a CL of QUORUM.


Regards,
Dushyant
________________________________
NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained herein are not intended to be, and do not constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received this communication in error, please destroy all electronic and paper copies and notify the sender immediately. Mistransmission is not intended to waive confidentiality or privilege. Morgan Stanley reserves the right, to the extent permitted under applicable law, to monitor electronic communications. This message is subject to terms available at the following link: http://www.morganstanley.com/disclaimers. If you cannot access these links, please notify us by reply message and we will send the contents to you. By messaging with Morgan Stanley you consent to the foregoing.

________________________________
NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained herein are not intended to be, and do not constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received this communication in error, please destroy all electronic and paper copies and notify the sender immediately. Mistransmission is not intended to waive confidentiality or privilege. Morgan Stanley reserves the right, to the extent permitted under applicable law, to monitor electronic communications. This message is subject to terms available at the following link:http://www.morganstanley.com/disclaimers. If you cannot access these links, please notify us by reply message and we will send the contents to you. By messaging with Morgan Stanley you consent to the foregoing.


--------------------------------------------------------------------------
NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained herein are not intended to be, and do not constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received this communication in error, please destroy all electronic and paper copies and notify the sender immediately. Mistransmission is not intended to waive confidentiality or privilege. Morgan Stanley reserves the right, to the extent permitted under applicable law, to monitor electronic communications. This message is subject to terms available at the following link: http://www.morganstanley.com/disclaimers. If you cannot access these links, please notify us by reply message and we will send the contents to you. By messaging with Morgan Stanley you consent to the foregoing.

Re: Mutation Dropped Messages

Posted by aaron morton <aa...@thelastpickle.com>.
> I increased the size of the cluster also the concurrent_writes parameter. Still there is a node which keeps on dropping the mutation messages.
Ensure all the nodes have the same spec, and the nodes have the same config. In a virtual environment consider moving the node.

> Is this due to some improper load balancing? 
What does nodetool ring say and what sort of queries (and RF and CL) are you sending.

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 6/03/2012, at 3:58 AM, Tiwari, Dushyant wrote:

> Hey Aaron,
>  
> I increased the size of the cluster also the concurrent_writes parameter. Still there is a node which keeps on dropping the mutation messages. The other nodes are not dropping mutation messages. I am using Hector API and had done nothing for load balancing so far. Just provided the host:port of the nodes in the Cassandrahostconfig. Is this due to some improper load balancing? Also the physical host where the node is hosted is relatively heavier than other nodes’ host. What can I do to improve?
> PS: The node is seed of the cluster.
>  
> Thanks,
> Dushyant
>  
> From: aaron morton [mailto:aaron@thelastpickle.com] 
> Sent: Monday, March 05, 2012 4:15 PM
> To: user@cassandra.apache.org
> Subject: Re: Mutation Dropped Messages
>  
> 1.       Which parameters to tune in the config files? – Especially looking for heavy writes
> The node is overloaded. It may be because there are no enough nodes, or the node is under temporary stress such as GC or repair. 
> If you have spare IO / CPU capacity you could increase the current_writes to increase throughput on the write stage. You then need to ensure the commit log and, to a lesser degree, the data volumes can keep up. 
>  
> 2.       What is the difference between TimedOutException and silently dropping mutation messages while operating on a CL of QUORUM.
> TimedOutExceptions means CL nodes did not respond to the coordinator before rpc_timeout. Dropping messages happens when a message is removed from the queue in the a thread pool after rpc_timeout has occurred. it is a feature of the architecture, and correct behaviour under stress. 
> Inconsistencies created by dropped messages are repaired via reads as high CL, HH (in 1.+), Read Repair or Anti Entropy.
>  
> Cheers
>  
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>  
> On 5/03/2012, at 11:32 PM, Tiwari, Dushyant wrote:
> 
> 
> Hi All,
>  
> While benchmarking Cassandra I found “Mutation Dropped” messages in the logs.  Now I know this is a good old question. It will be really great if someone can provide a check list to recover when such a thing happens. I am looking for answers of the following questions  -
>  
> 1.       Which parameters to tune in the config files? – Especially looking for heavy writes
> 2.       What is the difference between TimedOutException and silently dropping mutation messages while operating on a CL of QUORUM.
>  
>  
> Regards,
> Dushyant
> NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained herein are not intended to be, and do not constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received this communication in error, please destroy all electronic and paper copies and notify the sender immediately. Mistransmission is not intended to waive confidentiality or privilege. Morgan Stanley reserves the right, to the extent permitted under applicable law, to monitor electronic communications. This message is subject to terms available at the following link: http://www.morganstanley.com/disclaimers. If you cannot access these links, please notify us by reply message and we will send the contents to you. By messaging with Morgan Stanley you consent to the foregoing.
>  
> NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained herein are not intended to be, and do not constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received this communication in error, please destroy all electronic and paper copies and notify the sender immediately. Mistransmission is not intended to waive confidentiality or privilege. Morgan Stanley reserves the right, to the extent permitted under applicable law, to monitor electronic communications. This message is subject to terms available at the following link:http://www.morganstanley.com/disclaimers. If you cannot access these links, please notify us by reply message and we will send the contents to you. By messaging with Morgan Stanley you consent to the foregoing.


RE: Mutation Dropped Messages

Posted by "Tiwari, Dushyant" <Du...@morganstanley.com>.
Hey Aaron,

I increased the size of the cluster also the concurrent_writes parameter. Still there is a node which keeps on dropping the mutation messages. The other nodes are not dropping mutation messages. I am using Hector API and had done nothing for load balancing so far. Just provided the host:port of the nodes in the Cassandrahostconfig. Is this due to some improper load balancing? Also the physical host where the node is hosted is relatively heavier than other nodes' host. What can I do to improve?
PS: The node is seed of the cluster.

Thanks,
Dushyant

From: aaron morton [mailto:aaron@thelastpickle.com]
Sent: Monday, March 05, 2012 4:15 PM
To: user@cassandra.apache.org
Subject: Re: Mutation Dropped Messages

1.       Which parameters to tune in the config files? - Especially looking for heavy writes
The node is overloaded. It may be because there are no enough nodes, or the node is under temporary stress such as GC or repair.
If you have spare IO / CPU capacity you could increase the current_writes to increase throughput on the write stage. You then need to ensure the commit log and, to a lesser degree, the data volumes can keep up.

2.       What is the difference between TimedOutException and silently dropping mutation messages while operating on a CL of QUORUM.
TimedOutExceptions means CL nodes did not respond to the coordinator before rpc_timeout. Dropping messages happens when a message is removed from the queue in the a thread pool after rpc_timeout has occurred. it is a feature of the architecture, and correct behaviour under stress.
Inconsistencies created by dropped messages are repaired via reads as high CL, HH (in 1.+), Read Repair or Anti Entropy.

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 5/03/2012, at 11:32 PM, Tiwari, Dushyant wrote:


Hi All,

While benchmarking Cassandra I found "Mutation Dropped" messages in the logs.  Now I know this is a good old question. It will be really great if someone can provide a check list to recover when such a thing happens. I am looking for answers of the following questions  -

1.       Which parameters to tune in the config files? - Especially looking for heavy writes
2.       What is the difference between TimedOutException and silently dropping mutation messages while operating on a CL of QUORUM.


Regards,
Dushyant
________________________________
NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained herein are not intended to be, and do not constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received this communication in error, please destroy all electronic and paper copies and notify the sender immediately. Mistransmission is not intended to waive confidentiality or privilege. Morgan Stanley reserves the right, to the extent permitted under applicable law, to monitor electronic communications. This message is subject to terms available at the following link: http://www.morganstanley.com/disclaimers. If you cannot access these links, please notify us by reply message and we will send the contents to you. By messaging with Morgan Stanley you consent to the foregoing.


--------------------------------------------------------------------------
NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained herein are not intended to be, and do not constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received this communication in error, please destroy all electronic and paper copies and notify the sender immediately. Mistransmission is not intended to waive confidentiality or privilege. Morgan Stanley reserves the right, to the extent permitted under applicable law, to monitor electronic communications. This message is subject to terms available at the following link: http://www.morganstanley.com/disclaimers. If you cannot access these links, please notify us by reply message and we will send the contents to you. By messaging with Morgan Stanley you consent to the foregoing.

RE: Mutation Dropped Messages

Posted by "Tiwari, Dushyant" <Du...@morganstanley.com>.
Thanks a lot for the concurrent_writes hint that really improves the throughput. Do you mean dropped messages and no timedoutexception will mean the data is written somewhere in the cluster and by taking corrective measures desired CL can be achieved?



From: aaron morton [mailto:aaron@thelastpickle.com]
Sent: Monday, March 05, 2012 4:15 PM
To: user@cassandra.apache.org
Subject: Re: Mutation Dropped Messages

1.       Which parameters to tune in the config files? - Especially looking for heavy writes
The node is overloaded. It may be because there are no enough nodes, or the node is under temporary stress such as GC or repair.
If you have spare IO / CPU capacity you could increase the current_writes to increase throughput on the write stage. You then need to ensure the commit log and, to a lesser degree, the data volumes can keep up.

2.       What is the difference between TimedOutException and silently dropping mutation messages while operating on a CL of QUORUM.
TimedOutExceptions means CL nodes did not respond to the coordinator before rpc_timeout. Dropping messages happens when a message is removed from the queue in the a thread pool after rpc_timeout has occurred. it is a feature of the architecture, and correct behaviour under stress.
Inconsistencies created by dropped messages are repaired via reads as high CL, HH (in 1.+), Read Repair or Anti Entropy.

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 5/03/2012, at 11:32 PM, Tiwari, Dushyant wrote:


Hi All,

While benchmarking Cassandra I found "Mutation Dropped" messages in the logs.  Now I know this is a good old question. It will be really great if someone can provide a check list to recover when such a thing happens. I am looking for answers of the following questions  -

1.       Which parameters to tune in the config files? - Especially looking for heavy writes
2.       What is the difference between TimedOutException and silently dropping mutation messages while operating on a CL of QUORUM.


Regards,
Dushyant
________________________________
NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained herein are not intended to be, and do not constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received this communication in error, please destroy all electronic and paper copies and notify the sender immediately. Mistransmission is not intended to waive confidentiality or privilege. Morgan Stanley reserves the right, to the extent permitted under applicable law, to monitor electronic communications. This message is subject to terms available at the following link: http://www.morganstanley.com/disclaimers. If you cannot access these links, please notify us by reply message and we will send the contents to you. By messaging with Morgan Stanley you consent to the foregoing.


--------------------------------------------------------------------------
NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained herein are not intended to be, and do not constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received this communication in error, please destroy all electronic and paper copies and notify the sender immediately. Mistransmission is not intended to waive confidentiality or privilege. Morgan Stanley reserves the right, to the extent permitted under applicable law, to monitor electronic communications. This message is subject to terms available at the following link: http://www.morganstanley.com/disclaimers. If you cannot access these links, please notify us by reply message and we will send the contents to you. By messaging with Morgan Stanley you consent to the foregoing.

Re: Mutation Dropped Messages

Posted by aaron morton <aa...@thelastpickle.com>.
> 1.       Which parameters to tune in the config files? – Especially looking for heavy writes
The node is overloaded. It may be because there are no enough nodes, or the node is under temporary stress such as GC or repair. 
If you have spare IO / CPU capacity you could increase the current_writes to increase throughput on the write stage. You then need to ensure the commit log and, to a lesser degree, the data volumes can keep up. 

> 2.       What is the difference between TimedOutException and silently dropping mutation messages while operating on a CL of QUORUM.
TimedOutExceptions means CL nodes did not respond to the coordinator before rpc_timeout. Dropping messages happens when a message is removed from the queue in the a thread pool after rpc_timeout has occurred. it is a feature of the architecture, and correct behaviour under stress. 
Inconsistencies created by dropped messages are repaired via reads as high CL, HH (in 1.+), Read Repair or Anti Entropy.

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 5/03/2012, at 11:32 PM, Tiwari, Dushyant wrote:

> Hi All,
>  
> While benchmarking Cassandra I found “Mutation Dropped” messages in the logs.  Now I know this is a good old question. It will be really great if someone can provide a check list to recover when such a thing happens. I am looking for answers of the following questions  -
>  
> 1.       Which parameters to tune in the config files? – Especially looking for heavy writes
> 2.       What is the difference between TimedOutException and silently dropping mutation messages while operating on a CL of QUORUM.
>  
>  
> Regards,
> Dushyant
> NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained herein are not intended to be, and do not constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received this communication in error, please destroy all electronic and paper copies and notify the sender immediately. Mistransmission is not intended to waive confidentiality or privilege. Morgan Stanley reserves the right, to the extent permitted under applicable law, to monitor electronic communications. This message is subject to terms available at the following link: http://www.morganstanley.com/disclaimers. If you cannot access these links, please notify us by reply message and we will send the contents to you. By messaging with Morgan Stanley you consent to the foregoing.