You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Peng Xiao <25...@qq.com> on 2017/08/05 02:18:36 UTC

Re: MUTATION messages were dropped in last 5000 ms for cross nodetimeout

hi,


Does message drop mean data loss?

Thanks


------------------ Original ------------------
From: Akhil Mehra <ak...@gmail.com>
Date: 周五,8月 4,2017 16:00
To: user <us...@cassandra.apache.org>
Subject: Re: MUTATION messages were dropped in last 5000 ms  for cross nodetimeout



Glad I could be of help :)

Hopefully the partition size resize goes smoothly.


Regards,
Akhil

On 4/08/2017, at 5:41 AM, ZAIDI, ASAD A <az...@att.com> wrote:

Hi Akhil,
 
Thank you for your reply.
 
I kept testing different timeout numbers over last week and eventually settled at setting *_request_timeout_in_ms parameters at 1.5minutes for coordinator wait time. That is the number where I donot see any dropped mutations. 
 
Also asked developers to tweak data model where we saw bunch of tables with really large partition size , some are ranging  Partition-key size around ~6.6GB.. we’re now working to reduce the partition size of the tables. I am hoping corrected data model will help reduce coordinator wait time (get back to default number!)  again.
 
Thank again/Asad
 
From: Akhil Mehra [mailto:akhilmehra@gmail.com] 
Sent: Friday, July 21, 2017 4:24 PM
To: user@cassandra.apache.org
Subject: Re: MUTATION messages were dropped in last 5000 ms for cross node timeout


 
Hi Asad,
 

The 5000 ms is not configurable (https://github.com/apache/cassandra/blob/8b3a60b9a7dbefeecc06bace617279612ec7092d/src/java/org/apache/cassandra/net/MessagingService.java#L423). This just the time after which the number of dropped messages are reported. Thus dropped messages are reported every 5000ms. 

 

If you are looking to tweak the number of ms after which a message is considered dropped then you need to use the write_request_timeout_in_ms.  The write_request_timeout_in_ms (http://docs.datastax.com/en/cassandra/2.1/cassandra/configuration/configCassandra_yaml_r.html) can be used to increase the mutation timeout. By default it is set to 2000ms.

 

I hope that helps.

 

Regards,

Akhil

 

 

On 22/07/2017, at 2:46 AM, ZAIDI, ASAD A <az...@att.com> wrote:

 
Hi Akhil,

 

Thank you for your reply. Previously, I did ‘tune’ various timeouts – basically increased them a bit but none of those parameter listed in the link matches with that “were dropped in last 5000 ms”.

I was wondering from where that [5000ms] number is coming from when,  like I mentioned before, none of any timeout parameter settings matches that #!

 

Load is intermittently high but again cpu queue length never goes beyond medium depth. I wonder if there is some internal limit that I’m still not aware of.

 

Thanks/Asad

 

 

From: Akhil Mehra [mailto:akhilmehra@gmail.com] 
Sent: Thursday, July 20, 2017 3:47 PM
To: user@cassandra.apache.org
Subject: Re: MUTATION messages were dropped in last 5000 ms for cross node timeout



 

Hi Asad,


 


http://cassandra.apache.org/doc/latest/faq/index.html#why-message-dropped

 


As mentioned in the link above this is a load shedding mechanism used by Cassandra.


 


Is you cluster under heavy load?


 


Regards,


Akhil


 

 

On 21/07/2017, at 3:27 AM, ZAIDI, ASAD A <az...@att.com> wrote:


 

Hello Folks –


 


I’m using apache-cassandra 2.2.8.


 


I see many messages like below in my system.log file. In Cassandra.yaml file [ cross_node_timeout: true] is set and NTP server is also running correcting clock drift on 16node cluster. I do not see pending or blocked HintedHandoff  in tpstats output though there are bunch of MUTATIONS dropped observed.


 


<start timeout message >


INFO  [ScheduledTasks:1] 2017-07-20 08:02:52,511 MessagingService.java:946 - MUTATION messages were dropped in last 5000 ms: 822 for internal timeout and 2152 for cross node timeout


<end timeout message >


 


I’m seeking help here if you please let me know what I need to check in order to address these cross node timeouts.


 


Thank you,


Asad

Re: MUTATION messages were dropped in last 5000 ms for cross nodetimeout

Posted by Jeff Jirsa <jj...@gmail.com>.
No

-- 
Jeff Jirsa


> On Aug 4, 2017, at 7:18 PM, Peng Xiao <25...@qq.com> wrote:
> 
> hi,
> 
> Does message drop mean data loss?
> Thanks
> 
> 
> ------------------ Original ------------------
> From: Akhil Mehra <ak...@gmail.com>
> Date: 周五,8月 4,2017 16:00
> To: user <us...@cassandra.apache.org>
> Subject: Re: MUTATION messages were dropped in last 5000 ms for cross nodetimeout
> 
> Glad I could be of help :)
> 
> Hopefully the partition size resize goes smoothly.
> 
> Regards,
> Akhil
> 
>> On 4/08/2017, at 5:41 AM, ZAIDI, ASAD A <az...@att.com> wrote:
>> 
>> Hi Akhil,
>>  
>> Thank you for your reply.
>>  
>> I kept testing different timeout numbers over last week and eventually settled at setting *_request_timeout_in_ms parameters at 1.5minutes for coordinator wait time. That is the number where I donot see any dropped mutations. 
>>  
>> Also asked developers to tweak data model where we saw bunch of tables with really large partition size , some are ranging  Partition-key size around ~6.6GB.. we’re now working to reduce the partition size of the tables. I am hoping corrected data model will help reduce coordinator wait time (get back to default number!)  again.
>>  
>> Thank again/Asad
>>  
>> From: Akhil Mehra [mailto:akhilmehra@gmail.com] 
>> Sent: Friday, July 21, 2017 4:24 PM
>> To: user@cassandra.apache.org
>> Subject: Re: MUTATION messages were dropped in last 5000 ms for cross node timeout
>>  
>> Hi Asad,
>>  
>> The 5000 ms is not configurable (https://github.com/apache/cassandra/blob/8b3a60b9a7dbefeecc06bace617279612ec7092d/src/java/org/apache/cassandra/net/MessagingService.java#L423). This just the time after which the number of dropped messages are reported. Thus dropped messages are reported every 5000ms. 
>>  
>> If you are looking to tweak the number of ms after which a message is considered dropped then you need to use the write_request_timeout_in_ms.  The write_request_timeout_in_ms (http://docs.datastax.com/en/cassandra/2.1/cassandra/configuration/configCassandra_yaml_r.html) can be used to increase the mutation timeout. By default it is set to 2000ms.
>>  
>> I hope that helps.
>>  
>> Regards,
>> Akhil
>>  
>>  
>> On 22/07/2017, at 2:46 AM, ZAIDI, ASAD A <az...@att.com> wrote:
>>  
>> Hi Akhil,
>>  
>> Thank you for your reply. Previously, I did ‘tune’ various timeouts – basically increased them a bit but none of those parameter listed in the link matches with that “were dropped in last 5000 ms”.
>> I was wondering from where that [5000ms] number is coming from when,  like I mentioned before, none of any timeout parameter settings matches that #!
>>  
>> Load is intermittently high but again cpu queue length never goes beyond medium depth. I wonder if there is some internal limit that I’m still not aware of.
>>  
>> Thanks/Asad
>>  
>>  
>> From: Akhil Mehra [mailto:akhilmehra@gmail.com] 
>> Sent: Thursday, July 20, 2017 3:47 PM
>> To: user@cassandra.apache.org
>> Subject: Re: MUTATION messages were dropped in last 5000 ms for cross node timeout
>>  
>> Hi Asad,
>>  
>> http://cassandra.apache.org/doc/latest/faq/index.html#why-message-dropped
>>  
>> As mentioned in the link above this is a load shedding mechanism used by Cassandra.
>>  
>> Is you cluster under heavy load?
>>  
>> Regards,
>> Akhil
>>  
>>  
>> On 21/07/2017, at 3:27 AM, ZAIDI, ASAD A <az...@att.com> wrote:
>>  
>> Hello Folks –
>>  
>> I’m using apache-cassandra 2.2.8.
>>  
>> I see many messages like below in my system.log file. In Cassandra.yaml file [ cross_node_timeout: true] is set and NTP server is also running correcting clock drift on 16node cluster. I do not see pending or blocked HintedHandoff  in tpstats output though there are bunch of MUTATIONS dropped observed.
>>  
>> <start timeout message >
>> INFO  [ScheduledTasks:1] 2017-07-20 08:02:52,511 MessagingService.java:946 - MUTATION messages were dropped in last 5000 ms: 822 for internal timeout and 2152 for cross node timeout
>> <end timeout message >
>>  
>> I’m seeking help here if you please let me know what I need to check in order to address these cross node timeouts.
>>  
>> Thank you,
>> Asad
>