You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Dane Miller <da...@optimalsocial.com> on 2013/04/12 22:12:43 UTC

unexplained hinted handoff

I'm seeing hinted handoff kick in on all our nodes during periods of
high activity, but all the nodes seem to be up (according to the logs
and nodetool status).  The pattern in the logs is something like this:

18:10:45 194 READ messages dropped in last 5000ms
18:11:10 Started hinted handoff for host:
7668c813-41a9-4d42-b362-5420528fefa0 with IP: /10....
18:11:11 Finished hinted handoff of 13 rows to endpoint /10....

This happens on all the nodes every 10 min, and with a different
endpoint each time.  tpstats shows thousands of dropped reads, but no
other types of messages are dropped.

Do slow reads trigger hint storage?  If hints are being stored,
doesn't that imply DOWN nodes, and why don't I see that in the logs?

Dane

Re: unexplained hinted handoff

Posted by Dane Miller <da...@optimalsocial.com>.
On Sun, Apr 14, 2013 at 11:28 AM, aaron morton <aa...@thelastpickle.com> wrote:
>>  If hints are being stored, doesn't that imply DOWN nodes, and why don't I
>> see that in the logs?
>
> Hints are stored for two reasons. First if the node is down when the write
> request starts, second if the node does not reply to the coordinator before
> rpc_timeout. If you are not seeing dropped write messages it may indicate
> network issues between the nodes.

Very helpful! I increased the timeouts for *_request_timeout settings
in cassandra.yaml and no longer see HH in the logs.  Still not sure
why I saw logs about dropped reads, and not dropped mutations.

Anyhow, big timeouts helps in this case.  Thanks :)

Dane

Re: unexplained hinted handoff

Posted by aaron morton <aa...@thelastpickle.com>.
>> Do slow reads trigger hint storage?
No. 
But dropped read messages is often an indicator that the node is overwhelmed.

>>  If hints are being stored, doesn't that imply DOWN nodes, and why don't I see that in the logs?
Hints are stored for two reasons. First if the node is down when the write request starts, second if the node does not reply to the coordinator before rpc_timeout. If you are not seeing dropped write messages it may indicate network issues between the nodes. 

>> I'm seeing hinted handoff kick in on all our nodes during periods of
>> high activity,
Are you seeing log messages about hints been sent to nodes?

Cheers

  
-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 13/04/2013, at 8:23 AM, Dane Miller <da...@optimalsocial.com> wrote:

> On Fri, Apr 12, 2013 at 1:12 PM, Dane Miller <da...@optimalsocial.com> wrote:
>> I'm seeing hinted handoff kick in on all our nodes during periods of
>> high activity, but all the nodes seem to be up (according to the logs
>> and nodetool status).  The pattern in the logs is something like this:
>> 
>> 18:10:45 194 READ messages dropped in last 5000ms
>> 18:11:10 Started hinted handoff for host:
>> 7668c813-41a9-4d42-b362-5420528fefa0 with IP: /10....
>> 18:11:11 Finished hinted handoff of 13 rows to endpoint /10....
>> 
>> This happens on all the nodes every 10 min, and with a different
>> endpoint each time.  tpstats shows thousands of dropped reads, but no
>> other types of messages are dropped.
>> 
>> Do slow reads trigger hint storage?  If hints are being stored,
>> doesn't that imply DOWN nodes, and why don't I see that in the logs?
> 
> Sorry, meant to add: Cassandra 1.2.3, Ubuntu 12.04 x64


Re: unexplained hinted handoff

Posted by Dane Miller <da...@optimalsocial.com>.
On Fri, Apr 12, 2013 at 1:12 PM, Dane Miller <da...@optimalsocial.com> wrote:
> I'm seeing hinted handoff kick in on all our nodes during periods of
> high activity, but all the nodes seem to be up (according to the logs
> and nodetool status).  The pattern in the logs is something like this:
>
> 18:10:45 194 READ messages dropped in last 5000ms
> 18:11:10 Started hinted handoff for host:
> 7668c813-41a9-4d42-b362-5420528fefa0 with IP: /10....
> 18:11:11 Finished hinted handoff of 13 rows to endpoint /10....
>
> This happens on all the nodes every 10 min, and with a different
> endpoint each time.  tpstats shows thousands of dropped reads, but no
> other types of messages are dropped.
>
> Do slow reads trigger hint storage?  If hints are being stored,
> doesn't that imply DOWN nodes, and why don't I see that in the logs?

Sorry, meant to add: Cassandra 1.2.3, Ubuntu 12.04 x64