You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Bo Finnerup Madsen <bo...@gmail.com> on 2016/04/20 15:38:48 UTC

When are hints written?

Hi,

We have a small 5 node cluster of m4.xlarge clients that receives writes
from ~20 clients. The clients will write as fast as they can, and the whole
process is limited by the write performance of the cassandra cluster.
After we have tweaked our schema to avoid large partitions, the load is
going ok and we don't see any warnings or errors in the cassandra logs. But
we do see quite a lot of hint handoff activity. During the load, the
cassandra nodes are quite loaded, with linux reporting a load as high as 20.

I have read the available documentation on how hints works, and to my
understanding hints should only be written if a node is down. But as far as
I can see, none of the nodes are marked as down during the load. So I
suspect I am missing something :)
We have configured the servers with write_request_timeout_in_ms: 120000 and
the clients with a timeout of 130000, but still get hints stored.

In our case, I would like for the cluster to wait for the write to be
persisted on the relevant nodes before returning an ok to the client. But I
don't know which knobs to turn to accomplish this? or if it is even
possible :)

We are running cassandra 3.0.3, with 8Gb heap and a replication factor of 3.

Thank you in advance!

Yours sincerely,
  Bo Madsen

Re: When are hints written?

Posted by Bo Finnerup Madsen <bo...@gmail.com>.

Hi Jens,

I suspected that write_request_timeout_in_ms was involved, so I have
already raised that to 120000 on all nodes in the cluster, and to 130000 on
the client. This is much longer that any of our writes takes.

I looked at the hints files that where being written, and they where all 42
bytes long...I cannot think of anything in our workload that is a) that
small and b) that consistent in size, so perhaps the hints are being
written for some internal communication messages?

Noting your comment about 3.0.3 being "bleeding edge", I tried to remove
our use of materialized views which allowed me to downgrade our cluster to
2.1.13. Using this version I see virtually no hint messages (<100 over 3
days of heavy writing). So this removes my urgent need for understanding,
but it would still be nice to know what had happened :)

tor. 21. apr. 2016 kl. 15.58 skrev Jens Rantil <je...@tink.se>:

> Hi again Bo,
>
> I assume this is the piece of documentation you are referring to?
> http://docs.datastax.com/en/cassandra/2.0/cassandra/dml/dml_about_hh_c.html?scroll=concept_ds_ifg_jqx_zj__performance
>
> > If a replica node is overloaded or unavailable, and the failure detector
> has not yet marked it down, then expect most or all writes to that node to
> fail after the timeout triggered by write_request_timeout_in_ms,
> <http://docs.datastax.com/en/cassandra/2.0/cassandra/configuration/configCassandra_yaml_r.html#reference_ds_qfg_n1r_1k__write_request_timeout_in_ms>
> which defaults to 10 seconds. During that time, Cassandra writes the hint
> when the timeout is reached.
>
> I'm not an expert on this, but the way I've seen is that hints are written
> stored as soon as there is _any_ issues writing a mutation
> (insert/update/delete) to a node. By "issue", that essentially means that a
> node hasn't acknowledged back to the coordinator that the write succeeded
> within write_request_timeout_in_ms. This includes TCP/socket timeouts,
> connection issues or that the node is down. The hints are stored for a
> maximum timespan defaulting to 3 hours.
>
> Cheers,
> Jens
>
> On Thu, Apr 21, 2016 at 8:06 AM Bo Finnerup Madsen <bo...@gmail.com>
> wrote:
>
>> Hi Jens,
>>
>> Thank you for the tip!
>> ALL would definitely cure our hints issue, but as you note, it is not
>> optimal as we are unable to take down nodes without clients failing.
>>
>> I am most probably overlooking something in the documentation, but I
>> cannot see any description of when hints are written other than when a node
>> is marked as being down. And since none of our nodes have been marked as
>> being down (at least according to the logs), I suspect that there is some
>> timeout that governs when hints are written?
>>
>> Regarding your other post: Yes, 3.0.3 is pretty new. But we are new to
>> this cassandra game, and our schema-fu is not strong enough for us to
>> create a schema without using materialized views :)
>>
>>
>> ons. 20. apr. 2016 kl. 17.09 skrev Jens Rantil <je...@tink.se>:
>>
>>> Hi Bo,
>>>
>>> > In our case, I would like for the cluster to wait for the write to be
>>> persisted on the relevant nodes before returning an ok to the client.
>>> But I don't know which knobs to turn to accomplish this? or if it is even
>>> possible :)
>>>
>>> This is what write consistency option is for. Have a look at
>>> https://docs.datastax.com/en/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html.
>>> Note, however that if you use ALL, your clients will fail (throw exception,
>>> depending on language) as soon as a single partition can't be written. This
>>> means you can't do online maintenance of a Cassandra node (such as
>>> upgrading it etc.) without experiencing write issues.
>>>
>>> Cheers,
>>> Jens
>>>
>>> On Wed, Apr 20, 2016 at 3:39 PM Bo Finnerup Madsen <
>>> bo.gundersen@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> We have a small 5 node cluster of m4.xlarge clients that receives
>>>> writes from ~20 clients. The clients will write as fast as they can, and
>>>> the whole process is limited by the write performance of the cassandra
>>>> cluster.
>>>> After we have tweaked our schema to avoid large partitions, the load is
>>>> going ok and we don't see any warnings or errors in the cassandra logs. But
>>>> we do see quite a lot of hint handoff activity. During the load, the
>>>> cassandra nodes are quite loaded, with linux reporting a load as high as 20.
>>>>
>>>> I have read the available documentation on how hints works, and to my
>>>> understanding hints should only be written if a node is down. But as far as
>>>> I can see, none of the nodes are marked as down during the load. So I
>>>> suspect I am missing something :)
>>>> We have configured the servers with write_request_timeout_in_ms: 120000
>>>> and the clients with a timeout of 130000, but still get hints stored.
>>>>
>>>> In our case, I would like for the cluster to wait for the write to be
>>>> persisted on the relevant nodes before returning an ok to the client. But I
>>>> don't know which knobs to turn to accomplish this? or if it is even
>>>> possible :)
>>>>
>>>> We are running cassandra 3.0.3, with 8Gb heap and a replication factor
>>>> of 3.
>>>>
>>>> Thank you in advance!
>>>>
>>>> Yours sincerely,
>>>>   Bo Madsen
>>>>
>>> --
>>>
>>> Jens Rantil
>>> Backend Developer @ Tink
>>>
>>> Tink AB, Wallingatan 5, 111 60 Stockholm, Sweden
>>> For urgent matters you can reach me at +46-708-84 18 32.
>>>
>> --
>
> Jens Rantil
> Backend Developer @ Tink
>
> Tink AB, Wallingatan 5, 111 60 Stockholm, Sweden
> For urgent matters you can reach me at +46-708-84 18 32.
>

Re: When are hints written?

Posted by Jens Rantil <je...@tink.se>.

Hi again Bo,

I assume this is the piece of documentation you are referring to?
http://docs.datastax.com/en/cassandra/2.0/cassandra/dml/dml_about_hh_c.html?scroll=concept_ds_ifg_jqx_zj__performance

> If a replica node is overloaded or unavailable, and the failure detector
has not yet marked it down, then expect most or all writes to that node to
fail after the timeout triggered by write_request_timeout_in_ms,
<http://docs.datastax.com/en/cassandra/2.0/cassandra/configuration/configCassandra_yaml_r.html#reference_ds_qfg_n1r_1k__write_request_timeout_in_ms>
which defaults to 10 seconds. During that time, Cassandra writes the hint
when the timeout is reached.

I'm not an expert on this, but the way I've seen is that hints are written
stored as soon as there is _any_ issues writing a mutation
(insert/update/delete) to a node. By "issue", that essentially means that a
node hasn't acknowledged back to the coordinator that the write succeeded
within write_request_timeout_in_ms. This includes TCP/socket timeouts,
connection issues or that the node is down. The hints are stored for a
maximum timespan defaulting to 3 hours.

Cheers,
Jens

On Thu, Apr 21, 2016 at 8:06 AM Bo Finnerup Madsen <bo...@gmail.com>
wrote:

> Hi Jens,
>
> Thank you for the tip!
> ALL would definitely cure our hints issue, but as you note, it is not
> optimal as we are unable to take down nodes without clients failing.
>
> I am most probably overlooking something in the documentation, but I
> cannot see any description of when hints are written other than when a node
> is marked as being down. And since none of our nodes have been marked as
> being down (at least according to the logs), I suspect that there is some
> timeout that governs when hints are written?
>
> Regarding your other post: Yes, 3.0.3 is pretty new. But we are new to
> this cassandra game, and our schema-fu is not strong enough for us to
> create a schema without using materialized views :)
>
>
> ons. 20. apr. 2016 kl. 17.09 skrev Jens Rantil <je...@tink.se>:
>
>> Hi Bo,
>>
>> > In our case, I would like for the cluster to wait for the write to be
>> persisted on the relevant nodes before returning an ok to the client.
>> But I don't know which knobs to turn to accomplish this? or if it is even
>> possible :)
>>
>> This is what write consistency option is for. Have a look at
>> https://docs.datastax.com/en/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html.
>> Note, however that if you use ALL, your clients will fail (throw exception,
>> depending on language) as soon as a single partition can't be written. This
>> means you can't do online maintenance of a Cassandra node (such as
>> upgrading it etc.) without experiencing write issues.
>>
>> Cheers,
>> Jens
>>
>> On Wed, Apr 20, 2016 at 3:39 PM Bo Finnerup Madsen <
>> bo.gundersen@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> We have a small 5 node cluster of m4.xlarge clients that receives writes
>>> from ~20 clients. The clients will write as fast as they can, and the whole
>>> process is limited by the write performance of the cassandra cluster.
>>> After we have tweaked our schema to avoid large partitions, the load is
>>> going ok and we don't see any warnings or errors in the cassandra logs. But
>>> we do see quite a lot of hint handoff activity. During the load, the
>>> cassandra nodes are quite loaded, with linux reporting a load as high as 20.
>>>
>>> I have read the available documentation on how hints works, and to my
>>> understanding hints should only be written if a node is down. But as far as
>>> I can see, none of the nodes are marked as down during the load. So I
>>> suspect I am missing something :)
>>> We have configured the servers with write_request_timeout_in_ms: 120000
>>> and the clients with a timeout of 130000, but still get hints stored.
>>>
>>> In our case, I would like for the cluster to wait for the write to be
>>> persisted on the relevant nodes before returning an ok to the client. But I
>>> don't know which knobs to turn to accomplish this? or if it is even
>>> possible :)
>>>
>>> We are running cassandra 3.0.3, with 8Gb heap and a replication factor
>>> of 3.
>>>
>>> Thank you in advance!
>>>
>>> Yours sincerely,
>>>   Bo Madsen
>>>
>> --
>>
>> Jens Rantil
>> Backend Developer @ Tink
>>
>> Tink AB, Wallingatan 5, 111 60 Stockholm, Sweden
>> For urgent matters you can reach me at +46-708-84 18 32.
>>
> --

Jens Rantil
Backend Developer @ Tink

Tink AB, Wallingatan 5, 111 60 Stockholm, Sweden
For urgent matters you can reach me at +46-708-84 18 32.

Re: When are hints written?

Posted by Bo Finnerup Madsen <bo...@gmail.com>.

Hi Jens,

Thank you for the tip!
ALL would definitely cure our hints issue, but as you note, it is not
optimal as we are unable to take down nodes without clients failing.

I am most probably overlooking something in the documentation, but I cannot
see any description of when hints are written other than when a node is
marked as being down. And since none of our nodes have been marked as being
down (at least according to the logs), I suspect that there is some timeout
that governs when hints are written?

Regarding your other post: Yes, 3.0.3 is pretty new. But we are new to this
cassandra game, and our schema-fu is not strong enough for us to create a
schema without using materialized views :)

ons. 20. apr. 2016 kl. 17.09 skrev Jens Rantil <je...@tink.se>:

> Hi Bo,
>
> > In our case, I would like for the cluster to wait for the write to be
> persisted on the relevant nodes before returning an ok to the client. But
> I don't know which knobs to turn to accomplish this? or if it is even
> possible :)
>
> This is what write consistency option is for. Have a look at
> https://docs.datastax.com/en/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html.
> Note, however that if you use ALL, your clients will fail (throw exception,
> depending on language) as soon as a single partition can't be written. This
> means you can't do online maintenance of a Cassandra node (such as
> upgrading it etc.) without experiencing write issues.
>
> Cheers,
> Jens
>
> On Wed, Apr 20, 2016 at 3:39 PM Bo Finnerup Madsen <bo...@gmail.com>
> wrote:
>
>> Hi,
>>
>> We have a small 5 node cluster of m4.xlarge clients that receives writes
>> from ~20 clients. The clients will write as fast as they can, and the whole
>> process is limited by the write performance of the cassandra cluster.
>> After we have tweaked our schema to avoid large partitions, the load is
>> going ok and we don't see any warnings or errors in the cassandra logs. But
>> we do see quite a lot of hint handoff activity. During the load, the
>> cassandra nodes are quite loaded, with linux reporting a load as high as 20.
>>
>> I have read the available documentation on how hints works, and to my
>> understanding hints should only be written if a node is down. But as far as
>> I can see, none of the nodes are marked as down during the load. So I
>> suspect I am missing something :)
>> We have configured the servers with write_request_timeout_in_ms: 120000
>> and the clients with a timeout of 130000, but still get hints stored.
>>
>> In our case, I would like for the cluster to wait for the write to be
>> persisted on the relevant nodes before returning an ok to the client. But I
>> don't know which knobs to turn to accomplish this? or if it is even
>> possible :)
>>
>> We are running cassandra 3.0.3, with 8Gb heap and a replication factor of
>> 3.
>>
>> Thank you in advance!
>>
>> Yours sincerely,
>>   Bo Madsen
>>
> --
>
> Jens Rantil
> Backend Developer @ Tink
>
> Tink AB, Wallingatan 5, 111 60 Stockholm, Sweden
> For urgent matters you can reach me at +46-708-84 18 32.
>

Re: When are hints written?

Posted by Jens Rantil <je...@tink.se>.

Hi Bo,

> In our case, I would like for the cluster to wait for the write to be
persisted on the relevant nodes before returning an ok to the client. But I
don't know which knobs to turn to accomplish this? or if it is even
possible :)

This is what write consistency option is for. Have a look at
https://docs.datastax.com/en/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html.
Note, however that if you use ALL, your clients will fail (throw exception,
depending on language) as soon as a single partition can't be written. This
means you can't do online maintenance of a Cassandra node (such as
upgrading it etc.) without experiencing write issues.

Cheers,
Jens

On Wed, Apr 20, 2016 at 3:39 PM Bo Finnerup Madsen <bo...@gmail.com>
wrote:

> Hi,
>
> We have a small 5 node cluster of m4.xlarge clients that receives writes
> from ~20 clients. The clients will write as fast as they can, and the whole
> process is limited by the write performance of the cassandra cluster.
> After we have tweaked our schema to avoid large partitions, the load is
> going ok and we don't see any warnings or errors in the cassandra logs. But
> we do see quite a lot of hint handoff activity. During the load, the
> cassandra nodes are quite loaded, with linux reporting a load as high as 20.
>
> I have read the available documentation on how hints works, and to my
> understanding hints should only be written if a node is down. But as far as
> I can see, none of the nodes are marked as down during the load. So I
> suspect I am missing something :)
> We have configured the servers with write_request_timeout_in_ms: 120000
> and the clients with a timeout of 130000, but still get hints stored.
>
> In our case, I would like for the cluster to wait for the write to be
> persisted on the relevant nodes before returning an ok to the client. But I
> don't know which knobs to turn to accomplish this? or if it is even
> possible :)
>
> We are running cassandra 3.0.3, with 8Gb heap and a replication factor of
> 3.
>
> Thank you in advance!
>
> Yours sincerely,
>   Bo Madsen
>
-- 

Jens Rantil
Backend Developer @ Tink

Tink AB, Wallingatan 5, 111 60 Stockholm, Sweden
For urgent matters you can reach me at +46-708-84 18 32.

Re: When are hints written?

Posted by Jens Rantil <je...@tink.se>.

Also, 3.0.3 is a very new Cassandra version. IMHO, you might want to
consider not using bleeding edge in production. See
https://www.eventbrite.com/engineering/what-version-of-cassandra-should-i-run/
 and http://stackoverflow.com/q/25155916.

Cheers,
Jens

On Wed, Apr 20, 2016 at 3:39 PM Bo Finnerup Madsen <bo...@gmail.com>
wrote:

> Hi,
>
> We have a small 5 node cluster of m4.xlarge clients that receives writes
> from ~20 clients. The clients will write as fast as they can, and the whole
> process is limited by the write performance of the cassandra cluster.
> After we have tweaked our schema to avoid large partitions, the load is
> going ok and we don't see any warnings or errors in the cassandra logs. But
> we do see quite a lot of hint handoff activity. During the load, the
> cassandra nodes are quite loaded, with linux reporting a load as high as 20.
>
> I have read the available documentation on how hints works, and to my
> understanding hints should only be written if a node is down. But as far as
> I can see, none of the nodes are marked as down during the load. So I
> suspect I am missing something :)
> We have configured the servers with write_request_timeout_in_ms: 120000
> and the clients with a timeout of 130000, but still get hints stored.
>
> In our case, I would like for the cluster to wait for the write to be
> persisted on the relevant nodes before returning an ok to the client. But I
> don't know which knobs to turn to accomplish this? or if it is even
> possible :)
>
> We are running cassandra 3.0.3, with 8Gb heap and a replication factor of
> 3.
>
> Thank you in advance!
>
> Yours sincerely,
>   Bo Madsen
>
-- 

Jens Rantil
Backend Developer @ Tink

Tink AB, Wallingatan 5, 111 60 Stockholm, Sweden
For urgent matters you can reach me at +46-708-84 18 32.

Re: When are hints written?

Posted by Vinu Thomas <th...@hotmail.com>.

how do you unsubscribe

________________________________
From: Bo Finnerup Madsen <bo...@gmail.com>
Sent: Wednesday, April 20, 2016 9:38 AM
To: user@cassandra.apache.org
Subject: When are hints written?

Hi,

We have a small 5 node cluster of m4.xlarge clients that receives writes from ~20 clients. The clients will write as fast as they can, and the whole process is limited by the write performance of the cassandra cluster.
After we have tweaked our schema to avoid large partitions, the load is going ok and we don't see any warnings or errors in the cassandra logs. But we do see quite a lot of hint handoff activity. During the load, the cassandra nodes are quite loaded, with linux reporting a load as high as 20.

I have read the available documentation on how hints works, and to my understanding hints should only be written if a node is down. But as far as I can see, none of the nodes are marked as down during the load. So I suspect I am missing something :)
We have configured the servers with write_request_timeout_in_ms: 120000 and the clients with a timeout of 130000, but still get hints stored.

In our case, I would like for the cluster to wait for the write to be persisted on the relevant nodes before returning an ok to the client. But I don't know which knobs to turn to accomplish this? or if it is even possible :)

We are running cassandra 3.0.3, with 8Gb heap and a replication factor of 3.

Thank you in advance!

Yours sincerely,
  Bo Madsen