You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Jens Rantil <je...@tink.se> on 2015/01/05 11:52:14 UTC

Implications of ramping up max_hint_window_in_ms

Hi,


Since repair is a slow and daunting process*, I am considering increasing max_hint_window_in_ms from its default value of one (1) hour to something like 24-48 hours. This will give me and my team more time to fix the underlying problem of a node. I understand that
 - repair is the only way to avoid hardware failure/bit rot scenarios. I will still be running repair on a weekly basis.
 - disk usage obviously will increase before data has been handed off. Disk usage shouldn’t be an issue in this case.


Are there any other implications of making this change that I haven’t thought of?


* I know incremental repair is coming up, but I don’t consider it stable enough.


Thanks,
Jens

———
Jens Rantil
Backend engineer
Tink AB

Email: jens.rantil@tink.se
Phone: +46 708 84 18 32
Web: www.tink.se

Facebook Linkedin Twitter

Re: Implications of ramping up max_hint_window_in_ms

Posted by Robert Coli <rc...@eventbrite.com>.
On Tue, Jan 6, 2015 at 11:14 AM, Ryan Svihla <rs...@foundev.pro> wrote:

> woops wrong thread..ignore that :) Robert is correct in this regard by and
> large even though I disagree with the tradeoff, as my experience has shown
> me, for a lot of use cases it's not a happy tradeoff, YMMV and there are
> some that do exist (low write throughput).
>

FWIW, I agree with there being meaningful risk, mostly related as you say
to hints being contained with SSTables (until 3.0) and all that implies.
That's why my original statement included a disclaimer that it used to be
very bad and is now just less-bad. :D

=Rob

Re: Implications of ramping up max_hint_window_in_ms

Posted by Ryan Svihla <rs...@foundev.pro>.
woops wrong thread..ignore that :) Robert is correct in this regard by and
large even though I disagree with the tradeoff, as my experience has shown
me, for a lot of use cases it's not a happy tradeoff, YMMV and there are
some that do exist (low write throughput).

On Tue, Jan 6, 2015 at 12:58 PM, Ryan Svihla <rs...@foundev.pro> wrote:

> as long as they know how to handle node recovery and don't inflict return
> data back from the dead that was deleted.
>
> On Tue, Jan 6, 2015 at 12:52 PM, Robert Coli <rc...@eventbrite.com> wrote:
>
>> On Tue, Jan 6, 2015 at 7:39 AM, Ryan Svihla <rs...@foundev.pro> wrote:
>>
>>> In general today, large amounts of hints still pretty much makes a node
>>> angry (just no longer nearly as nasty as it was before), unless you have a
>>> really low throughput, you're probably not going to gain much in practice
>>> by raising the hints window today.
>>>
>>
>> It gains people with not-insane write workload who don't mind eventual
>> consistency more time to respond to outages?
>>
>> =Rob
>>
>>
>
>
> --
>
> Thanks,
> Ryan Svihla
>
>


-- 

Thanks,
Ryan Svihla

Re: Implications of ramping up max_hint_window_in_ms

Posted by Ryan Svihla <rs...@foundev.pro>.
as long as they know how to handle node recovery and don't inflict return
data back from the dead that was deleted.

On Tue, Jan 6, 2015 at 12:52 PM, Robert Coli <rc...@eventbrite.com> wrote:

> On Tue, Jan 6, 2015 at 7:39 AM, Ryan Svihla <rs...@foundev.pro> wrote:
>
>> In general today, large amounts of hints still pretty much makes a node
>> angry (just no longer nearly as nasty as it was before), unless you have a
>> really low throughput, you're probably not going to gain much in practice
>> by raising the hints window today.
>>
>
> It gains people with not-insane write workload who don't mind eventual
> consistency more time to respond to outages?
>
> =Rob
>
>


-- 

Thanks,
Ryan Svihla

Re: Implications of ramping up max_hint_window_in_ms

Posted by Robert Coli <rc...@eventbrite.com>.
On Tue, Jan 6, 2015 at 7:39 AM, Ryan Svihla <rs...@foundev.pro> wrote:

> In general today, large amounts of hints still pretty much makes a node
> angry (just no longer nearly as nasty as it was before), unless you have a
> really low throughput, you're probably not going to gain much in practice
> by raising the hints window today.
>

It gains people with not-insane write workload who don't mind eventual
consistency more time to respond to outages?

=Rob

Re: Implications of ramping up max_hint_window_in_ms

Posted by Ryan Svihla <rs...@foundev.pro>.
In general today, large amounts of hints still pretty much makes a node
angry (just no longer nearly as nasty as it was before), unless you have a
really low throughput, you're probably not going to gain much in practice
by raising the hints window today.

Later on when we get file system based hints in 3.0 I think your approach
will work better, today I'm concerned in practice larger hint windows won't
buy you a lot see the following for details.
http://www.datastax.com/dev/blog/whats-coming-to-cassandra-in-3-0-improved-hint-storage-and-delivery

On Tue, Jan 6, 2015 at 1:47 AM, Jens Rantil <je...@tink.se> wrote:

> Thanks for input, Rob. Just making sure, is "older version" the same as
> "less than version 2"?
>
>
>
> On Mon, Jan 5, 2015 at 8:13 PM, Robert Coli <rc...@eventbrite.com> wrote:
>
>> On Mon, Jan 5, 2015 at 2:52 AM, Jens Rantil <je...@tink.se> wrote:
>>
>>
>>> Since repair is a slow and daunting process*, I am considering
>>> increasing max_hint_window_in_ms from its default value of one (1) hour to
>>> something like 24-48 hours.
>>> ...
>>> Are there any other implications of making this change that I haven’t
>>> thought of?
>>>
>>
>> Not really, though 24-48 hours of hints could be an awful lot of hints. I
>> personally run with at least a 6 hour max_h_w_i_m.
>>
>> In older versions of Cassandra, 24-48 hours of hints could hose your node
>> via ineffective constant compaction.
>>
>> =Rob
>>
>
>


-- 

Thanks,
Ryan Svihla

Re: Implications of ramping up max_hint_window_in_ms

Posted by Jens Rantil <je...@tink.se>.
Thanks for input, Rob. Just making sure, is "older version" the same as "less than version 2"?

On Mon, Jan 5, 2015 at 8:13 PM, Robert Coli <rc...@eventbrite.com> wrote:

> On Mon, Jan 5, 2015 at 2:52 AM, Jens Rantil <je...@tink.se> wrote:
>> Since repair is a slow and daunting process*, I am considering increasing
>> max_hint_window_in_ms from its default value of one (1) hour to something
>> like 24-48 hours.
>> ...
>> Are there any other implications of making this change that I haven’t
>> thought of?
>>
> Not really, though 24-48 hours of hints could be an awful lot of hints. I
> personally run with at least a 6 hour max_h_w_i_m.
> In older versions of Cassandra, 24-48 hours of hints could hose your node
> via ineffective constant compaction.
> =Rob

Re: Implications of ramping up max_hint_window_in_ms

Posted by Robert Coli <rc...@eventbrite.com>.
On Mon, Jan 5, 2015 at 2:52 AM, Jens Rantil <je...@tink.se> wrote:


> Since repair is a slow and daunting process*, I am considering increasing
> max_hint_window_in_ms from its default value of one (1) hour to something
> like 24-48 hours.
> ...
> Are there any other implications of making this change that I haven’t
> thought of?
>

Not really, though 24-48 hours of hints could be an awful lot of hints. I
personally run with at least a 6 hour max_h_w_i_m.

In older versions of Cassandra, 24-48 hours of hints could hose your node
via ineffective constant compaction.

=Rob