You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Sergio <la...@gmail.com> on 2020/01/21 21:12:37 UTC

Is there any concern about increasing gc_grace_seconds from 5 days to 8 days?

Hi guys!

I just wanted to confirm with you before doing such an operation. I expect
to increase the space but nothing more than this. I  need to perform just :

UPDATE COLUMN FAMILY cf with GC_GRACE = 691,200; //8 days

Is it correct?

Thanks,

Sergio

Re: Is there any concern about increasing gc_grace_seconds from 5 days to 8 days?

Posted by Reid Pinchback <rp...@tripadvisor.com>.
I have plans to do so in the near-ish future.  People keep adding things to my to-do list, and I don’t have something on my to-do list yet saying “stop people from adding things to my to-do list”.  😉

Assuming I get to that point, if I answer something and I think something I wrote is relevant, I’ll point to it for those who want more details.  Email discussion threads, sometimes it is more helpful to say things a bit more abbreviated.  Not everybody needs details, many people have more context than I do and can fill in the backstory on their own.

R

From: Sergio <la...@gmail.com>
Date: Wednesday, January 22, 2020 at 4:46 PM
To: Reid Pinchback <rp...@tripadvisor.com>
Cc: "user@cassandra.apache.org" <us...@cassandra.apache.org>
Subject: Re: Is there any concern about increasing gc_grace_seconds from 5 days to 8 days?

Message from External Sender
Thanks for the explanation. It should deserve a blog post

Sergio

On Wed, Jan 22, 2020, 1:22 PM Reid Pinchback <rp...@tripadvisor.com>> wrote:
The reaper logs will say if nodes are being skipped.  The web UI isn’t that good at making it apparent.  You can sometimes tell it is likely happening when you see time gaps between parts of the repair.  This is for when nodes are skipped because of a timeout, but not only that.  The gaps are mostly controlled by the combined results of segmentCountPerNode, repairIntensity, and hangingRepairTimeoutMins.  The last of those three is the most obvious influence on timeouts, but the other two have some impact on the work attempted and the size of the time gaps.  However the C* version also has some bearing, as it influences how hard it is to process the data needed for repairs.

The more subtle aspect of node skipping isn’t the hanging repairs.  When repair of a token range is first attempted, Reaper uses JMX to ask C* if a repair is already underway.  The way it asks is very simplistic, so it doesn’t mean a repair is underway for that particular token range.  It just means something looking like a repair is going on.  Basically it just asks “hey is there a thread with the right magic naming pattern?”  The problem I think is that when you get some repair activity triggered on reads and writes for inconsistent data, I believe they show up as these kinds of threads too.  If you have a bad usage pattern of C* (where you write then very soon read back) then logically you’d expect this to happen quite a lot.

I’m not an expert on the internals since I’m not one of the C* contributors, but having stared at that part of the source quite a bit this year, that’s my take on what can happen.  And if I’m correct, that’s not a thing you can tune for. It is a consequence of C*-unfriendly usage patterns.

Bottom line though is that tuning repairs is only something you do if you find that repairs are taking longer than makes sense to you.  It’s totally separate from the notion that you should be able to run reaper-controlled repairs at least 2x per gc grace seconds.  That’s just a case of making some observations on the arithmetic of time intervals.


From: Sergio <la...@gmail.com>>
Date: Wednesday, January 22, 2020 at 4:08 PM
To: Reid Pinchback <rp...@tripadvisor.com>>
Cc: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: Is there any concern about increasing gc_grace_seconds from 5 days to 8 days?

Message from External Sender
Thank you very much for your extended response.
Should I look in the log some particular message to detect such behavior?
How do you tune it ?

Thanks,

Sergio

On Wed, Jan 22, 2020, 12:59 PM Reid Pinchback <rp...@tripadvisor.com>> wrote:
Kinda. It isn’t that you have to repair twice per se, just that the possibility of running repairs at least twice before GC grace seconds elapse means that clearly there is no chance of a tombstone not being subject to repair at least once before you hit your GC grace seconds.

Imagine a tombstone being created on the very first node that Reaper looked at in a repair cycle, but one second after Reaper completed repair of that particular token range.  Repairs will be complete, but that particular tombstone just missed being part of the effort.

Now your next repair run happens.  What if Reaper doesn’t look at that same node first?  It is easy to have happen, as there is a bunch of logic related to detection of existing repairs or things taking too long.  So the box that was “the first node” in that first repair run, through bad luck gets kicked down to later in the second run.  I’ve seen nodes get skipped multiple times (you can tune to reduce that, but bottom line… it happens).  So, bad luck you’ve got.  Eventually the node does get repaired, and the aging tombstone finally gets removed.  All fine and dandy…

Provided that the second repair run got to that point BEFORE you hit your GC grace seconds.

That’s why you need enough time to run it twice.  Because you need enough time to catch the oldest possible tombstone, even if it is dealt with at the very end of a repair run.  Yes, it sounds like a bit of a degenerate case, but if you are writing a lot of data, the probability of not having the degenerate cases become real cases becomes vanishingly small.

R


From: Sergio <la...@gmail.com>>
Date: Wednesday, January 22, 2020 at 1:41 PM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>, Reid Pinchback <rp...@tripadvisor.com>>
Subject: Re: Is there any concern about increasing gc_grace_seconds from 5 days to 8 days?

Message from External Sender
I was wondering if I should always complete 2 repairs cycles with reaper even if one repair cycle finishes in 7 hours.

Currently, I have around 200GB in column family data size to be repaired and I was scheduling once repair a week and I was not having too much stress on my 8 nodes cluster with i3xlarge nodes.

Thanks,

Sergio

Il giorno mer 22 gen 2020 alle ore 08:28 Sergio <la...@gmail.com>> ha scritto:
Thank you very much! Yes I am using reaper!

Best,

Sergio

On Wed, Jan 22, 2020, 8:00 AM Reid Pinchback <rp...@tripadvisor.com>> wrote:
Sergio, if you’re looking for a new frequency for your repairs because of the change, if you are using reaper, then I’d go for repair_freq <= gc_grace / 2.

Just serendipity with a conversation I was having at work this morning.  When you actually watch the reaper logs then you can see situations where unlucky timing with skipped nodes can make the time to remove a tombstone be up to 2 x repair_run_time.

If you aren’t using reaper, your mileage will vary, particularly if your repairs are consistent in the ordering across nodes.  Reaper can be moderately non-deterministic hence the need to be sure you can complete at least two repair runs.

R

From: Sergio <la...@gmail.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Tuesday, January 21, 2020 at 7:13 PM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: Is there any concern about increasing gc_grace_seconds from 5 days to 8 days?

Message from External Sender
Thank you very much for your response.
The considerations mentioned are the ones that I was expecting.
I believe that I am good to go.
I just wanted to make sure that there was no need to run any other extra command beside that one.

Best,

Sergio

On Tue, Jan 21, 2020, 3:55 PM Jeff Jirsa <jj...@gmail.com>> wrote:
Note that if you're actually running repairs within 5 days, and you adjust this to 8, you may stream a bunch of tombstones across in that 5-8 day window, which can increase disk usage / compaction (because as you pass 5 days, one replica may gc away the tombstones, the others may not because the tombstones shadow data, so you'll re-stream the tombstone to the other replicas)

On Tue, Jan 21, 2020 at 3:28 PM Elliott Sims <el...@backblaze.com>> wrote:
In addition to extra space, queries can potentially be more expensive because more dead rows and tombstones will need to be scanned.  How much of a difference this makes will depend drastically on the schema and access pattern, but I wouldn't expect going from 5 days to 8 to be very noticeable.

On Tue, Jan 21, 2020 at 2:14 PM Sergio <la...@gmail.com>> wrote:
https://stackoverflow.com/a/22030790<https://urldefense.proofpoint.com/v2/url?u=https-3A__stackoverflow.com_a_22030790&d=DwMFaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=qt1NAYTks84VVQ4WGXWkK6pw85m3FcuUjPRJPdIHMdw&s=aEgz5F5HRxPT3w4hpfNXQRhcchwRjrpf7KB3QyywO_Q&e=>

For CQLSH

alter table <table_name> with GC_GRACE_SECONDS = <seconds>;



Il giorno mar 21 gen 2020 alle ore 13:12 Sergio <la...@gmail.com>> ha scritto:
Hi guys!

I just wanted to confirm with you before doing such an operation. I expect to increase the space but nothing more than this. I  need to perform just :

UPDATE COLUMN FAMILY cf with GC_GRACE = 691,200; //8 days
Is it correct?

Thanks,

Sergio

Re: Is there any concern about increasing gc_grace_seconds from 5 days to 8 days?

Posted by Sergio <la...@gmail.com>.
Thanks for the explanation. It should deserve a blog post

Sergio

On Wed, Jan 22, 2020, 1:22 PM Reid Pinchback <rp...@tripadvisor.com>
wrote:

> The reaper logs will say if nodes are being skipped.  The web UI isn’t
> that good at making it apparent.  You can sometimes tell it is likely
> happening when you see time gaps between parts of the repair.  This is for
> when nodes are skipped because of a timeout, but not only that.  The gaps
> are mostly controlled by the combined results of segmentCountPerNode,
> repairIntensity, and hangingRepairTimeoutMins.  The last of those three is
> the most obvious influence on timeouts, but the other two have some impact
> on the work attempted and the size of the time gaps.  However the C*
> version also has some bearing, as it influences how hard it is to process
> the data needed for repairs.
>
>
>
> The more subtle aspect of node skipping isn’t the hanging repairs.  When
> repair of a token range is first attempted, Reaper uses JMX to ask C* if a
> repair is already underway.  The way it asks is very simplistic, so it
> doesn’t mean a repair is underway for that particular token range.  It just
> means something looking like a repair is going on.  Basically it just asks
> “hey is there a thread with the right magic naming pattern?”  The problem I
> think is that when you get some repair activity triggered on reads and
> writes for inconsistent data, I believe they show up as these kinds of
> threads too.  If you have a bad usage pattern of C* (where you write then
> very soon read back) then logically you’d expect this to happen quite a lot.
>
>
>
> I’m not an expert on the internals since I’m not one of the C*
> contributors, but having stared at that part of the source quite a bit this
> year, that’s my take on what can happen.  And if I’m correct, that’s not a
> thing you can tune for. It is a consequence of C*-unfriendly usage patterns.
>
>
>
> Bottom line though is that tuning repairs is only something you do if you
> find that repairs are taking longer than makes sense to you.  It’s totally
> separate from the notion that you should be able to run reaper-controlled
> repairs at least 2x per gc grace seconds.  That’s just a case of making
> some observations on the arithmetic of time intervals.
>
>
>
>
>
> *From: *Sergio <la...@gmail.com>
> *Date: *Wednesday, January 22, 2020 at 4:08 PM
> *To: *Reid Pinchback <rp...@tripadvisor.com>
> *Cc: *"user@cassandra.apache.org" <us...@cassandra.apache.org>
> *Subject: *Re: Is there any concern about increasing gc_grace_seconds
> from 5 days to 8 days?
>
>
>
> *Message from External Sender*
>
> Thank you very much for your extended response.
>
> Should I look in the log some particular message to detect such behavior?
>
> How do you tune it ?
>
>
>
> Thanks,
>
>
>
> Sergio
>
>
>
> On Wed, Jan 22, 2020, 12:59 PM Reid Pinchback <rp...@tripadvisor.com>
> wrote:
>
> Kinda. It isn’t that you have to repair twice per se, just that the
> possibility of running repairs at least twice before GC grace seconds
> elapse means that clearly there is no chance of a tombstone not being
> subject to repair at least once before you hit your GC grace seconds.
>
>
>
> Imagine a tombstone being created on the very first node that Reaper
> looked at in a repair cycle, but one second after Reaper completed repair
> of that particular token range.  Repairs will be complete, but that
> particular tombstone just missed being part of the effort.
>
>
>
> Now your next repair run happens.  What if Reaper doesn’t look at that
> same node first?  It is easy to have happen, as there is a bunch of logic
> related to detection of existing repairs or things taking too long.  So the
> box that was “the first node” in that first repair run, through bad luck
> gets kicked down to later in the second run.  I’ve seen nodes get skipped
> multiple times (you can tune to reduce that, but bottom line… it happens).
> So, bad luck you’ve got.  Eventually the node does get repaired, and the
> aging tombstone finally gets removed.  All fine and dandy…
>
>
>
> Provided that the second repair run got to that point BEFORE you hit your
> GC grace seconds.
>
>
>
> That’s why you need enough time to run it twice.  Because you need enough
> time to catch the oldest possible tombstone, even if it is dealt with at
> the very end of a repair run.  Yes, it sounds like a bit of a degenerate
> case, but if you are writing a lot of data, the probability of not having
> the degenerate cases become real cases becomes vanishingly small.
>
>
>
> R
>
>
>
>
>
> *From: *Sergio <la...@gmail.com>
> *Date: *Wednesday, January 22, 2020 at 1:41 PM
> *To: *"user@cassandra.apache.org" <us...@cassandra.apache.org>, Reid
> Pinchback <rp...@tripadvisor.com>
> *Subject: *Re: Is there any concern about increasing gc_grace_seconds
> from 5 days to 8 days?
>
>
>
> *Message from External Sender*
>
> I was wondering if I should always complete 2 repairs cycles with reaper
> even if one repair cycle finishes in 7 hours.
>
> Currently, I have around 200GB in column family data size to be repaired
> and I was scheduling once repair a week and I was not having too much
> stress on my 8 nodes cluster with i3xlarge nodes.
>
> Thanks,
>
> Sergio
>
>
>
> Il giorno mer 22 gen 2020 alle ore 08:28 Sergio <la...@gmail.com>
> ha scritto:
>
> Thank you very much! Yes I am using reaper!
>
>
>
> Best,
>
>
>
> Sergio
>
>
>
> On Wed, Jan 22, 2020, 8:00 AM Reid Pinchback <rp...@tripadvisor.com>
> wrote:
>
> Sergio, if you’re looking for a new frequency for your repairs because of
> the change, if you are using reaper, then I’d go for repair_freq <=
> gc_grace / 2.
>
>
>
> Just serendipity with a conversation I was having at work this morning.
> When you actually watch the reaper logs then you can see situations where
> unlucky timing with skipped nodes can make the time to remove a tombstone
> be up to 2 x repair_run_time.
>
>
>
> If you aren’t using reaper, your mileage will vary, particularly if your
> repairs are consistent in the ordering across nodes.  Reaper can be
> moderately non-deterministic hence the need to be sure you can complete at
> least two repair runs.
>
>
>
> R
>
>
>
> *From: *Sergio <la...@gmail.com>
> *Reply-To: *"user@cassandra.apache.org" <us...@cassandra.apache.org>
> *Date: *Tuesday, January 21, 2020 at 7:13 PM
> *To: *"user@cassandra.apache.org" <us...@cassandra.apache.org>
> *Subject: *Re: Is there any concern about increasing gc_grace_seconds
> from 5 days to 8 days?
>
>
>
> *Message from External Sender*
>
> Thank you very much for your response.
>
> The considerations mentioned are the ones that I was expecting.
>
> I believe that I am good to go.
>
> I just wanted to make sure that there was no need to run any other extra
> command beside that one.
>
>
>
> Best,
>
>
>
> Sergio
>
>
>
> On Tue, Jan 21, 2020, 3:55 PM Jeff Jirsa <jj...@gmail.com> wrote:
>
> Note that if you're actually running repairs within 5 days, and you adjust
> this to 8, you may stream a bunch of tombstones across in that 5-8 day
> window, which can increase disk usage / compaction (because as you pass 5
> days, one replica may gc away the tombstones, the others may not because
> the tombstones shadow data, so you'll re-stream the tombstone to the other
> replicas)
>
>
>
> On Tue, Jan 21, 2020 at 3:28 PM Elliott Sims <el...@backblaze.com>
> wrote:
>
> In addition to extra space, queries can potentially be more expensive
> because more dead rows and tombstones will need to be scanned.  How much of
> a difference this makes will depend drastically on the schema and access
> pattern, but I wouldn't expect going from 5 days to 8 to be very noticeable.
>
>
>
> On Tue, Jan 21, 2020 at 2:14 PM Sergio <la...@gmail.com> wrote:
>
> https://stackoverflow.com/a/22030790
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__stackoverflow.com_a_22030790&d=DwMFaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=qt1NAYTks84VVQ4WGXWkK6pw85m3FcuUjPRJPdIHMdw&s=aEgz5F5HRxPT3w4hpfNXQRhcchwRjrpf7KB3QyywO_Q&e=>
>
>
>
> For CQLSH
>
> alter table <table_name> with GC_GRACE_SECONDS = <seconds>;
>
>
>
>
>
> Il giorno mar 21 gen 2020 alle ore 13:12 Sergio <la...@gmail.com>
> ha scritto:
>
> Hi guys!
>
> I just wanted to confirm with you before doing such an operation. I expect
> to increase the space but nothing more than this. I  need to perform just :
>
> UPDATE COLUMN FAMILY cf with GC_GRACE = 691,200; //8 days
>
> Is it correct?
>
> Thanks,
>
> Sergio
>
>

Re: Is there any concern about increasing gc_grace_seconds from 5 days to 8 days?

Posted by Reid Pinchback <rp...@tripadvisor.com>.
The reaper logs will say if nodes are being skipped.  The web UI isn’t that good at making it apparent.  You can sometimes tell it is likely happening when you see time gaps between parts of the repair.  This is for when nodes are skipped because of a timeout, but not only that.  The gaps are mostly controlled by the combined results of segmentCountPerNode, repairIntensity, and hangingRepairTimeoutMins.  The last of those three is the most obvious influence on timeouts, but the other two have some impact on the work attempted and the size of the time gaps.  However the C* version also has some bearing, as it influences how hard it is to process the data needed for repairs.

The more subtle aspect of node skipping isn’t the hanging repairs.  When repair of a token range is first attempted, Reaper uses JMX to ask C* if a repair is already underway.  The way it asks is very simplistic, so it doesn’t mean a repair is underway for that particular token range.  It just means something looking like a repair is going on.  Basically it just asks “hey is there a thread with the right magic naming pattern?”  The problem I think is that when you get some repair activity triggered on reads and writes for inconsistent data, I believe they show up as these kinds of threads too.  If you have a bad usage pattern of C* (where you write then very soon read back) then logically you’d expect this to happen quite a lot.

I’m not an expert on the internals since I’m not one of the C* contributors, but having stared at that part of the source quite a bit this year, that’s my take on what can happen.  And if I’m correct, that’s not a thing you can tune for. It is a consequence of C*-unfriendly usage patterns.

Bottom line though is that tuning repairs is only something you do if you find that repairs are taking longer than makes sense to you.  It’s totally separate from the notion that you should be able to run reaper-controlled repairs at least 2x per gc grace seconds.  That’s just a case of making some observations on the arithmetic of time intervals.


From: Sergio <la...@gmail.com>
Date: Wednesday, January 22, 2020 at 4:08 PM
To: Reid Pinchback <rp...@tripadvisor.com>
Cc: "user@cassandra.apache.org" <us...@cassandra.apache.org>
Subject: Re: Is there any concern about increasing gc_grace_seconds from 5 days to 8 days?

Message from External Sender
Thank you very much for your extended response.
Should I look in the log some particular message to detect such behavior?
How do you tune it ?

Thanks,

Sergio

On Wed, Jan 22, 2020, 12:59 PM Reid Pinchback <rp...@tripadvisor.com>> wrote:
Kinda. It isn’t that you have to repair twice per se, just that the possibility of running repairs at least twice before GC grace seconds elapse means that clearly there is no chance of a tombstone not being subject to repair at least once before you hit your GC grace seconds.

Imagine a tombstone being created on the very first node that Reaper looked at in a repair cycle, but one second after Reaper completed repair of that particular token range.  Repairs will be complete, but that particular tombstone just missed being part of the effort.

Now your next repair run happens.  What if Reaper doesn’t look at that same node first?  It is easy to have happen, as there is a bunch of logic related to detection of existing repairs or things taking too long.  So the box that was “the first node” in that first repair run, through bad luck gets kicked down to later in the second run.  I’ve seen nodes get skipped multiple times (you can tune to reduce that, but bottom line… it happens).  So, bad luck you’ve got.  Eventually the node does get repaired, and the aging tombstone finally gets removed.  All fine and dandy…

Provided that the second repair run got to that point BEFORE you hit your GC grace seconds.

That’s why you need enough time to run it twice.  Because you need enough time to catch the oldest possible tombstone, even if it is dealt with at the very end of a repair run.  Yes, it sounds like a bit of a degenerate case, but if you are writing a lot of data, the probability of not having the degenerate cases become real cases becomes vanishingly small.

R


From: Sergio <la...@gmail.com>>
Date: Wednesday, January 22, 2020 at 1:41 PM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>, Reid Pinchback <rp...@tripadvisor.com>>
Subject: Re: Is there any concern about increasing gc_grace_seconds from 5 days to 8 days?

Message from External Sender
I was wondering if I should always complete 2 repairs cycles with reaper even if one repair cycle finishes in 7 hours.

Currently, I have around 200GB in column family data size to be repaired and I was scheduling once repair a week and I was not having too much stress on my 8 nodes cluster with i3xlarge nodes.

Thanks,

Sergio

Il giorno mer 22 gen 2020 alle ore 08:28 Sergio <la...@gmail.com>> ha scritto:
Thank you very much! Yes I am using reaper!

Best,

Sergio

On Wed, Jan 22, 2020, 8:00 AM Reid Pinchback <rp...@tripadvisor.com>> wrote:
Sergio, if you’re looking for a new frequency for your repairs because of the change, if you are using reaper, then I’d go for repair_freq <= gc_grace / 2.

Just serendipity with a conversation I was having at work this morning.  When you actually watch the reaper logs then you can see situations where unlucky timing with skipped nodes can make the time to remove a tombstone be up to 2 x repair_run_time.

If you aren’t using reaper, your mileage will vary, particularly if your repairs are consistent in the ordering across nodes.  Reaper can be moderately non-deterministic hence the need to be sure you can complete at least two repair runs.

R

From: Sergio <la...@gmail.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Tuesday, January 21, 2020 at 7:13 PM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: Is there any concern about increasing gc_grace_seconds from 5 days to 8 days?

Message from External Sender
Thank you very much for your response.
The considerations mentioned are the ones that I was expecting.
I believe that I am good to go.
I just wanted to make sure that there was no need to run any other extra command beside that one.

Best,

Sergio

On Tue, Jan 21, 2020, 3:55 PM Jeff Jirsa <jj...@gmail.com>> wrote:
Note that if you're actually running repairs within 5 days, and you adjust this to 8, you may stream a bunch of tombstones across in that 5-8 day window, which can increase disk usage / compaction (because as you pass 5 days, one replica may gc away the tombstones, the others may not because the tombstones shadow data, so you'll re-stream the tombstone to the other replicas)

On Tue, Jan 21, 2020 at 3:28 PM Elliott Sims <el...@backblaze.com>> wrote:
In addition to extra space, queries can potentially be more expensive because more dead rows and tombstones will need to be scanned.  How much of a difference this makes will depend drastically on the schema and access pattern, but I wouldn't expect going from 5 days to 8 to be very noticeable.

On Tue, Jan 21, 2020 at 2:14 PM Sergio <la...@gmail.com>> wrote:
https://stackoverflow.com/a/22030790<https://urldefense.proofpoint.com/v2/url?u=https-3A__stackoverflow.com_a_22030790&d=DwMFaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=qt1NAYTks84VVQ4WGXWkK6pw85m3FcuUjPRJPdIHMdw&s=aEgz5F5HRxPT3w4hpfNXQRhcchwRjrpf7KB3QyywO_Q&e=>

For CQLSH

alter table <table_name> with GC_GRACE_SECONDS = <seconds>;



Il giorno mar 21 gen 2020 alle ore 13:12 Sergio <la...@gmail.com>> ha scritto:
Hi guys!

I just wanted to confirm with you before doing such an operation. I expect to increase the space but nothing more than this. I  need to perform just :

UPDATE COLUMN FAMILY cf with GC_GRACE = 691,200; //8 days
Is it correct?

Thanks,

Sergio

Re: Is there any concern about increasing gc_grace_seconds from 5 days to 8 days?

Posted by Sergio <la...@gmail.com>.
Thank you very much for your extended response.
Should I look in the log some particular message to detect such behavior?
How do you tune it ?

Thanks,

Sergio

On Wed, Jan 22, 2020, 12:59 PM Reid Pinchback <rp...@tripadvisor.com>
wrote:

> Kinda. It isn’t that you have to repair twice per se, just that the
> possibility of running repairs at least twice before GC grace seconds
> elapse means that clearly there is no chance of a tombstone not being
> subject to repair at least once before you hit your GC grace seconds.
>
>
>
> Imagine a tombstone being created on the very first node that Reaper
> looked at in a repair cycle, but one second after Reaper completed repair
> of that particular token range.  Repairs will be complete, but that
> particular tombstone just missed being part of the effort.
>
>
>
> Now your next repair run happens.  What if Reaper doesn’t look at that
> same node first?  It is easy to have happen, as there is a bunch of logic
> related to detection of existing repairs or things taking too long.  So the
> box that was “the first node” in that first repair run, through bad luck
> gets kicked down to later in the second run.  I’ve seen nodes get skipped
> multiple times (you can tune to reduce that, but bottom line… it happens).
> So, bad luck you’ve got.  Eventually the node does get repaired, and the
> aging tombstone finally gets removed.  All fine and dandy…
>
>
>
> Provided that the second repair run got to that point BEFORE you hit your
> GC grace seconds.
>
>
>
> That’s why you need enough time to run it twice.  Because you need enough
> time to catch the oldest possible tombstone, even if it is dealt with at
> the very end of a repair run.  Yes, it sounds like a bit of a degenerate
> case, but if you are writing a lot of data, the probability of not having
> the degenerate cases become real cases becomes vanishingly small.
>
>
>
> R
>
>
>
>
>
> *From: *Sergio <la...@gmail.com>
> *Date: *Wednesday, January 22, 2020 at 1:41 PM
> *To: *"user@cassandra.apache.org" <us...@cassandra.apache.org>, Reid
> Pinchback <rp...@tripadvisor.com>
> *Subject: *Re: Is there any concern about increasing gc_grace_seconds
> from 5 days to 8 days?
>
>
>
> *Message from External Sender*
>
> I was wondering if I should always complete 2 repairs cycles with reaper
> even if one repair cycle finishes in 7 hours.
>
> Currently, I have around 200GB in column family data size to be repaired
> and I was scheduling once repair a week and I was not having too much
> stress on my 8 nodes cluster with i3xlarge nodes.
>
> Thanks,
>
> Sergio
>
>
>
> Il giorno mer 22 gen 2020 alle ore 08:28 Sergio <la...@gmail.com>
> ha scritto:
>
> Thank you very much! Yes I am using reaper!
>
>
>
> Best,
>
>
>
> Sergio
>
>
>
> On Wed, Jan 22, 2020, 8:00 AM Reid Pinchback <rp...@tripadvisor.com>
> wrote:
>
> Sergio, if you’re looking for a new frequency for your repairs because of
> the change, if you are using reaper, then I’d go for repair_freq <=
> gc_grace / 2.
>
>
>
> Just serendipity with a conversation I was having at work this morning.
> When you actually watch the reaper logs then you can see situations where
> unlucky timing with skipped nodes can make the time to remove a tombstone
> be up to 2 x repair_run_time.
>
>
>
> If you aren’t using reaper, your mileage will vary, particularly if your
> repairs are consistent in the ordering across nodes.  Reaper can be
> moderately non-deterministic hence the need to be sure you can complete at
> least two repair runs.
>
>
>
> R
>
>
>
> *From: *Sergio <la...@gmail.com>
> *Reply-To: *"user@cassandra.apache.org" <us...@cassandra.apache.org>
> *Date: *Tuesday, January 21, 2020 at 7:13 PM
> *To: *"user@cassandra.apache.org" <us...@cassandra.apache.org>
> *Subject: *Re: Is there any concern about increasing gc_grace_seconds
> from 5 days to 8 days?
>
>
>
> *Message from External Sender*
>
> Thank you very much for your response.
>
> The considerations mentioned are the ones that I was expecting.
>
> I believe that I am good to go.
>
> I just wanted to make sure that there was no need to run any other extra
> command beside that one.
>
>
>
> Best,
>
>
>
> Sergio
>
>
>
> On Tue, Jan 21, 2020, 3:55 PM Jeff Jirsa <jj...@gmail.com> wrote:
>
> Note that if you're actually running repairs within 5 days, and you adjust
> this to 8, you may stream a bunch of tombstones across in that 5-8 day
> window, which can increase disk usage / compaction (because as you pass 5
> days, one replica may gc away the tombstones, the others may not because
> the tombstones shadow data, so you'll re-stream the tombstone to the other
> replicas)
>
>
>
> On Tue, Jan 21, 2020 at 3:28 PM Elliott Sims <el...@backblaze.com>
> wrote:
>
> In addition to extra space, queries can potentially be more expensive
> because more dead rows and tombstones will need to be scanned.  How much of
> a difference this makes will depend drastically on the schema and access
> pattern, but I wouldn't expect going from 5 days to 8 to be very noticeable.
>
>
>
> On Tue, Jan 21, 2020 at 2:14 PM Sergio <la...@gmail.com> wrote:
>
> https://stackoverflow.com/a/22030790
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__stackoverflow.com_a_22030790&d=DwMFaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=qt1NAYTks84VVQ4WGXWkK6pw85m3FcuUjPRJPdIHMdw&s=aEgz5F5HRxPT3w4hpfNXQRhcchwRjrpf7KB3QyywO_Q&e=>
>
>
>
> For CQLSH
>
> alter table <table_name> with GC_GRACE_SECONDS = <seconds>;
>
>
>
>
>
> Il giorno mar 21 gen 2020 alle ore 13:12 Sergio <la...@gmail.com>
> ha scritto:
>
> Hi guys!
>
> I just wanted to confirm with you before doing such an operation. I expect
> to increase the space but nothing more than this. I  need to perform just :
>
> UPDATE COLUMN FAMILY cf with GC_GRACE = 691,200; //8 days
>
> Is it correct?
>
> Thanks,
>
> Sergio
>
>

Re: Is there any concern about increasing gc_grace_seconds from 5 days to 8 days?

Posted by Reid Pinchback <rp...@tripadvisor.com>.
Kinda. It isn’t that you have to repair twice per se, just that the possibility of running repairs at least twice before GC grace seconds elapse means that clearly there is no chance of a tombstone not being subject to repair at least once before you hit your GC grace seconds.

Imagine a tombstone being created on the very first node that Reaper looked at in a repair cycle, but one second after Reaper completed repair of that particular token range.  Repairs will be complete, but that particular tombstone just missed being part of the effort.

Now your next repair run happens.  What if Reaper doesn’t look at that same node first?  It is easy to have happen, as there is a bunch of logic related to detection of existing repairs or things taking too long.  So the box that was “the first node” in that first repair run, through bad luck gets kicked down to later in the second run.  I’ve seen nodes get skipped multiple times (you can tune to reduce that, but bottom line… it happens).  So, bad luck you’ve got.  Eventually the node does get repaired, and the aging tombstone finally gets removed.  All fine and dandy…

Provided that the second repair run got to that point BEFORE you hit your GC grace seconds.

That’s why you need enough time to run it twice.  Because you need enough time to catch the oldest possible tombstone, even if it is dealt with at the very end of a repair run.  Yes, it sounds like a bit of a degenerate case, but if you are writing a lot of data, the probability of not having the degenerate cases become real cases becomes vanishingly small.

R


From: Sergio <la...@gmail.com>
Date: Wednesday, January 22, 2020 at 1:41 PM
To: "user@cassandra.apache.org" <us...@cassandra.apache.org>, Reid Pinchback <rp...@tripadvisor.com>
Subject: Re: Is there any concern about increasing gc_grace_seconds from 5 days to 8 days?

Message from External Sender
I was wondering if I should always complete 2 repairs cycles with reaper even if one repair cycle finishes in 7 hours.

Currently, I have around 200GB in column family data size to be repaired and I was scheduling once repair a week and I was not having too much stress on my 8 nodes cluster with i3xlarge nodes.

Thanks,

Sergio

Il giorno mer 22 gen 2020 alle ore 08:28 Sergio <la...@gmail.com>> ha scritto:
Thank you very much! Yes I am using reaper!

Best,

Sergio

On Wed, Jan 22, 2020, 8:00 AM Reid Pinchback <rp...@tripadvisor.com>> wrote:
Sergio, if you’re looking for a new frequency for your repairs because of the change, if you are using reaper, then I’d go for repair_freq <= gc_grace / 2.

Just serendipity with a conversation I was having at work this morning.  When you actually watch the reaper logs then you can see situations where unlucky timing with skipped nodes can make the time to remove a tombstone be up to 2 x repair_run_time.

If you aren’t using reaper, your mileage will vary, particularly if your repairs are consistent in the ordering across nodes.  Reaper can be moderately non-deterministic hence the need to be sure you can complete at least two repair runs.

R

From: Sergio <la...@gmail.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Tuesday, January 21, 2020 at 7:13 PM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: Is there any concern about increasing gc_grace_seconds from 5 days to 8 days?

Message from External Sender
Thank you very much for your response.
The considerations mentioned are the ones that I was expecting.
I believe that I am good to go.
I just wanted to make sure that there was no need to run any other extra command beside that one.

Best,

Sergio

On Tue, Jan 21, 2020, 3:55 PM Jeff Jirsa <jj...@gmail.com>> wrote:
Note that if you're actually running repairs within 5 days, and you adjust this to 8, you may stream a bunch of tombstones across in that 5-8 day window, which can increase disk usage / compaction (because as you pass 5 days, one replica may gc away the tombstones, the others may not because the tombstones shadow data, so you'll re-stream the tombstone to the other replicas)

On Tue, Jan 21, 2020 at 3:28 PM Elliott Sims <el...@backblaze.com>> wrote:
In addition to extra space, queries can potentially be more expensive because more dead rows and tombstones will need to be scanned.  How much of a difference this makes will depend drastically on the schema and access pattern, but I wouldn't expect going from 5 days to 8 to be very noticeable.

On Tue, Jan 21, 2020 at 2:14 PM Sergio <la...@gmail.com>> wrote:
https://stackoverflow.com/a/22030790<https://urldefense.proofpoint.com/v2/url?u=https-3A__stackoverflow.com_a_22030790&d=DwMFaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=qt1NAYTks84VVQ4WGXWkK6pw85m3FcuUjPRJPdIHMdw&s=aEgz5F5HRxPT3w4hpfNXQRhcchwRjrpf7KB3QyywO_Q&e=>

For CQLSH

alter table <table_name> with GC_GRACE_SECONDS = <seconds>;



Il giorno mar 21 gen 2020 alle ore 13:12 Sergio <la...@gmail.com>> ha scritto:
Hi guys!

I just wanted to confirm with you before doing such an operation. I expect to increase the space but nothing more than this. I  need to perform just :

UPDATE COLUMN FAMILY cf with GC_GRACE = 691,200; //8 days
Is it correct?

Thanks,

Sergio

Re: Is there any concern about increasing gc_grace_seconds from 5 days to 8 days?

Posted by Sergio <la...@gmail.com>.
I was wondering if I should always complete 2 repairs cycles with reaper
even if one repair cycle finishes in 7 hours.

Currently, I have around 200GB in column family data size to be repaired
and I was scheduling once repair a week and I was not having too much
stress on my 8 nodes cluster with i3xlarge nodes.

Thanks,

Sergio

Il giorno mer 22 gen 2020 alle ore 08:28 Sergio <la...@gmail.com>
ha scritto:

> Thank you very much! Yes I am using reaper!
>
> Best,
>
> Sergio
>
> On Wed, Jan 22, 2020, 8:00 AM Reid Pinchback <rp...@tripadvisor.com>
> wrote:
>
>> Sergio, if you’re looking for a new frequency for your repairs because of
>> the change, if you are using reaper, then I’d go for repair_freq <=
>> gc_grace / 2.
>>
>>
>>
>> Just serendipity with a conversation I was having at work this morning.
>> When you actually watch the reaper logs then you can see situations where
>> unlucky timing with skipped nodes can make the time to remove a tombstone
>> be up to 2 x repair_run_time.
>>
>>
>>
>> If you aren’t using reaper, your mileage will vary, particularly if your
>> repairs are consistent in the ordering across nodes.  Reaper can be
>> moderately non-deterministic hence the need to be sure you can complete at
>> least two repair runs.
>>
>>
>>
>> R
>>
>>
>>
>> *From: *Sergio <la...@gmail.com>
>> *Reply-To: *"user@cassandra.apache.org" <us...@cassandra.apache.org>
>> *Date: *Tuesday, January 21, 2020 at 7:13 PM
>> *To: *"user@cassandra.apache.org" <us...@cassandra.apache.org>
>> *Subject: *Re: Is there any concern about increasing gc_grace_seconds
>> from 5 days to 8 days?
>>
>>
>>
>> *Message from External Sender*
>>
>> Thank you very much for your response.
>>
>> The considerations mentioned are the ones that I was expecting.
>>
>> I believe that I am good to go.
>>
>> I just wanted to make sure that there was no need to run any other extra
>> command beside that one.
>>
>>
>>
>> Best,
>>
>>
>>
>> Sergio
>>
>>
>>
>> On Tue, Jan 21, 2020, 3:55 PM Jeff Jirsa <jj...@gmail.com> wrote:
>>
>> Note that if you're actually running repairs within 5 days, and you
>> adjust this to 8, you may stream a bunch of tombstones across in that 5-8
>> day window, which can increase disk usage / compaction (because as you pass
>> 5 days, one replica may gc away the tombstones, the others may not because
>> the tombstones shadow data, so you'll re-stream the tombstone to the other
>> replicas)
>>
>>
>>
>> On Tue, Jan 21, 2020 at 3:28 PM Elliott Sims <el...@backblaze.com>
>> wrote:
>>
>> In addition to extra space, queries can potentially be more expensive
>> because more dead rows and tombstones will need to be scanned.  How much of
>> a difference this makes will depend drastically on the schema and access
>> pattern, but I wouldn't expect going from 5 days to 8 to be very noticeable.
>>
>>
>>
>> On Tue, Jan 21, 2020 at 2:14 PM Sergio <la...@gmail.com> wrote:
>>
>> https://stackoverflow.com/a/22030790
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__stackoverflow.com_a_22030790&d=DwMFaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=qt1NAYTks84VVQ4WGXWkK6pw85m3FcuUjPRJPdIHMdw&s=aEgz5F5HRxPT3w4hpfNXQRhcchwRjrpf7KB3QyywO_Q&e=>
>>
>>
>>
>> For CQLSH
>>
>> alter table <table_name> with GC_GRACE_SECONDS = <seconds>;
>>
>>
>>
>>
>>
>> Il giorno mar 21 gen 2020 alle ore 13:12 Sergio <
>> lapostadisergio@gmail.com> ha scritto:
>>
>> Hi guys!
>>
>> I just wanted to confirm with you before doing such an operation. I
>> expect to increase the space but nothing more than this. I  need to perform
>> just :
>>
>> UPDATE COLUMN FAMILY cf with GC_GRACE = 691,200; //8 days
>>
>> Is it correct?
>>
>> Thanks,
>>
>> Sergio
>>
>>

Re: Is there any concern about increasing gc_grace_seconds from 5 days to 8 days?

Posted by Sergio <la...@gmail.com>.
Thank you very much! Yes I am using reaper!

Best,

Sergio

On Wed, Jan 22, 2020, 8:00 AM Reid Pinchback <rp...@tripadvisor.com>
wrote:

> Sergio, if you’re looking for a new frequency for your repairs because of
> the change, if you are using reaper, then I’d go for repair_freq <=
> gc_grace / 2.
>
>
>
> Just serendipity with a conversation I was having at work this morning.
> When you actually watch the reaper logs then you can see situations where
> unlucky timing with skipped nodes can make the time to remove a tombstone
> be up to 2 x repair_run_time.
>
>
>
> If you aren’t using reaper, your mileage will vary, particularly if your
> repairs are consistent in the ordering across nodes.  Reaper can be
> moderately non-deterministic hence the need to be sure you can complete at
> least two repair runs.
>
>
>
> R
>
>
>
> *From: *Sergio <la...@gmail.com>
> *Reply-To: *"user@cassandra.apache.org" <us...@cassandra.apache.org>
> *Date: *Tuesday, January 21, 2020 at 7:13 PM
> *To: *"user@cassandra.apache.org" <us...@cassandra.apache.org>
> *Subject: *Re: Is there any concern about increasing gc_grace_seconds
> from 5 days to 8 days?
>
>
>
> *Message from External Sender*
>
> Thank you very much for your response.
>
> The considerations mentioned are the ones that I was expecting.
>
> I believe that I am good to go.
>
> I just wanted to make sure that there was no need to run any other extra
> command beside that one.
>
>
>
> Best,
>
>
>
> Sergio
>
>
>
> On Tue, Jan 21, 2020, 3:55 PM Jeff Jirsa <jj...@gmail.com> wrote:
>
> Note that if you're actually running repairs within 5 days, and you adjust
> this to 8, you may stream a bunch of tombstones across in that 5-8 day
> window, which can increase disk usage / compaction (because as you pass 5
> days, one replica may gc away the tombstones, the others may not because
> the tombstones shadow data, so you'll re-stream the tombstone to the other
> replicas)
>
>
>
> On Tue, Jan 21, 2020 at 3:28 PM Elliott Sims <el...@backblaze.com>
> wrote:
>
> In addition to extra space, queries can potentially be more expensive
> because more dead rows and tombstones will need to be scanned.  How much of
> a difference this makes will depend drastically on the schema and access
> pattern, but I wouldn't expect going from 5 days to 8 to be very noticeable.
>
>
>
> On Tue, Jan 21, 2020 at 2:14 PM Sergio <la...@gmail.com> wrote:
>
> https://stackoverflow.com/a/22030790
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__stackoverflow.com_a_22030790&d=DwMFaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=qt1NAYTks84VVQ4WGXWkK6pw85m3FcuUjPRJPdIHMdw&s=aEgz5F5HRxPT3w4hpfNXQRhcchwRjrpf7KB3QyywO_Q&e=>
>
>
>
> For CQLSH
>
> alter table <table_name> with GC_GRACE_SECONDS = <seconds>;
>
>
>
>
>
> Il giorno mar 21 gen 2020 alle ore 13:12 Sergio <la...@gmail.com>
> ha scritto:
>
> Hi guys!
>
> I just wanted to confirm with you before doing such an operation. I expect
> to increase the space but nothing more than this. I  need to perform just :
>
> UPDATE COLUMN FAMILY cf with GC_GRACE = 691,200; //8 days
>
> Is it correct?
>
> Thanks,
>
> Sergio
>
>

Re: Is there any concern about increasing gc_grace_seconds from 5 days to 8 days?

Posted by Reid Pinchback <rp...@tripadvisor.com>.
Sergio, if you’re looking for a new frequency for your repairs because of the change, if you are using reaper, then I’d go for repair_freq <= gc_grace / 2.

Just serendipity with a conversation I was having at work this morning.  When you actually watch the reaper logs then you can see situations where unlucky timing with skipped nodes can make the time to remove a tombstone be up to 2 x repair_run_time.

If you aren’t using reaper, your mileage will vary, particularly if your repairs are consistent in the ordering across nodes.  Reaper can be moderately non-deterministic hence the need to be sure you can complete at least two repair runs.

R

From: Sergio <la...@gmail.com>
Reply-To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
Date: Tuesday, January 21, 2020 at 7:13 PM
To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
Subject: Re: Is there any concern about increasing gc_grace_seconds from 5 days to 8 days?

Message from External Sender
Thank you very much for your response.
The considerations mentioned are the ones that I was expecting.
I believe that I am good to go.
I just wanted to make sure that there was no need to run any other extra command beside that one.

Best,

Sergio

On Tue, Jan 21, 2020, 3:55 PM Jeff Jirsa <jj...@gmail.com>> wrote:
Note that if you're actually running repairs within 5 days, and you adjust this to 8, you may stream a bunch of tombstones across in that 5-8 day window, which can increase disk usage / compaction (because as you pass 5 days, one replica may gc away the tombstones, the others may not because the tombstones shadow data, so you'll re-stream the tombstone to the other replicas)

On Tue, Jan 21, 2020 at 3:28 PM Elliott Sims <el...@backblaze.com>> wrote:
In addition to extra space, queries can potentially be more expensive because more dead rows and tombstones will need to be scanned.  How much of a difference this makes will depend drastically on the schema and access pattern, but I wouldn't expect going from 5 days to 8 to be very noticeable.

On Tue, Jan 21, 2020 at 2:14 PM Sergio <la...@gmail.com>> wrote:
https://stackoverflow.com/a/22030790<https://urldefense.proofpoint.com/v2/url?u=https-3A__stackoverflow.com_a_22030790&d=DwMFaQ&c=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA&r=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc&m=qt1NAYTks84VVQ4WGXWkK6pw85m3FcuUjPRJPdIHMdw&s=aEgz5F5HRxPT3w4hpfNXQRhcchwRjrpf7KB3QyywO_Q&e=>


For CQLSH

alter table <table_name> with GC_GRACE_SECONDS = <seconds>;



Il giorno mar 21 gen 2020 alle ore 13:12 Sergio <la...@gmail.com>> ha scritto:
Hi guys!

I just wanted to confirm with you before doing such an operation. I expect to increase the space but nothing more than this. I  need to perform just :

UPDATE COLUMN FAMILY cf with GC_GRACE = 691,200; //8 days
Is it correct?

Thanks,

Sergio

Re: Is there any concern about increasing gc_grace_seconds from 5 days to 8 days?

Posted by Sergio <la...@gmail.com>.
Thank you very much for your response.
The considerations mentioned are the ones that I was expecting.
I believe that I am good to go.
I just wanted to make sure that there was no need to run any other extra
command beside that one.

Best,

Sergio

On Tue, Jan 21, 2020, 3:55 PM Jeff Jirsa <jj...@gmail.com> wrote:

> Note that if you're actually running repairs within 5 days, and you adjust
> this to 8, you may stream a bunch of tombstones across in that 5-8 day
> window, which can increase disk usage / compaction (because as you pass 5
> days, one replica may gc away the tombstones, the others may not because
> the tombstones shadow data, so you'll re-stream the tombstone to the other
> replicas)
>
> On Tue, Jan 21, 2020 at 3:28 PM Elliott Sims <el...@backblaze.com>
> wrote:
>
>> In addition to extra space, queries can potentially be more expensive
>> because more dead rows and tombstones will need to be scanned.  How much of
>> a difference this makes will depend drastically on the schema and access
>> pattern, but I wouldn't expect going from 5 days to 8 to be very noticeable.
>>
>> On Tue, Jan 21, 2020 at 2:14 PM Sergio <la...@gmail.com> wrote:
>>
>>> https://stackoverflow.com/a/22030790
>>>
>>>
>>> For CQLSH
>>>
>>> alter table <table_name> with GC_GRACE_SECONDS = <seconds>;
>>>
>>>
>>>
>>> Il giorno mar 21 gen 2020 alle ore 13:12 Sergio <
>>> lapostadisergio@gmail.com> ha scritto:
>>>
>>>> Hi guys!
>>>>
>>>> I just wanted to confirm with you before doing such an operation. I
>>>> expect to increase the space but nothing more than this. I  need to perform
>>>> just :
>>>>
>>>> UPDATE COLUMN FAMILY cf with GC_GRACE = 691,200; //8 days
>>>>
>>>> Is it correct?
>>>>
>>>> Thanks,
>>>>
>>>> Sergio
>>>>
>>>

Re: Is there any concern about increasing gc_grace_seconds from 5 days to 8 days?

Posted by Jeff Jirsa <jj...@gmail.com>.
Note that if you're actually running repairs within 5 days, and you adjust
this to 8, you may stream a bunch of tombstones across in that 5-8 day
window, which can increase disk usage / compaction (because as you pass 5
days, one replica may gc away the tombstones, the others may not because
the tombstones shadow data, so you'll re-stream the tombstone to the other
replicas)

On Tue, Jan 21, 2020 at 3:28 PM Elliott Sims <el...@backblaze.com> wrote:

> In addition to extra space, queries can potentially be more expensive
> because more dead rows and tombstones will need to be scanned.  How much of
> a difference this makes will depend drastically on the schema and access
> pattern, but I wouldn't expect going from 5 days to 8 to be very noticeable.
>
> On Tue, Jan 21, 2020 at 2:14 PM Sergio <la...@gmail.com> wrote:
>
>> https://stackoverflow.com/a/22030790
>>
>>
>> For CQLSH
>>
>> alter table <table_name> with GC_GRACE_SECONDS = <seconds>;
>>
>>
>>
>> Il giorno mar 21 gen 2020 alle ore 13:12 Sergio <
>> lapostadisergio@gmail.com> ha scritto:
>>
>>> Hi guys!
>>>
>>> I just wanted to confirm with you before doing such an operation. I
>>> expect to increase the space but nothing more than this. I  need to perform
>>> just :
>>>
>>> UPDATE COLUMN FAMILY cf with GC_GRACE = 691,200; //8 days
>>>
>>> Is it correct?
>>>
>>> Thanks,
>>>
>>> Sergio
>>>
>>

Re: Is there any concern about increasing gc_grace_seconds from 5 days to 8 days?

Posted by Elliott Sims <el...@backblaze.com>.
In addition to extra space, queries can potentially be more expensive
because more dead rows and tombstones will need to be scanned.  How much of
a difference this makes will depend drastically on the schema and access
pattern, but I wouldn't expect going from 5 days to 8 to be very noticeable.

On Tue, Jan 21, 2020 at 2:14 PM Sergio <la...@gmail.com> wrote:

> https://stackoverflow.com/a/22030790
>
>
> For CQLSH
>
> alter table <table_name> with GC_GRACE_SECONDS = <seconds>;
>
>
>
> Il giorno mar 21 gen 2020 alle ore 13:12 Sergio <la...@gmail.com>
> ha scritto:
>
>> Hi guys!
>>
>> I just wanted to confirm with you before doing such an operation. I
>> expect to increase the space but nothing more than this. I  need to perform
>> just :
>>
>> UPDATE COLUMN FAMILY cf with GC_GRACE = 691,200; //8 days
>>
>> Is it correct?
>>
>> Thanks,
>>
>> Sergio
>>
>

Re: Is there any concern about increasing gc_grace_seconds from 5 days to 8 days?

Posted by Sergio <la...@gmail.com>.
https://stackoverflow.com/a/22030790


For CQLSH

alter table <table_name> with GC_GRACE_SECONDS = <seconds>;



Il giorno mar 21 gen 2020 alle ore 13:12 Sergio <la...@gmail.com>
ha scritto:

> Hi guys!
>
> I just wanted to confirm with you before doing such an operation. I expect
> to increase the space but nothing more than this. I  need to perform just :
>
> UPDATE COLUMN FAMILY cf with GC_GRACE = 691,200; //8 days
>
> Is it correct?
>
> Thanks,
>
> Sergio
>