You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Martin Xue <ma...@gmail.com> on 2019/08/04 00:03:26 UTC

What really happened during repair?

Hi Cassandra community,

I am using Cassandra 3.0.14, 1 cluster, node a,b,c in DC1, node d,e,f in
DC2.

Keyspace_m is 1TB

When I run repair -pr a full keyspace_m on node a, what I noticed are:
1. Repair process is running on node a
2. Anti compaction after repair are running on other nodes at least node
b,d,e,f

I want to know
1. why there are anti compactions running after repair?
2. Why it needs to run on other nodes? (I only run primary range repair on
node a)
3. What's the purpose of anti compaction after repair?
4. Can I disable the anti compaction? If so any damage will cause? (It
takes more than 2 days to run on 1TB keyspace_m, and filled up disk
quickly, too time and resources consuming)

Any suggestions would be appreciated.

Thanks
Regards
Martin

Re: What really happened during repair?

Posted by Martin Xue <ma...@gmail.com>.
Hi Alex,

Thanks for pointing that out. yes I am running version 3.0.14. I ran
"nodetool repair -pr --full 1_TB_keyspace", and just tried to understand
more about what happened behind the things.

I am still puzzle with the questions above.

Any suggestions will be appreciated.

Thanks
Regards
Martin


On Sun, Aug 4, 2019 at 4:31 PM Alexander Dejanovski <al...@thelastpickle.com>
wrote:

> Hi Jeff,
>
> Anticompaction only runs before repair in the upcoming 4.0.
> In all other versions of Cassandra, it runs at the end of repair sessions.
> My understanding from other messages Martin sent to the ML was that he was
> already running full repair not incremental, which before 4.0 will also
> performs anticompaction (unless you use subrange).
>
> Cheers,
>
>
>
> Le dim. 4 août 2019 à 02:29, Jeff Jirsa <jj...@gmail.com> a écrit :
>
>>
>> > On Aug 3, 2019, at 5:03 PM, Martin Xue <ma...@gmail.com> wrote:
>> >
>> > Hi Cassandra community,
>> >
>> > I am using Cassandra 3.0.14, 1 cluster, node a,b,c in DC1, node d,e,f
>> in DC2.
>> >
>> > Keyspace_m is 1TB
>> >
>> > When I run repair -pr a full keyspace_m on node a, what I noticed are:
>> > 1. Repair process is running on node a
>> > 2. Anti compaction after repair are running on other nodes at least
>> node b,d,e,f
>> >
>> > I want to know
>> > 1. why there are anti compactions running after repair?
>>
>>
>> They should run before repair - they split data you’re going to repair
>> from data you’re not going to repair
>>
>> If they’re running after, either there’s another repair command on
>> adjacent nodes or you’re repairing multiple key spaces and lost track
>>
>> > 2. Why it needs to run on other nodes? (I only run primary range repair
>> on node a)
>>
>> Every host involved in the repair will anticompact to split data in the
>> range you’re repairing from other data. That means RF number of hosts  will
>> run anticompaction for each range you repair
>> > 3. What's the purpose of anti compaction after repair?
>>
>> Answered above , but reminder it’s before
>>
>> > 4. Can I disable the anti compaction? If so any damage will cause? (It
>> takes more than 2 days to run on 1TB keyspace_m, and filled up disk
>> quickly, too time and resources consuming)
>>
>>
>> You can run full repair instead of incremental by passing -full
>>
>> But he cost of anticompaction should go down after the first successful
>> incremental repair
>>
>> >
>> > Any suggestions would be appreciated.
>> >
>> > Thanks
>> > Regards
>> > Martin
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> For additional commands, e-mail: user-help@cassandra.apache.org
>>
>>

Re: What really happened during repair?

Posted by Alexander Dejanovski <al...@thelastpickle.com>.
Hi Jeff,

Anticompaction only runs before repair in the upcoming 4.0.
In all other versions of Cassandra, it runs at the end of repair sessions.
My understanding from other messages Martin sent to the ML was that he was
already running full repair not incremental, which before 4.0 will also
performs anticompaction (unless you use subrange).

Cheers,



Le dim. 4 août 2019 à 02:29, Jeff Jirsa <jj...@gmail.com> a écrit :

>
> > On Aug 3, 2019, at 5:03 PM, Martin Xue <ma...@gmail.com> wrote:
> >
> > Hi Cassandra community,
> >
> > I am using Cassandra 3.0.14, 1 cluster, node a,b,c in DC1, node d,e,f in
> DC2.
> >
> > Keyspace_m is 1TB
> >
> > When I run repair -pr a full keyspace_m on node a, what I noticed are:
> > 1. Repair process is running on node a
> > 2. Anti compaction after repair are running on other nodes at least node
> b,d,e,f
> >
> > I want to know
> > 1. why there are anti compactions running after repair?
>
>
> They should run before repair - they split data you’re going to repair
> from data you’re not going to repair
>
> If they’re running after, either there’s another repair command on
> adjacent nodes or you’re repairing multiple key spaces and lost track
>
> > 2. Why it needs to run on other nodes? (I only run primary range repair
> on node a)
>
> Every host involved in the repair will anticompact to split data in the
> range you’re repairing from other data. That means RF number of hosts  will
> run anticompaction for each range you repair
> > 3. What's the purpose of anti compaction after repair?
>
> Answered above , but reminder it’s before
>
> > 4. Can I disable the anti compaction? If so any damage will cause? (It
> takes more than 2 days to run on 1TB keyspace_m, and filled up disk
> quickly, too time and resources consuming)
>
>
> You can run full repair instead of incremental by passing -full
>
> But he cost of anticompaction should go down after the first successful
> incremental repair
>
> >
> > Any suggestions would be appreciated.
> >
> > Thanks
> > Regards
> > Martin
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: user-help@cassandra.apache.org
>
>

Re: What really happened during repair?

Posted by Martin Xue <ma...@gmail.com>.
Hi Jeff,

Thanks for your reply.

I am using version 3.0.14. I did run the nodetool repair -pr --full
1_TB_keyspace

I notice the 'anti-compaction after repair' were run after I ran the above
command.

I am still puzzled with the questions above.

Thanks
Regards
Martin

On Sun, Aug 4, 2019 at 10:29 AM Jeff Jirsa <jj...@gmail.com> wrote:

>
> > On Aug 3, 2019, at 5:03 PM, Martin Xue <ma...@gmail.com> wrote:
> >
> > Hi Cassandra community,
> >
> > I am using Cassandra 3.0.14, 1 cluster, node a,b,c in DC1, node d,e,f in
> DC2.
> >
> > Keyspace_m is 1TB
> >
> > When I run repair -pr a full keyspace_m on node a, what I noticed are:
> > 1. Repair process is running on node a
> > 2. Anti compaction after repair are running on other nodes at least node
> b,d,e,f
> >
> > I want to know
> > 1. why there are anti compactions running after repair?
>
>
> They should run before repair - they split data you’re going to repair
> from data you’re not going to repair
>
> If they’re running after, either there’s another repair command on
> adjacent nodes or you’re repairing multiple key spaces and lost track
>
> > 2. Why it needs to run on other nodes? (I only run primary range repair
> on node a)
>
> Every host involved in the repair will anticompact to split data in the
> range you’re repairing from other data. That means RF number of hosts  will
> run anticompaction for each range you repair
> > 3. What's the purpose of anti compaction after repair?
>
> Answered above , but reminder it’s before
>
> > 4. Can I disable the anti compaction? If so any damage will cause? (It
> takes more than 2 days to run on 1TB keyspace_m, and filled up disk
> quickly, too time and resources consuming)
>
>
> You can run full repair instead of incremental by passing -full
>
> But he cost of anticompaction should go down after the first successful
> incremental repair
>
> >
> > Any suggestions would be appreciated.
> >
> > Thanks
> > Regards
> > Martin
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: user-help@cassandra.apache.org
>
>

Re: What really happened during repair?

Posted by Jeff Jirsa <jj...@gmail.com>.
> On Aug 3, 2019, at 5:03 PM, Martin Xue <ma...@gmail.com> wrote:
> 
> Hi Cassandra community,
> 
> I am using Cassandra 3.0.14, 1 cluster, node a,b,c in DC1, node d,e,f in DC2. 
> 
> Keyspace_m is 1TB
> 
> When I run repair -pr a full keyspace_m on node a, what I noticed are:
> 1. Repair process is running on node a
> 2. Anti compaction after repair are running on other nodes at least node b,d,e,f
> 
> I want to know 
> 1. why there are anti compactions running after repair?


They should run before repair - they split data you’re going to repair from data you’re not going to repair

If they’re running after, either there’s another repair command on adjacent nodes or you’re repairing multiple key spaces and lost track

> 2. Why it needs to run on other nodes? (I only run primary range repair on node a)

Every host involved in the repair will anticompact to split data in the range you’re repairing from other data. That means RF number of hosts  will run anticompaction for each range you repair 
> 3. What's the purpose of anti compaction after repair?

Answered above , but reminder it’s before 

> 4. Can I disable the anti compaction? If so any damage will cause? (It takes more than 2 days to run on 1TB keyspace_m, and filled up disk quickly, too time and resources consuming)


You can run full repair instead of incremental by passing -full

But he cost of anticompaction should go down after the first successful incremental repair

> 
> Any suggestions would be appreciated.
> 
> Thanks
> Regards
> Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org