You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Nuno Cervaens - Hoist Group - Portugal <nu...@hoistgroup.com> on 2018/04/20 15:42:56 UTC

cassandra repair takes ages

Hello,

I have a 3 node cluster with RF 2 and using STCS. I use SSDs for commitlogs and HDDs for data. Apache Cassandra version is 3.11.2.
I basically have a huge keyspace ('newts' from opennms) and a big keyspace ('opspanel'). Here's a summary of the 'du' output for one node (which is more or less the same for each node):

51G ./data/opspanel
776G ./data/newts/samples-00ae9420ea0711e5a39bbd7839a19930
776G ./data/newts

My issue is that running a 'nodetool repair -pr' takes one day an a half per node and as I want to store daily snapshots (for the past 7 days), I dont see how I can do this as repairs take too long.
For example I see huge compactions and validations that take lots of hours (compactionstats taken at different times):

id                                   compaction type keyspace table   completed    total        unit  progress
7125eb20-446b-11e8-a57d-f36e88375e31 Compaction      newts    samples 294177987449 835153786347 bytes 35,22%

id                                   compaction type             keyspace table   completed    total        unit  progress
6aa5ce51-4425-11e8-a7c1-572dede7e4d6 Anticompaction after repair newts    samples 581839334815 599408876344 bytes 97,07%

id                                   compaction type keyspace table   completed    total        unit  progress
69976700-43e2-11e8-a7c1-572dede7e4d6 Validation      newts    samples 63249761990  826302170493 bytes 7,65%
69973ff0-43e2-11e8-a7c1-572dede7e4d6 Validation      newts    samples 102513762816 826302170600 bytes 12,41%

Is there something I can do to improve the situation?

Also, is an incremental repair (apparently nodetool's default) safe? As I see in the datastax documentation that the incremental should not be used, only the full. Can you please clarify?

Thanks for the feedback.
Nuno

Re: cassandra repair takes ages

Posted by Nuno Cervaens - Hoist Group - Portugal <nu...@hoistgroup.com>.

Hi Carlos,

Ok thanks for the feedback and the url, its pretty clear now.

cheers,
nuno

On Dom, 2018-04-22 at 16:13 +0100, Carlos Rolo wrote:
> Hello,
> 
> I just stated that if you use QUORUM or in fact using ALL, since
> you're running ONE, this is a non-issue.
> 
> Regarding incremental repairs you can read here: http://thelastpickle
> .com/blog/2017/12/14/should-you-use-incremental-repair.html
> 
> You can't run repair -pr simultaneously. You can try to use a tool
> like Reaper to better manage and schedule repairs, but I doubt it
> will speed up a lot.
> 
> Regards,
> 
> Carlos Juzarte Rolo
> Cassandra Consultant / Datastax Certified Architect / Cassandra MVP
>  
> Pythian - Love your data
> 
> rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin:
> linkedin.com/in/carlosjuzarterolo 
> Mobile: +351 918 918 100 
> www.pythian.com
> 
> On Sun, Apr 22, 2018 at 11:39 AM, Nuno Cervaens - Hoist Group -
> Portugal <nu...@hoistgroup.com> wrote:
> > Hi Carlos,
> > 
> > Thanks for the reply.
> > Isnt the consistency level defined per session? All my session,
> > being for read or write as defaulted to ONE.
> > 
> > Movid to SSD is for sure an obvious improvement but not possible at
> > the moment.
> > My goal is to really spend the lowest time possible on running a
> > repair throughout all the nodes.
> > Are there any more downsides to run nodetool repair -pr
> > simultaneously on each node, besides the cpu and mem overload?
> > Also if someone can clarify about the safety of an incremental
> > repair.
> > 
> > thanks,
> > nuno
> > From: Carlos Rolo <ro...@pythian.com>
> > Sent: Friday, April 20, 2018 4:55:21 PM
> > To: user@cassandra.apache.org
> > Subject: Re: cassandra repair takes ages
> >  
> > Changing the datadrives to SSD would help to speed up the repairs.
> > 
> > Also don't run 3 node, RF2. That makes Quorum = All. 
> > 
> > Regards,
> > 
> > Carlos Juzarte Rolo
> > Cassandra Consultant / Datastax Certified Architect / Cassandra MVP
> >  
> > Pythian - Love your data
> > 
> > rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin:
> > linkedin.com/in/carlosjuzarterolo 
> > Mobile: +351 918 918 100 
> > www.pythian.com
> > 
> > On Fri, Apr 20, 2018 at 4:42 PM, Nuno Cervaens - Hoist Group -
> > Portugal <nu...@hoistgroup.com> wrote:
> > Hello,
> > 
> > I have a 3 node cluster with RF 2 and using STCS. I use SSDs for
> > commitlogs and HDDs for data. Apache Cassandra version is 3.11.2.
> > I basically have a huge keyspace ('newts' from opennms) and a big
> > keyspace ('opspanel'). Here's a summary of the 'du' output for one
> > node (which is more or less the same for each node):
> > 
> > 51G
> > ./data/opspanel
> > 776G
> > ./data/newts/samples-00ae9420ea0711e5a39bbd7839a19930
> > 776G
> > ./data/newts
> > 
> > My issue is that running a 'nodetool repair -pr' takes one day an a
> > half per node and as I want to store daily snapshots (for the past
> > 7 days), I dont see how I can do this as repairs take too long.
> > For example I see huge compactions and validations that take lots
> > of hours (compactionstats taken at different times):
> > 
> > id                                   compaction type keyspace
> > table   completed    total        unit  progress
> > 7125eb20-446b-11e8-a57d-f36e88375e31
> > Compaction      newts    samples 294177987449 835153786347 bytes
> > 35,22% 
> > 
> > id                                   compaction
> > type             keyspace
> > table   completed    total        unit  progress
> > 6aa5ce51-4425-11e8-a7c1-572dede7e4d6 Anticompaction after repair
> > newts    samples 581839334815 599408876344 bytes 97,07%  
> > 
> > id                                   compaction type keyspace
> > table   completed    total        unit  progress
> > 69976700-43e2-11e8-a7c1-572dede7e4d6
> > Validation      newts    samples 63249761990  826302170493 bytes
> > 7,65%   
> > 69973ff0-43e2-11e8-a7c1-572dede7e4d6
> > Validation      newts    samples 102513762816 826302170600 bytes
> > 12,41%
> > 
> > Is there something I can do to improve the situation?
> > 
> > Also, is an incremental repair (apparently nodetool's default)
> > safe? As I see in the datastax documentation that the incremental
> > should not be used, only the full. Can you please clarify?
> > 
> > Thanks for the feedback.
> > Nuno
> > 
> > 
> > --
> > 
> > 
> > 
> 
> --
> 
>

Re: cassandra repair takes ages

Posted by Carlos Rolo <ro...@pythian.com>.

Hello,

I just stated that if you use QUORUM or in fact using ALL, since you're
running ONE, this is a non-issue.

Regarding incremental repairs you can read here:
http://thelastpickle.com/blog/2017/12/14/should-you-use-incremental-repair.html

You can't run repair -pr simultaneously. You can try to use a tool like
Reaper to better manage and schedule repairs, but I doubt it will speed up
a lot.

Regards,

Carlos Juzarte Rolo
Cassandra Consultant / Datastax Certified Architect / Cassandra MVP

Pythian - Love your data

rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin:
*linkedin.com/in/carlosjuzarterolo
<http://linkedin.com/in/carlosjuzarterolo>*
Mobile: +351 918 918 100
www.pythian.com

On Sun, Apr 22, 2018 at 11:39 AM, Nuno Cervaens - Hoist Group - Portugal <
nuno.cervaens@hoistgroup.com> wrote:

> Hi Carlos,
>
>
> Thanks for the reply.
>
> Isnt the consistency level defined per session? All my session, being for
> read or write as defaulted to ONE.
>
>
> Movid to SSD is for sure an obvious improvement but not possible at the
> moment.
>
> My goal is to really spend the lowest time possible on running a repair
> throughout all the nodes.
>
> Are there any more downsides to run nodetool repair -pr simultaneously on
> each node, besides the cpu and mem overload?
>
> Also if someone can clarify about the safety of an incremental repair.
>
>
> thanks,
>
> nuno
> ------------------------------
> *From:* Carlos Rolo <ro...@pythian.com>
> *Sent:* Friday, April 20, 2018 4:55:21 PM
> *To:* user@cassandra.apache.org
> *Subject:* Re: cassandra repair takes ages
>
> Changing the datadrives to SSD would help to speed up the repairs.
>
> Also don't run 3 node, RF2. That makes Quorum = All.
>
> Regards,
>
> Carlos Juzarte Rolo
> Cassandra Consultant / Datastax Certified Architect / Cassandra MVP
>
> Pythian - Love your data
>
> rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin:
> *linkedin.com/in/carlosjuzarterolo
> <http://linkedin.com/in/carlosjuzarterolo>*
> Mobile: +351 918 918 100
> www.pythian.com
>
> On Fri, Apr 20, 2018 at 4:42 PM, Nuno Cervaens - Hoist Group - Portugal <
> nuno.cervaens@hoistgroup.com> wrote:
>
> Hello,
>
> I have a 3 node cluster with RF 2 and using STCS. I use SSDs for
> commitlogs and HDDs for data. Apache Cassandra version is 3.11.2.
> I basically have a huge keyspace ('newts' from opennms) and a big keyspace
> ('opspanel'). Here's a summary of the 'du' output for one node (which is
> more or less the same for each node):
>
> 51G ./data/opspanel
> 776G ./data/newts/samples-00ae9420ea0711e5a39bbd7839a19930
> 776G ./data/newts
>
> My issue is that running a 'nodetool repair -pr' takes one day an a half
> per node and as I want to store daily snapshots (for the past 7 days), I
> dont see how I can do this as repairs take too long.
> For example I see huge compactions and validations that take lots of hours
> (compactionstats taken at different times):
>
> id                                   compaction type keyspace
> table   completed    total        unit  progress
> 7125eb20-446b-11e8-a57d-f36e88375e31 Compaction      newts    samples
> 294177987449 835153786347 bytes 35,22%
>
> id                                   compaction type             keyspace
> table   completed    total        unit  progress
> 6aa5ce51-4425-11e8-a7c1-572dede7e4d6 Anticompaction after repair
> newts    samples 581839334815 599408876344 bytes 97,07%
>
> id                                   compaction type keyspace
> table   completed    total        unit  progress
> 69976700-43e2-11e8-a7c1-572dede7e4d6 Validation      newts    samples
> 63249761990  826302170493 bytes 7,65%
> 69973ff0-43e2-11e8-a7c1-572dede7e4d6 Validation      newts    samples
> 102513762816 826302170600 bytes 12,41%
>
> Is there something I can do to improve the situation?
>
> Also, is an incremental repair (apparently nodetool's default) safe? As I
> see in the datastax documentation that the incremental should not be used,
> only the full. Can you please clarify?
>
> Thanks for the feedback.
> Nuno
>
>
>
> --
>
>
>
>

-- 


--

Re: cassandra repair takes ages

Posted by Nuno Cervaens - Hoist Group - Portugal <nu...@hoistgroup.com>.

Hi Carlos,


Thanks for the reply.

Isnt the consistency level defined per session? All my session, being for read or write as defaulted to ONE.


Movid to SSD is for sure an obvious improvement but not possible at the moment.

My goal is to really spend the lowest time possible on running a repair throughout all the nodes.

Are there any more downsides to run nodetool repair -pr simultaneously on each node, besides the cpu and mem overload?

Also if someone can clarify about the safety of an incremental repair.


thanks,

nuno

________________________________
From: Carlos Rolo <ro...@pythian.com>
Sent: Friday, April 20, 2018 4:55:21 PM
To: user@cassandra.apache.org
Subject: Re: cassandra repair takes ages

Changing the datadrives to SSD would help to speed up the repairs.

Also don't run 3 node, RF2. That makes Quorum = All.

Regards,

Carlos Juzarte Rolo
Cassandra Consultant / Datastax Certified Architect / Cassandra MVP

Pythian - Love your data

rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin: linkedin.com/in/carlosjuzarterolo
<http://linkedin.com/in/carlosjuzarterolo>
Mobile: +351 918 918 100
www.pythian.com<http://www.pythian.com/>

On Fri, Apr 20, 2018 at 4:42 PM, Nuno Cervaens - Hoist Group - Portugal <nu...@hoistgroup.com>> wrote:
Hello,

I have a 3 node cluster with RF 2 and using STCS. I use SSDs for commitlogs and HDDs for data. Apache Cassandra version is 3.11.2.
I basically have a huge keyspace ('newts' from opennms) and a big keyspace ('opspanel'). Here's a summary of the 'du' output for one node (which is more or less the same for each node):

51G ./data/opspanel
776G ./data/newts/samples-00ae9420ea0711e5a39bbd7839a19930
776G ./data/newts

My issue is that running a 'nodetool repair -pr' takes one day an a half per node and as I want to store daily snapshots (for the past 7 days), I dont see how I can do this as repairs take too long.
For example I see huge compactions and validations that take lots of hours (compactionstats taken at different times):

id                                   compaction type keyspace table   completed    total        unit  progress
7125eb20-446b-11e8-a57d-f36e88375e31 Compaction      newts    samples 294177987449 835153786347 bytes 35,22%

id                                   compaction type             keyspace table   completed    total        unit  progress
6aa5ce51-4425-11e8-a7c1-572dede7e4d6 Anticompaction after repair newts    samples 581839334815 599408876344 bytes 97,07%

id                                   compaction type keyspace table   completed    total        unit  progress
69976700-43e2-11e8-a7c1-572dede7e4d6 Validation      newts    samples 63249761990  826302170493 bytes 7,65%
69973ff0-43e2-11e8-a7c1-572dede7e4d6 Validation      newts    samples 102513762816 826302170600 bytes 12,41%

Is there something I can do to improve the situation?

Also, is an incremental repair (apparently nodetool's default) safe? As I see in the datastax documentation that the incremental should not be used, only the full. Can you please clarify?

Thanks for the feedback.
Nuno



--

Re: cassandra repair takes ages

Posted by Carlos Rolo <ro...@pythian.com>.

Changing the datadrives to SSD would help to speed up the repairs.

Also don't run 3 node, RF2. That makes Quorum = All.

Regards,

Carlos Juzarte Rolo
Cassandra Consultant / Datastax Certified Architect / Cassandra MVP

Pythian - Love your data

rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin:
*linkedin.com/in/carlosjuzarterolo
<http://linkedin.com/in/carlosjuzarterolo>*
Mobile: +351 918 918 100
www.pythian.com

On Fri, Apr 20, 2018 at 4:42 PM, Nuno Cervaens - Hoist Group - Portugal <
nuno.cervaens@hoistgroup.com> wrote:

> Hello,
>
> I have a 3 node cluster with RF 2 and using STCS. I use SSDs for
> commitlogs and HDDs for data. Apache Cassandra version is 3.11.2.
> I basically have a huge keyspace ('newts' from opennms) and a big keyspace
> ('opspanel'). Here's a summary of the 'du' output for one node (which is
> more or less the same for each node):
>
> 51G ./data/opspanel
> 776G ./data/newts/samples-00ae9420ea0711e5a39bbd7839a19930
> 776G ./data/newts
>
> My issue is that running a 'nodetool repair -pr' takes one day an a half
> per node and as I want to store daily snapshots (for the past 7 days), I
> dont see how I can do this as repairs take too long.
> For example I see huge compactions and validations that take lots of hours
> (compactionstats taken at different times):
>
> id                                   compaction type keyspace
> table   completed    total        unit  progress
> 7125eb20-446b-11e8-a57d-f36e88375e31 Compaction      newts    samples
> 294177987449 835153786347 bytes 35,22%
>
> id                                   compaction type             keyspace
> table   completed    total        unit  progress
> 6aa5ce51-4425-11e8-a7c1-572dede7e4d6 Anticompaction after repair
> newts    samples 581839334815 599408876344 bytes 97,07%
>
> id                                   compaction type keyspace
> table   completed    total        unit  progress
> 69976700-43e2-11e8-a7c1-572dede7e4d6 Validation      newts    samples
> 63249761990  826302170493 bytes 7,65%
> 69973ff0-43e2-11e8-a7c1-572dede7e4d6 Validation      newts    samples
> 102513762816 826302170600 bytes 12,41%
>
> Is there something I can do to improve the situation?
>
> Also, is an incremental repair (apparently nodetool's default) safe? As I
> see in the datastax documentation that the incremental should not be used,
> only the full. Can you please clarify?
>
> Thanks for the feedback.
> Nuno
>

-- 


--