You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Roger Warner <rw...@pandora.com> on 2017/07/25 22:49:28 UTC

Data Loss irreparabley so

This is a quick informational question.     I know that Cassandra can detect failures of nodes and repair them given replication and multiple DC.

My question is can Cassandra tell if data was lost after a failure and node(s) “fixed” and resumed operation?

If so where would it log or flag it?  Or are we just supposed to figure it out?

R

回复： Data Loss irreparabley so

Posted by Peng Xiao <25...@qq.com>.

Due to the tombstone,we have set GC_GRACE_SECONDS to 6 hours.And for a huge table with 4T size,repair is a hard thing for us.




------------------ 原始邮件 ------------------
发件人: "kurt";<ku...@instaclustr.com>;
发送时间: 2017年8月3日(星期四) 中午12:08
收件人: "User"<us...@cassandra.apache.org>; 

主题: Re: Data Loss irreparabley so



You should run repairs every GC_GRACE_SECONDS. If a node is overloaded/goes down, you should run repairs. LOCAL_QUORUM will somewhat maintain consistency within a DC, but certainly doesn't mean you can get away without running repairs. You need to run repairs even if you are using QUORUM or ONE.

Re: Data Loss irreparabley so

Posted by kurt greaves <ku...@instaclustr.com>.

You should run repairs every GC_GRACE_SECONDS. If a node is overloaded/goes
down, you should run repairs. LOCAL_QUORUM will somewhat maintain
consistency within a DC, but certainly doesn't mean you can get away
without running repairs. You need to run repairs even if you are using
QUORUM or ONE.

回复： Data Loss irreparabley so

Posted by Peng Xiao <25...@qq.com>.

Hi,
We are also experiencing the same issue.we have 3 DCs(DC1 RF=3,DC2 RF=3,DC3,RF=1),if we use local_quorum,we are not meant to loss any data,right?
if we use local_one, maybe loss data? then we need to run repair regularly?
Could anyone advise?

Thanks

------------------ 原始邮件 ------------------
发件人: "Jon Haddad";<jo...@gmail.com>;
发送时间: 2017年7月28日(星期五) 凌晨1:37
收件人: "user"<us...@cassandra.apache.org>; 

主题: Re: Data Loss irreparabley so

We (The Last Pickle) maintain an open source tool to help manage repairs across your clusters called Reaper.  It’s a lot easier to set up and manage than trying to manage it through cron.

http://thelastpickle.com/reaper.html

On Jul 27, 2017, at 12:38 AM, Daniel Hölbling-Inzko <da...@bitmovin.com> wrote:

In that vein, Cassandra support Auto compaction and incremental repair. 
Does this mean I have to set up cron jobs on each node to do a nodetool repair or is this taken care of by Cassandra anyways?
How often should I run nodetool repair

Greetings Daniel
Jeff Jirsa <jj...@apache.org> schrieb am Do. 27. Juli 2017 um 07:48:

 On 2017-07-25 15:49 (-0700), Roger Warner <rw...@pandora.com> wrote:
 > This is a quick informational question.     I know that Cassandra can detect failures of nodes and repair them given replication and multiple DC.
 >
 > My question is can Cassandra tell if data was lost after a failure and node(s) “fixed” and resumed operation?
 >

 Sorta concerned by the way you're asking this - Cassandra doesn't "fix" failed nodes. It can route requests around a down node, but the "fixing" is entirely manual.

 If you have a node go down temporarily, and it comes back up (with it's disk intact), you can see it "repair" data with a combination of active (anti-entropy) repair via nodetool repair, or by watching 'nodetool netstats' and see the read repair counters increase over time (which will happen naturally as data is requested and mismatches are detected in the data, based on your consistency level).

 ---------------------------------------------------------------------
 To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
 For additional commands, e-mail: user-help@cassandra.apache.org

Re: Data Loss irreparabley so

Posted by Jon Haddad <jo...@gmail.com>.

We (The Last Pickle) maintain an open source tool to help manage repairs across your clusters called Reaper.  It’s a lot easier to set up and manage than trying to manage it through cron.

http://thelastpickle.com/reaper.html <http://thelastpickle.com/reaper.html>
> On Jul 27, 2017, at 12:38 AM, Daniel Hölbling-Inzko <da...@bitmovin.com> wrote:
> 
> In that vein, Cassandra support Auto compaction and incremental repair. 
> Does this mean I have to set up cron jobs on each node to do a nodetool repair or is this taken care of by Cassandra anyways?
> How often should I run nodetool repair
> 
> Greetings Daniel
> Jeff Jirsa <jjirsa@apache.org <ma...@apache.org>> schrieb am Do. 27. Juli 2017 um 07:48:
> 
> 
> On 2017-07-25 15:49 (-0700), Roger Warner <rwarner@pandora.com <ma...@pandora.com>> wrote:
> > This is a quick informational question.     I know that Cassandra can detect failures of nodes and repair them given replication and multiple DC.
> >
> > My question is can Cassandra tell if data was lost after a failure and node(s) “fixed” and resumed operation?
> >
> 
> Sorta concerned by the way you're asking this - Cassandra doesn't "fix" failed nodes. It can route requests around a down node, but the "fixing" is entirely manual.
> 
> If you have a node go down temporarily, and it comes back up (with it's disk intact), you can see it "repair" data with a combination of active (anti-entropy) repair via nodetool repair, or by watching 'nodetool netstats' and see the read repair counters increase over time (which will happen naturally as data is requested and mismatches are detected in the data, based on your consistency level).
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org <ma...@cassandra.apache.org>
> For additional commands, e-mail: user-help@cassandra.apache.org <ma...@cassandra.apache.org>
>

Re: Data Loss irreparabley so

Posted by Daniel Hölbling-Inzko <da...@bitmovin.com>.

In that vein, Cassandra support Auto compaction and incremental repair.
Does this mean I have to set up cron jobs on each node to do a nodetool
repair or is this taken care of by Cassandra anyways?
How often should I run nodetool repair

Greetings Daniel
Jeff Jirsa <jj...@apache.org> schrieb am Do. 27. Juli 2017 um 07:48:

>
>
> On 2017-07-25 15:49 (-0700), Roger Warner <rw...@pandora.com> wrote:
> > This is a quick informational question.     I know that Cassandra can
> detect failures of nodes and repair them given replication and multiple DC.
> >
> > My question is can Cassandra tell if data was lost after a failure and
> node(s) “fixed” and resumed operation?
> >
>
> Sorta concerned by the way you're asking this - Cassandra doesn't "fix"
> failed nodes. It can route requests around a down node, but the "fixing" is
> entirely manual.
>
> If you have a node go down temporarily, and it comes back up (with it's
> disk intact), you can see it "repair" data with a combination of active
> (anti-entropy) repair via nodetool repair, or by watching 'nodetool
> netstats' and see the read repair counters increase over time (which will
> happen naturally as data is requested and mismatches are detected in the
> data, based on your consistency level).
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: user-help@cassandra.apache.org
>
>

Re: Data Loss irreparabley so

Posted by Jeff Jirsa <jj...@apache.org>.

On 2017-07-25 15:49 (-0700), Roger Warner <rw...@pandora.com> wrote: 
> This is a quick informational question.     I know that Cassandra can detect failures of nodes and repair them given replication and multiple DC.
> 
> My question is can Cassandra tell if data was lost after a failure and node(s) âfixedâ and resumed operation?
> 

Sorta concerned by the way you're asking this - Cassandra doesn't "fix" failed nodes. It can route requests around a down node, but the "fixing" is entirely manual. 

If you have a node go down temporarily, and it comes back up (with it's disk intact), you can see it "repair" data with a combination of active (anti-entropy) repair via nodetool repair, or by watching 'nodetool netstats' and see the read repair counters increase over time (which will happen naturally as data is requested and mismatches are detected in the data, based on your consistency level).

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org

Re: Data Loss irreparabley so

Posted by kurt greaves <ku...@instaclustr.com>.

Cassandra doesn't do any automatic repairing. It can tell if your data is
inconsistent, however it's really up to you to manage consistency through
repairs and choice of consistency level for queries. If you lose a node,
you have to manually repair the cluster after replacing the node, but
really you should be doing this every GC Grace seconds regardless.