You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Andras Szerdahelyi <an...@ignitionone.com> on 2012/12/05 13:32:44 UTC

entire range of node out of sync -- out of the blue

hi list,

AntiEntropyService started syncing ranges of entire nodes ( ?! ) across my data centers and i'd like to understand why.

I see log lines like this on all my nodes in my two ( east/west ) data centres...

INFO [AntiEntropySessions:3] 2012-12-05 02:15:02,301 AntiEntropyService.java (line 666) [repair #7c7665c0-3eab-11e2-0000-dae6667065ff] new session: will sync /X.X.1.113, /X.X.0.71 on range (85070591730234615865843651857942052964,0] for ( .. )

( this is around 80-100 GB of data for a single node. )

- i did not observe any network failures or nodes falling off the ring
- good distribution of data ( load is equal on all nodes )
- hinted handoff is on
- read repair chance is 0.1 on the CF
- 2 replicas in each data centre ( which is also the number of nodes in each ) with NetworkTopologyStrategy
- repair -pr is scheduled to run off-peak hours, daily
- leveled compaction with stable max size 256mb ( i have found this to trigger compaction in acceptable intervals while still keeping the stable count down )
- i am on 1.1.6
- java heap 10G
- max memtables 2G
- 1G row cache
- 256M key cache

my nodes'  ranges are:

DC west
0
85070591730234615865843651857942052864

DC east
100
85070591730234615865843651857942052964

symptoms are:
- logs show sstables being streamed over to other nodes
- 140k files in data dir of CF on all nodes
- cfstats reports 20k sstables, up from 6 on all nodes
- compaction continuously running with no results whatsoever ( number of stables growing )

i tried the following:
- offline scrub ( has gone OOM, i noticed the script in the debian package specifies 256MB heap? )
- online scrub ( no effect )
- repair ( no effect )
- cleanup ( no effect )

my questions are:
- how do i stop repair before i run out of storage? ( can't let this finish )
- how do i clean up my stables ( grew from 6k to 20k since this started, while i shut writes off completely )

thanks,
Andras

Andras Szerdahelyi
Solutions Architect, IgnitionOne | 1831 Diegem E.Mommaertslaan 20A
M: +32 493 05 50 88 | Skype: sandrew84


[cid:7BDF7228-D831-4D98-967A-BE04FEB17544]

Re: entire range of node out of sync -- out of the blue

Posted by "B. Todd Burruss" <bt...@gmail.com>.

i am on DSE, and i am referring to the json manifest ... but my memory
isn't very good so i could have the name wrong.  we are hitting this bug:
https://issues.apache.org/jira/browse/CASSANDRA-3306




On Wed, Dec 19, 2012 at 8:17 AM, Andras Szerdahelyi <
andras.szerdahelyi@ignitionone.com> wrote:

>  Solr? Are you on DSE or am i missing something ( huge ) about Cassandra?
> ( wouldnt be the first time :-)
>
>  Or do you mean the json manifest ?  Its there and it looks ok, in fact
> its been corrupted twice due to storage problems and i hit
> https://issues.apache.org/jira/browse/CASSANDRA-5041
> TBH i think this was a repair without -pr
>
>  thanks,
> Andras
>
> Andras Szerdahelyi*
> *Solutions Architect, IgnitionOne | 1831 Diegem E.Mommaertslaan 20A
> M: +32 493 05 50 88 | Skype: sandrew84
>
>
>
>
>
>  On 18 Dec 2012, at 22:09, B. Todd Burruss <bt...@gmail.com> wrote:
>
>  in your data directory, for each keyspace there is a solr.json.
>  cassandra stores the SSTABLEs it knows about when using leveled
> compaction.  take a look at that file and see if it looks accurate.  if
> not, this is a bug with cassandra that we are checking into as well
>
>
> On Thu, Dec 6, 2012 at 7:38 PM, aaron morton <aa...@thelastpickle.com>wrote:
>
>> The log message matches what I would expect to see for nodetool -pr
>>
>>  Not using pr means repair all the ranges the node is a replica for. If
>> you have RF == number of nodes, then it will repair all the data.
>>
>>  Cheers
>>
>>     -----------------
>> Aaron Morton
>> Freelance Cassandra Developer
>> New Zealand
>>
>>  @aaronmorton
>> http://www.thelastpickle.com
>>
>>   On 6/12/2012, at 9:42 PM, Andras Szerdahelyi <
>> andras.szerdahelyi@ignitionone.com> wrote:
>>
>>   Thanks!
>>
>>  i'm also thinking a repair  run without -pr could have caused this
>> maybe ?
>>
>>
>> Andras Szerdahelyi*
>> *Solutions Architect, IgnitionOne | 1831 Diegem E.Mommaertslaan 20A
>> M: +32 493 05 50 88 | Skype: sandrew84
>>
>>
>> <C4798BB9-9092-4145-880B-A72C6B7AF9A4[41].png>
>>
>>
>>  On 06 Dec 2012, at 04:05, aaron morton <aa...@thelastpickle.com> wrote:
>>
>>   - how do i stop repair before i run out of storage? ( can't let this
>> finish )
>>
>>
>>  To stop the validation part of the repair…
>>
>>  nodetool -h localhost stop VALIDATION
>>
>>
>>  The only way I know to stop streaming is restart the node, their may be
>> a better way though.
>>
>>
>>   INFO [AntiEntropySessions:3] 2012-12-05 02:15:02,301
>> AntiEntropyService.java (line 666) [repair
>> #7c7665c0-3eab-11e2-0000-dae6667065ff] new session: will sync /X.X.1.113,
>> /X.X.0.71 on range (*85070591730234615865843651857942052964,0*] for ( ..
>> )
>>
>> Am assuming this was ran on the first node in DC west with -pr as you
>> said.
>> The log message is saying this is going to repair the primary range for
>> the node for the node. The repair is then actually performed one CF at a
>> time.
>>
>>  You should also see log messages ending with "range(s) out of sync"
>> which will say how out of sync the data is.
>>
>>
>> - how do i clean up my stables ( grew from 6k to 20k since this started,
>> while i shut writes off completely )
>>
>>  Sounds like repair is streaming a lot of differences.
>>  If you have the space I would give  Levelled compaction time to take
>> care of it.
>>
>>  Hope that helps.
>>
>>      -----------------
>> Aaron Morton
>> Freelance Cassandra Developer
>> New Zealand
>>
>>  @aaronmorton
>> http://www.thelastpickle.com
>>
>>  On 6/12/2012, at 1:32 AM, Andras Szerdahelyi <
>> andras.szerdahelyi@ignitionone.com> wrote:
>>
>>  hi list,
>>
>>  AntiEntropyService started syncing ranges of entire nodes ( ?! ) across
>> my data centers and i'd like to understand why.
>>
>>  I see log lines like this on all my nodes in my two ( east/west ) data
>> centres...
>>
>>  INFO [AntiEntropySessions:3] 2012-12-05 02:15:02,301
>> AntiEntropyService.java (line 666) [repair
>> #7c7665c0-3eab-11e2-0000-dae6667065ff] new session: will sync /X.X.1.113,
>> /X.X.0.71 on range (*85070591730234615865843651857942052964,0*] for ( ..
>> )
>>
>>  ( this is around 80-100 GB of data for a single node. )
>>
>>  - i did not observe any network failures or nodes falling off the ring
>> - good distribution of data ( load is equal on all nodes )
>> - hinted handoff is on
>> - read repair chance is 0.1 on the CF
>> - 2 replicas in each data centre ( which is also the number of nodes in
>> each ) with NetworkTopologyStrategy
>> - repair -pr is scheduled to run off-peak hours, daily
>> - leveled compaction with stable max size 256mb ( i have found this to
>> trigger compaction in acceptable intervals while still keeping the stable
>> count down )
>> - i am on 1.1.6
>> - java heap 10G
>> - max memtables 2G
>> - 1G row cache
>> - 256M key cache
>>
>>  my nodes'  ranges are:
>>
>>  DC west
>> 0
>> 85070591730234615865843651857942052864
>>
>>  DC east
>> 100
>> 85070591730234615865843651857942052964
>>
>>  symptoms are:
>> - logs show sstables being streamed over to other nodes
>> - 140k files in data dir of CF on all nodes
>> - cfstats reports 20k sstables, up from 6 on all nodes
>> - compaction continuously running with no results whatsoever ( number of
>> stables growing )
>>
>>  i tried the following:
>> - offline scrub ( has gone OOM, i noticed the script in the debian
>> package specifies 256MB heap? )
>> - online scrub ( no effect )
>> - repair ( no effect )
>> - cleanup ( no effect )
>>
>>  my questions are:
>> - how do i stop repair before i run out of storage? ( can't let this
>> finish )
>> - how do i clean up my stables ( grew from 6k to 20k since this started,
>> while i shut writes off completely )
>>
>>  thanks,
>> Andras
>>
>> Andras Szerdahelyi*
>> *Solutions Architect, IgnitionOne | 1831 Diegem E.Mommaertslaan 20A
>> M: +32 493 05 50 88 | Skype: sandrew84
>>
>>
>> <C4798BB9-9092-4145-880B-A72C6B7AF9A4[41].png>
>>
>>
>>
>>
>>
>>
>
>

Re: entire range of node out of sync -- out of the blue

Posted by Andras Szerdahelyi <an...@ignitionone.com>.

Solr? Are you on DSE or am i missing something ( huge ) about Cassandra? ( wouldnt be the first time :-)

Or do you mean the json manifest ?  Its there and it looks ok, in fact its been corrupted twice due to storage problems and i hit https://issues.apache.org/jira/browse/CASSANDRA-5041
TBH i think this was a repair without -pr

thanks,
Andras

Andras Szerdahelyi
Solutions Architect, IgnitionOne | 1831 Diegem E.Mommaertslaan 20A
M: +32 493 05 50 88 | Skype: sandrew84


[cid:7BDF7228-D831-4D98-967A-BE04FEB17544]




On 18 Dec 2012, at 22:09, B. Todd Burruss <bt...@gmail.com>> wrote:

in your data directory, for each keyspace there is a solr.json.  cassandra stores the SSTABLEs it knows about when using leveled compaction.  take a look at that file and see if it looks accurate.  if not, this is a bug with cassandra that we are checking into as well


On Thu, Dec 6, 2012 at 7:38 PM, aaron morton <aa...@thelastpickle.com>> wrote:
The log message matches what I would expect to see for nodetool -pr

Not using pr means repair all the ranges the node is a replica for. If you have RF == number of nodes, then it will repair all the data.

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com<http://www.thelastpickle.com/>

On 6/12/2012, at 9:42 PM, Andras Szerdahelyi <an...@ignitionone.com>> wrote:

Thanks!

i'm also thinking a repair  run without -pr could have caused this maybe ?


Andras Szerdahelyi
Solutions Architect, IgnitionOne | 1831 Diegem E.Mommaertslaan 20A
M: +32 493 05 50 88<tel:%2B32%20493%2005%2050%2088> | Skype: sandrew84


<C4798BB9-9092-4145-880B-A72C6B7AF9A4[41].png>




On 06 Dec 2012, at 04:05, aaron morton <aa...@thelastpickle.com>> wrote:

- how do i stop repair before i run out of storage? ( can't let this finish )

To stop the validation part of the repair…

nodetool -h localhost stop VALIDATION


The only way I know to stop streaming is restart the node, their may be a better way though.


INFO [AntiEntropySessions:3] 2012-12-05 02:15:02,301 AntiEntropyService.java (line 666) [repair #7c7665c0-3eab-11e2-0000-dae6667065ff] new session: will sync /X.X.1.113, /X.X.0.71 on range (85070591730234615865843651857942052964,0] for ( .. )
Am assuming this was ran on the first node in DC west with -pr as you said.
The log message is saying this is going to repair the primary range for the node for the node. The repair is then actually performed one CF at a time.

You should also see log messages ending with "range(s) out of sync" which will say how out of sync the data is.

- how do i clean up my stables ( grew from 6k to 20k since this started, while i shut writes off completely )
Sounds like repair is streaming a lot of differences.
If you have the space I would give  Levelled compaction time to take care of it.

Hope that helps.

-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com<http://www.thelastpickle.com/>

On 6/12/2012, at 1:32 AM, Andras Szerdahelyi <an...@ignitionone.com>> wrote:

hi list,

AntiEntropyService started syncing ranges of entire nodes ( ?! ) across my data centers and i'd like to understand why.

I see log lines like this on all my nodes in my two ( east/west ) data centres...

INFO [AntiEntropySessions:3] 2012-12-05 02:15:02,301 AntiEntropyService.java (line 666) [repair #7c7665c0-3eab-11e2-0000-dae6667065ff] new session: will sync /X.X.1.113, /X.X.0.71 on range (85070591730234615865843651857942052964,0] for ( .. )

( this is around 80-100 GB of data for a single node. )

- i did not observe any network failures or nodes falling off the ring
- good distribution of data ( load is equal on all nodes )
- hinted handoff is on
- read repair chance is 0.1 on the CF
- 2 replicas in each data centre ( which is also the number of nodes in each ) with NetworkTopologyStrategy
- repair -pr is scheduled to run off-peak hours, daily
- leveled compaction with stable max size 256mb ( i have found this to trigger compaction in acceptable intervals while still keeping the stable count down )
- i am on 1.1.6
- java heap 10G
- max memtables 2G
- 1G row cache
- 256M key cache

my nodes'  ranges are:

DC west
0
85070591730234615865843651857942052864

DC east
100
85070591730234615865843651857942052964

symptoms are:
- logs show sstables being streamed over to other nodes
- 140k files in data dir of CF on all nodes
- cfstats reports 20k sstables, up from 6 on all nodes
- compaction continuously running with no results whatsoever ( number of stables growing )

i tried the following:
- offline scrub ( has gone OOM, i noticed the script in the debian package specifies 256MB heap? )
- online scrub ( no effect )
- repair ( no effect )
- cleanup ( no effect )

my questions are:
- how do i stop repair before i run out of storage? ( can't let this finish )
- how do i clean up my stables ( grew from 6k to 20k since this started, while i shut writes off completely )

thanks,
Andras

Andras Szerdahelyi
Solutions Architect, IgnitionOne | 1831 Diegem E.Mommaertslaan 20A
M: +32 493 05 50 88<tel:%2B32%20493%2005%2050%2088> | Skype: sandrew84


<C4798BB9-9092-4145-880B-A72C6B7AF9A4[41].png>

Re: entire range of node out of sync -- out of the blue

Posted by "B. Todd Burruss" <bt...@gmail.com>.

in your data directory, for each keyspace there is a solr.json.  cassandra
stores the SSTABLEs it knows about when using leveled compaction.  take a
look at that file and see if it looks accurate.  if not, this is a bug with
cassandra that we are checking into as well


On Thu, Dec 6, 2012 at 7:38 PM, aaron morton <aa...@thelastpickle.com>wrote:

> The log message matches what I would expect to see for nodetool -pr
>
> Not using pr means repair all the ranges the node is a replica for. If you
> have RF == number of nodes, then it will repair all the data.
>
> Cheers
>
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 6/12/2012, at 9:42 PM, Andras Szerdahelyi <
> andras.szerdahelyi@ignitionone.com> wrote:
>
>  Thanks!
>
>  i'm also thinking a repair  run without -pr could have caused this maybe
> ?
>
>
> Andras Szerdahelyi*
> *Solutions Architect, IgnitionOne | 1831 Diegem E.Mommaertslaan 20A
> M: +32 493 05 50 88 | Skype: sandrew84
>
>
> <C4798BB9-9092-4145-880B-A72C6B7AF9A4[41].png>
>
>
>  On 06 Dec 2012, at 04:05, aaron morton <aa...@thelastpickle.com> wrote:
>
>   - how do i stop repair before i run out of storage? ( can't let this
> finish )
>
>
>   To stop the validation part of the repair…
>
>  nodetool -h localhost stop VALIDATION
>
>
>  The only way I know to stop streaming is restart the node, their may be
> a better way though.
>
>
>   INFO [AntiEntropySessions:3] 2012-12-05 02:15:02,301
> AntiEntropyService.java (line 666) [repair
> #7c7665c0-3eab-11e2-0000-dae6667065ff] new session: will sync /X.X.1.113,
> /X.X.0.71 on range (*85070591730234615865843651857942052964,0*] for ( .. )
>
> Am assuming this was ran on the first node in DC west with -pr as you said.
>  The log message is saying this is going to repair the primary range for
> the node for the node. The repair is then actually performed one CF at a
> time.
>
>  You should also see log messages ending with "range(s) out of sync"
> which will say how out of sync the data is.
>
>
>  - how do i clean up my stables ( grew from 6k to 20k since this started,
> while i shut writes off completely )
>
>  Sounds like repair is streaming a lot of differences.
>  If you have the space I would give  Levelled compaction time to take
> care of it.
>
>  Hope that helps.
>
>       -----------------
> Aaron Morton
> Freelance Cassandra Developer
> New Zealand
>
>  @aaronmorton
> http://www.thelastpickle.com
>
>  On 6/12/2012, at 1:32 AM, Andras Szerdahelyi <
> andras.szerdahelyi@ignitionone.com> wrote:
>
>  hi list,
>
>  AntiEntropyService started syncing ranges of entire nodes ( ?! ) across
> my data centers and i'd like to understand why.
>
>  I see log lines like this on all my nodes in my two ( east/west ) data
> centres...
>
>  INFO [AntiEntropySessions:3] 2012-12-05 02:15:02,301
> AntiEntropyService.java (line 666) [repair
> #7c7665c0-3eab-11e2-0000-dae6667065ff] new session: will sync /X.X.1.113,
> /X.X.0.71 on range (*85070591730234615865843651857942052964,0*] for ( .. )
>
>  ( this is around 80-100 GB of data for a single node. )
>
>  - i did not observe any network failures or nodes falling off the ring
> - good distribution of data ( load is equal on all nodes )
> - hinted handoff is on
> - read repair chance is 0.1 on the CF
> - 2 replicas in each data centre ( which is also the number of nodes in
> each ) with NetworkTopologyStrategy
> - repair -pr is scheduled to run off-peak hours, daily
> - leveled compaction with stable max size 256mb ( i have found this to
> trigger compaction in acceptable intervals while still keeping the stable
> count down )
> - i am on 1.1.6
> - java heap 10G
> - max memtables 2G
> - 1G row cache
> - 256M key cache
>
>  my nodes'  ranges are:
>
>  DC west
> 0
> 85070591730234615865843651857942052864
>
>  DC east
> 100
> 85070591730234615865843651857942052964
>
>  symptoms are:
> - logs show sstables being streamed over to other nodes
> - 140k files in data dir of CF on all nodes
> - cfstats reports 20k sstables, up from 6 on all nodes
> - compaction continuously running with no results whatsoever ( number of
> stables growing )
>
>  i tried the following:
> - offline scrub ( has gone OOM, i noticed the script in the debian package
> specifies 256MB heap? )
> - online scrub ( no effect )
> - repair ( no effect )
> - cleanup ( no effect )
>
>  my questions are:
> - how do i stop repair before i run out of storage? ( can't let this
> finish )
> - how do i clean up my stables ( grew from 6k to 20k since this started,
> while i shut writes off completely )
>
>  thanks,
> Andras
>
> Andras Szerdahelyi*
> *Solutions Architect, IgnitionOne | 1831 Diegem E.Mommaertslaan 20A
> M: +32 493 05 50 88 | Skype: sandrew84
>
>
> <C4798BB9-9092-4145-880B-A72C6B7AF9A4[41].png>
>
>
>
>
>
>

Re: entire range of node out of sync -- out of the blue

Posted by aaron morton <aa...@thelastpickle.com>.

The log message matches what I would expect to see for nodetool -pr

Not using pr means repair all the ranges the node is a replica for. If you have RF == number of nodes, then it will repair all the data. 

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 6/12/2012, at 9:42 PM, Andras Szerdahelyi <an...@ignitionone.com> wrote:

> Thanks!
> 
> i'm also thinking a repair  run without -pr could have caused this maybe ?
> 
> 
> Andras Szerdahelyi
> Solutions Architect, IgnitionOne | 1831 Diegem E.Mommaertslaan 20A
> M: +32 493 05 50 88 | Skype: sandrew84
> 
> 
> <C4798BB9-9092-4145-880B-A72C6B7AF9A4[41].png>
> 
> 
> On 06 Dec 2012, at 04:05, aaron morton <aa...@thelastpickle.com> wrote:
> 
>>> - how do i stop repair before i run out of storage? ( can't let this finish )
>> 
>> To stop the validation part of the repair…
>> 
>> nodetool -h localhost stop VALIDATION 
>> 
>> 
>> The only way I know to stop streaming is restart the node, their may be a better way though. 
>> 
>> 
>>> INFO [AntiEntropySessions:3] 2012-12-05 02:15:02,301 AntiEntropyService.java (line 666) [repair #7c7665c0-3eab-11e2-0000-dae6667065ff] new session: will sync /X.X.1.113, /X.X.0.71 on range (85070591730234615865843651857942052964,0] for ( .. )
>> Am assuming this was ran on the first node in DC west with -pr as you said.
>> The log message is saying this is going to repair the primary range for the node for the node. The repair is then actually performed one CF at a time. 
>> 
>> You should also see log messages ending with "range(s) out of sync" which will say how out of sync the data is. 
>>  
>>> - how do i clean up my stables ( grew from 6k to 20k since this started, while i shut writes off completely )
>> Sounds like repair is streaming a lot of differences. 
>> If you have the space I would give  Levelled compaction time to take care of it. 
>> 
>> Hope that helps.
>> 
>> -----------------
>> Aaron Morton
>> Freelance Cassandra Developer
>> New Zealand
>> 
>> @aaronmorton
>> http://www.thelastpickle.com
>> 
>> On 6/12/2012, at 1:32 AM, Andras Szerdahelyi <an...@ignitionone.com> wrote:
>> 
>>> hi list,
>>> 
>>> AntiEntropyService started syncing ranges of entire nodes ( ?! ) across my data centers and i'd like to understand why. 
>>> 
>>> I see log lines like this on all my nodes in my two ( east/west ) data centres...
>>> 
>>> INFO [AntiEntropySessions:3] 2012-12-05 02:15:02,301 AntiEntropyService.java (line 666) [repair #7c7665c0-3eab-11e2-0000-dae6667065ff] new session: will sync /X.X.1.113, /X.X.0.71 on range (85070591730234615865843651857942052964,0] for ( .. )
>>> 
>>> ( this is around 80-100 GB of data for a single node. )
>>> 
>>> - i did not observe any network failures or nodes falling off the ring
>>> - good distribution of data ( load is equal on all nodes )
>>> - hinted handoff is on
>>> - read repair chance is 0.1 on the CF
>>> - 2 replicas in each data centre ( which is also the number of nodes in each ) with NetworkTopologyStrategy
>>> - repair -pr is scheduled to run off-peak hours, daily
>>> - leveled compaction with stable max size 256mb ( i have found this to trigger compaction in acceptable intervals while still keeping the stable count down )
>>> - i am on 1.1.6
>>> - java heap 10G
>>> - max memtables 2G
>>> - 1G row cache
>>> - 256M key cache
>>> 
>>> my nodes'  ranges are:
>>> 
>>> DC west
>>> 0
>>> 85070591730234615865843651857942052864
>>> 
>>> DC east
>>> 100
>>> 85070591730234615865843651857942052964
>>> 
>>> symptoms are:
>>> - logs show sstables being streamed over to other nodes
>>> - 140k files in data dir of CF on all nodes
>>> - cfstats reports 20k sstables, up from 6 on all nodes
>>> - compaction continuously running with no results whatsoever ( number of stables growing )
>>> 
>>> i tried the following:
>>> - offline scrub ( has gone OOM, i noticed the script in the debian package specifies 256MB heap? )
>>> - online scrub ( no effect )
>>> - repair ( no effect )
>>> - cleanup ( no effect )
>>> 
>>> my questions are:
>>> - how do i stop repair before i run out of storage? ( can't let this finish )
>>> - how do i clean up my stables ( grew from 6k to 20k since this started, while i shut writes off completely )
>>> 
>>> thanks,
>>> Andras
>>> 
>>> Andras Szerdahelyi
>>> Solutions Architect, IgnitionOne | 1831 Diegem E.Mommaertslaan 20A
>>> M: +32 493 05 50 88 | Skype: sandrew84
>>> 
>>> 
>>> <C4798BB9-9092-4145-880B-A72C6B7AF9A4[41].png>
>>> 
>>> 
>> 
>

Re: entire range of node out of sync -- out of the blue

Posted by Andras Szerdahelyi <an...@ignitionone.com>.

Thanks!

i'm also thinking a repair  run without -pr could have caused this maybe ?


Andras Szerdahelyi
Solutions Architect, IgnitionOne | 1831 Diegem E.Mommaertslaan 20A
M: +32 493 05 50 88 | Skype: sandrew84


[cid:7BDF7228-D831-4D98-967A-BE04FEB17544]




On 06 Dec 2012, at 04:05, aaron morton <aa...@thelastpickle.com>> wrote:

- how do i stop repair before i run out of storage? ( can't let this finish )

To stop the validation part of the repair…

nodetool -h localhost stop VALIDATION


The only way I know to stop streaming is restart the node, their may be a better way though.


INFO [AntiEntropySessions:3] 2012-12-05 02:15:02,301 AntiEntropyService.java (line 666) [repair #7c7665c0-3eab-11e2-0000-dae6667065ff] new session: will sync /X.X.1.113, /X.X.0.71 on range (85070591730234615865843651857942052964,0] for ( .. )
Am assuming this was ran on the first node in DC west with -pr as you said.
The log message is saying this is going to repair the primary range for the node for the node. The repair is then actually performed one CF at a time.

You should also see log messages ending with "range(s) out of sync" which will say how out of sync the data is.

- how do i clean up my stables ( grew from 6k to 20k since this started, while i shut writes off completely )
Sounds like repair is streaming a lot of differences.
If you have the space I would give  Levelled compaction time to take care of it.

Hope that helps.

-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com<http://www.thelastpickle.com/>

On 6/12/2012, at 1:32 AM, Andras Szerdahelyi <an...@ignitionone.com>> wrote:

hi list,

AntiEntropyService started syncing ranges of entire nodes ( ?! ) across my data centers and i'd like to understand why.

I see log lines like this on all my nodes in my two ( east/west ) data centres...

INFO [AntiEntropySessions:3] 2012-12-05 02:15:02,301 AntiEntropyService.java (line 666) [repair #7c7665c0-3eab-11e2-0000-dae6667065ff] new session: will sync /X.X.1.113, /X.X.0.71 on range (85070591730234615865843651857942052964,0] for ( .. )

( this is around 80-100 GB of data for a single node. )

- i did not observe any network failures or nodes falling off the ring
- good distribution of data ( load is equal on all nodes )
- hinted handoff is on
- read repair chance is 0.1 on the CF
- 2 replicas in each data centre ( which is also the number of nodes in each ) with NetworkTopologyStrategy
- repair -pr is scheduled to run off-peak hours, daily
- leveled compaction with stable max size 256mb ( i have found this to trigger compaction in acceptable intervals while still keeping the stable count down )
- i am on 1.1.6
- java heap 10G
- max memtables 2G
- 1G row cache
- 256M key cache

my nodes'  ranges are:

DC west
0
85070591730234615865843651857942052864

DC east
100
85070591730234615865843651857942052964

symptoms are:
- logs show sstables being streamed over to other nodes
- 140k files in data dir of CF on all nodes
- cfstats reports 20k sstables, up from 6 on all nodes
- compaction continuously running with no results whatsoever ( number of stables growing )

i tried the following:
- offline scrub ( has gone OOM, i noticed the script in the debian package specifies 256MB heap? )
- online scrub ( no effect )
- repair ( no effect )
- cleanup ( no effect )

my questions are:
- how do i stop repair before i run out of storage? ( can't let this finish )
- how do i clean up my stables ( grew from 6k to 20k since this started, while i shut writes off completely )

thanks,
Andras

Andras Szerdahelyi
Solutions Architect, IgnitionOne | 1831 Diegem E.Mommaertslaan 20A
M: +32 493 05 50 88 | Skype: sandrew84


<C4798BB9-9092-4145-880B-A72C6B7AF9A4[41].png>

Re: entire range of node out of sync -- out of the blue

Posted by aaron morton <aa...@thelastpickle.com>.

> - how do i stop repair before i run out of storage? ( can't let this finish )

To stop the validation part of the repair…

nodetool -h localhost stop VALIDATION 


The only way I know to stop streaming is restart the node, their may be a better way though. 


> INFO [AntiEntropySessions:3] 2012-12-05 02:15:02,301 AntiEntropyService.java (line 666) [repair #7c7665c0-3eab-11e2-0000-dae6667065ff] new session: will sync /X.X.1.113, /X.X.0.71 on range (85070591730234615865843651857942052964,0] for ( .. )
Am assuming this was ran on the first node in DC west with -pr as you said.
The log message is saying this is going to repair the primary range for the node for the node. The repair is then actually performed one CF at a time. 

You should also see log messages ending with "range(s) out of sync" which will say how out of sync the data is. 
 
> - how do i clean up my stables ( grew from 6k to 20k since this started, while i shut writes off completely )
Sounds like repair is streaming a lot of differences. 
If you have the space I would give  Levelled compaction time to take care of it. 

Hope that helps.

-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 6/12/2012, at 1:32 AM, Andras Szerdahelyi <an...@ignitionone.com> wrote:

> hi list,
> 
> AntiEntropyService started syncing ranges of entire nodes ( ?! ) across my data centers and i'd like to understand why. 
> 
> I see log lines like this on all my nodes in my two ( east/west ) data centres...
> 
> INFO [AntiEntropySessions:3] 2012-12-05 02:15:02,301 AntiEntropyService.java (line 666) [repair #7c7665c0-3eab-11e2-0000-dae6667065ff] new session: will sync /X.X.1.113, /X.X.0.71 on range (85070591730234615865843651857942052964,0] for ( .. )
> 
> ( this is around 80-100 GB of data for a single node. )
> 
> - i did not observe any network failures or nodes falling off the ring
> - good distribution of data ( load is equal on all nodes )
> - hinted handoff is on
> - read repair chance is 0.1 on the CF
> - 2 replicas in each data centre ( which is also the number of nodes in each ) with NetworkTopologyStrategy
> - repair -pr is scheduled to run off-peak hours, daily
> - leveled compaction with stable max size 256mb ( i have found this to trigger compaction in acceptable intervals while still keeping the stable count down )
> - i am on 1.1.6
> - java heap 10G
> - max memtables 2G
> - 1G row cache
> - 256M key cache
> 
> my nodes'  ranges are:
> 
> DC west
> 0
> 85070591730234615865843651857942052864
> 
> DC east
> 100
> 85070591730234615865843651857942052964
> 
> symptoms are:
> - logs show sstables being streamed over to other nodes
> - 140k files in data dir of CF on all nodes
> - cfstats reports 20k sstables, up from 6 on all nodes
> - compaction continuously running with no results whatsoever ( number of stables growing )
> 
> i tried the following:
> - offline scrub ( has gone OOM, i noticed the script in the debian package specifies 256MB heap? )
> - online scrub ( no effect )
> - repair ( no effect )
> - cleanup ( no effect )
> 
> my questions are:
> - how do i stop repair before i run out of storage? ( can't let this finish )
> - how do i clean up my stables ( grew from 6k to 20k since this started, while i shut writes off completely )
> 
> thanks,
> Andras
> 
> Andras Szerdahelyi
> Solutions Architect, IgnitionOne | 1831 Diegem E.Mommaertslaan 20A
> M: +32 493 05 50 88 | Skype: sandrew84
> 
> 
> <C4798BB9-9092-4145-880B-A72C6B7AF9A4[41].png>
> 
>