You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Igor <ig...@4friends.od.ua> on 2012/04/21 20:43:06 UTC

repair strange behavior

Hi

I can't understand the repair behavior in my case. I have 12 nodes ring 
(all 1.0.7):

10.254.237.2    LA          ADS-LA-1    Up     Normal  50.92 GB        
0.00%   0
10.254.238.2    TX          TX-24-RACK  Up     Normal  33.29 GB        
0.00%   1
10.254.236.2    VA          ADS-VA-1    Up     Normal  50.07 GB        
0.00%   2
10.254.93.2     IL          R1          Up     Normal  49.29 GB        
0.00%   3
10.253.4.2      AZ          R1          Up     Normal  37.83 GB        
0.00%   5
10.254.180.2    GB          GB-1        Up     Normal  42.86 GB        
50.00%  85070591730234615865843651857942052863
10.254.191.2    LA          ADS-LA-1    Up     Normal  47.64 GB        
0.00%   85070591730234615865843651857942052864
10.254.221.2    TX          TX-24-RACK  Up     Normal  43.42 GB        
0.00%   85070591730234615865843651857942052865
10.254.217.2    VA          ADS-VA-1    Up     Normal  38.44 GB        
0.00%   85070591730234615865843651857942052866
10.254.94.2     IL          R1          Up     Normal  49.31 GB        
0.00%   85070591730234615865843651857942052867
10.253.5.2      AZ          R1          Up     Normal  49.01 GB        
0.00%   85070591730234615865843651857942052869
10.254.179.2    GB          GB-1        Up     Normal  27.08 GB        
50.00%  170141183460469231731687303715884105727

I have single keyspace 'meter' and two column families (one 'ids' is 
small, and second is bigger). The strange thing happened today when I 
try to run
"nodetool -h 10.254.180.2 -pr meter ids"
two times one after another. First repair finished successfully

  INFO 16:33:02,492 [repair #db582370-8bba-11e1-0000-5b777f708bff] ids 
is fully synced
  INFO 16:33:02,526 [repair #db582370-8bba-11e1-0000-5b777f708bff] 
session completed successfully

after moving near 50G of data, and I started second session one hour later:

INFO 17:44:37,842 [repair #aa415d00-8bd9-11e1-0000-5b777f708bff] new 
session: will sync localhost/1
0.254.180.2, /10.254.221.2, /10.254.191.2, /10.254.217.2, /10.253.5.2, 
/10.254.94.2 on range (5,8507
0591730234615865843651857942052863] for meter.[ids]

What is strange - when streams for the second repair starts they have 
the same or even bigger total volume, and I expected that second run 
will move less data (or even no data at all).

Is it OK? Or should I fix something?

Thanks!

Re: repair strange behavior

Posted by Igor <ig...@4friends.od.ua>.

Hi, Aaron

Just sum of total volume for all streams between nodes.

But seems I understand what happened: after repair my column family pass 
over several minor compactions, and during these compactions it create 
new tombstones (my CF contain data with TTL, so it can discover and mark 
new data each time it make minor compaction). As these tombstones 
arranged and created differently on each node (sstables have different 
sizes and so on, so size-tiered compaction works slightly different) - 
each subsequent repair discover new ranges to sync.

When I try to run *major* compaction, and then run repair it vent in 
minutes (against hours) as far as I understand - because after major 
compaction tombstones on all nodes are almost the same.

Does it sounds reasonable?

I'll try to find best strategy to minimize repair streams as I'm afraid 
of major compactions for other, possible large, CFs.

On 04/23/2012 12:34 PM, aaron morton wrote:
>> What is strange - when streams for the second repair starts they have 
>> the same or even bigger total volume,
> What measure are you using ?
>
> Cheers
>
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 22/04/2012, at 10:16 PM, Igor wrote:
>
>> but after repair all nodes should be in sync regardless of whether 
>> new files were compacted or not.
>> Do you suggest major compaction after repair? I'd like to avoid it.
>>
>> On 04/22/2012 11:52 AM, Philippe wrote:
>>>
>>> Repairs generate new files that then need to be compacted.
>>> Maybe that's where the temporary extra volume comes from?
>>>
>>> Le 21 avr. 2012 20:43, "Igor" <igor@4friends.od.ua 
>>> <ma...@4friends.od.ua>> a écrit :
>>>
>>>     Hi
>>>
>>>     I can't understand the repair behavior in my case. I have 12
>>>     nodes ring (all 1.0.7):
>>>
>>>     10.254.237.2    LA          ADS-LA-1    Up     Normal  50.92 GB
>>>            0.00%   0
>>>     10.254.238.2    TX          TX-24-RACK  Up     Normal  33.29 GB
>>>            0.00%   1
>>>     10.254.236.2    VA          ADS-VA-1    Up     Normal  50.07 GB
>>>            0.00%   2
>>>     10.254.93.2     IL          R1          Up     Normal  49.29 GB
>>>            0.00%   3
>>>     10.253.4.2      AZ          R1          Up     Normal  37.83 GB
>>>            0.00%   5
>>>     10.254.180.2    GB          GB-1        Up     Normal  42.86 GB
>>>            50.00%  85070591730234615865843651857942052863
>>>     10.254.191.2    LA          ADS-LA-1    Up     Normal  47.64 GB
>>>            0.00%   85070591730234615865843651857942052864
>>>     10.254.221.2    TX          TX-24-RACK  Up     Normal  43.42 GB
>>>            0.00%   85070591730234615865843651857942052865
>>>     10.254.217.2    VA          ADS-VA-1    Up     Normal  38.44 GB
>>>            0.00%   85070591730234615865843651857942052866
>>>     10.254.94.2     IL          R1          Up     Normal  49.31 GB
>>>            0.00%   85070591730234615865843651857942052867
>>>     10.253.5.2      AZ          R1          Up     Normal  49.01 GB
>>>            0.00%   85070591730234615865843651857942052869
>>>     10.254.179.2    GB          GB-1        Up     Normal  27.08 GB
>>>            50.00%  170141183460469231731687303715884105727
>>>
>>>     I have single keyspace 'meter' and two column families (one
>>>     'ids' is small, and second is bigger). The strange thing
>>>     happened today when I try to run
>>>     "nodetool -h 10.254.180.2 -pr meter ids"
>>>     two times one after another. First repair finished successfully
>>>
>>>      INFO 16:33:02,492 [repair
>>>     #db582370-8bba-11e1-0000-5b777f708bff] ids is fully synced
>>>      INFO 16:33:02,526 [repair
>>>     #db582370-8bba-11e1-0000-5b777f708bff] session completed
>>>     successfully
>>>
>>>     after moving near 50G of data, and I started second session one
>>>     hour later:
>>>
>>>     INFO 17:44:37,842 [repair #aa415d00-8bd9-11e1-0000-5b777f708bff]
>>>     new session: will sync localhost/1
>>>     0.254.180.2, /10.254.221.2 <http://10.254.221.2/>, /10.254.191.2
>>>     <http://10.254.191.2/>, /10.254.217.2 <http://10.254.217.2/>,
>>>     /10.253.5.2 <http://10.253.5.2/>, /10.254.94.2
>>>     <http://10.254.94.2/> on range (5,8507
>>>     0591730234615865843651857942052863] for meter.[ids]
>>>
>>>     What is strange - when streams for the second repair starts they
>>>     have the same or even bigger total volume, and I expected that
>>>     second run will move less data (or even no data at all).
>>>
>>>     Is it OK? Or should I fix something?
>>>
>>>     Thanks!
>>>
>>
>

Re: repair strange behavior

Posted by aaron morton <aa...@thelastpickle.com>.

> What is strange - when streams for the second repair starts they have the same or even bigger total volume,
What measure are you using ? 

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 22/04/2012, at 10:16 PM, Igor wrote:

> but after repair all nodes should be in sync regardless of whether new files were compacted or not.
> Do you suggest major compaction after repair? I'd like to avoid it.
> 
> On 04/22/2012 11:52 AM, Philippe wrote:
>> 
>> Repairs generate new files that then need to be compacted.
>> Maybe that's where the temporary extra volume comes from?
>> 
>> Le 21 avr. 2012 20:43, "Igor" <ig...@4friends.od.ua> a écrit :
>> Hi
>> 
>> I can't understand the repair behavior in my case. I have 12 nodes ring (all 1.0.7):
>> 
>> 10.254.237.2    LA          ADS-LA-1    Up     Normal  50.92 GB        0.00%   0
>> 10.254.238.2    TX          TX-24-RACK  Up     Normal  33.29 GB        0.00%   1
>> 10.254.236.2    VA          ADS-VA-1    Up     Normal  50.07 GB        0.00%   2
>> 10.254.93.2     IL          R1          Up     Normal  49.29 GB        0.00%   3
>> 10.253.4.2      AZ          R1          Up     Normal  37.83 GB        0.00%   5
>> 10.254.180.2    GB          GB-1        Up     Normal  42.86 GB        50.00%  85070591730234615865843651857942052863
>> 10.254.191.2    LA          ADS-LA-1    Up     Normal  47.64 GB        0.00%   85070591730234615865843651857942052864
>> 10.254.221.2    TX          TX-24-RACK  Up     Normal  43.42 GB        0.00%   85070591730234615865843651857942052865
>> 10.254.217.2    VA          ADS-VA-1    Up     Normal  38.44 GB        0.00%   85070591730234615865843651857942052866
>> 10.254.94.2     IL          R1          Up     Normal  49.31 GB        0.00%   85070591730234615865843651857942052867
>> 10.253.5.2      AZ          R1          Up     Normal  49.01 GB        0.00%   85070591730234615865843651857942052869
>> 10.254.179.2    GB          GB-1        Up     Normal  27.08 GB        50.00%  170141183460469231731687303715884105727
>> 
>> I have single keyspace 'meter' and two column families (one 'ids' is small, and second is bigger). The strange thing happened today when I try to run
>> "nodetool -h 10.254.180.2 -pr meter ids"
>> two times one after another. First repair finished successfully
>> 
>>  INFO 16:33:02,492 [repair #db582370-8bba-11e1-0000-5b777f708bff] ids is fully synced
>>  INFO 16:33:02,526 [repair #db582370-8bba-11e1-0000-5b777f708bff] session completed successfully
>> 
>> after moving near 50G of data, and I started second session one hour later:
>> 
>> INFO 17:44:37,842 [repair #aa415d00-8bd9-11e1-0000-5b777f708bff] new session: will sync localhost/1
>> 0.254.180.2, /10.254.221.2, /10.254.191.2, /10.254.217.2, /10.253.5.2, /10.254.94.2 on range (5,8507
>> 0591730234615865843651857942052863] for meter.[ids]
>> 
>> What is strange - when streams for the second repair starts they have the same or even bigger total volume, and I expected that second run will move less data (or even no data at all).
>> 
>> Is it OK? Or should I fix something?
>> 
>> Thanks!
>> 
>

Re: repair strange behavior

Posted by Igor <ig...@4friends.od.ua>.

but after repair all nodes should be in sync regardless of whether new 
files were compacted or not.
Do you suggest major compaction after repair? I'd like to avoid it.

On 04/22/2012 11:52 AM, Philippe wrote:
>
> Repairs generate new files that then need to be compacted.
> Maybe that's where the temporary extra volume comes from?
>
> Le 21 avr. 2012 20:43, "Igor" <igor@4friends.od.ua 
> <ma...@4friends.od.ua>> a écrit :
>
>     Hi
>
>     I can't understand the repair behavior in my case. I have 12 nodes
>     ring (all 1.0.7):
>
>     10.254.237.2    LA          ADS-LA-1    Up     Normal  50.92 GB  
>          0.00%   0
>     10.254.238.2    TX          TX-24-RACK  Up     Normal  33.29 GB  
>          0.00%   1
>     10.254.236.2    VA          ADS-VA-1    Up     Normal  50.07 GB  
>          0.00%   2
>     10.254.93.2     IL          R1          Up     Normal  49.29 GB  
>          0.00%   3
>     10.253.4.2      AZ          R1          Up     Normal  37.83 GB  
>          0.00%   5
>     10.254.180.2    GB          GB-1        Up     Normal  42.86 GB  
>          50.00%  85070591730234615865843651857942052863
>     10.254.191.2    LA          ADS-LA-1    Up     Normal  47.64 GB  
>          0.00%   85070591730234615865843651857942052864
>     10.254.221.2    TX          TX-24-RACK  Up     Normal  43.42 GB  
>          0.00%   85070591730234615865843651857942052865
>     10.254.217.2    VA          ADS-VA-1    Up     Normal  38.44 GB  
>          0.00%   85070591730234615865843651857942052866
>     10.254.94.2     IL          R1          Up     Normal  49.31 GB  
>          0.00%   85070591730234615865843651857942052867
>     10.253.5.2      AZ          R1          Up     Normal  49.01 GB  
>          0.00%   85070591730234615865843651857942052869
>     10.254.179.2    GB          GB-1        Up     Normal  27.08 GB  
>          50.00%  170141183460469231731687303715884105727
>
>     I have single keyspace 'meter' and two column families (one 'ids'
>     is small, and second is bigger). The strange thing happened today
>     when I try to run
>     "nodetool -h 10.254.180.2 -pr meter ids"
>     two times one after another. First repair finished successfully
>
>      INFO 16:33:02,492 [repair #db582370-8bba-11e1-0000-5b777f708bff]
>     ids is fully synced
>      INFO 16:33:02,526 [repair #db582370-8bba-11e1-0000-5b777f708bff]
>     session completed successfully
>
>     after moving near 50G of data, and I started second session one
>     hour later:
>
>     INFO 17:44:37,842 [repair #aa415d00-8bd9-11e1-0000-5b777f708bff]
>     new session: will sync localhost/1
>     0.254.180.2, /10.254.221.2 <http://10.254.221.2>, /10.254.191.2
>     <http://10.254.191.2>, /10.254.217.2 <http://10.254.217.2>,
>     /10.253.5.2 <http://10.253.5.2>, /10.254.94.2 <http://10.254.94.2>
>     on range (5,8507
>     0591730234615865843651857942052863] for meter.[ids]
>
>     What is strange - when streams for the second repair starts they
>     have the same or even bigger total volume, and I expected that
>     second run will move less data (or even no data at all).
>
>     Is it OK? Or should I fix something?
>
>     Thanks!
>

Re: repair strange behavior

Posted by Philippe <wa...@gmail.com>.

Repairs generate new files that then need to be compacted.
Maybe that's where the temporary extra volume comes from?
Le 21 avr. 2012 20:43, "Igor" <ig...@4friends.od.ua> a écrit :

> Hi
>
> I can't understand the repair behavior in my case. I have 12 nodes ring
> (all 1.0.7):
>
> 10.254.237.2    LA          ADS-LA-1    Up     Normal  50.92 GB
>  0.00%   0
> 10.254.238.2    TX          TX-24-RACK  Up     Normal  33.29 GB
>  0.00%   1
> 10.254.236.2    VA          ADS-VA-1    Up     Normal  50.07 GB
>  0.00%   2
> 10.254.93.2     IL          R1          Up     Normal  49.29 GB
>  0.00%   3
> 10.253.4.2      AZ          R1          Up     Normal  37.83 GB
>  0.00%   5
> 10.254.180.2    GB          GB-1        Up     Normal  42.86 GB
>  50.00%  850705917302346158658436518579**42052863
> 10.254.191.2    LA          ADS-LA-1    Up     Normal  47.64 GB
>  0.00%   850705917302346158658436518579**42052864
> 10.254.221.2    TX          TX-24-RACK  Up     Normal  43.42 GB
>  0.00%   850705917302346158658436518579**42052865
> 10.254.217.2    VA          ADS-VA-1    Up     Normal  38.44 GB
>  0.00%   850705917302346158658436518579**42052866
> 10.254.94.2     IL          R1          Up     Normal  49.31 GB
>  0.00%   850705917302346158658436518579**42052867
> 10.253.5.2      AZ          R1          Up     Normal  49.01 GB
>  0.00%   850705917302346158658436518579**42052869
> 10.254.179.2    GB          GB-1        Up     Normal  27.08 GB
>  50.00%  170141183460469231731687303715**884105727
>
> I have single keyspace 'meter' and two column families (one 'ids' is
> small, and second is bigger). The strange thing happened today when I try
> to run
> "nodetool -h 10.254.180.2 -pr meter ids"
> two times one after another. First repair finished successfully
>
>  INFO 16:33:02,492 [repair #db582370-8bba-11e1-0000-**5b777f708bff] ids
> is fully synced
>  INFO 16:33:02,526 [repair #db582370-8bba-11e1-0000-**5b777f708bff]
> session completed successfully
>
> after moving near 50G of data, and I started second session one hour later:
>
> INFO 17:44:37,842 [repair #aa415d00-8bd9-11e1-0000-**5b777f708bff] new
> session: will sync localhost/1
> 0.254.180.2, /10.254.221.2, /10.254.191.2, /10.254.217.2, /10.253.5.2, /
> 10.254.94.2 on range (5,8507
> 059173023461586584365185794205**2863] for meter.[ids]
>
> What is strange - when streams for the second repair starts they have the
> same or even bigger total volume, and I expected that second run will move
> less data (or even no data at all).
>
> Is it OK? Or should I fix something?
>
> Thanks!
>
>