You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Scott Dworkis <sv...@mylife.com> on 2010/08/17 04:26:43 UTC

curious space usages after recovering a failed node

i followed the alternative approach for handling a failed node here:

http://wiki.apache.org/cassandra/Operations

i.e. bringing up a replacement node with the same ip, bootstrapping it 
into the same token used by the failed node (using the InitialToken config 
parameter), then doing a repair.  at the end of this process i had a data 
directory that was almost 3x the size of the directory on the failed node 
at the time of failure... i expected around 2x for copies going around, 
but 3x seems a bit high for headroom i should expect to need for recovery.

the data i inserted here was 100 copies of an almost 10M file, random 
partitioner, no overwriting or anything, replication factor of 2.  so i'd 
expect to be using around 2G.

here is what ring and du looked like after the initial data load:

Address       Status     Load          Range                                      Ring
                                        170141183460469231731687303715884105728 
10.3.0.84     Up         448.8 MB      42535295865117307932921825928971026432     |<--|
10.3.0.85     Up         374 MB        85070591730234615865843651857942052864     |   |
10.3.0.114    Up         495 bytes     127605887595351923798765477786913079296    |   |
10.3.0.115    Up         496 bytes     170141183460469231731687303715884105728    |-->|

655M    /data/cassandra/
655M    /data/cassandra
655M    /data/cassandra
1001M     /data/cassandra


so far so good... now after the bootstrap:

Address       Status     Load          Range                                      Ring
                                        170141183460469231731687303715884105728 
10.3.0.84     Up         467.5 MB      42535295865117307932921825928971026432     |<--|
10.3.0.85     Up         205.7 MB      85070591730234615865843651857942052864     |   |
10.3.0.114    Up         448.8 MB      127605887595351923798765477786913079296    |   |
10.3.0.115    Up         514.25 MB     170141183460469231731687303715884105728    |-->|

674M    /data/cassandra
206M    /data/cassandra/
655M    /data/cassandra
767M      /data/cassandra


also reasonable, now after the repair:

Address       Status     Load          Range                                      Ring
                                        170141183460469231731687303715884105728 
10.3.0.84     Up         467.5 MB      42535295865117307932921825928971026432     |<--|
10.3.0.85     Up         916.3 MB      85070591730234615865843651857942052864     |   |
10.3.0.114    Up         654.5 MB      127605887595351923798765477786913079296    |   |
10.3.0.115    Up         514.25 MB     170141183460469231731687303715884105728    |-->|

674M    /data/cassandra
1.4G    /data/cassandra/
655M    /data/cassandra
767M      /data/cassandra


so i need 3x headroom if i were to try this on a huge production data set? 
after 3 or 4 nodetool cleanups, ring looks ok, but the data directories 
have bloated:

Address       Status     Load          Range                                      Ring
                                        170141183460469231731687303715884105728 
10.3.0.84     Up         467.5 MB      42535295865117307932921825928971026432     |<--|
10.3.0.85     Up         420.75 MB     85070591730234615865843651857942052864     |   |
10.3.0.114    Up         448.8 MB      127605887595351923798765477786913079296    |   |
10.3.0.115    Up         514.25 MB     170141183460469231731687303715884105728    |-->|

1.2G    /data/cassandra
842M    /data/cassandra/
1.1G    /data/cassandra
1.3G      /data/cassandra


so the question is, do i plan to need 3x headroom for node recoveries?

-scott

Re: curious space usages after recovering a failed node

Posted by Scott Dworkis <sv...@mylife.com>.
to update, i seem to be having luck with some combination of "cleanup" 
followed by triggering a garbage collection on jmx (all on each node).

(using jxterm):

echo -e 'open localhost:8080\nrun -b java.lang:type=Memory gc' | java -jar jmxterm-1.0-alpha-4-uber.jar

-scott

On Mon, 16 Aug 2010, Scott Dworkis wrote:

> i followed the alternative approach for handling a failed node here:
>
> http://wiki.apache.org/cassandra/Operations
>
> i.e. bringing up a replacement node with the same ip, bootstrapping it into 
> the same token used by the failed node (using the InitialToken config 
> parameter), then doing a repair.  at the end of this process i had a data 
> directory that was almost 3x the size of the directory on the failed node at 
> the time of failure... i expected around 2x for copies going around, but 3x 
> seems a bit high for headroom i should expect to need for recovery.
>
> the data i inserted here was 100 copies of an almost 10M file, random 
> partitioner, no overwriting or anything, replication factor of 2.  so i'd 
> expect to be using around 2G.
>
> here is what ring and du looked like after the initial data load:
>
> Address       Status     Load          Range 
> Ring
>                                       170141183460469231731687303715884105728 
> 10.3.0.84     Up         448.8 MB      42535295865117307932921825928971026432 
> |<--|
> 10.3.0.85     Up         374 MB        85070591730234615865843651857942052864 
> |   |
> 10.3.0.114    Up         495 bytes 
> 127605887595351923798765477786913079296    |   |
> 10.3.0.115    Up         496 bytes 
> 170141183460469231731687303715884105728    |-->|
>
> 655M    /data/cassandra/
> 655M    /data/cassandra
> 655M    /data/cassandra
> 1001M     /data/cassandra
>
>
> so far so good... now after the bootstrap:
>
> Address       Status     Load          Range 
> Ring
>                                       170141183460469231731687303715884105728 
> 10.3.0.84     Up         467.5 MB      42535295865117307932921825928971026432 
> |<--|
> 10.3.0.85     Up         205.7 MB      85070591730234615865843651857942052864 
> |   |
> 10.3.0.114    Up         448.8 MB 
> 127605887595351923798765477786913079296    |   |
> 10.3.0.115    Up         514.25 MB 
> 170141183460469231731687303715884105728    |-->|
>
> 674M    /data/cassandra
> 206M    /data/cassandra/
> 655M    /data/cassandra
> 767M      /data/cassandra
>
>
> also reasonable, now after the repair:
>
> Address       Status     Load          Range 
> Ring
>                                       170141183460469231731687303715884105728 
> 10.3.0.84     Up         467.5 MB      42535295865117307932921825928971026432 
> |<--|
> 10.3.0.85     Up         916.3 MB      85070591730234615865843651857942052864 
> |   |
> 10.3.0.114    Up         654.5 MB 
> 127605887595351923798765477786913079296    |   |
> 10.3.0.115    Up         514.25 MB 
> 170141183460469231731687303715884105728    |-->|
>
> 674M    /data/cassandra
> 1.4G    /data/cassandra/
> 655M    /data/cassandra
> 767M      /data/cassandra
>
>
> so i need 3x headroom if i were to try this on a huge production data set? 
> after 3 or 4 nodetool cleanups, ring looks ok, but the data directories have 
> bloated:
>
> Address       Status     Load          Range 
> Ring
>                                       170141183460469231731687303715884105728 
> 10.3.0.84     Up         467.5 MB      42535295865117307932921825928971026432 
> |<--|
> 10.3.0.85     Up         420.75 MB     85070591730234615865843651857942052864 
> |   |
> 10.3.0.114    Up         448.8 MB 
> 127605887595351923798765477786913079296    |   |
> 10.3.0.115    Up         514.25 MB 
> 170141183460469231731687303715884105728    |-->|
>
> 1.2G    /data/cassandra
> 842M    /data/cassandra/
> 1.1G    /data/cassandra
> 1.3G      /data/cassandra
>
>
> so the question is, do i plan to need 3x headroom for node recoveries?
>
> -scott
>