You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Dave Galbraith <da...@gmail.com> on 2015/04/23 08:57:27 UTC

How much data is bootstrapping supposed to send?

I had a one-node Cassandra 2.1.3 cluster, where the output of nodetool
status looked like this:

Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address       Load       Tokens  Owns    Host
ID                               Rack
UN  172.31.20.10  12.94 MB   256     ?
f803cae9-3f12-40c9-b681-caf4829b6bc6  rack1


Then I added another host to the cluster, and according to the logs it did
some bootstrapping:

INFO  [main] 2015-04-23 06:25:41,955 StorageService.java:1008 - JOINING:
schema complete, ready to bootstrap
INFO  [main] 2015-04-23 06:25:41,955 StorageService.java:1008 - JOINING:
calculation complete, ready to bootstrap
INFO  [main] 2015-04-23 06:25:41,956 StorageService.java:1008 - JOINING:
getting bootstrap token
INFO  [main] 2015-04-23 06:26:11,999 StorageService.java:1008 - JOINING:
Starting to bootstrap...
INFO  [main] 2015-04-23 06:26:12,159 StreamResultFuture.java:86 - [Stream
#a2d70110-e981-11e4-90fe-03a9e0dac111] Executing streaming plan for
Bootstrap
INFO  [main] 2015-04-23 06:26:13,225 StorageService.java:1037 - Bootstrap
completed! for the tokens [-6649489682159922872,


But when I ran nodetool status after the new node had joined the cluster,
it looked like this:

Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address        Load       Tokens  Owns    Host
ID                               Rack
UN  172.31.21.108  3.78 MB    256     ?
0fc3f5ac-c414-4340-b072-7d9959a28209  rack1
UN  172.31.20.10   15.45 MB   256     ?
f803cae9-3f12-40c9-b681-caf4829b6bc6  rack1


So I was expecting the load to drop to about 6.5 MB on my original node
while the new node would pick up about 6.5 MB, so they'd be balanced, but
instead the disk usage on my original node somehow increased by 2.5 MB
while the new node only picked up 3.78 MB. Why didn't I get a balanced
load? Why did the load on my original node go up when I added another node?
I didn't write any points during the bootstrap. All my keyspaces that have
a lot of data have replication factor 1, so I think and hope it wasn't just
replicating data on the new node. Thanks!

Re: How much data is bootstrapping supposed to send?

Posted by Robert Coli <rc...@eventbrite.com>.
On Wed, Apr 22, 2015 at 11:57 PM, Dave Galbraith <david92galbraith@gmail.com
> wrote:

> So I was expecting the load to drop to about 6.5 MB on my original node
> while the new node would pick up about 6.5 MB, so they'd be balanced, but
> instead the disk usage on my original node somehow increased by 2.5 MB
> while the new node only picked up 3.78 MB. Why didn't I get a balanced
> load? Why did the load on my original node go up when I added another node?
> I didn't write any points during the bootstrap. All my keyspaces that have
> a lot of data have replication factor 1, so I think and hope it wasn't just
> replicating data on the new node. Thanks!
>

To remove data from the source node which no longer belongs there, run
"nodetool cleanup" on it.

=Rob