You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Samuel CARRIERE <sa...@urssaf.fr> on 2012/06/05 21:38:43 UTC
RE Nodes not picking up data on repair, disk loaded unevenly

Hi,

To verify that the repair was successful, you can look for this kind of 
messages in the log :
 INFO [AntiEntropyStage:1] 2012-05-19 00:57:52,351 AntiEntropyService.java 
(line 762) [repair #e46a0a90-a13c-11e1-0000-596f3d333ab7] UsersCF is fully 
synced (3 remaining column family to sync for this session)
...
 INFO [AntiEntropyStage:1] 2012-05-19 00:59:25,348 AntiEntropyService.java 
(line 762) [repair #e46a0a90-a13c-11e1-0000-596f3d333ab7] MyOtherCF is 
fully synced (2 remaining column family to sync for this session)
...

To verify that one node "really" has the data it is supposed to have, 
well, you could isolate it from the rest of the cluster, and query the 
data (with thrift) with CL ONE.

Regards,
Samuel




Luke Hospadaruk <Lu...@ithaka.org> 
05/06/2012 20:53
Veuillez répondre à
user@cassandra.apache.org


A
"user@cassandra.apache.org" <us...@cassandra.apache.org>
cc

Objet
Nodes not picking up data on repair, disk loaded unevenly






I have a 4-node cluster with one keyspace (aside from the system keyspace)
with the replication factor set to 4.  The disk usage between the nodes is
pretty wildly different and I'm wondering why.  It's becoming a problem
because one node is getting to the point where it sometimes fails to
compact because it doesn't have enough space.

I've been doing a lot of experimenting with the schema, adding/dropping
things, changing settings around (not ideal I realize, but we're still in
development).

In an ideal world, I'd launch another cluster (this is all hosted in
amazon), copy all the data to that, and just get rid of my current
cluster, but the current cluster is in use by some other parties so
rebuilding everything is impractical (although possible if it's the only
reliable solution).

$ nodetool -h localhost ring
Address     DC        Rack  Status State  Load       Owns   Token
 

1.xx.xx.xx   Cassandra   rack1       Up     Normal  837.8 GB   25.00%  0
 
2.xx.xx.xx   Cassandra   rack1       Up     Normal  1.17 TB    25.00%
42535295865117307932921825928971026432
3.xx.xx.xx   Cassandra   rack1       Up     Normal  977.23 GB  25.00%
85070591730234615865843651857942052864
4.xx.xx.xx   Cassandra   rack1       Up     Normal  291.2 GB   25.00%
127605887595351923798765477786913079296

-Problems I'm having:
Nodes are running out of space and are apparently unable to perform
compactions because of it.  These machines have 1.7T total space each.

The logs for node #2 have a lot of warnings about insufficient space for
compaction.  Node number 4 was so extremely out of space (cassandra was
failing to start because of it)that I removed all the SSTables for one of
the less essential column families just to bring it back online.


I have (since I started noticing these issues) enabled compression for all
my column families.  On node #1 I was able to successfully run a scrub and
major compaction, so I suspect that the disk usage for node #1 is about
where all the other nodes should be.  At ~840GB I'm probably running close
to the max load I should have on a node, so I may need to launch more
nodes into the cluster, but I'd like to get things straightened out before
I introduce more potential issues (token moving, etc).

Node #4 seems not to be picking up all the data it should have (since
repication factor == number of nodes, the load should be roughly the
same?).  I've run repairs on that node to seemingly no avail (after repair
finishes, it still has about the same disk usage, which is much too low).


-What I think the solution should be:
One node at a time:
1) nodetool drain the node
2) shut down cassandra on the node
3) wipe out all the data in my keyspace on the node
4) bring cassandra back up
5) nodetool repair

-My concern:
This is basically what I did with node #4 (although I didn't drain, and I
didn't wipe the entire keyspace), and it doesn't seem to have regained all
the data it's supposed to have after the repair. The column family should
have at least 200-300GB of data, and the SSTables in the data directory
only total about 11GB, am I missing something?

Is there a way to verify that a node _really_ has all the data it's
supposed to have?

I don't want to do this process to each node and discover at the end of it
that I've lost a ton of data.

Is there something I should be looking for in the logs to verify that the
repair was successful?  If I do a 'nodetool netstats' during the repair I
don't see any streams going in or out of node #4.

Thanks,
Luke