You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by myreasoner <my...@gmail.com> on 2011/08/08 20:04:04 UTC

how to verify the row key is evenly distributed

Hi all,

I have a CF using incremental integer as row keys.  In a 5-node cluster with
RandomPartitioner, I've noticed the rows are not assigned evenly across
nodes--two of them are 5 times heavier loaded than the rest.

In nodetool, I can do
  *getendpoints <keyspace> <cf> <key> - Print the end points that owns the
key*

But is there any API I can call programmatically to determine the endpoints
on a given set of row keys?


--
View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/how-to-verify-the-row-key-is-evenly-distributed-tp6665277p6665277.html
Sent from the cassandra-user@incubator.apache.org mailing list archive at Nabble.com.

Re: how to verify the row key is evenly distributed

Posted by aaron morton <aa...@thelastpickle.com>.
If your data is not evenly distributed check the tokens in the ring with "nodetool ring" they should be evenly distributed. For background have a look at  http://wiki.apache.org/cassandra/Operations#Load_balancing

if they are evenly distributed there are a couple of other things to look at:
- if you've moved the tokens, remember to do nodetool clean
- the nodes may just be compacting at different rates 
- you may have some very large rows 

nodetool getendpoints is calling the getNaturalEndpoints() operation on the StorageProxy MBean. You can call this via JMX and via  JConsole. 

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 9 Aug 2011, at 06:04, myreasoner wrote:

> Hi all,
> 
> I have a CF using incremental integer as row keys.  In a 5-node cluster with
> RandomPartitioner, I've noticed the rows are not assigned evenly across
> nodes--two of them are 5 times heavier loaded than the rest.
> 
> In nodetool, I can do
>  *getendpoints <keyspace> <cf> <key> - Print the end points that owns the
> key*
> 
> But is there any API I can call programmatically to determine the endpoints
> on a given set of row keys?
> 
> 
> --
> View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/how-to-verify-the-row-key-is-evenly-distributed-tp6665277p6665277.html
> Sent from the cassandra-user@incubator.apache.org mailing list archive at Nabble.com.