You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by DuyHai Doan <do...@gmail.com> on 2014/04/24 16:14:32 UTC

Load balancing issue with virtual nodes

Hello all

 I'm facing a rather weird issue with virtual nodes.

 My customer has a cluster with 2 nodes only. I've set virtual nodes so
future addition of new nodes will be easy.

 Now, after some benching tests with massive data insert, I can see with
"htop" that one node has its CPU occupation up to 60% and the other only
around 10%

 The massive insertion workload consist of random data and random string
(20 chars) as partition key so in theory, the Murmur3 partitioner should
dispatch then evenly between both nodes...

 Is there any existing JIRA related to load balancing issue with vnodes ?

 Regards

 Duy Hai DOAN

Re: Load balancing issue with virtual nodes

Posted by DuyHai Doan <do...@gmail.com>.

Thanks you Ben for the links




On Tue, Apr 29, 2014 at 3:40 AM, Ben Bromhead <be...@instaclustr.com> wrote:

> Some imbalance is expected and considered normal:
>
> See http://wiki.apache.org/cassandra/VirtualNodes/Balance
>
> As well as
>
> https://issues.apache.org/jira/browse/CASSANDRA-7032
>
> Ben Bromhead
> Instaclustr | www.instaclustr.com | @instaclustr<http://twitter.com/instaclustr> |
> +61 415 936 359
>
> On 29 Apr 2014, at 7:30 am, DuyHai Doan <do...@gmail.com> wrote:
>
> Hello all
>
>  Some update about the issue.
>
>  After wiping completely all sstable/commitlog/saved_caches folder and
> restart the cluster from scratch, we still experience weird figures. After
> the restart, nodetool status does not show an exact balance of 50% of data
> for each node :
>
>
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> -- Address  Load Tokens Owns (effective) Host ID Rack
> UN host1 48.57 KB 256 *51.6%*  d00de0d1-836f-4658-af64-3a12c00f47d6 rack1
> UN host2 48.57 KB 256 *48.4%*  e9d2505b-7ba7-414c-8b17-af3bbe79ed9c rack1
>
>
> As you can see, the % is very close to 50% but not exactly 50%
>
>  What can explain that ? Can it be network connection issue during token
> initial shuffle phase ?
>
> P.S: both host1 and host2 are supposed to have exactly the same hardware
>
> Regards
>
>  Duy Hai DOAN
>
>
> On Thu, Apr 24, 2014 at 11:20 PM, Batranut Bogdan <ba...@yahoo.com>wrote:
>
>> I don't know about hector but the datastax java driver needs just one ip
>> from the cluster and it will discover the rest of the nodes. Then by
>> default it will do a round robin when sending requests. So if Hector does
>> the same the patterb will againg appear.
>> Did you look at the size of the dirs?
>> That documentation is for C* 0.8. It's old. But depending on your boxes
>> you might reach CPU bottleneck. Might want to google for write path in
>> cassandra..  According to that, there is not much to do when writes come
>> in...
>>   On Friday, April 25, 2014 12:00 AM, DuyHai Doan <do...@gmail.com>
>> wrote:
>>  I did some experiments.
>>
>>  Let's say we have node1 and node2
>>
>> First, I configured Hector with node1 & node2 as hosts and I saw that
>> only node1 has high CPU load
>>
>> To eliminate the "client connection" issue, I re-test with only node2
>> provided as host for Hector. Same pattern. CPU load is above 50% on node1
>> and below 10% on node2.
>>
>> It means that node2 is playing as coordinator and forward many write/read
>> request to node1
>>
>>  Why did I look at CPU load and not iostat & al ?
>>
>>  Because I have a very intensive write work load with read-only-once
>> pattern. I've read here (
>> http://www.datastax.com/docs/0.8/cluster_architecture/cluster_planning)
>> that heavy write in C* is more CPU bound but maybe the info may be outdated
>> and no longer true
>>
>>  Regards
>>
>>  Duy Hai DOAN
>>
>>
>> On Thu, Apr 24, 2014 at 10:00 PM, Michael Shuler <mi...@pbandjelly.org>wrote:
>>
>> On 04/24/2014 10:29 AM, DuyHai Doan wrote:
>>
>>   Client used = Hector 1.1-4
>>   Default Load Balancing connection policy
>>   Both nodes addresses are provided to Hector so according to its
>> connection policy, the client should switch alternatively between both
>> nodes
>>
>>
>> OK, so is only one connection being established to one node for one bulk
>> write operation? Or are multiple connections being made to both nodes and
>> writes performed on both?
>>
>> --
>> Michael
>>
>>
>>
>>
>>
>
>

Re: Load balancing issue with virtual nodes

Posted by Ben Bromhead <be...@instaclustr.com>.

Some imbalance is expected and considered normal:

See http://wiki.apache.org/cassandra/VirtualNodes/Balance

As well as

https://issues.apache.org/jira/browse/CASSANDRA-7032

Ben Bromhead
Instaclustr | www.instaclustr.com | @instaclustr | +61 415 936 359

On 29 Apr 2014, at 7:30 am, DuyHai Doan <do...@gmail.com> wrote:

> Hello all
> 
>  Some update about the issue.
> 
>  After wiping completely all sstable/commitlog/saved_caches folder and restart the cluster from scratch, we still experience weird figures. After the restart, nodetool status does not show an exact balance of 50% of data for each node :
> 
> 
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> -- Address  Load Tokens Owns (effective) Host ID Rack
> UN host1 48.57 KB 256 51.6%  d00de0d1-836f-4658-af64-3a12c00f47d6 rack1
> UN host2 48.57 KB 256 48.4%  e9d2505b-7ba7-414c-8b17-af3bbe79ed9c rack1
> 
> 
> As you can see, the % is very close to 50% but not exactly 50%
> 
>  What can explain that ? Can it be network connection issue during token initial shuffle phase ?
> 
> P.S: both host1 and host2 are supposed to have exactly the same hardware
> 
> Regards
> 
>  Duy Hai DOAN
> 
> 
> On Thu, Apr 24, 2014 at 11:20 PM, Batranut Bogdan <ba...@yahoo.com> wrote:
> I don't know about hector but the datastax java driver needs just one ip from the cluster and it will discover the rest of the nodes. Then by default it will do a round robin when sending requests. So if Hector does the same the patterb will againg appear.
> Did you look at the size of the dirs?
> That documentation is for C* 0.8. It's old. But depending on your boxes you might reach CPU bottleneck. Might want to google for write path in cassandra..  According to that, there is not much to do when writes come in...  
> On Friday, April 25, 2014 12:00 AM, DuyHai Doan <do...@gmail.com> wrote:
> I did some experiments.
> 
>  Let's say we have node1 and node2
> 
> First, I configured Hector with node1 & node2 as hosts and I saw that only node1 has high CPU load
> 
> To eliminate the "client connection" issue, I re-test with only node2 provided as host for Hector. Same pattern. CPU load is above 50% on node1 and below 10% on node2.
> 
> It means that node2 is playing as coordinator and forward many write/read request to node1
> 
>  Why did I look at CPU load and not iostat & al ?
> 
>  Because I have a very intensive write work load with read-only-once pattern. I've read here (http://www.datastax.com/docs/0.8/cluster_architecture/cluster_planning) that heavy write in C* is more CPU bound but maybe the info may be outdated and no longer true
> 
>  Regards
> 
>  Duy Hai DOAN
> 
> 
> On Thu, Apr 24, 2014 at 10:00 PM, Michael Shuler <mi...@pbandjelly.org> wrote:
> On 04/24/2014 10:29 AM, DuyHai Doan wrote:
>   Client used = Hector 1.1-4
>   Default Load Balancing connection policy
>   Both nodes addresses are provided to Hector so according to its
> connection policy, the client should switch alternatively between both nodes
> 
> OK, so is only one connection being established to one node for one bulk write operation? Or are multiple connections being made to both nodes and writes performed on both?
> 
> -- 
> Michael
> 
> 
> 
>

Re: Load balancing issue with virtual nodes

Posted by DuyHai Doan <do...@gmail.com>.

Hello all

 Some update about the issue.

 After wiping completely all sstable/commitlog/saved_caches folder and
restart the cluster from scratch, we still experience weird figures. After
the restart, nodetool status does not show an exact balance of 50% of data
for each node :


Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address  Load Tokens Owns (effective) Host ID Rack
UN host1 48.57 KB 256 *51.6%*  d00de0d1-836f-4658-af64-3a12c00f47d6 rack1
UN host2 48.57 KB 256 *48.4%*  e9d2505b-7ba7-414c-8b17-af3bbe79ed9c rack1


As you can see, the % is very close to 50% but not exactly 50%

 What can explain that ? Can it be network connection issue during token
initial shuffle phase ?

P.S: both host1 and host2 are supposed to have exactly the same hardware

Regards

 Duy Hai DOAN


On Thu, Apr 24, 2014 at 11:20 PM, Batranut Bogdan <ba...@yahoo.com>wrote:

> I don't know about hector but the datastax java driver needs just one ip
> from the cluster and it will discover the rest of the nodes. Then by
> default it will do a round robin when sending requests. So if Hector does
> the same the patterb will againg appear.
> Did you look at the size of the dirs?
> That documentation is for C* 0.8. It's old. But depending on your boxes
> you might reach CPU bottleneck. Might want to google for write path in
> cassandra..  According to that, there is not much to do when writes come
> in...
>   On Friday, April 25, 2014 12:00 AM, DuyHai Doan <do...@gmail.com>
> wrote:
>  I did some experiments.
>
>  Let's say we have node1 and node2
>
> First, I configured Hector with node1 & node2 as hosts and I saw that only
> node1 has high CPU load
>
> To eliminate the "client connection" issue, I re-test with only node2
> provided as host for Hector. Same pattern. CPU load is above 50% on node1
> and below 10% on node2.
>
> It means that node2 is playing as coordinator and forward many write/read
> request to node1
>
>  Why did I look at CPU load and not iostat & al ?
>
>  Because I have a very intensive write work load with read-only-once
> pattern. I've read here (
> http://www.datastax.com/docs/0.8/cluster_architecture/cluster_planning)
> that heavy write in C* is more CPU bound but maybe the info may be outdated
> and no longer true
>
>  Regards
>
>  Duy Hai DOAN
>
>
> On Thu, Apr 24, 2014 at 10:00 PM, Michael Shuler <mi...@pbandjelly.org>wrote:
>
> On 04/24/2014 10:29 AM, DuyHai Doan wrote:
>
>   Client used = Hector 1.1-4
>   Default Load Balancing connection policy
>   Both nodes addresses are provided to Hector so according to its
> connection policy, the client should switch alternatively between both
> nodes
>
>
> OK, so is only one connection being established to one node for one bulk
> write operation? Or are multiple connections being made to both nodes and
> writes performed on both?
>
> --
> Michael
>
>
>
>
>

Re: Load balancing issue with virtual nodes

Posted by Batranut Bogdan <ba...@yahoo.com>.

I don't know about hector but the datastax java driver needs just one ip from the cluster and it will discover the rest of the nodes. Then by default it will do a round robin when sending requests. So if Hector does the same the patterb will againg appear.
Did you look at the size of the dirs?
That documentation is for C* 0.8. It's old. But depending on your boxes you might reach CPU bottleneck. Might want to google for write path in cassandra..  According to that, there is not much to do when writes come in...  
On Friday, April 25, 2014 12:00 AM, DuyHai Doan <do...@gmail.com> wrote:

I did some experiments.

 Let's say we have node1 and node2

First, I configured Hector with node1 & node2 as hosts and I saw that only node1 has high CPU load

To eliminate the "client connection" issue, I re-test with only node2 provided as host for Hector. Same pattern. CPU load is above 50% on node1 and below 10% on node2.

It means that node2 is playing as coordinator and forward many write/read request to node1

 Why did I look at CPU load and not iostat & al ?

 Because I have a very intensive write work load with read-only-once pattern. I've read here (http://www.datastax.com/docs/0.8/cluster_architecture/cluster_planning) that heavy write in C* is more CPU bound but maybe the info may be outdated and no longer true

 Regards

 Duy Hai DOAN

On Thu, Apr 24, 2014 at 10:00 PM, Michael Shuler <mi...@pbandjelly.org> wrote:

On 04/24/2014 10:29 AM, DuyHai Doan wrote:
>
>  Client used = Hector 1.1-4
>>  Default Load Balancing connection policy
>>  Both nodes addresses are provided to Hector so according to its
>>connection policy, the client should switch alternatively between both nodes
>>
>
OK, so is only one connection being established to one node for one bulk write operation? Or are multiple connections being made to both nodes and writes performed on both?
>
>-- 
>Michael
>

Re: Load balancing issue with virtual nodes

Posted by DuyHai Doan <do...@gmail.com>.

I did some experiments.

 Let's say we have node1 and node2

First, I configured Hector with node1 & node2 as hosts and I saw that only
node1 has high CPU load

To eliminate the "client connection" issue, I re-test with only node2
provided as host for Hector. Same pattern. CPU load is above 50% on node1
and below 10% on node2.

It means that node2 is playing as coordinator and forward many write/read
request to node1

 Why did I look at CPU load and not iostat & al ?

 Because I have a very intensive write work load with read-only-once
pattern. I've read here (
http://www.datastax.com/docs/0.8/cluster_architecture/cluster_planning)
that heavy write in C* is more CPU bound but maybe the info may be outdated
and no longer true

 Regards

 Duy Hai DOAN

On Thu, Apr 24, 2014 at 10:00 PM, Michael Shuler <mi...@pbandjelly.org>wrote:

> On 04/24/2014 10:29 AM, DuyHai Doan wrote:
>
>>   Client used = Hector 1.1-4
>>   Default Load Balancing connection policy
>>   Both nodes addresses are provided to Hector so according to its
>> connection policy, the client should switch alternatively between both
>> nodes
>>
>
> OK, so is only one connection being established to one node for one bulk
> write operation? Or are multiple connections being made to both nodes and
> writes performed on both?
>
> --
> Michael
>

Re: Load balancing issue with virtual nodes

Posted by Batranut Bogdan <ba...@yahoo.com>.

Htop is not the only tool for this . Cassandra will hit io bottlnecks before cpu (on faster cpus) . A simple solution is to check the size of the data dir on the boxes. If you have aprox the same size then cassandra is wrinting in the whole cluster. Check how the data dir size changes when importing use iostat to see how loaded the hdds are. In my experience I rarely look at cpu load. Since i have hdds i/o is the thing that i keep an eye on. Also try secvential strings when inserting but this should behave aprox the same as for  random ones.<a href="https://overview.mail.yahoo.com?.src=iOS"><br/><br/>Sent from Yahoo Mail for iPhone</a>

Re: Load balancing issue with virtual nodes

Posted by Michael Shuler <mi...@pbandjelly.org>.

On 04/24/2014 10:29 AM, DuyHai Doan wrote:
>   Client used = Hector 1.1-4
>   Default Load Balancing connection policy
>   Both nodes addresses are provided to Hector so according to its
> connection policy, the client should switch alternatively between both nodes

OK, so is only one connection being established to one node for one bulk 
write operation? Or are multiple connections being made to both nodes 
and writes performed on both?

-- 
Michael

Re: Load balancing issue with virtual nodes

Posted by DuyHai Doan <do...@gmail.com>.

Hello Michael

 RF = 1

 Client used = Hector 1.1-4
 Default Load Balancing connection policy
 Both nodes addresses are provided to Hector so according to its connection
policy, the client should switch alternatively between both nodes

 Regards

 Duy Hai DOAN


On Thu, Apr 24, 2014 at 4:37 PM, Michael Shuler <mi...@pbandjelly.org>wrote:

> On 04/24/2014 09:14 AM, DuyHai Doan wrote:
>
>>   My customer has a cluster with 2 nodes only. I've set virtual nodes so
>> future addition of new nodes will be easy.
>>
>
> with RF=?
>
>
>    Now, after some benching tests with massive data insert, I can see
>> with "htop" that one node has its CPU occupation up to 60% and the other
>> only around 10%
>>
>
> This would make sense if the keyspace is RF=1 and the client is connecting
> to one node. This also makes sense if the KS is RF=2 and the client is
> connecting to one node.
>
>
>    The massive insertion workload consist of random data and random
>> string (20 chars) as partition key so in theory, the Murmur3 partitioner
>> should dispatch then evenly between both nodes...
>>
>>   Is there any existing JIRA related to load balancing issue with vnodes ?
>>
>
> vnode != node
>
> Clients connect to nodes - real servers with a network connection. Vnodes
> are internal to how C* distributes data in the cluster and have nothing to
> do with network connections of the nodes. Load balancing client connections
> is at the node/network level and how the client/driver is configured to
> distribute network connections - has nothing to do with vnodes.
>
> --
> Michael
>
>

Re: Load balancing issue with virtual nodes

Posted by Michael Shuler <mi...@pbandjelly.org>.

On 04/24/2014 09:14 AM, DuyHai Doan wrote:
>   My customer has a cluster with 2 nodes only. I've set virtual nodes so
> future addition of new nodes will be easy.

with RF=?

>   Now, after some benching tests with massive data insert, I can see
> with "htop" that one node has its CPU occupation up to 60% and the other
> only around 10%

This would make sense if the keyspace is RF=1 and the client is 
connecting to one node. This also makes sense if the KS is RF=2 and the 
client is connecting to one node.

>   The massive insertion workload consist of random data and random
> string (20 chars) as partition key so in theory, the Murmur3 partitioner
> should dispatch then evenly between both nodes...
>
>   Is there any existing JIRA related to load balancing issue with vnodes ?

vnode != node

Clients connect to nodes - real servers with a network connection. 
Vnodes are internal to how C* distributes data in the cluster and have 
nothing to do with network connections of the nodes. Load balancing 
client connections is at the node/network level and how the 
client/driver is configured to distribute network connections - has 
nothing to do with vnodes.

-- 
Michael