You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jonathan Ellis (JIRA)" <ji...@apache.org> on 2009/05/21 21:46:45 UTC

[jira] Created: (CASSANDRA-192) Load balancing

Load balancing
--------------

                 Key: CASSANDRA-192
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-192
             Project: Cassandra
          Issue Type: New Feature
            Reporter: Jonathan Ellis
             Fix For: 0.4


We need to be able to spread load evenly across a cluster to mitigate keys not being uniformly distributed as well as heterogeneous nodes in a cluster.  The former is particularly likely to be a problem when using the OrderPreservingPartitioner, since the keys are not randomized by a hash function.

Avinash suggested three papers on load balancing in this thread: http://groups.google.com/group/cassandra-dev/msg/b3d67acf35801c41

Of these, the useful ones are
 http://www.iptps.org/papers-2004/karger-load-balance.pdf (Simple Efficient Load Balancing Algorithms for Peer-to-Peer Systems by David R. Karger and Matthias Ruhl)
 http://iptps03.cs.berkeley.edu/final-papers/load_balancing.ps (Load Balancing in Structured P2P Systems by Ananth Rao et al)

The third, 
http://iptps03.cs.berkeley.edu/final-papers/simple_load_balancing.ps (Simple Load Balancing for Distributed Hash Tables by John Byers et al) is not applicable to Cassandra's design.  ("First, we suggest the direct application of the lsquolsquopower of two choicesrsquorsquo paradigm, whereby an item is stored at the less loaded of two (or more) random alternatives. We then consider how associating a small constant number of hash values with a key can naturally be extended to support other load balancing strategies.")

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-192) Load balancing

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778094#action_12778094 ] 

Hudson commented on CASSANDRA-192:
----------------------------------

Integrated in Cassandra #259 (See [http://hudson.zones.apache.org/hudson/job/Cassandra/259/])
    nodeprobe loadbalance
patch by jbellis; reviewed by goffinet for 


> Load balancing
> --------------
>
>                 Key: CASSANDRA-192
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-192
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>             Fix For: 0.5
>
>         Attachments: 192.patch
>
>
> We need to be able to spread load evenly across a cluster to mitigate keys not being uniformly distributed as well as heterogeneous nodes in a cluster.  The former is particularly likely to be a problem when using the OrderPreservingPartitioner, since the keys are not randomized by a hash function.
> Avinash suggested three papers on load balancing in this thread: http://groups.google.com/group/cassandra-dev/msg/b3d67acf35801c41
> Of these, the useful ones are
>  http://www.iptps.org/papers-2004/karger-load-balance.pdf (Simple Efficient Load Balancing Algorithms for Peer-to-Peer Systems by David R. Karger and Matthias Ruhl)
>  http://iptps03.cs.berkeley.edu/final-papers/load_balancing.ps (Load Balancing in Structured P2P Systems by Ananth Rao et al)
> The third, 
> http://iptps03.cs.berkeley.edu/final-papers/simple_load_balancing.ps (Simple Load Balancing for Distributed Hash Tables by John Byers et al) is not applicable to Cassandra's design.  ("First, we suggest the direct application of the 'power of two choices' paradigm, whereby an item is stored at the less loaded of two (or more) random alternatives. We then consider how associating a small constant number of hash values with a key can naturally be extended to support other load balancing strategies.")

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-192) Load balancing

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12723718#action_12723718 ] 

Stu Hood commented on CASSANDRA-192:
------------------------------------

Re: the protocol described in https://issues.apache.org/jira/browse/CASSANDRA-192?focusedCommentId=12713079&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12713079

It will make things much cleaner for CASSANDRA-193 if when a node chooses an address, it always chooses an address that was calculated using recursive applications of CASSANDRA-242. That way, rather than recalculating the entire MerkleTree for the node, we only have to add or remove a subtree.


> Load balancing
> --------------
>
>                 Key: CASSANDRA-192
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-192
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Jonathan Ellis
>             Fix For: 0.5
>
>
> We need to be able to spread load evenly across a cluster to mitigate keys not being uniformly distributed as well as heterogeneous nodes in a cluster.  The former is particularly likely to be a problem when using the OrderPreservingPartitioner, since the keys are not randomized by a hash function.
> Avinash suggested three papers on load balancing in this thread: http://groups.google.com/group/cassandra-dev/msg/b3d67acf35801c41
> Of these, the useful ones are
>  http://www.iptps.org/papers-2004/karger-load-balance.pdf (Simple Efficient Load Balancing Algorithms for Peer-to-Peer Systems by David R. Karger and Matthias Ruhl)
>  http://iptps03.cs.berkeley.edu/final-papers/load_balancing.ps (Load Balancing in Structured P2P Systems by Ananth Rao et al)
> The third, 
> http://iptps03.cs.berkeley.edu/final-papers/simple_load_balancing.ps (Simple Load Balancing for Distributed Hash Tables by John Byers et al) is not applicable to Cassandra's design.  ("First, we suggest the direct application of the 'power of two choices' paradigm, whereby an item is stored at the less loaded of two (or more) random alternatives. We then consider how associating a small constant number of hash values with a key can naturally be extended to support other load balancing strategies.")

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-192) Load balancing

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-192:
-------------------------------------

    Fix Version/s:     (was: 0.4)
                   0.5

> Load balancing
> --------------
>
>                 Key: CASSANDRA-192
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-192
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Jonathan Ellis
>             Fix For: 0.5
>
>
> We need to be able to spread load evenly across a cluster to mitigate keys not being uniformly distributed as well as heterogeneous nodes in a cluster.  The former is particularly likely to be a problem when using the OrderPreservingPartitioner, since the keys are not randomized by a hash function.
> Avinash suggested three papers on load balancing in this thread: http://groups.google.com/group/cassandra-dev/msg/b3d67acf35801c41
> Of these, the useful ones are
>  http://www.iptps.org/papers-2004/karger-load-balance.pdf (Simple Efficient Load Balancing Algorithms for Peer-to-Peer Systems by David R. Karger and Matthias Ruhl)
>  http://iptps03.cs.berkeley.edu/final-papers/load_balancing.ps (Load Balancing in Structured P2P Systems by Ananth Rao et al)
> The third, 
> http://iptps03.cs.berkeley.edu/final-papers/simple_load_balancing.ps (Simple Load Balancing for Distributed Hash Tables by John Byers et al) is not applicable to Cassandra's design.  ("First, we suggest the direct application of the 'power of two choices' paradigm, whereby an item is stored at the less loaded of two (or more) random alternatives. We then consider how associating a small constant number of hash values with a key can naturally be extended to support other load balancing strategies.")

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (CASSANDRA-192) Load balancing

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12777145#action_12777145 ] 

Jonathan Ellis edited comment on CASSANDRA-192 at 11/12/09 7:43 PM:
--------------------------------------------------------------------

adds op-initiated load-balancing ("nodeprobe loadbalance").

fully automatic is a little more dangerous and I don't want to go there for 0.5.

requires CASSANDRA-541.

      was (Author: jbellis):
    adds op-initiated load-balancing.

fully automatic is a little more dangerous and I don't want to go there for 0.5.
  
> Load balancing
> --------------
>
>                 Key: CASSANDRA-192
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-192
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>             Fix For: 0.5
>
>         Attachments: 192.patch
>
>
> We need to be able to spread load evenly across a cluster to mitigate keys not being uniformly distributed as well as heterogeneous nodes in a cluster.  The former is particularly likely to be a problem when using the OrderPreservingPartitioner, since the keys are not randomized by a hash function.
> Avinash suggested three papers on load balancing in this thread: http://groups.google.com/group/cassandra-dev/msg/b3d67acf35801c41
> Of these, the useful ones are
>  http://www.iptps.org/papers-2004/karger-load-balance.pdf (Simple Efficient Load Balancing Algorithms for Peer-to-Peer Systems by David R. Karger and Matthias Ruhl)
>  http://iptps03.cs.berkeley.edu/final-papers/load_balancing.ps (Load Balancing in Structured P2P Systems by Ananth Rao et al)
> The third, 
> http://iptps03.cs.berkeley.edu/final-papers/simple_load_balancing.ps (Simple Load Balancing for Distributed Hash Tables by John Byers et al) is not applicable to Cassandra's design.  ("First, we suggest the direct application of the 'power of two choices' paradigm, whereby an item is stored at the less loaded of two (or more) random alternatives. We then consider how associating a small constant number of hash values with a key can naturally be extended to support other load balancing strategies.")

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-192) Load balancing

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-192:
-------------------------------------

    Description: 
We need to be able to spread load evenly across a cluster to mitigate keys not being uniformly distributed as well as heterogeneous nodes in a cluster.  The former is particularly likely to be a problem when using the OrderPreservingPartitioner, since the keys are not randomized by a hash function.

Avinash suggested three papers on load balancing in this thread: http://groups.google.com/group/cassandra-dev/msg/b3d67acf35801c41

Of these, the useful ones are
 http://www.iptps.org/papers-2004/karger-load-balance.pdf (Simple Efficient Load Balancing Algorithms for Peer-to-Peer Systems by David R. Karger and Matthias Ruhl)
 http://iptps03.cs.berkeley.edu/final-papers/load_balancing.ps (Load Balancing in Structured P2P Systems by Ananth Rao et al)

The third, 
http://iptps03.cs.berkeley.edu/final-papers/simple_load_balancing.ps (Simple Load Balancing for Distributed Hash Tables by John Byers et al) is not applicable to Cassandra's design.  ("First, we suggest the direct application of the 'power of two choices' paradigm, whereby an item is stored at the less loaded of two (or more) random alternatives. We then consider how associating a small constant number of hash values with a key can naturally be extended to support other load balancing strategies.")

  was:
We need to be able to spread load evenly across a cluster to mitigate keys not being uniformly distributed as well as heterogeneous nodes in a cluster.  The former is particularly likely to be a problem when using the OrderPreservingPartitioner, since the keys are not randomized by a hash function.

Avinash suggested three papers on load balancing in this thread: http://groups.google.com/group/cassandra-dev/msg/b3d67acf35801c41

Of these, the useful ones are
 http://www.iptps.org/papers-2004/karger-load-balance.pdf (Simple Efficient Load Balancing Algorithms for Peer-to-Peer Systems by David R. Karger and Matthias Ruhl)
 http://iptps03.cs.berkeley.edu/final-papers/load_balancing.ps (Load Balancing in Structured P2P Systems by Ananth Rao et al)

The third, 
http://iptps03.cs.berkeley.edu/final-papers/simple_load_balancing.ps (Simple Load Balancing for Distributed Hash Tables by John Byers et al) is not applicable to Cassandra's design.  ('First, we suggest the direct application of the lsquolsquopower of two choicesrsquorsquo paradigm, whereby an item is stored at the less loaded of two (or more) random alternatives. We then consider how associating a small constant number of hash values with a key can naturally be extended to support other load balancing strategies.')


> Load balancing
> --------------
>
>                 Key: CASSANDRA-192
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-192
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Jonathan Ellis
>             Fix For: 0.4
>
>
> We need to be able to spread load evenly across a cluster to mitigate keys not being uniformly distributed as well as heterogeneous nodes in a cluster.  The former is particularly likely to be a problem when using the OrderPreservingPartitioner, since the keys are not randomized by a hash function.
> Avinash suggested three papers on load balancing in this thread: http://groups.google.com/group/cassandra-dev/msg/b3d67acf35801c41
> Of these, the useful ones are
>  http://www.iptps.org/papers-2004/karger-load-balance.pdf (Simple Efficient Load Balancing Algorithms for Peer-to-Peer Systems by David R. Karger and Matthias Ruhl)
>  http://iptps03.cs.berkeley.edu/final-papers/load_balancing.ps (Load Balancing in Structured P2P Systems by Ananth Rao et al)
> The third, 
> http://iptps03.cs.berkeley.edu/final-papers/simple_load_balancing.ps (Simple Load Balancing for Distributed Hash Tables by John Byers et al) is not applicable to Cassandra's design.  ("First, we suggest the direct application of the 'power of two choices' paradigm, whereby an item is stored at the less loaded of two (or more) random alternatives. We then consider how associating a small constant number of hash values with a key can naturally be extended to support other load balancing strategies.")

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-192) Load balancing

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12714582#action_12714582 ] 

Jun Rao commented on CASSANDRA-192:
-----------------------------------

The basic idea looks right. A few other details.

1. How to handle replicas? At the minimum, it seems that when the data transfer from A to B is over, B needs to send delete requests to not only A, but also nodes that replicates part of A's data.

2. How to coordinate with hinted handoff data? What happens when the ring tokens have changed but there are still some old hinted data delivered to A? Do we want A to forward those data to B?
 

> Load balancing
> --------------
>
>                 Key: CASSANDRA-192
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-192
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Jonathan Ellis
>             Fix For: 0.4
>
>
> We need to be able to spread load evenly across a cluster to mitigate keys not being uniformly distributed as well as heterogeneous nodes in a cluster.  The former is particularly likely to be a problem when using the OrderPreservingPartitioner, since the keys are not randomized by a hash function.
> Avinash suggested three papers on load balancing in this thread: http://groups.google.com/group/cassandra-dev/msg/b3d67acf35801c41
> Of these, the useful ones are
>  http://www.iptps.org/papers-2004/karger-load-balance.pdf (Simple Efficient Load Balancing Algorithms for Peer-to-Peer Systems by David R. Karger and Matthias Ruhl)
>  http://iptps03.cs.berkeley.edu/final-papers/load_balancing.ps (Load Balancing in Structured P2P Systems by Ananth Rao et al)
> The third, 
> http://iptps03.cs.berkeley.edu/final-papers/simple_load_balancing.ps (Simple Load Balancing for Distributed Hash Tables by John Byers et al) is not applicable to Cassandra's design.  ("First, we suggest the direct application of the 'power of two choices' paradigm, whereby an item is stored at the less loaded of two (or more) random alternatives. We then consider how associating a small constant number of hash values with a key can naturally be extended to support other load balancing strategies.")

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-192) Load balancing

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-192:
-------------------------------------

    Attachment:     (was: 192.patch)

> Load balancing
> --------------
>
>                 Key: CASSANDRA-192
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-192
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>             Fix For: 0.5
>
>         Attachments: 192.patch
>
>
> We need to be able to spread load evenly across a cluster to mitigate keys not being uniformly distributed as well as heterogeneous nodes in a cluster.  The former is particularly likely to be a problem when using the OrderPreservingPartitioner, since the keys are not randomized by a hash function.
> Avinash suggested three papers on load balancing in this thread: http://groups.google.com/group/cassandra-dev/msg/b3d67acf35801c41
> Of these, the useful ones are
>  http://www.iptps.org/papers-2004/karger-load-balance.pdf (Simple Efficient Load Balancing Algorithms for Peer-to-Peer Systems by David R. Karger and Matthias Ruhl)
>  http://iptps03.cs.berkeley.edu/final-papers/load_balancing.ps (Load Balancing in Structured P2P Systems by Ananth Rao et al)
> The third, 
> http://iptps03.cs.berkeley.edu/final-papers/simple_load_balancing.ps (Simple Load Balancing for Distributed Hash Tables by John Byers et al) is not applicable to Cassandra's design.  ("First, we suggest the direct application of the 'power of two choices' paradigm, whereby an item is stored at the less loaded of two (or more) random alternatives. We then consider how associating a small constant number of hash values with a key can naturally be extended to support other load balancing strategies.")

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-192) Load balancing

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12713213#action_12713213 ] 

Jonathan Ellis commented on CASSANDRA-192:
------------------------------------------

Citation search results (http://scholar.google.com/scholar?hl=en&lr=&cites=9360931679730374378).

Short version: Mercury section 4.4 and 5.5 are pertinent.  The rest is not.

"Mercury: Supporting scalable multi-attribute range queries:" Leave/join based load-balancing, basically Case 2 of Ruhl's algorithm.  They conclude that alpha=2 represents perhaps the best tradeoff of convergence (time to stop balancing) vs actual load ratio achieved, where "a node is said to be lightly loaded if the ratio of its local load to average load is less than 1/alpha and heavily loaded if the ratio is greater than alpha."  Most of the paper is spent describing how by random sampling each node can build a histogram of load distribution in the cluster.  

"Online balancing of range-partitioned data with applications to peer-to-peer systems:" weird mix of single-point-of-failure and p2p system design.  LB algorithm is designed to be run for each update/delete.

"One torus to rule them all: multi-dimensional queries in P2P systems:" classic overlay network design for large volume of node churn.  Proposes SCRAP and MURK (better acronyms than most).  SCRAP allows MD queries by mapping to a single dimension e.g. by z-ordering.  MURK uses kd-trees.  Only concerned with routing LB (a non-issue for us).

"A case study in building layered DHT applications:" builds prefix hash trees on top of OpenDHT for geographic range queries.  They started with the goal of using an unmodified DHT out of the box, but had to add atomic operations to it.  The layering approach resulted in query latency of up to 2s.  Not exactly a vindication of their approach.  Doesn't deal with LB.

"Load balancing and locality in range-queriable data structures:" pointer-based rather than token-based routing.  Bucketized keys for range queries.  Per-update/delete balancing.

"Heterogeneity and load balance in distributed hash tables:" leave/join virtual node-based LB in an overlay-linked DHT, with the twist that virtual node tokens are not random but picked to mitigate the extra cost the virtual nodes add to the overlay links.  Assumes load is uniformly distributed over key space.

> Load balancing
> --------------
>
>                 Key: CASSANDRA-192
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-192
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Jonathan Ellis
>             Fix For: 0.4
>
>
> We need to be able to spread load evenly across a cluster to mitigate keys not being uniformly distributed as well as heterogeneous nodes in a cluster.  The former is particularly likely to be a problem when using the OrderPreservingPartitioner, since the keys are not randomized by a hash function.
> Avinash suggested three papers on load balancing in this thread: http://groups.google.com/group/cassandra-dev/msg/b3d67acf35801c41
> Of these, the useful ones are
>  http://www.iptps.org/papers-2004/karger-load-balance.pdf (Simple Efficient Load Balancing Algorithms for Peer-to-Peer Systems by David R. Karger and Matthias Ruhl)
>  http://iptps03.cs.berkeley.edu/final-papers/load_balancing.ps (Load Balancing in Structured P2P Systems by Ananth Rao et al)
> The third, 
> http://iptps03.cs.berkeley.edu/final-papers/simple_load_balancing.ps (Simple Load Balancing for Distributed Hash Tables by John Byers et al) is not applicable to Cassandra's design.  ("First, we suggest the direct application of the 'power of two choices' paradigm, whereby an item is stored at the less loaded of two (or more) random alternatives. We then consider how associating a small constant number of hash values with a key can naturally be extended to support other load balancing strategies.")

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-192) Load balancing

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-192:
-------------------------------------

    Description: 
We need to be able to spread load evenly across a cluster to mitigate keys not being uniformly distributed as well as heterogeneous nodes in a cluster.  The former is particularly likely to be a problem when using the OrderPreservingPartitioner, since the keys are not randomized by a hash function.

Avinash suggested three papers on load balancing in this thread: http://groups.google.com/group/cassandra-dev/msg/b3d67acf35801c41

Of these, the useful ones are
 http://www.iptps.org/papers-2004/karger-load-balance.pdf (Simple Efficient Load Balancing Algorithms for Peer-to-Peer Systems by David R. Karger and Matthias Ruhl)
 http://iptps03.cs.berkeley.edu/final-papers/load_balancing.ps (Load Balancing in Structured P2P Systems by Ananth Rao et al)

The third, 
http://iptps03.cs.berkeley.edu/final-papers/simple_load_balancing.ps (Simple Load Balancing for Distributed Hash Tables by John Byers et al) is not applicable to Cassandra's design.  ('First, we suggest the direct application of the lsquolsquopower of two choicesrsquorsquo paradigm, whereby an item is stored at the less loaded of two (or more) random alternatives. We then consider how associating a small constant number of hash values with a key can naturally be extended to support other load balancing strategies.')

  was:
We need to be able to spread load evenly across a cluster to mitigate keys not being uniformly distributed as well as heterogeneous nodes in a cluster.  The former is particularly likely to be a problem when using the OrderPreservingPartitioner, since the keys are not randomized by a hash function.

Avinash suggested three papers on load balancing in this thread: http://groups.google.com/group/cassandra-dev/msg/b3d67acf35801c41

Of these, the useful ones are
 http://www.iptps.org/papers-2004/karger-load-balance.pdf (Simple Efficient Load Balancing Algorithms for Peer-to-Peer Systems by David R. Karger and Matthias Ruhl)
 http://iptps03.cs.berkeley.edu/final-papers/load_balancing.ps (Load Balancing in Structured P2P Systems by Ananth Rao et al)

The third, 
http://iptps03.cs.berkeley.edu/final-papers/simple_load_balancing.ps (Simple Load Balancing for Distributed Hash Tables by John Byers et al) is not applicable to Cassandra's design.  ("First, we suggest the direct application of the lsquolsquopower of two choicesrsquorsquo paradigm, whereby an item is stored at the less loaded of two (or more) random alternatives. We then consider how associating a small constant number of hash values with a key can naturally be extended to support other load balancing strategies.")


> Load balancing
> --------------
>
>                 Key: CASSANDRA-192
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-192
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Jonathan Ellis
>             Fix For: 0.4
>
>
> We need to be able to spread load evenly across a cluster to mitigate keys not being uniformly distributed as well as heterogeneous nodes in a cluster.  The former is particularly likely to be a problem when using the OrderPreservingPartitioner, since the keys are not randomized by a hash function.
> Avinash suggested three papers on load balancing in this thread: http://groups.google.com/group/cassandra-dev/msg/b3d67acf35801c41
> Of these, the useful ones are
>  http://www.iptps.org/papers-2004/karger-load-balance.pdf (Simple Efficient Load Balancing Algorithms for Peer-to-Peer Systems by David R. Karger and Matthias Ruhl)
>  http://iptps03.cs.berkeley.edu/final-papers/load_balancing.ps (Load Balancing in Structured P2P Systems by Ananth Rao et al)
> The third, 
> http://iptps03.cs.berkeley.edu/final-papers/simple_load_balancing.ps (Simple Load Balancing for Distributed Hash Tables by John Byers et al) is not applicable to Cassandra's design.  ('First, we suggest the direct application of the lsquolsquopower of two choicesrsquorsquo paradigm, whereby an item is stored at the less loaded of two (or more) random alternatives. We then consider how associating a small constant number of hash values with a key can naturally be extended to support other load balancing strategies.')

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-192) Load balancing

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-192:
-------------------------------------

    Attachment: 192.patch

adds op-initiated load-balancing.

fully automatic is a little more dangerous and I don't want to go there for 0.5.

> Load balancing
> --------------
>
>                 Key: CASSANDRA-192
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-192
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>             Fix For: 0.5
>
>         Attachments: 192.patch
>
>
> We need to be able to spread load evenly across a cluster to mitigate keys not being uniformly distributed as well as heterogeneous nodes in a cluster.  The former is particularly likely to be a problem when using the OrderPreservingPartitioner, since the keys are not randomized by a hash function.
> Avinash suggested three papers on load balancing in this thread: http://groups.google.com/group/cassandra-dev/msg/b3d67acf35801c41
> Of these, the useful ones are
>  http://www.iptps.org/papers-2004/karger-load-balance.pdf (Simple Efficient Load Balancing Algorithms for Peer-to-Peer Systems by David R. Karger and Matthias Ruhl)
>  http://iptps03.cs.berkeley.edu/final-papers/load_balancing.ps (Load Balancing in Structured P2P Systems by Ananth Rao et al)
> The third, 
> http://iptps03.cs.berkeley.edu/final-papers/simple_load_balancing.ps (Simple Load Balancing for Distributed Hash Tables by John Byers et al) is not applicable to Cassandra's design.  ("First, we suggest the direct application of the 'power of two choices' paradigm, whereby an item is stored at the less loaded of two (or more) random alternatives. We then consider how associating a small constant number of hash values with a key can naturally be extended to support other load balancing strategies.")

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (CASSANDRA-192) Load balancing

Posted by "Stu Hood (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12723718#action_12723718 ] 

Stu Hood edited comment on CASSANDRA-192 at 11/6/09 9:42 PM:
-------------------------------------------------------------

It will make things much cleaner for CASSANDRA-193 if when a node chooses an address, it always chooses an address that was calculated using recursive applications of CASSANDRA-242. That way, rather than recalculating the entire MerkleTree for the node, we only have to add or remove a subtree.

EDIT: Ignore this comment: using midpoint to select tokens for Nodes isn't really helpful at all.

      was (Author: stuhood):
    Re: the protocol described in https://issues.apache.org/jira/browse/CASSANDRA-192?focusedCommentId=12713079&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12713079

It will make things much cleaner for CASSANDRA-193 if when a node chooses an address, it always chooses an address that was calculated using recursive applications of CASSANDRA-242. That way, rather than recalculating the entire MerkleTree for the node, we only have to add or remove a subtree.

  
> Load balancing
> --------------
>
>                 Key: CASSANDRA-192
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-192
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>             Fix For: 0.5
>
>
> We need to be able to spread load evenly across a cluster to mitigate keys not being uniformly distributed as well as heterogeneous nodes in a cluster.  The former is particularly likely to be a problem when using the OrderPreservingPartitioner, since the keys are not randomized by a hash function.
> Avinash suggested three papers on load balancing in this thread: http://groups.google.com/group/cassandra-dev/msg/b3d67acf35801c41
> Of these, the useful ones are
>  http://www.iptps.org/papers-2004/karger-load-balance.pdf (Simple Efficient Load Balancing Algorithms for Peer-to-Peer Systems by David R. Karger and Matthias Ruhl)
>  http://iptps03.cs.berkeley.edu/final-papers/load_balancing.ps (Load Balancing in Structured P2P Systems by Ananth Rao et al)
> The third, 
> http://iptps03.cs.berkeley.edu/final-papers/simple_load_balancing.ps (Simple Load Balancing for Distributed Hash Tables by John Byers et al) is not applicable to Cassandra's design.  ("First, we suggest the direct application of the 'power of two choices' paradigm, whereby an item is stored at the less loaded of two (or more) random alternatives. We then consider how associating a small constant number of hash values with a key can naturally be extended to support other load balancing strategies.")

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (CASSANDRA-192) Load balancing

Posted by "Sandeep Tata (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sandeep Tata reassigned CASSANDRA-192:
--------------------------------------

    Assignee:     (was: Sandeep Tata)

> Load balancing
> --------------
>
>                 Key: CASSANDRA-192
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-192
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>             Fix For: 0.5
>
>
> We need to be able to spread load evenly across a cluster to mitigate keys not being uniformly distributed as well as heterogeneous nodes in a cluster.  The former is particularly likely to be a problem when using the OrderPreservingPartitioner, since the keys are not randomized by a hash function.
> Avinash suggested three papers on load balancing in this thread: http://groups.google.com/group/cassandra-dev/msg/b3d67acf35801c41
> Of these, the useful ones are
>  http://www.iptps.org/papers-2004/karger-load-balance.pdf (Simple Efficient Load Balancing Algorithms for Peer-to-Peer Systems by David R. Karger and Matthias Ruhl)
>  http://iptps03.cs.berkeley.edu/final-papers/load_balancing.ps (Load Balancing in Structured P2P Systems by Ananth Rao et al)
> The third, 
> http://iptps03.cs.berkeley.edu/final-papers/simple_load_balancing.ps (Simple Load Balancing for Distributed Hash Tables by John Byers et al) is not applicable to Cassandra's design.  ("First, we suggest the direct application of the 'power of two choices' paradigm, whereby an item is stored at the less loaded of two (or more) random alternatives. We then consider how associating a small constant number of hash values with a key can naturally be extended to support other load balancing strategies.")

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-192) Load balancing

Posted by "Chris Goffinet (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778013#action_12778013 ] 

Chris Goffinet commented on CASSANDRA-192:
------------------------------------------

LGTM. Passes tests and nosetests.


> Load balancing
> --------------
>
>                 Key: CASSANDRA-192
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-192
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>             Fix For: 0.5
>
>         Attachments: 192.patch
>
>
> We need to be able to spread load evenly across a cluster to mitigate keys not being uniformly distributed as well as heterogeneous nodes in a cluster.  The former is particularly likely to be a problem when using the OrderPreservingPartitioner, since the keys are not randomized by a hash function.
> Avinash suggested three papers on load balancing in this thread: http://groups.google.com/group/cassandra-dev/msg/b3d67acf35801c41
> Of these, the useful ones are
>  http://www.iptps.org/papers-2004/karger-load-balance.pdf (Simple Efficient Load Balancing Algorithms for Peer-to-Peer Systems by David R. Karger and Matthias Ruhl)
>  http://iptps03.cs.berkeley.edu/final-papers/load_balancing.ps (Load Balancing in Structured P2P Systems by Ananth Rao et al)
> The third, 
> http://iptps03.cs.berkeley.edu/final-papers/simple_load_balancing.ps (Simple Load Balancing for Distributed Hash Tables by John Byers et al) is not applicable to Cassandra's design.  ("First, we suggest the direct application of the 'power of two choices' paradigm, whereby an item is stored at the less loaded of two (or more) random alternatives. We then consider how associating a small constant number of hash values with a key can naturally be extended to support other load balancing strategies.")

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-192) Load balancing

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12713827#action_12713827 ] 

Jonathan Ellis commented on CASSANDRA-192:
------------------------------------------

Hopefully nobody writes about that because it is the easy part. :)

Here is a scheme I think would work:

if node B is assuming part of the range node A is currently responsible, then we do it as follows:

Node A notes the range it is transferring and begins anticompaction of its SSTables.  It continues to accept reads and writes for that range but also forwards writes to B so that anticompaction doesn't need to worry about these extra writes.  When anticompaction is done, it sends the appropriate sections over to B.  When complete, B begins to gossip its adjusted token and A creates a task that will remove the sections B now has some time in the future (after we can assume the entire cluster has gotten the new token info -- minutes or hours, not days).

To avoid garbage on either side in the event of a crash before the process is complete, we can add a check during compaction that throws away data that is not part of the current node's responsibility (or replica ranges).

This glosses over a bunch of details (do we anticompact memtables too, or just flush first and let the sstable code go over it?  what if B is already part of the ring that gets replicas for the range in question?) but I think the basic idea is sound.

> Load balancing
> --------------
>
>                 Key: CASSANDRA-192
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-192
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Jonathan Ellis
>             Fix For: 0.4
>
>
> We need to be able to spread load evenly across a cluster to mitigate keys not being uniformly distributed as well as heterogeneous nodes in a cluster.  The former is particularly likely to be a problem when using the OrderPreservingPartitioner, since the keys are not randomized by a hash function.
> Avinash suggested three papers on load balancing in this thread: http://groups.google.com/group/cassandra-dev/msg/b3d67acf35801c41
> Of these, the useful ones are
>  http://www.iptps.org/papers-2004/karger-load-balance.pdf (Simple Efficient Load Balancing Algorithms for Peer-to-Peer Systems by David R. Karger and Matthias Ruhl)
>  http://iptps03.cs.berkeley.edu/final-papers/load_balancing.ps (Load Balancing in Structured P2P Systems by Ananth Rao et al)
> The third, 
> http://iptps03.cs.berkeley.edu/final-papers/simple_load_balancing.ps (Simple Load Balancing for Distributed Hash Tables by John Byers et al) is not applicable to Cassandra's design.  ("First, we suggest the direct application of the 'power of two choices' paradigm, whereby an item is stored at the less loaded of two (or more) random alternatives. We then consider how associating a small constant number of hash values with a key can naturally be extended to support other load balancing strategies.")

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (CASSANDRA-192) Load balancing

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12713079#action_12713079 ] 

Jonathan Ellis edited comment on CASSANDRA-192 at 5/26/09 12:56 PM:
--------------------------------------------------------------------

The Karger/Ruhl paper (really  Ruhl -- based on his thesis) gives two load balancing algorithms.  One is based again on each machine having several virtual nodes, but the load balance is done by only activating one node per machine.  Each machine picks its node based on how evenly it partitions the address space.  This would be easy to implement in Cassandra for our random hash-based partitioner (since only one node is active at a time, changing nodes maps essentially to a token change in Cassandra with no further changes necessary) but does not help order-preserving partitioning where we cannot tell how evenly the address space (same as the key space) is partitioned.

The second Ruhl algorithm assumes neither the ability to measure address space nor virtual nodes.  The summary is brief and I will excerpt it here:

---
The protocol is the following (where ε is any constant, 0 < ε < 1). Recall that each node stores the items whose addresses fall between the node's address and its predecessor's address, and that j denotes the load on node j.

Item balancing: Each node i occasionally contacts another node j at random. If i ≤ εj or j ≤ εi then the nodes perform a load balancing operation (assume that i > j), distinguishing two cases:

Case 1: i = j + 1: In this case, i is the successor of j and the two nodes handle address intervals next to each other. Node j increases its address so that the (i − j)/2 items with lowest addresses get reassigned from node i to node j. Both nodes end up with load (i + j)/2.

Case 2: i != j + 1: If j+1 > i, then we set i := j + 1 and go to case 1. Otherwise, node j moves between nodes i − 1 and i to capture half of node i's items. This means that node j's items are now handled by its former successor, node j + 1.
---

This seems like the most promising option so far.  I will do a citation search on this paper to see if anything else interesting turns up.

      was (Author: jbellis):
    The Karger/Ruhl paper (really  Ruhl -- based on his thesis) gives two load balancing algorithms.  One is based again on each machine having several virtual nodes, but the load balance is done by only activating one node per machine.  Each machine picks its node based on how evenly it partitions the address space.  This would be easy to implement in Cassandra for our random hash-based partitioner (since only one node is active at a time, changing nodes maps essentially to a token change in Cassandra with no further changes necessary) but does not help order-preserving partitioning where we cannot tell how evenly the address space (same as the key space) is partitioned.

The second Ruhl algorithm assumes neither the ability to measure address space nor virtual nodes.  The summary is brief and I will excerpt it here:

---
The protocol is the following (where ε is any constant, 0 < ε < 1). Recall that each node stores the items whose addresses fall between the node's address and its predecessor's address, and that j denotes the load on node j.

Item balancing: Each node i occasionally contacts another node j at random. If i ≤ εj or j ≤ εi then the nodes perform a load balancing operation (assume that i > j), distinguishing two cases:

Case 1: i = j + 1: In this case, i is the successor of j and the two nodes handle address intervals next to each other. Node j increases its address so that the (i − j)/2 items with lowest addresses get reassigned from node i to node j. Both nodes end up with load (i + j)/2.

Case 2: i = j + 1: If j+1 > i, then we set i := j + 1 and go to case 1. Otherwise, node j moves between nodes i − 1 and i to capture half of node i's items. This means that node j's items are now handled by its former successor, node j + 1.
---

This seems like the most promising option so far.  I will do a citation search on this paper to see if anything else interesting turns up.
  
> Load balancing
> --------------
>
>                 Key: CASSANDRA-192
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-192
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Jonathan Ellis
>             Fix For: 0.4
>
>
> We need to be able to spread load evenly across a cluster to mitigate keys not being uniformly distributed as well as heterogeneous nodes in a cluster.  The former is particularly likely to be a problem when using the OrderPreservingPartitioner, since the keys are not randomized by a hash function.
> Avinash suggested three papers on load balancing in this thread: http://groups.google.com/group/cassandra-dev/msg/b3d67acf35801c41
> Of these, the useful ones are
>  http://www.iptps.org/papers-2004/karger-load-balance.pdf (Simple Efficient Load Balancing Algorithms for Peer-to-Peer Systems by David R. Karger and Matthias Ruhl)
>  http://iptps03.cs.berkeley.edu/final-papers/load_balancing.ps (Load Balancing in Structured P2P Systems by Ananth Rao et al)
> The third, 
> http://iptps03.cs.berkeley.edu/final-papers/simple_load_balancing.ps (Simple Load Balancing for Distributed Hash Tables by John Byers et al) is not applicable to Cassandra's design.  ("First, we suggest the direct application of the 'power of two choices' paradigm, whereby an item is stored at the less loaded of two (or more) random alternatives. We then consider how associating a small constant number of hash values with a key can naturally be extended to support other load balancing strategies.")

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (CASSANDRA-192) Load balancing

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis reassigned CASSANDRA-192:
----------------------------------------

    Assignee: Jonathan Ellis

> Load balancing
> --------------
>
>                 Key: CASSANDRA-192
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-192
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>             Fix For: 0.5
>
>
> We need to be able to spread load evenly across a cluster to mitigate keys not being uniformly distributed as well as heterogeneous nodes in a cluster.  The former is particularly likely to be a problem when using the OrderPreservingPartitioner, since the keys are not randomized by a hash function.
> Avinash suggested three papers on load balancing in this thread: http://groups.google.com/group/cassandra-dev/msg/b3d67acf35801c41
> Of these, the useful ones are
>  http://www.iptps.org/papers-2004/karger-load-balance.pdf (Simple Efficient Load Balancing Algorithms for Peer-to-Peer Systems by David R. Karger and Matthias Ruhl)
>  http://iptps03.cs.berkeley.edu/final-papers/load_balancing.ps (Load Balancing in Structured P2P Systems by Ananth Rao et al)
> The third, 
> http://iptps03.cs.berkeley.edu/final-papers/simple_load_balancing.ps (Simple Load Balancing for Distributed Hash Tables by John Byers et al) is not applicable to Cassandra's design.  ("First, we suggest the direct application of the 'power of two choices' paradigm, whereby an item is stored at the less loaded of two (or more) random alternatives. We then consider how associating a small constant number of hash values with a key can naturally be extended to support other load balancing strategies.")

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-192) Load balancing

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-192:
-------------------------------------

    Attachment: 192.patch

rebased

> Load balancing
> --------------
>
>                 Key: CASSANDRA-192
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-192
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>             Fix For: 0.5
>
>         Attachments: 192.patch
>
>
> We need to be able to spread load evenly across a cluster to mitigate keys not being uniformly distributed as well as heterogeneous nodes in a cluster.  The former is particularly likely to be a problem when using the OrderPreservingPartitioner, since the keys are not randomized by a hash function.
> Avinash suggested three papers on load balancing in this thread: http://groups.google.com/group/cassandra-dev/msg/b3d67acf35801c41
> Of these, the useful ones are
>  http://www.iptps.org/papers-2004/karger-load-balance.pdf (Simple Efficient Load Balancing Algorithms for Peer-to-Peer Systems by David R. Karger and Matthias Ruhl)
>  http://iptps03.cs.berkeley.edu/final-papers/load_balancing.ps (Load Balancing in Structured P2P Systems by Ananth Rao et al)
> The third, 
> http://iptps03.cs.berkeley.edu/final-papers/simple_load_balancing.ps (Simple Load Balancing for Distributed Hash Tables by John Byers et al) is not applicable to Cassandra's design.  ("First, we suggest the direct application of the 'power of two choices' paradigm, whereby an item is stored at the less loaded of two (or more) random alternatives. We then consider how associating a small constant number of hash values with a key can naturally be extended to support other load balancing strategies.")

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-192) Load balancing

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12713079#action_12713079 ] 

Jonathan Ellis commented on CASSANDRA-192:
------------------------------------------

The Karger/Ruhl paper (really  Ruhl -- based on his thesis) gives two load balancing algorithms.  One is based again on each machine having several virtual nodes, but the load balance is done by only activating one node per machine.  Each machine picks its node based on how evenly it partitions the address space.  This would be easy to implement in Cassandra for our random hash-based partitioner (since only one node is active at a time, changing nodes maps essentially to a token change in Cassandra with no further changes necessary) but does not help order-preserving partitioning where we cannot tell how evenly the address space (same as the key space) is partitioned.

The second Ruhl algorithm assumes neither the ability to measure address space nor virtual nodes.  The summary is brief and I will excerpt it here:

---
The protocol is the following (where ε is any constant, 0 < ε < 1). Recall that each node stores the items whose addresses fall between the node's address and its predecessor's address, and that j denotes the load on node j.

Item balancing: Each node i occasionally contacts another node j at random. If i ≤ εj or j ≤ εi then the nodes perform a load balancing operation (assume that i > j), distinguishing two cases:

Case 1: i = j + 1: In this case, i is the successor of j and the two nodes handle address intervals next to each other. Node j increases its address so that the (i − j)/2 items with lowest addresses get reassigned from node i to node j. Both nodes end up with load (i + j)/2.

Case 2: i = j + 1: If j+1 > i, then we set i := j + 1 and go to case 1. Otherwise, node j moves between nodes i − 1 and i to capture half of node i's items. This means that node j's items are now handled by its former successor, node j + 1.
---

This seems like the most promising option so far.  I will do a citation search on this paper to see if anything else interesting turns up.

> Load balancing
> --------------
>
>                 Key: CASSANDRA-192
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-192
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Jonathan Ellis
>             Fix For: 0.4
>
>
> We need to be able to spread load evenly across a cluster to mitigate keys not being uniformly distributed as well as heterogeneous nodes in a cluster.  The former is particularly likely to be a problem when using the OrderPreservingPartitioner, since the keys are not randomized by a hash function.
> Avinash suggested three papers on load balancing in this thread: http://groups.google.com/group/cassandra-dev/msg/b3d67acf35801c41
> Of these, the useful ones are
>  http://www.iptps.org/papers-2004/karger-load-balance.pdf (Simple Efficient Load Balancing Algorithms for Peer-to-Peer Systems by David R. Karger and Matthias Ruhl)
>  http://iptps03.cs.berkeley.edu/final-papers/load_balancing.ps (Load Balancing in Structured P2P Systems by Ananth Rao et al)
> The third, 
> http://iptps03.cs.berkeley.edu/final-papers/simple_load_balancing.ps (Simple Load Balancing for Distributed Hash Tables by John Byers et al) is not applicable to Cassandra's design.  ("First, we suggest the direct application of the 'power of two choices' paradigm, whereby an item is stored at the less loaded of two (or more) random alternatives. We then consider how associating a small constant number of hash values with a key can naturally be extended to support other load balancing strategies.")

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-192) Load balancing

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-192:
-------------------------------------

    Attachment: 192.patch

rebased

> Load balancing
> --------------
>
>                 Key: CASSANDRA-192
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-192
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>             Fix For: 0.5
>
>         Attachments: 192.patch, 192.patch
>
>
> We need to be able to spread load evenly across a cluster to mitigate keys not being uniformly distributed as well as heterogeneous nodes in a cluster.  The former is particularly likely to be a problem when using the OrderPreservingPartitioner, since the keys are not randomized by a hash function.
> Avinash suggested three papers on load balancing in this thread: http://groups.google.com/group/cassandra-dev/msg/b3d67acf35801c41
> Of these, the useful ones are
>  http://www.iptps.org/papers-2004/karger-load-balance.pdf (Simple Efficient Load Balancing Algorithms for Peer-to-Peer Systems by David R. Karger and Matthias Ruhl)
>  http://iptps03.cs.berkeley.edu/final-papers/load_balancing.ps (Load Balancing in Structured P2P Systems by Ananth Rao et al)
> The third, 
> http://iptps03.cs.berkeley.edu/final-papers/simple_load_balancing.ps (Simple Load Balancing for Distributed Hash Tables by John Byers et al) is not applicable to Cassandra's design.  ("First, we suggest the direct application of the 'power of two choices' paradigm, whereby an item is stored at the less loaded of two (or more) random alternatives. We then consider how associating a small constant number of hash values with a key can naturally be extended to support other load balancing strategies.")

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-192) Load balancing

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12713818#action_12713818 ] 

Jun Rao commented on CASSANDRA-192:
-----------------------------------

Those references mainly deal with what to move during LB. Have you thought about how to do the move, especially online (i.e, the moving happens in parallel with ongoing read/write requests)? 

> Load balancing
> --------------
>
>                 Key: CASSANDRA-192
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-192
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Jonathan Ellis
>             Fix For: 0.4
>
>
> We need to be able to spread load evenly across a cluster to mitigate keys not being uniformly distributed as well as heterogeneous nodes in a cluster.  The former is particularly likely to be a problem when using the OrderPreservingPartitioner, since the keys are not randomized by a hash function.
> Avinash suggested three papers on load balancing in this thread: http://groups.google.com/group/cassandra-dev/msg/b3d67acf35801c41
> Of these, the useful ones are
>  http://www.iptps.org/papers-2004/karger-load-balance.pdf (Simple Efficient Load Balancing Algorithms for Peer-to-Peer Systems by David R. Karger and Matthias Ruhl)
>  http://iptps03.cs.berkeley.edu/final-papers/load_balancing.ps (Load Balancing in Structured P2P Systems by Ananth Rao et al)
> The third, 
> http://iptps03.cs.berkeley.edu/final-papers/simple_load_balancing.ps (Simple Load Balancing for Distributed Hash Tables by John Byers et al) is not applicable to Cassandra's design.  ("First, we suggest the direct application of the 'power of two choices' paradigm, whereby an item is stored at the less loaded of two (or more) random alternatives. We then consider how associating a small constant number of hash values with a key can naturally be extended to support other load balancing strategies.")

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-192) Load balancing

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-192:
-------------------------------------

    Attachment:     (was: 192.patch)

> Load balancing
> --------------
>
>                 Key: CASSANDRA-192
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-192
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>             Fix For: 0.5
>
>         Attachments: 192.patch
>
>
> We need to be able to spread load evenly across a cluster to mitigate keys not being uniformly distributed as well as heterogeneous nodes in a cluster.  The former is particularly likely to be a problem when using the OrderPreservingPartitioner, since the keys are not randomized by a hash function.
> Avinash suggested three papers on load balancing in this thread: http://groups.google.com/group/cassandra-dev/msg/b3d67acf35801c41
> Of these, the useful ones are
>  http://www.iptps.org/papers-2004/karger-load-balance.pdf (Simple Efficient Load Balancing Algorithms for Peer-to-Peer Systems by David R. Karger and Matthias Ruhl)
>  http://iptps03.cs.berkeley.edu/final-papers/load_balancing.ps (Load Balancing in Structured P2P Systems by Ananth Rao et al)
> The third, 
> http://iptps03.cs.berkeley.edu/final-papers/simple_load_balancing.ps (Simple Load Balancing for Distributed Hash Tables by John Byers et al) is not applicable to Cassandra's design.  ("First, we suggest the direct application of the 'power of two choices' paradigm, whereby an item is stored at the less loaded of two (or more) random alternatives. We then consider how associating a small constant number of hash values with a key can naturally be extended to support other load balancing strategies.")

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-192) Load balancing

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12713060#action_12713060 ] 

Jonathan Ellis commented on CASSANDRA-192:
------------------------------------------

The Rao paper (or RLS+03, as it is cited in the other) describes three schemes to balance the load among machines by giving each several "virtual nodes" (as in Dynamo) and moving or splitting nodes from heavily loaded machines to lightly loaded.  A scheme like this is relatively simple once you already have a virtual node system in place but we do not and we would prefer not to add that layer of complexity if we can avoid it.


> Load balancing
> --------------
>
>                 Key: CASSANDRA-192
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-192
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Jonathan Ellis
>             Fix For: 0.4
>
>
> We need to be able to spread load evenly across a cluster to mitigate keys not being uniformly distributed as well as heterogeneous nodes in a cluster.  The former is particularly likely to be a problem when using the OrderPreservingPartitioner, since the keys are not randomized by a hash function.
> Avinash suggested three papers on load balancing in this thread: http://groups.google.com/group/cassandra-dev/msg/b3d67acf35801c41
> Of these, the useful ones are
>  http://www.iptps.org/papers-2004/karger-load-balance.pdf (Simple Efficient Load Balancing Algorithms for Peer-to-Peer Systems by David R. Karger and Matthias Ruhl)
>  http://iptps03.cs.berkeley.edu/final-papers/load_balancing.ps (Load Balancing in Structured P2P Systems by Ananth Rao et al)
> The third, 
> http://iptps03.cs.berkeley.edu/final-papers/simple_load_balancing.ps (Simple Load Balancing for Distributed Hash Tables by John Byers et al) is not applicable to Cassandra's design.  ("First, we suggest the direct application of the 'power of two choices' paradigm, whereby an item is stored at the less loaded of two (or more) random alternatives. We then consider how associating a small constant number of hash values with a key can naturally be extended to support other load balancing strategies.")

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (CASSANDRA-192) Load balancing

Posted by "Sandeep Tata (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sandeep Tata reassigned CASSANDRA-192:
--------------------------------------

    Assignee: Sandeep Tata

> Load balancing
> --------------
>
>                 Key: CASSANDRA-192
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-192
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Sandeep Tata
>             Fix For: 0.5
>
>
> We need to be able to spread load evenly across a cluster to mitigate keys not being uniformly distributed as well as heterogeneous nodes in a cluster.  The former is particularly likely to be a problem when using the OrderPreservingPartitioner, since the keys are not randomized by a hash function.
> Avinash suggested three papers on load balancing in this thread: http://groups.google.com/group/cassandra-dev/msg/b3d67acf35801c41
> Of these, the useful ones are
>  http://www.iptps.org/papers-2004/karger-load-balance.pdf (Simple Efficient Load Balancing Algorithms for Peer-to-Peer Systems by David R. Karger and Matthias Ruhl)
>  http://iptps03.cs.berkeley.edu/final-papers/load_balancing.ps (Load Balancing in Structured P2P Systems by Ananth Rao et al)
> The third, 
> http://iptps03.cs.berkeley.edu/final-papers/simple_load_balancing.ps (Simple Load Balancing for Distributed Hash Tables by John Byers et al) is not applicable to Cassandra's design.  ("First, we suggest the direct application of the 'power of two choices' paradigm, whereby an item is stored at the less loaded of two (or more) random alternatives. We then consider how associating a small constant number of hash values with a key can naturally be extended to support other load balancing strategies.")

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.