You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@cassandra.apache.org by "Sam Overton (Created) (JIRA)" <ji...@apache.org> on 2012/04/05 15:59:24 UTC

[jira] [Created] (CASSANDRA-4123) vnodes aware Replication Strategy

vnodes aware Replication Strategy
----------------------------------

Key: CASSANDRA-4123
URL: https://issues.apache.org/jira/browse/CASSANDRA-4123
Project: Cassandra
Issue Type: Sub-task
Reporter: Sam Overton
Assignee: Sam Overton

The simplest implementation for this would be if NTS regarded a single host as a distinct rack. This would prevent replicas being placed on the same host. The rest of the logic for replica selection would be identical to NTS (but this would be removing a level of topology hierarchy). This would be achievable just by writing a snitch to place hosts in their own rack.

A better solution would be to add an extra level of hierarchy to NTS so that it still supported DC & rack, and IP would be the new level at the bottom of the hierarchy. The logic would remain largely the same.

I would very much like to build in Peter Schuller's notion of Distribution Factor (as described in http://www.mail-archive.com/dev@cassandra.apache.org/msg03844.html). This requires a method of defining a "replica set" for each host and then treating it in a similar way to a DC (ie. RF replicas are chosen from that set, instead of from the whole cluster).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4123) vnodes aware Replication Strategy

Posted by "Sam Overton (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-4123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402269#comment-13402269 ] 

Sam Overton commented on CASSANDRA-4123:
----------------------------------------

I should point out that the existing NTS (and also the modified implementation in CASSANDRA-3881) already prevent replicas from being placed on the same host. The first couple of paragraphs in the ticket description could probably be removed.

The main issue left in this ticket is this notion of Distribution Factor and replica sets, to reduce the chance that simultaneous failures share data.


                
> vnodes aware Replication Strategy 
> ----------------------------------
>
>                 Key: CASSANDRA-4123
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4123
>             Project: Cassandra
>          Issue Type: Sub-task
>          Components: Core
>            Reporter: Sam Overton
>            Assignee: Sam Overton
>
> The simplest implementation for this would be if NTS regarded a single host as a distinct rack. This would prevent replicas being placed on the same host. The rest of the logic for replica selection would be identical to NTS (but this would be removing a level of topology hierarchy). This would be achievable just by writing a snitch to place hosts in their own rack.
> A better solution would be to add an extra level of hierarchy to NTS so that it still supported DC & rack, and IP would be the new level at the bottom of the hierarchy. The logic would remain largely the same.
> I would very much like to build in Peter Schuller's notion of Distribution Factor (as described in http://www.mail-archive.com/dev@cassandra.apache.org/msg03844.html). This requires a method of defining a "replica set" for each host and then treating it in a similar way to a DC (ie. RF replicas are chosen from that set, instead of from the whole cluster). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4123) vnodes aware Replication Strategy

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-4123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13409665#comment-13409665 ] 

Jonathan Ellis commented on CASSANDRA-4123:
-------------------------------------------

Got it, thanks.
                
> vnodes aware Replication Strategy 
> ----------------------------------
>
>                 Key: CASSANDRA-4123
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4123
>             Project: Cassandra
>          Issue Type: Sub-task
>          Components: Core
>            Reporter: Sam Overton
>            Assignee: Sam Overton
>
> The simplest implementation for this would be if NTS regarded a single host as a distinct rack. This would prevent replicas being placed on the same host. The rest of the logic for replica selection would be identical to NTS (but this would be removing a level of topology hierarchy). This would be achievable just by writing a snitch to place hosts in their own rack.
> A better solution would be to add an extra level of hierarchy to NTS so that it still supported DC & rack, and IP would be the new level at the bottom of the hierarchy. The logic would remain largely the same.
> I would very much like to build in Peter Schuller's notion of Distribution Factor (as described in http://www.mail-archive.com/dev@cassandra.apache.org/msg03844.html). This requires a method of defining a "replica set" for each host and then treating it in a similar way to a DC (ie. RF replicas are chosen from that set, instead of from the whole cluster). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4123) vnodes aware Replication Strategy

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-4123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401659#comment-13401659 ] 

Jonathan Ellis commented on CASSANDRA-4123:
-------------------------------------------

bq. A better solution would be to add an extra level of hierarchy to NTS so that it still supported DC & rack, and IP would be the new level at the bottom of the hierarchy

Agreed--with the caveat that if we don't have enough racks to satisfy the replica count, we shrug and throw multiple replicas on a rack.  But if we don't have enough hosts, I think that should be fatal (as it is now, in the vnode-less world).
                
> vnodes aware Replication Strategy 
> ----------------------------------
>
>                 Key: CASSANDRA-4123
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4123
>             Project: Cassandra
>          Issue Type: Sub-task
>          Components: Core
>            Reporter: Sam Overton
>            Assignee: Sam Overton
>
> The simplest implementation for this would be if NTS regarded a single host as a distinct rack. This would prevent replicas being placed on the same host. The rest of the logic for replica selection would be identical to NTS (but this would be removing a level of topology hierarchy). This would be achievable just by writing a snitch to place hosts in their own rack.
> A better solution would be to add an extra level of hierarchy to NTS so that it still supported DC & rack, and IP would be the new level at the bottom of the hierarchy. The logic would remain largely the same.
> I would very much like to build in Peter Schuller's notion of Distribution Factor (as described in http://www.mail-archive.com/dev@cassandra.apache.org/msg03844.html). This requires a method of defining a "replica set" for each host and then treating it in a similar way to a DC (ie. RF replicas are chosen from that set, instead of from the whole cluster). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Comment Edited] (CASSANDRA-4123) vnodes aware Replication Strategy

Posted by "Sam Overton (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-4123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402269#comment-13402269 ] 

Sam Overton edited comment on CASSANDRA-4123 at 6/27/12 2:52 PM:
-----------------------------------------------------------------

I should point out that the existing NTS (and also the modified implementation in CASSANDRA-3881) already prevent replicas from being placed on the same host. SimpleStrategy also is fixed in the CASSANDRA-4121 patch. The first couple of paragraphs in the ticket description could probably be removed.

The main issue left in this ticket is this notion of Distribution Factor and replica sets, to reduce the chance that simultaneous failures share data.


                
      was (Author: soverton):
    I should point out that the existing NTS (and also the modified implementation in CASSANDRA-3881) already prevent replicas from being placed on the same host. The first couple of paragraphs in the ticket description could probably be removed.

The main issue left in this ticket is this notion of Distribution Factor and replica sets, to reduce the chance that simultaneous failures share data.


                  
> vnodes aware Replication Strategy 
> ----------------------------------
>
>                 Key: CASSANDRA-4123
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4123
>             Project: Cassandra
>          Issue Type: Sub-task
>          Components: Core
>            Reporter: Sam Overton
>            Assignee: Sam Overton
>
> The simplest implementation for this would be if NTS regarded a single host as a distinct rack. This would prevent replicas being placed on the same host. The rest of the logic for replica selection would be identical to NTS (but this would be removing a level of topology hierarchy). This would be achievable just by writing a snitch to place hosts in their own rack.
> A better solution would be to add an extra level of hierarchy to NTS so that it still supported DC & rack, and IP would be the new level at the bottom of the hierarchy. The logic would remain largely the same.
> I would very much like to build in Peter Schuller's notion of Distribution Factor (as described in http://www.mail-archive.com/dev@cassandra.apache.org/msg03844.html). This requires a method of defining a "replica set" for each host and then treating it in a similar way to a DC (ie. RF replicas are chosen from that set, instead of from the whole cluster). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4123) vnodes aware Replication Strategy

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-4123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13409646#comment-13409646 ] 

Jonathan Ellis commented on CASSANDRA-4123:
-------------------------------------------

bq. existing NTS (and also the modified implementation in CASSANDRA-3881) already prevent replicas from being placed on the same host

How's that?  Certainly this block from NTS looks like it will blithely assign a row to multiple vnodes on the same host:

{code}
            // can we skip checking the rack?
            if (seenRacks.get(dc).size() == racks.get(dc).keySet().size())
            {
                dcReplicas.get(dc).add(ep);
                replicas.add(ep);
            }
{code}
                
> vnodes aware Replication Strategy 
> ----------------------------------
>
>                 Key: CASSANDRA-4123
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4123
>             Project: Cassandra
>          Issue Type: Sub-task
>          Components: Core
>            Reporter: Sam Overton
>            Assignee: Sam Overton
>
> The simplest implementation for this would be if NTS regarded a single host as a distinct rack. This would prevent replicas being placed on the same host. The rest of the logic for replica selection would be identical to NTS (but this would be removing a level of topology hierarchy). This would be achievable just by writing a snitch to place hosts in their own rack.
> A better solution would be to add an extra level of hierarchy to NTS so that it still supported DC & rack, and IP would be the new level at the bottom of the hierarchy. The logic would remain largely the same.
> I would very much like to build in Peter Schuller's notion of Distribution Factor (as described in http://www.mail-archive.com/dev@cassandra.apache.org/msg03844.html). This requires a method of defining a "replica set" for each host and then treating it in a similar way to a DC (ie. RF replicas are chosen from that set, instead of from the whole cluster). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4123) vnodes aware Replication Strategy

Posted by "Sam Overton (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-4123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13409655#comment-13409655 ] 

Sam Overton commented on CASSANDRA-4123:
----------------------------------------

{noformat}
        Set<InetAddress> replicas = new HashSet<InetAddress>();
        // replicas we have found in each DC
        Map<String, Set<InetAddress>> dcReplicas = new HashMap<String, Set<InetAddress>>(datacenters.size())
{noformat}

They're both sets, and we're using the .size() of the set to establish whether we have found sufficient replicas, so whilst we do add the same endpoint twice, it is effectively a no-op.
                
> vnodes aware Replication Strategy 
> ----------------------------------
>
>                 Key: CASSANDRA-4123
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4123
>             Project: Cassandra
>          Issue Type: Sub-task
>          Components: Core
>            Reporter: Sam Overton
>            Assignee: Sam Overton
>
> The simplest implementation for this would be if NTS regarded a single host as a distinct rack. This would prevent replicas being placed on the same host. The rest of the logic for replica selection would be identical to NTS (but this would be removing a level of topology hierarchy). This would be achievable just by writing a snitch to place hosts in their own rack.
> A better solution would be to add an extra level of hierarchy to NTS so that it still supported DC & rack, and IP would be the new level at the bottom of the hierarchy. The logic would remain largely the same.
> I would very much like to build in Peter Schuller's notion of Distribution Factor (as described in http://www.mail-archive.com/dev@cassandra.apache.org/msg03844.html). This requires a method of defining a "replica set" for each host and then treating it in a similar way to a DC (ie. RF replicas are chosen from that set, instead of from the whole cluster). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira