You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jonathan Ellis (JIRA)" <ji...@apache.org> on 2010/02/24 18:45:28 UTC

[jira] Created: (CASSANDRA-833) fix consistencylevel during bootstrap

fix consistencylevel during bootstrap
-------------------------------------

                 Key: CASSANDRA-833
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-833
             Project: Cassandra
          Issue Type: Bug
          Components: Core
    Affects Versions: 0.5
            Reporter: Jonathan Ellis
            Assignee: Jonathan Ellis
             Fix For: 0.6


As originally designed, bootstrap nodes should *always* get *all* writes under any consistencylevel, so when bootstrap finishes the operator can run cleanup on the old nodes w/o fear that he might lose data.

but if a bootstrap operation fails or is aborted, that means all writes will fail until the ex-bootstrapping node is decommissioned.  so starting in CASSANDRA-722, we just ignore dead nodes in consistencylevel calculations.

but this breaks the original design.  CASSANDRA-822 adds a partial fix for this (just adding bootstrap targets into the RF targets and hinting normally), but this is still broken under certain conditions.  The real fix is to consider consistencylevel for two sets of nodes:

  1. the RF targets as currently existing (no pending ranges)
  2.  the RF targets as they will exist after all movement ops are done

If we satisfy CL for both sets then we will always be in good shape.

I'm not sure if we can easily calculate 2. from the current TokenMetadata, though.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (CASSANDRA-833) fix consistencylevel during bootstrap

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis reassigned CASSANDRA-833:
----------------------------------------

    Assignee: Jaakko Laine  (was: Jonathan Ellis)

> fix consistencylevel during bootstrap
> -------------------------------------
>
>                 Key: CASSANDRA-833
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-833
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.5
>            Reporter: Jonathan Ellis
>            Assignee: Jaakko Laine
>             Fix For: 0.6
>
>
> As originally designed, bootstrap nodes should *always* get *all* writes under any consistencylevel, so when bootstrap finishes the operator can run cleanup on the old nodes w/o fear that he might lose data.
> but if a bootstrap operation fails or is aborted, that means all writes will fail until the ex-bootstrapping node is decommissioned.  so starting in CASSANDRA-722, we just ignore dead nodes in consistencylevel calculations.
> but this breaks the original design.  CASSANDRA-822 adds a partial fix for this (just adding bootstrap targets into the RF targets and hinting normally), but this is still broken under certain conditions.  The real fix is to consider consistencylevel for two sets of nodes:
>   1. the RF targets as currently existing (no pending ranges)
>   2.  the RF targets as they will exist after all movement ops are done
> If we satisfy CL for both sets then we will always be in good shape.
> I'm not sure if we can easily calculate 2. from the current TokenMetadata, though.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-833) fix consistencylevel during bootstrap

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12840438#action_12840438 ] 

Jonathan Ellis commented on CASSANDRA-833:
------------------------------------------

To clarify: the #722 fix breaks the design because a bootstrapping node that goes "down" temporarily but completes bootstrap will not actually have all the writes that happened during bootstrap on it.

> fix consistencylevel during bootstrap
> -------------------------------------
>
>                 Key: CASSANDRA-833
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-833
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.5
>            Reporter: Jonathan Ellis
>            Assignee: Jonathan Ellis
>             Fix For: 0.6
>
>
> As originally designed, bootstrap nodes should *always* get *all* writes under any consistencylevel, so when bootstrap finishes the operator can run cleanup on the old nodes w/o fear that he might lose data.
> but if a bootstrap operation fails or is aborted, that means all writes will fail until the ex-bootstrapping node is decommissioned.  so starting in CASSANDRA-722, we just ignore dead nodes in consistencylevel calculations.
> but this breaks the original design.  CASSANDRA-822 adds a partial fix for this (just adding bootstrap targets into the RF targets and hinting normally), but this is still broken under certain conditions.  The real fix is to consider consistencylevel for two sets of nodes:
>   1. the RF targets as currently existing (no pending ranges)
>   2.  the RF targets as they will exist after all movement ops are done
> If we satisfy CL for both sets then we will always be in good shape.
> I'm not sure if we can easily calculate 2. from the current TokenMetadata, though.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-833) fix consistencylevel during bootstrap

Posted by "Jaakko Laine (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12841870#action_12841870 ] 

Jaakko Laine commented on CASSANDRA-833:
----------------------------------------

This issue is not only related to bootstrapping, since nodes leaving the ring will also cause pending ranges. If a node does not complete leaving operation properly, obsolete pending ranges will be left in metadata.

(2) above is actually almost exactly how pending ranges is calculated. All current move operations are finished and pending ranges is calculated according to what are the new natural endpoints for the ranges in question.

This is not directly related to bootstrapping IMHO, but to the fact that node movement increases quorum and due to node movement being uncertain, there is bigger possibility that something goes wrong and quorum nodes cannot be reached.


> fix consistencylevel during bootstrap
> -------------------------------------
>
>                 Key: CASSANDRA-833
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-833
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.5
>            Reporter: Jonathan Ellis
>            Assignee: Jaakko Laine
>             Fix For: 0.6
>
>
> As originally designed, bootstrap nodes should *always* get *all* writes under any consistencylevel, so when bootstrap finishes the operator can run cleanup on the old nodes w/o fear that he might lose data.
> but if a bootstrap operation fails or is aborted, that means all writes will fail until the ex-bootstrapping node is decommissioned.  so starting in CASSANDRA-722, we just ignore dead nodes in consistencylevel calculations.
> but this breaks the original design.  CASSANDRA-822 adds a partial fix for this (just adding bootstrap targets into the RF targets and hinting normally), but this is still broken under certain conditions.  The real fix is to consider consistencylevel for two sets of nodes:
>   1. the RF targets as currently existing (no pending ranges)
>   2.  the RF targets as they will exist after all movement ops are done
> If we satisfy CL for both sets then we will always be in good shape.
> I'm not sure if we can easily calculate 2. from the current TokenMetadata, though.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-833) fix consistencylevel during bootstrap

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-833:
-------------------------------------

    Fix Version/s:     (was: 0.6)
                   0.7

> fix consistencylevel during bootstrap
> -------------------------------------
>
>                 Key: CASSANDRA-833
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-833
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.5
>            Reporter: Jonathan Ellis
>            Assignee: Jaakko Laine
>             Fix For: 0.7
>
>
> As originally designed, bootstrap nodes should *always* get *all* writes under any consistencylevel, so when bootstrap finishes the operator can run cleanup on the old nodes w/o fear that he might lose data.
> but if a bootstrap operation fails or is aborted, that means all writes will fail until the ex-bootstrapping node is decommissioned.  so starting in CASSANDRA-722, we just ignore dead nodes in consistencylevel calculations.
> but this breaks the original design.  CASSANDRA-822 adds a partial fix for this (just adding bootstrap targets into the RF targets and hinting normally), but this is still broken under certain conditions.  The real fix is to consider consistencylevel for two sets of nodes:
>   1. the RF targets as currently existing (no pending ranges)
>   2.  the RF targets as they will exist after all movement ops are done
> If we satisfy CL for both sets then we will always be in good shape.
> I'm not sure if we can easily calculate 2. from the current TokenMetadata, though.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.