You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jun Rao (JIRA)" <ji...@apache.org> on 2009/05/14 20:42:45 UTC

[jira] Created: (CASSANDRA-180) Blocking insert may have fewer responses than replication factor

Blocking insert may have fewer responses than replication factor
----------------------------------------------------------------

                 Key: CASSANDRA-180
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-180
             Project: Cassandra
          Issue Type: Bug
    Affects Versions: 0.3
            Reporter: Jun Rao
            Assignee: Jun Rao
            Priority: Minor


Currently, block_insert always assumes the number of responses equals the replication factor. However, for a small cluster (e,g, 1 node) and/or when failure occurs, the number of responses could be fewer than the replication factor.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-180) Blocking insert may have fewer responses than replication factor

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-180:
-------------------------------------

    Attachment: 180-v3.patch

v3 checks for the right quorum count.

> Blocking insert may have fewer responses than replication factor
> ----------------------------------------------------------------
>
>                 Key: CASSANDRA-180
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-180
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.3
>            Reporter: Jun Rao
>            Assignee: Jun Rao
>            Priority: Minor
>         Attachments: 180-v2.patch, 180-v3.patch, issue180.patchv1
>
>
> Currently, block_insert always assumes the number of responses equals the replication factor. However, for a small cluster (e,g, 1 node) and/or when failure occurs, the number of responses could be fewer than the replication factor.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-180) Blocking insert may have fewer responses than replication factor

Posted by "Sandeep Tata (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12709884#action_12709884 ] 

Sandeep Tata commented on CASSANDRA-180:
----------------------------------------

+1 for v3

We can now get rid of "// TODO: throw a thrift exception if we do not have N nodes"

> Blocking insert may have fewer responses than replication factor
> ----------------------------------------------------------------
>
>                 Key: CASSANDRA-180
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-180
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.3
>            Reporter: Jun Rao
>            Assignee: Jun Rao
>            Priority: Minor
>         Attachments: 180-v2.patch, 180-v3.patch, issue180.patchv1
>
>
> Currently, block_insert always assumes the number of responses equals the replication factor. However, for a small cluster (e,g, 1 node) and/or when failure occurs, the number of responses could be fewer than the replication factor.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-180) Blocking insert may have fewer responses than replication factor

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jun Rao updated CASSANDRA-180:
------------------------------

    Attachment: issue180.patchv1

Attach a fix. The expected number of responses is set based on the number of live nodes identified.

> Blocking insert may have fewer responses than replication factor
> ----------------------------------------------------------------
>
>                 Key: CASSANDRA-180
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-180
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.3
>            Reporter: Jun Rao
>            Assignee: Jun Rao
>            Priority: Minor
>         Attachments: issue180.patchv1
>
>
> Currently, block_insert always assumes the number of responses equals the replication factor. However, for a small cluster (e,g, 1 node) and/or when failure occurs, the number of responses could be fewer than the replication factor.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-180) Blocking insert may have fewer responses than replication factor

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12710109#action_12710109 ] 

Hudson commented on CASSANDRA-180:
----------------------------------

Integrated in Cassandra #78 (See [http://hudson.zones.apache.org/hudson/job/Cassandra/78/])
    check for enough endpoints before starting a quorum wait.
patch by jbellis; reviewed by Jun Rao and Sandeep Tata for 


> Blocking insert may have fewer responses than replication factor
> ----------------------------------------------------------------
>
>                 Key: CASSANDRA-180
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-180
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.3
>            Reporter: Jun Rao
>            Assignee: Jun Rao
>            Priority: Minor
>         Attachments: 180-v2.patch, 180-v3.patch, issue180.patchv1
>
>
> Currently, block_insert always assumes the number of responses equals the replication factor. However, for a small cluster (e,g, 1 node) and/or when failure occurs, the number of responses could be fewer than the replication factor.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-180) Blocking insert may have fewer responses than replication factor

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12709976#action_12709976 ] 

Jun Rao commented on CASSANDRA-180:
-----------------------------------

v3 looks good to me too.


> Blocking insert may have fewer responses than replication factor
> ----------------------------------------------------------------
>
>                 Key: CASSANDRA-180
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-180
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.3
>            Reporter: Jun Rao
>            Assignee: Jun Rao
>            Priority: Minor
>         Attachments: 180-v2.patch, 180-v3.patch, issue180.patchv1
>
>
> Currently, block_insert always assumes the number of responses equals the replication factor. However, for a small cluster (e,g, 1 node) and/or when failure occurs, the number of responses could be fewer than the replication factor.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-180) Blocking insert may have fewer responses than replication factor

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-180:
-------------------------------------

    Attachment: 180-v2.patch

Reducing the responseCount is going to break things, since that's used for determining when a successful quorum has been reached.

Say you have a 5 node cluster and a replication factor of 3.  But there is a network split and the node a client is talking to can only see itself.  With your patch it would start up a QRH with a RC of 1, get the ack, and report that the write was successful.  But we've just sliently violated our promise of quorum consistency (at least 2 nodes).

The existing code is optimal for when a write succeeeds -- as soon as a quorum is reached it returns, w/o waiting for any more responses that may or may not come.  The only problem is that it will wait for timeout when it is impossible for a write to reach quorum b/c there are not enough nodes.  I've attached a patch that addresses that problem.  What do you think?

(Note that we don't need to try to solve the problem of "what if at the beginning of a write there are enough nodes to reach quorum, but partway through we get a nack from a node making it impossible" b/c nodes only ack success, they don't nack failure.  And making them do so adds more complication than it is worth for such an uncommon case.)

> Blocking insert may have fewer responses than replication factor
> ----------------------------------------------------------------
>
>                 Key: CASSANDRA-180
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-180
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.3
>            Reporter: Jun Rao
>            Assignee: Jun Rao
>            Priority: Minor
>         Attachments: 180-v2.patch, issue180.patchv1
>
>
> Currently, block_insert always assumes the number of responses equals the replication factor. However, for a small cluster (e,g, 1 node) and/or when failure occurs, the number of responses could be fewer than the replication factor.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.