You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Juho Mäkinen (JIRA)" <ji...@apache.org> on 2010/10/18 11:35:24 UTC

[jira] Created: (CASSANDRA-1627) Better QUORUM calculation algorithm

Better QUORUM calculation algorithm 
------------------------------------

                 Key: CASSANDRA-1627
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1627
             Project: Cassandra
          Issue Type: Bug
          Components: API, Core
            Reporter: Juho Mäkinen


The current QUORUM calculation algorithm is a bit problematic on some setups, especially when using ReplicationFactor 3 (RF=3).

As the current algorithm is "N / 2 + 1" the result is rounded to nearest integer, resulting that on RF=3 the QUORUM is also 3. Discussion with ntelford and ron_r resulted a better suggestion to use FLOOR(N / 2 + 1), resulting QUORUM 2 on RF=3, but also decreasing the QUORUM value on odd RF numbers above RF=4, resulting faster cluster operation but still maintaining the QUORUM requirement.

Here's a table showing current method and the new suggestion:
||RF|||1||2||3||4||5||6||7||8||9||10||
|round(N / 2 + 1)|2|2|3|3|4|4|5|5|6|6|
|FLOOR(N/2 + 1)|1|2|2|3|3|4|4|5|5|6|


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (CASSANDRA-1627) Better QUORUM calculation algorithm

Posted by "Juho Mäkinen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Juho Mäkinen resolved CASSANDRA-1627.
-------------------------------------

    Resolution: Invalid

12:53 < Garo_> inside QuorumResponseHandler: public int determineBlockFor(ConsistencyLevel consistencyLevel, String table)
12:53 < Garo_> notice the return type is int
12:53 < Garo_> it has a switch which returns on QUORUM: return (DatabaseDescriptor.getReplicationFactor(table) / 2) + 1;
12:54 < Garo_> so basically cassandra already uses floor(N/2 + 1)


> Better QUORUM calculation algorithm 
> ------------------------------------
>
>                 Key: CASSANDRA-1627
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1627
>             Project: Cassandra
>          Issue Type: Bug
>          Components: API, Core
>            Reporter: Juho Mäkinen
>            Priority: Minor
>
> The current QUORUM calculation algorithm is a bit problematic on some setups, especially when using ReplicationFactor 3 (RF=3).
> As the current algorithm is "N / 2 + 1" the result is rounded to nearest integer, resulting that on RF=3 the QUORUM is also 3. Discussion with ntelford and ron_r resulted a better suggestion to use FLOOR(N / 2 + 1), resulting QUORUM 2 on RF=3, but also decreasing the QUORUM value on odd RF numbers above RF=4, resulting faster cluster operation but still maintaining the QUORUM requirement.
> Here's a table showing current method and the new suggestion:
> ||RF|||1||2||3||4||5||6||7||8||9||10||
> |round(N / 2 + 1)|2|2|3|3|4|4|5|5|6|6|
> |FLOOR(N/2 + 1)|1|2|2|3|3|4|4|5|5|6|
> EDIT: as pcmanus pointed out, the N/2+1 calculation indeed returns 2 when N=2, so the round error doesn't occur here. I'll need to dig the problem up a bit because this suggestion originated when my cluster returned UnavailableException (I'm using RF=3) when doing QUORUM operations when one node was down.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-1627) Better QUORUM calculation algorithm

Posted by "Nicholas Telford (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-1627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12922033#action_12922033 ] 

Nicholas Telford commented on CASSANDRA-1627:
---------------------------------------------

To clarify, this would fix two major availability issues:
- Clusters with RF=1 not being able to use QUORUM (at present, the number of QUORUM nodes exceeds RF, so would always fail)
- Clusters with an odd RF not being able to run QUORUM operations if N / 2 - 1 nodes are down.

The second point is especially important as clusters with RF=3 currently can't run QUORUM operations when a single node is down (e.g. during a rolling restart).

> Better QUORUM calculation algorithm 
> ------------------------------------
>
>                 Key: CASSANDRA-1627
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1627
>             Project: Cassandra
>          Issue Type: Bug
>          Components: API, Core
>            Reporter: Juho Mäkinen
>
> The current QUORUM calculation algorithm is a bit problematic on some setups, especially when using ReplicationFactor 3 (RF=3).
> As the current algorithm is "N / 2 + 1" the result is rounded to nearest integer, resulting that on RF=3 the QUORUM is also 3. Discussion with ntelford and ron_r resulted a better suggestion to use FLOOR(N / 2 + 1), resulting QUORUM 2 on RF=3, but also decreasing the QUORUM value on odd RF numbers above RF=4, resulting faster cluster operation but still maintaining the QUORUM requirement.
> Here's a table showing current method and the new suggestion:
> ||RF|||1||2||3||4||5||6||7||8||9||10||
> |round(N / 2 + 1)|2|2|3|3|4|4|5|5|6|6|
> |FLOOR(N/2 + 1)|1|2|2|3|3|4|4|5|5|6|

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-1627) Better QUORUM calculation algorithm

Posted by "Juho Mäkinen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-1627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Juho Mäkinen updated CASSANDRA-1627:
------------------------------------

    Description: 
The current QUORUM calculation algorithm is a bit problematic on some setups, especially when using ReplicationFactor 3 (RF=3).

As the current algorithm is "N / 2 + 1" the result is rounded to nearest integer, resulting that on RF=3 the QUORUM is also 3. Discussion with ntelford and ron_r resulted a better suggestion to use FLOOR(N / 2 + 1), resulting QUORUM 2 on RF=3, but also decreasing the QUORUM value on odd RF numbers above RF=4, resulting faster cluster operation but still maintaining the QUORUM requirement.

Here's a table showing current method and the new suggestion:
||RF|||1||2||3||4||5||6||7||8||9||10||
|round(N / 2 + 1)|2|2|3|3|4|4|5|5|6|6|
|FLOOR(N/2 + 1)|1|2|2|3|3|4|4|5|5|6|

EDIT: as pcmanus pointed out, the N/2+1 calculation indeed returns 2 when N=2, so the round error doesn't occur here. I'll need to dig the problem up a bit because this suggestion originated when my cluster returned UnavailableException (I'm using RF=3) when doing QUORUM operations when one node was down.


  was:
The current QUORUM calculation algorithm is a bit problematic on some setups, especially when using ReplicationFactor 3 (RF=3).

As the current algorithm is "N / 2 + 1" the result is rounded to nearest integer, resulting that on RF=3 the QUORUM is also 3. Discussion with ntelford and ron_r resulted a better suggestion to use FLOOR(N / 2 + 1), resulting QUORUM 2 on RF=3, but also decreasing the QUORUM value on odd RF numbers above RF=4, resulting faster cluster operation but still maintaining the QUORUM requirement.

Here's a table showing current method and the new suggestion:
||RF|||1||2||3||4||5||6||7||8||9||10||
|round(N / 2 + 1)|2|2|3|3|4|4|5|5|6|6|
|FLOOR(N/2 + 1)|1|2|2|3|3|4|4|5|5|6|


       Priority: Minor  (was: Major)

> Better QUORUM calculation algorithm 
> ------------------------------------
>
>                 Key: CASSANDRA-1627
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1627
>             Project: Cassandra
>          Issue Type: Bug
>          Components: API, Core
>            Reporter: Juho Mäkinen
>            Priority: Minor
>
> The current QUORUM calculation algorithm is a bit problematic on some setups, especially when using ReplicationFactor 3 (RF=3).
> As the current algorithm is "N / 2 + 1" the result is rounded to nearest integer, resulting that on RF=3 the QUORUM is also 3. Discussion with ntelford and ron_r resulted a better suggestion to use FLOOR(N / 2 + 1), resulting QUORUM 2 on RF=3, but also decreasing the QUORUM value on odd RF numbers above RF=4, resulting faster cluster operation but still maintaining the QUORUM requirement.
> Here's a table showing current method and the new suggestion:
> ||RF|||1||2||3||4||5||6||7||8||9||10||
> |round(N / 2 + 1)|2|2|3|3|4|4|5|5|6|6|
> |FLOOR(N/2 + 1)|1|2|2|3|3|4|4|5|5|6|
> EDIT: as pcmanus pointed out, the N/2+1 calculation indeed returns 2 when N=2, so the round error doesn't occur here. I'll need to dig the problem up a bit because this suggestion originated when my cluster returned UnavailableException (I'm using RF=3) when doing QUORUM operations when one node was down.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.