You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Juho Mäkinen (JIRA)" <ji...@apache.org> on 2010/10/18 11:35:24 UTC
[jira] Created: (CASSANDRA-1627) Better QUORUM calculation
algorithm
Better QUORUM calculation algorithm
------------------------------------
Key: CASSANDRA-1627
URL: https://issues.apache.org/jira/browse/CASSANDRA-1627
Project: Cassandra
Issue Type: Bug
Components: API, Core
Reporter: Juho Mäkinen
The current QUORUM calculation algorithm is a bit problematic on some setups, especially when using ReplicationFactor 3 (RF=3).
As the current algorithm is "N / 2 + 1" the result is rounded to nearest integer, resulting that on RF=3 the QUORUM is also 3. Discussion with ntelford and ron_r resulted a better suggestion to use FLOOR(N / 2 + 1), resulting QUORUM 2 on RF=3, but also decreasing the QUORUM value on odd RF numbers above RF=4, resulting faster cluster operation but still maintaining the QUORUM requirement.
Here's a table showing current method and the new suggestion:
||RF|||1||2||3||4||5||6||7||8||9||10||
|round(N / 2 + 1)|2|2|3|3|4|4|5|5|6|6|
|FLOOR(N/2 + 1)|1|2|2|3|3|4|4|5|5|6|
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Resolved: (CASSANDRA-1627) Better QUORUM calculation
algorithm
Posted by "Juho Mäkinen (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-1627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Juho Mäkinen resolved CASSANDRA-1627.
-------------------------------------
Resolution: Invalid
12:53 < Garo_> inside QuorumResponseHandler: public int determineBlockFor(ConsistencyLevel consistencyLevel, String table)
12:53 < Garo_> notice the return type is int
12:53 < Garo_> it has a switch which returns on QUORUM: return (DatabaseDescriptor.getReplicationFactor(table) / 2) + 1;
12:54 < Garo_> so basically cassandra already uses floor(N/2 + 1)
> Better QUORUM calculation algorithm
> ------------------------------------
>
> Key: CASSANDRA-1627
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1627
> Project: Cassandra
> Issue Type: Bug
> Components: API, Core
> Reporter: Juho Mäkinen
> Priority: Minor
>
> The current QUORUM calculation algorithm is a bit problematic on some setups, especially when using ReplicationFactor 3 (RF=3).
> As the current algorithm is "N / 2 + 1" the result is rounded to nearest integer, resulting that on RF=3 the QUORUM is also 3. Discussion with ntelford and ron_r resulted a better suggestion to use FLOOR(N / 2 + 1), resulting QUORUM 2 on RF=3, but also decreasing the QUORUM value on odd RF numbers above RF=4, resulting faster cluster operation but still maintaining the QUORUM requirement.
> Here's a table showing current method and the new suggestion:
> ||RF|||1||2||3||4||5||6||7||8||9||10||
> |round(N / 2 + 1)|2|2|3|3|4|4|5|5|6|6|
> |FLOOR(N/2 + 1)|1|2|2|3|3|4|4|5|5|6|
> EDIT: as pcmanus pointed out, the N/2+1 calculation indeed returns 2 when N=2, so the round error doesn't occur here. I'll need to dig the problem up a bit because this suggestion originated when my cluster returned UnavailableException (I'm using RF=3) when doing QUORUM operations when one node was down.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1627) Better QUORUM calculation
algorithm
Posted by "Nicholas Telford (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-1627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12922033#action_12922033 ]
Nicholas Telford commented on CASSANDRA-1627:
---------------------------------------------
To clarify, this would fix two major availability issues:
- Clusters with RF=1 not being able to use QUORUM (at present, the number of QUORUM nodes exceeds RF, so would always fail)
- Clusters with an odd RF not being able to run QUORUM operations if N / 2 - 1 nodes are down.
The second point is especially important as clusters with RF=3 currently can't run QUORUM operations when a single node is down (e.g. during a rolling restart).
> Better QUORUM calculation algorithm
> ------------------------------------
>
> Key: CASSANDRA-1627
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1627
> Project: Cassandra
> Issue Type: Bug
> Components: API, Core
> Reporter: Juho Mäkinen
>
> The current QUORUM calculation algorithm is a bit problematic on some setups, especially when using ReplicationFactor 3 (RF=3).
> As the current algorithm is "N / 2 + 1" the result is rounded to nearest integer, resulting that on RF=3 the QUORUM is also 3. Discussion with ntelford and ron_r resulted a better suggestion to use FLOOR(N / 2 + 1), resulting QUORUM 2 on RF=3, but also decreasing the QUORUM value on odd RF numbers above RF=4, resulting faster cluster operation but still maintaining the QUORUM requirement.
> Here's a table showing current method and the new suggestion:
> ||RF|||1||2||3||4||5||6||7||8||9||10||
> |round(N / 2 + 1)|2|2|3|3|4|4|5|5|6|6|
> |FLOOR(N/2 + 1)|1|2|2|3|3|4|4|5|5|6|
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (CASSANDRA-1627) Better QUORUM calculation
algorithm
Posted by "Juho Mäkinen (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-1627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Juho Mäkinen updated CASSANDRA-1627:
------------------------------------
Description:
The current QUORUM calculation algorithm is a bit problematic on some setups, especially when using ReplicationFactor 3 (RF=3).
As the current algorithm is "N / 2 + 1" the result is rounded to nearest integer, resulting that on RF=3 the QUORUM is also 3. Discussion with ntelford and ron_r resulted a better suggestion to use FLOOR(N / 2 + 1), resulting QUORUM 2 on RF=3, but also decreasing the QUORUM value on odd RF numbers above RF=4, resulting faster cluster operation but still maintaining the QUORUM requirement.
Here's a table showing current method and the new suggestion:
||RF|||1||2||3||4||5||6||7||8||9||10||
|round(N / 2 + 1)|2|2|3|3|4|4|5|5|6|6|
|FLOOR(N/2 + 1)|1|2|2|3|3|4|4|5|5|6|
EDIT: as pcmanus pointed out, the N/2+1 calculation indeed returns 2 when N=2, so the round error doesn't occur here. I'll need to dig the problem up a bit because this suggestion originated when my cluster returned UnavailableException (I'm using RF=3) when doing QUORUM operations when one node was down.
was:
The current QUORUM calculation algorithm is a bit problematic on some setups, especially when using ReplicationFactor 3 (RF=3).
As the current algorithm is "N / 2 + 1" the result is rounded to nearest integer, resulting that on RF=3 the QUORUM is also 3. Discussion with ntelford and ron_r resulted a better suggestion to use FLOOR(N / 2 + 1), resulting QUORUM 2 on RF=3, but also decreasing the QUORUM value on odd RF numbers above RF=4, resulting faster cluster operation but still maintaining the QUORUM requirement.
Here's a table showing current method and the new suggestion:
||RF|||1||2||3||4||5||6||7||8||9||10||
|round(N / 2 + 1)|2|2|3|3|4|4|5|5|6|6|
|FLOOR(N/2 + 1)|1|2|2|3|3|4|4|5|5|6|
Priority: Minor (was: Major)
> Better QUORUM calculation algorithm
> ------------------------------------
>
> Key: CASSANDRA-1627
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1627
> Project: Cassandra
> Issue Type: Bug
> Components: API, Core
> Reporter: Juho Mäkinen
> Priority: Minor
>
> The current QUORUM calculation algorithm is a bit problematic on some setups, especially when using ReplicationFactor 3 (RF=3).
> As the current algorithm is "N / 2 + 1" the result is rounded to nearest integer, resulting that on RF=3 the QUORUM is also 3. Discussion with ntelford and ron_r resulted a better suggestion to use FLOOR(N / 2 + 1), resulting QUORUM 2 on RF=3, but also decreasing the QUORUM value on odd RF numbers above RF=4, resulting faster cluster operation but still maintaining the QUORUM requirement.
> Here's a table showing current method and the new suggestion:
> ||RF|||1||2||3||4||5||6||7||8||9||10||
> |round(N / 2 + 1)|2|2|3|3|4|4|5|5|6|6|
> |FLOOR(N/2 + 1)|1|2|2|3|3|4|4|5|5|6|
> EDIT: as pcmanus pointed out, the N/2+1 calculation indeed returns 2 when N=2, so the round error doesn't occur here. I'll need to dig the problem up a bit because this suggestion originated when my cluster returned UnavailableException (I'm using RF=3) when doing QUORUM operations when one node was down.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.