You are viewing a plain text version of this content. The canonical link for it is here.
Posted to pr@cassandra.apache.org by "maxim-chn (via GitHub)" <gi...@apache.org> on 2023/06/13 19:51:27 UTC
[GitHub] [cassandra] maxim-chn opened a new pull request, #2415: Replica nodes for batch operation are picked by response time
maxim-chn opened a new pull request, #2415:
URL: https://github.com/apache/cassandra/pull/2415
[CASSANDRA-18120](https://issues.apache.org/jira/browse/CASSANDRA-18120). Random replica node pick for batch operations was changed to a pick by the nodes' response time.
Motivation:
- Avoid any potential followup change by the end-users.
- Minimize scope of the change.
- Do not introduce expensive logic.
The idea is to use readily-available interpretation of the potential replica node's communication time as a proxy indicator.
It indicates if the node will take longer to complete batch operation than other nodes.
Value [PHI](https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.80.7427&rep=rep1&type=pdf) is used as an indicator. It is calculated during gossip. The longer it takes node to complete gossip operation, the higher PHI value gets.
The assumption here is that longer gossip completion surely indicates a longer batch operation completion. Additional PHI description is available at [CASSANDRA-2597](https://issues.apache.org/jira/browse/CASSANDRA-2597)'s comment section.
patch by Maxim Chanturiay; reviewed by <Reviewers> for CASSANDRA-18120
Co-authored-by: Dan Sarisky <email1>
The [Cassandra Jira](https://issues.apache.org/jira/projects/CASSANDRA/issues/)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org
[GitHub] [cassandra] smiklosovic commented on a diff in pull request #2415: Replica nodes for batch operation are picked by response time
Posted by "smiklosovic (via GitHub)" <gi...@apache.org>.
smiklosovic commented on code in PR #2415:
URL: https://github.com/apache/cassandra/pull/2415#discussion_r1247045856
##########
src/java/org/apache/cassandra/gms/IFailureDetector.java:
##########
@@ -37,6 +40,28 @@ public interface IFailureDetector
*/
public boolean isAlive(InetAddressAndPort ep);
+ /**
+ * Sorts the endpoints based on their response time.
+ * Nodes at smaller indexes have faster response time than nodes at higher indexes.
+ * @param List<InetAddressAndPort> endpoints.
+ * @return void.
+ */
+ public default void sortEndpointsByResponseTime(List<InetAddressAndPort> endpoints)
Review Comment:
@maxim-chn does this mean that if I call this method, it will throw an exception because `isEpAResponseTimeSlowerThanEpB` is not implemented? So by default it will throw exception?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org
[GitHub] [cassandra] smiklosovic commented on a diff in pull request #2415: Replica nodes for batch operation are picked by response time
Posted by "smiklosovic (via GitHub)" <gi...@apache.org>.
smiklosovic commented on code in PR #2415:
URL: https://github.com/apache/cassandra/pull/2415#discussion_r1247045856
##########
src/java/org/apache/cassandra/gms/IFailureDetector.java:
##########
@@ -37,6 +40,28 @@ public interface IFailureDetector
*/
public boolean isAlive(InetAddressAndPort ep);
+ /**
+ * Sorts the endpoints based on their response time.
+ * Nodes at smaller indexes have faster response time than nodes at higher indexes.
+ * @param List<InetAddressAndPort> endpoints.
+ * @return void.
+ */
+ public default void sortEndpointsByResponseTime(List<InetAddressAndPort> endpoints)
Review Comment:
@maxim-chn do this mean that if I call this method, it will throw an exception because `isEpAResponseTimeSlowerThanEpB` is not implemented? So by default it will throw exception?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org
[GitHub] [cassandra] maxim-chn commented on a diff in pull request #2415: Replica nodes for batch operation are picked by response time
Posted by "maxim-chn (via GitHub)" <gi...@apache.org>.
maxim-chn commented on code in PR #2415:
URL: https://github.com/apache/cassandra/pull/2415#discussion_r1249247460
##########
src/java/org/apache/cassandra/gms/IFailureDetector.java:
##########
@@ -37,6 +40,28 @@ public interface IFailureDetector
*/
public boolean isAlive(InetAddressAndPort ep);
+ /**
+ * Sorts the endpoints based on their response time.
+ * Nodes at smaller indexes have faster response time than nodes at higher indexes.
+ * @param List<InetAddressAndPort> endpoints.
+ * @return void.
+ */
+ public default void sortEndpointsByResponseTime(List<InetAddressAndPort> endpoints)
Review Comment:
@smiklosovic , yes indeed.
The motivation is to encourage any future implementation of `IFailureDetector` (besides `FailureDetector`) to take into consideration how 2 endpoints are compared in terms of response time.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org
Re: [PR] Replica nodes for batch operation are picked by response time [cassandra]
Posted by "shunsaker (via GitHub)" <gi...@apache.org>.
shunsaker commented on code in PR #2415:
URL: https://github.com/apache/cassandra/pull/2415#discussion_r1484288959
##########
src/java/org/apache/cassandra/locator/ReplicaPlans.java:
##########
@@ -302,16 +302,30 @@ public static Collection<InetAddressAndPort> filterBatchlogEndpoints(String loca
}
else
{
- racks = Lists.newArrayList(validated.keySet());
- shuffle.accept((List<?>) racks);
+ ArrayList<String> sortedRacks = Lists.newArrayList(validated.keySet());
+ sortedRacks.sort((String rackA, String rackB) -> {
+ List<InetAddressAndPort> rackMembersA = validated.get(rackA);
+ List<InetAddressAndPort> rackMembersB = validated.get(rackB);
+
+ if (rackMembersA == null || rackMembersA.isEmpty())
+ return 1;
+ if (rackMembersB == null || rackMembersB.isEmpty())
+ return -1;
+
+ endpointsSorter.accept(rackMembersA);
+ endpointsSorter.accept(rackMembersB);
+
+ return enpointsCompare.test(rackMembersA.get(0), rackMembersB.get(0)) ? 1 : -1;
+ });
+ racks = sortedRacks;
}
// grab a random member of up to two racks
List<InetAddressAndPort> result = new ArrayList<>(2);
for (String rack : Iterables.limit(racks, 2))
{
List<InetAddressAndPort> rackMembers = validated.get(rack);
- result.add(rackMembers.get(indexPicker.apply(rackMembers.size())));
+ result.add(rackMembers.get(0)); // Pick the node with fastest response time in the rack
Review Comment:
If there are exactly 2 racks, then I don't think the nodes have been sorted
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org