You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by GitBox <gi...@apache.org> on 2021/06/29 16:20:43 UTC

[GitHub] [kafka] jeffkbkim opened a new pull request #10940: KAFKA-13007: KafkaAdminClient getListOffsetsCalls reuse cluster snapshot

jeffkbkim opened a new pull request #10940:
URL: https://github.com/apache/kafka/pull/10940


   in getListOffsetsCalls, we rebuild the cluster snapshot for every topic partition. instead, we should reuse a snapshot. this will reduce the time complexity from O(n^2) to O(n).
   
   for manual testing (used AK 2.8), i've passed in a map of 6K topic partitions to listOffsets
   
   without snapshot reuse:
   duration of building futures from metadata response: **15582** milliseconds
   total duration of listOffsets: **15743** milliseconds
   
   with reuse:
   duration of building futures from metadata response: **24** milliseconds
   total duration of listOffsets: **235** milliseconds
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] ijuma commented on a change in pull request #10940: KAFKA-13007: KafkaAdminClient getListOffsetsCalls reuse cluster snapshot

Posted by GitBox <gi...@apache.org>.
ijuma commented on a change in pull request #10940:
URL: https://github.com/apache/kafka/pull/10940#discussion_r660800092



##########
File path: clients/src/main/java/org/apache/kafka/clients/admin/KafkaAdminClient.java
##########
@@ -4200,6 +4200,7 @@ public ListOffsetsResult listOffsets(Map<TopicPartition, OffsetSpec> topicPartit
                                            Map<TopicPartition, KafkaFutureImpl<ListOffsetsResultInfo>> futures) {
 
         MetadataResponse mr = context.response().orElseThrow(() -> new IllegalStateException("No Metadata response"));
+        Cluster clusterSnapshot = mr.cluster();

Review comment:
       yeah, I was thinking the same.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] showuon commented on pull request #10940: KAFKA-13007: KafkaAdminClient getListOffsetsCalls reuse cluster snapshot

Posted by GitBox <gi...@apache.org>.
showuon commented on pull request #10940:
URL: https://github.com/apache/kafka/pull/10940#issuecomment-872027365


   > this will reduce the time complexity from O(n^2) to O(n).
   
   I think we reduce the time complexity from O(n) to O(1). FYI~


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] jeffkbkim commented on pull request #10940: KAFKA-13007: KafkaAdminClient getListOffsetsCalls reuse cluster snapshot

Posted by GitBox <gi...@apache.org>.
jeffkbkim commented on pull request #10940:
URL: https://github.com/apache/kafka/pull/10940#issuecomment-872460573


   i have renamed cluster() to buildCluster(). i will submit a separate PR for the benchmark


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] showuon commented on a change in pull request #10940: KAFKA-13007: KafkaAdminClient getListOffsetsCalls reuse cluster snapshot

Posted by GitBox <gi...@apache.org>.
showuon commented on a change in pull request #10940:
URL: https://github.com/apache/kafka/pull/10940#discussion_r662073522



##########
File path: clients/src/main/java/org/apache/kafka/clients/admin/KafkaAdminClient.java
##########
@@ -4200,6 +4200,7 @@ public ListOffsetsResult listOffsets(Map<TopicPartition, OffsetSpec> topicPartit
                                            Map<TopicPartition, KafkaFutureImpl<ListOffsetsResultInfo>> futures) {
 
         MetadataResponse mr = context.response().orElseThrow(() -> new IllegalStateException("No Metadata response"));
+        Cluster clusterSnapshot = mr.cluster();

Review comment:
       Agree to rename to `buildCluster()` !




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] hachikuji merged pull request #10940: KAFKA-13007: KafkaAdminClient getListOffsetsCalls reuse cluster snapshot

Posted by GitBox <gi...@apache.org>.
hachikuji merged pull request #10940:
URL: https://github.com/apache/kafka/pull/10940


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] showuon edited a comment on pull request #10940: KAFKA-13007: KafkaAdminClient getListOffsetsCalls reuse cluster snapshot

Posted by GitBox <gi...@apache.org>.
showuon edited a comment on pull request #10940:
URL: https://github.com/apache/kafka/pull/10940#issuecomment-872027365


   > this will reduce the time complexity from O(n^2) to O(n).
   
   I think we reduce the time complexity from O(n) to O(1), suppose n is the partition size. FYI~


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] jeffkbkim commented on pull request #10940: KAFKA-13007: KafkaAdminClient getListOffsetsCalls reuse cluster snapshot

Posted by GitBox <gi...@apache.org>.
jeffkbkim commented on pull request #10940:
URL: https://github.com/apache/kafka/pull/10940#issuecomment-870768034


   @ijuma @hachikuji thanks for the review. i'll add a benchmark, not too sure if there's any other way to test the performance.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] hachikuji commented on a change in pull request #10940: KAFKA-13007: KafkaAdminClient getListOffsetsCalls reuse cluster snapshot

Posted by GitBox <gi...@apache.org>.
hachikuji commented on a change in pull request #10940:
URL: https://github.com/apache/kafka/pull/10940#discussion_r660787940



##########
File path: clients/src/main/java/org/apache/kafka/clients/admin/KafkaAdminClient.java
##########
@@ -4200,6 +4200,7 @@ public ListOffsetsResult listOffsets(Map<TopicPartition, OffsetSpec> topicPartit
                                            Map<TopicPartition, KafkaFutureImpl<ListOffsetsResultInfo>> futures) {
 
         MetadataResponse mr = context.response().orElseThrow(() -> new IllegalStateException("No Metadata response"));
+        Cluster clusterSnapshot = mr.cluster();

Review comment:
       Having names that sound like simple accessors probably makes this kind of performance issue more likely. Maybe we could rename `cluster()` to `buildCluster()` so that it is clear there is some work.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] jeffkbkim commented on pull request #10940: KAFKA-13007: KafkaAdminClient getListOffsetsCalls reuse cluster snapshot

Posted by GitBox <gi...@apache.org>.
jeffkbkim commented on pull request #10940:
URL: https://github.com/apache/kafka/pull/10940#issuecomment-872467503


   @showuon thanks!
   
   on the time complexity, i think you may have missed the for-loop wrapped around. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org