You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Jun Rao (JIRA)" <ji...@apache.org> on 2012/05/07 19:28:50 UTC
[jira] [Created] (KAFKA-339) using MultiFetch in the follower
Jun Rao created KAFKA-339:
-----------------------------
Summary: using MultiFetch in the follower
Key: KAFKA-339
URL: https://issues.apache.org/jira/browse/KAFKA-339
Project: Kafka
Issue Type: Sub-task
Components: core
Affects Versions: 0.8
Reporter: Jun Rao
A broker could be following multiple topic/partitions from the broker. Instead of using 1 fetcher thread per topic/partition, it would be more efficient to use 1 fetcher thread that issues multi-fetch requests.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (KAFKA-339) using MultiFetch in the follower
Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/KAFKA-339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jun Rao updated KAFKA-339:
--------------------------
Attachment: kafka-339_v1.patch
> using MultiFetch in the follower
> --------------------------------
>
> Key: KAFKA-339
> URL: https://issues.apache.org/jira/browse/KAFKA-339
> Project: Kafka
> Issue Type: Sub-task
> Components: core
> Affects Versions: 0.8
> Reporter: Jun Rao
> Assignee: Jun Rao
> Fix For: 0.8
>
> Attachments: kafka-339_v1.patch
>
> Original Estimate: 252h
> Remaining Estimate: 252h
>
> A broker could be following multiple topic/partitions from the broker. Instead of using 1 fetcher thread per topic/partition, it would be more efficient to use 1 fetcher thread that issues multi-fetch requests.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-339) using MultiFetch in the follower
Posted by "Joel Koshy (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/KAFKA-339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13397783#comment-13397783 ]
Joel Koshy commented on KAFKA-339:
----------------------------------
+1 on v2
Minor comments:
- AbstractFetcherManager: may be useful to log the fetcher-id for the topic/partition when adding the fetcher
- ReplicaFetcherManager: miss-match -> mismatch
- ReplicaManager: makeFollower: I still don't think we need to look up zk for the leader as the value passed in should be current.
> using MultiFetch in the follower
> --------------------------------
>
> Key: KAFKA-339
> URL: https://issues.apache.org/jira/browse/KAFKA-339
> Project: Kafka
> Issue Type: Sub-task
> Components: core
> Affects Versions: 0.8
> Reporter: Jun Rao
> Assignee: Jun Rao
> Fix For: 0.8
>
> Attachments: kafka-339_v1.patch, kafka-339_v2.patch
>
> Original Estimate: 252h
> Remaining Estimate: 252h
>
> A broker could be following multiple topic/partitions from the broker. Instead of using 1 fetcher thread per topic/partition, it would be more efficient to use 1 fetcher thread that issues multi-fetch requests.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (KAFKA-339) using MultiFetch in the follower
Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/KAFKA-339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jun Rao reassigned KAFKA-339:
-----------------------------
Assignee: Jun Rao
> using MultiFetch in the follower
> --------------------------------
>
> Key: KAFKA-339
> URL: https://issues.apache.org/jira/browse/KAFKA-339
> Project: Kafka
> Issue Type: Sub-task
> Components: core
> Affects Versions: 0.8
> Reporter: Jun Rao
> Assignee: Jun Rao
> Fix For: 0.8
>
> Original Estimate: 252h
> Remaining Estimate: 252h
>
> A broker could be following multiple topic/partitions from the broker. Instead of using 1 fetcher thread per topic/partition, it would be more efficient to use 1 fetcher thread that issues multi-fetch requests.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (KAFKA-339) using MultiFetch in the follower
Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/KAFKA-339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jun Rao updated KAFKA-339:
--------------------------
Status: Patch Available (was: Open)
Uploaded patch v1.
Created an AbstractFetcher and AbstractFetcherManager, which contain the common code path for fetchers used in followers and real consumer clients. Added ReplicaFetcher and ReplicaFetcherManager (use multi-fetch) to replace ReplicaFetcherThread.
Will reimplement the fetcher in consumer client based on AbstractFetcher and AbstractFetcherManager in a separate jira, kafka-362.
> using MultiFetch in the follower
> --------------------------------
>
> Key: KAFKA-339
> URL: https://issues.apache.org/jira/browse/KAFKA-339
> Project: Kafka
> Issue Type: Sub-task
> Components: core
> Affects Versions: 0.8
> Reporter: Jun Rao
> Assignee: Jun Rao
> Fix For: 0.8
>
> Attachments: kafka-339_v1.patch
>
> Original Estimate: 252h
> Remaining Estimate: 252h
>
> A broker could be following multiple topic/partitions from the broker. Instead of using 1 fetcher thread per topic/partition, it would be more efficient to use 1 fetcher thread that issues multi-fetch requests.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (KAFKA-339) using MultiFetch in the follower
Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/KAFKA-339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jun Rao updated KAFKA-339:
--------------------------
Resolution: Fixed
Status: Resolved (was: Patch Available)
Thanks for the review. Committed patch v2 with a minor improvement by testing multiple topics in ReplicaFetcherTest.
As for unnecessary ZK reads in ReplicaManager, this should be fixed as part of kafka-343 when moving the leader election logic to the controller.
> using MultiFetch in the follower
> --------------------------------
>
> Key: KAFKA-339
> URL: https://issues.apache.org/jira/browse/KAFKA-339
> Project: Kafka
> Issue Type: Sub-task
> Components: core
> Affects Versions: 0.8
> Reporter: Jun Rao
> Assignee: Jun Rao
> Fix For: 0.8
>
> Attachments: kafka-339_v1.patch, kafka-339_v2.patch
>
> Original Estimate: 252h
> Remaining Estimate: 252h
>
> A broker could be following multiple topic/partitions from the broker. Instead of using 1 fetcher thread per topic/partition, it would be more efficient to use 1 fetcher thread that issues multi-fetch requests.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-339) using MultiFetch in the follower
Posted by "Neha Narkhede (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/KAFKA-339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13397826#comment-13397826 ]
Neha Narkhede commented on KAFKA-339:
-------------------------------------
+1 on v2, assuming Joel's comments are addressed.
> using MultiFetch in the follower
> --------------------------------
>
> Key: KAFKA-339
> URL: https://issues.apache.org/jira/browse/KAFKA-339
> Project: Kafka
> Issue Type: Sub-task
> Components: core
> Affects Versions: 0.8
> Reporter: Jun Rao
> Assignee: Jun Rao
> Fix For: 0.8
>
> Attachments: kafka-339_v1.patch, kafka-339_v2.patch
>
> Original Estimate: 252h
> Remaining Estimate: 252h
>
> A broker could be following multiple topic/partitions from the broker. Instead of using 1 fetcher thread per topic/partition, it would be more efficient to use 1 fetcher thread that issues multi-fetch requests.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (KAFKA-339) using MultiFetch in the follower
Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/KAFKA-339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jun Rao updated KAFKA-339:
--------------------------
Attachment: kafka-339_v2.patch
Thanks for the review. Attaching patch v2.
AbstractFetcher:
- currentOffset actually can be none. A fetcher can be removed after the multi-fetch request is made.
AbstractFetcherManager:
-- addFetcher actually always adds a fetcher, but not always creates a new fetcher thread. I see the naming is a bit confusing. Renamed AbstractFetcher to AbstractFetcherThread.
-- Fetchermanager is maintaining 1 or more fetcherThreads per source broker. Be default, there is 1 fetcherThread per broker. However, for higher degree of parallelism, more fetcherThreads can be configured. A fetcher corresponds to the fetching from 1 partition of a topic. Multiple fetchers can be added to a fetcherThread.
ReplicaManager:
-- We need to get the host/port from ZK for a given broker id. Such information should be cached. Will create a separate jira to address this issue.
The rest of of comments have been fixed.
> using MultiFetch in the follower
> --------------------------------
>
> Key: KAFKA-339
> URL: https://issues.apache.org/jira/browse/KAFKA-339
> Project: Kafka
> Issue Type: Sub-task
> Components: core
> Affects Versions: 0.8
> Reporter: Jun Rao
> Assignee: Jun Rao
> Fix For: 0.8
>
> Attachments: kafka-339_v1.patch, kafka-339_v2.patch
>
> Original Estimate: 252h
> Remaining Estimate: 252h
>
> A broker could be following multiple topic/partitions from the broker. Instead of using 1 fetcher thread per topic/partition, it would be more efficient to use 1 fetcher thread that issues multi-fetch requests.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-339) using MultiFetch in the follower
Posted by "Joel Koshy (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/KAFKA-339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13397097#comment-13397097 ]
Joel Koshy commented on KAFKA-339:
----------------------------------
Thanks for the patch. Some comments:
AbstractFetcher:
- currentOffset should never be empty, so we can get rid of the if.
- hasPartition, partitionCount need to synchronize fetchMap
- "shutting down" should probably be removed from the string on line 106
- newOffset can be computed from the messageSet - so the processPartitionData implementation does not need to return the log end offset (and likewise when we use this in the high-level consumer). It's probably safer to prevent processPartitionData from overriding the new offset, and I don't see any benefit in allowing it to do so.
AbstractFetcherManager:
- addFetcher:
- can rename to maybeAddFetcher
- also, maybe we should move the info log on line 38 to the None case and print the fetcher id; and add another log for the other case saying there's already a fetcher.
- What is the purpose of having a fetcher ID vs simply topic-partition?
- Should synchronize fetcherRunnableMap in shutdown with mapLock
ReplicaManager:
- Maybe some corner case that I'm missing, but makeFollower already passes in the new leaderBrokerId so why do we need to re-read from ZooKeeper (line 173)?
ReplicaFetchTest:
- Ideally producer.close() should be before the waitUntilTrue
- The condition function uses & instead of &&.
- Also, instead of the hard-coded 60L I think it would be clearer and sufficient to do something like:
val expectedOffset = brokers.head.getLogManager...logEndOffset
assertEquals(brokers.size, brokers.count( broker => broker.getLogManager...logEndOffset == expectedOffset ))
> using MultiFetch in the follower
> --------------------------------
>
> Key: KAFKA-339
> URL: https://issues.apache.org/jira/browse/KAFKA-339
> Project: Kafka
> Issue Type: Sub-task
> Components: core
> Affects Versions: 0.8
> Reporter: Jun Rao
> Assignee: Jun Rao
> Fix For: 0.8
>
> Attachments: kafka-339_v1.patch
>
> Original Estimate: 252h
> Remaining Estimate: 252h
>
> A broker could be following multiple topic/partitions from the broker. Instead of using 1 fetcher thread per topic/partition, it would be more efficient to use 1 fetcher thread that issues multi-fetch requests.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira