You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Jun Rao (JIRA)" <ji...@apache.org> on 2012/05/07 19:28:50 UTC

[jira] [Created] (KAFKA-339) using MultiFetch in the follower

Jun Rao created KAFKA-339:
-----------------------------

             Summary: using MultiFetch in the follower
                 Key: KAFKA-339
                 URL: https://issues.apache.org/jira/browse/KAFKA-339
             Project: Kafka
          Issue Type: Sub-task
          Components: core
    Affects Versions: 0.8
            Reporter: Jun Rao


A broker could be following multiple topic/partitions from the broker. Instead of using 1 fetcher thread per topic/partition, it would be more efficient to use 1 fetcher thread that issues multi-fetch requests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (KAFKA-339) using MultiFetch in the follower

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jun Rao updated KAFKA-339:
--------------------------

    Attachment: kafka-339_v1.patch
    
> using MultiFetch in the follower
> --------------------------------
>
>                 Key: KAFKA-339
>                 URL: https://issues.apache.org/jira/browse/KAFKA-339
>             Project: Kafka
>          Issue Type: Sub-task
>          Components: core
>    Affects Versions: 0.8
>            Reporter: Jun Rao
>            Assignee: Jun Rao
>             Fix For: 0.8
>
>         Attachments: kafka-339_v1.patch
>
>   Original Estimate: 252h
>  Remaining Estimate: 252h
>
> A broker could be following multiple topic/partitions from the broker. Instead of using 1 fetcher thread per topic/partition, it would be more efficient to use 1 fetcher thread that issues multi-fetch requests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (KAFKA-339) using MultiFetch in the follower

Posted by "Joel Koshy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/KAFKA-339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13397783#comment-13397783 ] 

Joel Koshy commented on KAFKA-339:
----------------------------------

+1 on v2

Minor comments:
- AbstractFetcherManager: may be useful to log the fetcher-id for the topic/partition when adding the fetcher
- ReplicaFetcherManager: miss-match -> mismatch
- ReplicaManager: makeFollower: I still don't think we need to look up zk for the leader as the value passed in should be current.

                
> using MultiFetch in the follower
> --------------------------------
>
>                 Key: KAFKA-339
>                 URL: https://issues.apache.org/jira/browse/KAFKA-339
>             Project: Kafka
>          Issue Type: Sub-task
>          Components: core
>    Affects Versions: 0.8
>            Reporter: Jun Rao
>            Assignee: Jun Rao
>             Fix For: 0.8
>
>         Attachments: kafka-339_v1.patch, kafka-339_v2.patch
>
>   Original Estimate: 252h
>  Remaining Estimate: 252h
>
> A broker could be following multiple topic/partitions from the broker. Instead of using 1 fetcher thread per topic/partition, it would be more efficient to use 1 fetcher thread that issues multi-fetch requests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Assigned] (KAFKA-339) using MultiFetch in the follower

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jun Rao reassigned KAFKA-339:
-----------------------------

    Assignee: Jun Rao
    
> using MultiFetch in the follower
> --------------------------------
>
>                 Key: KAFKA-339
>                 URL: https://issues.apache.org/jira/browse/KAFKA-339
>             Project: Kafka
>          Issue Type: Sub-task
>          Components: core
>    Affects Versions: 0.8
>            Reporter: Jun Rao
>            Assignee: Jun Rao
>             Fix For: 0.8
>
>   Original Estimate: 252h
>  Remaining Estimate: 252h
>
> A broker could be following multiple topic/partitions from the broker. Instead of using 1 fetcher thread per topic/partition, it would be more efficient to use 1 fetcher thread that issues multi-fetch requests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (KAFKA-339) using MultiFetch in the follower

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jun Rao updated KAFKA-339:
--------------------------

    Status: Patch Available  (was: Open)

Uploaded patch v1. 

Created an AbstractFetcher and AbstractFetcherManager, which contain the common code path for fetchers used in followers and real consumer clients. Added ReplicaFetcher and ReplicaFetcherManager (use multi-fetch) to replace ReplicaFetcherThread.

Will reimplement the fetcher in consumer client based on AbstractFetcher and AbstractFetcherManager in a separate jira, kafka-362.
                
> using MultiFetch in the follower
> --------------------------------
>
>                 Key: KAFKA-339
>                 URL: https://issues.apache.org/jira/browse/KAFKA-339
>             Project: Kafka
>          Issue Type: Sub-task
>          Components: core
>    Affects Versions: 0.8
>            Reporter: Jun Rao
>            Assignee: Jun Rao
>             Fix For: 0.8
>
>         Attachments: kafka-339_v1.patch
>
>   Original Estimate: 252h
>  Remaining Estimate: 252h
>
> A broker could be following multiple topic/partitions from the broker. Instead of using 1 fetcher thread per topic/partition, it would be more efficient to use 1 fetcher thread that issues multi-fetch requests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (KAFKA-339) using MultiFetch in the follower

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jun Rao updated KAFKA-339:
--------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

Thanks for the review. Committed patch v2 with a minor improvement by testing multiple topics in ReplicaFetcherTest. 

As for unnecessary ZK reads in ReplicaManager, this should be fixed as part of kafka-343 when moving the leader election logic to the controller.
                
> using MultiFetch in the follower
> --------------------------------
>
>                 Key: KAFKA-339
>                 URL: https://issues.apache.org/jira/browse/KAFKA-339
>             Project: Kafka
>          Issue Type: Sub-task
>          Components: core
>    Affects Versions: 0.8
>            Reporter: Jun Rao
>            Assignee: Jun Rao
>             Fix For: 0.8
>
>         Attachments: kafka-339_v1.patch, kafka-339_v2.patch
>
>   Original Estimate: 252h
>  Remaining Estimate: 252h
>
> A broker could be following multiple topic/partitions from the broker. Instead of using 1 fetcher thread per topic/partition, it would be more efficient to use 1 fetcher thread that issues multi-fetch requests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (KAFKA-339) using MultiFetch in the follower

Posted by "Neha Narkhede (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/KAFKA-339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13397826#comment-13397826 ] 

Neha Narkhede commented on KAFKA-339:
-------------------------------------

+1 on v2, assuming Joel's comments are addressed.
                
> using MultiFetch in the follower
> --------------------------------
>
>                 Key: KAFKA-339
>                 URL: https://issues.apache.org/jira/browse/KAFKA-339
>             Project: Kafka
>          Issue Type: Sub-task
>          Components: core
>    Affects Versions: 0.8
>            Reporter: Jun Rao
>            Assignee: Jun Rao
>             Fix For: 0.8
>
>         Attachments: kafka-339_v1.patch, kafka-339_v2.patch
>
>   Original Estimate: 252h
>  Remaining Estimate: 252h
>
> A broker could be following multiple topic/partitions from the broker. Instead of using 1 fetcher thread per topic/partition, it would be more efficient to use 1 fetcher thread that issues multi-fetch requests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (KAFKA-339) using MultiFetch in the follower

Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/KAFKA-339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jun Rao updated KAFKA-339:
--------------------------

    Attachment: kafka-339_v2.patch

Thanks for the review. Attaching patch v2.

AbstractFetcher:
- currentOffset actually can be none. A fetcher can be removed after the multi-fetch request is made.

AbstractFetcherManager:
-- addFetcher actually always adds a fetcher, but not always creates a new fetcher thread. I see the naming is a bit confusing. Renamed AbstractFetcher to AbstractFetcherThread.
-- Fetchermanager is maintaining 1 or more fetcherThreads per source broker. Be default, there is 1 fetcherThread per broker. However, for higher degree of parallelism, more fetcherThreads can be configured. A fetcher corresponds to the fetching from 1 partition of a topic. Multiple fetchers can be added to a fetcherThread.

ReplicaManager:
-- We need to get the host/port from ZK for a given broker id. Such information should be cached. Will create a separate jira to address this issue.

The rest of of comments have been fixed.
                
> using MultiFetch in the follower
> --------------------------------
>
>                 Key: KAFKA-339
>                 URL: https://issues.apache.org/jira/browse/KAFKA-339
>             Project: Kafka
>          Issue Type: Sub-task
>          Components: core
>    Affects Versions: 0.8
>            Reporter: Jun Rao
>            Assignee: Jun Rao
>             Fix For: 0.8
>
>         Attachments: kafka-339_v1.patch, kafka-339_v2.patch
>
>   Original Estimate: 252h
>  Remaining Estimate: 252h
>
> A broker could be following multiple topic/partitions from the broker. Instead of using 1 fetcher thread per topic/partition, it would be more efficient to use 1 fetcher thread that issues multi-fetch requests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (KAFKA-339) using MultiFetch in the follower

Posted by "Joel Koshy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/KAFKA-339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13397097#comment-13397097 ] 

Joel Koshy commented on KAFKA-339:
----------------------------------

Thanks for the patch. Some comments:

AbstractFetcher:
- currentOffset should never be empty, so we can get rid of the if.
- hasPartition, partitionCount need to synchronize fetchMap
- "shutting down" should probably be removed from the string on line 106
- newOffset can be computed from the messageSet - so the processPartitionData implementation does not need to return the log end offset (and likewise when we use this in the high-level consumer). It's probably safer to prevent processPartitionData from overriding the new offset, and I don't see any benefit in allowing it to do so.

AbstractFetcherManager:
- addFetcher:
  - can rename to maybeAddFetcher
  - also, maybe we should move the info log on line 38 to the None case and print the fetcher id; and add another log for the other case saying there's already a fetcher.
- What is the purpose of having a fetcher ID vs simply topic-partition?
- Should synchronize fetcherRunnableMap in shutdown with mapLock

ReplicaManager:
- Maybe some corner case that I'm missing, but makeFollower already passes in the new leaderBrokerId so why do we need to re-read from ZooKeeper (line 173)?

ReplicaFetchTest:
- Ideally producer.close() should be before the waitUntilTrue
- The condition function uses & instead of &&.
- Also, instead of the hard-coded 60L I think it would be clearer and sufficient to do something like:
  val expectedOffset = brokers.head.getLogManager...logEndOffset
  assertEquals(brokers.size, brokers.count( broker => broker.getLogManager...logEndOffset == expectedOffset ))

                
> using MultiFetch in the follower
> --------------------------------
>
>                 Key: KAFKA-339
>                 URL: https://issues.apache.org/jira/browse/KAFKA-339
>             Project: Kafka
>          Issue Type: Sub-task
>          Components: core
>    Affects Versions: 0.8
>            Reporter: Jun Rao
>            Assignee: Jun Rao
>             Fix For: 0.8
>
>         Attachments: kafka-339_v1.patch
>
>   Original Estimate: 252h
>  Remaining Estimate: 252h
>
> A broker could be following multiple topic/partitions from the broker. Instead of using 1 fetcher thread per topic/partition, it would be more efficient to use 1 fetcher thread that issues multi-fetch requests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira