You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2018/05/21 08:04:00 UTC
[jira] [Commented] (KAFKA-6916) AdminClient does not refresh metadata on broker failure

    [ https://issues.apache.org/jira/browse/KAFKA-6916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16482281#comment-16482281 ] 

ASF GitHub Bot commented on KAFKA-6916:
---------------------------------------

rajinisivaram opened a new pull request #5050: KAFKA-6916: Refresh metadata in admin client if broker connection fails
URL: https://github.com/apache/kafka/pull/5050
 
 
   Refresh metadata if broker connection fails so that new calls are sent only to nodes that are alive and requests to controller are sent to the new controller if controller changes due to broker failure. Also reassign calls that could not be sent.
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> AdminClient does not refresh metadata on broker failure
> -------------------------------------------------------
>
>                 Key: KAFKA-6916
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6916
>             Project: Kafka
>          Issue Type: Task
>          Components: admin
>    Affects Versions: 1.1.0, 1.0.1
>            Reporter: Rajini Sivaram
>            Assignee: Rajini Sivaram
>            Priority: Major
>             Fix For: 2.0.0
>
>
> There are intermittent test failures in DynamicBrokerReconfigurationTest when brokers are restarted. The test uses ephemeral ports and hence ports after server restart are not the same as the ports before restart. The tests rely on metadata refresh on producers, consumers and admin clients to obtain new server ports when connections fail. This works with producers and consumers, but results in intermittent failures with admin client because refresh is not triggered.
> There are a couple of issues in AdminClient:
>  # Unlike producers and consumers, adminClient does not request metadata update when connection to a broker fails. This is particularly bad if controller goes down. Controller is used for various requests like createTopics and describeTopics. If controller goes down and adminClient.describeTopics() is invoked, adminClient sends the request to the old controller. If the connection fails, it keeps retrying with the same address. Metadata refresh is never triggered. The request times out after 2 minutes by default, metadata is not refreshed for 5 minutes by default. We should refresh metadata whenever connection to a broker fails.
>  # Admin client requests are always retried on the same node. In the example above, if controller goes down and a new controller is elected, it will be good if the retried request is sent to the new controller. Otherwise we are just blocking the call for 2 minutes with a lot of retries that would never succeed.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)