You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "tuyang (JIRA)" <ji...@apache.org> on 2017/07/31 02:34:01 UTC
[jira] [Comment Edited] (KAFKA-5678) When the broker graceful shutdown occurs, the producer side sends timeout.

    [ https://issues.apache.org/jira/browse/KAFKA-5678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16106741#comment-16106741 ] 

tuyang edited comment on KAFKA-5678 at 7/31/17 2:33 AM:
--------------------------------------------------------

Thanks [~becket_qin]
After I checked the pull request before，I find the solution has something in common，but it is not the same as before.
The before pull request's solution is focus on the message lose in the leader quick change，but this problem is focus on the latency on the produceRequest or fetchRequest because of leader change.
Maybe we should use some code like below.
{code:java}
forceCompleteDelayedProduce
forceCompleteDelayedFetch
{code}

to replace some code as below in the prior pull request.
{code:java}
tryCompleteDelayedProduce
tryCompleteDelayedFetch
{code}

the prior pull request may leads to increase request.timeout.ms latency, because it may not complete,  so this request will not complete in the future until request.timeout.ms elapsed, Maybe we should force it but not try it.




was (Author: json tu):
Thanks [~becket_qin]
After I checked the pull request before，I find the solution has something in common，but it is not the same as before.
The before pull request's solution is focus on the message lose in the leader quick change，but this problem is focus on the latency on the send or fetch because of leader change.
Maybe we should use some code like below.
{code:java}
forceCompleteDelayedProduce
forceCompleteDelayedFetch
{code}

to replace some code as below in the prior pull request.
{code:java}
tryCompleteDelayedProduce
tryCompleteDelayedFetch
{code}

the prior pull request may leads to increase request.timeout.ms latency, because it may not complete,  so this request will not complete in the future until request.timeout.ms elapsed, Maybe we should force it but not try it.



> When the broker graceful shutdown occurs, the producer side sends timeout.
> --------------------------------------------------------------------------
>
>                 Key: KAFKA-5678
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5678
>             Project: Kafka
>          Issue Type: Improvement
>    Affects Versions: 0.9.0.0, 0.10.0.0, 0.11.0.0
>            Reporter: tuyang
>
> Test environment as follows.
> 1.Kafka version：0.9.0.1
> 2.Cluster with 3 broker which with broker id A,B,C 
> 3.Topic with 6 partitions with 2 replicas，with 2 leader partitions at each broker.
> We can reproduce the problem as follows.
> 1.we send message as quickly as possible with ack -1.
> 2.if partition p0's leader is on broker A and we graceful shutdown broker A，but we send a message to p0 before the leader is reelect, so the message can be appended to the leader replica successful, but if the follower replica not catch it as quickly as possible, so the shutting down broker will create a delayProduce for this request to wait complete until request.timeout.ms .
> 3.because of the controllerShutdown request from broker A, then the p0 partition leader will reelect
> , then the replica on broker A will become follower before complete shut down.then the delayProduce will not be trigger to complete until expire. 
> 4.if broker A shutdown cost too long, then the producer will get response after request.timeout.ms, which results in increase the producer send latency when we are restarting broker one by one.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)