You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@kafka.apache.org by Cheng Tan <ct...@confluent.io> on 2020/04/28 01:48:32 UTC

[DISCUSS] KIP-601: Configurable socket connection timeout

Hi developers,

I’m proposing KIP-601 to support configurable socket connection timeout. https://cwiki.apache.org/confluence/display/KAFKA/KIP-601%3A+Configurable+socket+connection+timeout <https://cwiki.apache.org/confluence/display/KAFKA/KIP-601:+Configurable+socket+connection+timeout>

Currently, the initial socket connection timeout is depending on system setting tcp_syn_retries. The actual timeout value is 2 ^ (tcp_sync_retries + 1) - 1 seconds. For the reasons below, we want to control the client-side socket timeout directly using configuration files.
• The default value of tcp_syn_retries is 6. It means the default timeout value is 127 seconds and too long in some scenarios. For example, when the user specifies a list of N bootstrap-servers and no connection has been built between the client and the servers, the least loaded node provider will poll all the server nodes specified by the user. If M servers in the bootstrap-servers list are offline, the client may take (127 * M) seconds to connect to the cluster. In the worst case when M = N - 1, the wait time can be several minutes.
• Though we may set the default value of tcp_syn_retries smaller, we will then change the system level network behaviors, which might cause other issues.
• Applications depending on KafkaAdminClient may want to robustly know and control the initial socket connect timeout, which can help throw corresponding exceptions in their layer.

Please let me know if you have any thoughts or suggestions. Thanks.

Best, - Cheng Tan

Re: [DISCUSS] KIP-601: Configurable socket connection timeout

Posted by Cheng Tan <ct...@confluent.io>.

Hi Colin. Thanks for the discussion and feedback. I re-wrote the KIP-601 proposal following your suggestions. Now the new proposal is ready.

Best, - Cheng Tan

> On Apr 28, 2020, at 2:55 PM, Colin McCabe <cm...@apache.org> wrote:
> 
> 
> Thanks again for the KIP.  This seems like it has been a gap in Kaf

Re: [DISCUSS] KIP-601: Configurable socket connection timeout

Posted by Colin McCabe <cm...@apache.org>.

Hi Cheng,

Thanks for the KIP.

> Currently, the initial socket connection timeout is depending on system 
> setting tcp_syn_retries. The actual timeout value is 2 ^ (tcp_sync_retries + 1) - 1 seconds

This section is confusing since it refers to Linux configuration settings without mentioning identifying them as such.  I wondered for a second if Kafka had a tcp_syn_retries setting (it doesn't.)  You should make it more clear when you are referring to an operating-system-specific detail.

> We propose a new common client config: connections.timeout.ms

The name "connections.timeout.ms" seems unclear.  If I had to guess about what "connections.timeout.ms" was, I would probably guess that it was some kind of idle timeout on the connection as a whole.  I wouldn't think that it only applied to connection setup.  How about something like "connection.setup.timeout.ms"?

> Currently, when no nodes provided in --boostrap-server option is
> connected, the LeastLoadedNodeProvider will provide an unconnected
> node for the client. The Cluster class shuffled the nodes to balance the
> initial pressure and the LeastLoadedNodeProvider will always provide
> the same node, which is the last node after shuffling. Consequently, 
> though we may provide several bootstrap servers, the client not be able 
> to connect to the cluster if any of the servers in the --bootstrap-server list 
> is offline.

Hmm... I can't follow what this paragraph is trying to say.  What's the connection between LeastLoadedProvider and TCP connection setup timeouts?

> ClusterConnectionStates will keep the index of the most recently 
> provided node. Meanwhile, a public API looks like below will be added 
> to simulate the round-robin node picking.

Again, this seems unrelated to the problem described in the motivation section.

I would really suggest creating a separate KIP for this, since it seems unrelated to the problem of TCP timeouts.  When KIPs try to do too many things at once, they often lose focus and get delayed.

> To Discuss
> 1. We can think about if we want to have different 
> connect.timeout.ms for different clients.
> 2. What would be the default value for connect.timeout.ms (is 10s
> too short?)
> 3. Should the Selector / IdleExpiryManager throw an exception 
> when the initial socket channel connection is timeout? Personally 
> I don't think we need as we will continue trying the rest of the 
> candidate nodes.

It's good to get feedback, but in general everything in a KIP is "To discuss," so we should not have a separate section titled "To discuss."  :) 

We also want the KIP to reflect what we actually decided on and implemented, so it shouldn't include open questions (unless they are questions about future work that is not covered in the scope of the KIP).

To come back to the timeout question, I think you should motivate your decision (and it does have to be decided here, not left as an open question), by explaining its relationship to other timeouts, and to the underlying TCP three-way handshake which it is putting an upper bound on.  10 seconds doesn't seem unreasonable in that context.

I believe that the default value of 127 seconds in Linux is set as high as it is to compensate for the fact that many applications do not retry connections if the initial establishment fails.  That isn't the case for Kafka, which motivates a lower default for us than for them.

Question 3 seems to be about whether a connection setup failure should result in the node getting moved to an error state (and hence not getting picked a second time by LeastLoadedNode, for example).  It seems pretty clear that a connection setup error is an error, right?  (Also, phrasing this as "should we throw an exception" is unnecessarily implementation specific.)

For rejected alternatives, we should explain why the KIP introduces a new timeout configuration rather than terminating an unsuccessful TCP connection setup attempt after request.timeout.ms has elapsed.

Also, the "Compatibility, Deprecation, and Migration Plan" should say that there is no impact (the section is just blank now)

Thanks again for the KIP.  This seems like it has been a gap in Kafka's error handling for a while, and it's good to see a proposal to fix it.

best,
Colin

On Mon, Apr 27, 2020, at 18:48, Cheng Tan wrote:
> Hi developers,
> 
> 
> I’m proposing KIP-601 to support configurable socket connection 
> timeout. 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-601%3A+Configurable+socket+connection+timeout <https://cwiki.apache.org/confluence/display/KAFKA/KIP-601:+Configurable+socket+connection+timeout>
> 
> Currently, the initial socket connection timeout is depending on system 
> setting tcp_syn_retries. The actual timeout value is 2 ^ 
> (tcp_sync_retries + 1) - 1 seconds. For the reasons below, we want to 
> control the client-side socket timeout directly using configuration 
> files. 
> 	• The default value of tcp_syn_retries is 6. It means the default 
> timeout value is 127 seconds and too long in some scenarios. For 
> example, when the user specifies a list of N bootstrap-servers and no 
> connection has been built between the client and the servers, the least 
> loaded node provider will poll all the server nodes specified by the 
> user. If M servers in the bootstrap-servers list are offline, the 
> client may take (127 * M) seconds to connect to the cluster. In the 
> worst case when M = N - 1, the wait time can be several minutes.
> 	• Though we may set the default value of tcp_syn_retries smaller, we 
> will then change the system level network behaviors, which might cause 
> other issues.
> 	• Applications depending on KafkaAdminClient may want to robustly know 
> and control the initial socket connect timeout, which can help throw 
> corresponding exceptions in their layer.
> 
> Please let me know if you have any thoughts or suggestions. Thanks.
> 
> 
> Best, - Cheng Tan
> 
>

Re: [DISCUSS] KIP-601: Configurable socket connection timeout

Posted by Cheng Tan <ct...@confluent.io>.

Dear Rajini,

Thanks for all the feedbacks. They are very helpful for me to do the brainstorming. I’ve incorporated our discuss in the KIP and started a voting thread. 

Best, - Cheng Tan

> On May 15, 2020, at 2:13 PM, Rajini Sivaram <ra...@gmail.com> wrote:
> 
> Hi Cheng,
> 
> I am fine with the rest of the KIP apart from the 10s default. If no one
> else has any concerns about this new default, let's go with it. Please go
> ahead and start vote.
> 
> Regards,
> 
> Rajini
> 
> 
> On Fri, May 15, 2020 at 8:21 PM Cheng Tan <ct...@confluent.io> wrote:
> 
>> Dear Rajini,
>> 
>> 
>> Thanks for the reply.
>> 
>>> e have a lot of these and I want to
>>> understand the benefits of the proposed timeout in this case alone. We
>>> currently have a request timeout of 30s. Would you consider adding a 10s
>>> connection timeout?
>> 
>> A shorter timeout (10s) at the transportation level will help clients
>> detect dead nodes faster. “request.timeout.ms” is too general and applies
>> to all the requests whose complexity at the application level varies.  It’s
>> risky to set “request.timeout.ms” to a lower value for detecting dead
>> nodes quicker because of the involvement of the application layer.
>> 
>> After “socket.connection.setup.timeout.ms” hits, NetworkClient will fail
>> the request in the exact approach as it handles “request.timeout.ms”.
>> That is to say, the response will constructed upon a RetriableException.
>> Producer, Consumer, and KafkaAdminClient can then perform their retry logic
>> as a request timeout happens.
>> 
>>> We have KIP-612 that is proposing to throttle connection set up on the
>> one
>>> hand and this KIP that is dramatically reducing default connection
>> timeout
>>> on the other. Not sure if that is a good idea.
>> 
>> The default of the broker connection creation rate limit is Int.MaxValue.
>> The KIP also proposes per-IP throttle configuration. Thus, I don’t expect
>> the combination of the broker connection throttle and a shorter client
>> transportation timeout will have a side effect.
>> 
>> Does the reasons above make sense to you?
>> 
>> Best, - Cheng
>> 
>> 
>> 
>> 
>>> On May 15, 2020, at 4:49 AM, Rajini Sivaram <ra...@gmail.com>
>> wrote:
>>> 
>>> Hi Cheng,
>>> 
>>> Let me rephrase my question. Let's say we didn't have the case of
>>> leastLoadedNode. We are only talking about connections to a specific node
>>> (i.e. leader or controller). We have a lot of these and I want to
>>> understand the benefits of the proposed timeout in this case alone. We
>>> currently have a request timeout of 30s. Would you consider adding a 10s
>>> connection timeout? And if you did, what would you expect the 10s timeout
>>> to do?
>>> 
>>> a) We could fail a request if connection didn't complete within 10s. If
>> we
>>> always expect connections to succeed within 10s, this would be considered
>>> reasonable behaviour. But this would be changing the current default,
>> which
>>> allows you up to 30 seconds to connect and process a request.
>>> b) We retry the connection. What would be the point? We were waiting in a
>>> queue for connecting, but we decide to stop and join the back of the
>> queue.
>>> 
>>> We have KIP-612 that is proposing to throttle connection set up on the
>> one
>>> hand and this KIP that is dramatically reducing default connection
>> timeout
>>> on the other. Not sure if that is a good idea.
>>> 
>>> 
>>> On Fri, May 15, 2020 at 1:26 AM Cheng Tan <ct...@confluent.io> wrote:
>> 
>>

Re: [DISCUSS] KIP-601: Configurable socket connection timeout

Posted by Colin McCabe <cm...@apache.org>.

On Tue, May 19, 2020, at 03:27, Rajini Sivaram wrote:
> Hi Colin,
> 
> I do agree about the `leastLoadedNode` case. My question was about the
> other cases where we are connecting to a specific node: fetch requests to
> leaders, produce requests to leaders, requests to group coordinators,
> requests to controller etc. It will be good to either quantify that these
> connections are less common and hence less critical in terms of performance
> in typical deployments or describe the impact on these connections from the
> proposed change in default behaviour. It is perfectly fine if connections
> to specific nodes don't benefit from the new timeout, I was looking for
> analysis which says they aren't made any worse either, especially in the
> context of other connection rate limiting/quota work we are proposing like
> KIP-612.
> 

Hi Rajini,

This is a fair point.  In the VOTE thread, I proposed using an exponential connection retry backoff to mitigate this problem.  So the first few retries would happen quickly, but later retries would take increasingly longer, keeping the number of reconnect attempts down.

(This is assuming we're trying to connect to a single fixed node, like the controller node)

best,
Colin


>
> Regards,
> 
> Rajini
> 
> 
> On Mon, May 18, 2020 at 8:48 PM Colin McCabe <cm...@apache.org> wrote:
> 
> > Hi Rajini,
> >
> > I think the idea behind the 10 second default is that if you have three
> > Kafka nodes A, B, C (or whatever), and you can't talk to A within 10
> > seconds, you'll try again with B or C, and still have plenty of time left
> > over.  Whereas currently, if your connection hangs while trying to connect
> > to A, you're out of luck-- you'll just hang until the whole request timeout
> > is gone.  So while you could have tried a different node and succeeded, you
> > never got a chance to.
> >
> > So in the common case where you have other nodes that you can connect to,
> > we won't end up trying to reconnect to the same node over and over.  I'll
> > add some more comments in the vote thread.
> >
> > best,
> > Colin
> >
> >
> > On Fri, May 15, 2020, at 14:13, Rajini Sivaram wrote:
> > > Hi Cheng,
> > >
> > > I am fine with the rest of the KIP apart from the 10s default. If no one
> > > else has any concerns about this new default, let's go with it. Please go
> > > ahead and start vote.
> > >
> > > Regards,
> > >
> > > Rajini
> > >
> > >
> > > On Fri, May 15, 2020 at 8:21 PM Cheng Tan <ct...@confluent.io> wrote:
> > >
> > > > Dear Rajini,
> > > >
> > > >
> > > > Thanks for the reply.
> > > >
> > > > > e have a lot of these and I want to
> > > > > understand the benefits of the proposed timeout in this case alone.
> > We
> > > > > currently have a request timeout of 30s. Would you consider adding a
> > 10s
> > > > > connection timeout?
> > > >
> > > > A shorter timeout (10s) at the transportation level will help clients
> > > > detect dead nodes faster. “request.timeout.ms” is too general and
> > applies
> > > > to all the requests whose complexity at the application level varies.
> > It’s
> > > > risky to set “request.timeout.ms” to a lower value for detecting dead
> > > > nodes quicker because of the involvement of the application layer.
> > > >
> > > > After “socket.connection.setup.timeout.ms” hits, NetworkClient will
> > fail
> > > > the request in the exact approach as it handles “request.timeout.ms”.
> > > > That is to say, the response will constructed upon a
> > RetriableException.
> > > > Producer, Consumer, and KafkaAdminClient can then perform their retry
> > logic
> > > > as a request timeout happens.
> > > >
> > > > > We have KIP-612 that is proposing to throttle connection set up on
> > the
> > > > one
> > > > > hand and this KIP that is dramatically reducing default connection
> > > > timeout
> > > > > on the other. Not sure if that is a good idea.
> > > >
> > > > The default of the broker connection creation rate limit is
> > Int.MaxValue.
> > > > The KIP also proposes per-IP throttle configuration. Thus, I don’t
> > expect
> > > > the combination of the broker connection throttle and a shorter client
> > > > transportation timeout will have a side effect.
> > > >
> > > > Does the reasons above make sense to you?
> > > >
> > > > Best, - Cheng
> > > >
> > > >
> > > >
> > > >
> > > > > On May 15, 2020, at 4:49 AM, Rajini Sivaram <rajinisivaram@gmail.com
> > >
> > > > wrote:
> > > > >
> > > > > Hi Cheng,
> > > > >
> > > > > Let me rephrase my question. Let's say we didn't have the case of
> > > > > leastLoadedNode. We are only talking about connections to a specific
> > node
> > > > > (i.e. leader or controller). We have a lot of these and I want to
> > > > > understand the benefits of the proposed timeout in this case alone.
> > We
> > > > > currently have a request timeout of 30s. Would you consider adding a
> > 10s
> > > > > connection timeout? And if you did, what would you expect the 10s
> > timeout
> > > > > to do?
> > > > >
> > > > > a) We could fail a request if connection didn't complete within 10s.
> > If
> > > > we
> > > > > always expect connections to succeed within 10s, this would be
> > considered
> > > > > reasonable behaviour. But this would be changing the current default,
> > > > which
> > > > > allows you up to 30 seconds to connect and process a request.
> > > > > b) We retry the connection. What would be the point? We were waiting
> > in a
> > > > > queue for connecting, but we decide to stop and join the back of the
> > > > queue.
> > > > >
> > > > > We have KIP-612 that is proposing to throttle connection set up on
> > the
> > > > one
> > > > > hand and this KIP that is dramatically reducing default connection
> > > > timeout
> > > > > on the other. Not sure if that is a good idea.
> > > > >
> > > > >
> > > > > On Fri, May 15, 2020 at 1:26 AM Cheng Tan <ct...@confluent.io> wrote:
> > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-601: Configurable socket connection timeout

Posted by Rajini Sivaram <ra...@gmail.com>.

Hi Colin,

I do agree about the `leastLoadedNode` case. My question was about the
other cases where we are connecting to a specific node: fetch requests to
leaders, produce requests to leaders, requests to group coordinators,
requests to controller etc. It will be good to either quantify that these
connections are less common and hence less critical in terms of performance
in typical deployments or describe the impact on these connections from the
proposed change in default behaviour. It is perfectly fine if connections
to specific nodes don't benefit from the new timeout, I was looking for
analysis which says they aren't made any worse either, especially in the
context of other connection rate limiting/quota work we are proposing like
KIP-612.

Regards,

Rajini


On Mon, May 18, 2020 at 8:48 PM Colin McCabe <cm...@apache.org> wrote:

> Hi Rajini,
>
> I think the idea behind the 10 second default is that if you have three
> Kafka nodes A, B, C (or whatever), and you can't talk to A within 10
> seconds, you'll try again with B or C, and still have plenty of time left
> over.  Whereas currently, if your connection hangs while trying to connect
> to A, you're out of luck-- you'll just hang until the whole request timeout
> is gone.  So while you could have tried a different node and succeeded, you
> never got a chance to.
>
> So in the common case where you have other nodes that you can connect to,
> we won't end up trying to reconnect to the same node over and over.  I'll
> add some more comments in the vote thread.
>
> best,
> Colin
>
>
> On Fri, May 15, 2020, at 14:13, Rajini Sivaram wrote:
> > Hi Cheng,
> >
> > I am fine with the rest of the KIP apart from the 10s default. If no one
> > else has any concerns about this new default, let's go with it. Please go
> > ahead and start vote.
> >
> > Regards,
> >
> > Rajini
> >
> >
> > On Fri, May 15, 2020 at 8:21 PM Cheng Tan <ct...@confluent.io> wrote:
> >
> > > Dear Rajini,
> > >
> > >
> > > Thanks for the reply.
> > >
> > > > e have a lot of these and I want to
> > > > understand the benefits of the proposed timeout in this case alone.
> We
> > > > currently have a request timeout of 30s. Would you consider adding a
> 10s
> > > > connection timeout?
> > >
> > > A shorter timeout (10s) at the transportation level will help clients
> > > detect dead nodes faster. “request.timeout.ms” is too general and
> applies
> > > to all the requests whose complexity at the application level varies.
> It’s
> > > risky to set “request.timeout.ms” to a lower value for detecting dead
> > > nodes quicker because of the involvement of the application layer.
> > >
> > > After “socket.connection.setup.timeout.ms” hits, NetworkClient will
> fail
> > > the request in the exact approach as it handles “request.timeout.ms”.
> > > That is to say, the response will constructed upon a
> RetriableException.
> > > Producer, Consumer, and KafkaAdminClient can then perform their retry
> logic
> > > as a request timeout happens.
> > >
> > > > We have KIP-612 that is proposing to throttle connection set up on
> the
> > > one
> > > > hand and this KIP that is dramatically reducing default connection
> > > timeout
> > > > on the other. Not sure if that is a good idea.
> > >
> > > The default of the broker connection creation rate limit is
> Int.MaxValue.
> > > The KIP also proposes per-IP throttle configuration. Thus, I don’t
> expect
> > > the combination of the broker connection throttle and a shorter client
> > > transportation timeout will have a side effect.
> > >
> > > Does the reasons above make sense to you?
> > >
> > > Best, - Cheng
> > >
> > >
> > >
> > >
> > > > On May 15, 2020, at 4:49 AM, Rajini Sivaram <rajinisivaram@gmail.com
> >
> > > wrote:
> > > >
> > > > Hi Cheng,
> > > >
> > > > Let me rephrase my question. Let's say we didn't have the case of
> > > > leastLoadedNode. We are only talking about connections to a specific
> node
> > > > (i.e. leader or controller). We have a lot of these and I want to
> > > > understand the benefits of the proposed timeout in this case alone.
> We
> > > > currently have a request timeout of 30s. Would you consider adding a
> 10s
> > > > connection timeout? And if you did, what would you expect the 10s
> timeout
> > > > to do?
> > > >
> > > > a) We could fail a request if connection didn't complete within 10s.
> If
> > > we
> > > > always expect connections to succeed within 10s, this would be
> considered
> > > > reasonable behaviour. But this would be changing the current default,
> > > which
> > > > allows you up to 30 seconds to connect and process a request.
> > > > b) We retry the connection. What would be the point? We were waiting
> in a
> > > > queue for connecting, but we decide to stop and join the back of the
> > > queue.
> > > >
> > > > We have KIP-612 that is proposing to throttle connection set up on
> the
> > > one
> > > > hand and this KIP that is dramatically reducing default connection
> > > timeout
> > > > on the other. Not sure if that is a good idea.
> > > >
> > > >
> > > > On Fri, May 15, 2020 at 1:26 AM Cheng Tan <ct...@confluent.io> wrote:
> > >
> > >
> >
>

Re: [DISCUSS] KIP-601: Configurable socket connection timeout

Posted by Colin McCabe <cm...@apache.org>.

Hi Rajini,

I think the idea behind the 10 second default is that if you have three Kafka nodes A, B, C (or whatever), and you can't talk to A within 10 seconds, you'll try again with B or C, and still have plenty of time left over.  Whereas currently, if your connection hangs while trying to connect to A, you're out of luck-- you'll just hang until the whole request timeout is gone.  So while you could have tried a different node and succeeded, you never got a chance to.

So in the common case where you have other nodes that you can connect to, we won't end up trying to reconnect to the same node over and over.  I'll add some more comments in the vote thread.

best,
Colin


On Fri, May 15, 2020, at 14:13, Rajini Sivaram wrote:
> Hi Cheng,
> 
> I am fine with the rest of the KIP apart from the 10s default. If no one
> else has any concerns about this new default, let's go with it. Please go
> ahead and start vote.
> 
> Regards,
> 
> Rajini
> 
> 
> On Fri, May 15, 2020 at 8:21 PM Cheng Tan <ct...@confluent.io> wrote:
> 
> > Dear Rajini,
> >
> >
> > Thanks for the reply.
> >
> > > e have a lot of these and I want to
> > > understand the benefits of the proposed timeout in this case alone. We
> > > currently have a request timeout of 30s. Would you consider adding a 10s
> > > connection timeout?
> >
> > A shorter timeout (10s) at the transportation level will help clients
> > detect dead nodes faster. “request.timeout.ms” is too general and applies
> > to all the requests whose complexity at the application level varies.  It’s
> > risky to set “request.timeout.ms” to a lower value for detecting dead
> > nodes quicker because of the involvement of the application layer.
> >
> > After “socket.connection.setup.timeout.ms” hits, NetworkClient will fail
> > the request in the exact approach as it handles “request.timeout.ms”.
> > That is to say, the response will constructed upon a RetriableException.
> > Producer, Consumer, and KafkaAdminClient can then perform their retry logic
> > as a request timeout happens.
> >
> > > We have KIP-612 that is proposing to throttle connection set up on the
> > one
> > > hand and this KIP that is dramatically reducing default connection
> > timeout
> > > on the other. Not sure if that is a good idea.
> >
> > The default of the broker connection creation rate limit is Int.MaxValue.
> > The KIP also proposes per-IP throttle configuration. Thus, I don’t expect
> > the combination of the broker connection throttle and a shorter client
> > transportation timeout will have a side effect.
> >
> > Does the reasons above make sense to you?
> >
> > Best, - Cheng
> >
> >
> >
> >
> > > On May 15, 2020, at 4:49 AM, Rajini Sivaram <ra...@gmail.com>
> > wrote:
> > >
> > > Hi Cheng,
> > >
> > > Let me rephrase my question. Let's say we didn't have the case of
> > > leastLoadedNode. We are only talking about connections to a specific node
> > > (i.e. leader or controller). We have a lot of these and I want to
> > > understand the benefits of the proposed timeout in this case alone. We
> > > currently have a request timeout of 30s. Would you consider adding a 10s
> > > connection timeout? And if you did, what would you expect the 10s timeout
> > > to do?
> > >
> > > a) We could fail a request if connection didn't complete within 10s. If
> > we
> > > always expect connections to succeed within 10s, this would be considered
> > > reasonable behaviour. But this would be changing the current default,
> > which
> > > allows you up to 30 seconds to connect and process a request.
> > > b) We retry the connection. What would be the point? We were waiting in a
> > > queue for connecting, but we decide to stop and join the back of the
> > queue.
> > >
> > > We have KIP-612 that is proposing to throttle connection set up on the
> > one
> > > hand and this KIP that is dramatically reducing default connection
> > timeout
> > > on the other. Not sure if that is a good idea.
> > >
> > >
> > > On Fri, May 15, 2020 at 1:26 AM Cheng Tan <ct...@confluent.io> wrote:
> >
> >
>

Re: [DISCUSS] KIP-601: Configurable socket connection timeout

Posted by Rajini Sivaram <ra...@gmail.com>.

Hi Cheng,

I am fine with the rest of the KIP apart from the 10s default. If no one
else has any concerns about this new default, let's go with it. Please go
ahead and start vote.

Regards,

Rajini


On Fri, May 15, 2020 at 8:21 PM Cheng Tan <ct...@confluent.io> wrote:

> Dear Rajini,
>
>
> Thanks for the reply.
>
> > e have a lot of these and I want to
> > understand the benefits of the proposed timeout in this case alone. We
> > currently have a request timeout of 30s. Would you consider adding a 10s
> > connection timeout?
>
> A shorter timeout (10s) at the transportation level will help clients
> detect dead nodes faster. “request.timeout.ms” is too general and applies
> to all the requests whose complexity at the application level varies.  It’s
> risky to set “request.timeout.ms” to a lower value for detecting dead
> nodes quicker because of the involvement of the application layer.
>
> After “socket.connection.setup.timeout.ms” hits, NetworkClient will fail
> the request in the exact approach as it handles “request.timeout.ms”.
> That is to say, the response will constructed upon a RetriableException.
> Producer, Consumer, and KafkaAdminClient can then perform their retry logic
> as a request timeout happens.
>
> > We have KIP-612 that is proposing to throttle connection set up on the
> one
> > hand and this KIP that is dramatically reducing default connection
> timeout
> > on the other. Not sure if that is a good idea.
>
> The default of the broker connection creation rate limit is Int.MaxValue.
> The KIP also proposes per-IP throttle configuration. Thus, I don’t expect
> the combination of the broker connection throttle and a shorter client
> transportation timeout will have a side effect.
>
> Does the reasons above make sense to you?
>
> Best, - Cheng
>
>
>
>
> > On May 15, 2020, at 4:49 AM, Rajini Sivaram <ra...@gmail.com>
> wrote:
> >
> > Hi Cheng,
> >
> > Let me rephrase my question. Let's say we didn't have the case of
> > leastLoadedNode. We are only talking about connections to a specific node
> > (i.e. leader or controller). We have a lot of these and I want to
> > understand the benefits of the proposed timeout in this case alone. We
> > currently have a request timeout of 30s. Would you consider adding a 10s
> > connection timeout? And if you did, what would you expect the 10s timeout
> > to do?
> >
> > a) We could fail a request if connection didn't complete within 10s. If
> we
> > always expect connections to succeed within 10s, this would be considered
> > reasonable behaviour. But this would be changing the current default,
> which
> > allows you up to 30 seconds to connect and process a request.
> > b) We retry the connection. What would be the point? We were waiting in a
> > queue for connecting, but we decide to stop and join the back of the
> queue.
> >
> > We have KIP-612 that is proposing to throttle connection set up on the
> one
> > hand and this KIP that is dramatically reducing default connection
> timeout
> > on the other. Not sure if that is a good idea.
> >
> >
> > On Fri, May 15, 2020 at 1:26 AM Cheng Tan <ct...@confluent.io> wrote:
>
>

Re: [DISCUSS] KIP-601: Configurable socket connection timeout

Posted by Cheng Tan <ct...@confluent.io>.

Dear Rajini,

Thanks for the reply. 

> e have a lot of these and I want to
> understand the benefits of the proposed timeout in this case alone. We
> currently have a request timeout of 30s. Would you consider adding a 10s
> connection timeout? 

A shorter timeout (10s) at the transportation level will help clients detect dead nodes faster. “request.timeout.ms” is too general and applies to all the requests whose complexity at the application level varies.  It’s risky to set “request.timeout.ms” to a lower value for detecting dead nodes quicker because of the involvement of the application layer.

After “socket.connection.setup.timeout.ms” hits, NetworkClient will fail the request in the exact approach as it handles “request.timeout.ms”. That is to say, the response will constructed upon a RetriableException. Producer, Consumer, and KafkaAdminClient can then perform their retry logic as a request timeout happens.

> We have KIP-612 that is proposing to throttle connection set up on the one
> hand and this KIP that is dramatically reducing default connection timeout
> on the other. Not sure if that is a good idea.

The default of the broker connection creation rate limit is Int.MaxValue. The KIP also proposes per-IP throttle configuration. Thus, I don’t expect the combination of the broker connection throttle and a shorter client transportation timeout will have a side effect.

Does the reasons above make sense to you? 

Best, - Cheng

> On May 15, 2020, at 4:49 AM, Rajini Sivaram <ra...@gmail.com> wrote:
> 
> Hi Cheng,
> 
> Let me rephrase my question. Let's say we didn't have the case of
> leastLoadedNode. We are only talking about connections to a specific node
> (i.e. leader or controller). We have a lot of these and I want to
> understand the benefits of the proposed timeout in this case alone. We
> currently have a request timeout of 30s. Would you consider adding a 10s
> connection timeout? And if you did, what would you expect the 10s timeout
> to do?
> 
> a) We could fail a request if connection didn't complete within 10s. If we
> always expect connections to succeed within 10s, this would be considered
> reasonable behaviour. But this would be changing the current default, which
> allows you up to 30 seconds to connect and process a request.
> b) We retry the connection. What would be the point? We were waiting in a
> queue for connecting, but we decide to stop and join the back of the queue.
> 
> We have KIP-612 that is proposing to throttle connection set up on the one
> hand and this KIP that is dramatically reducing default connection timeout
> on the other. Not sure if that is a good idea.
> 
> 
> On Fri, May 15, 2020 at 1:26 AM Cheng Tan <ct...@confluent.io> wrote:

Re: [DISCUSS] KIP-601: Configurable socket connection timeout

Posted by Rajini Sivaram <ra...@gmail.com>.

Hi Cheng,

Let me rephrase my question. Let's say we didn't have the case of
leastLoadedNode. We are only talking about connections to a specific node
(i.e. leader or controller). We have a lot of these and I want to
understand the benefits of the proposed timeout in this case alone. We
currently have a request timeout of 30s. Would you consider adding a 10s
connection timeout? And if you did, what would you expect the 10s timeout
to do?

a) We could fail a request if connection didn't complete within 10s. If we
always expect connections to succeed within 10s, this would be considered
reasonable behaviour. But this would be changing the current default, which
allows you up to 30 seconds to connect and process a request.
b) We retry the connection. What would be the point? We were waiting in a
queue for connecting, but we decide to stop and join the back of the queue.

We have KIP-612 that is proposing to throttle connection set up on the one
hand and this KIP that is dramatically reducing default connection timeout
on the other. Not sure if that is a good idea.


On Fri, May 15, 2020 at 1:26 AM Cheng Tan <ct...@confluent.io> wrote:

> Hi Rajini,
>
> Thanks for the reply.
>
> > Not sure 10s is a good default because it unnecessarily times out
> > connections, only to attempt re-connecting to the same broker (except in
> > the leastLoadedNode case where it would be useful to have a lower
> timeout).
>
>
> The underlying logic for a connection turn from “connecting” to
> “connected” is finishing the TCP three-way handshake (done by
> socketChannel.connect()). So we can say the 10 seconds timeout is for the
> transportation layer three-way handshake.
>
> The request timeout includes more besides the handshakes, such as
> authentication, server processing the request, and server sending back the
> response.
>
> Thus I think setting the default of socket.connection.setup.timeout.ms a
> smaller value than request.timeout.ms is reasonable. Do you have any
> specific reason in mind that 10s is too short?
>
> Best, -Chang Tan
>
>
> > 在 2020年5月14日，上午6:44，Rajini Sivaram <ra...@gmail.com> 写道：
> >
> > Hi Cheng,
> >
> > 1) Thanks for the update,  the KIP now says `
> > socket.connections.setup.timeout.ms*`*, which sounds good.
> >
> > 2) Not sure 10s is a good default because it unnecessarily times out
> > connections, only to attempt re-connecting to the same broker (except in
> > the leastLoadedNode case where it would be useful to have a lower
> timeout).
> > Couldn't we just use 30s (request timeout) as the default value,
> retaining
> > the current behaviour? Or alternatively, only apply the timeout where it
> > makes sense to have a lower timeout (i.e. a timeout for leastLoadedNode)?
> >
> > Regards,
> >
> > Rajini
> >
> >> On Thu, May 14, 2020 at 5:00 AM Cheng Tan <ct...@confluent.io> wrote:
> >>
> >> Hi Rajini,
> >>
> >> Thanks for the comments.
> >>
> >>> I think
> >>> they started off as connection timeouts but now include authentication
> >> time
> >>> as well. Have we considered using similar configs for this case?
> >>
> >>
> >> The new config I proposed is focusing on the connections to unreachable
> >> servers. The timeout count won’t involved the authentication. I think
> >> “socket.” prefix make sense. I’ll change it. Colin mentioned that adding
> >> the key word “setup” can help people understand that this config only
> >> applies to the connection setup. What about “socket.connection.setup.ms
> ”?
> >>
> >>> The KIP proposes 10s as the default. What does this mean for typical
> >>> connections like a produce request going to the leader?
> >>
> >> The new timeout config applies to each connection. It’s at the
> >> NetworkClient level and won’t consider the underlying connection logic.
> >> Specifically, by default, every connection will have 10 seconds to
> become
> >> “connected” from “connecting”, which implies the corresponding socket
> >> channel is now connected (SocketChanel.finishConnect() returns True), no
> >> matter what the request logic and abstraction is.
> >>
> >> Please let me know if these make sense to you or if you have more
> >> thoughts. Thank you.
> >>
> >> Best, -Cheng
> >>
> >>>> 在 2020年5月13日，上午7:11，Rajini Sivaram <ra...@gmail.com> 写道：
> >>>
> >>> Hi Cheng,
> >>>
> >>> Thanks for the KIP, sounds like a good improvement. A couple of
> comments:
> >>>
> >>> 1) We currently have client connection timeouts on the broker with
> >> configs
> >>> named `xxx.socket.timeout.ms` (e.g. controller.socket.timeout.ms). I
> >> think
> >>> they started off as connection timeouts but now include authentication
> >> time
> >>> as well. Have we considered using similar configs for this case? We may
> >>> want to prefix the new config with `socket.` anyway - something along
> the
> >>> lines of `socket.connection.timeout.ms` if it is just the connection
> >> time.
> >>>
> >>> 2) The KIP proposes 10s as the default. What does this mean for typical
> >>> connections like a produce request going to the leader? Instead of one
> >>> connection attempt to the leader, we want three separate connection
> >>> attempts within the request timeout to the leader?
> >>>
> >>> Regards,
> >>>
> >>> Rajini
> >>>
> >>>
> >>>> On Thu, May 7, 2020 at 11:51 PM Jose Garcia Sancio <
> >> jsancio@confluent.io>
> >>>> wrote:
> >>>> Cheng,
> >>>> Thanks for the KIP and the detailed proposal section. LGTM!
> >>>>>> On Thu, May 7, 2020 at 3:38 PM Cheng Tan <ct...@confluent.io> wrote:
> >>>>>> I think more about the potential wider use cases, I modified the
> >>>> proposal to target all the connection. Thanks.
> >>>>> - Best, - Cheng Tan
> >>>>>> On May 7, 2020, at 1:41 AM, Cheng Tan <ct...@confluent.io> wrote:
> >>>>>> Hi Colin,
> >>>>>> Sorry for the confusion. I’m proposing to implement timeout in the
> >>>> NetworkClient.leastLoadedNode() when iterating all the cached node.
> The
> >>>> alternative I can think is to implement the timeout in
> >> NetworkClient.poll()
> >>>>>> I’d prefer to implement in the leastLoadedNode(). Here’re the
> reasons:
> >>>>>> Usually when clients send a request, they will asking the network
> >>>> client to send the request to a specific node. In this case, the
> >>>> connection.setup.timeout won’t matter too much because the client
> >> doesn’t
> >>>> want to try other nodes for that specific request. The request level
> >>>> timeout would be enough. The metadata fetcher fetches the nodes status
> >>>> periodically so the clients can reassign the request to another node
> >> after
> >>>> timeout.
> >>>>>> Consumer, producer, and AdminClient are all using leastLoadedNode()
> >>>> for metadata fetch, where the connection setup timeout can play an
> >>>> important role. Unlike other requests can refer to the metadata for
> node
> >>>> condition, the metadata requests can only blindly choose a node for
> >> retry
> >>>> in the worst scenario. We want to make sure the client can get the
> >> metadata
> >>>> smoothly and as soon as possible. As a result, we need this
> >>>> connection.setup.timeout.
> >>>>>> Implementing the timeout in poll() or anywhere else might need an
> >>>> extra iteration of all nodes, which might downgrade the network client
> >>>> performance.
> >>>>>> I also updated the KIP content and KIP status. Please let me know if
> >>>> the above ideas make sense. Thanks.
> >>>>>> Best, - Cheng Tan
> >>>>>>> On May 4, 2020, at 5:26 PM, Colin McCabe <cmccabe@apache.org
> >> <mailto:
> >>>> cmccabe@apache.org>> wrote:
> >>>>>>> Hi Cheng,
> >>>>>>> On the KIP page, it lists this KIP as "draft."  It seems like
> "under
> >>>> discussion" is appropriate here, right?
> >>>>>>>> Currently, the initial socket connection timeout is depending on
> >>>> Linux kernel setting
> >>>>>>>> tcp_syn_retries. The timeout value is 2 ^ (tcp_sync_retries + 1)
> - 1
> >>>> seconds. For the
> >>>>>>>> reasons below, we want to control the client-side socket timeout
> >>>> directly using
> >>>>>>>> configuration files
> >>>>>>> Linux is just one example of an OS that Kafka could run on, right?
> >>>> You could also be running on MacOS, for example.
> >>>>>>>> I'm proposing to do a lazy socket connection time out. That is, we
> >>>> only check if
> >>>>>>>> we need to timeout a socket when we consider the corresponding
> node
> >>>> as a
> >>>>>>>> candidate in the node provider.
> >>>>>>> The NodeProvider is an AdminClient abstraction, right?  Why
> wouldn't
> >>>> we implement a connection setup timeout for all clients, not just
> >>>> AdminClient?
> >>>>>>> best,
> >>>>>>> Colin
> >>>>>>> On Mon, May 4, 2020, at 13:18, Colin McCabe wrote:
> >>>>>>>> Hmm.  A big part of the reason behind the KIP is that the default
> >>>>>>>> connection timeout behavior of the OS doesn't work for Kafka,
> right?
> >>>>>>>> For example, on Linux, if we wait 127 seconds for a connection
> >>>> attempt
> >>>>>>>> to time out, we won't get a chance to make another attempt in most
> >>>>>>>> cases.  So I think it makes sense to set a shorter default.
> >>>>>>>> best,
> >>>>>>>> Colin
> >>>>>>>> On Mon, May 4, 2020, at 09:44, Jose Garcia Sancio wrote:
> >>>>>>>>> Thanks for the KIP Cheng,
> >>>>>>>>>> The default value will be 10 seconds.
> >>>>>>>>> I think we should make the default the current behavior. Meaning
> >> the
> >>>>>>>>> default should leverage the default connect timeout from the
> >>>> operating
> >>>>>>>>> system.
> >>>>>>>>>> Proposed Changes
> >>>>>>>>> I don't fully understand this section. It seems like it is mainly
> >>>>>>>>> focused on the problem with the current implementation. Can you
> >>>>>>>>> explain how the proposed changes solve the problem?
> >>>>>>>>> Thanks.
> >>>>>>>>> --
> >>>>>>>>> -Jose
> >>>> --
> >>>> -Jose
> >>
>

Re: [DISCUSS] KIP-601: Configurable socket connection timeout

Posted by Cheng Tan <ct...@confluent.io>.

Hi Rajini,

Thanks for the reply. 

> Not sure 10s is a good default because it unnecessarily times out
> connections, only to attempt re-connecting to the same broker (except in
> the leastLoadedNode case where it would be useful to have a lower timeout).


The underlying logic for a connection turn from “connecting” to “connected” is finishing the TCP three-way handshake (done by socketChannel.connect()). So we can say the 10 seconds timeout is for the transportation layer three-way handshake.

The request timeout includes more besides the handshakes, such as authentication, server processing the request, and server sending back the response. 

Thus I think setting the default of socket.connection.setup.timeout.ms a smaller value than request.timeout.ms is reasonable. Do you have any specific reason in mind that 10s is too short? 

Best, -Chang Tan 


> 在 2020年5月14日，上午6:44，Rajini Sivaram <ra...@gmail.com> 写道：
> 
> Hi Cheng,
> 
> 1) Thanks for the update,  the KIP now says `
> socket.connections.setup.timeout.ms*`*, which sounds good.
> 
> 2) Not sure 10s is a good default because it unnecessarily times out
> connections, only to attempt re-connecting to the same broker (except in
> the leastLoadedNode case where it would be useful to have a lower timeout).
> Couldn't we just use 30s (request timeout) as the default value, retaining
> the current behaviour? Or alternatively, only apply the timeout where it
> makes sense to have a lower timeout (i.e. a timeout for leastLoadedNode)?
> 
> Regards,
> 
> Rajini
> 
>> On Thu, May 14, 2020 at 5:00 AM Cheng Tan <ct...@confluent.io> wrote:
>> 
>> Hi Rajini,
>> 
>> Thanks for the comments.
>> 
>>> I think
>>> they started off as connection timeouts but now include authentication
>> time
>>> as well. Have we considered using similar configs for this case?
>> 
>> 
>> The new config I proposed is focusing on the connections to unreachable
>> servers. The timeout count won’t involved the authentication. I think
>> “socket.” prefix make sense. I’ll change it. Colin mentioned that adding
>> the key word “setup” can help people understand that this config only
>> applies to the connection setup. What about “socket.connection.setup.ms”?
>> 
>>> The KIP proposes 10s as the default. What does this mean for typical
>>> connections like a produce request going to the leader?
>> 
>> The new timeout config applies to each connection. It’s at the
>> NetworkClient level and won’t consider the underlying connection logic.
>> Specifically, by default, every connection will have 10 seconds to become
>> “connected” from “connecting”, which implies the corresponding socket
>> channel is now connected (SocketChanel.finishConnect() returns True), no
>> matter what the request logic and abstraction is.
>> 
>> Please let me know if these make sense to you or if you have more
>> thoughts. Thank you.
>> 
>> Best, -Cheng
>> 
>>>> 在 2020年5月13日，上午7:11，Rajini Sivaram <ra...@gmail.com> 写道：
>>> 
>>> Hi Cheng,
>>> 
>>> Thanks for the KIP, sounds like a good improvement. A couple of comments:
>>> 
>>> 1) We currently have client connection timeouts on the broker with
>> configs
>>> named `xxx.socket.timeout.ms` (e.g. controller.socket.timeout.ms). I
>> think
>>> they started off as connection timeouts but now include authentication
>> time
>>> as well. Have we considered using similar configs for this case? We may
>>> want to prefix the new config with `socket.` anyway - something along the
>>> lines of `socket.connection.timeout.ms` if it is just the connection
>> time.
>>> 
>>> 2) The KIP proposes 10s as the default. What does this mean for typical
>>> connections like a produce request going to the leader? Instead of one
>>> connection attempt to the leader, we want three separate connection
>>> attempts within the request timeout to the leader?
>>> 
>>> Regards,
>>> 
>>> Rajini
>>> 
>>> 
>>>> On Thu, May 7, 2020 at 11:51 PM Jose Garcia Sancio <
>> jsancio@confluent.io>
>>>> wrote:
>>>> Cheng,
>>>> Thanks for the KIP and the detailed proposal section. LGTM!
>>>>>> On Thu, May 7, 2020 at 3:38 PM Cheng Tan <ct...@confluent.io> wrote:
>>>>>> I think more about the potential wider use cases, I modified the
>>>> proposal to target all the connection. Thanks.
>>>>> - Best, - Cheng Tan
>>>>>> On May 7, 2020, at 1:41 AM, Cheng Tan <ct...@confluent.io> wrote:
>>>>>> Hi Colin,
>>>>>> Sorry for the confusion. I’m proposing to implement timeout in the
>>>> NetworkClient.leastLoadedNode() when iterating all the cached node. The
>>>> alternative I can think is to implement the timeout in
>> NetworkClient.poll()
>>>>>> I’d prefer to implement in the leastLoadedNode(). Here’re the reasons:
>>>>>> Usually when clients send a request, they will asking the network
>>>> client to send the request to a specific node. In this case, the
>>>> connection.setup.timeout won’t matter too much because the client
>> doesn’t
>>>> want to try other nodes for that specific request. The request level
>>>> timeout would be enough. The metadata fetcher fetches the nodes status
>>>> periodically so the clients can reassign the request to another node
>> after
>>>> timeout.
>>>>>> Consumer, producer, and AdminClient are all using leastLoadedNode()
>>>> for metadata fetch, where the connection setup timeout can play an
>>>> important role. Unlike other requests can refer to the metadata for node
>>>> condition, the metadata requests can only blindly choose a node for
>> retry
>>>> in the worst scenario. We want to make sure the client can get the
>> metadata
>>>> smoothly and as soon as possible. As a result, we need this
>>>> connection.setup.timeout.
>>>>>> Implementing the timeout in poll() or anywhere else might need an
>>>> extra iteration of all nodes, which might downgrade the network client
>>>> performance.
>>>>>> I also updated the KIP content and KIP status. Please let me know if
>>>> the above ideas make sense. Thanks.
>>>>>> Best, - Cheng Tan
>>>>>>> On May 4, 2020, at 5:26 PM, Colin McCabe <cmccabe@apache.org
>> <mailto:
>>>> cmccabe@apache.org>> wrote:
>>>>>>> Hi Cheng,
>>>>>>> On the KIP page, it lists this KIP as "draft."  It seems like "under
>>>> discussion" is appropriate here, right?
>>>>>>>> Currently, the initial socket connection timeout is depending on
>>>> Linux kernel setting
>>>>>>>> tcp_syn_retries. The timeout value is 2 ^ (tcp_sync_retries + 1) - 1
>>>> seconds. For the
>>>>>>>> reasons below, we want to control the client-side socket timeout
>>>> directly using
>>>>>>>> configuration files
>>>>>>> Linux is just one example of an OS that Kafka could run on, right?
>>>> You could also be running on MacOS, for example.
>>>>>>>> I'm proposing to do a lazy socket connection time out. That is, we
>>>> only check if
>>>>>>>> we need to timeout a socket when we consider the corresponding node
>>>> as a
>>>>>>>> candidate in the node provider.
>>>>>>> The NodeProvider is an AdminClient abstraction, right?  Why wouldn't
>>>> we implement a connection setup timeout for all clients, not just
>>>> AdminClient?
>>>>>>> best,
>>>>>>> Colin
>>>>>>> On Mon, May 4, 2020, at 13:18, Colin McCabe wrote:
>>>>>>>> Hmm.  A big part of the reason behind the KIP is that the default
>>>>>>>> connection timeout behavior of the OS doesn't work for Kafka, right?
>>>>>>>> For example, on Linux, if we wait 127 seconds for a connection
>>>> attempt
>>>>>>>> to time out, we won't get a chance to make another attempt in most
>>>>>>>> cases.  So I think it makes sense to set a shorter default.
>>>>>>>> best,
>>>>>>>> Colin
>>>>>>>> On Mon, May 4, 2020, at 09:44, Jose Garcia Sancio wrote:
>>>>>>>>> Thanks for the KIP Cheng,
>>>>>>>>>> The default value will be 10 seconds.
>>>>>>>>> I think we should make the default the current behavior. Meaning
>> the
>>>>>>>>> default should leverage the default connect timeout from the
>>>> operating
>>>>>>>>> system.
>>>>>>>>>> Proposed Changes
>>>>>>>>> I don't fully understand this section. It seems like it is mainly
>>>>>>>>> focused on the problem with the current implementation. Can you
>>>>>>>>> explain how the proposed changes solve the problem?
>>>>>>>>> Thanks.
>>>>>>>>> --
>>>>>>>>> -Jose
>>>> --
>>>> -Jose
>>

Re: [DISCUSS] KIP-601: Configurable socket connection timeout

Posted by Rajini Sivaram <ra...@gmail.com>.

Hi Cheng,

1) Thanks for the update,  the KIP now says `
socket.connections.setup.timeout.ms*`*, which sounds good.

2) Not sure 10s is a good default because it unnecessarily times out
connections, only to attempt re-connecting to the same broker (except in
the leastLoadedNode case where it would be useful to have a lower timeout).
Couldn't we just use 30s (request timeout) as the default value, retaining
the current behaviour? Or alternatively, only apply the timeout where it
makes sense to have a lower timeout (i.e. a timeout for leastLoadedNode)?

Regards,

Rajini

On Thu, May 14, 2020 at 5:00 AM Cheng Tan <ct...@confluent.io> wrote:

> Hi Rajini,
>
> Thanks for the comments.
>
> > I think
> > they started off as connection timeouts but now include authentication
> time
> > as well. Have we considered using similar configs for this case?
>
>
> The new config I proposed is focusing on the connections to unreachable
> servers. The timeout count won’t involved the authentication. I think
> “socket.” prefix make sense. I’ll change it. Colin mentioned that adding
> the key word “setup” can help people understand that this config only
> applies to the connection setup. What about “socket.connection.setup.ms”?
>
> > The KIP proposes 10s as the default. What does this mean for typical
> > connections like a produce request going to the leader?
>
> The new timeout config applies to each connection. It’s at the
> NetworkClient level and won’t consider the underlying connection logic.
> Specifically, by default, every connection will have 10 seconds to become
> “connected” from “connecting”, which implies the corresponding socket
> channel is now connected (SocketChanel.finishConnect() returns True), no
> matter what the request logic and abstraction is.
>
> Please let me know if these make sense to you or if you have more
> thoughts. Thank you.
>
> Best, -Cheng
>
> > 在 2020年5月13日，上午7:11，Rajini Sivaram <ra...@gmail.com> 写道：
> >
> > Hi Cheng,
> >
> > Thanks for the KIP, sounds like a good improvement. A couple of comments:
> >
> > 1) We currently have client connection timeouts on the broker with
> configs
> > named `xxx.socket.timeout.ms` (e.g. controller.socket.timeout.ms). I
> think
> > they started off as connection timeouts but now include authentication
> time
> > as well. Have we considered using similar configs for this case? We may
> > want to prefix the new config with `socket.` anyway - something along the
> > lines of `socket.connection.timeout.ms` if it is just the connection
> time.
> >
> > 2) The KIP proposes 10s as the default. What does this mean for typical
> > connections like a produce request going to the leader? Instead of one
> > connection attempt to the leader, we want three separate connection
> > attempts within the request timeout to the leader?
> >
> > Regards,
> >
> > Rajini
> >
> >
> >> On Thu, May 7, 2020 at 11:51 PM Jose Garcia Sancio <
> jsancio@confluent.io>
> >> wrote:
> >> Cheng,
> >> Thanks for the KIP and the detailed proposal section. LGTM!
> >>>> On Thu, May 7, 2020 at 3:38 PM Cheng Tan <ct...@confluent.io> wrote:
> >>>> I think more about the potential wider use cases, I modified the
> >> proposal to target all the connection. Thanks.
> >>> - Best, - Cheng Tan
> >>>> On May 7, 2020, at 1:41 AM, Cheng Tan <ct...@confluent.io> wrote:
> >>>> Hi Colin,
> >>>> Sorry for the confusion. I’m proposing to implement timeout in the
> >> NetworkClient.leastLoadedNode() when iterating all the cached node. The
> >> alternative I can think is to implement the timeout in
> NetworkClient.poll()
> >>>> I’d prefer to implement in the leastLoadedNode(). Here’re the reasons:
> >>>> Usually when clients send a request, they will asking the network
> >> client to send the request to a specific node. In this case, the
> >> connection.setup.timeout won’t matter too much because the client
> doesn’t
> >> want to try other nodes for that specific request. The request level
> >> timeout would be enough. The metadata fetcher fetches the nodes status
> >> periodically so the clients can reassign the request to another node
> after
> >> timeout.
> >>>> Consumer, producer, and AdminClient are all using leastLoadedNode()
> >> for metadata fetch, where the connection setup timeout can play an
> >> important role. Unlike other requests can refer to the metadata for node
> >> condition, the metadata requests can only blindly choose a node for
> retry
> >> in the worst scenario. We want to make sure the client can get the
> metadata
> >> smoothly and as soon as possible. As a result, we need this
> >> connection.setup.timeout.
> >>>> Implementing the timeout in poll() or anywhere else might need an
> >> extra iteration of all nodes, which might downgrade the network client
> >> performance.
> >>>> I also updated the KIP content and KIP status. Please let me know if
> >> the above ideas make sense. Thanks.
> >>>> Best, - Cheng Tan
> >>>>> On May 4, 2020, at 5:26 PM, Colin McCabe <cmccabe@apache.org
> <mailto:
> >> cmccabe@apache.org>> wrote:
> >>>>> Hi Cheng,
> >>>>> On the KIP page, it lists this KIP as "draft."  It seems like "under
> >> discussion" is appropriate here, right?
> >>>>>> Currently, the initial socket connection timeout is depending on
> >> Linux kernel setting
> >>>>>> tcp_syn_retries. The timeout value is 2 ^ (tcp_sync_retries + 1) - 1
> >> seconds. For the
> >>>>>> reasons below, we want to control the client-side socket timeout
> >> directly using
> >>>>>> configuration files
> >>>>> Linux is just one example of an OS that Kafka could run on, right?
> >> You could also be running on MacOS, for example.
> >>>>>> I'm proposing to do a lazy socket connection time out. That is, we
> >> only check if
> >>>>>> we need to timeout a socket when we consider the corresponding node
> >> as a
> >>>>>> candidate in the node provider.
> >>>>> The NodeProvider is an AdminClient abstraction, right?  Why wouldn't
> >> we implement a connection setup timeout for all clients, not just
> >> AdminClient?
> >>>>> best,
> >>>>> Colin
> >>>>> On Mon, May 4, 2020, at 13:18, Colin McCabe wrote:
> >>>>>> Hmm.  A big part of the reason behind the KIP is that the default
> >>>>>> connection timeout behavior of the OS doesn't work for Kafka, right?
> >>>>>> For example, on Linux, if we wait 127 seconds for a connection
> >> attempt
> >>>>>> to time out, we won't get a chance to make another attempt in most
> >>>>>> cases.  So I think it makes sense to set a shorter default.
> >>>>>> best,
> >>>>>> Colin
> >>>>>> On Mon, May 4, 2020, at 09:44, Jose Garcia Sancio wrote:
> >>>>>>> Thanks for the KIP Cheng,
> >>>>>>>> The default value will be 10 seconds.
> >>>>>>> I think we should make the default the current behavior. Meaning
> the
> >>>>>>> default should leverage the default connect timeout from the
> >> operating
> >>>>>>> system.
> >>>>>>>> Proposed Changes
> >>>>>>> I don't fully understand this section. It seems like it is mainly
> >>>>>>> focused on the problem with the current implementation. Can you
> >>>>>>> explain how the proposed changes solve the problem?
> >>>>>>> Thanks.
> >>>>>>> --
> >>>>>>> -Jose
> >> --
> >> -Jose
>

Re: [DISCUSS] KIP-601: Configurable socket connection timeout

Posted by Cheng Tan <ct...@confluent.io>.

Hi Rajini,

Thanks for the comments. 

> I think
> they started off as connection timeouts but now include authentication time
> as well. Have we considered using similar configs for this case?


The new config I proposed is focusing on the connections to unreachable servers. The timeout count won’t involved the authentication. I think “socket.” prefix make sense. I’ll change it. Colin mentioned that adding the key word “setup” can help people understand that this config only applies to the connection setup. What about “socket.connection.setup.ms”?

> The KIP proposes 10s as the default. What does this mean for typical
> connections like a produce request going to the leader?

The new timeout config applies to each connection. It’s at the NetworkClient level and won’t consider the underlying connection logic. Specifically, by default, every connection will have 10 seconds to become “connected” from “connecting”, which implies the corresponding socket channel is now connected (SocketChanel.finishConnect() returns True), no matter what the request logic and abstraction is.

Please let me know if these make sense to you or if you have more thoughts. Thank you.

Best, -Cheng

> 在 2020年5月13日，上午7:11，Rajini Sivaram <ra...@gmail.com> 写道：
> 
> Hi Cheng,
> 
> Thanks for the KIP, sounds like a good improvement. A couple of comments:
> 
> 1) We currently have client connection timeouts on the broker with configs
> named `xxx.socket.timeout.ms` (e.g. controller.socket.timeout.ms). I think
> they started off as connection timeouts but now include authentication time
> as well. Have we considered using similar configs for this case? We may
> want to prefix the new config with `socket.` anyway - something along the
> lines of `socket.connection.timeout.ms` if it is just the connection time.
> 
> 2) The KIP proposes 10s as the default. What does this mean for typical
> connections like a produce request going to the leader? Instead of one
> connection attempt to the leader, we want three separate connection
> attempts within the request timeout to the leader?
> 
> Regards,
> 
> Rajini
> 
> 
>> On Thu, May 7, 2020 at 11:51 PM Jose Garcia Sancio <js...@confluent.io>
>> wrote:
>> Cheng,
>> Thanks for the KIP and the detailed proposal section. LGTM!
>>>> On Thu, May 7, 2020 at 3:38 PM Cheng Tan <ct...@confluent.io> wrote:
>>>> I think more about the potential wider use cases, I modified the
>> proposal to target all the connection. Thanks.
>>> - Best, - Cheng Tan
>>>> On May 7, 2020, at 1:41 AM, Cheng Tan <ct...@confluent.io> wrote:
>>>> Hi Colin,
>>>> Sorry for the confusion. I’m proposing to implement timeout in the
>> NetworkClient.leastLoadedNode() when iterating all the cached node. The
>> alternative I can think is to implement the timeout in NetworkClient.poll()
>>>> I’d prefer to implement in the leastLoadedNode(). Here’re the reasons:
>>>> Usually when clients send a request, they will asking the network
>> client to send the request to a specific node. In this case, the
>> connection.setup.timeout won’t matter too much because the client doesn’t
>> want to try other nodes for that specific request. The request level
>> timeout would be enough. The metadata fetcher fetches the nodes status
>> periodically so the clients can reassign the request to another node after
>> timeout.
>>>> Consumer, producer, and AdminClient are all using leastLoadedNode()
>> for metadata fetch, where the connection setup timeout can play an
>> important role. Unlike other requests can refer to the metadata for node
>> condition, the metadata requests can only blindly choose a node for retry
>> in the worst scenario. We want to make sure the client can get the metadata
>> smoothly and as soon as possible. As a result, we need this
>> connection.setup.timeout.
>>>> Implementing the timeout in poll() or anywhere else might need an
>> extra iteration of all nodes, which might downgrade the network client
>> performance.
>>>> I also updated the KIP content and KIP status. Please let me know if
>> the above ideas make sense. Thanks.
>>>> Best, - Cheng Tan
>>>>> On May 4, 2020, at 5:26 PM, Colin McCabe <cmccabe@apache.org <mailto:
>> cmccabe@apache.org>> wrote:
>>>>> Hi Cheng,
>>>>> On the KIP page, it lists this KIP as "draft."  It seems like "under
>> discussion" is appropriate here, right?
>>>>>> Currently, the initial socket connection timeout is depending on
>> Linux kernel setting
>>>>>> tcp_syn_retries. The timeout value is 2 ^ (tcp_sync_retries + 1) - 1
>> seconds. For the
>>>>>> reasons below, we want to control the client-side socket timeout
>> directly using
>>>>>> configuration files
>>>>> Linux is just one example of an OS that Kafka could run on, right?
>> You could also be running on MacOS, for example.
>>>>>> I'm proposing to do a lazy socket connection time out. That is, we
>> only check if
>>>>>> we need to timeout a socket when we consider the corresponding node
>> as a
>>>>>> candidate in the node provider.
>>>>> The NodeProvider is an AdminClient abstraction, right?  Why wouldn't
>> we implement a connection setup timeout for all clients, not just
>> AdminClient?
>>>>> best,
>>>>> Colin
>>>>> On Mon, May 4, 2020, at 13:18, Colin McCabe wrote:
>>>>>> Hmm.  A big part of the reason behind the KIP is that the default
>>>>>> connection timeout behavior of the OS doesn't work for Kafka, right?
>>>>>> For example, on Linux, if we wait 127 seconds for a connection
>> attempt
>>>>>> to time out, we won't get a chance to make another attempt in most
>>>>>> cases.  So I think it makes sense to set a shorter default.
>>>>>> best,
>>>>>> Colin
>>>>>> On Mon, May 4, 2020, at 09:44, Jose Garcia Sancio wrote:
>>>>>>> Thanks for the KIP Cheng,
>>>>>>>> The default value will be 10 seconds.
>>>>>>> I think we should make the default the current behavior. Meaning the
>>>>>>> default should leverage the default connect timeout from the
>> operating
>>>>>>> system.
>>>>>>>> Proposed Changes
>>>>>>> I don't fully understand this section. It seems like it is mainly
>>>>>>> focused on the problem with the current implementation. Can you
>>>>>>> explain how the proposed changes solve the problem?
>>>>>>> Thanks.
>>>>>>> --
>>>>>>> -Jose
>> --
>> -Jose

Re: [DISCUSS] KIP-601: Configurable socket connection timeout

Posted by Rajini Sivaram <ra...@gmail.com>.

Hi Cheng,

Thanks for the KIP, sounds like a good improvement. A couple of comments:

1) We currently have client connection timeouts on the broker with configs
named `xxx.socket.timeout.ms` (e.g. controller.socket.timeout.ms). I think
they started off as connection timeouts but now include authentication time
as well. Have we considered using similar configs for this case? We may
want to prefix the new config with `socket.` anyway - something along the
lines of `socket.connection.timeout.ms` if it is just the connection time.

2) The KIP proposes 10s as the default. What does this mean for typical
connections like a produce request going to the leader? Instead of one
connection attempt to the leader, we want three separate connection
attempts within the request timeout to the leader?

Regards,

Rajini


On Thu, May 7, 2020 at 11:51 PM Jose Garcia Sancio <js...@confluent.io>
wrote:

> Cheng,
>
> Thanks for the KIP and the detailed proposal section. LGTM!
>
> On Thu, May 7, 2020 at 3:38 PM Cheng Tan <ct...@confluent.io> wrote:
> >
> > I think more about the potential wider use cases, I modified the
> proposal to target all the connection. Thanks.
> >
> > - Best, - Cheng Tan
> >
> > > On May 7, 2020, at 1:41 AM, Cheng Tan <ct...@confluent.io> wrote:
> > >
> > > Hi Colin,
> > >
> > > Sorry for the confusion. I’m proposing to implement timeout in the
> NetworkClient.leastLoadedNode() when iterating all the cached node. The
> alternative I can think is to implement the timeout in NetworkClient.poll()
> > >
> > > I’d prefer to implement in the leastLoadedNode(). Here’re the reasons:
> > > Usually when clients send a request, they will asking the network
> client to send the request to a specific node. In this case, the
> connection.setup.timeout won’t matter too much because the client doesn’t
> want to try other nodes for that specific request. The request level
> timeout would be enough. The metadata fetcher fetches the nodes status
> periodically so the clients can reassign the request to another node after
> timeout.
> > > Consumer, producer, and AdminClient are all using leastLoadedNode()
> for metadata fetch, where the connection setup timeout can play an
> important role. Unlike other requests can refer to the metadata for node
> condition, the metadata requests can only blindly choose a node for retry
> in the worst scenario. We want to make sure the client can get the metadata
> smoothly and as soon as possible. As a result, we need this
> connection.setup.timeout.
> > > Implementing the timeout in poll() or anywhere else might need an
> extra iteration of all nodes, which might downgrade the network client
> performance.
> > > I also updated the KIP content and KIP status. Please let me know if
> the above ideas make sense. Thanks.
> > >
> > > Best, - Cheng Tan
> > >
> > >
> > >
> > >> On May 4, 2020, at 5:26 PM, Colin McCabe <cmccabe@apache.org <mailto:
> cmccabe@apache.org>> wrote:
> > >>
> > >> Hi Cheng,
> > >>
> > >> On the KIP page, it lists this KIP as "draft."  It seems like "under
> discussion" is appropriate here, right?
> > >>
> > >>> Currently, the initial socket connection timeout is depending on
> Linux kernel setting
> > >>> tcp_syn_retries. The timeout value is 2 ^ (tcp_sync_retries + 1) - 1
> seconds. For the
> > >>> reasons below, we want to control the client-side socket timeout
> directly using
> > >>> configuration files
> > >>
> > >> Linux is just one example of an OS that Kafka could run on, right?
> You could also be running on MacOS, for example.
> > >>
> > >>> I'm proposing to do a lazy socket connection time out. That is, we
> only check if
> > >>> we need to timeout a socket when we consider the corresponding node
> as a
> > >>> candidate in the node provider.
> > >>
> > >> The NodeProvider is an AdminClient abstraction, right?  Why wouldn't
> we implement a connection setup timeout for all clients, not just
> AdminClient?
> > >>
> > >> best,
> > >> Colin
> > >>
> > >> On Mon, May 4, 2020, at 13:18, Colin McCabe wrote:
> > >>> Hmm.  A big part of the reason behind the KIP is that the default
> > >>> connection timeout behavior of the OS doesn't work for Kafka, right?
> > >>> For example, on Linux, if we wait 127 seconds for a connection
> attempt
> > >>> to time out, we won't get a chance to make another attempt in most
> > >>> cases.  So I think it makes sense to set a shorter default.
> > >>>
> > >>> best,
> > >>> Colin
> > >>>
> > >>>
> > >>> On Mon, May 4, 2020, at 09:44, Jose Garcia Sancio wrote:
> > >>>> Thanks for the KIP Cheng,
> > >>>>
> > >>>>> The default value will be 10 seconds.
> > >>>>
> > >>>> I think we should make the default the current behavior. Meaning the
> > >>>> default should leverage the default connect timeout from the
> operating
> > >>>> system.
> > >>>>
> > >>>>> Proposed Changes
> > >>>>
> > >>>> I don't fully understand this section. It seems like it is mainly
> > >>>> focused on the problem with the current implementation. Can you
> > >>>> explain how the proposed changes solve the problem?
> > >>>>
> > >>>> Thanks.
> > >>>>
> > >>>>
> > >>>> --
> > >>>> -Jose
> > >>>>
> > >>>
> > >
> >
>
>
> --
> -Jose
>

Re: [DISCUSS] KIP-601: Configurable socket connection timeout

Posted by Jose Garcia Sancio <js...@confluent.io>.

Cheng,

Thanks for the KIP and the detailed proposal section. LGTM!

On Thu, May 7, 2020 at 3:38 PM Cheng Tan <ct...@confluent.io> wrote:
>
> I think more about the potential wider use cases, I modified the proposal to target all the connection. Thanks.
>
> - Best, - Cheng Tan
>
> > On May 7, 2020, at 1:41 AM, Cheng Tan <ct...@confluent.io> wrote:
> >
> > Hi Colin,
> >
> > Sorry for the confusion. I’m proposing to implement timeout in the NetworkClient.leastLoadedNode() when iterating all the cached node. The alternative I can think is to implement the timeout in NetworkClient.poll()
> >
> > I’d prefer to implement in the leastLoadedNode(). Here’re the reasons:
> > Usually when clients send a request, they will asking the network client to send the request to a specific node. In this case, the connection.setup.timeout won’t matter too much because the client doesn’t want to try other nodes for that specific request. The request level timeout would be enough. The metadata fetcher fetches the nodes status periodically so the clients can reassign the request to another node after timeout.
> > Consumer, producer, and AdminClient are all using leastLoadedNode() for metadata fetch, where the connection setup timeout can play an important role. Unlike other requests can refer to the metadata for node condition, the metadata requests can only blindly choose a node for retry in the worst scenario. We want to make sure the client can get the metadata smoothly and as soon as possible. As a result, we need this connection.setup.timeout.
> > Implementing the timeout in poll() or anywhere else might need an extra iteration of all nodes, which might downgrade the network client performance.
> > I also updated the KIP content and KIP status. Please let me know if the above ideas make sense. Thanks.
> >
> > Best, - Cheng Tan
> >
> >
> >
> >> On May 4, 2020, at 5:26 PM, Colin McCabe <cmccabe@apache.org <ma...@apache.org>> wrote:
> >>
> >> Hi Cheng,
> >>
> >> On the KIP page, it lists this KIP as "draft."  It seems like "under discussion" is appropriate here, right?
> >>
> >>> Currently, the initial socket connection timeout is depending on Linux kernel setting
> >>> tcp_syn_retries. The timeout value is 2 ^ (tcp_sync_retries + 1) - 1 seconds. For the
> >>> reasons below, we want to control the client-side socket timeout directly using
> >>> configuration files
> >>
> >> Linux is just one example of an OS that Kafka could run on, right?  You could also be running on MacOS, for example.
> >>
> >>> I'm proposing to do a lazy socket connection time out. That is, we only check if
> >>> we need to timeout a socket when we consider the corresponding node as a
> >>> candidate in the node provider.
> >>
> >> The NodeProvider is an AdminClient abstraction, right?  Why wouldn't we implement a connection setup timeout for all clients, not just AdminClient?
> >>
> >> best,
> >> Colin
> >>
> >> On Mon, May 4, 2020, at 13:18, Colin McCabe wrote:
> >>> Hmm.  A big part of the reason behind the KIP is that the default
> >>> connection timeout behavior of the OS doesn't work for Kafka, right?
> >>> For example, on Linux, if we wait 127 seconds for a connection attempt
> >>> to time out, we won't get a chance to make another attempt in most
> >>> cases.  So I think it makes sense to set a shorter default.
> >>>
> >>> best,
> >>> Colin
> >>>
> >>>
> >>> On Mon, May 4, 2020, at 09:44, Jose Garcia Sancio wrote:
> >>>> Thanks for the KIP Cheng,
> >>>>
> >>>>> The default value will be 10 seconds.
> >>>>
> >>>> I think we should make the default the current behavior. Meaning the
> >>>> default should leverage the default connect timeout from the operating
> >>>> system.
> >>>>
> >>>>> Proposed Changes
> >>>>
> >>>> I don't fully understand this section. It seems like it is mainly
> >>>> focused on the problem with the current implementation. Can you
> >>>> explain how the proposed changes solve the problem?
> >>>>
> >>>> Thanks.
> >>>>
> >>>>
> >>>> --
> >>>> -Jose
> >>>>
> >>>
> >
>


-- 
-Jose

Re: [DISCUSS] KIP-601: Configurable socket connection timeout

Posted by Cheng Tan <ct...@confluent.io>.

I think more about the potential wider use cases, I modified the proposal to target all the connection. Thanks.

- Best, - Cheng Tan

> On May 7, 2020, at 1:41 AM, Cheng Tan <ct...@confluent.io> wrote:
> 
> Hi Colin,
> 
> Sorry for the confusion. I’m proposing to implement timeout in the NetworkClient.leastLoadedNode() when iterating all the cached node. The alternative I can think is to implement the timeout in NetworkClient.poll() 
> 
> I’d prefer to implement in the leastLoadedNode(). Here’re the reasons:
> Usually when clients send a request, they will asking the network client to send the request to a specific node. In this case, the connection.setup.timeout won’t matter too much because the client doesn’t want to try other nodes for that specific request. The request level timeout would be enough. The metadata fetcher fetches the nodes status periodically so the clients can reassign the request to another node after timeout.
> Consumer, producer, and AdminClient are all using leastLoadedNode() for metadata fetch, where the connection setup timeout can play an important role. Unlike other requests can refer to the metadata for node condition, the metadata requests can only blindly choose a node for retry in the worst scenario. We want to make sure the client can get the metadata smoothly and as soon as possible. As a result, we need this connection.setup.timeout.
> Implementing the timeout in poll() or anywhere else might need an extra iteration of all nodes, which might downgrade the network client performance.
> I also updated the KIP content and KIP status. Please let me know if the above ideas make sense. Thanks.
> 
> Best, - Cheng Tan
> 
> 
> 
>> On May 4, 2020, at 5:26 PM, Colin McCabe <cmccabe@apache.org <ma...@apache.org>> wrote:
>> 
>> Hi Cheng,
>> 
>> On the KIP page, it lists this KIP as "draft."  It seems like "under discussion" is appropriate here, right?
>> 
>>> Currently, the initial socket connection timeout is depending on Linux kernel setting
>>> tcp_syn_retries. The timeout value is 2 ^ (tcp_sync_retries + 1) - 1 seconds. For the
>>> reasons below, we want to control the client-side socket timeout directly using 
>>> configuration files
>> 
>> Linux is just one example of an OS that Kafka could run on, right?  You could also be running on MacOS, for example.
>> 
>>> I'm proposing to do a lazy socket connection time out. That is, we only check if
>>> we need to timeout a socket when we consider the corresponding node as a 
>>> candidate in the node provider.
>> 
>> The NodeProvider is an AdminClient abstraction, right?  Why wouldn't we implement a connection setup timeout for all clients, not just AdminClient?
>> 
>> best,
>> Colin
>> 
>> On Mon, May 4, 2020, at 13:18, Colin McCabe wrote:
>>> Hmm.  A big part of the reason behind the KIP is that the default 
>>> connection timeout behavior of the OS doesn't work for Kafka, right?  
>>> For example, on Linux, if we wait 127 seconds for a connection attempt 
>>> to time out, we won't get a chance to make another attempt in most 
>>> cases.  So I think it makes sense to set a shorter default.
>>> 
>>> best,
>>> Colin
>>> 
>>> 
>>> On Mon, May 4, 2020, at 09:44, Jose Garcia Sancio wrote:
>>>> Thanks for the KIP Cheng,
>>>> 
>>>>> The default value will be 10 seconds.
>>>> 
>>>> I think we should make the default the current behavior. Meaning the
>>>> default should leverage the default connect timeout from the operating
>>>> system.
>>>> 
>>>>> Proposed Changes
>>>> 
>>>> I don't fully understand this section. It seems like it is mainly
>>>> focused on the problem with the current implementation. Can you
>>>> explain how the proposed changes solve the problem?
>>>> 
>>>> Thanks.
>>>> 
>>>> 
>>>> -- 
>>>> -Jose
>>>> 
>>> 
>

Re: [DISCUSS] KIP-601: Configurable socket connection timeout

Posted by Cheng Tan <ct...@confluent.io>.

Hi Colin,

Sorry for the confusion. I’m proposing to implement timeout in the NetworkClient.leastLoadedNode() when iterating all the cached node. The alternative I can think is to implement the timeout in NetworkClient.poll() 

I’d prefer to implement in the leastLoadedNode(). Here’re the reasons:
Usually when clients send a request, they will asking the network client to send the request to a specific node. In this case, the connection.setup.timeout won’t matter too much because the client doesn’t want to try other nodes for that specific request. The request level timeout would be enough. The metadata fetcher fetches the nodes status periodically so the clients can reassign the request to another node after timeout.
Consumer, producer, and AdminClient are all using leastLoadedNode() for metadata fetch, where the connection setup timeout can play an important role. Unlike other requests can refer to the metadata for node condition, the metadata requests can only blindly choose a node for retry in the worst scenario. We want to make sure the client can get the metadata smoothly and as soon as possible. As a result, we need this connection.setup.timeout.
Implementing the timeout in poll() or anywhere else might need an extra iteration of all nodes, which might downgrade the network client performance.
I also updated the KIP content and KIP status. Please let me know if the above ideas make sense. Thanks.

Best, - Cheng Tan

> On May 4, 2020, at 5:26 PM, Colin McCabe <cm...@apache.org> wrote:
> 
> Hi Cheng,
> 
> On the KIP page, it lists this KIP as "draft."  It seems like "under discussion" is appropriate here, right?
> 
>> Currently, the initial socket connection timeout is depending on Linux kernel setting
>> tcp_syn_retries. The timeout value is 2 ^ (tcp_sync_retries + 1) - 1 seconds. For the
>> reasons below, we want to control the client-side socket timeout directly using 
>> configuration files
> 
> Linux is just one example of an OS that Kafka could run on, right?  You could also be running on MacOS, for example.
> 
>> I'm proposing to do a lazy socket connection time out. That is, we only check if
>> we need to timeout a socket when we consider the corresponding node as a 
>> candidate in the node provider.
> 
> The NodeProvider is an AdminClient abstraction, right?  Why wouldn't we implement a connection setup timeout for all clients, not just AdminClient?
> 
> best,
> Colin
> 
> On Mon, May 4, 2020, at 13:18, Colin McCabe wrote:
>> Hmm.  A big part of the reason behind the KIP is that the default 
>> connection timeout behavior of the OS doesn't work for Kafka, right?  
>> For example, on Linux, if we wait 127 seconds for a connection attempt 
>> to time out, we won't get a chance to make another attempt in most 
>> cases.  So I think it makes sense to set a shorter default.
>> 
>> best,
>> Colin
>> 
>> 
>> On Mon, May 4, 2020, at 09:44, Jose Garcia Sancio wrote:
>>> Thanks for the KIP Cheng,
>>> 
>>>> The default value will be 10 seconds.
>>> 
>>> I think we should make the default the current behavior. Meaning the
>>> default should leverage the default connect timeout from the operating
>>> system.
>>> 
>>>> Proposed Changes
>>> 
>>> I don't fully understand this section. It seems like it is mainly
>>> focused on the problem with the current implementation. Can you
>>> explain how the proposed changes solve the problem?
>>> 
>>> Thanks.
>>> 
>>> 
>>> -- 
>>> -Jose
>>> 
>>

Re: [DISCUSS] KIP-601: Configurable socket connection timeout

Posted by Colin McCabe <cm...@apache.org>.

Hi Cheng,

On the KIP page, it lists this KIP as "draft."  It seems like "under discussion" is appropriate here, right?

> Currently, the initial socket connection timeout is depending on Linux kernel setting
> tcp_syn_retries. The timeout value is 2 ^ (tcp_sync_retries + 1) - 1 seconds. For the
> reasons below, we want to control the client-side socket timeout directly using 
> configuration files

Linux is just one example of an OS that Kafka could run on, right?  You could also be running on MacOS, for example.

> I'm proposing to do a lazy socket connection time out. That is, we only check if
> we need to timeout a socket when we consider the corresponding node as a 
> candidate in the node provider.

The NodeProvider is an AdminClient abstraction, right?  Why wouldn't we implement a connection setup timeout for all clients, not just AdminClient?

best,
Colin

On Mon, May 4, 2020, at 13:18, Colin McCabe wrote:
> Hmm.  A big part of the reason behind the KIP is that the default 
> connection timeout behavior of the OS doesn't work for Kafka, right?  
> For example, on Linux, if we wait 127 seconds for a connection attempt 
> to time out, we won't get a chance to make another attempt in most 
> cases.  So I think it makes sense to set a shorter default.
> 
> best,
> Colin
> 
> 
> On Mon, May 4, 2020, at 09:44, Jose Garcia Sancio wrote:
> > Thanks for the KIP Cheng,
> > 
> > > The default value will be 10 seconds.
> > 
> > I think we should make the default the current behavior. Meaning the
> > default should leverage the default connect timeout from the operating
> > system.
> > 
> > > Proposed Changes
> > 
> > I don't fully understand this section. It seems like it is mainly
> > focused on the problem with the current implementation. Can you
> > explain how the proposed changes solve the problem?
> > 
> > Thanks.
> > 
> > 
> > -- 
> > -Jose
> >
>

Re: [DISCUSS] KIP-601: Configurable socket connection timeout

Posted by Colin McCabe <cm...@apache.org>.

Hmm.  A big part of the reason behind the KIP is that the default connection timeout behavior of the OS doesn't work for Kafka, right?  For example, on Linux, if we wait 127 seconds for a connection attempt to time out, we won't get a chance to make another attempt in most cases.  So I think it makes sense to set a shorter default.

best,
Colin

On Mon, May 4, 2020, at 09:44, Jose Garcia Sancio wrote:
> Thanks for the KIP Cheng,
> 
> > The default value will be 10 seconds.
> 
> I think we should make the default the current behavior. Meaning the
> default should leverage the default connect timeout from the operating
> system.
> 
> > Proposed Changes
> 
> I don't fully understand this section. It seems like it is mainly
> focused on the problem with the current implementation. Can you
> explain how the proposed changes solve the problem?
> 
> Thanks.
> 
> 
> -- 
> -Jose
>

Re: [DISCUSS] KIP-601: Configurable socket connection timeout

Posted by Jose Garcia Sancio <js...@confluent.io>.

Thanks for the KIP Cheng,

> The default value will be 10 seconds.

I think we should make the default the current behavior. Meaning the
default should leverage the default connect timeout from the operating
system.

> Proposed Changes

I don't fully understand this section. It seems like it is mainly
focused on the problem with the current implementation. Can you
explain how the proposed changes solve the problem?

Thanks.


-- 
-Jose