You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by safique ahemad <sa...@gmail.com> on 2016/06/02 20:16:42 UTC

Kafka take too long to update the client with metadata when a broker is gone

Hi All,

We are using Kafka broker cluster in our data center.
Recently, It is realized that when a Kafka broker goes down then client try
to refresh the metadata but it get stale metadata upto near 30 seconds.

After near 30-35 seconds, updated metadata is obtained by client. This is
really a large time for the client continuously gets send failure for so
long.

Kindly, reply if any configuration may help here or something else or
required.


-- 

Regards,
Safique Ahemad

Re: Kafka take too long to update the client with metadata when a broker is gone

Posted by safique ahemad <sa...@gmail.com>.
I digged it furthermore...

It seems the API blockingSendAndReceive hanging for a long to send/receive
response from the broker which is not affected.

I just checked send and receive time its taking near about 30 sec.



On Tue, Jun 14, 2016 at 11:03 AM, safique ahemad <sa...@gmail.com>
wrote:

> Guys any response would be appreciated.
>
>
> ---------- Forwarded message ----------
> From: safique ahemad <sa...@gmail.com>
> Date: Thu, Jun 9, 2016 at 11:18 AM
> Subject: Re: Kafka take too long to update the client with metadata when a
> broker is gone
> To: users@kafka.apache.org
> Preview attachment tracesKafka3FailTruncated.log
> tracesKafka3FailTruncated.log
> Not virus scanned
>
> <https://doc-04-50-docs.googleusercontent.com/docs/securesc/7068rl289qqa9uum23oadhr0rcvjhflt/1r1nep3r3tp180gnekri32cn636vcet7/1465927200000/06745288266951563386/06745288266951563386/0B-nANlrsm5ogQkh1NUR2UHYtbkU?e=download>
>
> Hello guys,
>
> Below is the link where kafka logs can be seens with TRACE enabled.
>
>
> https://drive.google.com/file/d/0B-nANlrsm5ogQkh1NUR2UHYtbkU/view?usp=sharing
>
> I have truncated log as it was very big but it has all the cover of the
> time of problem.
>
> Scenario:
> 1) There were three kafka running i.e. kafka1(perhaps it is the
> controller), kafka2 and kafka3. Go Sarama producer was producing.
>
> 2) Kafka3 is killed.
>
> the log time stamp is: 00:47:26
> 3) At the time stamp 00:47:34, new leaders are chosen for down partitions.
>
> 4) But If you see, before 00:48:58, when client send a metadata fetch
> request, kafka1 give it stale metadata in response.
> But internally it use correct metadata.
> At 00:48:58, Kafka receive some trigger after then it start giving correct
> metadata to client.
>
>
> Kindly, go through the log and revert if I am missing anything.
>
>
>
> On Fri, Jun 3, 2016 at 6:36 AM, Christian <en...@gmail.com> wrote:
>
>> Hi Gerard,
>>
>> When trying to reproduce this, did you use the go sarama client Safique
>> mentioned?
>>
>>
>> On Fri, Jun 3, 2016 at 5:10 AM, Gerard Klijs <ge...@dizzit.com>
>> wrote:
>>
>> > I asume you use a replication factor of 3 for the topics? When I ran
>> some
>> > test with producer/consumers in a dockerized setup, there where only few
>> > failures before the producer switched to to correct new broker again. I
>> > don't know the exact time, but seemed like a few seconds at max, this
>> was
>> > with with 0.9.0.0.
>> >
>> > On Fri, Jun 3, 2016 at 9:00 AM safique ahemad <sa...@gmail.com>
>> > wrote:
>> >
>> > > Hi Steve,
>> > >
>> > > There is no way to access that from public side so I won't be able to
>> do
>> > > that. Sorry for that.
>> > > But the step is quite simple. The only difference is that we have
>> > deployed
>> > > Kafka cluster using mesos url.
>> > >
>> > > 1) launch 3 Kafka broker cluster and create a topic with multiple
>> > > partitions at least 3 so that one partition land at least on a broker.
>> > > 2) launch consumer/producer client.
>> > > 3) kill a broker
>> > > 4) just observe the behavior of producer client
>> > >
>> > >
>> > >
>> > > On Thu, Jun 2, 2016 at 8:15 PM, Steve Tian <st...@gmail.com>
>> > > wrote:
>> > >
>> > > > I see.  I'm not sure if this is a known issue.  Do you mind share
>> the
>> > > > brokers/topics setup and the steps to reproduce this issue?
>> > > >
>> > > > Cheers, Steve
>> > > >
>> > > > On Fri, Jun 3, 2016, 9:45 AM safique ahemad <sa...@gmail.com>
>> > > wrote:
>> > > >
>> > > > > you got it right...
>> > > > >
>> > > > > But DialTimeout is not a concern here. Client try fetching
>> metadata
>> > > from
>> > > > > Kafka brokers but Kafka give them stale metadata near 30-40 sec.
>> > > > > It try to fetch 3-4 time in between until it get updated metadata.
>> > > > > This is completely different problem than
>> > > > > https://github.com/Shopify/sarama/issues/661
>> > > > >
>> > > > >
>> > > > >
>> > > > > On Thu, Jun 2, 2016 at 6:05 PM, Steve Tian <
>> steve.cs.tian@gmail.com>
>> > > > > wrote:
>> > > > >
>> > > > > > So you are coming from
>> > https://github.com/Shopify/sarama/issues/661
>> > > ,
>> > > > > > right?   I'm not sure if anything from broker side can help but
>> > looks
>> > > > > like
>> > > > > > you already found DialTimeout on client side can help?
>> > > > > >
>> > > > > > Cheers, Steve
>> > > > > >
>> > > > > > On Fri, Jun 3, 2016, 8:33 AM safique ahemad <
>> saf.jnumca@gmail.com>
>> > > > > wrote:
>> > > > > >
>> > > > > > > kafka version:0.9.0.0
>> > > > > > > go sarama client version: 1.8
>> > > > > > >
>> > > > > > > On Thu, Jun 2, 2016 at 5:14 PM, Steve Tian <
>> > > steve.cs.tian@gmail.com>
>> > > > > > > wrote:
>> > > > > > >
>> > > > > > > > Client version?
>> > > > > > > >
>> > > > > > > > On Fri, Jun 3, 2016, 4:44 AM safique ahemad <
>> > > saf.jnumca@gmail.com>
>> > > > > > > wrote:
>> > > > > > > >
>> > > > > > > > > Hi All,
>> > > > > > > > >
>> > > > > > > > > We are using Kafka broker cluster in our data center.
>> > > > > > > > > Recently, It is realized that when a Kafka broker goes
>> down
>> > > then
>> > > > > > client
>> > > > > > > > try
>> > > > > > > > > to refresh the metadata but it get stale metadata upto
>> near
>> > 30
>> > > > > > seconds.
>> > > > > > > > >
>> > > > > > > > > After near 30-35 seconds, updated metadata is obtained by
>> > > client.
>> > > > > > This
>> > > > > > > is
>> > > > > > > > > really a large time for the client continuously gets send
>> > > failure
>> > > > > for
>> > > > > > > so
>> > > > > > > > > long.
>> > > > > > > > >
>> > > > > > > > > Kindly, reply if any configuration may help here or
>> something
>> > > > else
>> > > > > or
>> > > > > > > > > required.
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > --
>> > > > > > > > >
>> > > > > > > > > Regards,
>> > > > > > > > > Safique Ahemad
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > > --
>> > > > > > >
>> > > > > > > Regards,
>> > > > > > > Safique Ahemad
>> > > > > > > GlobalLogic | Leaders in software R&D services
>> > > > > > > P :+91 120 4342000-2990 | M:+91 9953533367
>> > > > > > > www.globallogic.com
>> > > > > > >
>> > > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > > --
>> > > > >
>> > > > > Regards,
>> > > > > Safique Ahemad
>> > > > > GlobalLogic | Leaders in software R&D services
>> > > > > P :+91 120 4342000-2990 | M:+91 9953533367
>> > > > > www.globallogic.com
>> > > > >
>> > > >
>> > >
>> > >
>> > >
>> > > --
>> > >
>> > > Regards,
>> > > Safique Ahemad
>> > > GlobalLogic | Leaders in software R&D services
>> > > P :+91 120 4342000-2990 | M:+91 9953533367
>> > > www.globallogic.com
>> > >
>> >
>>
>
>
>
> --
>
> Regards,
> Safique Ahemad
> GlobalLogic | Leaders in software R&D services
> P :+91 120 4342000-2990 | M:+91 9953533367
> www.globallogic.com
>
>
>
> --
>
> Regards,
> Safique Ahemad
> GlobalLogic | Leaders in software R&D services
> P :+91 120 4342000-2990 | M:+91 9953533367
> www.globallogic.com
>



-- 

Regards,
Safique Ahemad
GlobalLogic | Leaders in software R&D services
P :+91 120 4342000-2990 | M:+91 9953533367
www.globallogic.com

Re: Kafka take too long to update the client with metadata when a broker is gone

Posted by safique ahemad <sa...@gmail.com>.
Hello guys,

Below is the link where kafka logs can be seens with TRACE enabled.

https://drive.google.com/file/d/0B-nANlrsm5ogQkh1NUR2UHYtbkU/view?usp=sharing

I have truncated log as it was very big but it has all the cover of the
time of problem.

Scenario:
1) There were three kafka running i.e. kafka1(perhaps it is the
controller), kafka2 and kafka3. Go Sarama producer was producing.

2) Kafka3 is killed.

the log time stamp is: 00:47:26
3) At the time stamp 00:47:34, new leaders are chosen for down partitions.

4) But If you see, before 00:48:58, when client send a metadata fetch
request, kafka1 give it stale metadata in response.
But internally it use correct metadata.
At 00:48:58, Kafka receive some trigger after then it start giving correct
metadata to client.


Kindly, go through the log and revert if I am missing anything.



On Fri, Jun 3, 2016 at 6:36 AM, Christian <en...@gmail.com> wrote:

> Hi Gerard,
>
> When trying to reproduce this, did you use the go sarama client Safique
> mentioned?
>
>
> On Fri, Jun 3, 2016 at 5:10 AM, Gerard Klijs <ge...@dizzit.com>
> wrote:
>
> > I asume you use a replication factor of 3 for the topics? When I ran some
> > test with producer/consumers in a dockerized setup, there where only few
> > failures before the producer switched to to correct new broker again. I
> > don't know the exact time, but seemed like a few seconds at max, this was
> > with with 0.9.0.0.
> >
> > On Fri, Jun 3, 2016 at 9:00 AM safique ahemad <sa...@gmail.com>
> > wrote:
> >
> > > Hi Steve,
> > >
> > > There is no way to access that from public side so I won't be able to
> do
> > > that. Sorry for that.
> > > But the step is quite simple. The only difference is that we have
> > deployed
> > > Kafka cluster using mesos url.
> > >
> > > 1) launch 3 Kafka broker cluster and create a topic with multiple
> > > partitions at least 3 so that one partition land at least on a broker.
> > > 2) launch consumer/producer client.
> > > 3) kill a broker
> > > 4) just observe the behavior of producer client
> > >
> > >
> > >
> > > On Thu, Jun 2, 2016 at 8:15 PM, Steve Tian <st...@gmail.com>
> > > wrote:
> > >
> > > > I see.  I'm not sure if this is a known issue.  Do you mind share the
> > > > brokers/topics setup and the steps to reproduce this issue?
> > > >
> > > > Cheers, Steve
> > > >
> > > > On Fri, Jun 3, 2016, 9:45 AM safique ahemad <sa...@gmail.com>
> > > wrote:
> > > >
> > > > > you got it right...
> > > > >
> > > > > But DialTimeout is not a concern here. Client try fetching metadata
> > > from
> > > > > Kafka brokers but Kafka give them stale metadata near 30-40 sec.
> > > > > It try to fetch 3-4 time in between until it get updated metadata.
> > > > > This is completely different problem than
> > > > > https://github.com/Shopify/sarama/issues/661
> > > > >
> > > > >
> > > > >
> > > > > On Thu, Jun 2, 2016 at 6:05 PM, Steve Tian <
> steve.cs.tian@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > So you are coming from
> > https://github.com/Shopify/sarama/issues/661
> > > ,
> > > > > > right?   I'm not sure if anything from broker side can help but
> > looks
> > > > > like
> > > > > > you already found DialTimeout on client side can help?
> > > > > >
> > > > > > Cheers, Steve
> > > > > >
> > > > > > On Fri, Jun 3, 2016, 8:33 AM safique ahemad <
> saf.jnumca@gmail.com>
> > > > > wrote:
> > > > > >
> > > > > > > kafka version:0.9.0.0
> > > > > > > go sarama client version: 1.8
> > > > > > >
> > > > > > > On Thu, Jun 2, 2016 at 5:14 PM, Steve Tian <
> > > steve.cs.tian@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Client version?
> > > > > > > >
> > > > > > > > On Fri, Jun 3, 2016, 4:44 AM safique ahemad <
> > > saf.jnumca@gmail.com>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi All,
> > > > > > > > >
> > > > > > > > > We are using Kafka broker cluster in our data center.
> > > > > > > > > Recently, It is realized that when a Kafka broker goes down
> > > then
> > > > > > client
> > > > > > > > try
> > > > > > > > > to refresh the metadata but it get stale metadata upto near
> > 30
> > > > > > seconds.
> > > > > > > > >
> > > > > > > > > After near 30-35 seconds, updated metadata is obtained by
> > > client.
> > > > > > This
> > > > > > > is
> > > > > > > > > really a large time for the client continuously gets send
> > > failure
> > > > > for
> > > > > > > so
> > > > > > > > > long.
> > > > > > > > >
> > > > > > > > > Kindly, reply if any configuration may help here or
> something
> > > > else
> > > > > or
> > > > > > > > > required.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > >
> > > > > > > > > Regards,
> > > > > > > > > Safique Ahemad
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > >
> > > > > > > Regards,
> > > > > > > Safique Ahemad
> > > > > > > GlobalLogic | Leaders in software R&D services
> > > > > > > P :+91 120 4342000-2990 | M:+91 9953533367
> > > > > > > www.globallogic.com
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > >
> > > > > Regards,
> > > > > Safique Ahemad
> > > > > GlobalLogic | Leaders in software R&D services
> > > > > P :+91 120 4342000-2990 | M:+91 9953533367
> > > > > www.globallogic.com
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > >
> > > Regards,
> > > Safique Ahemad
> > > GlobalLogic | Leaders in software R&D services
> > > P :+91 120 4342000-2990 | M:+91 9953533367
> > > www.globallogic.com
> > >
> >
>



-- 

Regards,
Safique Ahemad
GlobalLogic | Leaders in software R&D services
P :+91 120 4342000-2990 | M:+91 9953533367
www.globallogic.com

Re: Kafka take too long to update the client with metadata when a broker is gone

Posted by Christian <en...@gmail.com>.
Hi Gerard,

When trying to reproduce this, did you use the go sarama client Safique
mentioned?


On Fri, Jun 3, 2016 at 5:10 AM, Gerard Klijs <ge...@dizzit.com>
wrote:

> I asume you use a replication factor of 3 for the topics? When I ran some
> test with producer/consumers in a dockerized setup, there where only few
> failures before the producer switched to to correct new broker again. I
> don't know the exact time, but seemed like a few seconds at max, this was
> with with 0.9.0.0.
>
> On Fri, Jun 3, 2016 at 9:00 AM safique ahemad <sa...@gmail.com>
> wrote:
>
> > Hi Steve,
> >
> > There is no way to access that from public side so I won't be able to do
> > that. Sorry for that.
> > But the step is quite simple. The only difference is that we have
> deployed
> > Kafka cluster using mesos url.
> >
> > 1) launch 3 Kafka broker cluster and create a topic with multiple
> > partitions at least 3 so that one partition land at least on a broker.
> > 2) launch consumer/producer client.
> > 3) kill a broker
> > 4) just observe the behavior of producer client
> >
> >
> >
> > On Thu, Jun 2, 2016 at 8:15 PM, Steve Tian <st...@gmail.com>
> > wrote:
> >
> > > I see.  I'm not sure if this is a known issue.  Do you mind share the
> > > brokers/topics setup and the steps to reproduce this issue?
> > >
> > > Cheers, Steve
> > >
> > > On Fri, Jun 3, 2016, 9:45 AM safique ahemad <sa...@gmail.com>
> > wrote:
> > >
> > > > you got it right...
> > > >
> > > > But DialTimeout is not a concern here. Client try fetching metadata
> > from
> > > > Kafka brokers but Kafka give them stale metadata near 30-40 sec.
> > > > It try to fetch 3-4 time in between until it get updated metadata.
> > > > This is completely different problem than
> > > > https://github.com/Shopify/sarama/issues/661
> > > >
> > > >
> > > >
> > > > On Thu, Jun 2, 2016 at 6:05 PM, Steve Tian <st...@gmail.com>
> > > > wrote:
> > > >
> > > > > So you are coming from
> https://github.com/Shopify/sarama/issues/661
> > ,
> > > > > right?   I'm not sure if anything from broker side can help but
> looks
> > > > like
> > > > > you already found DialTimeout on client side can help?
> > > > >
> > > > > Cheers, Steve
> > > > >
> > > > > On Fri, Jun 3, 2016, 8:33 AM safique ahemad <sa...@gmail.com>
> > > > wrote:
> > > > >
> > > > > > kafka version:0.9.0.0
> > > > > > go sarama client version: 1.8
> > > > > >
> > > > > > On Thu, Jun 2, 2016 at 5:14 PM, Steve Tian <
> > steve.cs.tian@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Client version?
> > > > > > >
> > > > > > > On Fri, Jun 3, 2016, 4:44 AM safique ahemad <
> > saf.jnumca@gmail.com>
> > > > > > wrote:
> > > > > > >
> > > > > > > > Hi All,
> > > > > > > >
> > > > > > > > We are using Kafka broker cluster in our data center.
> > > > > > > > Recently, It is realized that when a Kafka broker goes down
> > then
> > > > > client
> > > > > > > try
> > > > > > > > to refresh the metadata but it get stale metadata upto near
> 30
> > > > > seconds.
> > > > > > > >
> > > > > > > > After near 30-35 seconds, updated metadata is obtained by
> > client.
> > > > > This
> > > > > > is
> > > > > > > > really a large time for the client continuously gets send
> > failure
> > > > for
> > > > > > so
> > > > > > > > long.
> > > > > > > >
> > > > > > > > Kindly, reply if any configuration may help here or something
> > > else
> > > > or
> > > > > > > > required.
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > >
> > > > > > > > Regards,
> > > > > > > > Safique Ahemad
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > >
> > > > > > Regards,
> > > > > > Safique Ahemad
> > > > > > GlobalLogic | Leaders in software R&D services
> > > > > > P :+91 120 4342000-2990 | M:+91 9953533367
> > > > > > www.globallogic.com
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > >
> > > > Regards,
> > > > Safique Ahemad
> > > > GlobalLogic | Leaders in software R&D services
> > > > P :+91 120 4342000-2990 | M:+91 9953533367
> > > > www.globallogic.com
> > > >
> > >
> >
> >
> >
> > --
> >
> > Regards,
> > Safique Ahemad
> > GlobalLogic | Leaders in software R&D services
> > P :+91 120 4342000-2990 | M:+91 9953533367
> > www.globallogic.com
> >
>

Re: Kafka take too long to update the client with metadata when a broker is gone

Posted by Gerard Klijs <ge...@dizzit.com>.
I asume you use a replication factor of 3 for the topics? When I ran some
test with producer/consumers in a dockerized setup, there where only few
failures before the producer switched to to correct new broker again. I
don't know the exact time, but seemed like a few seconds at max, this was
with with 0.9.0.0.

On Fri, Jun 3, 2016 at 9:00 AM safique ahemad <sa...@gmail.com> wrote:

> Hi Steve,
>
> There is no way to access that from public side so I won't be able to do
> that. Sorry for that.
> But the step is quite simple. The only difference is that we have deployed
> Kafka cluster using mesos url.
>
> 1) launch 3 Kafka broker cluster and create a topic with multiple
> partitions at least 3 so that one partition land at least on a broker.
> 2) launch consumer/producer client.
> 3) kill a broker
> 4) just observe the behavior of producer client
>
>
>
> On Thu, Jun 2, 2016 at 8:15 PM, Steve Tian <st...@gmail.com>
> wrote:
>
> > I see.  I'm not sure if this is a known issue.  Do you mind share the
> > brokers/topics setup and the steps to reproduce this issue?
> >
> > Cheers, Steve
> >
> > On Fri, Jun 3, 2016, 9:45 AM safique ahemad <sa...@gmail.com>
> wrote:
> >
> > > you got it right...
> > >
> > > But DialTimeout is not a concern here. Client try fetching metadata
> from
> > > Kafka brokers but Kafka give them stale metadata near 30-40 sec.
> > > It try to fetch 3-4 time in between until it get updated metadata.
> > > This is completely different problem than
> > > https://github.com/Shopify/sarama/issues/661
> > >
> > >
> > >
> > > On Thu, Jun 2, 2016 at 6:05 PM, Steve Tian <st...@gmail.com>
> > > wrote:
> > >
> > > > So you are coming from https://github.com/Shopify/sarama/issues/661
> ,
> > > > right?   I'm not sure if anything from broker side can help but looks
> > > like
> > > > you already found DialTimeout on client side can help?
> > > >
> > > > Cheers, Steve
> > > >
> > > > On Fri, Jun 3, 2016, 8:33 AM safique ahemad <sa...@gmail.com>
> > > wrote:
> > > >
> > > > > kafka version:0.9.0.0
> > > > > go sarama client version: 1.8
> > > > >
> > > > > On Thu, Jun 2, 2016 at 5:14 PM, Steve Tian <
> steve.cs.tian@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Client version?
> > > > > >
> > > > > > On Fri, Jun 3, 2016, 4:44 AM safique ahemad <
> saf.jnumca@gmail.com>
> > > > > wrote:
> > > > > >
> > > > > > > Hi All,
> > > > > > >
> > > > > > > We are using Kafka broker cluster in our data center.
> > > > > > > Recently, It is realized that when a Kafka broker goes down
> then
> > > > client
> > > > > > try
> > > > > > > to refresh the metadata but it get stale metadata upto near 30
> > > > seconds.
> > > > > > >
> > > > > > > After near 30-35 seconds, updated metadata is obtained by
> client.
> > > > This
> > > > > is
> > > > > > > really a large time for the client continuously gets send
> failure
> > > for
> > > > > so
> > > > > > > long.
> > > > > > >
> > > > > > > Kindly, reply if any configuration may help here or something
> > else
> > > or
> > > > > > > required.
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > >
> > > > > > > Regards,
> > > > > > > Safique Ahemad
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > >
> > > > > Regards,
> > > > > Safique Ahemad
> > > > > GlobalLogic | Leaders in software R&D services
> > > > > P :+91 120 4342000-2990 | M:+91 9953533367
> > > > > www.globallogic.com
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > >
> > > Regards,
> > > Safique Ahemad
> > > GlobalLogic | Leaders in software R&D services
> > > P :+91 120 4342000-2990 | M:+91 9953533367
> > > www.globallogic.com
> > >
> >
>
>
>
> --
>
> Regards,
> Safique Ahemad
> GlobalLogic | Leaders in software R&D services
> P :+91 120 4342000-2990 | M:+91 9953533367
> www.globallogic.com
>

Re: Kafka take too long to update the client with metadata when a broker is gone

Posted by safique ahemad <sa...@gmail.com>.
Hi Steve,

There is no way to access that from public side so I won't be able to do
that. Sorry for that.
But the step is quite simple. The only difference is that we have deployed
Kafka cluster using mesos url.

1) launch 3 Kafka broker cluster and create a topic with multiple
partitions at least 3 so that one partition land at least on a broker.
2) launch consumer/producer client.
3) kill a broker
4) just observe the behavior of producer client



On Thu, Jun 2, 2016 at 8:15 PM, Steve Tian <st...@gmail.com> wrote:

> I see.  I'm not sure if this is a known issue.  Do you mind share the
> brokers/topics setup and the steps to reproduce this issue?
>
> Cheers, Steve
>
> On Fri, Jun 3, 2016, 9:45 AM safique ahemad <sa...@gmail.com> wrote:
>
> > you got it right...
> >
> > But DialTimeout is not a concern here. Client try fetching metadata from
> > Kafka brokers but Kafka give them stale metadata near 30-40 sec.
> > It try to fetch 3-4 time in between until it get updated metadata.
> > This is completely different problem than
> > https://github.com/Shopify/sarama/issues/661
> >
> >
> >
> > On Thu, Jun 2, 2016 at 6:05 PM, Steve Tian <st...@gmail.com>
> > wrote:
> >
> > > So you are coming from https://github.com/Shopify/sarama/issues/661 ,
> > > right?   I'm not sure if anything from broker side can help but looks
> > like
> > > you already found DialTimeout on client side can help?
> > >
> > > Cheers, Steve
> > >
> > > On Fri, Jun 3, 2016, 8:33 AM safique ahemad <sa...@gmail.com>
> > wrote:
> > >
> > > > kafka version:0.9.0.0
> > > > go sarama client version: 1.8
> > > >
> > > > On Thu, Jun 2, 2016 at 5:14 PM, Steve Tian <st...@gmail.com>
> > > > wrote:
> > > >
> > > > > Client version?
> > > > >
> > > > > On Fri, Jun 3, 2016, 4:44 AM safique ahemad <sa...@gmail.com>
> > > > wrote:
> > > > >
> > > > > > Hi All,
> > > > > >
> > > > > > We are using Kafka broker cluster in our data center.
> > > > > > Recently, It is realized that when a Kafka broker goes down then
> > > client
> > > > > try
> > > > > > to refresh the metadata but it get stale metadata upto near 30
> > > seconds.
> > > > > >
> > > > > > After near 30-35 seconds, updated metadata is obtained by client.
> > > This
> > > > is
> > > > > > really a large time for the client continuously gets send failure
> > for
> > > > so
> > > > > > long.
> > > > > >
> > > > > > Kindly, reply if any configuration may help here or something
> else
> > or
> > > > > > required.
> > > > > >
> > > > > >
> > > > > > --
> > > > > >
> > > > > > Regards,
> > > > > > Safique Ahemad
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > >
> > > > Regards,
> > > > Safique Ahemad
> > > > GlobalLogic | Leaders in software R&D services
> > > > P :+91 120 4342000-2990 | M:+91 9953533367
> > > > www.globallogic.com
> > > >
> > >
> >
> >
> >
> > --
> >
> > Regards,
> > Safique Ahemad
> > GlobalLogic | Leaders in software R&D services
> > P :+91 120 4342000-2990 | M:+91 9953533367
> > www.globallogic.com
> >
>



-- 

Regards,
Safique Ahemad
GlobalLogic | Leaders in software R&D services
P :+91 120 4342000-2990 | M:+91 9953533367
www.globallogic.com

Re: Kafka take too long to update the client with metadata when a broker is gone

Posted by Steve Tian <st...@gmail.com>.
I see.  I'm not sure if this is a known issue.  Do you mind share the
brokers/topics setup and the steps to reproduce this issue?

Cheers, Steve

On Fri, Jun 3, 2016, 9:45 AM safique ahemad <sa...@gmail.com> wrote:

> you got it right...
>
> But DialTimeout is not a concern here. Client try fetching metadata from
> Kafka brokers but Kafka give them stale metadata near 30-40 sec.
> It try to fetch 3-4 time in between until it get updated metadata.
> This is completely different problem than
> https://github.com/Shopify/sarama/issues/661
>
>
>
> On Thu, Jun 2, 2016 at 6:05 PM, Steve Tian <st...@gmail.com>
> wrote:
>
> > So you are coming from https://github.com/Shopify/sarama/issues/661 ,
> > right?   I'm not sure if anything from broker side can help but looks
> like
> > you already found DialTimeout on client side can help?
> >
> > Cheers, Steve
> >
> > On Fri, Jun 3, 2016, 8:33 AM safique ahemad <sa...@gmail.com>
> wrote:
> >
> > > kafka version:0.9.0.0
> > > go sarama client version: 1.8
> > >
> > > On Thu, Jun 2, 2016 at 5:14 PM, Steve Tian <st...@gmail.com>
> > > wrote:
> > >
> > > > Client version?
> > > >
> > > > On Fri, Jun 3, 2016, 4:44 AM safique ahemad <sa...@gmail.com>
> > > wrote:
> > > >
> > > > > Hi All,
> > > > >
> > > > > We are using Kafka broker cluster in our data center.
> > > > > Recently, It is realized that when a Kafka broker goes down then
> > client
> > > > try
> > > > > to refresh the metadata but it get stale metadata upto near 30
> > seconds.
> > > > >
> > > > > After near 30-35 seconds, updated metadata is obtained by client.
> > This
> > > is
> > > > > really a large time for the client continuously gets send failure
> for
> > > so
> > > > > long.
> > > > >
> > > > > Kindly, reply if any configuration may help here or something else
> or
> > > > > required.
> > > > >
> > > > >
> > > > > --
> > > > >
> > > > > Regards,
> > > > > Safique Ahemad
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > >
> > > Regards,
> > > Safique Ahemad
> > > GlobalLogic | Leaders in software R&D services
> > > P :+91 120 4342000-2990 | M:+91 9953533367
> > > www.globallogic.com
> > >
> >
>
>
>
> --
>
> Regards,
> Safique Ahemad
> GlobalLogic | Leaders in software R&D services
> P :+91 120 4342000-2990 | M:+91 9953533367
> www.globallogic.com
>

Re: Kafka take too long to update the client with metadata when a broker is gone

Posted by safique ahemad <sa...@gmail.com>.
you got it right...

But DialTimeout is not a concern here. Client try fetching metadata from
Kafka brokers but Kafka give them stale metadata near 30-40 sec.
It try to fetch 3-4 time in between until it get updated metadata.
This is completely different problem than
https://github.com/Shopify/sarama/issues/661



On Thu, Jun 2, 2016 at 6:05 PM, Steve Tian <st...@gmail.com> wrote:

> So you are coming from https://github.com/Shopify/sarama/issues/661 ,
> right?   I'm not sure if anything from broker side can help but looks like
> you already found DialTimeout on client side can help?
>
> Cheers, Steve
>
> On Fri, Jun 3, 2016, 8:33 AM safique ahemad <sa...@gmail.com> wrote:
>
> > kafka version:0.9.0.0
> > go sarama client version: 1.8
> >
> > On Thu, Jun 2, 2016 at 5:14 PM, Steve Tian <st...@gmail.com>
> > wrote:
> >
> > > Client version?
> > >
> > > On Fri, Jun 3, 2016, 4:44 AM safique ahemad <sa...@gmail.com>
> > wrote:
> > >
> > > > Hi All,
> > > >
> > > > We are using Kafka broker cluster in our data center.
> > > > Recently, It is realized that when a Kafka broker goes down then
> client
> > > try
> > > > to refresh the metadata but it get stale metadata upto near 30
> seconds.
> > > >
> > > > After near 30-35 seconds, updated metadata is obtained by client.
> This
> > is
> > > > really a large time for the client continuously gets send failure for
> > so
> > > > long.
> > > >
> > > > Kindly, reply if any configuration may help here or something else or
> > > > required.
> > > >
> > > >
> > > > --
> > > >
> > > > Regards,
> > > > Safique Ahemad
> > > >
> > >
> >
> >
> >
> > --
> >
> > Regards,
> > Safique Ahemad
> > GlobalLogic | Leaders in software R&D services
> > P :+91 120 4342000-2990 | M:+91 9953533367
> > www.globallogic.com
> >
>



-- 

Regards,
Safique Ahemad
GlobalLogic | Leaders in software R&D services
P :+91 120 4342000-2990 | M:+91 9953533367
www.globallogic.com

Re: Kafka take too long to update the client with metadata when a broker is gone

Posted by Steve Tian <st...@gmail.com>.
So you are coming from https://github.com/Shopify/sarama/issues/661 ,
right?   I'm not sure if anything from broker side can help but looks like
you already found DialTimeout on client side can help?

Cheers, Steve

On Fri, Jun 3, 2016, 8:33 AM safique ahemad <sa...@gmail.com> wrote:

> kafka version:0.9.0.0
> go sarama client version: 1.8
>
> On Thu, Jun 2, 2016 at 5:14 PM, Steve Tian <st...@gmail.com>
> wrote:
>
> > Client version?
> >
> > On Fri, Jun 3, 2016, 4:44 AM safique ahemad <sa...@gmail.com>
> wrote:
> >
> > > Hi All,
> > >
> > > We are using Kafka broker cluster in our data center.
> > > Recently, It is realized that when a Kafka broker goes down then client
> > try
> > > to refresh the metadata but it get stale metadata upto near 30 seconds.
> > >
> > > After near 30-35 seconds, updated metadata is obtained by client. This
> is
> > > really a large time for the client continuously gets send failure for
> so
> > > long.
> > >
> > > Kindly, reply if any configuration may help here or something else or
> > > required.
> > >
> > >
> > > --
> > >
> > > Regards,
> > > Safique Ahemad
> > >
> >
>
>
>
> --
>
> Regards,
> Safique Ahemad
> GlobalLogic | Leaders in software R&D services
> P :+91 120 4342000-2990 | M:+91 9953533367
> www.globallogic.com
>

Re: Kafka take too long to update the client with metadata when a broker is gone

Posted by safique ahemad <sa...@gmail.com>.
kafka version:0.9.0.0
go sarama client version: 1.8

On Thu, Jun 2, 2016 at 5:14 PM, Steve Tian <st...@gmail.com> wrote:

> Client version?
>
> On Fri, Jun 3, 2016, 4:44 AM safique ahemad <sa...@gmail.com> wrote:
>
> > Hi All,
> >
> > We are using Kafka broker cluster in our data center.
> > Recently, It is realized that when a Kafka broker goes down then client
> try
> > to refresh the metadata but it get stale metadata upto near 30 seconds.
> >
> > After near 30-35 seconds, updated metadata is obtained by client. This is
> > really a large time for the client continuously gets send failure for so
> > long.
> >
> > Kindly, reply if any configuration may help here or something else or
> > required.
> >
> >
> > --
> >
> > Regards,
> > Safique Ahemad
> >
>



-- 

Regards,
Safique Ahemad
GlobalLogic | Leaders in software R&D services
P :+91 120 4342000-2990 | M:+91 9953533367
www.globallogic.com

Re: Kafka take too long to update the client with metadata when a broker is gone

Posted by Steve Tian <st...@gmail.com>.
Client version?

On Fri, Jun 3, 2016, 4:44 AM safique ahemad <sa...@gmail.com> wrote:

> Hi All,
>
> We are using Kafka broker cluster in our data center.
> Recently, It is realized that when a Kafka broker goes down then client try
> to refresh the metadata but it get stale metadata upto near 30 seconds.
>
> After near 30-35 seconds, updated metadata is obtained by client. This is
> really a large time for the client continuously gets send failure for so
> long.
>
> Kindly, reply if any configuration may help here or something else or
> required.
>
>
> --
>
> Regards,
> Safique Ahemad
>