You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Prabhjot Bharaj <pr...@gmail.com> on 2015/09/18 14:48:59 UTC
Useful metric to check slow ISR catchup
Hi,
I've noticed that 1 follower replica node out of my kafka cluster catches
up to the data form the leader pretty slowly.
My topic has just 1 partition with 3 replicas. One other follower replica
gets the full data from the leader pretty instantly
It takes around 22 minutes to catch up 500MB of data.
I have setup ganglia monitoring on my cluster and I'm interested in knowing
what metric exposed through JMX would be useful in checking the reason for
such slowness
Thanks,
Prabhjot
Fwd: Useful metric to check slow ISR catchup
Posted by Prabhjot Bharaj <pr...@gmail.com>.
Hi Dev Folks,
Request your expertise on this doubt of mine
Thanks,
Prabhjot
---------- Forwarded message ----------
From: Prabhjot Bharaj <pr...@gmail.com>
Date: Mon, Sep 21, 2015 at 2:59 PM
Subject: Re: Useful metric to check slow ISR catchup
To: users@kafka.apache.org
Hi,
Attaching a screenshot of bytes/sec from Ganglia
As you can see, the graph in RED color belongs to the third replica, for
which the bytes/sec is around 10 times lower than its 2 peers (in Green and
Blue)
Earlier, I was thinking that it could be related to that 1 system only, but
when I created a new topic with 1 partition and 3 replicas, I see similar
graph on the other set of machines.
I'm not sure what parameter could be causing this. Any pointers are
appreciated
Thanks,
Prabhjot
On Mon, Sep 21, 2015 at 1:20 PM, Prabhjot Bharaj <pr...@gmail.com>
wrote:
> Hello Folks,
>
> Request your expertise on this
>
> Thanks,
> Prabhjot
>
> On Fri, Sep 18, 2015 at 6:18 PM, Prabhjot Bharaj <pr...@gmail.com>
> wrote:
>
>> Hi,
>>
>> I've noticed that 1 follower replica node out of my kafka cluster catches
>> up to the data form the leader pretty slowly.
>> My topic has just 1 partition with 3 replicas. One other follower replica
>> gets the full data from the leader pretty instantly
>>
>> It takes around 22 minutes to catch up 500MB of data.
>>
>> I have setup ganglia monitoring on my cluster and I'm interested in
>> knowing what metric exposed through JMX would be useful in checking the
>> reason for such slowness
>>
>> Thanks,
>> Prabhjot
>>
>
>
>
> --
> ---------------------------------------------------------
> "There are only 10 types of people in the world: Those who understand
> binary, and those who don't"
>
--
---------------------------------------------------------
"There are only 10 types of people in the world: Those who understand
binary, and those who don't"
--
---------------------------------------------------------
"There are only 10 types of people in the world: Those who understand
binary, and those who don't"
Re: Useful metric to check slow ISR catchup
Posted by Prabhjot Bharaj <pr...@gmail.com>.
Hi,
Attaching a screenshot of bytes/sec from Ganglia
As you can see, the graph in RED color belongs to the third replica, for
which the bytes/sec is around 10 times lower than its 2 peers (in Green and
Blue)
Earlier, I was thinking that it could be related to that 1 system only, but
when I created a new topic with 1 partition and 3 replicas, I see similar
graph on the other set of machines.
I'm not sure what parameter could be causing this. Any pointers are
appreciated
Thanks,
Prabhjot
On Mon, Sep 21, 2015 at 1:20 PM, Prabhjot Bharaj <pr...@gmail.com>
wrote:
> Hello Folks,
>
> Request your expertise on this
>
> Thanks,
> Prabhjot
>
> On Fri, Sep 18, 2015 at 6:18 PM, Prabhjot Bharaj <pr...@gmail.com>
> wrote:
>
>> Hi,
>>
>> I've noticed that 1 follower replica node out of my kafka cluster catches
>> up to the data form the leader pretty slowly.
>> My topic has just 1 partition with 3 replicas. One other follower replica
>> gets the full data from the leader pretty instantly
>>
>> It takes around 22 minutes to catch up 500MB of data.
>>
>> I have setup ganglia monitoring on my cluster and I'm interested in
>> knowing what metric exposed through JMX would be useful in checking the
>> reason for such slowness
>>
>> Thanks,
>> Prabhjot
>>
>
>
>
> --
> ---------------------------------------------------------
> "There are only 10 types of people in the world: Those who understand
> binary, and those who don't"
>
--
---------------------------------------------------------
"There are only 10 types of people in the world: Those who understand
binary, and those who don't"
Re: Useful metric to check slow ISR catchup
Posted by Prabhjot Bharaj <pr...@gmail.com>.
Hello Folks,
Request your expertise on this
Thanks,
Prabhjot
On Fri, Sep 18, 2015 at 6:18 PM, Prabhjot Bharaj <pr...@gmail.com>
wrote:
> Hi,
>
> I've noticed that 1 follower replica node out of my kafka cluster catches
> up to the data form the leader pretty slowly.
> My topic has just 1 partition with 3 replicas. One other follower replica
> gets the full data from the leader pretty instantly
>
> It takes around 22 minutes to catch up 500MB of data.
>
> I have setup ganglia monitoring on my cluster and I'm interested in
> knowing what metric exposed through JMX would be useful in checking the
> reason for such slowness
>
> Thanks,
> Prabhjot
>
--
---------------------------------------------------------
"There are only 10 types of people in the world: Those who understand
binary, and those who don't"