You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Sam Lendle <sl...@pandora.com> on 2018/07/11 18:42:09 UTC
Kafka Streams processor node metrics process rate with multiple
stream threads
Hello!
Using kafka-streams 1.1.0, I noticed when I sum the process rate metric for a given processor node, the rate is many times higher than the number of incoming messages. Digging further, it looks like the rate metric associated with each thread in a given application instance is always the same, and if I average by instance and then sum the rates, I recover the incoming message rate. So it looks like the rate metric for each stream thread is actually the reporting the rate for all threads on the instance.
Is this a known issue, or am I misusing the metric? I’m not sure if this affects other metrics, but it does look like the average latency metric is identical for all threads on the same instance, so I suspect it does.
Thanks,
Sam
Re: Kafka Streams processor node metrics process rate with multiple
stream threads
Posted by Guozhang Wang <wa...@gmail.com>.
Thanks Sam! Please feel free to assign the ticket to yourself and I will
review your PR if you created one:
https://cwiki.apache.org/confluence/display/KAFKA/Contributing+Code+Changes#ContributingCodeChanges-PullRequest
On Tue, Jul 17, 2018 at 6:29 PM, Sam Lendle <sl...@pandora.com> wrote:
> https://issues.apache.org/jira/browse/KAFKA-7176
>
> If I have a change I will give trunk a try.
>
> On 7/16/18, 2:14 PM, "Guozhang Wang" <wa...@gmail.com> wrote:
>
> Hmm.. this seems new to me. Checked on the source code it seems right
> to me.
>
> Could you try out the latest trunk (build from source code) and see if
> it
> is the same issue for you?
>
> > In addition to that, though, I also see state store metrics for tasks
> that have been migrated to another instance, and their values continue
> to
> be updated, even after seeing messages in the logs indicating that
> local
> state for those tasks has been cleaned. Is this also fixed, or a
> separate
> issue?
>
> This may be an issue that is not yet resolved, I'd need to double
> check. At
> the mean time, could you create a JIRA for it?
>
>
> Guozhang
>
>
> On Thu, Jul 12, 2018 at 4:04 PM, Sam Lendle <sl...@pandora.com>
> wrote:
>
> > Ah great, thanks Gouzhang.
> >
> > I also noticed a similar issue with state store metrics, where rate
> > metrics for each thread/task appear to be the total rate across all
> > threads/tasks on that instance.
> >
> > In addition to that, though, I also see state store metrics for
> tasks that
> > have been migrated to another instance, and their values continue to
> be
> > updated, even after seeing messages in the logs indicating that
> local state
> > for those tasks has been cleaned. Is this also fixed, or a separate
> issue?
> >
> > Best,
> > Sam
> >
> > On 7/11/18, 10:51 PM, "Guozhang Wang" <wa...@gmail.com> wrote:
> >
> > Hello Sam,
> >
> > It is a known issue that should have been fixed in 2.0, the
> correlated
> > fix
> > has also been cherry-picked to the 1.1.1 bug fix release as well:
> >
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.
> > com_apache_kafka_pull_5277&d=DwIFaQ&c=gFTBenQ7Vj71sUi1A4CkFnmPzqwDo0
> > 7QsHw-JRepxyw&r=BNCekDhngyXB6C2Ag7PIfHotiuqjAVwLOZLQHB7fyOM&m=-
> > PxNeRIE8RN79eewJpZdqKjdn7hBegA5u-pJ208prdA&s=gJdWWHIgT-
> > uqkFvjwFCQNXvC4C6fvar7pHqXXcHg2KE&e=
> >
> >
> > Guozhang
> >
> > On Wed, Jul 11, 2018 at 11:42 AM, Sam Lendle <
> slendle@pandora.com>
> > wrote:
> >
> > > Hello!
> > >
> > > Using kafka-streams 1.1.0, I noticed when I sum the process
> rate
> > metric
> > > for a given processor node, the rate is many times higher than
> the
> > number
> > > of incoming messages. Digging further, it looks like the rate
> metric
> > > associated with each thread in a given application instance is
> > always the
> > > same, and if I average by instance and then sum the rates, I
> recover
> > the
> > > incoming message rate. So it looks like the rate metric for
> each
> > stream
> > > thread is actually the reporting the rate for all threads on
> the
> > instance.
> > >
> > > Is this a known issue, or am I misusing the metric? I’m not
> sure if
> > this
> > > affects other metrics, but it does look like the average
> latency
> > metric is
> > > identical for all threads on the same instance, so I suspect
> it does.
> > >
> > > Thanks,
> > > Sam
> > >
> >
> >
> >
> > --
> > -- Guozhang
> >
> >
> >
>
>
> --
> -- Guozhang
>
>
>
--
-- Guozhang
Re: Kafka Streams processor node metrics process rate with multiple
stream threads
Posted by Sam Lendle <sl...@pandora.com>.
https://issues.apache.org/jira/browse/KAFKA-7176
If I have a change I will give trunk a try.
On 7/16/18, 2:14 PM, "Guozhang Wang" <wa...@gmail.com> wrote:
Hmm.. this seems new to me. Checked on the source code it seems right to me.
Could you try out the latest trunk (build from source code) and see if it
is the same issue for you?
> In addition to that, though, I also see state store metrics for tasks
that have been migrated to another instance, and their values continue to
be updated, even after seeing messages in the logs indicating that local
state for those tasks has been cleaned. Is this also fixed, or a separate
issue?
This may be an issue that is not yet resolved, I'd need to double check. At
the mean time, could you create a JIRA for it?
Guozhang
On Thu, Jul 12, 2018 at 4:04 PM, Sam Lendle <sl...@pandora.com> wrote:
> Ah great, thanks Gouzhang.
>
> I also noticed a similar issue with state store metrics, where rate
> metrics for each thread/task appear to be the total rate across all
> threads/tasks on that instance.
>
> In addition to that, though, I also see state store metrics for tasks that
> have been migrated to another instance, and their values continue to be
> updated, even after seeing messages in the logs indicating that local state
> for those tasks has been cleaned. Is this also fixed, or a separate issue?
>
> Best,
> Sam
>
> On 7/11/18, 10:51 PM, "Guozhang Wang" <wa...@gmail.com> wrote:
>
> Hello Sam,
>
> It is a known issue that should have been fixed in 2.0, the correlated
> fix
> has also been cherry-picked to the 1.1.1 bug fix release as well:
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.
> com_apache_kafka_pull_5277&d=DwIFaQ&c=gFTBenQ7Vj71sUi1A4CkFnmPzqwDo0
> 7QsHw-JRepxyw&r=BNCekDhngyXB6C2Ag7PIfHotiuqjAVwLOZLQHB7fyOM&m=-
> PxNeRIE8RN79eewJpZdqKjdn7hBegA5u-pJ208prdA&s=gJdWWHIgT-
> uqkFvjwFCQNXvC4C6fvar7pHqXXcHg2KE&e=
>
>
> Guozhang
>
> On Wed, Jul 11, 2018 at 11:42 AM, Sam Lendle <sl...@pandora.com>
> wrote:
>
> > Hello!
> >
> > Using kafka-streams 1.1.0, I noticed when I sum the process rate
> metric
> > for a given processor node, the rate is many times higher than the
> number
> > of incoming messages. Digging further, it looks like the rate metric
> > associated with each thread in a given application instance is
> always the
> > same, and if I average by instance and then sum the rates, I recover
> the
> > incoming message rate. So it looks like the rate metric for each
> stream
> > thread is actually the reporting the rate for all threads on the
> instance.
> >
> > Is this a known issue, or am I misusing the metric? I’m not sure if
> this
> > affects other metrics, but it does look like the average latency
> metric is
> > identical for all threads on the same instance, so I suspect it does.
> >
> > Thanks,
> > Sam
> >
>
>
>
> --
> -- Guozhang
>
>
>
--
-- Guozhang
Re: Kafka Streams processor node metrics process rate with multiple
stream threads
Posted by Guozhang Wang <wa...@gmail.com>.
Hmm.. this seems new to me. Checked on the source code it seems right to me.
Could you try out the latest trunk (build from source code) and see if it
is the same issue for you?
> In addition to that, though, I also see state store metrics for tasks
that have been migrated to another instance, and their values continue to
be updated, even after seeing messages in the logs indicating that local
state for those tasks has been cleaned. Is this also fixed, or a separate
issue?
This may be an issue that is not yet resolved, I'd need to double check. At
the mean time, could you create a JIRA for it?
Guozhang
On Thu, Jul 12, 2018 at 4:04 PM, Sam Lendle <sl...@pandora.com> wrote:
> Ah great, thanks Gouzhang.
>
> I also noticed a similar issue with state store metrics, where rate
> metrics for each thread/task appear to be the total rate across all
> threads/tasks on that instance.
>
> In addition to that, though, I also see state store metrics for tasks that
> have been migrated to another instance, and their values continue to be
> updated, even after seeing messages in the logs indicating that local state
> for those tasks has been cleaned. Is this also fixed, or a separate issue?
>
> Best,
> Sam
>
> On 7/11/18, 10:51 PM, "Guozhang Wang" <wa...@gmail.com> wrote:
>
> Hello Sam,
>
> It is a known issue that should have been fixed in 2.0, the correlated
> fix
> has also been cherry-picked to the 1.1.1 bug fix release as well:
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.
> com_apache_kafka_pull_5277&d=DwIFaQ&c=gFTBenQ7Vj71sUi1A4CkFnmPzqwDo0
> 7QsHw-JRepxyw&r=BNCekDhngyXB6C2Ag7PIfHotiuqjAVwLOZLQHB7fyOM&m=-
> PxNeRIE8RN79eewJpZdqKjdn7hBegA5u-pJ208prdA&s=gJdWWHIgT-
> uqkFvjwFCQNXvC4C6fvar7pHqXXcHg2KE&e=
>
>
> Guozhang
>
> On Wed, Jul 11, 2018 at 11:42 AM, Sam Lendle <sl...@pandora.com>
> wrote:
>
> > Hello!
> >
> > Using kafka-streams 1.1.0, I noticed when I sum the process rate
> metric
> > for a given processor node, the rate is many times higher than the
> number
> > of incoming messages. Digging further, it looks like the rate metric
> > associated with each thread in a given application instance is
> always the
> > same, and if I average by instance and then sum the rates, I recover
> the
> > incoming message rate. So it looks like the rate metric for each
> stream
> > thread is actually the reporting the rate for all threads on the
> instance.
> >
> > Is this a known issue, or am I misusing the metric? I’m not sure if
> this
> > affects other metrics, but it does look like the average latency
> metric is
> > identical for all threads on the same instance, so I suspect it does.
> >
> > Thanks,
> > Sam
> >
>
>
>
> --
> -- Guozhang
>
>
>
--
-- Guozhang
Re: Kafka Streams processor node metrics process rate with multiple
stream threads
Posted by Sam Lendle <sl...@pandora.com>.
Ah great, thanks Gouzhang.
I also noticed a similar issue with state store metrics, where rate metrics for each thread/task appear to be the total rate across all threads/tasks on that instance.
In addition to that, though, I also see state store metrics for tasks that have been migrated to another instance, and their values continue to be updated, even after seeing messages in the logs indicating that local state for those tasks has been cleaned. Is this also fixed, or a separate issue?
Best,
Sam
On 7/11/18, 10:51 PM, "Guozhang Wang" <wa...@gmail.com> wrote:
Hello Sam,
It is a known issue that should have been fixed in 2.0, the correlated fix
has also been cherry-picked to the 1.1.1 bug fix release as well:
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_kafka_pull_5277&d=DwIFaQ&c=gFTBenQ7Vj71sUi1A4CkFnmPzqwDo07QsHw-JRepxyw&r=BNCekDhngyXB6C2Ag7PIfHotiuqjAVwLOZLQHB7fyOM&m=-PxNeRIE8RN79eewJpZdqKjdn7hBegA5u-pJ208prdA&s=gJdWWHIgT-uqkFvjwFCQNXvC4C6fvar7pHqXXcHg2KE&e=
Guozhang
On Wed, Jul 11, 2018 at 11:42 AM, Sam Lendle <sl...@pandora.com> wrote:
> Hello!
>
> Using kafka-streams 1.1.0, I noticed when I sum the process rate metric
> for a given processor node, the rate is many times higher than the number
> of incoming messages. Digging further, it looks like the rate metric
> associated with each thread in a given application instance is always the
> same, and if I average by instance and then sum the rates, I recover the
> incoming message rate. So it looks like the rate metric for each stream
> thread is actually the reporting the rate for all threads on the instance.
>
> Is this a known issue, or am I misusing the metric? I’m not sure if this
> affects other metrics, but it does look like the average latency metric is
> identical for all threads on the same instance, so I suspect it does.
>
> Thanks,
> Sam
>
--
-- Guozhang
Re: Kafka Streams processor node metrics process rate with multiple
stream threads
Posted by Guozhang Wang <wa...@gmail.com>.
Hello Sam,
It is a known issue that should have been fixed in 2.0, the correlated fix
has also been cherry-picked to the 1.1.1 bug fix release as well:
https://github.com/apache/kafka/pull/5277
Guozhang
On Wed, Jul 11, 2018 at 11:42 AM, Sam Lendle <sl...@pandora.com> wrote:
> Hello!
>
> Using kafka-streams 1.1.0, I noticed when I sum the process rate metric
> for a given processor node, the rate is many times higher than the number
> of incoming messages. Digging further, it looks like the rate metric
> associated with each thread in a given application instance is always the
> same, and if I average by instance and then sum the rates, I recover the
> incoming message rate. So it looks like the rate metric for each stream
> thread is actually the reporting the rate for all threads on the instance.
>
> Is this a known issue, or am I misusing the metric? I’m not sure if this
> affects other metrics, but it does look like the average latency metric is
> identical for all threads on the same instance, so I suspect it does.
>
> Thanks,
> Sam
>
--
-- Guozhang