You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Xiyuan Hu <xi...@gmail.com> on 2019/10/02 18:54:21 UTC

Kafka Streams changelog topic has 5 times higher out-traffic than in-traffic

Hi All,

I'm doing smoke testing with my Kafka Streams app(V2.1.0). I noticed
that below behaviors:
1) Out throughput of changelog topic could go up to 70mb/s while the
in-traffic is around 10mb/s.
2) When traffic is bumpy, either due to producer/consumer throttle or
some other reasons I'm still debugging, changelog topic out throughput
could go up to 160mb/s while in-throughput is less than 1mb/s. And
from the monitor tool, no messages are actually put into this topic.

Could anyone help me to understand how changelog topic works? Given it
still has consumer lag and producer timeout issues(records
expired(5min) since batch created), is it just the cluster
bandwidth/capacity issue?

Thanks, appreciate all the help!!

Re: Kafka Streams changelog topic has 5 times higher out-traffic than in-traffic

Posted by Boyang Chen <re...@gmail.com>.
Hey Xiyuan,

to better understand the situation, we need to clarify the actual consumer
of the changelog topic which contributes to the volume increase. You could
attempt to expose some broker metrics to see the actual client, but
normally there are two types of changelog consumers: 1. restore consumer
responsible for standby task 2. restore consumer for stream task relying on
the changelog topic to view the materialized state.

Boyang

On Thu, Oct 3, 2019 at 9:04 PM Xiyuan Hu <xi...@gmail.com> wrote:

> Hi Peter,
> Thanks for the reply! I noticed that, after deployment, changelog
> topic has high bytes in/sec and messages/sec, but low bytes out/sec.
> Once the app is unstable, or traffic is bumpy, it switched: changelog
> topic has low bytes in/sec and messages/sec but high bytes out/sec. Is
> it normally? Why it will switch during unstable time?
>
> Thanks!
>
> On Thu, Oct 3, 2019 at 5:03 AM Peter Levart <pe...@gmail.com>
> wrote:
> >
> > Hi Hu,
> >
> > On 10/2/19 8:54 PM, Xiyuan Hu wrote:
> > > Hi All,
> > >
> > > I'm doing smoke testing with my Kafka Streams app(V2.1.0). I noticed
> > > that below behaviors:
> > > 1) Out throughput of changelog topic could go up to 70mb/s while the
> > > in-traffic is around 10mb/s.
> > > 2) When traffic is bumpy, either due to producer/consumer throttle or
> > > some other reasons I'm still debugging, changelog topic out throughput
> > > could go up to 160mb/s while in-throughput is less than 1mb/s. And
> > > from the monitor tool, no messages are actually put into this topic.
> > >
> > > Could anyone help me to understand how changelog topic works? Given it
> > > still has consumer lag and producer timeout issues(records
> > > expired(5min) since batch created), is it just the cluster
> > > bandwidth/capacity issue?
> > >
> > > Thanks, appreciate all the help!!
> >
> > In case you're using KeyValueStore (the simplest of stores), each update
> > of a (K, V) entry into the store produced by the processor of your
> > stream will result in one message written to the log topic of that
> > store. Multiple quick consecutive updates to the same entry (with same
> > key) are possibly collapsed into just one message holding the last
> > version of the entry. So it is possible that your input stream is having
> > keys that are not repeating very frequently in short time, therefore the
> > rate of messages/s is roughly the same in your input and log topics. The
> > difference in bytes/s therefore must be the result of input topic
> > message size vs. log topic message size. The log topic message is
> > basically the whole (K, V) entry in the KVStore.
> >
> > Regards, Peter
> >
>

Re: Kafka Streams changelog topic has 5 times higher out-traffic than in-traffic

Posted by Xiyuan Hu <xi...@gmail.com>.
Hi Peter,
Thanks for the reply! I noticed that, after deployment, changelog
topic has high bytes in/sec and messages/sec, but low bytes out/sec.
Once the app is unstable, or traffic is bumpy, it switched: changelog
topic has low bytes in/sec and messages/sec but high bytes out/sec. Is
it normally? Why it will switch during unstable time?

Thanks!

On Thu, Oct 3, 2019 at 5:03 AM Peter Levart <pe...@gmail.com> wrote:
>
> Hi Hu,
>
> On 10/2/19 8:54 PM, Xiyuan Hu wrote:
> > Hi All,
> >
> > I'm doing smoke testing with my Kafka Streams app(V2.1.0). I noticed
> > that below behaviors:
> > 1) Out throughput of changelog topic could go up to 70mb/s while the
> > in-traffic is around 10mb/s.
> > 2) When traffic is bumpy, either due to producer/consumer throttle or
> > some other reasons I'm still debugging, changelog topic out throughput
> > could go up to 160mb/s while in-throughput is less than 1mb/s. And
> > from the monitor tool, no messages are actually put into this topic.
> >
> > Could anyone help me to understand how changelog topic works? Given it
> > still has consumer lag and producer timeout issues(records
> > expired(5min) since batch created), is it just the cluster
> > bandwidth/capacity issue?
> >
> > Thanks, appreciate all the help!!
>
> In case you're using KeyValueStore (the simplest of stores), each update
> of a (K, V) entry into the store produced by the processor of your
> stream will result in one message written to the log topic of that
> store. Multiple quick consecutive updates to the same entry (with same
> key) are possibly collapsed into just one message holding the last
> version of the entry. So it is possible that your input stream is having
> keys that are not repeating very frequently in short time, therefore the
> rate of messages/s is roughly the same in your input and log topics. The
> difference in bytes/s therefore must be the result of input topic
> message size vs. log topic message size. The log topic message is
> basically the whole (K, V) entry in the KVStore.
>
> Regards, Peter
>

Re: Kafka Streams changelog topic has 5 times higher out-traffic than in-traffic

Posted by Xiyuan Hu <xi...@gmail.com>.
Hi Peter,
Thanks for the reply! I noticed that, after deployment, changelog
topic has high bytes in/sec and messages/sec, but low bytes out/sec.
Once the app is unstable, or traffic is bumpy, it switched: changelog
topic has low bytes in/sec and messages/sec but high bytes out/sec. Is
it normally? Why it will switch during unstable time?

Thanks!

On Thu, Oct 3, 2019 at 5:03 AM Peter Levart <pe...@gmail.com> wrote:
>
> Hi Hu,
>
> On 10/2/19 8:54 PM, Xiyuan Hu wrote:
> > Hi All,
> >
> > I'm doing smoke testing with my Kafka Streams app(V2.1.0). I noticed
> > that below behaviors:
> > 1) Out throughput of changelog topic could go up to 70mb/s while the
> > in-traffic is around 10mb/s.
> > 2) When traffic is bumpy, either due to producer/consumer throttle or
> > some other reasons I'm still debugging, changelog topic out throughput
> > could go up to 160mb/s while in-throughput is less than 1mb/s. And
> > from the monitor tool, no messages are actually put into this topic.
> >
> > Could anyone help me to understand how changelog topic works? Given it
> > still has consumer lag and producer timeout issues(records
> > expired(5min) since batch created), is it just the cluster
> > bandwidth/capacity issue?
> >
> > Thanks, appreciate all the help!!
>
> In case you're using KeyValueStore (the simplest of stores), each update
> of a (K, V) entry into the store produced by the processor of your
> stream will result in one message written to the log topic of that
> store. Multiple quick consecutive updates to the same entry (with same
> key) are possibly collapsed into just one message holding the last
> version of the entry. So it is possible that your input stream is having
> keys that are not repeating very frequently in short time, therefore the
> rate of messages/s is roughly the same in your input and log topics. The
> difference in bytes/s therefore must be the result of input topic
> message size vs. log topic message size. The log topic message is
> basically the whole (K, V) entry in the KVStore.
>
> Regards, Peter
>

Re: Kafka Streams changelog topic has 5 times higher out-traffic than in-traffic

Posted by Peter Levart <pe...@gmail.com>.
Hi Hu,

On 10/2/19 8:54 PM, Xiyuan Hu wrote:
> Hi All,
>
> I'm doing smoke testing with my Kafka Streams app(V2.1.0). I noticed
> that below behaviors:
> 1) Out throughput of changelog topic could go up to 70mb/s while the
> in-traffic is around 10mb/s.
> 2) When traffic is bumpy, either due to producer/consumer throttle or
> some other reasons I'm still debugging, changelog topic out throughput
> could go up to 160mb/s while in-throughput is less than 1mb/s. And
> from the monitor tool, no messages are actually put into this topic.
>
> Could anyone help me to understand how changelog topic works? Given it
> still has consumer lag and producer timeout issues(records
> expired(5min) since batch created), is it just the cluster
> bandwidth/capacity issue?
>
> Thanks, appreciate all the help!!

In case you're using KeyValueStore (the simplest of stores), each update 
of a (K, V) entry into the store produced by the processor of your 
stream will result in one message written to the log topic of that 
store. Multiple quick consecutive updates to the same entry (with same 
key) are possibly collapsed into just one message holding the last 
version of the entry. So it is possible that your input stream is having 
keys that are not repeating very frequently in short time, therefore the 
rate of messages/s is roughly the same in your input and log topics. The 
difference in bytes/s therefore must be the result of input topic 
message size vs. log topic message size. The log topic message is 
basically the whole (K, V) entry in the KVStore.

Regards, Peter