You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Nicolas Fouché <nf...@onfocus.io> on 2017/03/09 15:48:16 UTC

Streams - Got bit by log levels of the record cache

Hi,

I just wanted to share how we misinterpreted logs from Streams at the INFO
level. Version 0.10.1.1, I think it's the same for 0.10.2.0.

So, we configured `commit.interval.ms` and `cache.max.bytes.buffering`, and
we expected to always reach the commit interval before the maximum bytes.
It was confirmed by looking at the logs the commit interval actually
triggered commits. No logs about the maximum bytes.

But then I noticed in our monitoring graphs and by consuming output topics
that the rate of messages was way too high (thanks to pv[1]).

I checked the code of Streams to write a kinda "sequence diagram" to see
what was logged and at what level.

For `commit.interval.ms`:
- StreamThread#maybeCommit does an INFO recalling the configured duration
- then StreamThread#commitOne does one INFO per task.
- then NamedCache#flush announces the flushes at DEBUG level.

For `cache.max.bytes.buffering`:
- ThreadCache#maybeEvict does a TRACE (?!?) to announce that the cache is
too big
- then NamedCache#flush announces the flushes at DEBUG level.

So I think it makes the logs not exhaustive enough concerning the writes to
States + Changelog topics +  Output topics (Processor forwards). Thus logs
are easily misinterpreted.

Should I file a JIRA ?

[1] https://linux.die.net/man/1/pv

- Nicolas

Re: Streams - Got bit by log levels of the record cache

Posted by Nicolas Fouché <nf...@onfocus.io>.
Thanks. I added a link to this thread in KAFKA-4829.

2017-03-10 9:49 GMT+01:00 Michael Noll <mi...@confluent.io>:

> I think a related JIRA ticket is
> https://issues.apache.org/jira/browse/KAFKA-4829 (see Guozhang's comment
> about the ticket's scope).
>
> -Michael
>
>
> On Thu, Mar 9, 2017 at 6:22 PM, Damian Guy <da...@gmail.com> wrote:
>
> > Hi Nicolas,
> >
> > Please do file a JIRA.
> >
> > Many thanks,
> > Damian
> >
> > On Thu, 9 Mar 2017 at 15:54 Nicolas Fouché <nf...@onfocus.io> wrote:
> >
> > > Hi,
> > >
> > > I just wanted to share how we misinterpreted logs from Streams at the
> > INFO
> > > level. Version 0.10.1.1, I think it's the same for 0.10.2.0.
> > >
> > > So, we configured `commit.interval.ms` and
> `cache.max.bytes.buffering`,
> > > and
> > > we expected to always reach the commit interval before the maximum
> bytes.
> > > It was confirmed by looking at the logs the commit interval actually
> > > triggered commits. No logs about the maximum bytes.
> > >
> > > But then I noticed in our monitoring graphs and by consuming output
> > topics
> > > that the rate of messages was way too high (thanks to pv[1]).
> > >
> > > I checked the code of Streams to write a kinda "sequence diagram" to
> see
> > > what was logged and at what level.
> > >
> > > For `commit.interval.ms`:
> > > - StreamThread#maybeCommit does an INFO recalling the configured
> duration
> > > - then StreamThread#commitOne does one INFO per task.
> > > - then NamedCache#flush announces the flushes at DEBUG level.
> > >
> > > For `cache.max.bytes.buffering`:
> > > - ThreadCache#maybeEvict does a TRACE (?!?) to announce that the cache
> is
> > > too big
> > > - then NamedCache#flush announces the flushes at DEBUG level.
> > >
> > > So I think it makes the logs not exhaustive enough concerning the
> writes
> > to
> > > States + Changelog topics +  Output topics (Processor forwards). Thus
> > logs
> > > are easily misinterpreted.
> > >
> > > Should I file a JIRA ?
> > >
> > > [1] https://linux.die.net/man/1/pv
> > >
> > > - Nicolas
> > >
> >
>

Re: Streams - Got bit by log levels of the record cache

Posted by Michael Noll <mi...@confluent.io>.
I think a related JIRA ticket is
https://issues.apache.org/jira/browse/KAFKA-4829 (see Guozhang's comment
about the ticket's scope).

-Michael


On Thu, Mar 9, 2017 at 6:22 PM, Damian Guy <da...@gmail.com> wrote:

> Hi Nicolas,
>
> Please do file a JIRA.
>
> Many thanks,
> Damian
>
> On Thu, 9 Mar 2017 at 15:54 Nicolas Fouché <nf...@onfocus.io> wrote:
>
> > Hi,
> >
> > I just wanted to share how we misinterpreted logs from Streams at the
> INFO
> > level. Version 0.10.1.1, I think it's the same for 0.10.2.0.
> >
> > So, we configured `commit.interval.ms` and `cache.max.bytes.buffering`,
> > and
> > we expected to always reach the commit interval before the maximum bytes.
> > It was confirmed by looking at the logs the commit interval actually
> > triggered commits. No logs about the maximum bytes.
> >
> > But then I noticed in our monitoring graphs and by consuming output
> topics
> > that the rate of messages was way too high (thanks to pv[1]).
> >
> > I checked the code of Streams to write a kinda "sequence diagram" to see
> > what was logged and at what level.
> >
> > For `commit.interval.ms`:
> > - StreamThread#maybeCommit does an INFO recalling the configured duration
> > - then StreamThread#commitOne does one INFO per task.
> > - then NamedCache#flush announces the flushes at DEBUG level.
> >
> > For `cache.max.bytes.buffering`:
> > - ThreadCache#maybeEvict does a TRACE (?!?) to announce that the cache is
> > too big
> > - then NamedCache#flush announces the flushes at DEBUG level.
> >
> > So I think it makes the logs not exhaustive enough concerning the writes
> to
> > States + Changelog topics +  Output topics (Processor forwards). Thus
> logs
> > are easily misinterpreted.
> >
> > Should I file a JIRA ?
> >
> > [1] https://linux.die.net/man/1/pv
> >
> > - Nicolas
> >
>

Re: Streams - Got bit by log levels of the record cache

Posted by Damian Guy <da...@gmail.com>.
Hi Nicolas,

Please do file a JIRA.

Many thanks,
Damian

On Thu, 9 Mar 2017 at 15:54 Nicolas Fouché <nf...@onfocus.io> wrote:

> Hi,
>
> I just wanted to share how we misinterpreted logs from Streams at the INFO
> level. Version 0.10.1.1, I think it's the same for 0.10.2.0.
>
> So, we configured `commit.interval.ms` and `cache.max.bytes.buffering`,
> and
> we expected to always reach the commit interval before the maximum bytes.
> It was confirmed by looking at the logs the commit interval actually
> triggered commits. No logs about the maximum bytes.
>
> But then I noticed in our monitoring graphs and by consuming output topics
> that the rate of messages was way too high (thanks to pv[1]).
>
> I checked the code of Streams to write a kinda "sequence diagram" to see
> what was logged and at what level.
>
> For `commit.interval.ms`:
> - StreamThread#maybeCommit does an INFO recalling the configured duration
> - then StreamThread#commitOne does one INFO per task.
> - then NamedCache#flush announces the flushes at DEBUG level.
>
> For `cache.max.bytes.buffering`:
> - ThreadCache#maybeEvict does a TRACE (?!?) to announce that the cache is
> too big
> - then NamedCache#flush announces the flushes at DEBUG level.
>
> So I think it makes the logs not exhaustive enough concerning the writes to
> States + Changelog topics +  Output topics (Processor forwards). Thus logs
> are easily misinterpreted.
>
> Should I file a JIRA ?
>
> [1] https://linux.die.net/man/1/pv
>
> - Nicolas
>