You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Guozhang Wang (JIRA)" <ji...@apache.org> on 2017/12/18 19:08:01 UTC

[jira] [Commented] (KAFKA-6376) Improve Streams metrics for skipped records

    [ https://issues.apache.org/jira/browse/KAFKA-6376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16295486#comment-16295486 ] 

Guozhang Wang commented on KAFKA-6376:
--------------------------------------

Regarding 3), what I meant is that in user's own `process()` call, users can for example try-catch some expected exceptions and then stop further processing the record and also not forwarding to downstream as well. Arguably this should be out of the Streams library itself and hence we would not need to record it ourselves. But given that streams processor context does expose the metrics registry maybe we can think of a manner to allow users to record such dropping occurrences into the library's own metrics as well, but I can also see that this may be a further-fetch and would not want to do it along with this JIRA.

Regarding `null` keys being dropping, yes that is definitely another scenario we should consider in this ticket.

> Improve Streams metrics for skipped records
> -------------------------------------------
>
>                 Key: KAFKA-6376
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6376
>             Project: Kafka
>          Issue Type: Bug
>          Components: metrics, streams
>    Affects Versions: 1.0.0
>            Reporter: Matthias J. Sax
>              Labels: needs-kip
>
> Copy this from KIP-210 discussion thread:
> {quote}
> Note that currently we have two metrics for `skipped-records` on different
> levels:
> 1) on the highest level, the thread-level, we have a `skipped-records`,
> that records all the skipped records due to deserialization errors.
> 2) on the lower processor-node level, we have a
> `skippedDueToDeserializationError`, that records the skipped records on
> that specific source node due to deserialization errors.
> So you can see that 1) does not cover any other scenarios and can just be
> thought of as an aggregate of 2) across all the tasks' source nodes.
> However, there are other places that can cause a record to be dropped, for
> example:
> 1) https://issues.apache.org/jira/browse/KAFKA-5784: records could be
> dropped due to window elapsed.
> 2) KIP-210: records could be dropped on the producer side.
> 3) records could be dropped during user-customized processing on errors.
> {quote}
> [~guozhang] Not sure what you mean by "3) records could be dropped during user-customized processing on errors."
> Btw: we also drop record with {{null}} key and/or value for certain DSL operations. This should be included as well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)