You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Yiming Zang (JIRA)" <ji...@apache.org> on 2019/03/07 00:11:00 UTC

[jira] [Commented] (KAFKA-6020) Broker side filtering

    [ https://issues.apache.org/jira/browse/KAFKA-6020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16786242#comment-16786242 ] 

Yiming Zang commented on KAFKA-6020:
------------------------------------

Any updates for this?

We have smilier needs on our side, strongly support this idea on broker-side filtering. 

Our use case comes from N-DC replication. Basically imagine if you have 5 data centers and you need to replicate data to everywhere, typically you'll have to run N*(N-1) which is 20 mirror-maker jobs in order replicate messages in each local data center to all remote data centers. Each mirror maker will have to read the whole 5 copies of events, do some processing and only replicate one fifth of the events. This is a huge waste of network bandwidth and cpu resources. If we can have a way to pre filter the events on broker side, mirror maker doesn't need to read all 5 copies of events any more, which can be a huge amount of savings when we have even more data centers in the future.

> Broker side filtering
> ---------------------
>
>                 Key: KAFKA-6020
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6020
>             Project: Kafka
>          Issue Type: New Feature
>          Components: consumer
>            Reporter: Pavel Micka
>            Priority: Major
>              Labels: needs-kip
>
> Currently, it is not possible to filter messages on broker side. Filtering messages on broker side is convenient for filter with very low selectivity (one message in few thousands). In my case it means to transfer several GB of data to consumer, throw it away, take one message and do it again...
> While I understand that filtering by message body is not feasible (for performance reasons), I propose to filter just by message key prefix. This can be achieved even without any deserialization, as the prefix to be matched can be passed as an array (hence the broker would do just array prefix compare).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)