You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Christophe Jolif <cj...@gmail.com> on 2018/01/05 16:51:22 UTC

"keyed" aggregation

Hi all,

I'm sourcing from a Kafka topic, using the key of the Kafka message to key
the stream, then doing some aggregation on the keyed stream.

Now I want to sink back to a different Kafka topic but re-using the same
key. The thing is that my aggregation "lost" the key. Obviously I can make
sure my aggregation function keeps the key, but I find that a bit strange
as it does not relate to aggregation.

Is there a best practice in that domain? How should the key be carried when
moving from a kafka source to sink but doing some aggregation along the way?

Thanks,
-- 
Christophe

Re: "keyed" aggregation

Posted by Till Rohrmann <tr...@apache.org>.
Hi Christophe,

if you don't have a way to recompute the key from the aggregation result,
then you have to write an aggregation function which explicitly keeps it
(e.g. a tuple value where the first entry is the key and the second the
aggregate value).

Cheers,
Till

On Fri, Jan 5, 2018 at 5:51 PM, Christophe Jolif <cj...@gmail.com> wrote:

> Hi all,
>
> I'm sourcing from a Kafka topic, using the key of the Kafka message to key
> the stream, then doing some aggregation on the keyed stream.
>
> Now I want to sink back to a different Kafka topic but re-using the same
> key. The thing is that my aggregation "lost" the key. Obviously I can make
> sure my aggregation function keeps the key, but I find that a bit strange
> as it does not relate to aggregation.
>
> Is there a best practice in that domain? How should the key be carried
> when moving from a kafka source to sink but doing some aggregation along
> the way?
>
> Thanks,
> --
> Christophe
>