You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Basanth Gowda <ba...@gmail.com> on 2017/08/14 12:03:05 UTC

Aggregation on multiple Key combinations and multiple Windows

Hello,
Posted this yesterday, but not sure if it went through or not.

I am fairly new to Flink. I have a use case which needs aggregation on
different combination of keys and windowing for different intervals. I
searched through but couldn't find anything that could help.


Came across this model on a presentation for Apex . This sums up what we
are trying to achieve. What is the best way to do this in Flink



{"keys":[{"name":"campaignId","type":"integer"},
 {"name":"adId","type":"integer"},
 {"name":"creativeId","type":"integer"},
 {"name":"publisherId","type":"integer"},
 {"name":"adOrderId","type":"integer"}],
 "timeBuckets":["1h","1d"],
 "values":
[{"name":"impressions","type":"integer","aggregators":["SUM"]}
,
 {"name":"clicks","type":"integer","aggregators":["SUM"]},
 {"name":"revenue","type":"integer"}],
 "dimensions":
 [{"combination":["campaignId","adId"]},
 {"combination":["creativeId","campaignId"]},
 {"combination":["campaignId"]},
 {"combination":["publisherId","adOrderId","campaignId"],
"additionalValues":["revenue:SUM"]}]
}


I have been able to do this by the following and repeating this for every
key + window combination. So in the above case there would be 8 blocks like
below. (4 combinations and 2 window period for each combination)

modelDataStream.keyBy("campaiginId","addId")
        .timeWindow(Time.minutes(1))
        .trigger(CountTrigger.of(2))
        .reduce(..)