You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Felipe Gutierrez <fe...@gmail.com> on 2019/07/01 06:45:39 UTC

Re: Calculate a 4 hour time window for sum,count with sum and count output every 5 mins

No, there is no specific reason.
I am using it because I am computing the HyperLogLog over a window.
*--*
*-- Felipe Gutierrez*

*-- skype: felipe.o.gutierrez*
*--* *https://felipeogutierrez.blogspot.com
<https://felipeogutierrez.blogspot.com>*


On Mon, Jul 1, 2019 at 12:34 AM Vijay Balakrishnan <bv...@gmail.com>
wrote:

> Hi Felipe,
> Thanks for the example. I will try a variation of that for mine. Is there
> a specific reason to use the HyperLogLogState ?
>
> Vijay
>
> On Tue, Jun 18, 2019 at 3:00 AM Felipe Gutierrez <
> felipe.o.gutierrez@gmail.com> wrote:
>
>> Hi Vijay,
>>
>> I managed by using
>> "ctx.timerService().registerProcessingTimeTimer(timeoutTime);" on the
>> processElement method and clearing the state on the onTimer method. This is
>> my program [1].
>>
>> [1]
>> https://github.com/felipegutierrez/explore-flink/blob/master/src/main/java/org/sense/flink/examples/stream/WordHLLKeyedProcessWindowTwitter.java
>>
>> Kind Regards,
>> Felipe
>> *--*
>> *-- Felipe Gutierrez*
>>
>> *-- skype: felipe.o.gutierrez*
>> *--* *https://felipeogutierrez.blogspot.com
>> <https://felipeogutierrez.blogspot.com>*
>>
>>
>> On Mon, Jun 17, 2019 at 8:57 PM Rafi Aroch <ra...@gmail.com> wrote:
>>
>>> Hi Vijay,
>>>
>>> When using windows, you may use the 'trigger' to set a Custom Trigger
>>> which would trigger your *ProcessWindowFunction* accordingly.
>>>
>>> In your case, you would probably use:
>>>
>>>> *.trigger(ContinuousProcessingTimeTrigger.of(Time.minutes(5)))*
>>>>
>>>
>>> Thanks,
>>> Rafi
>>>
>>>
>>> On Mon, Jun 17, 2019 at 9:01 PM Vijay Balakrishnan <bv...@gmail.com>
>>> wrote:
>>>
>>>> I am also implementing the ProcessWindowFunction and accessing the
>>>> windowState to get data but how do i push data out every 5 mins during a 4
>>>> hr time window ?? I am adding a globalState to handle the 4 hr window ???
>>>> Or should I still use the context.windowState even for the 4 hr window ?
>>>>
>>>> public  class MGroupingAggregateClass extends
>>>>> ProcessWindowFunction<....> {
>>>>>
>>>>> private MapState<String, Object> timedGroupKeyState;
>>>>> private MapState<String, Object> globalGroupKeyState;
>>>>> private final MapStateDescriptor<String, Object>
>>>>> timedMapKeyStateDescriptor =
>>>>>        new MapStateDescriptor<>("timedGroupKeyState",
>>>>>                String.class, Object.class);
>>>>> private final MapStateDescriptor<String, Object>
>>>>> globalMapKeyStateDescriptor =
>>>>>            new MapStateDescriptor<>("globalGroupKeyState",
>>>>>                    String.class, Object.class);
>>>>>
>>>>>
>>>>> public void open(Configuration ..) {
>>>>> timedGroupKeyState =
>>>>> getRuntimeContext().getMapState(timedMapKeyStateDescriptor);
>>>>> globalGroupKeyState =
>>>>> getRuntimeContext().getMapState(globalMapKeyStateDescriptor);
>>>>> }
>>>>>
>>>>> public void process(MonitoringTuple currKey, Context context,
>>>>> Iterable<Map<String, Object>> elements,
>>>>>                        Collector<Map<String, Object>> out) throws
>>>>> Exception {
>>>>>        logger.info("Entered MGroupingAggregateWindowProcessing -
>>>>> process interval:{}, currKey:{}", interval, currKey);
>>>>>        timedGroupKeyState =
>>>>> context.windowState().getMapState(timedMapKeyStateDescriptor);
>>>>>        globalGroupKeyState =
>>>>> context.globalState().getMapState(globalMapKeyStateDescriptor);
>>>>> ...
>>>>> //get data fromm state
>>>>> Object timedGroupStateObj = timedGroupKeyState.get(groupKey);
>>>>>
>>>>> //how do i push the data out every 5 mins to the sink during the 4 hr
>>>>> window ??
>>>>>
>>>>> }
>>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Mon, Jun 17, 2019 at 10:06 AM Vijay Balakrishnan <bv...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>> Need to calculate a 4 hour time window for count, sum with current
>>>>> calculated results being output every 5 mins.
>>>>> How do i do that ?
>>>>> Currently, I calculate results for 5 sec and 5 min time windows fine
>>>>> on the KeyedStream.
>>>>>
>>>>> Time timeWindow = getTimeWindowFromInterval(interval);//eg: timeWindow
>>>>>> = Time.seconds(timeIntervalL);
>>>>>> KeyedStream<Map<String, Object>, ...> monitoringTupleKeyedStream =
>>>>>> kinesisStream.keyBy(...);
>>>>>> final WindowedStream<Map<String, Object>, ...., TimeWindow>
>>>>>> windowStream =
>>>>>>         monitoringTupleKeyedStream
>>>>>>                 .timeWindow(timeWindow);
>>>>>> DataStream<....> enrichedMGStream = windowStream.aggregate(
>>>>>>         new MGroupingWindowAggregateClass(...),
>>>>>>         new MGroupingAggregateClass(....))
>>>>>>         .map(new Monitoring...(...));
>>>>>> enrichedMGStream.addSink(..);
>>>>>>
>>>>>
>>>>>
>>>>> TIA,
>>>>> Vijay
>>>>>
>>>>

Re: Calculate a 4 hour time window for sum,count with sum and count output every 5 mins

Posted by Vijay Balakrishnan <bv...@gmail.com>.
Hi Rafi,
I tried your approach with:

> windowStream.trigger(ContinuousEventTimeTrigger.of(Time.minutes(5)));
>
> I can use .trigger with ProcessWindowFunction but it doesn't accumulate
data across windows i.e I want to collect data for a 5h window with data
sent to output every 5 mins with the output data getting accumulated after
every 5 mins.

@Felipe- I am using a ProcessWindowFunction and cannot find a way to use
process() & onTimer with it.

On Sun, Jun 30, 2019 at 11:45 PM Felipe Gutierrez <
felipe.o.gutierrez@gmail.com> wrote:

> No, there is no specific reason.
> I am using it because I am computing the HyperLogLog over a window.
> *--*
> *-- Felipe Gutierrez*
>
> *-- skype: felipe.o.gutierrez*
> *--* *https://felipeogutierrez.blogspot.com
> <https://felipeogutierrez.blogspot.com>*
>
>
> On Mon, Jul 1, 2019 at 12:34 AM Vijay Balakrishnan <bv...@gmail.com>
> wrote:
>
>> Hi Felipe,
>> Thanks for the example. I will try a variation of that for mine. Is there
>> a specific reason to use the HyperLogLogState ?
>>
>> Vijay
>>
>> On Tue, Jun 18, 2019 at 3:00 AM Felipe Gutierrez <
>> felipe.o.gutierrez@gmail.com> wrote:
>>
>>> Hi Vijay,
>>>
>>> I managed by using
>>> "ctx.timerService().registerProcessingTimeTimer(timeoutTime);" on the
>>> processElement method and clearing the state on the onTimer method. This is
>>> my program [1].
>>>
>>> [1]
>>> https://github.com/felipegutierrez/explore-flink/blob/master/src/main/java/org/sense/flink/examples/stream/WordHLLKeyedProcessWindowTwitter.java
>>>
>>> Kind Regards,
>>> Felipe
>>> *--*
>>> *-- Felipe Gutierrez*
>>>
>>> *-- skype: felipe.o.gutierrez*
>>> *--* *https://felipeogutierrez.blogspot.com
>>> <https://felipeogutierrez.blogspot.com>*
>>>
>>>
>>> On Mon, Jun 17, 2019 at 8:57 PM Rafi Aroch <ra...@gmail.com> wrote:
>>>
>>>> Hi Vijay,
>>>>
>>>> When using windows, you may use the 'trigger' to set a Custom Trigger
>>>> which would trigger your *ProcessWindowFunction* accordingly.
>>>>
>>>> In your case, you would probably use:
>>>>
>>>>> *.trigger(ContinuousProcessingTimeTrigger.of(Time.minutes(5)))*
>>>>>
>>>>
>>>> Thanks,
>>>> Rafi
>>>>
>>>>
>>>> On Mon, Jun 17, 2019 at 9:01 PM Vijay Balakrishnan <bv...@gmail.com>
>>>> wrote:
>>>>
>>>>> I am also implementing the ProcessWindowFunction and accessing the
>>>>> windowState to get data but how do i push data out every 5 mins during a 4
>>>>> hr time window ?? I am adding a globalState to handle the 4 hr window ???
>>>>> Or should I still use the context.windowState even for the 4 hr window ?
>>>>>
>>>>> public  class MGroupingAggregateClass extends
>>>>>> ProcessWindowFunction<....> {
>>>>>>
>>>>>> private MapState<String, Object> timedGroupKeyState;
>>>>>> private MapState<String, Object> globalGroupKeyState;
>>>>>> private final MapStateDescriptor<String, Object>
>>>>>> timedMapKeyStateDescriptor =
>>>>>>        new MapStateDescriptor<>("timedGroupKeyState",
>>>>>>                String.class, Object.class);
>>>>>> private final MapStateDescriptor<String, Object>
>>>>>> globalMapKeyStateDescriptor =
>>>>>>            new MapStateDescriptor<>("globalGroupKeyState",
>>>>>>                    String.class, Object.class);
>>>>>>
>>>>>>
>>>>>> public void open(Configuration ..) {
>>>>>> timedGroupKeyState =
>>>>>> getRuntimeContext().getMapState(timedMapKeyStateDescriptor);
>>>>>> globalGroupKeyState =
>>>>>> getRuntimeContext().getMapState(globalMapKeyStateDescriptor);
>>>>>> }
>>>>>>
>>>>>> public void process(MonitoringTuple currKey, Context context,
>>>>>> Iterable<Map<String, Object>> elements,
>>>>>>                        Collector<Map<String, Object>> out) throws
>>>>>> Exception {
>>>>>>        logger.info("Entered MGroupingAggregateWindowProcessing -
>>>>>> process interval:{}, currKey:{}", interval, currKey);
>>>>>>        timedGroupKeyState =
>>>>>> context.windowState().getMapState(timedMapKeyStateDescriptor);
>>>>>>        globalGroupKeyState =
>>>>>> context.globalState().getMapState(globalMapKeyStateDescriptor);
>>>>>> ...
>>>>>> //get data fromm state
>>>>>> Object timedGroupStateObj = timedGroupKeyState.get(groupKey);
>>>>>>
>>>>>> //how do i push the data out every 5 mins to the sink during the 4 hr
>>>>>> window ??
>>>>>>
>>>>>> }
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Jun 17, 2019 at 10:06 AM Vijay Balakrishnan <
>>>>> bvijaykr@gmail.com> wrote:
>>>>>
>>>>>> Hi,
>>>>>> Need to calculate a 4 hour time window for count, sum with current
>>>>>> calculated results being output every 5 mins.
>>>>>> How do i do that ?
>>>>>> Currently, I calculate results for 5 sec and 5 min time windows fine
>>>>>> on the KeyedStream.
>>>>>>
>>>>>> Time timeWindow = getTimeWindowFromInterval(interval);//eg:
>>>>>>> timeWindow = Time.seconds(timeIntervalL);
>>>>>>> KeyedStream<Map<String, Object>, ...> monitoringTupleKeyedStream =
>>>>>>> kinesisStream.keyBy(...);
>>>>>>> final WindowedStream<Map<String, Object>, ...., TimeWindow>
>>>>>>> windowStream =
>>>>>>>         monitoringTupleKeyedStream
>>>>>>>                 .timeWindow(timeWindow);
>>>>>>> DataStream<....> enrichedMGStream = windowStream.aggregate(
>>>>>>>         new MGroupingWindowAggregateClass(...),
>>>>>>>         new MGroupingAggregateClass(....))
>>>>>>>         .map(new Monitoring...(...));
>>>>>>> enrichedMGStream.addSink(..);
>>>>>>>
>>>>>>
>>>>>>
>>>>>> TIA,
>>>>>> Vijay
>>>>>>
>>>>>