You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Ashish Attarde <as...@gmail.com> on 2018/04/06 18:09:01 UTC

Lot of data generated in out file

Hi Flink Team,

I am seeing one of the out file for on my task manager is dumping lot of
data.
Not sure, why this is happening. All the data that is getting dumped in out
file is ideally what *parsedInput *stream should be getting.



Here is the flink program that is executing:

env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);
env.setParallelism(1);

DataStream<String> rawInput = env.addSource(new FlinkKafkaConsumer010<>(
                                        "event-ft",
                                        new SimpleStringSchema(),
                                        kafkaProps).setStartFromLatest());

DataStream<String> input2 = rawInput
                            .map(new KafkaMsgReads());

DataStream<EventRec> parsedInput = input2
                                    .flatMap(new Splitter())
                                    .assignTimestampsAndWatermarks(new
BoundedOutOfOrdernessTimestampExtractor<FTRecord>(Time.seconds(2)) {
                                    @Override
                                    public long
extractTimestamp(EventRec record) {
                                        return
record.getmTimeStamp()/TT_SCALE_FACTOR;
                                    }
                                }).rebalance().map(new RawInputCounter());

parsedInput
        .keyBy("mflowHashLSB","mflowHashMSB")
        .window(SlidingEventTimeWindows.of(Time.milliseconds(1000),Time.milliseconds(950)))
        .allowedLateness(Time.seconds(1))
        .apply(new CRWindow());

parsedInput.writeUsingOutputFormat(new DiscardingOutputFormat<>());

env.execute();


Here is the definition of *CRWindow* class:


public static class CRWindow  implements WindowFunction<FTRecord,
FTFlow, Tuple, TimeWindow> {

    @Override
    public void apply(Tuple key, TimeWindow window, Iterable<FTRecord>
ftRecords, Collector<FTFlow> collector) {
        return;
    }

}


Also, is there any elaborate documentation of windowing mechanism
available? I am intereseted in using windowing with ability to push
the events from one window to future window. Similar funcationality
exist in storm for pushing an event to subsequent window.


Thanks

-Ashish

Re: Lot of data generated in out file

Posted by Ashish Attarde <as...@gmail.com>.
Thanks Gordon for your reply.

The out file mistry got resolved. Someone accidently, modified the POJO
code on server that I work on to, and had put in println.

Thank you for the information. I am experimenting with windowing to
understand better and fit in my use case.

Thanks
-Ashish

On Sun, Apr 15, 2018 at 10:09 PM, Tzu-Li (Gordon) Tai <tz...@apache.org>
wrote:

> Hi Ashish,
>
> I don't really see why there are outputs in the out file for the program
> you
> provided. Perhaps others could chime in here ..
>
> As for your second question regarding window outputs:
> Yes, subsequent window operators should definitely be doable in Flink.
> This is just a matter of multiple transformations in your pipeline.
> The only restriction right now, is that after a window operation, the
> stream
> is no longer a KeyedStream, so you would need to "re-key" the stream before
> applying the second windowed transformation.
>
> Cheers,
> Gordon
>
>
>
> --
> Sent from: http://apache-flink-user-mailing-list-archive.2336050.
> n4.nabble.com/
>



-- 

Thanks
-Ashish Attarde

Re: Lot of data generated in out file

Posted by "Tzu-Li (Gordon) Tai" <tz...@apache.org>.
Hi Ashish,

I don't really see why there are outputs in the out file for the program you
provided. Perhaps others could chime in here ..

As for your second question regarding window outputs:
Yes, subsequent window operators should definitely be doable in Flink.
This is just a matter of multiple transformations in your pipeline.
The only restriction right now, is that after a window operation, the stream
is no longer a KeyedStream, so you would need to "re-key" the stream before
applying the second windowed transformation.

Cheers,
Gordon



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/