You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Elias Levy <fe...@gmail.com> on 2016/04/07 17:22:10 UTC

WindowedStream operation questions

An observation and a couple of question from a novice.

The observation: The Flink web site makes available ScalaDocs for
org.apache.flink.api.scala but not for org.apache.flink.streaming.api.scala.

Now for the questions:

Why can't you use map to transform a data stream, say convert all the
elements to integers (e.g. .map { x => 1 }), then create a tumbling
processing time window (e.g.
.windowAll(TumblingProcessingTimeWindows.of(Time.seconds(2))))?

Then the inverse: Why do AllWindiowedStream and WindowStream not have a map
method?

Re: WindowedStream operation questions

Posted by Aljoscha Krettek <al...@apache.org>.
Hi,
I saw the issue you opened. :-) I'll try and figure out how to get all the
Scaladocs on there.

Regarding the other questions. A WindowedStream is basically not a Stream
in itself but a stepping stone towards specifying a windowed operation that
results in a new Stream. So the pattern always has to be like this:

in
  .keyBy(...)
  .window(...)
  .reduce() // or fold() or apply()

Only with the final specification of the operation do you get a new
DataStream of elements. You can do a map() before the window operation but
not in between specifying a window and an operation that works on windows.

This differs from the Apache Beam (formerly Google Dataflow) model where
the window is a property of stream elements. There you can do:

in
  .map()
  .window()
  .map()
  .reduce()

And the reduce operation will work on the windows that where assigned in
the window operation.

I hope this helps somewhat but please let me know if I should go into
details.

Cheers,
Aljoscha

On Thu, 7 Apr 2016 at 17:22 Elias Levy <fe...@gmail.com> wrote:

> An observation and a couple of question from a novice.
>
> The observation: The Flink web site makes available ScalaDocs for
> org.apache.flink.api.scala but not for org.apache.flink.streaming.api.scala.
>
> Now for the questions:
>
> Why can't you use map to transform a data stream, say convert all the
> elements to integers (e.g. .map { x => 1 }), then create a tumbling
> processing time window (e.g.
> .windowAll(TumblingProcessingTimeWindows.of(Time.seconds(2))))?
>
> Then the inverse: Why do AllWindiowedStream and WindowStream not have a
> map method?
>