You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Jon Yeargers <jo...@cedexis.com> on 2016/08/03 00:30:31 UTC

What is output from DataSet.print()?

Topology snip:

datastream = some_stream.keyBy(keySelector).timeWindow(Time.seconds(60)).reduce(new
some_KeyReduce());


If I have a KeySelector that's pretty 'loose' (IE lots of matches) the
'some_KeyReduce' function gets hit frequently and some set of values is
printed out via 'datastream.print()'.

If I have a more stringent KeySelector the 'keyReduce' function never gets
called but the 'datastream.print()' function still outputs numerous values.

So how are the KeySelector and the output of the datastream.print()
related? Or are they?

Re: What is output from DataSet.print()?

Posted by Stephan Ewen <se...@apache.org>.
Hi!

The print() output is usually partitioned in the same way as the previous
operation.
Because your previous operation is the keyBy/window operator, it should be
partitioned following the key selected by the key selector.

The Reduce() function gets only called if a window has at least two
elements. If the window has only one element, that single element is the
result of the window and gets printed.

Greetings,
Stephan


On Wed, Aug 3, 2016 at 2:30 AM, Jon Yeargers <jo...@cedexis.com>
wrote:

> Topology snip:
>
> datastream = some_stream.keyBy(keySelector).timeWindow(Time.seconds(60)).reduce(new some_KeyReduce());
>
>
> If I have a KeySelector that's pretty 'loose' (IE lots of matches) the
> 'some_KeyReduce' function gets hit frequently and some set of values is
> printed out via 'datastream.print()'.
>
> If I have a more stringent KeySelector the 'keyReduce' function never gets
> called but the 'datastream.print()' function still outputs numerous values.
>
> So how are the KeySelector and the output of the datastream.print()
> related? Or are they?
>
>