You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@flink.apache.org by tw...@apache.org on 2017/08/09 11:57:41 UTC

[1/7] flink git commit: [FLINK-7370] [docs] Relocate files according to new structure

Repository: flink
Updated Branches:
  refs/heads/master ff70cc3af -> 31b86f605


http://git-wip-us.apache.org/repos/asf/flink/blob/31b86f60/docs/dev/stream/windows.md
----------------------------------------------------------------------
diff --git a/docs/dev/stream/windows.md b/docs/dev/stream/windows.md
deleted file mode 100644
index ab53a3a..0000000
--- a/docs/dev/stream/windows.md
+++ /dev/null
@@ -1,1039 +0,0 @@
----
-title: "Windows"
-nav-parent_id: operators
-nav-id: windows
-nav-pos: 10
----
-<!--
-Licensed to the Apache Software Foundation (ASF) under one
-or more contributor license agreements.  See the NOTICE file
-distributed with this work for additional information
-regarding copyright ownership.  The ASF licenses this file
-to you under the Apache License, Version 2.0 (the
-"License"); you may not use this file except in compliance
-with the License.  You may obtain a copy of the License at
-
-  http://www.apache.org/licenses/LICENSE-2.0
-
-Unless required by applicable law or agreed to in writing,
-software distributed under the License is distributed on an
-"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-KIND, either express or implied.  See the License for the
-specific language governing permissions and limitations
-under the License.
--->
-
-Windows are at the heart of processing infinite streams. Windows split the stream into "buckets" of finite size,
-over which we can apply computations. This document focuses on how windowing is performed in Flink and how the
-programmer can benefit to the maximum from its offered functionality.
-
-The general structure of a windowed Flink program is presented below. The first snippet refers to *keyed* streams,
-while the second to *non-keyed* ones. As one can see, the only difference is the `keyBy(...)` call for the keyed streams
-and the `window(...)` which becomes `windowAll(...)` for non-keyed streams. These is also going to serve as a roadmap
-for the rest of the page.
-
-**Keyed Windows**
-
-    stream
-           .keyBy(...)          <-  keyed versus non-keyed windows
-           .window(...)         <-  required: "assigner"
-          [.trigger(...)]       <-  optional: "trigger" (else default trigger)
-          [.evictor(...)]       <-  optional: "evictor" (else no evictor)
-          [.allowedLateness()]  <-  optional, else zero
-           .reduce/fold/apply() <-  required: "function"
-
-**Non-Keyed Windows**
-
-    stream
-           .windowAll(...)      <-  required: "assigner"
-          [.trigger(...)]       <-  optional: "trigger" (else default trigger)
-          [.evictor(...)]       <-  optional: "evictor" (else no evictor)
-          [.allowedLateness()]  <-  optional, else zero
-           .reduce/fold/apply() <-  required: "function"
-
-In the above, the commands in square brackets ([...]) are optional. This reveals that Flink allows you to customize your
-windowing logic in many different ways so that it best fits your needs.
-
-* This will be replaced by the TOC
-{:toc}
-
-## Window Lifecycle
-
-In a nutshell, a window is **created** as soon as the first element that should belong to this window arrives, and the
-window is **completely removed** when the time (event or processing time) passes its end timestamp plus the user-specified
-`allowed lateness` (see [Allowed Lateness](#allowed-lateness)). Flink guarantees removal only for time-based
-windows and not for other types, *e.g.* global windows (see [Window Assigners](#window-assigners)). For example, with an
-event-time-based windowing strategy that creates non-overlapping (or tumbling) windows every 5 minutes and has an allowed
-lateness of 1 min, Flink will create a new window for the interval between `12:00` and `12:05` when the first element with
-a timestamp that falls into this interval arrives, and it will remove it when the watermark passes the `12:06`
-timestamp.
-
-In addition, each window will have a `Trigger` (see [Triggers](#triggers)) and a function (`WindowFunction`, `ReduceFunction` or
-`FoldFunction`) (see [Window Functions](#window-functions)) attached to it. The function will contain the computation to
-be applied to the contents of the window, while the `Trigger` specifies the conditions under which the window is
-considered ready for the function to be applied. A triggering policy might be something like "when the number of elements
-in the window is more than 4", or "when the watermark passes the end of the window". A trigger can also decide to
-purge a window's contents any time between its creation and removal. Purging in this case only refers to the elements
-in the window, and *not* the window metadata. This means that new data can still be added to that window.
-
-Apart from the above, you can specify an `Evictor` (see [Evictors](#evictors)) which will be able to remove
-elements from the window after the trigger fires and before and/or after the function is applied.
-
-In the following we go into more detail for each of the components above. We start with the required parts in the above
-snippet (see [Keyed vs Non-Keyed Windows](#keyed-vs-non-keyed-windows), [Window Assigner](#window-assigner), and
-[Window Function](#window-function)) before moving to the optional ones.
-
-## Keyed vs Non-Keyed Windows
-
-The first thing to specify is whether your stream should be keyed or not. This has to be done before defining the window.
-Using the `keyBy(...)` will split your infinite stream into logical keyed streams. If `keyBy(...)` is not called, your
-stream is not keyed.
-
-In the case of keyed streams, any attribute of your incoming events can be used as a key
-(more details [here]({{ site.baseurl }}/dev/api_concepts.html#specifying-keys)). Having a keyed stream will
-allow your windowed computation to be performed in parallel by multiple tasks, as each logical keyed stream can be processed
-independently from the rest. All elements referring to the same key will be sent to the same parallel task.
-
-In case of non-keyed streams, your original stream will not be split into multiple logical streams and all the windowing logic
-will be performed by a single task, *i.e.* with parallelism of 1.
-
-## Window Assigners
-
-After specifying whether your stream is keyed or not, the next step is to define a *window assigner*.
-The window assigner defines how elements are assigned to windows. This is done by specifying the `WindowAssigner`
-of your choice in the `window(...)` (for *keyed* streams) or the `windowAll()` (for *non-keyed* streams) call.
-
-A `WindowAssigner` is responsible for assigning each incoming element to one or more windows. Flink comes
-with pre-defined window assigners for the most common use cases, namely *tumbling windows*,
-*sliding windows*, *session windows* and *global windows*. You can also implement a custom window assigner by
-extending the `WindowAssigner` class. All built-in window assigners (except the global
-windows) assign elements to windows based on time, which can either be processing time or event
-time. Please take a look at our section on [event time]({{ site.baseurl }}/dev/event_time.html) to learn
-about the difference between processing time and event time and how timestamps and watermarks are generated.
-
-In the following, we show how Flink's pre-defined window assigners work and how they are used
-in a DataStream program. The following figures visualize the workings of each assigner. The purple circles
-represent elements of the stream, which are partitioned by some key (in this case *user 1*, *user 2* and *user 3*).
-The x-axis shows the progress of time.
-
-### Tumbling Windows
-
-A *tumbling windows* assigner assigns each element to a window of a specified *window size*.
-Tumbling windows have a fixed size and do not overlap. For example, if you specify a tumbling
-window with a size of 5 minutes, the current window will be evaluated and a new window will be
-started every five minutes as illustrated by the following figure.
-
-<img src="{{ site.baseurl }}/fig/tumbling-windows.svg" class="center" style="width: 100%;" />
-
-The following code snippets show how to use tumbling windows.
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-{% highlight java %}
-DataStream<T> input = ...;
-
-// tumbling event-time windows
-input
-    .keyBy(<key selector>)
-    .window(TumblingEventTimeWindows.of(Time.seconds(5)))
-    .<windowed transformation>(<window function>);
-
-// tumbling processing-time windows
-input
-    .keyBy(<key selector>)
-    .window(TumblingProcessingTimeWindows.of(Time.seconds(5)))
-    .<windowed transformation>(<window function>);
-
-// daily tumbling event-time windows offset by -8 hours.
-input
-    .keyBy(<key selector>)
-    .window(TumblingEventTimeWindows.of(Time.days(1), Time.hours(-8)))
-    .<windowed transformation>(<window function>);
-{% endhighlight %}
-</div>
-
-<div data-lang="scala" markdown="1">
-{% highlight scala %}
-val input: DataStream[T] = ...
-
-// tumbling event-time windows
-input
-    .keyBy(<key selector>)
-    .window(TumblingEventTimeWindows.of(Time.seconds(5)))
-    .<windowed transformation>(<window function>)
-
-// tumbling processing-time windows
-input
-    .keyBy(<key selector>)
-    .window(TumblingProcessingTimeWindows.of(Time.seconds(5)))
-    .<windowed transformation>(<window function>)
-
-// daily tumbling event-time windows offset by -8 hours.
-input
-    .keyBy(<key selector>)
-    .window(TumblingEventTimeWindows.of(Time.days(1), Time.hours(-8)))
-    .<windowed transformation>(<window function>)
-{% endhighlight %}
-</div>
-</div>
-
-Time intervals can be specified by using one of `Time.milliseconds(x)`, `Time.seconds(x)`,
-`Time.minutes(x)`, and so on.
-
-As shown in the last example, tumbling window assigners also take an optional `offset`
-parameter that can be used to change the alignment of windows. For example, without offsets
-hourly tumbling windows are aligned with epoch, that is you will get windows such as
-`1:00:00.000 - 1:59:59.999`, `2:00:00.000 - 2:59:59.999` and so on. If you want to change
-that you can give an offset. With an offset of 15 minutes you would, for example, get
-`1:15:00.000 - 2:14:59.999`, `2:15:00.000 - 3:14:59.999` etc.
-An important use case for offsets is to adjust windows to timezones other than UTC-0.
-For example, in China you would have to specify an offset of `Time.hours(-8)`.
-
-### Sliding Windows
-
-The *sliding windows* assigner assigns elements to windows of fixed length. Similar to a tumbling
-windows assigner, the size of the windows is configured by the *window size* parameter.
-An additional *window slide* parameter controls how frequently a sliding window is started. Hence,
-sliding windows can be overlapping if the slide is smaller than the window size. In this case elements
-are assigned to multiple windows.
-
-For example, you could have windows of size 10 minutes that slides by 5 minutes. With this you get every
-5 minutes a window that contains the events that arrived during the last 10 minutes as depicted by the
-following figure.
-
-<img src="{{ site.baseurl }}/fig/sliding-windows.svg" class="center" style="width: 100%;" />
-
-The following code snippets show how to use sliding windows.
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-{% highlight java %}
-DataStream<T> input = ...;
-
-// sliding event-time windows
-input
-    .keyBy(<key selector>)
-    .window(SlidingEventTimeWindows.of(Time.seconds(10), Time.seconds(5)))
-    .<windowed transformation>(<window function>);
-
-// sliding processing-time windows
-input
-    .keyBy(<key selector>)
-    .window(SlidingProcessingTimeWindows.of(Time.seconds(10), Time.seconds(5)))
-    .<windowed transformation>(<window function>);
-
-// sliding processing-time windows offset by -8 hours
-input
-    .keyBy(<key selector>)
-    .window(SlidingProcessingTimeWindows.of(Time.hours(12), Time.hours(1), Time.hours(-8)))
-    .<windowed transformation>(<window function>);
-{% endhighlight %}
-</div>
-
-<div data-lang="scala" markdown="1">
-{% highlight scala %}
-val input: DataStream[T] = ...
-
-// sliding event-time windows
-input
-    .keyBy(<key selector>)
-    .window(SlidingEventTimeWindows.of(Time.seconds(10), Time.seconds(5)))
-    .<windowed transformation>(<window function>)
-
-// sliding processing-time windows
-input
-    .keyBy(<key selector>)
-    .window(SlidingProcessingTimeWindows.of(Time.seconds(10), Time.seconds(5)))
-    .<windowed transformation>(<window function>)
-
-// sliding processing-time windows offset by -8 hours
-input
-    .keyBy(<key selector>)
-    .window(SlidingProcessingTimeWindows.of(Time.hours(12), Time.hours(1), Time.hours(-8)))
-    .<windowed transformation>(<window function>)
-{% endhighlight %}
-</div>
-</div>
-
-Time intervals can be specified by using one of `Time.milliseconds(x)`, `Time.seconds(x)`,
-`Time.minutes(x)`, and so on.
-
-As shown in the last example, sliding window assigners also take an optional `offset` parameter
-that can be used to change the alignment of windows. For example, without offsets hourly windows
-sliding by 30 minutes are aligned with epoch, that is you will get windows such as
-`1:00:00.000 - 1:59:59.999`, `1:30:00.000 - 2:29:59.999` and so on. If you want to change that
-you can give an offset. With an offset of 15 minutes you would, for example, get
-`1:15:00.000 - 2:14:59.999`, `1:45:00.000 - 2:44:59.999` etc.
-An important use case for offsets is to adjust windows to timezones other than UTC-0.
-For example, in China you would have to specify an offset of `Time.hours(-8)`.
-
-### Session Windows
-
-The *session windows* assigner groups elements by sessions of activity. Session windows do not overlap and
-do not have a fixed start and end time, in contrast to *tumbling windows* and *sliding windows*. Instead a
-session window closes when it does not receive elements for a certain period of time, *i.e.*, when a gap of
-inactivity occurred. A session window assigner is configured with the *session gap* which
-defines how long is the required period of inactivity. When this period expires, the current session closes
-and subsequent elements are assigned to a new session window.
-
-<img src="{{ site.baseurl }}/fig/session-windows.svg" class="center" style="width: 100%;" />
-
-The following code snippets show how to use session windows.
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-{% highlight java %}
-DataStream<T> input = ...;
-
-// event-time session windows
-input
-    .keyBy(<key selector>)
-    .window(EventTimeSessionWindows.withGap(Time.minutes(10)))
-    .<windowed transformation>(<window function>);
-
-// processing-time session windows
-input
-    .keyBy(<key selector>)
-    .window(ProcessingTimeSessionWindows.withGap(Time.minutes(10)))
-    .<windowed transformation>(<window function>);
-{% endhighlight %}
-</div>
-
-<div data-lang="scala" markdown="1">
-{% highlight scala %}
-val input: DataStream[T] = ...
-
-// event-time session windows
-input
-    .keyBy(<key selector>)
-    .window(EventTimeSessionWindows.withGap(Time.minutes(10)))
-    .<windowed transformation>(<window function>)
-
-// processing-time session windows
-input
-    .keyBy(<key selector>)
-    .window(ProcessingTimeSessionWindows.withGap(Time.minutes(10)))
-    .<windowed transformation>(<window function>)
-{% endhighlight %}
-</div>
-</div>
-
-Time intervals can be specified by using one of `Time.milliseconds(x)`, `Time.seconds(x)`,
-`Time.minutes(x)`, and so on.
-
-<span class="label label-danger">Attention</span> Since session windows do not have a fixed start and end,
-they are  evaluated differently than tumbling and sliding windows. Internally, a session window operator
-creates a new window for each arriving record and merges windows together if their are closer to each other
-than the defined gap.
-In order to be mergeable, a session window operator requires a merging [Trigger](#triggers) and a merging
-[Window Function](#window-functions), such as `ReduceFunction` or `WindowFunction`
-(`FoldFunction` cannot merge.)
-
-### Global Windows
-
-A *global windows* assigner assigns all elements with the same key to the same single *global window*.
-This windowing scheme is only useful if you also specify a custom [trigger](#triggers). Otherwise,
-no computation will be performed, as the global window does not have a natural end at
-which we could process the aggregated elements.
-
-<img src="{{ site.baseurl }}/fig/non-windowed.svg" class="center" style="width: 100%;" />
-
-The following code snippets show how to use a global window.
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-{% highlight java %}
-DataStream<T> input = ...;
-
-input
-    .keyBy(<key selector>)
-    .window(GlobalWindows.create())
-    .<windowed transformation>(<window function>);
-{% endhighlight %}
-</div>
-
-<div data-lang="scala" markdown="1">
-{% highlight scala %}
-val input: DataStream[T] = ...
-
-input
-    .keyBy(<key selector>)
-    .window(GlobalWindows.create())
-    .<windowed transformation>(<window function>)
-{% endhighlight %}
-</div>
-</div>
-
-## Window Functions
-
-After defining the window assigner, we need to specify the computation that we want
-to perform on each of these windows. This is the responsibility of the *window function*, which is used to process the
-elements of each (possibly keyed) window once the system determines that a window is ready for processing
-(see [triggers](#triggers) for how Flink determines when a window is ready).
-
-The window function can be one of `ReduceFunction`, `FoldFunction` or `WindowFunction`. The first
-two can be executed more efficiently (see [State Size](#state size) section) because Flink can incrementally aggregate
-the elements for each window as they arrive. A `WindowFunction` gets an `Iterable` for all the elements contained in a
-window and additional meta information about the window to which the elements belong.
-
-A windowed transformation with a `WindowFunction` cannot be executed as efficiently as the other
-cases because Flink has to buffer *all* elements for a window internally before invoking the function.
-This can be mitigated by combining a `WindowFunction` with a `ReduceFunction` or `FoldFunction` to
-get both incremental aggregation of window elements and the additional window metadata that the
-`WindowFunction` receives. We will look at examples for each of these variants.
-
-### ReduceFunction
-
-A `ReduceFunction` specifies how two elements from the input are combined to produce
-an output element of the same type. Flink uses a `ReduceFunction` to incrementally aggregate
-the elements of a window.
-
-A `ReduceFunction` can be defined and used like this:
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-{% highlight java %}
-DataStream<Tuple2<String, Long>> input = ...;
-
-input
-    .keyBy(<key selector>)
-    .window(<window assigner>)
-    .reduce(new ReduceFunction<Tuple2<String, Long>> {
-      public Tuple2<String, Long> reduce(Tuple2<String, Long> v1, Tuple2<String, Long> v2) {
-        return new Tuple2<>(v1.f0, v1.f1 + v2.f1);
-      }
-    });
-{% endhighlight %}
-</div>
-
-<div data-lang="scala" markdown="1">
-{% highlight scala %}
-val input: DataStream[(String, Long)] = ...
-
-input
-    .keyBy(<key selector>)
-    .window(<window assigner>)
-    .reduce { (v1, v2) => (v1._1, v1._2 + v2._2) }
-{% endhighlight %}
-</div>
-</div>
-
-The above example sums up the second fields of the tuples for all elements in a window.
-
-### FoldFunction
-
-A `FoldFunction` specifies how an input element of the window is combined with an element of
-the output type. The `FoldFunction` is incrementally called for each element that is added
-to the window and the current output value. The first element is combined with a pre-defined initial value of the output type.
-
-A `FoldFunction` can be defined and used like this:
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-{% highlight java %}
-DataStream<Tuple2<String, Long>> input = ...;
-
-input
-    .keyBy(<key selector>)
-    .window(<window assigner>)
-    .fold("", new FoldFunction<Tuple2<String, Long>, String>> {
-       public String fold(String acc, Tuple2<String, Long> value) {
-         return acc + value.f1;
-       }
-    });
-{% endhighlight %}
-</div>
-
-<div data-lang="scala" markdown="1">
-{% highlight scala %}
-val input: DataStream[(String, Long)] = ...
-
-input
-    .keyBy(<key selector>)
-    .window(<window assigner>)
-    .fold("") { (acc, v) => acc + v._2 }
-{% endhighlight %}
-</div>
-</div>
-
-The above example appends all input `Long` values to an initially empty `String`.
-
-<span class="label label-danger">Attention</span> `fold()` cannot be used with session windows or other mergeable windows.
-
-### WindowFunction - The Generic Case
-
-A `WindowFunction` gets an `Iterable` containing all the elements of the window and provides
-the most flexibility of all window functions. This comes
-at the cost of performance and resource consumption, because elements cannot be incrementally
-aggregated but instead need to be buffered internally until the window is considered ready for processing.
-
-The signature of a `WindowFunction` looks as follows:
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-{% highlight java %}
-public interface WindowFunction<IN, OUT, KEY, W extends Window> extends Function, Serializable {
-
-  /**
-   * Evaluates the window and outputs none or several elements.
-   *
-   * @param key The key for which this window is evaluated.
-   * @param window The window that is being evaluated.
-   * @param input The elements in the window being evaluated.
-   * @param out A collector for emitting elements.
-   *
-   * @throws Exception The function may throw exceptions to fail the program and trigger recovery.
-   */
-  void apply(KEY key, W window, Iterable<IN> input, Collector<OUT> out) throws Exception;
-}
-{% endhighlight %}
-</div>
-
-<div data-lang="scala" markdown="1">
-{% highlight scala %}
-trait WindowFunction[IN, OUT, KEY, W <: Window] extends Function with Serializable {
-
-  /**
-    * Evaluates the window and outputs none or several elements.
-    *
-    * @param key    The key for which this window is evaluated.
-    * @param window The window that is being evaluated.
-    * @param input  The elements in the window being evaluated.
-    * @param out    A collector for emitting elements.
-    * @throws Exception The function may throw exceptions to fail the program and trigger recovery.
-    */
-  def apply(key: KEY, window: W, input: Iterable[IN], out: Collector[OUT])
-}
-{% endhighlight %}
-</div>
-</div>
-
-A `WindowFunction` can be defined and used like this:
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-{% highlight java %}
-DataStream<Tuple2<String, Long>> input = ...;
-
-input
-    .keyBy(<key selector>)
-    .window(<window assigner>)
-    .apply(new MyWindowFunction());
-
-/* ... */
-
-public class MyWindowFunction implements WindowFunction<Tuple<String, Long>, String, String, TimeWindow> {
-
-  void apply(String key, TimeWindow window, Iterable<Tuple<String, Long>> input, Collector<String> out) {
-    long count = 0;
-    for (Tuple<String, Long> in: input) {
-      count++;
-    }
-    out.collect("Window: " + window + "count: " + count);
-  }
-}
-
-{% endhighlight %}
-</div>
-
-<div data-lang="scala" markdown="1">
-{% highlight scala %}
-val input: DataStream[(String, Long)] = ...
-
-input
-    .keyBy(<key selector>)
-    .window(<window assigner>)
-    .apply(new MyWindowFunction())
-
-/* ... */
-
-class MyWindowFunction extends WindowFunction[(String, Long), String, String, TimeWindow] {
-
-  def apply(key: String, window: TimeWindow, input: Iterable[(String, Long)], out: Collector[String]): () = {
-    var count = 0L
-    for (in <- input) {
-      count = count + 1
-    }
-    out.collect(s"Window $window count: $count")
-  }
-}
-{% endhighlight %}
-</div>
-</div>
-
-The example shows a `WindowFunction` to count the elements in a window. In addition, the window function adds information about the window to the output.
-
-<span class="label label-danger">Attention</span> Note that using `WindowFunction` for simple aggregates such as count is quite inefficient. The next section shows how a `ReduceFunction` can be combined with a `WindowFunction` to get both incremental aggregation and the added information of a `WindowFunction`.
-
-### ProcessWindowFunction
-
-In places where a `WindowFunction` can be used you can also use a `ProcessWindowFunction`. This
-is very similar to `WindowFunction`, except that the interface allows to query more information
-about the context in which the window evaluation happens.
-
-This is the `ProcessWindowFunction` interface:
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-{% highlight java %}
-public abstract class ProcessWindowFunction<IN, OUT, KEY, W extends Window> implements Function {
-
-    /**
-     * Evaluates the window and outputs none or several elements.
-     *
-     * @param key The key for which this window is evaluated.
-     * @param context The context in which the window is being evaluated.
-     * @param elements The elements in the window being evaluated.
-     * @param out A collector for emitting elements.
-     *
-     * @throws Exception The function may throw exceptions to fail the program and trigger recovery.
-     */
-    public abstract void process(
-            KEY key,
-            Context context,
-            Iterable<IN> elements,
-            Collector<OUT> out) throws Exception;
-
-    /**
-     * The context holding window metadata
-     */
-    public abstract class Context {
-        /**
-         * @return The window that is being evaluated.
-         */
-        public abstract W window();
-    }
-}
-{% endhighlight %}
-</div>
-
-<div data-lang="scala" markdown="1">
-{% highlight scala %}
-abstract class ProcessWindowFunction[IN, OUT, KEY, W <: Window] extends Function {
-
-  /**
-    * Evaluates the window and outputs none or several elements.
-    *
-    * @param key      The key for which this window is evaluated.
-    * @param context  The context in which the window is being evaluated.
-    * @param elements The elements in the window being evaluated.
-    * @param out      A collector for emitting elements.
-    * @throws Exception The function may throw exceptions to fail the program and trigger recovery.
-    */
-  @throws[Exception]
-  def process(
-      key: KEY,
-      context: Context,
-      elements: Iterable[IN],
-      out: Collector[OUT])
-
-  /**
-    * The context holding window metadata
-    */
-  abstract class Context {
-    /**
-      * @return The window that is being evaluated.
-      */
-    def window: W
-  }
-}
-{% endhighlight %}
-</div>
-</div>
-
-It can be used like this:
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-{% highlight java %}
-DataStream<Tuple2<String, Long>> input = ...;
-
-input
-    .keyBy(<key selector>)
-    .window(<window assigner>)
-    .process(new MyProcessWindowFunction());
-{% endhighlight %}
-</div>
-
-<div data-lang="scala" markdown="1">
-{% highlight scala %}
-val input: DataStream[(String, Long)] = ...
-
-input
-    .keyBy(<key selector>)
-    .window(<window assigner>)
-    .process(new MyProcessWindowFunction())
-{% endhighlight %}
-</div>
-</div>
-
-### WindowFunction with Incremental Aggregation
-
-A `WindowFunction` can be combined with either a `ReduceFunction` or a `FoldFunction` to
-incrementally aggregate elements as they arrive in the window.
-When the window is closed, the `WindowFunction` will be provided with the aggregated result.
-This allows to incrementally compute windows while having access to the
-additional window meta information of the `WindowFunction`.
-
-<span class="label label-info">Note</span> You can also `ProcessWindowFunction` instead of
-`WindowFunction` for incremental window aggregation.
-
-#### Incremental Window Aggregation with FoldFunction
-
-The following example shows how an incremental `FoldFunction` can be combined with
-a `WindowFunction` to extract the number of events in the window and return also
-the key and end time of the window.
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-{% highlight java %}
-DataStream<SensorReading> input = ...;
-
-input
-  .keyBy(<key selector>)
-  .timeWindow(<window assigner>)
-  .fold(new Tuple3<String, Long, Integer>("",0L, 0), new MyFoldFunction(), new MyWindowFunction())
-
-// Function definitions
-
-private static class MyFoldFunction
-    implements FoldFunction<SensorReading, Tuple3<String, Long, Integer> > {
-
-  public Tuple3<String, Long, Integer> fold(Tuple3<String, Long, Integer> acc, SensorReading s) {
-      Integer cur = acc.getField(2);
-      acc.setField(2, cur + 1);
-      return acc;
-  }
-}
-
-private static class MyWindowFunction
-    implements WindowFunction<Tuple3<String, Long, Integer>, Tuple3<String, Long, Integer>, String, TimeWindow> {
-
-  public void apply(String key,
-                    TimeWindow window,
-                    Iterable<Tuple3<String, Long, Integer>> counts,
-                    Collector<Tuple3<String, Long, Integer>> out) {
-    Integer count = counts.iterator().next().getField(2);
-    out.collect(new Tuple3<String, Long, Integer>(key, window.getEnd(),count));
-  }
-}
-
-{% endhighlight %}
-</div>
-<div data-lang="scala" markdown="1">
-{% highlight scala %}
-
-val input: DataStream[SensorReading] = ...
-
-input
- .keyBy(<key selector>)
- .timeWindow(<window assigner>)
- .fold (
-    ("", 0L, 0),
-    (acc: (String, Long, Int), r: SensorReading) => { ("", 0L, acc._3 + 1) },
-    ( key: String,
-      window: TimeWindow,
-      counts: Iterable[(String, Long, Int)],
-      out: Collector[(String, Long, Int)] ) =>
-      {
-        val count = counts.iterator.next()
-        out.collect((key, window.getEnd, count._3))
-      }
-  )
-
-{% endhighlight %}
-</div>
-</div>
-
-#### Incremental Window Aggregation with ReduceFunction
-
-The following example shows how an incremental `ReduceFunction` can be combined with
-a `WindowFunction` to return the smallest event in a window along
-with the start time of the window.
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-{% highlight java %}
-DataStream<SensorReading> input = ...;
-
-input
-  .keyBy(<key selector>)
-  .timeWindow(<window assigner>)
-  .reduce(new MyReduceFunction(), new MyWindowFunction());
-
-// Function definitions
-
-private static class MyReduceFunction implements ReduceFunction<SensorReading> {
-
-  public SensorReading reduce(SensorReading r1, SensorReading r2) {
-      return r1.value() > r2.value() ? r2 : r1;
-  }
-}
-
-private static class MyWindowFunction
-    implements WindowFunction<SensorReading, Tuple2<Long, SensorReading>, String, TimeWindow> {
-
-  public void apply(String key,
-                    TimeWindow window,
-                    Iterable<SensorReading> minReadings,
-                    Collector<Tuple2<Long, SensorReading>> out) {
-      SensorReading min = minReadings.iterator().next();
-      out.collect(new Tuple2<Long, SensorReading>(window.getStart(), min));
-  }
-}
-
-{% endhighlight %}
-</div>
-<div data-lang="scala" markdown="1">
-{% highlight scala %}
-
-val input: DataStream[SensorReading] = ...
-
-input
-  .keyBy(<key selector>)
-  .timeWindow(<window assigner>)
-  .reduce(
-    (r1: SensorReading, r2: SensorReading) => { if (r1.value > r2.value) r2 else r1 },
-    ( key: String,
-      window: TimeWindow,
-      minReadings: Iterable[SensorReading],
-      out: Collector[(Long, SensorReading)] ) =>
-      {
-        val min = minReadings.iterator.next()
-        out.collect((window.getStart, min))
-      }
-  )
-
-{% endhighlight %}
-</div>
-</div>
-
-## Triggers
-
-A `Trigger` determines when a window (as formed by the *window assigner*) is ready to be
-processed by the *window function*. Each `WindowAssigner` comes with a default `Trigger`.
-If the default trigger does not fit your needs, you can specify a custom trigger using `trigger(...)`.
-
-The trigger interface has five methods that allow a `Trigger` to react to different events:
-
-* The `onElement()` method is called for each element that is added to a window.
-* The `onEventTime()` method is called when  a registered event-time timer fires.
-* The `onProcessingTime()` method is called when a registered processing-time timer fires.
-* The `onMerge()` method is relevant for stateful triggers and merges the states of two triggers when their corresponding windows merge, *e.g.* when using session windows.
-* Finally the `clear()` method performs any action needed upon removal of the corresponding window.
-
-Two things to notice about the above methods are:
-
-1) The first three decide how to act on their invocation event by returning a `TriggerResult`. The action can be one of the following:
-
-* `CONTINUE`: do nothing,
-* `FIRE`: trigger the computation,
-* `PURGE`: clear the elements in the window, and
-* `FIRE_AND_PURGE`: trigger the computation and clear the elements in the window afterwards.
-
-2) Any of these methods can be used to register processing- or event-time timers for future actions.
-
-### Fire and Purge
-
-Once a trigger determines that a window is ready for processing, it fires, *i.e.*, it returns `FIRE` or `FIRE_AND_PURGE`. This is the signal for the window operator
-to emit the result of the current window. Given a window with a `WindowFunction`
-all elements are passed to the `WindowFunction` (possibly after passing them to an evictor).
-Windows with `ReduceFunction` of `FoldFunction` simply emit their eagerly aggregated result.
-
-When a trigger fires, it can either `FIRE` or `FIRE_AND_PURGE`. While `FIRE` keeps the contents of the window, `FIRE_AND_PURGE` removes its content.
-By default, the pre-implemented triggers simply `FIRE` without purging the window state.
-
-<span class="label label-danger">Attention</span> Purging will simply remove the contents of the window and will leave any potential meta-information about the window and any trigger state intact.
-
-### Default Triggers of WindowAssigners
-
-The default `Trigger` of a `WindowAssigner` is appropriate for many use cases. For example, all the event-time window assigners have an `EventTimeTrigger` as
-default trigger. This trigger simply fires once the watermark passes the end of a window.
-
-<span class="label label-danger">Attention</span> The default trigger of the `GlobalWindow` is the `NeverTrigger` which does never fire. Consequently, you always have to define a custom trigger when using a `GlobalWindow`.
-
-<span class="label label-danger">Attention</span> By specifying a trigger using `trigger()` you
-are overwriting the default trigger of a `WindowAssigner`. For example, if you specify a
-`CountTrigger` for `TumblingEventTimeWindows` you will no longer get window firings based on the
-progress of time but only by count. Right now, you have to write your own custom trigger if
-you want to react based on both time and count.
-
-### Built-in and Custom Triggers
-
-Flink comes with a few built-in triggers.
-
-* The (already mentioned) `EventTimeTrigger` fires based on the progress of event-time as measured by watermarks.
-* The `ProcessingTimeTrigger` fires based on processing time.
-* The `CountTrigger` fires once the number of elements in a window exceeds the given limit.
-* The `PurgingTrigger` takes as argument another trigger and transforms it into a purging one.
-
-If you need to implement a custom trigger, you should check out the abstract
-{% gh_link /flink-streaming-java/src/main/java/org/apache/flink/streaming/api/windowing/triggers/Trigger.java "Trigger" %} class.
-Please note that the API is still evolving and might change in future versions of Flink.
-
-## Evictors
-
-Flinkā€™s windowing model allows specifying an optional `Evictor` in addition to the `WindowAssigner` and the `Trigger`.
-This can be done using the `evictor(...)` method (shown in the beginning of this document). The evictor has the ability
-to remove elements from a window *after* the trigger fires and *before and/or after* the window function is applied.
-To do so, the `Evictor` interface has two methods:
-
-    /**
-     * Optionally evicts elements. Called before windowing function.
-     *
-     * @param elements The elements currently in the pane.
-     * @param size The current number of elements in the pane.
-     * @param window The {@link Window}
-     * @param evictorContext The context for the Evictor
-     */
-    void evictBefore(Iterable<TimestampedValue<T>> elements, int size, W window, EvictorContext evictorContext);
-
-    /**
-     * Optionally evicts elements. Called after windowing function.
-     *
-     * @param elements The elements currently in the pane.
-     * @param size The current number of elements in the pane.
-     * @param window The {@link Window}
-     * @param evictorContext The context for the Evictor
-     */
-    void evictAfter(Iterable<TimestampedValue<T>> elements, int size, W window, EvictorContext evictorContext);
-
-The `evictBefore()` contains the eviction logic to be applied before the window function, while the `evictAfter()`
-contains the one to be applied after the window function. Elements evicted before the application of the window
-function will not be processed by it.
-
-Flink comes with three pre-implemented evictors. These are:
-
-* `CountEvictor`: keeps up to a user-specified number of elements from the window and discards the remaining ones from
-the beginning of the window buffer.
-* `DeltaEvictor`: takes a `DeltaFunction` and a `threshold`, computes the delta between the last element in the
-window buffer and each of the remaining ones, and removes the ones with a delta greater or equal to the threshold.
-* `TimeEvictor`: takes as argument an `interval` in milliseconds and for a given window, it finds the maximum
-timestamp `max_ts` among its elements and removes all the elements with timestamps smaller than `max_ts - interval`.
-
-<span class="label label-info">Default</span> By default, all the pre-implemented evictors apply their logic before the
-window function.
-
-<span class="label label-danger">Attention</span> Specifying an evictor prevents any pre-aggregation, as all the
-elements of a window have to be passed to the evictor before applying the computation.
-
-<span class="label label-danger">Attention</span> Flink provides no guarantees about the order of the elements within
-a window. This implies that although an evictor may remove elements from the beginning of the window, these are not
-necessarily the ones that arrive first or last.
-
-
-## Allowed Lateness
-
-When working with *event-time* windowing, it can happen that elements arrive late, *i.e.* the watermark that Flink uses to
-keep track of the progress of event-time is already past the end timestamp of a window to which an element belongs. See
-[event time](./event_time.html) and especially [late elements](./event_time.html#late-elements) for a more thorough
-discussion of how Flink deals with event time.
-
-By default, late elements are dropped when the watermark is past the end of the window. However,
-Flink allows to specify a maximum *allowed lateness* for window operators. Allowed lateness
-specifies by how much time elements can be late before they are dropped, and its default value is 0.
-Elements that arrive after the watermark has passed the end of the window but before it passes the end of
-the window plus the allowed lateness, are still added to the window. Depending on the trigger used,
-a late but not dropped element may cause the window to fire again. This is the case for the `EventTimeTrigger`.
-
-In order to make this work, Flink keeps the state of windows until their allowed lateness expires. Once this happens, Flink removes the window and deletes its state, as
-also described in the [Window Lifecycle](#window-lifecycle) section.
-
-<span class="label label-info">Default</span> By default, the allowed lateness is set to
-`0`. That is, elements that arrive behind the watermark will be dropped.
-
-You can specify an allowed lateness like this:
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-{% highlight java %}
-DataStream<T> input = ...;
-
-input
-    .keyBy(<key selector>)
-    .window(<window assigner>)
-    .allowedLateness(<time>)
-    .<windowed transformation>(<window function>);
-{% endhighlight %}
-</div>
-
-<div data-lang="scala" markdown="1">
-{% highlight scala %}
-val input: DataStream[T] = ...
-
-input
-    .keyBy(<key selector>)
-    .window(<window assigner>)
-    .allowedLateness(<time>)
-    .<windowed transformation>(<window function>)
-{% endhighlight %}
-</div>
-</div>
-
-<span class="label label-info">Note</span> When using the `GlobalWindows` window assigner no
-data is ever considered late because the end timestamp of the global window is `Long.MAX_VALUE`.
-
-### Getting late data as a side output
-
-Using Flink's [side output]({{ site.baseurl }}/dev/stream/side_output.html) feature you can get a stream of the data
-that was discarded as late.
-
-You first need to specify that you want to get late data using `sideOutputLateData(OutputTag)` on
-the windowed stream. Then, you can get the side-output stream on the result of the windowed
-operation:
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-{% highlight java %}
-final OutputTag<T> lateOutputTag = new OutputTag<T>("late-data"){};
-
-DataStream<T> input = ...;
-
-DataStream<T> result = input
-    .keyBy(<key selector>)
-    .window(<window assigner>)
-    .allowedLateness(<time>)
-    .sideOutputLateData(lateOutputTag)
-    .<windowed transformation>(<window function>);
-
-DataStream<T> lateStream = result.getSideOutput(lateOutputTag);
-{% endhighlight %}
-</div>
-
-<div data-lang="scala" markdown="1">
-{% highlight scala %}
-val lateOutputTag = OutputTag[T]("late-data")
-
-val input: DataStream[T] = ...
-
-val result = input
-    .keyBy(<key selector>)
-    .window(<window assigner>)
-    .allowedLateness(<time>)
-    .sideOutputLateData(lateOutputTag)
-    .<windowed transformation>(<window function>)
-
-val lateStream = result.getSideOutput(lateOutputTag)
-{% endhighlight %}
-</div>
-</div>
-
-### Late elements considerations
-
-When specifying an allowed lateness greater than 0, the window along with its content is kept after the watermark passes
-the end of the window. In these cases, when a late but not dropped element arrives, it could trigger another firing for the
-window. These firings are called `late firings`, as they are triggered by late events and in contrast to the `main firing`
-which is the first firing of the window. In case of session windows, late firings can further lead to merging of windows,
-as they may "bridge" the gap between two pre-existing, unmerged windows.
-
-<span class="label label-info">Attention</span> You should be aware that the elements emitted by a late firing should be treated as updated results of a previous computation, i.e., your data stream will contain multiple results for the same computation. Depending on your application, you need to take these duplicated results into account or deduplicate them.
-
-## Useful state size considerations
-
-Windows can be defined over long periods of time (such as days, weeks, or months) and therefore accumulate very large state. There are a couple of rules to keep in mind when estimating the storage requirements of your windowing computation:
-
-1. Flink creates one copy of each element per window to which it belongs. Given this, tumbling windows keep one copy of each element (an element belongs to exactly window unless it is dropped late). In contrast, sliding windows create several of each element, as explained in the [Window Assigners](#window-assigners) section. Hence, a sliding window of size 1 day and slide 1 second might not be a good idea.
-
-2. `FoldFunction` and `ReduceFunction` can significantly reduce the storage requirements, as they eagerly aggregate elements and store only one value per window. In contrast, just using a `WindowFunction` requires accumulating all elements.
-
-3. Using an `Evictor` prevents any pre-aggregation, as all the elements of a window have to be passed through the evictor before applying the computation (see [Evictors](#evictors)).

http://git-wip-us.apache.org/repos/asf/flink/blob/31b86f60/docs/ops/state/checkpoints.md
----------------------------------------------------------------------
diff --git a/docs/ops/state/checkpoints.md b/docs/ops/state/checkpoints.md
index 4f2a9da..96c7a20 100644
--- a/docs/ops/state/checkpoints.md
+++ b/docs/ops/state/checkpoints.md
@@ -32,7 +32,7 @@ Checkpoints make state in Flink fault tolerant by allowing state and the
 corresponding stream positions to be recovered, thereby giving the application
 the same semantics as a failure-free execution.
 
-See [Checkpointing](../../dev/stream/state/checkpointing.html) for how to enable and
+See [Checkpointing]({{ site.baseurl }}/dev/stream/state/checkpointing.html) for how to enable and
 configure checkpoints for your program.
 
 ## Externalized Checkpoints

http://git-wip-us.apache.org/repos/asf/flink/blob/31b86f60/docs/redirects/windows.md
----------------------------------------------------------------------
diff --git a/docs/redirects/windows.md b/docs/redirects/windows.md
deleted file mode 100644
index bc65659..0000000
--- a/docs/redirects/windows.md
+++ /dev/null
@@ -1,24 +0,0 @@
----
-title: "Windows"
-layout: redirect
-redirect: /dev/stream/windows.html
-permalink: /apis/streaming/windows.html
----
-<!--
-Licensed to the Apache Software Foundation (ASF) under one
-or more contributor license agreements.  See the NOTICE file
-distributed with this work for additional information
-regarding copyright ownership.  The ASF licenses this file
-to you under the Apache License, Version 2.0 (the
-"License"); you may not use this file except in compliance
-with the License.  You may obtain a copy of the License at
-
-  http://www.apache.org/licenses/LICENSE-2.0
-
-Unless required by applicable law or agreed to in writing,
-software distributed under the License is distributed on an
-"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-KIND, either express or implied.  See the License for the
-specific language governing permissions and limitations
-under the License.
--->

http://git-wip-us.apache.org/repos/asf/flink/blob/31b86f60/docs/redirects/windows_2.md
----------------------------------------------------------------------
diff --git a/docs/redirects/windows_2.md b/docs/redirects/windows_2.md
deleted file mode 100644
index c7039e4..0000000
--- a/docs/redirects/windows_2.md
+++ /dev/null
@@ -1,24 +0,0 @@
----
-title: "Windows"
-layout: redirect
-redirect: /dev/stream/windows.html
-permalink: /dev/windows.html
----
-<!--
-Licensed to the Apache Software Foundation (ASF) under one
-or more contributor license agreements.  See the NOTICE file
-distributed with this work for additional information
-regarding copyright ownership.  The ASF licenses this file
-to you under the Apache License, Version 2.0 (the
-"License"); you may not use this file except in compliance
-with the License.  You may obtain a copy of the License at
-
-  http://www.apache.org/licenses/LICENSE-2.0
-
-Unless required by applicable law or agreed to in writing,
-software distributed under the License is distributed on an
-"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-KIND, either express or implied.  See the License for the
-specific language governing permissions and limitations
-under the License.
--->


[2/7] flink git commit: [FLINK-7370] [docs] Relocate files according to new structure

Posted by tw...@apache.org.
http://git-wip-us.apache.org/repos/asf/flink/blob/31b86f60/docs/dev/stream/operators/windows.md
----------------------------------------------------------------------
diff --git a/docs/dev/stream/operators/windows.md b/docs/dev/stream/operators/windows.md
new file mode 100644
index 0000000..c2d557f
--- /dev/null
+++ b/docs/dev/stream/operators/windows.md
@@ -0,0 +1,1039 @@
+---
+title: "Windows"
+nav-parent_id: streaming_operators
+nav-id: windows
+nav-pos: 10
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+Windows are at the heart of processing infinite streams. Windows split the stream into "buckets" of finite size,
+over which we can apply computations. This document focuses on how windowing is performed in Flink and how the
+programmer can benefit to the maximum from its offered functionality.
+
+The general structure of a windowed Flink program is presented below. The first snippet refers to *keyed* streams,
+while the second to *non-keyed* ones. As one can see, the only difference is the `keyBy(...)` call for the keyed streams
+and the `window(...)` which becomes `windowAll(...)` for non-keyed streams. These is also going to serve as a roadmap
+for the rest of the page.
+
+**Keyed Windows**
+
+    stream
+           .keyBy(...)          <-  keyed versus non-keyed windows
+           .window(...)         <-  required: "assigner"
+          [.trigger(...)]       <-  optional: "trigger" (else default trigger)
+          [.evictor(...)]       <-  optional: "evictor" (else no evictor)
+          [.allowedLateness()]  <-  optional, else zero
+           .reduce/fold/apply() <-  required: "function"
+
+**Non-Keyed Windows**
+
+    stream
+           .windowAll(...)      <-  required: "assigner"
+          [.trigger(...)]       <-  optional: "trigger" (else default trigger)
+          [.evictor(...)]       <-  optional: "evictor" (else no evictor)
+          [.allowedLateness()]  <-  optional, else zero
+           .reduce/fold/apply() <-  required: "function"
+
+In the above, the commands in square brackets ([...]) are optional. This reveals that Flink allows you to customize your
+windowing logic in many different ways so that it best fits your needs.
+
+* This will be replaced by the TOC
+{:toc}
+
+## Window Lifecycle
+
+In a nutshell, a window is **created** as soon as the first element that should belong to this window arrives, and the
+window is **completely removed** when the time (event or processing time) passes its end timestamp plus the user-specified
+`allowed lateness` (see [Allowed Lateness](#allowed-lateness)). Flink guarantees removal only for time-based
+windows and not for other types, *e.g.* global windows (see [Window Assigners](#window-assigners)). For example, with an
+event-time-based windowing strategy that creates non-overlapping (or tumbling) windows every 5 minutes and has an allowed
+lateness of 1 min, Flink will create a new window for the interval between `12:00` and `12:05` when the first element with
+a timestamp that falls into this interval arrives, and it will remove it when the watermark passes the `12:06`
+timestamp.
+
+In addition, each window will have a `Trigger` (see [Triggers](#triggers)) and a function (`WindowFunction`, `ReduceFunction` or
+`FoldFunction`) (see [Window Functions](#window-functions)) attached to it. The function will contain the computation to
+be applied to the contents of the window, while the `Trigger` specifies the conditions under which the window is
+considered ready for the function to be applied. A triggering policy might be something like "when the number of elements
+in the window is more than 4", or "when the watermark passes the end of the window". A trigger can also decide to
+purge a window's contents any time between its creation and removal. Purging in this case only refers to the elements
+in the window, and *not* the window metadata. This means that new data can still be added to that window.
+
+Apart from the above, you can specify an `Evictor` (see [Evictors](#evictors)) which will be able to remove
+elements from the window after the trigger fires and before and/or after the function is applied.
+
+In the following we go into more detail for each of the components above. We start with the required parts in the above
+snippet (see [Keyed vs Non-Keyed Windows](#keyed-vs-non-keyed-windows), [Window Assigner](#window-assigner), and
+[Window Function](#window-function)) before moving to the optional ones.
+
+## Keyed vs Non-Keyed Windows
+
+The first thing to specify is whether your stream should be keyed or not. This has to be done before defining the window.
+Using the `keyBy(...)` will split your infinite stream into logical keyed streams. If `keyBy(...)` is not called, your
+stream is not keyed.
+
+In the case of keyed streams, any attribute of your incoming events can be used as a key
+(more details [here]({{ site.baseurl }}/dev/api_concepts.html#specifying-keys)). Having a keyed stream will
+allow your windowed computation to be performed in parallel by multiple tasks, as each logical keyed stream can be processed
+independently from the rest. All elements referring to the same key will be sent to the same parallel task.
+
+In case of non-keyed streams, your original stream will not be split into multiple logical streams and all the windowing logic
+will be performed by a single task, *i.e.* with parallelism of 1.
+
+## Window Assigners
+
+After specifying whether your stream is keyed or not, the next step is to define a *window assigner*.
+The window assigner defines how elements are assigned to windows. This is done by specifying the `WindowAssigner`
+of your choice in the `window(...)` (for *keyed* streams) or the `windowAll()` (for *non-keyed* streams) call.
+
+A `WindowAssigner` is responsible for assigning each incoming element to one or more windows. Flink comes
+with pre-defined window assigners for the most common use cases, namely *tumbling windows*,
+*sliding windows*, *session windows* and *global windows*. You can also implement a custom window assigner by
+extending the `WindowAssigner` class. All built-in window assigners (except the global
+windows) assign elements to windows based on time, which can either be processing time or event
+time. Please take a look at our section on [event time]({{ site.baseurl }}/dev/event_time.html) to learn
+about the difference between processing time and event time and how timestamps and watermarks are generated.
+
+In the following, we show how Flink's pre-defined window assigners work and how they are used
+in a DataStream program. The following figures visualize the workings of each assigner. The purple circles
+represent elements of the stream, which are partitioned by some key (in this case *user 1*, *user 2* and *user 3*).
+The x-axis shows the progress of time.
+
+### Tumbling Windows
+
+A *tumbling windows* assigner assigns each element to a window of a specified *window size*.
+Tumbling windows have a fixed size and do not overlap. For example, if you specify a tumbling
+window with a size of 5 minutes, the current window will be evaluated and a new window will be
+started every five minutes as illustrated by the following figure.
+
+<img src="{{ site.baseurl }}/fig/tumbling-windows.svg" class="center" style="width: 100%;" />
+
+The following code snippets show how to use tumbling windows.
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+DataStream<T> input = ...;
+
+// tumbling event-time windows
+input
+    .keyBy(<key selector>)
+    .window(TumblingEventTimeWindows.of(Time.seconds(5)))
+    .<windowed transformation>(<window function>);
+
+// tumbling processing-time windows
+input
+    .keyBy(<key selector>)
+    .window(TumblingProcessingTimeWindows.of(Time.seconds(5)))
+    .<windowed transformation>(<window function>);
+
+// daily tumbling event-time windows offset by -8 hours.
+input
+    .keyBy(<key selector>)
+    .window(TumblingEventTimeWindows.of(Time.days(1), Time.hours(-8)))
+    .<windowed transformation>(<window function>);
+{% endhighlight %}
+</div>
+
+<div data-lang="scala" markdown="1">
+{% highlight scala %}
+val input: DataStream[T] = ...
+
+// tumbling event-time windows
+input
+    .keyBy(<key selector>)
+    .window(TumblingEventTimeWindows.of(Time.seconds(5)))
+    .<windowed transformation>(<window function>)
+
+// tumbling processing-time windows
+input
+    .keyBy(<key selector>)
+    .window(TumblingProcessingTimeWindows.of(Time.seconds(5)))
+    .<windowed transformation>(<window function>)
+
+// daily tumbling event-time windows offset by -8 hours.
+input
+    .keyBy(<key selector>)
+    .window(TumblingEventTimeWindows.of(Time.days(1), Time.hours(-8)))
+    .<windowed transformation>(<window function>)
+{% endhighlight %}
+</div>
+</div>
+
+Time intervals can be specified by using one of `Time.milliseconds(x)`, `Time.seconds(x)`,
+`Time.minutes(x)`, and so on.
+
+As shown in the last example, tumbling window assigners also take an optional `offset`
+parameter that can be used to change the alignment of windows. For example, without offsets
+hourly tumbling windows are aligned with epoch, that is you will get windows such as
+`1:00:00.000 - 1:59:59.999`, `2:00:00.000 - 2:59:59.999` and so on. If you want to change
+that you can give an offset. With an offset of 15 minutes you would, for example, get
+`1:15:00.000 - 2:14:59.999`, `2:15:00.000 - 3:14:59.999` etc.
+An important use case for offsets is to adjust windows to timezones other than UTC-0.
+For example, in China you would have to specify an offset of `Time.hours(-8)`.
+
+### Sliding Windows
+
+The *sliding windows* assigner assigns elements to windows of fixed length. Similar to a tumbling
+windows assigner, the size of the windows is configured by the *window size* parameter.
+An additional *window slide* parameter controls how frequently a sliding window is started. Hence,
+sliding windows can be overlapping if the slide is smaller than the window size. In this case elements
+are assigned to multiple windows.
+
+For example, you could have windows of size 10 minutes that slides by 5 minutes. With this you get every
+5 minutes a window that contains the events that arrived during the last 10 minutes as depicted by the
+following figure.
+
+<img src="{{ site.baseurl }}/fig/sliding-windows.svg" class="center" style="width: 100%;" />
+
+The following code snippets show how to use sliding windows.
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+DataStream<T> input = ...;
+
+// sliding event-time windows
+input
+    .keyBy(<key selector>)
+    .window(SlidingEventTimeWindows.of(Time.seconds(10), Time.seconds(5)))
+    .<windowed transformation>(<window function>);
+
+// sliding processing-time windows
+input
+    .keyBy(<key selector>)
+    .window(SlidingProcessingTimeWindows.of(Time.seconds(10), Time.seconds(5)))
+    .<windowed transformation>(<window function>);
+
+// sliding processing-time windows offset by -8 hours
+input
+    .keyBy(<key selector>)
+    .window(SlidingProcessingTimeWindows.of(Time.hours(12), Time.hours(1), Time.hours(-8)))
+    .<windowed transformation>(<window function>);
+{% endhighlight %}
+</div>
+
+<div data-lang="scala" markdown="1">
+{% highlight scala %}
+val input: DataStream[T] = ...
+
+// sliding event-time windows
+input
+    .keyBy(<key selector>)
+    .window(SlidingEventTimeWindows.of(Time.seconds(10), Time.seconds(5)))
+    .<windowed transformation>(<window function>)
+
+// sliding processing-time windows
+input
+    .keyBy(<key selector>)
+    .window(SlidingProcessingTimeWindows.of(Time.seconds(10), Time.seconds(5)))
+    .<windowed transformation>(<window function>)
+
+// sliding processing-time windows offset by -8 hours
+input
+    .keyBy(<key selector>)
+    .window(SlidingProcessingTimeWindows.of(Time.hours(12), Time.hours(1), Time.hours(-8)))
+    .<windowed transformation>(<window function>)
+{% endhighlight %}
+</div>
+</div>
+
+Time intervals can be specified by using one of `Time.milliseconds(x)`, `Time.seconds(x)`,
+`Time.minutes(x)`, and so on.
+
+As shown in the last example, sliding window assigners also take an optional `offset` parameter
+that can be used to change the alignment of windows. For example, without offsets hourly windows
+sliding by 30 minutes are aligned with epoch, that is you will get windows such as
+`1:00:00.000 - 1:59:59.999`, `1:30:00.000 - 2:29:59.999` and so on. If you want to change that
+you can give an offset. With an offset of 15 minutes you would, for example, get
+`1:15:00.000 - 2:14:59.999`, `1:45:00.000 - 2:44:59.999` etc.
+An important use case for offsets is to adjust windows to timezones other than UTC-0.
+For example, in China you would have to specify an offset of `Time.hours(-8)`.
+
+### Session Windows
+
+The *session windows* assigner groups elements by sessions of activity. Session windows do not overlap and
+do not have a fixed start and end time, in contrast to *tumbling windows* and *sliding windows*. Instead a
+session window closes when it does not receive elements for a certain period of time, *i.e.*, when a gap of
+inactivity occurred. A session window assigner is configured with the *session gap* which
+defines how long is the required period of inactivity. When this period expires, the current session closes
+and subsequent elements are assigned to a new session window.
+
+<img src="{{ site.baseurl }}/fig/session-windows.svg" class="center" style="width: 100%;" />
+
+The following code snippets show how to use session windows.
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+DataStream<T> input = ...;
+
+// event-time session windows
+input
+    .keyBy(<key selector>)
+    .window(EventTimeSessionWindows.withGap(Time.minutes(10)))
+    .<windowed transformation>(<window function>);
+
+// processing-time session windows
+input
+    .keyBy(<key selector>)
+    .window(ProcessingTimeSessionWindows.withGap(Time.minutes(10)))
+    .<windowed transformation>(<window function>);
+{% endhighlight %}
+</div>
+
+<div data-lang="scala" markdown="1">
+{% highlight scala %}
+val input: DataStream[T] = ...
+
+// event-time session windows
+input
+    .keyBy(<key selector>)
+    .window(EventTimeSessionWindows.withGap(Time.minutes(10)))
+    .<windowed transformation>(<window function>)
+
+// processing-time session windows
+input
+    .keyBy(<key selector>)
+    .window(ProcessingTimeSessionWindows.withGap(Time.minutes(10)))
+    .<windowed transformation>(<window function>)
+{% endhighlight %}
+</div>
+</div>
+
+Time intervals can be specified by using one of `Time.milliseconds(x)`, `Time.seconds(x)`,
+`Time.minutes(x)`, and so on.
+
+<span class="label label-danger">Attention</span> Since session windows do not have a fixed start and end,
+they are  evaluated differently than tumbling and sliding windows. Internally, a session window operator
+creates a new window for each arriving record and merges windows together if their are closer to each other
+than the defined gap.
+In order to be mergeable, a session window operator requires a merging [Trigger](#triggers) and a merging
+[Window Function](#window-functions), such as `ReduceFunction` or `WindowFunction`
+(`FoldFunction` cannot merge.)
+
+### Global Windows
+
+A *global windows* assigner assigns all elements with the same key to the same single *global window*.
+This windowing scheme is only useful if you also specify a custom [trigger](#triggers). Otherwise,
+no computation will be performed, as the global window does not have a natural end at
+which we could process the aggregated elements.
+
+<img src="{{ site.baseurl }}/fig/non-windowed.svg" class="center" style="width: 100%;" />
+
+The following code snippets show how to use a global window.
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+DataStream<T> input = ...;
+
+input
+    .keyBy(<key selector>)
+    .window(GlobalWindows.create())
+    .<windowed transformation>(<window function>);
+{% endhighlight %}
+</div>
+
+<div data-lang="scala" markdown="1">
+{% highlight scala %}
+val input: DataStream[T] = ...
+
+input
+    .keyBy(<key selector>)
+    .window(GlobalWindows.create())
+    .<windowed transformation>(<window function>)
+{% endhighlight %}
+</div>
+</div>
+
+## Window Functions
+
+After defining the window assigner, we need to specify the computation that we want
+to perform on each of these windows. This is the responsibility of the *window function*, which is used to process the
+elements of each (possibly keyed) window once the system determines that a window is ready for processing
+(see [triggers](#triggers) for how Flink determines when a window is ready).
+
+The window function can be one of `ReduceFunction`, `FoldFunction` or `WindowFunction`. The first
+two can be executed more efficiently (see [State Size](#state size) section) because Flink can incrementally aggregate
+the elements for each window as they arrive. A `WindowFunction` gets an `Iterable` for all the elements contained in a
+window and additional meta information about the window to which the elements belong.
+
+A windowed transformation with a `WindowFunction` cannot be executed as efficiently as the other
+cases because Flink has to buffer *all* elements for a window internally before invoking the function.
+This can be mitigated by combining a `WindowFunction` with a `ReduceFunction` or `FoldFunction` to
+get both incremental aggregation of window elements and the additional window metadata that the
+`WindowFunction` receives. We will look at examples for each of these variants.
+
+### ReduceFunction
+
+A `ReduceFunction` specifies how two elements from the input are combined to produce
+an output element of the same type. Flink uses a `ReduceFunction` to incrementally aggregate
+the elements of a window.
+
+A `ReduceFunction` can be defined and used like this:
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+DataStream<Tuple2<String, Long>> input = ...;
+
+input
+    .keyBy(<key selector>)
+    .window(<window assigner>)
+    .reduce(new ReduceFunction<Tuple2<String, Long>> {
+      public Tuple2<String, Long> reduce(Tuple2<String, Long> v1, Tuple2<String, Long> v2) {
+        return new Tuple2<>(v1.f0, v1.f1 + v2.f1);
+      }
+    });
+{% endhighlight %}
+</div>
+
+<div data-lang="scala" markdown="1">
+{% highlight scala %}
+val input: DataStream[(String, Long)] = ...
+
+input
+    .keyBy(<key selector>)
+    .window(<window assigner>)
+    .reduce { (v1, v2) => (v1._1, v1._2 + v2._2) }
+{% endhighlight %}
+</div>
+</div>
+
+The above example sums up the second fields of the tuples for all elements in a window.
+
+### FoldFunction
+
+A `FoldFunction` specifies how an input element of the window is combined with an element of
+the output type. The `FoldFunction` is incrementally called for each element that is added
+to the window and the current output value. The first element is combined with a pre-defined initial value of the output type.
+
+A `FoldFunction` can be defined and used like this:
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+DataStream<Tuple2<String, Long>> input = ...;
+
+input
+    .keyBy(<key selector>)
+    .window(<window assigner>)
+    .fold("", new FoldFunction<Tuple2<String, Long>, String>> {
+       public String fold(String acc, Tuple2<String, Long> value) {
+         return acc + value.f1;
+       }
+    });
+{% endhighlight %}
+</div>
+
+<div data-lang="scala" markdown="1">
+{% highlight scala %}
+val input: DataStream[(String, Long)] = ...
+
+input
+    .keyBy(<key selector>)
+    .window(<window assigner>)
+    .fold("") { (acc, v) => acc + v._2 }
+{% endhighlight %}
+</div>
+</div>
+
+The above example appends all input `Long` values to an initially empty `String`.
+
+<span class="label label-danger">Attention</span> `fold()` cannot be used with session windows or other mergeable windows.
+
+### WindowFunction - The Generic Case
+
+A `WindowFunction` gets an `Iterable` containing all the elements of the window and provides
+the most flexibility of all window functions. This comes
+at the cost of performance and resource consumption, because elements cannot be incrementally
+aggregated but instead need to be buffered internally until the window is considered ready for processing.
+
+The signature of a `WindowFunction` looks as follows:
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+public interface WindowFunction<IN, OUT, KEY, W extends Window> extends Function, Serializable {
+
+  /**
+   * Evaluates the window and outputs none or several elements.
+   *
+   * @param key The key for which this window is evaluated.
+   * @param window The window that is being evaluated.
+   * @param input The elements in the window being evaluated.
+   * @param out A collector for emitting elements.
+   *
+   * @throws Exception The function may throw exceptions to fail the program and trigger recovery.
+   */
+  void apply(KEY key, W window, Iterable<IN> input, Collector<OUT> out) throws Exception;
+}
+{% endhighlight %}
+</div>
+
+<div data-lang="scala" markdown="1">
+{% highlight scala %}
+trait WindowFunction[IN, OUT, KEY, W <: Window] extends Function with Serializable {
+
+  /**
+    * Evaluates the window and outputs none or several elements.
+    *
+    * @param key    The key for which this window is evaluated.
+    * @param window The window that is being evaluated.
+    * @param input  The elements in the window being evaluated.
+    * @param out    A collector for emitting elements.
+    * @throws Exception The function may throw exceptions to fail the program and trigger recovery.
+    */
+  def apply(key: KEY, window: W, input: Iterable[IN], out: Collector[OUT])
+}
+{% endhighlight %}
+</div>
+</div>
+
+A `WindowFunction` can be defined and used like this:
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+DataStream<Tuple2<String, Long>> input = ...;
+
+input
+    .keyBy(<key selector>)
+    .window(<window assigner>)
+    .apply(new MyWindowFunction());
+
+/* ... */
+
+public class MyWindowFunction implements WindowFunction<Tuple<String, Long>, String, String, TimeWindow> {
+
+  void apply(String key, TimeWindow window, Iterable<Tuple<String, Long>> input, Collector<String> out) {
+    long count = 0;
+    for (Tuple<String, Long> in: input) {
+      count++;
+    }
+    out.collect("Window: " + window + "count: " + count);
+  }
+}
+
+{% endhighlight %}
+</div>
+
+<div data-lang="scala" markdown="1">
+{% highlight scala %}
+val input: DataStream[(String, Long)] = ...
+
+input
+    .keyBy(<key selector>)
+    .window(<window assigner>)
+    .apply(new MyWindowFunction())
+
+/* ... */
+
+class MyWindowFunction extends WindowFunction[(String, Long), String, String, TimeWindow] {
+
+  def apply(key: String, window: TimeWindow, input: Iterable[(String, Long)], out: Collector[String]): () = {
+    var count = 0L
+    for (in <- input) {
+      count = count + 1
+    }
+    out.collect(s"Window $window count: $count")
+  }
+}
+{% endhighlight %}
+</div>
+</div>
+
+The example shows a `WindowFunction` to count the elements in a window. In addition, the window function adds information about the window to the output.
+
+<span class="label label-danger">Attention</span> Note that using `WindowFunction` for simple aggregates such as count is quite inefficient. The next section shows how a `ReduceFunction` can be combined with a `WindowFunction` to get both incremental aggregation and the added information of a `WindowFunction`.
+
+### ProcessWindowFunction
+
+In places where a `WindowFunction` can be used you can also use a `ProcessWindowFunction`. This
+is very similar to `WindowFunction`, except that the interface allows to query more information
+about the context in which the window evaluation happens.
+
+This is the `ProcessWindowFunction` interface:
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+public abstract class ProcessWindowFunction<IN, OUT, KEY, W extends Window> implements Function {
+
+    /**
+     * Evaluates the window and outputs none or several elements.
+     *
+     * @param key The key for which this window is evaluated.
+     * @param context The context in which the window is being evaluated.
+     * @param elements The elements in the window being evaluated.
+     * @param out A collector for emitting elements.
+     *
+     * @throws Exception The function may throw exceptions to fail the program and trigger recovery.
+     */
+    public abstract void process(
+            KEY key,
+            Context context,
+            Iterable<IN> elements,
+            Collector<OUT> out) throws Exception;
+
+    /**
+     * The context holding window metadata
+     */
+    public abstract class Context {
+        /**
+         * @return The window that is being evaluated.
+         */
+        public abstract W window();
+    }
+}
+{% endhighlight %}
+</div>
+
+<div data-lang="scala" markdown="1">
+{% highlight scala %}
+abstract class ProcessWindowFunction[IN, OUT, KEY, W <: Window] extends Function {
+
+  /**
+    * Evaluates the window and outputs none or several elements.
+    *
+    * @param key      The key for which this window is evaluated.
+    * @param context  The context in which the window is being evaluated.
+    * @param elements The elements in the window being evaluated.
+    * @param out      A collector for emitting elements.
+    * @throws Exception The function may throw exceptions to fail the program and trigger recovery.
+    */
+  @throws[Exception]
+  def process(
+      key: KEY,
+      context: Context,
+      elements: Iterable[IN],
+      out: Collector[OUT])
+
+  /**
+    * The context holding window metadata
+    */
+  abstract class Context {
+    /**
+      * @return The window that is being evaluated.
+      */
+    def window: W
+  }
+}
+{% endhighlight %}
+</div>
+</div>
+
+It can be used like this:
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+DataStream<Tuple2<String, Long>> input = ...;
+
+input
+    .keyBy(<key selector>)
+    .window(<window assigner>)
+    .process(new MyProcessWindowFunction());
+{% endhighlight %}
+</div>
+
+<div data-lang="scala" markdown="1">
+{% highlight scala %}
+val input: DataStream[(String, Long)] = ...
+
+input
+    .keyBy(<key selector>)
+    .window(<window assigner>)
+    .process(new MyProcessWindowFunction())
+{% endhighlight %}
+</div>
+</div>
+
+### WindowFunction with Incremental Aggregation
+
+A `WindowFunction` can be combined with either a `ReduceFunction` or a `FoldFunction` to
+incrementally aggregate elements as they arrive in the window.
+When the window is closed, the `WindowFunction` will be provided with the aggregated result.
+This allows to incrementally compute windows while having access to the
+additional window meta information of the `WindowFunction`.
+
+<span class="label label-info">Note</span> You can also `ProcessWindowFunction` instead of
+`WindowFunction` for incremental window aggregation.
+
+#### Incremental Window Aggregation with FoldFunction
+
+The following example shows how an incremental `FoldFunction` can be combined with
+a `WindowFunction` to extract the number of events in the window and return also
+the key and end time of the window.
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+DataStream<SensorReading> input = ...;
+
+input
+  .keyBy(<key selector>)
+  .timeWindow(<window assigner>)
+  .fold(new Tuple3<String, Long, Integer>("",0L, 0), new MyFoldFunction(), new MyWindowFunction())
+
+// Function definitions
+
+private static class MyFoldFunction
+    implements FoldFunction<SensorReading, Tuple3<String, Long, Integer> > {
+
+  public Tuple3<String, Long, Integer> fold(Tuple3<String, Long, Integer> acc, SensorReading s) {
+      Integer cur = acc.getField(2);
+      acc.setField(2, cur + 1);
+      return acc;
+  }
+}
+
+private static class MyWindowFunction
+    implements WindowFunction<Tuple3<String, Long, Integer>, Tuple3<String, Long, Integer>, String, TimeWindow> {
+
+  public void apply(String key,
+                    TimeWindow window,
+                    Iterable<Tuple3<String, Long, Integer>> counts,
+                    Collector<Tuple3<String, Long, Integer>> out) {
+    Integer count = counts.iterator().next().getField(2);
+    out.collect(new Tuple3<String, Long, Integer>(key, window.getEnd(),count));
+  }
+}
+
+{% endhighlight %}
+</div>
+<div data-lang="scala" markdown="1">
+{% highlight scala %}
+
+val input: DataStream[SensorReading] = ...
+
+input
+ .keyBy(<key selector>)
+ .timeWindow(<window assigner>)
+ .fold (
+    ("", 0L, 0),
+    (acc: (String, Long, Int), r: SensorReading) => { ("", 0L, acc._3 + 1) },
+    ( key: String,
+      window: TimeWindow,
+      counts: Iterable[(String, Long, Int)],
+      out: Collector[(String, Long, Int)] ) =>
+      {
+        val count = counts.iterator.next()
+        out.collect((key, window.getEnd, count._3))
+      }
+  )
+
+{% endhighlight %}
+</div>
+</div>
+
+#### Incremental Window Aggregation with ReduceFunction
+
+The following example shows how an incremental `ReduceFunction` can be combined with
+a `WindowFunction` to return the smallest event in a window along
+with the start time of the window.
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+DataStream<SensorReading> input = ...;
+
+input
+  .keyBy(<key selector>)
+  .timeWindow(<window assigner>)
+  .reduce(new MyReduceFunction(), new MyWindowFunction());
+
+// Function definitions
+
+private static class MyReduceFunction implements ReduceFunction<SensorReading> {
+
+  public SensorReading reduce(SensorReading r1, SensorReading r2) {
+      return r1.value() > r2.value() ? r2 : r1;
+  }
+}
+
+private static class MyWindowFunction
+    implements WindowFunction<SensorReading, Tuple2<Long, SensorReading>, String, TimeWindow> {
+
+  public void apply(String key,
+                    TimeWindow window,
+                    Iterable<SensorReading> minReadings,
+                    Collector<Tuple2<Long, SensorReading>> out) {
+      SensorReading min = minReadings.iterator().next();
+      out.collect(new Tuple2<Long, SensorReading>(window.getStart(), min));
+  }
+}
+
+{% endhighlight %}
+</div>
+<div data-lang="scala" markdown="1">
+{% highlight scala %}
+
+val input: DataStream[SensorReading] = ...
+
+input
+  .keyBy(<key selector>)
+  .timeWindow(<window assigner>)
+  .reduce(
+    (r1: SensorReading, r2: SensorReading) => { if (r1.value > r2.value) r2 else r1 },
+    ( key: String,
+      window: TimeWindow,
+      minReadings: Iterable[SensorReading],
+      out: Collector[(Long, SensorReading)] ) =>
+      {
+        val min = minReadings.iterator.next()
+        out.collect((window.getStart, min))
+      }
+  )
+
+{% endhighlight %}
+</div>
+</div>
+
+## Triggers
+
+A `Trigger` determines when a window (as formed by the *window assigner*) is ready to be
+processed by the *window function*. Each `WindowAssigner` comes with a default `Trigger`.
+If the default trigger does not fit your needs, you can specify a custom trigger using `trigger(...)`.
+
+The trigger interface has five methods that allow a `Trigger` to react to different events:
+
+* The `onElement()` method is called for each element that is added to a window.
+* The `onEventTime()` method is called when  a registered event-time timer fires.
+* The `onProcessingTime()` method is called when a registered processing-time timer fires.
+* The `onMerge()` method is relevant for stateful triggers and merges the states of two triggers when their corresponding windows merge, *e.g.* when using session windows.
+* Finally the `clear()` method performs any action needed upon removal of the corresponding window.
+
+Two things to notice about the above methods are:
+
+1) The first three decide how to act on their invocation event by returning a `TriggerResult`. The action can be one of the following:
+
+* `CONTINUE`: do nothing,
+* `FIRE`: trigger the computation,
+* `PURGE`: clear the elements in the window, and
+* `FIRE_AND_PURGE`: trigger the computation and clear the elements in the window afterwards.
+
+2) Any of these methods can be used to register processing- or event-time timers for future actions.
+
+### Fire and Purge
+
+Once a trigger determines that a window is ready for processing, it fires, *i.e.*, it returns `FIRE` or `FIRE_AND_PURGE`. This is the signal for the window operator
+to emit the result of the current window. Given a window with a `WindowFunction`
+all elements are passed to the `WindowFunction` (possibly after passing them to an evictor).
+Windows with `ReduceFunction` of `FoldFunction` simply emit their eagerly aggregated result.
+
+When a trigger fires, it can either `FIRE` or `FIRE_AND_PURGE`. While `FIRE` keeps the contents of the window, `FIRE_AND_PURGE` removes its content.
+By default, the pre-implemented triggers simply `FIRE` without purging the window state.
+
+<span class="label label-danger">Attention</span> Purging will simply remove the contents of the window and will leave any potential meta-information about the window and any trigger state intact.
+
+### Default Triggers of WindowAssigners
+
+The default `Trigger` of a `WindowAssigner` is appropriate for many use cases. For example, all the event-time window assigners have an `EventTimeTrigger` as
+default trigger. This trigger simply fires once the watermark passes the end of a window.
+
+<span class="label label-danger">Attention</span> The default trigger of the `GlobalWindow` is the `NeverTrigger` which does never fire. Consequently, you always have to define a custom trigger when using a `GlobalWindow`.
+
+<span class="label label-danger">Attention</span> By specifying a trigger using `trigger()` you
+are overwriting the default trigger of a `WindowAssigner`. For example, if you specify a
+`CountTrigger` for `TumblingEventTimeWindows` you will no longer get window firings based on the
+progress of time but only by count. Right now, you have to write your own custom trigger if
+you want to react based on both time and count.
+
+### Built-in and Custom Triggers
+
+Flink comes with a few built-in triggers.
+
+* The (already mentioned) `EventTimeTrigger` fires based on the progress of event-time as measured by watermarks.
+* The `ProcessingTimeTrigger` fires based on processing time.
+* The `CountTrigger` fires once the number of elements in a window exceeds the given limit.
+* The `PurgingTrigger` takes as argument another trigger and transforms it into a purging one.
+
+If you need to implement a custom trigger, you should check out the abstract
+{% gh_link /flink-streaming-java/src/main/java/org/apache/flink/streaming/api/windowing/triggers/Trigger.java "Trigger" %} class.
+Please note that the API is still evolving and might change in future versions of Flink.
+
+## Evictors
+
+Flinkā€™s windowing model allows specifying an optional `Evictor` in addition to the `WindowAssigner` and the `Trigger`.
+This can be done using the `evictor(...)` method (shown in the beginning of this document). The evictor has the ability
+to remove elements from a window *after* the trigger fires and *before and/or after* the window function is applied.
+To do so, the `Evictor` interface has two methods:
+
+    /**
+     * Optionally evicts elements. Called before windowing function.
+     *
+     * @param elements The elements currently in the pane.
+     * @param size The current number of elements in the pane.
+     * @param window The {@link Window}
+     * @param evictorContext The context for the Evictor
+     */
+    void evictBefore(Iterable<TimestampedValue<T>> elements, int size, W window, EvictorContext evictorContext);
+
+    /**
+     * Optionally evicts elements. Called after windowing function.
+     *
+     * @param elements The elements currently in the pane.
+     * @param size The current number of elements in the pane.
+     * @param window The {@link Window}
+     * @param evictorContext The context for the Evictor
+     */
+    void evictAfter(Iterable<TimestampedValue<T>> elements, int size, W window, EvictorContext evictorContext);
+
+The `evictBefore()` contains the eviction logic to be applied before the window function, while the `evictAfter()`
+contains the one to be applied after the window function. Elements evicted before the application of the window
+function will not be processed by it.
+
+Flink comes with three pre-implemented evictors. These are:
+
+* `CountEvictor`: keeps up to a user-specified number of elements from the window and discards the remaining ones from
+the beginning of the window buffer.
+* `DeltaEvictor`: takes a `DeltaFunction` and a `threshold`, computes the delta between the last element in the
+window buffer and each of the remaining ones, and removes the ones with a delta greater or equal to the threshold.
+* `TimeEvictor`: takes as argument an `interval` in milliseconds and for a given window, it finds the maximum
+timestamp `max_ts` among its elements and removes all the elements with timestamps smaller than `max_ts - interval`.
+
+<span class="label label-info">Default</span> By default, all the pre-implemented evictors apply their logic before the
+window function.
+
+<span class="label label-danger">Attention</span> Specifying an evictor prevents any pre-aggregation, as all the
+elements of a window have to be passed to the evictor before applying the computation.
+
+<span class="label label-danger">Attention</span> Flink provides no guarantees about the order of the elements within
+a window. This implies that although an evictor may remove elements from the beginning of the window, these are not
+necessarily the ones that arrive first or last.
+
+
+## Allowed Lateness
+
+When working with *event-time* windowing, it can happen that elements arrive late, *i.e.* the watermark that Flink uses to
+keep track of the progress of event-time is already past the end timestamp of a window to which an element belongs. See
+[event time]({{ site.baseurl }}/dev/event_time.html) and especially [late elements]({{ site.baseurl }}/dev/event_time.html#late-elements) for a more thorough
+discussion of how Flink deals with event time.
+
+By default, late elements are dropped when the watermark is past the end of the window. However,
+Flink allows to specify a maximum *allowed lateness* for window operators. Allowed lateness
+specifies by how much time elements can be late before they are dropped, and its default value is 0.
+Elements that arrive after the watermark has passed the end of the window but before it passes the end of
+the window plus the allowed lateness, are still added to the window. Depending on the trigger used,
+a late but not dropped element may cause the window to fire again. This is the case for the `EventTimeTrigger`.
+
+In order to make this work, Flink keeps the state of windows until their allowed lateness expires. Once this happens, Flink removes the window and deletes its state, as
+also described in the [Window Lifecycle](#window-lifecycle) section.
+
+<span class="label label-info">Default</span> By default, the allowed lateness is set to
+`0`. That is, elements that arrive behind the watermark will be dropped.
+
+You can specify an allowed lateness like this:
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+DataStream<T> input = ...;
+
+input
+    .keyBy(<key selector>)
+    .window(<window assigner>)
+    .allowedLateness(<time>)
+    .<windowed transformation>(<window function>);
+{% endhighlight %}
+</div>
+
+<div data-lang="scala" markdown="1">
+{% highlight scala %}
+val input: DataStream[T] = ...
+
+input
+    .keyBy(<key selector>)
+    .window(<window assigner>)
+    .allowedLateness(<time>)
+    .<windowed transformation>(<window function>)
+{% endhighlight %}
+</div>
+</div>
+
+<span class="label label-info">Note</span> When using the `GlobalWindows` window assigner no
+data is ever considered late because the end timestamp of the global window is `Long.MAX_VALUE`.
+
+### Getting late data as a side output
+
+Using Flink's [side output]({{ site.baseurl }}/dev/stream/side_output.html) feature you can get a stream of the data
+that was discarded as late.
+
+You first need to specify that you want to get late data using `sideOutputLateData(OutputTag)` on
+the windowed stream. Then, you can get the side-output stream on the result of the windowed
+operation:
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+final OutputTag<T> lateOutputTag = new OutputTag<T>("late-data"){};
+
+DataStream<T> input = ...;
+
+DataStream<T> result = input
+    .keyBy(<key selector>)
+    .window(<window assigner>)
+    .allowedLateness(<time>)
+    .sideOutputLateData(lateOutputTag)
+    .<windowed transformation>(<window function>);
+
+DataStream<T> lateStream = result.getSideOutput(lateOutputTag);
+{% endhighlight %}
+</div>
+
+<div data-lang="scala" markdown="1">
+{% highlight scala %}
+val lateOutputTag = OutputTag[T]("late-data")
+
+val input: DataStream[T] = ...
+
+val result = input
+    .keyBy(<key selector>)
+    .window(<window assigner>)
+    .allowedLateness(<time>)
+    .sideOutputLateData(lateOutputTag)
+    .<windowed transformation>(<window function>)
+
+val lateStream = result.getSideOutput(lateOutputTag)
+{% endhighlight %}
+</div>
+</div>
+
+### Late elements considerations
+
+When specifying an allowed lateness greater than 0, the window along with its content is kept after the watermark passes
+the end of the window. In these cases, when a late but not dropped element arrives, it could trigger another firing for the
+window. These firings are called `late firings`, as they are triggered by late events and in contrast to the `main firing`
+which is the first firing of the window. In case of session windows, late firings can further lead to merging of windows,
+as they may "bridge" the gap between two pre-existing, unmerged windows.
+
+<span class="label label-info">Attention</span> You should be aware that the elements emitted by a late firing should be treated as updated results of a previous computation, i.e., your data stream will contain multiple results for the same computation. Depending on your application, you need to take these duplicated results into account or deduplicate them.
+
+## Useful state size considerations
+
+Windows can be defined over long periods of time (such as days, weeks, or months) and therefore accumulate very large state. There are a couple of rules to keep in mind when estimating the storage requirements of your windowing computation:
+
+1. Flink creates one copy of each element per window to which it belongs. Given this, tumbling windows keep one copy of each element (an element belongs to exactly window unless it is dropped late). In contrast, sliding windows create several of each element, as explained in the [Window Assigners](#window-assigners) section. Hence, a sliding window of size 1 day and slide 1 second might not be a good idea.
+
+2. `FoldFunction` and `ReduceFunction` can significantly reduce the storage requirements, as they eagerly aggregate elements and store only one value per window. In contrast, just using a `WindowFunction` requires accumulating all elements.
+
+3. Using an `Evictor` prevents any pre-aggregation, as all the elements of a window have to be passed through the evictor before applying the computation (see [Evictors](#evictors)).

http://git-wip-us.apache.org/repos/asf/flink/blob/31b86f60/docs/dev/stream/process_function.md
----------------------------------------------------------------------
diff --git a/docs/dev/stream/process_function.md b/docs/dev/stream/process_function.md
deleted file mode 100644
index 60531aa..0000000
--- a/docs/dev/stream/process_function.md
+++ /dev/null
@@ -1,238 +0,0 @@
----
-title: "Process Function (Low-level Operations)"
-nav-title: "Process Function"
-nav-parent_id: operators
-nav-pos: 35
----
-<!--
-Licensed to the Apache Software Foundation (ASF) under one
-or more contributor license agreements.  See the NOTICE file
-distributed with this work for additional information
-regarding copyright ownership.  The ASF licenses this file
-to you under the Apache License, Version 2.0 (the
-"License"); you may not use this file except in compliance
-with the License.  You may obtain a copy of the License at
-
-  http://www.apache.org/licenses/LICENSE-2.0
-
-Unless required by applicable law or agreed to in writing,
-software distributed under the License is distributed on an
-"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-KIND, either express or implied.  See the License for the
-specific language governing permissions and limitations
-under the License.
--->
-
-* This will be replaced by the TOC
-{:toc}
-
-## The ProcessFunction
-
-The `ProcessFunction` is a low-level stream processing operation, giving access to the basic building blocks of
-all (acyclic) streaming applications:
-
-  - events (stream elements)
-  - state (fault-tolerant, consistent, only on keyed stream)
-  - timers (event time and processing time, only on keyed stream)
-
-The `ProcessFunction` can be thought of as a `FlatMapFunction` with access to keyed state and timers. It handles events
-by being invoked for each event received in the input stream(s).
-
-For fault-tolerant state, the `ProcessFunction` gives access to Flink's [keyed state](state/state.html), accessible via the
-`RuntimeContext`, similar to the way other stateful functions can access keyed state.
-
-The timers allow applications to react to changes in processing time and in [event time](../event_time.html).
-Every call to the function `processElement(...)` gets a `Context` object which gives access to the element's
-event time timestamp, and to the *TimerService*. The `TimerService` can be used to register callbacks for future
-event-/processing-time instants. When a timer's particular time is reached, the `onTimer(...)` method is
-called. During that call, all states are again scoped to the key with which the timer was created, allowing
-timers to manipulate keyed state.
-
-<span class="label label-info">Note</span> If you want to access keyed state and timers you have
-to apply the `ProcessFunction` on a keyed stream:
-
-{% highlight java %}
-stream.keyBy(...).process(new MyProcessFunction())
-{% endhighlight %}
-
-
-## Low-level Joins
-
-To realize low-level operations on two inputs, applications can use `CoProcessFunction`. This
-function is bound to two different inputs and gets individual calls to `processElement1(...)` and
-`processElement2(...)` for records from the two different inputs.
-
-Implementing a low level join typically follows this pattern:
-
-  - Create a state object for one input (or both)
-  - Update the state upon receiving elements from its input
-  - Upon receiving elements from the other input, probe the state and produce the joined result
-
-For example, you might be joining customer data to financial trades,
-while keeping state for the customer data. If you care about having
-complete and deterministic joins in the face of out-of-order events,
-you can use a timer to evaluate and emit the join for a trade when the
-watermark for the customer data stream has passed the time of that
-trade.
-
-## Example
-
-The following example maintains counts per key, and emits a key/count pair whenever a minute passes (in event time) without an update for that key:
-
-  - The count, key, and last-modification-timestamp are stored in a `ValueState`, which is implicitly scoped by key.
-  - For each record, the `ProcessFunction` increments the counter and sets the last-modification timestamp
-  - The function also schedules a callback one minute into the future (in event time)
-  - Upon each callback, it checks the callback's event time timestamp against the last-modification time of the stored count
-    and emits the key/count if they match (i.e., no further update occurred during that minute)
-
-<span class="label label-info">Note</span> This simple example could have been implemented with
-session windows. We use `ProcessFunction` here to illustrate the basic pattern it provides.
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-
-{% highlight java %}
-import org.apache.flink.api.common.state.ValueState;
-import org.apache.flink.api.common.state.ValueStateDescriptor;
-import org.apache.flink.api.java.tuple.Tuple2;
-import org.apache.flink.configuration.Configuration;
-import org.apache.flink.streaming.api.functions.ProcessFunction;
-import org.apache.flink.streaming.api.functions.ProcessFunction.Context;
-import org.apache.flink.streaming.api.functions.ProcessFunction.OnTimerContext;
-import org.apache.flink.util.Collector;
-
-
-// the source data stream
-DataStream<Tuple2<String, String>> stream = ...;
-
-// apply the process function onto a keyed stream
-DataStream<Tuple2<String, Long>> result = stream
-    .keyBy(0)
-    .process(new CountWithTimeoutFunction());
-
-/**
- * The data type stored in the state
- */
-public class CountWithTimestamp {
-
-    public String key;
-    public long count;
-    public long lastModified;
-}
-
-/**
- * The implementation of the ProcessFunction that maintains the count and timeouts
- */
-public class CountWithTimeoutFunction extends ProcessFunction<Tuple2<String, String>, Tuple2<String, Long>> {
-
-    /** The state that is maintained by this process function */
-    private ValueState<CountWithTimestamp> state;
-
-    @Override
-    public void open(Configuration parameters) throws Exception {
-        state = getRuntimeContext().getState(new ValueStateDescriptor<>("myState", CountWithTimestamp.class));
-    }
-
-    @Override
-    public void processElement(Tuple2<String, String> value, Context ctx, Collector<Tuple2<String, Long>> out)
-            throws Exception {
-
-        // retrieve the current count
-        CountWithTimestamp current = state.value();
-        if (current == null) {
-            current = new CountWithTimestamp();
-            current.key = value.f0;
-        }
-
-        // update the state's count
-        current.count++;
-
-        // set the state's timestamp to the record's assigned event time timestamp
-        current.lastModified = ctx.timestamp();
-
-        // write the state back
-        state.update(current);
-
-        // schedule the next timer 60 seconds from the current event time
-        ctx.timerService().registerEventTimeTimer(current.lastModified + 60000);
-    }
-
-    @Override
-    public void onTimer(long timestamp, OnTimerContext ctx, Collector<Tuple2<String, Long>> out)
-            throws Exception {
-
-        // get the state for the key that scheduled the timer
-        CountWithTimestamp result = state.value();
-
-        // check if this is an outdated timer or the latest timer
-        if (timestamp == result.lastModified + 60000) {
-            // emit the state on timeout
-            out.collect(new Tuple2<String, Long>(result.key, result.count));
-        }
-    }
-}
-{% endhighlight %}
-</div>
-
-<div data-lang="scala" markdown="1">
-{% highlight scala %}
-import org.apache.flink.api.common.state.ValueState
-import org.apache.flink.api.common.state.ValueStateDescriptor
-import org.apache.flink.streaming.api.functions.ProcessFunction
-import org.apache.flink.streaming.api.functions.ProcessFunction.Context
-import org.apache.flink.streaming.api.functions.ProcessFunction.OnTimerContext
-import org.apache.flink.util.Collector
-
-// the source data stream
-val stream: DataStream[Tuple2[String, String]] = ...
-
-// apply the process function onto a keyed stream
-val result: DataStream[Tuple2[String, Long]] = stream
-  .keyBy(0)
-  .process(new CountWithTimeoutFunction())
-
-/**
-  * The data type stored in the state
-  */
-case class CountWithTimestamp(key: String, count: Long, lastModified: Long)
-
-/**
-  * The implementation of the ProcessFunction that maintains the count and timeouts
-  */
-class CountWithTimeoutFunction extends ProcessFunction[(String, String), (String, Long)] {
-
-  /** The state that is maintained by this process function */
-  lazy val state: ValueState[CountWithTimestamp] = getRuntimeContext
-    .getState(new ValueStateDescriptor[CountWithTimestamp]("myState", classOf[CountWithTimestamp]))
-
-
-  override def processElement(value: (String, String), ctx: Context, out: Collector[(String, Long)]): Unit = {
-    // initialize or retrieve/update the state
-
-    val current: CountWithTimestamp = state.value match {
-      case null =>
-        CountWithTimestamp(value._1, 1, ctx.timestamp)
-      case CountWithTimestamp(key, count, lastModified) =>
-        CountWithTimestamp(key, count + 1, ctx.timestamp)
-    }
-
-    // write the state back
-    state.update(current)
-
-    // schedule the next timer 60 seconds from the current event time
-    ctx.timerService.registerEventTimeTimer(current.lastModified + 60000)
-  }
-
-  override def onTimer(timestamp: Long, ctx: OnTimerContext, out: Collector[(String, Long)]): Unit = {
-    state.value match {
-      case CountWithTimestamp(key, count, lastModified) if (timestamp == lastModified + 60000) =>
-        out.collect((key, count))
-      case _ =>
-    }
-  }
-}
-{% endhighlight %}
-</div>
-</div>
-
-{% top %}

http://git-wip-us.apache.org/repos/asf/flink/blob/31b86f60/docs/dev/stream/side_output.md
----------------------------------------------------------------------
diff --git a/docs/dev/stream/side_output.md b/docs/dev/stream/side_output.md
index 305deb0..da76af4 100644
--- a/docs/dev/stream/side_output.md
+++ b/docs/dev/stream/side_output.md
@@ -56,7 +56,7 @@ Notice how the `OutputTag` is typed according to the type of elements that the s
 contains.
 
 Emitting data to a side output is only possible from within a
-[ProcessFunction]({{ site.baseurl }}/dev/stream/process_function.html). You can use the `Context` parameter
+[ProcessFunction]({{ site.baseurl }}/dev/stream/operators/process_function.html). You can use the `Context` parameter
 to emit data to a side output identified by an `OutputTag`:
 
 <div class="codetabs" markdown="1">

http://git-wip-us.apache.org/repos/asf/flink/blob/31b86f60/docs/dev/stream/state/checkpointing.md
----------------------------------------------------------------------
diff --git a/docs/dev/stream/state/checkpointing.md b/docs/dev/stream/state/checkpointing.md
index 34c16b0..2a2edc9 100644
--- a/docs/dev/stream/state/checkpointing.md
+++ b/docs/dev/stream/state/checkpointing.md
@@ -32,7 +32,7 @@ any type of more elaborate operation.
 In order to make state fault tolerant, Flink needs to **checkpoint** the state. Checkpoints allow Flink to recover state and positions
 in the streams to give the application the same semantics as a failure-free execution.
 
-The [documentation on streaming fault tolerance](../../../internals/stream_checkpointing.html) describes in detail the technique behind Flink's streaming fault tolerance mechanism.
+The [documentation on streaming fault tolerance]({{ site.baseurl }}/internals/stream_checkpointing.html) describes in detail the technique behind Flink's streaming fault tolerance mechanism.
 
 
 ## Prerequisites
@@ -72,7 +72,7 @@ Other parameters for checkpointing include:
 
     This option cannot be used when a minimum time between checkpoints is defined.
 
-  - *externalized checkpoints*: You can configure periodic checkpoints to be persisted externally. Externalized checkpoints write their meta data out to persistent storage and are *not* automatically cleaned up when the job fails. This way, you will have a checkpoint around to resume from if your job fails. There are more details in the [deployment notes on externalized checkpoints](../../ops/state/checkpoints.html#externalized-checkpoints).
+  - *externalized checkpoints*: You can configure periodic checkpoints to be persisted externally. Externalized checkpoints write their meta data out to persistent storage and are *not* automatically cleaned up when the job fails. This way, you will have a checkpoint around to resume from if your job fails. There are more details in the [deployment notes on externalized checkpoints]({{ site.baseurl }}/ops/state/checkpoints.html#externalized-checkpoints).
 
 <div class="codetabs" markdown="1">
 <div data-lang="java" markdown="1">


[3/7] flink git commit: [FLINK-7370] [docs] Relocate files according to new structure

Posted by tw...@apache.org.
http://git-wip-us.apache.org/repos/asf/flink/blob/31b86f60/docs/dev/stream/operators/index.md
----------------------------------------------------------------------
diff --git a/docs/dev/stream/operators/index.md b/docs/dev/stream/operators/index.md
new file mode 100644
index 0000000..0ed0b2a
--- /dev/null
+++ b/docs/dev/stream/operators/index.md
@@ -0,0 +1,1169 @@
+---
+title: "Operators"
+nav-id: streaming_operators
+nav-show_overview: true
+nav-parent_id: streaming
+nav-pos: 9
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+Operators transform one or more DataStreams into a new DataStream. Programs can combine
+multiple transformations into sophisticated dataflow topologies.
+
+This section gives a description of the basic transformations, the effective physical
+partitioning after applying those as well as insights into Flink's operator chaining.
+
+* toc
+{:toc}
+
+# DataStream Transformations
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+
+<br />
+
+<table class="table table-bordered">
+  <thead>
+    <tr>
+      <th class="text-left" style="width: 25%">Transformation</th>
+      <th class="text-center">Description</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+          <td><strong>Map</strong><br>DataStream &rarr; DataStream</td>
+          <td>
+            <p>Takes one element and produces one element. A map function that doubles the values of the input stream:</p>
+    {% highlight java %}
+DataStream<Integer> dataStream = //...
+dataStream.map(new MapFunction<Integer, Integer>() {
+    @Override
+    public Integer map(Integer value) throws Exception {
+        return 2 * value;
+    }
+});
+    {% endhighlight %}
+          </td>
+        </tr>
+
+        <tr>
+          <td><strong>FlatMap</strong><br>DataStream &rarr; DataStream</td>
+          <td>
+            <p>Takes one element and produces zero, one, or more elements. A flatmap function that splits sentences to words:</p>
+    {% highlight java %}
+dataStream.flatMap(new FlatMapFunction<String, String>() {
+    @Override
+    public void flatMap(String value, Collector<String> out)
+        throws Exception {
+        for(String word: value.split(" ")){
+            out.collect(word);
+        }
+    }
+});
+    {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Filter</strong><br>DataStream &rarr; DataStream</td>
+          <td>
+            <p>Evaluates a boolean function for each element and retains those for which the function returns true.
+            A filter that filters out zero values:
+            </p>
+    {% highlight java %}
+dataStream.filter(new FilterFunction<Integer>() {
+    @Override
+    public boolean filter(Integer value) throws Exception {
+        return value != 0;
+    }
+});
+    {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>KeyBy</strong><br>DataStream &rarr; KeyedStream</td>
+          <td>
+            <p>Logically partitions a stream into disjoint partitions, each partition containing elements of the same key.
+            Internally, this is implemented with hash partitioning. See <a href="{{ site.baseurl }}/dev/api_concepts.html#specifying-keys">keys</a> on how to specify keys.
+            This transformation returns a KeyedStream.</p>
+    {% highlight java %}
+dataStream.keyBy("someKey") // Key by field "someKey"
+dataStream.keyBy(0) // Key by the first element of a Tuple
+    {% endhighlight %}
+            <p>
+            <span class="label label-danger">Attention</span>
+            A type <strong>cannot be a key</strong> if:
+    	    <ol>
+    	    <li> it is a POJO type but does not override the <em>hashCode()</em> method and
+    	    relies on the <em>Object.hashCode()</em> implementation.</li>
+    	    <li> it is an array of any type.</li>
+    	    </ol>
+    	    </p>
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Reduce</strong><br>KeyedStream &rarr; DataStream</td>
+          <td>
+            <p>A "rolling" reduce on a keyed data stream. Combines the current element with the last reduced value and
+            emits the new value.
+                    <br/>
+            	<br/>
+            A reduce function that creates a stream of partial sums:</p>
+            {% highlight java %}
+keyedStream.reduce(new ReduceFunction<Integer>() {
+    @Override
+    public Integer reduce(Integer value1, Integer value2)
+    throws Exception {
+        return value1 + value2;
+    }
+});
+            {% endhighlight %}
+            </p>
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Fold</strong><br>KeyedStream &rarr; DataStream</td>
+          <td>
+          <p>A "rolling" fold on a keyed data stream with an initial value.
+          Combines the current element with the last folded value and
+          emits the new value.
+          <br/>
+          <br/>
+          <p>A fold function that, when applied on the sequence (1,2,3,4,5),
+          emits the sequence "start-1", "start-1-2", "start-1-2-3", ...</p>
+          {% highlight java %}
+DataStream<String> result =
+  keyedStream.fold("start", new FoldFunction<Integer, String>() {
+    @Override
+    public String fold(String current, Integer value) {
+        return current + "-" + value;
+    }
+  });
+          {% endhighlight %}
+          </p>
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Aggregations</strong><br>KeyedStream &rarr; DataStream</td>
+          <td>
+            <p>Rolling aggregations on a keyed data stream. The difference between min
+	    and minBy is that min returns the minimum value, whereas minBy returns
+	    the element that has the minimum value in this field (same for max and maxBy).</p>
+    {% highlight java %}
+keyedStream.sum(0);
+keyedStream.sum("key");
+keyedStream.min(0);
+keyedStream.min("key");
+keyedStream.max(0);
+keyedStream.max("key");
+keyedStream.minBy(0);
+keyedStream.minBy("key");
+keyedStream.maxBy(0);
+keyedStream.maxBy("key");
+    {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Window</strong><br>KeyedStream &rarr; WindowedStream</td>
+          <td>
+            <p>Windows can be defined on already partitioned KeyedStreams. Windows group the data in each
+            key according to some characteristic (e.g., the data that arrived within the last 5 seconds).
+            See <a href="windows.html">windows</a> for a complete description of windows.
+    {% highlight java %}
+dataStream.keyBy(0).window(TumblingEventTimeWindows.of(Time.seconds(5))); // Last 5 seconds of data
+    {% endhighlight %}
+        </p>
+          </td>
+        </tr>
+        <tr>
+          <td><strong>WindowAll</strong><br>DataStream &rarr; AllWindowedStream</td>
+          <td>
+              <p>Windows can be defined on regular DataStreams. Windows group all the stream events
+              according to some characteristic (e.g., the data that arrived within the last 5 seconds).
+              See <a href="windows.html">windows</a> for a complete description of windows.</p>
+              <p><strong>WARNING:</strong> This is in many cases a <strong>non-parallel</strong> transformation. All records will be
+               gathered in one task for the windowAll operator.</p>
+  {% highlight java %}
+dataStream.windowAll(TumblingEventTimeWindows.of(Time.seconds(5))); // Last 5 seconds of data
+  {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Window Apply</strong><br>WindowedStream &rarr; DataStream<br>AllWindowedStream &rarr; DataStream</td>
+          <td>
+            <p>Applies a general function to the window as a whole. Below is a function that manually sums the elements of a window.</p>
+            <p><strong>Note:</strong> If you are using a windowAll transformation, you need to use an AllWindowFunction instead.</p>
+    {% highlight java %}
+windowedStream.apply (new WindowFunction<Tuple2<String,Integer>, Integer, Tuple, Window>() {
+    public void apply (Tuple tuple,
+            Window window,
+            Iterable<Tuple2<String, Integer>> values,
+            Collector<Integer> out) throws Exception {
+        int sum = 0;
+        for (value t: values) {
+            sum += t.f1;
+        }
+        out.collect (new Integer(sum));
+    }
+});
+
+// applying an AllWindowFunction on non-keyed window stream
+allWindowedStream.apply (new AllWindowFunction<Tuple2<String,Integer>, Integer, Window>() {
+    public void apply (Window window,
+            Iterable<Tuple2<String, Integer>> values,
+            Collector<Integer> out) throws Exception {
+        int sum = 0;
+        for (value t: values) {
+            sum += t.f1;
+        }
+        out.collect (new Integer(sum));
+    }
+});
+    {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Window Reduce</strong><br>WindowedStream &rarr; DataStream</td>
+          <td>
+            <p>Applies a functional reduce function to the window and returns the reduced value.</p>
+    {% highlight java %}
+windowedStream.reduce (new ReduceFunction<Tuple2<String,Integer>>() {
+    public Tuple2<String, Integer> reduce(Tuple2<String, Integer> value1, Tuple2<String, Integer> value2) throws Exception {
+        return new Tuple2<String,Integer>(value1.f0, value1.f1 + value2.f1);
+    }
+});
+    {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Window Fold</strong><br>WindowedStream &rarr; DataStream</td>
+          <td>
+            <p>Applies a functional fold function to the window and returns the folded value.
+               The example function, when applied on the sequence (1,2,3,4,5),
+               folds the sequence into the string "start-1-2-3-4-5":</p>
+    {% highlight java %}
+windowedStream.fold("start", new FoldFunction<Integer, String>() {
+    public String fold(String current, Integer value) {
+        return current + "-" + value;
+    }
+});
+    {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Aggregations on windows</strong><br>WindowedStream &rarr; DataStream</td>
+          <td>
+            <p>Aggregates the contents of a window. The difference between min
+	    and minBy is that min returns the minimun value, whereas minBy returns
+	    the element that has the minimum value in this field (same for max and maxBy).</p>
+    {% highlight java %}
+windowedStream.sum(0);
+windowedStream.sum("key");
+windowedStream.min(0);
+windowedStream.min("key");
+windowedStream.max(0);
+windowedStream.max("key");
+windowedStream.minBy(0);
+windowedStream.minBy("key");
+windowedStream.maxBy(0);
+windowedStream.maxBy("key");
+    {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Union</strong><br>DataStream* &rarr; DataStream</td>
+          <td>
+            <p>Union of two or more data streams creating a new stream containing all the elements from all the streams. Note: If you union a data stream
+            with itself you will get each element twice in the resulting stream.</p>
+    {% highlight java %}
+dataStream.union(otherStream1, otherStream2, ...);
+    {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Window Join</strong><br>DataStream,DataStream &rarr; DataStream</td>
+          <td>
+            <p>Join two data streams on a given key and a common window.</p>
+    {% highlight java %}
+dataStream.join(otherStream)
+    .where(<key selector>).equalTo(<key selector>)
+    .window(TumblingEventTimeWindows.of(Time.seconds(3)))
+    .apply (new JoinFunction () {...});
+    {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Window CoGroup</strong><br>DataStream,DataStream &rarr; DataStream</td>
+          <td>
+            <p>Cogroups two data streams on a given key and a common window.</p>
+    {% highlight java %}
+dataStream.coGroup(otherStream)
+    .where(0).equalTo(1)
+    .window(TumblingEventTimeWindows.of(Time.seconds(3)))
+    .apply (new CoGroupFunction () {...});
+    {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Connect</strong><br>DataStream,DataStream &rarr; ConnectedStreams</td>
+          <td>
+            <p>"Connects" two data streams retaining their types. Connect allowing for shared state between
+            the two streams.</p>
+    {% highlight java %}
+DataStream<Integer> someStream = //...
+DataStream<String> otherStream = //...
+
+ConnectedStreams<Integer, String> connectedStreams = someStream.connect(otherStream);
+    {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>CoMap, CoFlatMap</strong><br>ConnectedStreams &rarr; DataStream</td>
+          <td>
+            <p>Similar to map and flatMap on a connected data stream</p>
+    {% highlight java %}
+connectedStreams.map(new CoMapFunction<Integer, String, Boolean>() {
+    @Override
+    public Boolean map1(Integer value) {
+        return true;
+    }
+
+    @Override
+    public Boolean map2(String value) {
+        return false;
+    }
+});
+connectedStreams.flatMap(new CoFlatMapFunction<Integer, String, String>() {
+
+   @Override
+   public void flatMap1(Integer value, Collector<String> out) {
+       out.collect(value.toString());
+   }
+
+   @Override
+   public void flatMap2(String value, Collector<String> out) {
+       for (String word: value.split(" ")) {
+         out.collect(word);
+       }
+   }
+});
+    {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Split</strong><br>DataStream &rarr; SplitStream</td>
+          <td>
+            <p>
+                Split the stream into two or more streams according to some criterion.
+                {% highlight java %}
+SplitStream<Integer> split = someDataStream.split(new OutputSelector<Integer>() {
+    @Override
+    public Iterable<String> select(Integer value) {
+        List<String> output = new ArrayList<String>();
+        if (value % 2 == 0) {
+            output.add("even");
+        }
+        else {
+            output.add("odd");
+        }
+        return output;
+    }
+});
+                {% endhighlight %}
+            </p>
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Select</strong><br>SplitStream &rarr; DataStream</td>
+          <td>
+            <p>
+                Select one or more streams from a split stream.
+                {% highlight java %}
+SplitStream<Integer> split;
+DataStream<Integer> even = split.select("even");
+DataStream<Integer> odd = split.select("odd");
+DataStream<Integer> all = split.select("even","odd");
+                {% endhighlight %}
+            </p>
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Iterate</strong><br>DataStream &rarr; IterativeStream &rarr; DataStream</td>
+          <td>
+            <p>
+                Creates a "feedback" loop in the flow, by redirecting the output of one operator
+                to some previous operator. This is especially useful for defining algorithms that
+                continuously update a model. The following code starts with a stream and applies
+		the iteration body continuously. Elements that are greater than 0 are sent back
+		to the feedback channel, and the rest of the elements are forwarded downstream.
+		See <a href="#iterations">iterations</a> for a complete description.
+                {% highlight java %}
+IterativeStream<Long> iteration = initialStream.iterate();
+DataStream<Long> iterationBody = iteration.map (/*do something*/);
+DataStream<Long> feedback = iterationBody.filter(new FilterFunction<Long>(){
+    @Override
+    public boolean filter(Integer value) throws Exception {
+        return value > 0;
+    }
+});
+iteration.closeWith(feedback);
+DataStream<Long> output = iterationBody.filter(new FilterFunction<Long>(){
+    @Override
+    public boolean filter(Integer value) throws Exception {
+        return value <= 0;
+    }
+});
+                {% endhighlight %}
+            </p>
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Extract Timestamps</strong><br>DataStream &rarr; DataStream</td>
+          <td>
+            <p>
+                Extracts timestamps from records in order to work with windows
+                that use event time semantics. See <a href="{{ site.baseurl }}/dev/event_time.html">Event Time</a>.
+                {% highlight java %}
+stream.assignTimestamps (new TimeStampExtractor() {...});
+                {% endhighlight %}
+            </p>
+          </td>
+        </tr>
+  </tbody>
+</table>
+
+</div>
+
+<div data-lang="scala" markdown="1">
+
+<br />
+
+<table class="table table-bordered">
+  <thead>
+    <tr>
+      <th class="text-left" style="width: 25%">Transformation</th>
+      <th class="text-center">Description</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+          <td><strong>Map</strong><br>DataStream &rarr; DataStream</td>
+          <td>
+            <p>Takes one element and produces one element. A map function that doubles the values of the input stream:</p>
+    {% highlight scala %}
+dataStream.map { x => x * 2 }
+    {% endhighlight %}
+          </td>
+        </tr>
+
+        <tr>
+          <td><strong>FlatMap</strong><br>DataStream &rarr; DataStream</td>
+          <td>
+            <p>Takes one element and produces zero, one, or more elements. A flatmap function that splits sentences to words:</p>
+    {% highlight scala %}
+dataStream.flatMap { str => str.split(" ") }
+    {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Filter</strong><br>DataStream &rarr; DataStream</td>
+          <td>
+            <p>Evaluates a boolean function for each element and retains those for which the function returns true.
+            A filter that filters out zero values:
+            </p>
+    {% highlight scala %}
+dataStream.filter { _ != 0 }
+    {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>KeyBy</strong><br>DataStream &rarr; KeyedStream</td>
+          <td>
+            <p>Logically partitions a stream into disjoint partitions, each partition containing elements of the same key.
+            Internally, this is implemented with hash partitioning. See <a href="{{ site.baseurl }}/dev/api_concepts.html#specifying-keys">keys</a> on how to specify keys.
+            This transformation returns a KeyedStream.</p>
+    {% highlight scala %}
+dataStream.keyBy("someKey") // Key by field "someKey"
+dataStream.keyBy(0) // Key by the first element of a Tuple
+    {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Reduce</strong><br>KeyedStream &rarr; DataStream</td>
+          <td>
+            <p>A "rolling" reduce on a keyed data stream. Combines the current element with the last reduced value and
+            emits the new value.
+                    <br/>
+            	<br/>
+            A reduce function that creates a stream of partial sums:</p>
+            {% highlight scala %}
+keyedStream.reduce { _ + _ }
+            {% endhighlight %}
+            </p>
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Fold</strong><br>KeyedStream &rarr; DataStream</td>
+          <td>
+          <p>A "rolling" fold on a keyed data stream with an initial value.
+          Combines the current element with the last folded value and
+          emits the new value.
+          <br/>
+          <br/>
+          <p>A fold function that, when applied on the sequence (1,2,3,4,5),
+          emits the sequence "start-1", "start-1-2", "start-1-2-3", ...</p>
+          {% highlight scala %}
+val result: DataStream[String] =
+    keyedStream.fold("start")((str, i) => { str + "-" + i })
+          {% endhighlight %}
+          </p>
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Aggregations</strong><br>KeyedStream &rarr; DataStream</td>
+          <td>
+            <p>Rolling aggregations on a keyed data stream. The difference between min
+	    and minBy is that min returns the minimun value, whereas minBy returns
+	    the element that has the minimum value in this field (same for max and maxBy).</p>
+    {% highlight scala %}
+keyedStream.sum(0)
+keyedStream.sum("key")
+keyedStream.min(0)
+keyedStream.min("key")
+keyedStream.max(0)
+keyedStream.max("key")
+keyedStream.minBy(0)
+keyedStream.minBy("key")
+keyedStream.maxBy(0)
+keyedStream.maxBy("key")
+    {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Window</strong><br>KeyedStream &rarr; WindowedStream</td>
+          <td>
+            <p>Windows can be defined on already partitioned KeyedStreams. Windows group the data in each
+            key according to some characteristic (e.g., the data that arrived within the last 5 seconds).
+            See <a href="windows.html">windows</a> for a description of windows.
+    {% highlight scala %}
+dataStream.keyBy(0).window(TumblingEventTimeWindows.of(Time.seconds(5))) // Last 5 seconds of data
+    {% endhighlight %}
+        </p>
+          </td>
+        </tr>
+        <tr>
+          <td><strong>WindowAll</strong><br>DataStream &rarr; AllWindowedStream</td>
+          <td>
+              <p>Windows can be defined on regular DataStreams. Windows group all the stream events
+              according to some characteristic (e.g., the data that arrived within the last 5 seconds).
+              See <a href="windows.html">windows</a> for a complete description of windows.</p>
+              <p><strong>WARNING:</strong> This is in many cases a <strong>non-parallel</strong> transformation. All records will be
+               gathered in one task for the windowAll operator.</p>
+  {% highlight scala %}
+dataStream.windowAll(TumblingEventTimeWindows.of(Time.seconds(5))) // Last 5 seconds of data
+  {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Window Apply</strong><br>WindowedStream &rarr; DataStream<br>AllWindowedStream &rarr; DataStream</td>
+          <td>
+            <p>Applies a general function to the window as a whole. Below is a function that manually sums the elements of a window.</p>
+            <p><strong>Note:</strong> If you are using a windowAll transformation, you need to use an AllWindowFunction instead.</p>
+    {% highlight scala %}
+windowedStream.apply { WindowFunction }
+
+// applying an AllWindowFunction on non-keyed window stream
+allWindowedStream.apply { AllWindowFunction }
+
+    {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Window Reduce</strong><br>WindowedStream &rarr; DataStream</td>
+          <td>
+            <p>Applies a functional reduce function to the window and returns the reduced value.</p>
+    {% highlight scala %}
+windowedStream.reduce { _ + _ }
+    {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Window Fold</strong><br>WindowedStream &rarr; DataStream</td>
+          <td>
+            <p>Applies a functional fold function to the window and returns the folded value.
+               The example function, when applied on the sequence (1,2,3,4,5),
+               folds the sequence into the string "start-1-2-3-4-5":</p>
+          {% highlight scala %}
+val result: DataStream[String] =
+    windowedStream.fold("start", (str, i) => { str + "-" + i })
+          {% endhighlight %}
+          </td>
+	</tr>
+        <tr>
+          <td><strong>Aggregations on windows</strong><br>WindowedStream &rarr; DataStream</td>
+          <td>
+            <p>Aggregates the contents of a window. The difference between min
+	    and minBy is that min returns the minimum value, whereas minBy returns
+	    the element that has the minimum value in this field (same for max and maxBy).</p>
+    {% highlight scala %}
+windowedStream.sum(0)
+windowedStream.sum("key")
+windowedStream.min(0)
+windowedStream.min("key")
+windowedStream.max(0)
+windowedStream.max("key")
+windowedStream.minBy(0)
+windowedStream.minBy("key")
+windowedStream.maxBy(0)
+windowedStream.maxBy("key")
+    {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Union</strong><br>DataStream* &rarr; DataStream</td>
+          <td>
+            <p>Union of two or more data streams creating a new stream containing all the elements from all the streams. Note: If you union a data stream
+            with itself you will get each element twice in the resulting stream.</p>
+    {% highlight scala %}
+dataStream.union(otherStream1, otherStream2, ...)
+    {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Window Join</strong><br>DataStream,DataStream &rarr; DataStream</td>
+          <td>
+            <p>Join two data streams on a given key and a common window.</p>
+    {% highlight scala %}
+dataStream.join(otherStream)
+    .where(<key selector>).equalTo(<key selector>)
+    .window(TumblingEventTimeWindows.of(Time.seconds(3)))
+    .apply { ... }
+    {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Window CoGroup</strong><br>DataStream,DataStream &rarr; DataStream</td>
+          <td>
+            <p>Cogroups two data streams on a given key and a common window.</p>
+    {% highlight scala %}
+dataStream.coGroup(otherStream)
+    .where(0).equalTo(1)
+    .window(TumblingEventTimeWindows.of(Time.seconds(3)))
+    .apply {}
+    {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Connect</strong><br>DataStream,DataStream &rarr; ConnectedStreams</td>
+          <td>
+            <p>"Connects" two data streams retaining their types, allowing for shared state between
+            the two streams.</p>
+    {% highlight scala %}
+someStream : DataStream[Int] = ...
+otherStream : DataStream[String] = ...
+
+val connectedStreams = someStream.connect(otherStream)
+    {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>CoMap, CoFlatMap</strong><br>ConnectedStreams &rarr; DataStream</td>
+          <td>
+            <p>Similar to map and flatMap on a connected data stream</p>
+    {% highlight scala %}
+connectedStreams.map(
+    (_ : Int) => true,
+    (_ : String) => false
+)
+connectedStreams.flatMap(
+    (_ : Int) => true,
+    (_ : String) => false
+)
+    {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Split</strong><br>DataStream &rarr; SplitStream</td>
+          <td>
+            <p>
+                Split the stream into two or more streams according to some criterion.
+                {% highlight scala %}
+val split = someDataStream.split(
+  (num: Int) =>
+    (num % 2) match {
+      case 0 => List("even")
+      case 1 => List("odd")
+    }
+)
+                {% endhighlight %}
+            </p>
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Select</strong><br>SplitStream &rarr; DataStream</td>
+          <td>
+            <p>
+                Select one or more streams from a split stream.
+                {% highlight scala %}
+
+val even = split select "even"
+val odd = split select "odd"
+val all = split.select("even","odd")
+                {% endhighlight %}
+            </p>
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Iterate</strong><br>DataStream &rarr; IterativeStream  &rarr; DataStream</td>
+          <td>
+            <p>
+                Creates a "feedback" loop in the flow, by redirecting the output of one operator
+                to some previous operator. This is especially useful for defining algorithms that
+                continuously update a model. The following code starts with a stream and applies
+		the iteration body continuously. Elements that are greater than 0 are sent back
+		to the feedback channel, and the rest of the elements are forwarded downstream.
+		See <a href="#iterations">iterations</a> for a complete description.
+                {% highlight java %}
+initialStream.iterate {
+  iteration => {
+    val iterationBody = iteration.map {/*do something*/}
+    (iterationBody.filter(_ > 0), iterationBody.filter(_ <= 0))
+  }
+}
+                {% endhighlight %}
+            </p>
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Extract Timestamps</strong><br>DataStream &rarr; DataStream</td>
+          <td>
+            <p>
+                Extracts timestamps from records in order to work with windows
+                that use event time semantics.
+                See <a href="{{ site.baseurl }}/dev/event_time.html">Event Time</a>.
+                {% highlight scala %}
+stream.assignTimestamps { timestampExtractor }
+                {% endhighlight %}
+            </p>
+          </td>
+        </tr>
+  </tbody>
+</table>
+
+Extraction from tuples, case classes and collections via anonymous pattern matching, like the following:
+{% highlight scala %}
+val data: DataStream[(Int, String, Double)] = // [...]
+data.map {
+  case (id, name, temperature) => // [...]
+}
+{% endhighlight %}
+is not supported by the API out-of-the-box. To use this feature, you should use a <a href="{{ site.baseurl }}/dev/scala_api_extensions.html">Scala API extension</a>.
+
+
+</div>
+</div>
+
+The following transformations are available on data streams of Tuples:
+
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+
+<br />
+
+<table class="table table-bordered">
+  <thead>
+    <tr>
+      <th class="text-left" style="width: 20%">Transformation</th>
+      <th class="text-center">Description</th>
+    </tr>
+  </thead>
+  <tbody>
+   <tr>
+      <td><strong>Project</strong><br>DataStream &rarr; DataStream</td>
+      <td>
+        <p>Selects a subset of fields from the tuples
+{% highlight java %}
+DataStream<Tuple3<Integer, Double, String>> in = // [...]
+DataStream<Tuple2<String, Integer>> out = in.project(2,0);
+{% endhighlight %}
+        </p>
+      </td>
+    </tr>
+  </tbody>
+</table>
+
+</div>
+</div>
+
+
+# Physical partitioning
+
+Flink also gives low-level control (if desired) on the exact stream partitioning after a transformation,
+via the following functions.
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+
+<br />
+
+<table class="table table-bordered">
+  <thead>
+    <tr>
+      <th class="text-left" style="width: 20%">Transformation</th>
+      <th class="text-center">Description</th>
+    </tr>
+  </thead>
+  <tbody>
+   <tr>
+      <td><strong>Custom partitioning</strong><br>DataStream &rarr; DataStream</td>
+      <td>
+        <p>
+            Uses a user-defined Partitioner to select the target task for each element.
+            {% highlight java %}
+dataStream.partitionCustom(partitioner, "someKey");
+dataStream.partitionCustom(partitioner, 0);
+            {% endhighlight %}
+        </p>
+      </td>
+    </tr>
+   <tr>
+     <td><strong>Random partitioning</strong><br>DataStream &rarr; DataStream</td>
+     <td>
+       <p>
+            Partitions elements randomly according to a uniform distribution.
+            {% highlight java %}
+dataStream.shuffle();
+            {% endhighlight %}
+       </p>
+     </td>
+   </tr>
+   <tr>
+      <td><strong>Rebalancing (Round-robin partitioning)</strong><br>DataStream &rarr; DataStream</td>
+      <td>
+        <p>
+            Partitions elements round-robin, creating equal load per partition. Useful for performance
+            optimization in the presence of data skew.
+            {% highlight java %}
+dataStream.rebalance();
+            {% endhighlight %}
+        </p>
+      </td>
+    </tr>
+    <tr>
+      <td><strong>Rescaling</strong><br>DataStream &rarr; DataStream</td>
+      <td>
+        <p>
+            Partitions elements, round-robin, to a subset of downstream operations. This is
+            useful if you want to have pipelines where you, for example, fan out from
+            each parallel instance of a source to a subset of several mappers to distribute load
+            but don't want the full rebalance that rebalance() would incur. This would require only
+            local data transfers instead of transferring data over network, depending on
+            other configuration values such as the number of slots of TaskManagers.
+        </p>
+        <p>
+            The subset of downstream operations to which the upstream operation sends
+            elements depends on the degree of parallelism of both the upstream and downstream operation.
+            For example, if the upstream operation has parallelism 2 and the downstream operation
+            has parallelism 6, then one upstream operation would distribute elements to three
+            downstream operations while the other upstream operation would distribute to the other
+            three downstream operations. If, on the other hand, the downstream operation has parallelism
+            2 while the upstream operation has parallelism 6 then three upstream operations would
+            distribute to one downstream operation while the other three upstream operations would
+            distribute to the other downstream operation.
+        </p>
+        <p>
+            In cases where the different parallelisms are not multiples of each other one or several
+            downstream operations will have a differing number of inputs from upstream operations.
+        </p>
+        <p>
+            Please see this figure for a visualization of the connection pattern in the above
+            example:
+        </p>
+
+        <div style="text-align: center">
+            <img src="{{ site.baseurl }}/fig/rescale.svg" alt="Checkpoint barriers in data streams" />
+            </div>
+
+
+        <p>
+                    {% highlight java %}
+dataStream.rescale();
+            {% endhighlight %}
+
+        </p>
+      </td>
+    </tr>
+   <tr>
+      <td><strong>Broadcasting</strong><br>DataStream &rarr; DataStream</td>
+      <td>
+        <p>
+            Broadcasts elements to every partition.
+            {% highlight java %}
+dataStream.broadcast();
+            {% endhighlight %}
+        </p>
+      </td>
+    </tr>
+  </tbody>
+</table>
+
+</div>
+
+<div data-lang="scala" markdown="1">
+
+<br />
+
+<table class="table table-bordered">
+  <thead>
+    <tr>
+      <th class="text-left" style="width: 20%">Transformation</th>
+      <th class="text-center">Description</th>
+    </tr>
+  </thead>
+  <tbody>
+   <tr>
+      <td><strong>Custom partitioning</strong><br>DataStream &rarr; DataStream</td>
+      <td>
+        <p>
+            Uses a user-defined Partitioner to select the target task for each element.
+            {% highlight scala %}
+dataStream.partitionCustom(partitioner, "someKey")
+dataStream.partitionCustom(partitioner, 0)
+            {% endhighlight %}
+        </p>
+      </td>
+    </tr>
+   <tr>
+     <td><strong>Random partitioning</strong><br>DataStream &rarr; DataStream</td>
+     <td>
+       <p>
+            Partitions elements randomly according to a uniform distribution.
+            {% highlight scala %}
+dataStream.shuffle()
+            {% endhighlight %}
+       </p>
+     </td>
+   </tr>
+   <tr>
+      <td><strong>Rebalancing (Round-robin partitioning)</strong><br>DataStream &rarr; DataStream</td>
+      <td>
+        <p>
+            Partitions elements round-robin, creating equal load per partition. Useful for performance
+            optimization in the presence of data skew.
+            {% highlight scala %}
+dataStream.rebalance()
+            {% endhighlight %}
+        </p>
+      </td>
+    </tr>
+    <tr>
+      <td><strong>Rescaling</strong><br>DataStream &rarr; DataStream</td>
+      <td>
+        <p>
+            Partitions elements, round-robin, to a subset of downstream operations. This is
+            useful if you want to have pipelines where you, for example, fan out from
+            each parallel instance of a source to a subset of several mappers to distribute load
+            but don't want the full rebalance that rebalance() would incur. This would require only
+            local data transfers instead of transferring data over network, depending on
+            other configuration values such as the number of slots of TaskManagers.
+        </p>
+        <p>
+            The subset of downstream operations to which the upstream operation sends
+            elements depends on the degree of parallelism of both the upstream and downstream operation.
+            For example, if the upstream operation has parallelism 2 and the downstream operation
+            has parallelism 4, then one upstream operation would distribute elements to two
+            downstream operations while the other upstream operation would distribute to the other
+            two downstream operations. If, on the other hand, the downstream operation has parallelism
+            2 while the upstream operation has parallelism 4 then two upstream operations would
+            distribute to one downstream operation while the other two upstream operations would
+            distribute to the other downstream operations.
+        </p>
+        <p>
+            In cases where the different parallelisms are not multiples of each other one or several
+            downstream operations will have a differing number of inputs from upstream operations.
+
+        </p>
+        </p>
+            Please see this figure for a visualization of the connection pattern in the above
+            example:
+        </p>
+
+        <div style="text-align: center">
+            <img src="{{ site.baseurl }}/fig/rescale.svg" alt="Checkpoint barriers in data streams" />
+            </div>
+
+
+        <p>
+                    {% highlight java %}
+dataStream.rescale()
+            {% endhighlight %}
+
+        </p>
+      </td>
+    </tr>
+   <tr>
+      <td><strong>Broadcasting</strong><br>DataStream &rarr; DataStream</td>
+      <td>
+        <p>
+            Broadcasts elements to every partition.
+            {% highlight scala %}
+dataStream.broadcast()
+            {% endhighlight %}
+        </p>
+      </td>
+    </tr>
+  </tbody>
+</table>
+
+</div>
+</div>
+
+# Task chaining and resource groups
+
+Chaining two subsequent transformations means co-locating them within the same thread for better
+performance. Flink by default chains operators if this is possible (e.g., two subsequent map
+transformations). The API gives fine-grained control over chaining if desired:
+
+Use `StreamExecutionEnvironment.disableOperatorChaining()` if you want to disable chaining in
+the whole job. For more fine grained control, the following functions are available. Note that
+these functions can only be used right after a DataStream transformation as they refer to the
+previous transformation. For example, you can use `someStream.map(...).startNewChain()`, but
+you cannot use `someStream.startNewChain()`.
+
+A resource group is a slot in Flink, see
+[slots]({{site.baseurl}}/ops/config.html#configuring-taskmanager-processing-slots). You can
+manually isolate operators in separate slots if desired.
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+
+<br />
+
+<table class="table table-bordered">
+  <thead>
+    <tr>
+      <th class="text-left" style="width: 20%">Transformation</th>
+      <th class="text-center">Description</th>
+    </tr>
+  </thead>
+  <tbody>
+   <tr>
+      <td>Start new chain</td>
+      <td>
+        <p>Begin a new chain, starting with this operator. The two
+	mappers will be chained, and filter will not be chained to
+	the first mapper.
+{% highlight java %}
+someStream.filter(...).map(...).startNewChain().map(...);
+{% endhighlight %}
+        </p>
+      </td>
+    </tr>
+   <tr>
+      <td>Disable chaining</td>
+      <td>
+        <p>Do not chain the map operator
+{% highlight java %}
+someStream.map(...).disableChaining();
+{% endhighlight %}
+        </p>
+      </td>
+    </tr>
+    <tr>
+      <td>Set slot sharing group</td>
+      <td>
+        <p>Set the slot sharing group of an operation. Flink will put operations with the same
+        slot sharing group into the same slot while keeping operations that don't have the
+        slot sharing group in other slots. This can be used to isolate slots. The slot sharing
+        group is inherited from input operations if all input operations are in the same slot
+        sharing group.
+        The name of the default slot sharing group is "default", operations can explicitly
+        be put into this group by calling slotSharingGroup("default").
+{% highlight java %}
+someStream.filter(...).slotSharingGroup("name");
+{% endhighlight %}
+        </p>
+      </td>
+    </tr>
+  </tbody>
+</table>
+
+</div>
+
+<div data-lang="scala" markdown="1">
+
+<br />
+
+<table class="table table-bordered">
+  <thead>
+    <tr>
+      <th class="text-left" style="width: 20%">Transformation</th>
+      <th class="text-center">Description</th>
+    </tr>
+  </thead>
+  <tbody>
+   <tr>
+      <td>Start new chain</td>
+      <td>
+        <p>Begin a new chain, starting with this operator. The two
+	mappers will be chained, and filter will not be chained to
+	the first mapper.
+{% highlight scala %}
+someStream.filter(...).map(...).startNewChain().map(...)
+{% endhighlight %}
+        </p>
+      </td>
+    </tr>
+   <tr>
+      <td>Disable chaining</td>
+      <td>
+        <p>Do not chain the map operator
+{% highlight scala %}
+someStream.map(...).disableChaining()
+{% endhighlight %}
+        </p>
+      </td>
+    </tr>
+  <tr>
+      <td>Set slot sharing group</td>
+      <td>
+        <p>Set the slot sharing group of an operation. Flink will put operations with the same
+        slot sharing group into the same slot while keeping operations that don't have the
+        slot sharing group in other slots. This can be used to isolate slots. The slot sharing
+        group is inherited from input operations if all input operations are in the same slot
+        sharing group.
+        The name of the default slot sharing group is "default", operations can explicitly
+        be put into this group by calling slotSharingGroup("default").
+{% highlight java %}
+someStream.filter(...).slotSharingGroup("name")
+{% endhighlight %}
+        </p>
+      </td>
+    </tr>
+  </tbody>
+</table>
+
+</div>
+</div>
+
+
+{% top %}
+

http://git-wip-us.apache.org/repos/asf/flink/blob/31b86f60/docs/dev/stream/operators/process_function.md
----------------------------------------------------------------------
diff --git a/docs/dev/stream/operators/process_function.md b/docs/dev/stream/operators/process_function.md
new file mode 100644
index 0000000..9f32359
--- /dev/null
+++ b/docs/dev/stream/operators/process_function.md
@@ -0,0 +1,238 @@
+---
+title: "Process Function (Low-level Operations)"
+nav-title: "Process Function"
+nav-parent_id: streaming_operators
+nav-pos: 35
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+* This will be replaced by the TOC
+{:toc}
+
+## The ProcessFunction
+
+The `ProcessFunction` is a low-level stream processing operation, giving access to the basic building blocks of
+all (acyclic) streaming applications:
+
+  - events (stream elements)
+  - state (fault-tolerant, consistent, only on keyed stream)
+  - timers (event time and processing time, only on keyed stream)
+
+The `ProcessFunction` can be thought of as a `FlatMapFunction` with access to keyed state and timers. It handles events
+by being invoked for each event received in the input stream(s).
+
+For fault-tolerant state, the `ProcessFunction` gives access to Flink's [keyed state]({{ site.baseurl }}/dev/stream/state/state.html), accessible via the
+`RuntimeContext`, similar to the way other stateful functions can access keyed state.
+
+The timers allow applications to react to changes in processing time and in [event time]({{ site.baseurl }}/dev/event_time.html).
+Every call to the function `processElement(...)` gets a `Context` object which gives access to the element's
+event time timestamp, and to the *TimerService*. The `TimerService` can be used to register callbacks for future
+event-/processing-time instants. When a timer's particular time is reached, the `onTimer(...)` method is
+called. During that call, all states are again scoped to the key with which the timer was created, allowing
+timers to manipulate keyed state.
+
+<span class="label label-info">Note</span> If you want to access keyed state and timers you have
+to apply the `ProcessFunction` on a keyed stream:
+
+{% highlight java %}
+stream.keyBy(...).process(new MyProcessFunction())
+{% endhighlight %}
+
+
+## Low-level Joins
+
+To realize low-level operations on two inputs, applications can use `CoProcessFunction`. This
+function is bound to two different inputs and gets individual calls to `processElement1(...)` and
+`processElement2(...)` for records from the two different inputs.
+
+Implementing a low level join typically follows this pattern:
+
+  - Create a state object for one input (or both)
+  - Update the state upon receiving elements from its input
+  - Upon receiving elements from the other input, probe the state and produce the joined result
+
+For example, you might be joining customer data to financial trades,
+while keeping state for the customer data. If you care about having
+complete and deterministic joins in the face of out-of-order events,
+you can use a timer to evaluate and emit the join for a trade when the
+watermark for the customer data stream has passed the time of that
+trade.
+
+## Example
+
+The following example maintains counts per key, and emits a key/count pair whenever a minute passes (in event time) without an update for that key:
+
+  - The count, key, and last-modification-timestamp are stored in a `ValueState`, which is implicitly scoped by key.
+  - For each record, the `ProcessFunction` increments the counter and sets the last-modification timestamp
+  - The function also schedules a callback one minute into the future (in event time)
+  - Upon each callback, it checks the callback's event time timestamp against the last-modification time of the stored count
+    and emits the key/count if they match (i.e., no further update occurred during that minute)
+
+<span class="label label-info">Note</span> This simple example could have been implemented with
+session windows. We use `ProcessFunction` here to illustrate the basic pattern it provides.
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+
+{% highlight java %}
+import org.apache.flink.api.common.state.ValueState;
+import org.apache.flink.api.common.state.ValueStateDescriptor;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.streaming.api.functions.ProcessFunction;
+import org.apache.flink.streaming.api.functions.ProcessFunction.Context;
+import org.apache.flink.streaming.api.functions.ProcessFunction.OnTimerContext;
+import org.apache.flink.util.Collector;
+
+
+// the source data stream
+DataStream<Tuple2<String, String>> stream = ...;
+
+// apply the process function onto a keyed stream
+DataStream<Tuple2<String, Long>> result = stream
+    .keyBy(0)
+    .process(new CountWithTimeoutFunction());
+
+/**
+ * The data type stored in the state
+ */
+public class CountWithTimestamp {
+
+    public String key;
+    public long count;
+    public long lastModified;
+}
+
+/**
+ * The implementation of the ProcessFunction that maintains the count and timeouts
+ */
+public class CountWithTimeoutFunction extends ProcessFunction<Tuple2<String, String>, Tuple2<String, Long>> {
+
+    /** The state that is maintained by this process function */
+    private ValueState<CountWithTimestamp> state;
+
+    @Override
+    public void open(Configuration parameters) throws Exception {
+        state = getRuntimeContext().getState(new ValueStateDescriptor<>("myState", CountWithTimestamp.class));
+    }
+
+    @Override
+    public void processElement(Tuple2<String, String> value, Context ctx, Collector<Tuple2<String, Long>> out)
+            throws Exception {
+
+        // retrieve the current count
+        CountWithTimestamp current = state.value();
+        if (current == null) {
+            current = new CountWithTimestamp();
+            current.key = value.f0;
+        }
+
+        // update the state's count
+        current.count++;
+
+        // set the state's timestamp to the record's assigned event time timestamp
+        current.lastModified = ctx.timestamp();
+
+        // write the state back
+        state.update(current);
+
+        // schedule the next timer 60 seconds from the current event time
+        ctx.timerService().registerEventTimeTimer(current.lastModified + 60000);
+    }
+
+    @Override
+    public void onTimer(long timestamp, OnTimerContext ctx, Collector<Tuple2<String, Long>> out)
+            throws Exception {
+
+        // get the state for the key that scheduled the timer
+        CountWithTimestamp result = state.value();
+
+        // check if this is an outdated timer or the latest timer
+        if (timestamp == result.lastModified + 60000) {
+            // emit the state on timeout
+            out.collect(new Tuple2<String, Long>(result.key, result.count));
+        }
+    }
+}
+{% endhighlight %}
+</div>
+
+<div data-lang="scala" markdown="1">
+{% highlight scala %}
+import org.apache.flink.api.common.state.ValueState
+import org.apache.flink.api.common.state.ValueStateDescriptor
+import org.apache.flink.streaming.api.functions.ProcessFunction
+import org.apache.flink.streaming.api.functions.ProcessFunction.Context
+import org.apache.flink.streaming.api.functions.ProcessFunction.OnTimerContext
+import org.apache.flink.util.Collector
+
+// the source data stream
+val stream: DataStream[Tuple2[String, String]] = ...
+
+// apply the process function onto a keyed stream
+val result: DataStream[Tuple2[String, Long]] = stream
+  .keyBy(0)
+  .process(new CountWithTimeoutFunction())
+
+/**
+  * The data type stored in the state
+  */
+case class CountWithTimestamp(key: String, count: Long, lastModified: Long)
+
+/**
+  * The implementation of the ProcessFunction that maintains the count and timeouts
+  */
+class CountWithTimeoutFunction extends ProcessFunction[(String, String), (String, Long)] {
+
+  /** The state that is maintained by this process function */
+  lazy val state: ValueState[CountWithTimestamp] = getRuntimeContext
+    .getState(new ValueStateDescriptor[CountWithTimestamp]("myState", classOf[CountWithTimestamp]))
+
+
+  override def processElement(value: (String, String), ctx: Context, out: Collector[(String, Long)]): Unit = {
+    // initialize or retrieve/update the state
+
+    val current: CountWithTimestamp = state.value match {
+      case null =>
+        CountWithTimestamp(value._1, 1, ctx.timestamp)
+      case CountWithTimestamp(key, count, lastModified) =>
+        CountWithTimestamp(key, count + 1, ctx.timestamp)
+    }
+
+    // write the state back
+    state.update(current)
+
+    // schedule the next timer 60 seconds from the current event time
+    ctx.timerService.registerEventTimeTimer(current.lastModified + 60000)
+  }
+
+  override def onTimer(timestamp: Long, ctx: OnTimerContext, out: Collector[(String, Long)]): Unit = {
+    state.value match {
+      case CountWithTimestamp(key, count, lastModified) if (timestamp == lastModified + 60000) =>
+        out.collect((key, count))
+      case _ =>
+    }
+  }
+}
+{% endhighlight %}
+</div>
+</div>
+
+{% top %}


[5/7] flink git commit: [FLINK-7370][docs] rework the operator documentation structure

Posted by tw...@apache.org.
http://git-wip-us.apache.org/repos/asf/flink/blob/cafa45e2/docs/redirects/windows.md
----------------------------------------------------------------------
diff --git a/docs/redirects/windows.md b/docs/redirects/windows.md
index 55f57ac..bc65659 100644
--- a/docs/redirects/windows.md
+++ b/docs/redirects/windows.md
@@ -1,7 +1,7 @@
 ---
 title: "Windows"
 layout: redirect
-redirect: /dev/windows.html
+redirect: /dev/stream/windows.html
 permalink: /apis/streaming/windows.html
 ---
 <!--

http://git-wip-us.apache.org/repos/asf/flink/blob/cafa45e2/docs/redirects/windows_2.md
----------------------------------------------------------------------
diff --git a/docs/redirects/windows_2.md b/docs/redirects/windows_2.md
new file mode 100644
index 0000000..c7039e4
--- /dev/null
+++ b/docs/redirects/windows_2.md
@@ -0,0 +1,24 @@
+---
+title: "Windows"
+layout: redirect
+redirect: /dev/stream/windows.html
+permalink: /dev/windows.html
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->


[6/7] flink git commit: [FLINK-7370][docs] rework the operator documentation structure

Posted by tw...@apache.org.
http://git-wip-us.apache.org/repos/asf/flink/blob/cafa45e2/docs/dev/stream/process_function.md
----------------------------------------------------------------------
diff --git a/docs/dev/stream/process_function.md b/docs/dev/stream/process_function.md
index 696a8b8..60531aa 100644
--- a/docs/dev/stream/process_function.md
+++ b/docs/dev/stream/process_function.md
@@ -1,7 +1,7 @@
 ---
 title: "Process Function (Low-level Operations)"
 nav-title: "Process Function"
-nav-parent_id: streaming
+nav-parent_id: operators
 nav-pos: 35
 ---
 <!--

http://git-wip-us.apache.org/repos/asf/flink/blob/cafa45e2/docs/dev/stream/windows.md
----------------------------------------------------------------------
diff --git a/docs/dev/stream/windows.md b/docs/dev/stream/windows.md
new file mode 100644
index 0000000..ab53a3a
--- /dev/null
+++ b/docs/dev/stream/windows.md
@@ -0,0 +1,1039 @@
+---
+title: "Windows"
+nav-parent_id: operators
+nav-id: windows
+nav-pos: 10
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+Windows are at the heart of processing infinite streams. Windows split the stream into "buckets" of finite size,
+over which we can apply computations. This document focuses on how windowing is performed in Flink and how the
+programmer can benefit to the maximum from its offered functionality.
+
+The general structure of a windowed Flink program is presented below. The first snippet refers to *keyed* streams,
+while the second to *non-keyed* ones. As one can see, the only difference is the `keyBy(...)` call for the keyed streams
+and the `window(...)` which becomes `windowAll(...)` for non-keyed streams. These is also going to serve as a roadmap
+for the rest of the page.
+
+**Keyed Windows**
+
+    stream
+           .keyBy(...)          <-  keyed versus non-keyed windows
+           .window(...)         <-  required: "assigner"
+          [.trigger(...)]       <-  optional: "trigger" (else default trigger)
+          [.evictor(...)]       <-  optional: "evictor" (else no evictor)
+          [.allowedLateness()]  <-  optional, else zero
+           .reduce/fold/apply() <-  required: "function"
+
+**Non-Keyed Windows**
+
+    stream
+           .windowAll(...)      <-  required: "assigner"
+          [.trigger(...)]       <-  optional: "trigger" (else default trigger)
+          [.evictor(...)]       <-  optional: "evictor" (else no evictor)
+          [.allowedLateness()]  <-  optional, else zero
+           .reduce/fold/apply() <-  required: "function"
+
+In the above, the commands in square brackets ([...]) are optional. This reveals that Flink allows you to customize your
+windowing logic in many different ways so that it best fits your needs.
+
+* This will be replaced by the TOC
+{:toc}
+
+## Window Lifecycle
+
+In a nutshell, a window is **created** as soon as the first element that should belong to this window arrives, and the
+window is **completely removed** when the time (event or processing time) passes its end timestamp plus the user-specified
+`allowed lateness` (see [Allowed Lateness](#allowed-lateness)). Flink guarantees removal only for time-based
+windows and not for other types, *e.g.* global windows (see [Window Assigners](#window-assigners)). For example, with an
+event-time-based windowing strategy that creates non-overlapping (or tumbling) windows every 5 minutes and has an allowed
+lateness of 1 min, Flink will create a new window for the interval between `12:00` and `12:05` when the first element with
+a timestamp that falls into this interval arrives, and it will remove it when the watermark passes the `12:06`
+timestamp.
+
+In addition, each window will have a `Trigger` (see [Triggers](#triggers)) and a function (`WindowFunction`, `ReduceFunction` or
+`FoldFunction`) (see [Window Functions](#window-functions)) attached to it. The function will contain the computation to
+be applied to the contents of the window, while the `Trigger` specifies the conditions under which the window is
+considered ready for the function to be applied. A triggering policy might be something like "when the number of elements
+in the window is more than 4", or "when the watermark passes the end of the window". A trigger can also decide to
+purge a window's contents any time between its creation and removal. Purging in this case only refers to the elements
+in the window, and *not* the window metadata. This means that new data can still be added to that window.
+
+Apart from the above, you can specify an `Evictor` (see [Evictors](#evictors)) which will be able to remove
+elements from the window after the trigger fires and before and/or after the function is applied.
+
+In the following we go into more detail for each of the components above. We start with the required parts in the above
+snippet (see [Keyed vs Non-Keyed Windows](#keyed-vs-non-keyed-windows), [Window Assigner](#window-assigner), and
+[Window Function](#window-function)) before moving to the optional ones.
+
+## Keyed vs Non-Keyed Windows
+
+The first thing to specify is whether your stream should be keyed or not. This has to be done before defining the window.
+Using the `keyBy(...)` will split your infinite stream into logical keyed streams. If `keyBy(...)` is not called, your
+stream is not keyed.
+
+In the case of keyed streams, any attribute of your incoming events can be used as a key
+(more details [here]({{ site.baseurl }}/dev/api_concepts.html#specifying-keys)). Having a keyed stream will
+allow your windowed computation to be performed in parallel by multiple tasks, as each logical keyed stream can be processed
+independently from the rest. All elements referring to the same key will be sent to the same parallel task.
+
+In case of non-keyed streams, your original stream will not be split into multiple logical streams and all the windowing logic
+will be performed by a single task, *i.e.* with parallelism of 1.
+
+## Window Assigners
+
+After specifying whether your stream is keyed or not, the next step is to define a *window assigner*.
+The window assigner defines how elements are assigned to windows. This is done by specifying the `WindowAssigner`
+of your choice in the `window(...)` (for *keyed* streams) or the `windowAll()` (for *non-keyed* streams) call.
+
+A `WindowAssigner` is responsible for assigning each incoming element to one or more windows. Flink comes
+with pre-defined window assigners for the most common use cases, namely *tumbling windows*,
+*sliding windows*, *session windows* and *global windows*. You can also implement a custom window assigner by
+extending the `WindowAssigner` class. All built-in window assigners (except the global
+windows) assign elements to windows based on time, which can either be processing time or event
+time. Please take a look at our section on [event time]({{ site.baseurl }}/dev/event_time.html) to learn
+about the difference between processing time and event time and how timestamps and watermarks are generated.
+
+In the following, we show how Flink's pre-defined window assigners work and how they are used
+in a DataStream program. The following figures visualize the workings of each assigner. The purple circles
+represent elements of the stream, which are partitioned by some key (in this case *user 1*, *user 2* and *user 3*).
+The x-axis shows the progress of time.
+
+### Tumbling Windows
+
+A *tumbling windows* assigner assigns each element to a window of a specified *window size*.
+Tumbling windows have a fixed size and do not overlap. For example, if you specify a tumbling
+window with a size of 5 minutes, the current window will be evaluated and a new window will be
+started every five minutes as illustrated by the following figure.
+
+<img src="{{ site.baseurl }}/fig/tumbling-windows.svg" class="center" style="width: 100%;" />
+
+The following code snippets show how to use tumbling windows.
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+DataStream<T> input = ...;
+
+// tumbling event-time windows
+input
+    .keyBy(<key selector>)
+    .window(TumblingEventTimeWindows.of(Time.seconds(5)))
+    .<windowed transformation>(<window function>);
+
+// tumbling processing-time windows
+input
+    .keyBy(<key selector>)
+    .window(TumblingProcessingTimeWindows.of(Time.seconds(5)))
+    .<windowed transformation>(<window function>);
+
+// daily tumbling event-time windows offset by -8 hours.
+input
+    .keyBy(<key selector>)
+    .window(TumblingEventTimeWindows.of(Time.days(1), Time.hours(-8)))
+    .<windowed transformation>(<window function>);
+{% endhighlight %}
+</div>
+
+<div data-lang="scala" markdown="1">
+{% highlight scala %}
+val input: DataStream[T] = ...
+
+// tumbling event-time windows
+input
+    .keyBy(<key selector>)
+    .window(TumblingEventTimeWindows.of(Time.seconds(5)))
+    .<windowed transformation>(<window function>)
+
+// tumbling processing-time windows
+input
+    .keyBy(<key selector>)
+    .window(TumblingProcessingTimeWindows.of(Time.seconds(5)))
+    .<windowed transformation>(<window function>)
+
+// daily tumbling event-time windows offset by -8 hours.
+input
+    .keyBy(<key selector>)
+    .window(TumblingEventTimeWindows.of(Time.days(1), Time.hours(-8)))
+    .<windowed transformation>(<window function>)
+{% endhighlight %}
+</div>
+</div>
+
+Time intervals can be specified by using one of `Time.milliseconds(x)`, `Time.seconds(x)`,
+`Time.minutes(x)`, and so on.
+
+As shown in the last example, tumbling window assigners also take an optional `offset`
+parameter that can be used to change the alignment of windows. For example, without offsets
+hourly tumbling windows are aligned with epoch, that is you will get windows such as
+`1:00:00.000 - 1:59:59.999`, `2:00:00.000 - 2:59:59.999` and so on. If you want to change
+that you can give an offset. With an offset of 15 minutes you would, for example, get
+`1:15:00.000 - 2:14:59.999`, `2:15:00.000 - 3:14:59.999` etc.
+An important use case for offsets is to adjust windows to timezones other than UTC-0.
+For example, in China you would have to specify an offset of `Time.hours(-8)`.
+
+### Sliding Windows
+
+The *sliding windows* assigner assigns elements to windows of fixed length. Similar to a tumbling
+windows assigner, the size of the windows is configured by the *window size* parameter.
+An additional *window slide* parameter controls how frequently a sliding window is started. Hence,
+sliding windows can be overlapping if the slide is smaller than the window size. In this case elements
+are assigned to multiple windows.
+
+For example, you could have windows of size 10 minutes that slides by 5 minutes. With this you get every
+5 minutes a window that contains the events that arrived during the last 10 minutes as depicted by the
+following figure.
+
+<img src="{{ site.baseurl }}/fig/sliding-windows.svg" class="center" style="width: 100%;" />
+
+The following code snippets show how to use sliding windows.
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+DataStream<T> input = ...;
+
+// sliding event-time windows
+input
+    .keyBy(<key selector>)
+    .window(SlidingEventTimeWindows.of(Time.seconds(10), Time.seconds(5)))
+    .<windowed transformation>(<window function>);
+
+// sliding processing-time windows
+input
+    .keyBy(<key selector>)
+    .window(SlidingProcessingTimeWindows.of(Time.seconds(10), Time.seconds(5)))
+    .<windowed transformation>(<window function>);
+
+// sliding processing-time windows offset by -8 hours
+input
+    .keyBy(<key selector>)
+    .window(SlidingProcessingTimeWindows.of(Time.hours(12), Time.hours(1), Time.hours(-8)))
+    .<windowed transformation>(<window function>);
+{% endhighlight %}
+</div>
+
+<div data-lang="scala" markdown="1">
+{% highlight scala %}
+val input: DataStream[T] = ...
+
+// sliding event-time windows
+input
+    .keyBy(<key selector>)
+    .window(SlidingEventTimeWindows.of(Time.seconds(10), Time.seconds(5)))
+    .<windowed transformation>(<window function>)
+
+// sliding processing-time windows
+input
+    .keyBy(<key selector>)
+    .window(SlidingProcessingTimeWindows.of(Time.seconds(10), Time.seconds(5)))
+    .<windowed transformation>(<window function>)
+
+// sliding processing-time windows offset by -8 hours
+input
+    .keyBy(<key selector>)
+    .window(SlidingProcessingTimeWindows.of(Time.hours(12), Time.hours(1), Time.hours(-8)))
+    .<windowed transformation>(<window function>)
+{% endhighlight %}
+</div>
+</div>
+
+Time intervals can be specified by using one of `Time.milliseconds(x)`, `Time.seconds(x)`,
+`Time.minutes(x)`, and so on.
+
+As shown in the last example, sliding window assigners also take an optional `offset` parameter
+that can be used to change the alignment of windows. For example, without offsets hourly windows
+sliding by 30 minutes are aligned with epoch, that is you will get windows such as
+`1:00:00.000 - 1:59:59.999`, `1:30:00.000 - 2:29:59.999` and so on. If you want to change that
+you can give an offset. With an offset of 15 minutes you would, for example, get
+`1:15:00.000 - 2:14:59.999`, `1:45:00.000 - 2:44:59.999` etc.
+An important use case for offsets is to adjust windows to timezones other than UTC-0.
+For example, in China you would have to specify an offset of `Time.hours(-8)`.
+
+### Session Windows
+
+The *session windows* assigner groups elements by sessions of activity. Session windows do not overlap and
+do not have a fixed start and end time, in contrast to *tumbling windows* and *sliding windows*. Instead a
+session window closes when it does not receive elements for a certain period of time, *i.e.*, when a gap of
+inactivity occurred. A session window assigner is configured with the *session gap* which
+defines how long is the required period of inactivity. When this period expires, the current session closes
+and subsequent elements are assigned to a new session window.
+
+<img src="{{ site.baseurl }}/fig/session-windows.svg" class="center" style="width: 100%;" />
+
+The following code snippets show how to use session windows.
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+DataStream<T> input = ...;
+
+// event-time session windows
+input
+    .keyBy(<key selector>)
+    .window(EventTimeSessionWindows.withGap(Time.minutes(10)))
+    .<windowed transformation>(<window function>);
+
+// processing-time session windows
+input
+    .keyBy(<key selector>)
+    .window(ProcessingTimeSessionWindows.withGap(Time.minutes(10)))
+    .<windowed transformation>(<window function>);
+{% endhighlight %}
+</div>
+
+<div data-lang="scala" markdown="1">
+{% highlight scala %}
+val input: DataStream[T] = ...
+
+// event-time session windows
+input
+    .keyBy(<key selector>)
+    .window(EventTimeSessionWindows.withGap(Time.minutes(10)))
+    .<windowed transformation>(<window function>)
+
+// processing-time session windows
+input
+    .keyBy(<key selector>)
+    .window(ProcessingTimeSessionWindows.withGap(Time.minutes(10)))
+    .<windowed transformation>(<window function>)
+{% endhighlight %}
+</div>
+</div>
+
+Time intervals can be specified by using one of `Time.milliseconds(x)`, `Time.seconds(x)`,
+`Time.minutes(x)`, and so on.
+
+<span class="label label-danger">Attention</span> Since session windows do not have a fixed start and end,
+they are  evaluated differently than tumbling and sliding windows. Internally, a session window operator
+creates a new window for each arriving record and merges windows together if their are closer to each other
+than the defined gap.
+In order to be mergeable, a session window operator requires a merging [Trigger](#triggers) and a merging
+[Window Function](#window-functions), such as `ReduceFunction` or `WindowFunction`
+(`FoldFunction` cannot merge.)
+
+### Global Windows
+
+A *global windows* assigner assigns all elements with the same key to the same single *global window*.
+This windowing scheme is only useful if you also specify a custom [trigger](#triggers). Otherwise,
+no computation will be performed, as the global window does not have a natural end at
+which we could process the aggregated elements.
+
+<img src="{{ site.baseurl }}/fig/non-windowed.svg" class="center" style="width: 100%;" />
+
+The following code snippets show how to use a global window.
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+DataStream<T> input = ...;
+
+input
+    .keyBy(<key selector>)
+    .window(GlobalWindows.create())
+    .<windowed transformation>(<window function>);
+{% endhighlight %}
+</div>
+
+<div data-lang="scala" markdown="1">
+{% highlight scala %}
+val input: DataStream[T] = ...
+
+input
+    .keyBy(<key selector>)
+    .window(GlobalWindows.create())
+    .<windowed transformation>(<window function>)
+{% endhighlight %}
+</div>
+</div>
+
+## Window Functions
+
+After defining the window assigner, we need to specify the computation that we want
+to perform on each of these windows. This is the responsibility of the *window function*, which is used to process the
+elements of each (possibly keyed) window once the system determines that a window is ready for processing
+(see [triggers](#triggers) for how Flink determines when a window is ready).
+
+The window function can be one of `ReduceFunction`, `FoldFunction` or `WindowFunction`. The first
+two can be executed more efficiently (see [State Size](#state size) section) because Flink can incrementally aggregate
+the elements for each window as they arrive. A `WindowFunction` gets an `Iterable` for all the elements contained in a
+window and additional meta information about the window to which the elements belong.
+
+A windowed transformation with a `WindowFunction` cannot be executed as efficiently as the other
+cases because Flink has to buffer *all* elements for a window internally before invoking the function.
+This can be mitigated by combining a `WindowFunction` with a `ReduceFunction` or `FoldFunction` to
+get both incremental aggregation of window elements and the additional window metadata that the
+`WindowFunction` receives. We will look at examples for each of these variants.
+
+### ReduceFunction
+
+A `ReduceFunction` specifies how two elements from the input are combined to produce
+an output element of the same type. Flink uses a `ReduceFunction` to incrementally aggregate
+the elements of a window.
+
+A `ReduceFunction` can be defined and used like this:
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+DataStream<Tuple2<String, Long>> input = ...;
+
+input
+    .keyBy(<key selector>)
+    .window(<window assigner>)
+    .reduce(new ReduceFunction<Tuple2<String, Long>> {
+      public Tuple2<String, Long> reduce(Tuple2<String, Long> v1, Tuple2<String, Long> v2) {
+        return new Tuple2<>(v1.f0, v1.f1 + v2.f1);
+      }
+    });
+{% endhighlight %}
+</div>
+
+<div data-lang="scala" markdown="1">
+{% highlight scala %}
+val input: DataStream[(String, Long)] = ...
+
+input
+    .keyBy(<key selector>)
+    .window(<window assigner>)
+    .reduce { (v1, v2) => (v1._1, v1._2 + v2._2) }
+{% endhighlight %}
+</div>
+</div>
+
+The above example sums up the second fields of the tuples for all elements in a window.
+
+### FoldFunction
+
+A `FoldFunction` specifies how an input element of the window is combined with an element of
+the output type. The `FoldFunction` is incrementally called for each element that is added
+to the window and the current output value. The first element is combined with a pre-defined initial value of the output type.
+
+A `FoldFunction` can be defined and used like this:
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+DataStream<Tuple2<String, Long>> input = ...;
+
+input
+    .keyBy(<key selector>)
+    .window(<window assigner>)
+    .fold("", new FoldFunction<Tuple2<String, Long>, String>> {
+       public String fold(String acc, Tuple2<String, Long> value) {
+         return acc + value.f1;
+       }
+    });
+{% endhighlight %}
+</div>
+
+<div data-lang="scala" markdown="1">
+{% highlight scala %}
+val input: DataStream[(String, Long)] = ...
+
+input
+    .keyBy(<key selector>)
+    .window(<window assigner>)
+    .fold("") { (acc, v) => acc + v._2 }
+{% endhighlight %}
+</div>
+</div>
+
+The above example appends all input `Long` values to an initially empty `String`.
+
+<span class="label label-danger">Attention</span> `fold()` cannot be used with session windows or other mergeable windows.
+
+### WindowFunction - The Generic Case
+
+A `WindowFunction` gets an `Iterable` containing all the elements of the window and provides
+the most flexibility of all window functions. This comes
+at the cost of performance and resource consumption, because elements cannot be incrementally
+aggregated but instead need to be buffered internally until the window is considered ready for processing.
+
+The signature of a `WindowFunction` looks as follows:
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+public interface WindowFunction<IN, OUT, KEY, W extends Window> extends Function, Serializable {
+
+  /**
+   * Evaluates the window and outputs none or several elements.
+   *
+   * @param key The key for which this window is evaluated.
+   * @param window The window that is being evaluated.
+   * @param input The elements in the window being evaluated.
+   * @param out A collector for emitting elements.
+   *
+   * @throws Exception The function may throw exceptions to fail the program and trigger recovery.
+   */
+  void apply(KEY key, W window, Iterable<IN> input, Collector<OUT> out) throws Exception;
+}
+{% endhighlight %}
+</div>
+
+<div data-lang="scala" markdown="1">
+{% highlight scala %}
+trait WindowFunction[IN, OUT, KEY, W <: Window] extends Function with Serializable {
+
+  /**
+    * Evaluates the window and outputs none or several elements.
+    *
+    * @param key    The key for which this window is evaluated.
+    * @param window The window that is being evaluated.
+    * @param input  The elements in the window being evaluated.
+    * @param out    A collector for emitting elements.
+    * @throws Exception The function may throw exceptions to fail the program and trigger recovery.
+    */
+  def apply(key: KEY, window: W, input: Iterable[IN], out: Collector[OUT])
+}
+{% endhighlight %}
+</div>
+</div>
+
+A `WindowFunction` can be defined and used like this:
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+DataStream<Tuple2<String, Long>> input = ...;
+
+input
+    .keyBy(<key selector>)
+    .window(<window assigner>)
+    .apply(new MyWindowFunction());
+
+/* ... */
+
+public class MyWindowFunction implements WindowFunction<Tuple<String, Long>, String, String, TimeWindow> {
+
+  void apply(String key, TimeWindow window, Iterable<Tuple<String, Long>> input, Collector<String> out) {
+    long count = 0;
+    for (Tuple<String, Long> in: input) {
+      count++;
+    }
+    out.collect("Window: " + window + "count: " + count);
+  }
+}
+
+{% endhighlight %}
+</div>
+
+<div data-lang="scala" markdown="1">
+{% highlight scala %}
+val input: DataStream[(String, Long)] = ...
+
+input
+    .keyBy(<key selector>)
+    .window(<window assigner>)
+    .apply(new MyWindowFunction())
+
+/* ... */
+
+class MyWindowFunction extends WindowFunction[(String, Long), String, String, TimeWindow] {
+
+  def apply(key: String, window: TimeWindow, input: Iterable[(String, Long)], out: Collector[String]): () = {
+    var count = 0L
+    for (in <- input) {
+      count = count + 1
+    }
+    out.collect(s"Window $window count: $count")
+  }
+}
+{% endhighlight %}
+</div>
+</div>
+
+The example shows a `WindowFunction` to count the elements in a window. In addition, the window function adds information about the window to the output.
+
+<span class="label label-danger">Attention</span> Note that using `WindowFunction` for simple aggregates such as count is quite inefficient. The next section shows how a `ReduceFunction` can be combined with a `WindowFunction` to get both incremental aggregation and the added information of a `WindowFunction`.
+
+### ProcessWindowFunction
+
+In places where a `WindowFunction` can be used you can also use a `ProcessWindowFunction`. This
+is very similar to `WindowFunction`, except that the interface allows to query more information
+about the context in which the window evaluation happens.
+
+This is the `ProcessWindowFunction` interface:
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+public abstract class ProcessWindowFunction<IN, OUT, KEY, W extends Window> implements Function {
+
+    /**
+     * Evaluates the window and outputs none or several elements.
+     *
+     * @param key The key for which this window is evaluated.
+     * @param context The context in which the window is being evaluated.
+     * @param elements The elements in the window being evaluated.
+     * @param out A collector for emitting elements.
+     *
+     * @throws Exception The function may throw exceptions to fail the program and trigger recovery.
+     */
+    public abstract void process(
+            KEY key,
+            Context context,
+            Iterable<IN> elements,
+            Collector<OUT> out) throws Exception;
+
+    /**
+     * The context holding window metadata
+     */
+    public abstract class Context {
+        /**
+         * @return The window that is being evaluated.
+         */
+        public abstract W window();
+    }
+}
+{% endhighlight %}
+</div>
+
+<div data-lang="scala" markdown="1">
+{% highlight scala %}
+abstract class ProcessWindowFunction[IN, OUT, KEY, W <: Window] extends Function {
+
+  /**
+    * Evaluates the window and outputs none or several elements.
+    *
+    * @param key      The key for which this window is evaluated.
+    * @param context  The context in which the window is being evaluated.
+    * @param elements The elements in the window being evaluated.
+    * @param out      A collector for emitting elements.
+    * @throws Exception The function may throw exceptions to fail the program and trigger recovery.
+    */
+  @throws[Exception]
+  def process(
+      key: KEY,
+      context: Context,
+      elements: Iterable[IN],
+      out: Collector[OUT])
+
+  /**
+    * The context holding window metadata
+    */
+  abstract class Context {
+    /**
+      * @return The window that is being evaluated.
+      */
+    def window: W
+  }
+}
+{% endhighlight %}
+</div>
+</div>
+
+It can be used like this:
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+DataStream<Tuple2<String, Long>> input = ...;
+
+input
+    .keyBy(<key selector>)
+    .window(<window assigner>)
+    .process(new MyProcessWindowFunction());
+{% endhighlight %}
+</div>
+
+<div data-lang="scala" markdown="1">
+{% highlight scala %}
+val input: DataStream[(String, Long)] = ...
+
+input
+    .keyBy(<key selector>)
+    .window(<window assigner>)
+    .process(new MyProcessWindowFunction())
+{% endhighlight %}
+</div>
+</div>
+
+### WindowFunction with Incremental Aggregation
+
+A `WindowFunction` can be combined with either a `ReduceFunction` or a `FoldFunction` to
+incrementally aggregate elements as they arrive in the window.
+When the window is closed, the `WindowFunction` will be provided with the aggregated result.
+This allows to incrementally compute windows while having access to the
+additional window meta information of the `WindowFunction`.
+
+<span class="label label-info">Note</span> You can also `ProcessWindowFunction` instead of
+`WindowFunction` for incremental window aggregation.
+
+#### Incremental Window Aggregation with FoldFunction
+
+The following example shows how an incremental `FoldFunction` can be combined with
+a `WindowFunction` to extract the number of events in the window and return also
+the key and end time of the window.
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+DataStream<SensorReading> input = ...;
+
+input
+  .keyBy(<key selector>)
+  .timeWindow(<window assigner>)
+  .fold(new Tuple3<String, Long, Integer>("",0L, 0), new MyFoldFunction(), new MyWindowFunction())
+
+// Function definitions
+
+private static class MyFoldFunction
+    implements FoldFunction<SensorReading, Tuple3<String, Long, Integer> > {
+
+  public Tuple3<String, Long, Integer> fold(Tuple3<String, Long, Integer> acc, SensorReading s) {
+      Integer cur = acc.getField(2);
+      acc.setField(2, cur + 1);
+      return acc;
+  }
+}
+
+private static class MyWindowFunction
+    implements WindowFunction<Tuple3<String, Long, Integer>, Tuple3<String, Long, Integer>, String, TimeWindow> {
+
+  public void apply(String key,
+                    TimeWindow window,
+                    Iterable<Tuple3<String, Long, Integer>> counts,
+                    Collector<Tuple3<String, Long, Integer>> out) {
+    Integer count = counts.iterator().next().getField(2);
+    out.collect(new Tuple3<String, Long, Integer>(key, window.getEnd(),count));
+  }
+}
+
+{% endhighlight %}
+</div>
+<div data-lang="scala" markdown="1">
+{% highlight scala %}
+
+val input: DataStream[SensorReading] = ...
+
+input
+ .keyBy(<key selector>)
+ .timeWindow(<window assigner>)
+ .fold (
+    ("", 0L, 0),
+    (acc: (String, Long, Int), r: SensorReading) => { ("", 0L, acc._3 + 1) },
+    ( key: String,
+      window: TimeWindow,
+      counts: Iterable[(String, Long, Int)],
+      out: Collector[(String, Long, Int)] ) =>
+      {
+        val count = counts.iterator.next()
+        out.collect((key, window.getEnd, count._3))
+      }
+  )
+
+{% endhighlight %}
+</div>
+</div>
+
+#### Incremental Window Aggregation with ReduceFunction
+
+The following example shows how an incremental `ReduceFunction` can be combined with
+a `WindowFunction` to return the smallest event in a window along
+with the start time of the window.
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+DataStream<SensorReading> input = ...;
+
+input
+  .keyBy(<key selector>)
+  .timeWindow(<window assigner>)
+  .reduce(new MyReduceFunction(), new MyWindowFunction());
+
+// Function definitions
+
+private static class MyReduceFunction implements ReduceFunction<SensorReading> {
+
+  public SensorReading reduce(SensorReading r1, SensorReading r2) {
+      return r1.value() > r2.value() ? r2 : r1;
+  }
+}
+
+private static class MyWindowFunction
+    implements WindowFunction<SensorReading, Tuple2<Long, SensorReading>, String, TimeWindow> {
+
+  public void apply(String key,
+                    TimeWindow window,
+                    Iterable<SensorReading> minReadings,
+                    Collector<Tuple2<Long, SensorReading>> out) {
+      SensorReading min = minReadings.iterator().next();
+      out.collect(new Tuple2<Long, SensorReading>(window.getStart(), min));
+  }
+}
+
+{% endhighlight %}
+</div>
+<div data-lang="scala" markdown="1">
+{% highlight scala %}
+
+val input: DataStream[SensorReading] = ...
+
+input
+  .keyBy(<key selector>)
+  .timeWindow(<window assigner>)
+  .reduce(
+    (r1: SensorReading, r2: SensorReading) => { if (r1.value > r2.value) r2 else r1 },
+    ( key: String,
+      window: TimeWindow,
+      minReadings: Iterable[SensorReading],
+      out: Collector[(Long, SensorReading)] ) =>
+      {
+        val min = minReadings.iterator.next()
+        out.collect((window.getStart, min))
+      }
+  )
+
+{% endhighlight %}
+</div>
+</div>
+
+## Triggers
+
+A `Trigger` determines when a window (as formed by the *window assigner*) is ready to be
+processed by the *window function*. Each `WindowAssigner` comes with a default `Trigger`.
+If the default trigger does not fit your needs, you can specify a custom trigger using `trigger(...)`.
+
+The trigger interface has five methods that allow a `Trigger` to react to different events:
+
+* The `onElement()` method is called for each element that is added to a window.
+* The `onEventTime()` method is called when  a registered event-time timer fires.
+* The `onProcessingTime()` method is called when a registered processing-time timer fires.
+* The `onMerge()` method is relevant for stateful triggers and merges the states of two triggers when their corresponding windows merge, *e.g.* when using session windows.
+* Finally the `clear()` method performs any action needed upon removal of the corresponding window.
+
+Two things to notice about the above methods are:
+
+1) The first three decide how to act on their invocation event by returning a `TriggerResult`. The action can be one of the following:
+
+* `CONTINUE`: do nothing,
+* `FIRE`: trigger the computation,
+* `PURGE`: clear the elements in the window, and
+* `FIRE_AND_PURGE`: trigger the computation and clear the elements in the window afterwards.
+
+2) Any of these methods can be used to register processing- or event-time timers for future actions.
+
+### Fire and Purge
+
+Once a trigger determines that a window is ready for processing, it fires, *i.e.*, it returns `FIRE` or `FIRE_AND_PURGE`. This is the signal for the window operator
+to emit the result of the current window. Given a window with a `WindowFunction`
+all elements are passed to the `WindowFunction` (possibly after passing them to an evictor).
+Windows with `ReduceFunction` of `FoldFunction` simply emit their eagerly aggregated result.
+
+When a trigger fires, it can either `FIRE` or `FIRE_AND_PURGE`. While `FIRE` keeps the contents of the window, `FIRE_AND_PURGE` removes its content.
+By default, the pre-implemented triggers simply `FIRE` without purging the window state.
+
+<span class="label label-danger">Attention</span> Purging will simply remove the contents of the window and will leave any potential meta-information about the window and any trigger state intact.
+
+### Default Triggers of WindowAssigners
+
+The default `Trigger` of a `WindowAssigner` is appropriate for many use cases. For example, all the event-time window assigners have an `EventTimeTrigger` as
+default trigger. This trigger simply fires once the watermark passes the end of a window.
+
+<span class="label label-danger">Attention</span> The default trigger of the `GlobalWindow` is the `NeverTrigger` which does never fire. Consequently, you always have to define a custom trigger when using a `GlobalWindow`.
+
+<span class="label label-danger">Attention</span> By specifying a trigger using `trigger()` you
+are overwriting the default trigger of a `WindowAssigner`. For example, if you specify a
+`CountTrigger` for `TumblingEventTimeWindows` you will no longer get window firings based on the
+progress of time but only by count. Right now, you have to write your own custom trigger if
+you want to react based on both time and count.
+
+### Built-in and Custom Triggers
+
+Flink comes with a few built-in triggers.
+
+* The (already mentioned) `EventTimeTrigger` fires based on the progress of event-time as measured by watermarks.
+* The `ProcessingTimeTrigger` fires based on processing time.
+* The `CountTrigger` fires once the number of elements in a window exceeds the given limit.
+* The `PurgingTrigger` takes as argument another trigger and transforms it into a purging one.
+
+If you need to implement a custom trigger, you should check out the abstract
+{% gh_link /flink-streaming-java/src/main/java/org/apache/flink/streaming/api/windowing/triggers/Trigger.java "Trigger" %} class.
+Please note that the API is still evolving and might change in future versions of Flink.
+
+## Evictors
+
+Flinkā€™s windowing model allows specifying an optional `Evictor` in addition to the `WindowAssigner` and the `Trigger`.
+This can be done using the `evictor(...)` method (shown in the beginning of this document). The evictor has the ability
+to remove elements from a window *after* the trigger fires and *before and/or after* the window function is applied.
+To do so, the `Evictor` interface has two methods:
+
+    /**
+     * Optionally evicts elements. Called before windowing function.
+     *
+     * @param elements The elements currently in the pane.
+     * @param size The current number of elements in the pane.
+     * @param window The {@link Window}
+     * @param evictorContext The context for the Evictor
+     */
+    void evictBefore(Iterable<TimestampedValue<T>> elements, int size, W window, EvictorContext evictorContext);
+
+    /**
+     * Optionally evicts elements. Called after windowing function.
+     *
+     * @param elements The elements currently in the pane.
+     * @param size The current number of elements in the pane.
+     * @param window The {@link Window}
+     * @param evictorContext The context for the Evictor
+     */
+    void evictAfter(Iterable<TimestampedValue<T>> elements, int size, W window, EvictorContext evictorContext);
+
+The `evictBefore()` contains the eviction logic to be applied before the window function, while the `evictAfter()`
+contains the one to be applied after the window function. Elements evicted before the application of the window
+function will not be processed by it.
+
+Flink comes with three pre-implemented evictors. These are:
+
+* `CountEvictor`: keeps up to a user-specified number of elements from the window and discards the remaining ones from
+the beginning of the window buffer.
+* `DeltaEvictor`: takes a `DeltaFunction` and a `threshold`, computes the delta between the last element in the
+window buffer and each of the remaining ones, and removes the ones with a delta greater or equal to the threshold.
+* `TimeEvictor`: takes as argument an `interval` in milliseconds and for a given window, it finds the maximum
+timestamp `max_ts` among its elements and removes all the elements with timestamps smaller than `max_ts - interval`.
+
+<span class="label label-info">Default</span> By default, all the pre-implemented evictors apply their logic before the
+window function.
+
+<span class="label label-danger">Attention</span> Specifying an evictor prevents any pre-aggregation, as all the
+elements of a window have to be passed to the evictor before applying the computation.
+
+<span class="label label-danger">Attention</span> Flink provides no guarantees about the order of the elements within
+a window. This implies that although an evictor may remove elements from the beginning of the window, these are not
+necessarily the ones that arrive first or last.
+
+
+## Allowed Lateness
+
+When working with *event-time* windowing, it can happen that elements arrive late, *i.e.* the watermark that Flink uses to
+keep track of the progress of event-time is already past the end timestamp of a window to which an element belongs. See
+[event time](./event_time.html) and especially [late elements](./event_time.html#late-elements) for a more thorough
+discussion of how Flink deals with event time.
+
+By default, late elements are dropped when the watermark is past the end of the window. However,
+Flink allows to specify a maximum *allowed lateness* for window operators. Allowed lateness
+specifies by how much time elements can be late before they are dropped, and its default value is 0.
+Elements that arrive after the watermark has passed the end of the window but before it passes the end of
+the window plus the allowed lateness, are still added to the window. Depending on the trigger used,
+a late but not dropped element may cause the window to fire again. This is the case for the `EventTimeTrigger`.
+
+In order to make this work, Flink keeps the state of windows until their allowed lateness expires. Once this happens, Flink removes the window and deletes its state, as
+also described in the [Window Lifecycle](#window-lifecycle) section.
+
+<span class="label label-info">Default</span> By default, the allowed lateness is set to
+`0`. That is, elements that arrive behind the watermark will be dropped.
+
+You can specify an allowed lateness like this:
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+DataStream<T> input = ...;
+
+input
+    .keyBy(<key selector>)
+    .window(<window assigner>)
+    .allowedLateness(<time>)
+    .<windowed transformation>(<window function>);
+{% endhighlight %}
+</div>
+
+<div data-lang="scala" markdown="1">
+{% highlight scala %}
+val input: DataStream[T] = ...
+
+input
+    .keyBy(<key selector>)
+    .window(<window assigner>)
+    .allowedLateness(<time>)
+    .<windowed transformation>(<window function>)
+{% endhighlight %}
+</div>
+</div>
+
+<span class="label label-info">Note</span> When using the `GlobalWindows` window assigner no
+data is ever considered late because the end timestamp of the global window is `Long.MAX_VALUE`.
+
+### Getting late data as a side output
+
+Using Flink's [side output]({{ site.baseurl }}/dev/stream/side_output.html) feature you can get a stream of the data
+that was discarded as late.
+
+You first need to specify that you want to get late data using `sideOutputLateData(OutputTag)` on
+the windowed stream. Then, you can get the side-output stream on the result of the windowed
+operation:
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+final OutputTag<T> lateOutputTag = new OutputTag<T>("late-data"){};
+
+DataStream<T> input = ...;
+
+DataStream<T> result = input
+    .keyBy(<key selector>)
+    .window(<window assigner>)
+    .allowedLateness(<time>)
+    .sideOutputLateData(lateOutputTag)
+    .<windowed transformation>(<window function>);
+
+DataStream<T> lateStream = result.getSideOutput(lateOutputTag);
+{% endhighlight %}
+</div>
+
+<div data-lang="scala" markdown="1">
+{% highlight scala %}
+val lateOutputTag = OutputTag[T]("late-data")
+
+val input: DataStream[T] = ...
+
+val result = input
+    .keyBy(<key selector>)
+    .window(<window assigner>)
+    .allowedLateness(<time>)
+    .sideOutputLateData(lateOutputTag)
+    .<windowed transformation>(<window function>)
+
+val lateStream = result.getSideOutput(lateOutputTag)
+{% endhighlight %}
+</div>
+</div>
+
+### Late elements considerations
+
+When specifying an allowed lateness greater than 0, the window along with its content is kept after the watermark passes
+the end of the window. In these cases, when a late but not dropped element arrives, it could trigger another firing for the
+window. These firings are called `late firings`, as they are triggered by late events and in contrast to the `main firing`
+which is the first firing of the window. In case of session windows, late firings can further lead to merging of windows,
+as they may "bridge" the gap between two pre-existing, unmerged windows.
+
+<span class="label label-info">Attention</span> You should be aware that the elements emitted by a late firing should be treated as updated results of a previous computation, i.e., your data stream will contain multiple results for the same computation. Depending on your application, you need to take these duplicated results into account or deduplicate them.
+
+## Useful state size considerations
+
+Windows can be defined over long periods of time (such as days, weeks, or months) and therefore accumulate very large state. There are a couple of rules to keep in mind when estimating the storage requirements of your windowing computation:
+
+1. Flink creates one copy of each element per window to which it belongs. Given this, tumbling windows keep one copy of each element (an element belongs to exactly window unless it is dropped late). In contrast, sliding windows create several of each element, as explained in the [Window Assigners](#window-assigners) section. Hence, a sliding window of size 1 day and slide 1 second might not be a good idea.
+
+2. `FoldFunction` and `ReduceFunction` can significantly reduce the storage requirements, as they eagerly aggregate elements and store only one value per window. In contrast, just using a `WindowFunction` requires accumulating all elements.
+
+3. Using an `Evictor` prevents any pre-aggregation, as all the elements of a window have to be passed through the evictor before applying the computation (see [Evictors](#evictors)).

http://git-wip-us.apache.org/repos/asf/flink/blob/cafa45e2/docs/dev/windows.md
----------------------------------------------------------------------
diff --git a/docs/dev/windows.md b/docs/dev/windows.md
deleted file mode 100644
index f0320a1..0000000
--- a/docs/dev/windows.md
+++ /dev/null
@@ -1,1039 +0,0 @@
----
-title: "Windows"
-nav-parent_id: streaming
-nav-id: windows
-nav-pos: 10
----
-<!--
-Licensed to the Apache Software Foundation (ASF) under one
-or more contributor license agreements.  See the NOTICE file
-distributed with this work for additional information
-regarding copyright ownership.  The ASF licenses this file
-to you under the Apache License, Version 2.0 (the
-"License"); you may not use this file except in compliance
-with the License.  You may obtain a copy of the License at
-
-  http://www.apache.org/licenses/LICENSE-2.0
-
-Unless required by applicable law or agreed to in writing,
-software distributed under the License is distributed on an
-"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-KIND, either express or implied.  See the License for the
-specific language governing permissions and limitations
-under the License.
--->
-
-Windows are at the heart of processing infinite streams. Windows split the stream into "buckets" of finite size,
-over which we can apply computations. This document focuses on how windowing is performed in Flink and how the
-programmer can benefit to the maximum from its offered functionality.
-
-The general structure of a windowed Flink program is presented below. The first snippet refers to *keyed* streams,
-while the second to *non-keyed* ones. As one can see, the only difference is the `keyBy(...)` call for the keyed streams
-and the `window(...)` which becomes `windowAll(...)` for non-keyed streams. These is also going to serve as a roadmap
-for the rest of the page.
-
-**Keyed Windows**
-
-    stream
-           .keyBy(...)          <-  keyed versus non-keyed windows
-           .window(...)         <-  required: "assigner"
-          [.trigger(...)]       <-  optional: "trigger" (else default trigger)
-          [.evictor(...)]       <-  optional: "evictor" (else no evictor)
-          [.allowedLateness()]  <-  optional, else zero
-           .reduce/fold/apply() <-  required: "function"
-
-**Non-Keyed Windows**
-
-    stream
-           .windowAll(...)      <-  required: "assigner"
-          [.trigger(...)]       <-  optional: "trigger" (else default trigger)
-          [.evictor(...)]       <-  optional: "evictor" (else no evictor)
-          [.allowedLateness()]  <-  optional, else zero
-           .reduce/fold/apply() <-  required: "function"
-
-In the above, the commands in square brackets ([...]) are optional. This reveals that Flink allows you to customize your
-windowing logic in many different ways so that it best fits your needs.
-
-* This will be replaced by the TOC
-{:toc}
-
-## Window Lifecycle
-
-In a nutshell, a window is **created** as soon as the first element that should belong to this window arrives, and the
-window is **completely removed** when the time (event or processing time) passes its end timestamp plus the user-specified
-`allowed lateness` (see [Allowed Lateness](#allowed-lateness)). Flink guarantees removal only for time-based
-windows and not for other types, *e.g.* global windows (see [Window Assigners](#window-assigners)). For example, with an
-event-time-based windowing strategy that creates non-overlapping (or tumbling) windows every 5 minutes and has an allowed
-lateness of 1 min, Flink will create a new window for the interval between `12:00` and `12:05` when the first element with
-a timestamp that falls into this interval arrives, and it will remove it when the watermark passes the `12:06`
-timestamp.
-
-In addition, each window will have a `Trigger` (see [Triggers](#triggers)) and a function (`WindowFunction`, `ReduceFunction` or
-`FoldFunction`) (see [Window Functions](#window-functions)) attached to it. The function will contain the computation to
-be applied to the contents of the window, while the `Trigger` specifies the conditions under which the window is
-considered ready for the function to be applied. A triggering policy might be something like "when the number of elements
-in the window is more than 4", or "when the watermark passes the end of the window". A trigger can also decide to
-purge a window's contents any time between its creation and removal. Purging in this case only refers to the elements
-in the window, and *not* the window metadata. This means that new data can still be added to that window.
-
-Apart from the above, you can specify an `Evictor` (see [Evictors](#evictors)) which will be able to remove
-elements from the window after the trigger fires and before and/or after the function is applied.
-
-In the following we go into more detail for each of the components above. We start with the required parts in the above
-snippet (see [Keyed vs Non-Keyed Windows](#keyed-vs-non-keyed-windows), [Window Assigner](#window-assigner), and
-[Window Function](#window-function)) before moving to the optional ones.
-
-## Keyed vs Non-Keyed Windows
-
-The first thing to specify is whether your stream should be keyed or not. This has to be done before defining the window.
-Using the `keyBy(...)` will split your infinite stream into logical keyed streams. If `keyBy(...)` is not called, your
-stream is not keyed.
-
-In the case of keyed streams, any attribute of your incoming events can be used as a key
-(more details [here]({{ site.baseurl }}/dev/api_concepts.html#specifying-keys)). Having a keyed stream will
-allow your windowed computation to be performed in parallel by multiple tasks, as each logical keyed stream can be processed
-independently from the rest. All elements referring to the same key will be sent to the same parallel task.
-
-In case of non-keyed streams, your original stream will not be split into multiple logical streams and all the windowing logic
-will be performed by a single task, *i.e.* with parallelism of 1.
-
-## Window Assigners
-
-After specifying whether your stream is keyed or not, the next step is to define a *window assigner*.
-The window assigner defines how elements are assigned to windows. This is done by specifying the `WindowAssigner`
-of your choice in the `window(...)` (for *keyed* streams) or the `windowAll()` (for *non-keyed* streams) call.
-
-A `WindowAssigner` is responsible for assigning each incoming element to one or more windows. Flink comes
-with pre-defined window assigners for the most common use cases, namely *tumbling windows*,
-*sliding windows*, *session windows* and *global windows*. You can also implement a custom window assigner by
-extending the `WindowAssigner` class. All built-in window assigners (except the global
-windows) assign elements to windows based on time, which can either be processing time or event
-time. Please take a look at our section on [event time]({{ site.baseurl }}/dev/event_time.html) to learn
-about the difference between processing time and event time and how timestamps and watermarks are generated.
-
-In the following, we show how Flink's pre-defined window assigners work and how they are used
-in a DataStream program. The following figures visualize the workings of each assigner. The purple circles
-represent elements of the stream, which are partitioned by some key (in this case *user 1*, *user 2* and *user 3*).
-The x-axis shows the progress of time.
-
-### Tumbling Windows
-
-A *tumbling windows* assigner assigns each element to a window of a specified *window size*.
-Tumbling windows have a fixed size and do not overlap. For example, if you specify a tumbling
-window with a size of 5 minutes, the current window will be evaluated and a new window will be
-started every five minutes as illustrated by the following figure.
-
-<img src="{{ site.baseurl }}/fig/tumbling-windows.svg" class="center" style="width: 100%;" />
-
-The following code snippets show how to use tumbling windows.
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-{% highlight java %}
-DataStream<T> input = ...;
-
-// tumbling event-time windows
-input
-    .keyBy(<key selector>)
-    .window(TumblingEventTimeWindows.of(Time.seconds(5)))
-    .<windowed transformation>(<window function>);
-
-// tumbling processing-time windows
-input
-    .keyBy(<key selector>)
-    .window(TumblingProcessingTimeWindows.of(Time.seconds(5)))
-    .<windowed transformation>(<window function>);
-
-// daily tumbling event-time windows offset by -8 hours.
-input
-    .keyBy(<key selector>)
-    .window(TumblingEventTimeWindows.of(Time.days(1), Time.hours(-8)))
-    .<windowed transformation>(<window function>);
-{% endhighlight %}
-</div>
-
-<div data-lang="scala" markdown="1">
-{% highlight scala %}
-val input: DataStream[T] = ...
-
-// tumbling event-time windows
-input
-    .keyBy(<key selector>)
-    .window(TumblingEventTimeWindows.of(Time.seconds(5)))
-    .<windowed transformation>(<window function>)
-
-// tumbling processing-time windows
-input
-    .keyBy(<key selector>)
-    .window(TumblingProcessingTimeWindows.of(Time.seconds(5)))
-    .<windowed transformation>(<window function>)
-
-// daily tumbling event-time windows offset by -8 hours.
-input
-    .keyBy(<key selector>)
-    .window(TumblingEventTimeWindows.of(Time.days(1), Time.hours(-8)))
-    .<windowed transformation>(<window function>)
-{% endhighlight %}
-</div>
-</div>
-
-Time intervals can be specified by using one of `Time.milliseconds(x)`, `Time.seconds(x)`,
-`Time.minutes(x)`, and so on.
-
-As shown in the last example, tumbling window assigners also take an optional `offset`
-parameter that can be used to change the alignment of windows. For example, without offsets
-hourly tumbling windows are aligned with epoch, that is you will get windows such as
-`1:00:00.000 - 1:59:59.999`, `2:00:00.000 - 2:59:59.999` and so on. If you want to change
-that you can give an offset. With an offset of 15 minutes you would, for example, get
-`1:15:00.000 - 2:14:59.999`, `2:15:00.000 - 3:14:59.999` etc.
-An important use case for offsets is to adjust windows to timezones other than UTC-0.
-For example, in China you would have to specify an offset of `Time.hours(-8)`.
-
-### Sliding Windows
-
-The *sliding windows* assigner assigns elements to windows of fixed length. Similar to a tumbling
-windows assigner, the size of the windows is configured by the *window size* parameter.
-An additional *window slide* parameter controls how frequently a sliding window is started. Hence,
-sliding windows can be overlapping if the slide is smaller than the window size. In this case elements
-are assigned to multiple windows.
-
-For example, you could have windows of size 10 minutes that slides by 5 minutes. With this you get every
-5 minutes a window that contains the events that arrived during the last 10 minutes as depicted by the
-following figure.
-
-<img src="{{ site.baseurl }}/fig/sliding-windows.svg" class="center" style="width: 100%;" />
-
-The following code snippets show how to use sliding windows.
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-{% highlight java %}
-DataStream<T> input = ...;
-
-// sliding event-time windows
-input
-    .keyBy(<key selector>)
-    .window(SlidingEventTimeWindows.of(Time.seconds(10), Time.seconds(5)))
-    .<windowed transformation>(<window function>);
-
-// sliding processing-time windows
-input
-    .keyBy(<key selector>)
-    .window(SlidingProcessingTimeWindows.of(Time.seconds(10), Time.seconds(5)))
-    .<windowed transformation>(<window function>);
-
-// sliding processing-time windows offset by -8 hours
-input
-    .keyBy(<key selector>)
-    .window(SlidingProcessingTimeWindows.of(Time.hours(12), Time.hours(1), Time.hours(-8)))
-    .<windowed transformation>(<window function>);
-{% endhighlight %}
-</div>
-
-<div data-lang="scala" markdown="1">
-{% highlight scala %}
-val input: DataStream[T] = ...
-
-// sliding event-time windows
-input
-    .keyBy(<key selector>)
-    .window(SlidingEventTimeWindows.of(Time.seconds(10), Time.seconds(5)))
-    .<windowed transformation>(<window function>)
-
-// sliding processing-time windows
-input
-    .keyBy(<key selector>)
-    .window(SlidingProcessingTimeWindows.of(Time.seconds(10), Time.seconds(5)))
-    .<windowed transformation>(<window function>)
-
-// sliding processing-time windows offset by -8 hours
-input
-    .keyBy(<key selector>)
-    .window(SlidingProcessingTimeWindows.of(Time.hours(12), Time.hours(1), Time.hours(-8)))
-    .<windowed transformation>(<window function>)
-{% endhighlight %}
-</div>
-</div>
-
-Time intervals can be specified by using one of `Time.milliseconds(x)`, `Time.seconds(x)`,
-`Time.minutes(x)`, and so on.
-
-As shown in the last example, sliding window assigners also take an optional `offset` parameter
-that can be used to change the alignment of windows. For example, without offsets hourly windows
-sliding by 30 minutes are aligned with epoch, that is you will get windows such as
-`1:00:00.000 - 1:59:59.999`, `1:30:00.000 - 2:29:59.999` and so on. If you want to change that
-you can give an offset. With an offset of 15 minutes you would, for example, get
-`1:15:00.000 - 2:14:59.999`, `1:45:00.000 - 2:44:59.999` etc.
-An important use case for offsets is to adjust windows to timezones other than UTC-0.
-For example, in China you would have to specify an offset of `Time.hours(-8)`.
-
-### Session Windows
-
-The *session windows* assigner groups elements by sessions of activity. Session windows do not overlap and
-do not have a fixed start and end time, in contrast to *tumbling windows* and *sliding windows*. Instead a
-session window closes when it does not receive elements for a certain period of time, *i.e.*, when a gap of
-inactivity occurred. A session window assigner is configured with the *session gap* which
-defines how long is the required period of inactivity. When this period expires, the current session closes
-and subsequent elements are assigned to a new session window.
-
-<img src="{{ site.baseurl }}/fig/session-windows.svg" class="center" style="width: 100%;" />
-
-The following code snippets show how to use session windows.
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-{% highlight java %}
-DataStream<T> input = ...;
-
-// event-time session windows
-input
-    .keyBy(<key selector>)
-    .window(EventTimeSessionWindows.withGap(Time.minutes(10)))
-    .<windowed transformation>(<window function>);
-
-// processing-time session windows
-input
-    .keyBy(<key selector>)
-    .window(ProcessingTimeSessionWindows.withGap(Time.minutes(10)))
-    .<windowed transformation>(<window function>);
-{% endhighlight %}
-</div>
-
-<div data-lang="scala" markdown="1">
-{% highlight scala %}
-val input: DataStream[T] = ...
-
-// event-time session windows
-input
-    .keyBy(<key selector>)
-    .window(EventTimeSessionWindows.withGap(Time.minutes(10)))
-    .<windowed transformation>(<window function>)
-
-// processing-time session windows
-input
-    .keyBy(<key selector>)
-    .window(ProcessingTimeSessionWindows.withGap(Time.minutes(10)))
-    .<windowed transformation>(<window function>)
-{% endhighlight %}
-</div>
-</div>
-
-Time intervals can be specified by using one of `Time.milliseconds(x)`, `Time.seconds(x)`,
-`Time.minutes(x)`, and so on.
-
-<span class="label label-danger">Attention</span> Since session windows do not have a fixed start and end,
-they are  evaluated differently than tumbling and sliding windows. Internally, a session window operator
-creates a new window for each arriving record and merges windows together if their are closer to each other
-than the defined gap.
-In order to be mergeable, a session window operator requires a merging [Trigger](#triggers) and a merging
-[Window Function](#window-functions), such as `ReduceFunction` or `WindowFunction`
-(`FoldFunction` cannot merge.)
-
-### Global Windows
-
-A *global windows* assigner assigns all elements with the same key to the same single *global window*.
-This windowing scheme is only useful if you also specify a custom [trigger](#triggers). Otherwise,
-no computation will be performed, as the global window does not have a natural end at
-which we could process the aggregated elements.
-
-<img src="{{ site.baseurl }}/fig/non-windowed.svg" class="center" style="width: 100%;" />
-
-The following code snippets show how to use a global window.
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-{% highlight java %}
-DataStream<T> input = ...;
-
-input
-    .keyBy(<key selector>)
-    .window(GlobalWindows.create())
-    .<windowed transformation>(<window function>);
-{% endhighlight %}
-</div>
-
-<div data-lang="scala" markdown="1">
-{% highlight scala %}
-val input: DataStream[T] = ...
-
-input
-    .keyBy(<key selector>)
-    .window(GlobalWindows.create())
-    .<windowed transformation>(<window function>)
-{% endhighlight %}
-</div>
-</div>
-
-## Window Functions
-
-After defining the window assigner, we need to specify the computation that we want
-to perform on each of these windows. This is the responsibility of the *window function*, which is used to process the
-elements of each (possibly keyed) window once the system determines that a window is ready for processing
-(see [triggers](#triggers) for how Flink determines when a window is ready).
-
-The window function can be one of `ReduceFunction`, `FoldFunction` or `WindowFunction`. The first
-two can be executed more efficiently (see [State Size](#state size) section) because Flink can incrementally aggregate
-the elements for each window as they arrive. A `WindowFunction` gets an `Iterable` for all the elements contained in a
-window and additional meta information about the window to which the elements belong.
-
-A windowed transformation with a `WindowFunction` cannot be executed as efficiently as the other
-cases because Flink has to buffer *all* elements for a window internally before invoking the function.
-This can be mitigated by combining a `WindowFunction` with a `ReduceFunction` or `FoldFunction` to
-get both incremental aggregation of window elements and the additional window metadata that the
-`WindowFunction` receives. We will look at examples for each of these variants.
-
-### ReduceFunction
-
-A `ReduceFunction` specifies how two elements from the input are combined to produce
-an output element of the same type. Flink uses a `ReduceFunction` to incrementally aggregate
-the elements of a window.
-
-A `ReduceFunction` can be defined and used like this:
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-{% highlight java %}
-DataStream<Tuple2<String, Long>> input = ...;
-
-input
-    .keyBy(<key selector>)
-    .window(<window assigner>)
-    .reduce(new ReduceFunction<Tuple2<String, Long>> {
-      public Tuple2<String, Long> reduce(Tuple2<String, Long> v1, Tuple2<String, Long> v2) {
-        return new Tuple2<>(v1.f0, v1.f1 + v2.f1);
-      }
-    });
-{% endhighlight %}
-</div>
-
-<div data-lang="scala" markdown="1">
-{% highlight scala %}
-val input: DataStream[(String, Long)] = ...
-
-input
-    .keyBy(<key selector>)
-    .window(<window assigner>)
-    .reduce { (v1, v2) => (v1._1, v1._2 + v2._2) }
-{% endhighlight %}
-</div>
-</div>
-
-The above example sums up the second fields of the tuples for all elements in a window.
-
-### FoldFunction
-
-A `FoldFunction` specifies how an input element of the window is combined with an element of
-the output type. The `FoldFunction` is incrementally called for each element that is added
-to the window and the current output value. The first element is combined with a pre-defined initial value of the output type.
-
-A `FoldFunction` can be defined and used like this:
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-{% highlight java %}
-DataStream<Tuple2<String, Long>> input = ...;
-
-input
-    .keyBy(<key selector>)
-    .window(<window assigner>)
-    .fold("", new FoldFunction<Tuple2<String, Long>, String>> {
-       public String fold(String acc, Tuple2<String, Long> value) {
-         return acc + value.f1;
-       }
-    });
-{% endhighlight %}
-</div>
-
-<div data-lang="scala" markdown="1">
-{% highlight scala %}
-val input: DataStream[(String, Long)] = ...
-
-input
-    .keyBy(<key selector>)
-    .window(<window assigner>)
-    .fold("") { (acc, v) => acc + v._2 }
-{% endhighlight %}
-</div>
-</div>
-
-The above example appends all input `Long` values to an initially empty `String`.
-
-<span class="label label-danger">Attention</span> `fold()` cannot be used with session windows or other mergeable windows.
-
-### WindowFunction - The Generic Case
-
-A `WindowFunction` gets an `Iterable` containing all the elements of the window and provides
-the most flexibility of all window functions. This comes
-at the cost of performance and resource consumption, because elements cannot be incrementally
-aggregated but instead need to be buffered internally until the window is considered ready for processing.
-
-The signature of a `WindowFunction` looks as follows:
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-{% highlight java %}
-public interface WindowFunction<IN, OUT, KEY, W extends Window> extends Function, Serializable {
-
-  /**
-   * Evaluates the window and outputs none or several elements.
-   *
-   * @param key The key for which this window is evaluated.
-   * @param window The window that is being evaluated.
-   * @param input The elements in the window being evaluated.
-   * @param out A collector for emitting elements.
-   *
-   * @throws Exception The function may throw exceptions to fail the program and trigger recovery.
-   */
-  void apply(KEY key, W window, Iterable<IN> input, Collector<OUT> out) throws Exception;
-}
-{% endhighlight %}
-</div>
-
-<div data-lang="scala" markdown="1">
-{% highlight scala %}
-trait WindowFunction[IN, OUT, KEY, W <: Window] extends Function with Serializable {
-
-  /**
-    * Evaluates the window and outputs none or several elements.
-    *
-    * @param key    The key for which this window is evaluated.
-    * @param window The window that is being evaluated.
-    * @param input  The elements in the window being evaluated.
-    * @param out    A collector for emitting elements.
-    * @throws Exception The function may throw exceptions to fail the program and trigger recovery.
-    */
-  def apply(key: KEY, window: W, input: Iterable[IN], out: Collector[OUT])
-}
-{% endhighlight %}
-</div>
-</div>
-
-A `WindowFunction` can be defined and used like this:
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-{% highlight java %}
-DataStream<Tuple2<String, Long>> input = ...;
-
-input
-    .keyBy(<key selector>)
-    .window(<window assigner>)
-    .apply(new MyWindowFunction());
-
-/* ... */
-
-public class MyWindowFunction implements WindowFunction<Tuple<String, Long>, String, String, TimeWindow> {
-
-  void apply(String key, TimeWindow window, Iterable<Tuple<String, Long>> input, Collector<String> out) {
-    long count = 0;
-    for (Tuple<String, Long> in: input) {
-      count++;
-    }
-    out.collect("Window: " + window + "count: " + count);
-  }
-}
-
-{% endhighlight %}
-</div>
-
-<div data-lang="scala" markdown="1">
-{% highlight scala %}
-val input: DataStream[(String, Long)] = ...
-
-input
-    .keyBy(<key selector>)
-    .window(<window assigner>)
-    .apply(new MyWindowFunction())
-
-/* ... */
-
-class MyWindowFunction extends WindowFunction[(String, Long), String, String, TimeWindow] {
-
-  def apply(key: String, window: TimeWindow, input: Iterable[(String, Long)], out: Collector[String]): () = {
-    var count = 0L
-    for (in <- input) {
-      count = count + 1
-    }
-    out.collect(s"Window $window count: $count")
-  }
-}
-{% endhighlight %}
-</div>
-</div>
-
-The example shows a `WindowFunction` to count the elements in a window. In addition, the window function adds information about the window to the output.
-
-<span class="label label-danger">Attention</span> Note that using `WindowFunction` for simple aggregates such as count is quite inefficient. The next section shows how a `ReduceFunction` can be combined with a `WindowFunction` to get both incremental aggregation and the added information of a `WindowFunction`.
-
-### ProcessWindowFunction
-
-In places where a `WindowFunction` can be used you can also use a `ProcessWindowFunction`. This
-is very similar to `WindowFunction`, except that the interface allows to query more information
-about the context in which the window evaluation happens.
-
-This is the `ProcessWindowFunction` interface:
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-{% highlight java %}
-public abstract class ProcessWindowFunction<IN, OUT, KEY, W extends Window> implements Function {
-
-    /**
-     * Evaluates the window and outputs none or several elements.
-     *
-     * @param key The key for which this window is evaluated.
-     * @param context The context in which the window is being evaluated.
-     * @param elements The elements in the window being evaluated.
-     * @param out A collector for emitting elements.
-     *
-     * @throws Exception The function may throw exceptions to fail the program and trigger recovery.
-     */
-    public abstract void process(
-            KEY key,
-            Context context,
-            Iterable<IN> elements,
-            Collector<OUT> out) throws Exception;
-
-    /**
-     * The context holding window metadata
-     */
-    public abstract class Context {
-        /**
-         * @return The window that is being evaluated.
-         */
-        public abstract W window();
-    }
-}
-{% endhighlight %}
-</div>
-
-<div data-lang="scala" markdown="1">
-{% highlight scala %}
-abstract class ProcessWindowFunction[IN, OUT, KEY, W <: Window] extends Function {
-
-  /**
-    * Evaluates the window and outputs none or several elements.
-    *
-    * @param key      The key for which this window is evaluated.
-    * @param context  The context in which the window is being evaluated.
-    * @param elements The elements in the window being evaluated.
-    * @param out      A collector for emitting elements.
-    * @throws Exception The function may throw exceptions to fail the program and trigger recovery.
-    */
-  @throws[Exception]
-  def process(
-      key: KEY,
-      context: Context,
-      elements: Iterable[IN],
-      out: Collector[OUT])
-
-  /**
-    * The context holding window metadata
-    */
-  abstract class Context {
-    /**
-      * @return The window that is being evaluated.
-      */
-    def window: W
-  }
-}
-{% endhighlight %}
-</div>
-</div>
-
-It can be used like this:
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-{% highlight java %}
-DataStream<Tuple2<String, Long>> input = ...;
-
-input
-    .keyBy(<key selector>)
-    .window(<window assigner>)
-    .process(new MyProcessWindowFunction());
-{% endhighlight %}
-</div>
-
-<div data-lang="scala" markdown="1">
-{% highlight scala %}
-val input: DataStream[(String, Long)] = ...
-
-input
-    .keyBy(<key selector>)
-    .window(<window assigner>)
-    .process(new MyProcessWindowFunction())
-{% endhighlight %}
-</div>
-</div>
-
-### WindowFunction with Incremental Aggregation
-
-A `WindowFunction` can be combined with either a `ReduceFunction` or a `FoldFunction` to
-incrementally aggregate elements as they arrive in the window.
-When the window is closed, the `WindowFunction` will be provided with the aggregated result.
-This allows to incrementally compute windows while having access to the
-additional window meta information of the `WindowFunction`.
-
-<span class="label label-info">Note</span> You can also `ProcessWindowFunction` instead of
-`WindowFunction` for incremental window aggregation.
-
-#### Incremental Window Aggregation with FoldFunction
-
-The following example shows how an incremental `FoldFunction` can be combined with
-a `WindowFunction` to extract the number of events in the window and return also
-the key and end time of the window.
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-{% highlight java %}
-DataStream<SensorReading> input = ...;
-
-input
-  .keyBy(<key selector>)
-  .timeWindow(<window assigner>)
-  .fold(new Tuple3<String, Long, Integer>("",0L, 0), new MyFoldFunction(), new MyWindowFunction())
-
-// Function definitions
-
-private static class MyFoldFunction
-    implements FoldFunction<SensorReading, Tuple3<String, Long, Integer> > {
-
-  public Tuple3<String, Long, Integer> fold(Tuple3<String, Long, Integer> acc, SensorReading s) {
-      Integer cur = acc.getField(2);
-      acc.setField(2, cur + 1);
-      return acc;
-  }
-}
-
-private static class MyWindowFunction
-    implements WindowFunction<Tuple3<String, Long, Integer>, Tuple3<String, Long, Integer>, String, TimeWindow> {
-
-  public void apply(String key,
-                    TimeWindow window,
-                    Iterable<Tuple3<String, Long, Integer>> counts,
-                    Collector<Tuple3<String, Long, Integer>> out) {
-    Integer count = counts.iterator().next().getField(2);
-    out.collect(new Tuple3<String, Long, Integer>(key, window.getEnd(),count));
-  }
-}
-
-{% endhighlight %}
-</div>
-<div data-lang="scala" markdown="1">
-{% highlight scala %}
-
-val input: DataStream[SensorReading] = ...
-
-input
- .keyBy(<key selector>)
- .timeWindow(<window assigner>)
- .fold (
-    ("", 0L, 0),
-    (acc: (String, Long, Int), r: SensorReading) => { ("", 0L, acc._3 + 1) },
-    ( key: String,
-      window: TimeWindow,
-      counts: Iterable[(String, Long, Int)],
-      out: Collector[(String, Long, Int)] ) =>
-      {
-        val count = counts.iterator.next()
-        out.collect((key, window.getEnd, count._3))
-      }
-  )
-
-{% endhighlight %}
-</div>
-</div>
-
-#### Incremental Window Aggregation with ReduceFunction
-
-The following example shows how an incremental `ReduceFunction` can be combined with
-a `WindowFunction` to return the smallest event in a window along
-with the start time of the window.
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-{% highlight java %}
-DataStream<SensorReading> input = ...;
-
-input
-  .keyBy(<key selector>)
-  .timeWindow(<window assigner>)
-  .reduce(new MyReduceFunction(), new MyWindowFunction());
-
-// Function definitions
-
-private static class MyReduceFunction implements ReduceFunction<SensorReading> {
-
-  public SensorReading reduce(SensorReading r1, SensorReading r2) {
-      return r1.value() > r2.value() ? r2 : r1;
-  }
-}
-
-private static class MyWindowFunction
-    implements WindowFunction<SensorReading, Tuple2<Long, SensorReading>, String, TimeWindow> {
-
-  public void apply(String key,
-                    TimeWindow window,
-                    Iterable<SensorReading> minReadings,
-                    Collector<Tuple2<Long, SensorReading>> out) {
-      SensorReading min = minReadings.iterator().next();
-      out.collect(new Tuple2<Long, SensorReading>(window.getStart(), min));
-  }
-}
-
-{% endhighlight %}
-</div>
-<div data-lang="scala" markdown="1">
-{% highlight scala %}
-
-val input: DataStream[SensorReading] = ...
-
-input
-  .keyBy(<key selector>)
-  .timeWindow(<window assigner>)
-  .reduce(
-    (r1: SensorReading, r2: SensorReading) => { if (r1.value > r2.value) r2 else r1 },
-    ( key: String,
-      window: TimeWindow,
-      minReadings: Iterable[SensorReading],
-      out: Collector[(Long, SensorReading)] ) =>
-      {
-        val min = minReadings.iterator.next()
-        out.collect((window.getStart, min))
-      }
-  )
-
-{% endhighlight %}
-</div>
-</div>
-
-## Triggers
-
-A `Trigger` determines when a window (as formed by the *window assigner*) is ready to be
-processed by the *window function*. Each `WindowAssigner` comes with a default `Trigger`.
-If the default trigger does not fit your needs, you can specify a custom trigger using `trigger(...)`.
-
-The trigger interface has five methods that allow a `Trigger` to react to different events:
-
-* The `onElement()` method is called for each element that is added to a window.
-* The `onEventTime()` method is called when  a registered event-time timer fires.
-* The `onProcessingTime()` method is called when a registered processing-time timer fires.
-* The `onMerge()` method is relevant for stateful triggers and merges the states of two triggers when their corresponding windows merge, *e.g.* when using session windows.
-* Finally the `clear()` method performs any action needed upon removal of the corresponding window.
-
-Two things to notice about the above methods are:
-
-1) The first three decide how to act on their invocation event by returning a `TriggerResult`. The action can be one of the following:
-
-* `CONTINUE`: do nothing,
-* `FIRE`: trigger the computation,
-* `PURGE`: clear the elements in the window, and
-* `FIRE_AND_PURGE`: trigger the computation and clear the elements in the window afterwards.
-
-2) Any of these methods can be used to register processing- or event-time timers for future actions.
-
-### Fire and Purge
-
-Once a trigger determines that a window is ready for processing, it fires, *i.e.*, it returns `FIRE` or `FIRE_AND_PURGE`. This is the signal for the window operator
-to emit the result of the current window. Given a window with a `WindowFunction`
-all elements are passed to the `WindowFunction` (possibly after passing them to an evictor).
-Windows with `ReduceFunction` of `FoldFunction` simply emit their eagerly aggregated result.
-
-When a trigger fires, it can either `FIRE` or `FIRE_AND_PURGE`. While `FIRE` keeps the contents of the window, `FIRE_AND_PURGE` removes its content.
-By default, the pre-implemented triggers simply `FIRE` without purging the window state.
-
-<span class="label label-danger">Attention</span> Purging will simply remove the contents of the window and will leave any potential meta-information about the window and any trigger state intact.
-
-### Default Triggers of WindowAssigners
-
-The default `Trigger` of a `WindowAssigner` is appropriate for many use cases. For example, all the event-time window assigners have an `EventTimeTrigger` as
-default trigger. This trigger simply fires once the watermark passes the end of a window.
-
-<span class="label label-danger">Attention</span> The default trigger of the `GlobalWindow` is the `NeverTrigger` which does never fire. Consequently, you always have to define a custom trigger when using a `GlobalWindow`.
-
-<span class="label label-danger">Attention</span> By specifying a trigger using `trigger()` you
-are overwriting the default trigger of a `WindowAssigner`. For example, if you specify a
-`CountTrigger` for `TumblingEventTimeWindows` you will no longer get window firings based on the
-progress of time but only by count. Right now, you have to write your own custom trigger if
-you want to react based on both time and count.
-
-### Built-in and Custom Triggers
-
-Flink comes with a few built-in triggers.
-
-* The (already mentioned) `EventTimeTrigger` fires based on the progress of event-time as measured by watermarks.
-* The `ProcessingTimeTrigger` fires based on processing time.
-* The `CountTrigger` fires once the number of elements in a window exceeds the given limit.
-* The `PurgingTrigger` takes as argument another trigger and transforms it into a purging one.
-
-If you need to implement a custom trigger, you should check out the abstract
-{% gh_link /flink-streaming-java/src/main/java/org/apache/flink/streaming/api/windowing/triggers/Trigger.java "Trigger" %} class.
-Please note that the API is still evolving and might change in future versions of Flink.
-
-## Evictors
-
-Flinkā€™s windowing model allows specifying an optional `Evictor` in addition to the `WindowAssigner` and the `Trigger`.
-This can be done using the `evictor(...)` method (shown in the beginning of this document). The evictor has the ability
-to remove elements from a window *after* the trigger fires and *before and/or after* the window function is applied.
-To do so, the `Evictor` interface has two methods:
-
-    /**
-     * Optionally evicts elements. Called before windowing function.
-     *
-     * @param elements The elements currently in the pane.
-     * @param size The current number of elements in the pane.
-     * @param window The {@link Window}
-     * @param evictorContext The context for the Evictor
-     */
-    void evictBefore(Iterable<TimestampedValue<T>> elements, int size, W window, EvictorContext evictorContext);
-
-    /**
-     * Optionally evicts elements. Called after windowing function.
-     *
-     * @param elements The elements currently in the pane.
-     * @param size The current number of elements in the pane.
-     * @param window The {@link Window}
-     * @param evictorContext The context for the Evictor
-     */
-    void evictAfter(Iterable<TimestampedValue<T>> elements, int size, W window, EvictorContext evictorContext);
-
-The `evictBefore()` contains the eviction logic to be applied before the window function, while the `evictAfter()`
-contains the one to be applied after the window function. Elements evicted before the application of the window
-function will not be processed by it.
-
-Flink comes with three pre-implemented evictors. These are:
-
-* `CountEvictor`: keeps up to a user-specified number of elements from the window and discards the remaining ones from
-the beginning of the window buffer.
-* `DeltaEvictor`: takes a `DeltaFunction` and a `threshold`, computes the delta between the last element in the
-window buffer and each of the remaining ones, and removes the ones with a delta greater or equal to the threshold.
-* `TimeEvictor`: takes as argument an `interval` in milliseconds and for a given window, it finds the maximum
-timestamp `max_ts` among its elements and removes all the elements with timestamps smaller than `max_ts - interval`.
-
-<span class="label label-info">Default</span> By default, all the pre-implemented evictors apply their logic before the
-window function.
-
-<span class="label label-danger">Attention</span> Specifying an evictor prevents any pre-aggregation, as all the
-elements of a window have to be passed to the evictor before applying the computation.
-
-<span class="label label-danger">Attention</span> Flink provides no guarantees about the order of the elements within
-a window. This implies that although an evictor may remove elements from the beginning of the window, these are not
-necessarily the ones that arrive first or last.
-
-
-## Allowed Lateness
-
-When working with *event-time* windowing, it can happen that elements arrive late, *i.e.* the watermark that Flink uses to
-keep track of the progress of event-time is already past the end timestamp of a window to which an element belongs. See
-[event time](./event_time.html) and especially [late elements](./event_time.html#late-elements) for a more thorough
-discussion of how Flink deals with event time.
-
-By default, late elements are dropped when the watermark is past the end of the window. However,
-Flink allows to specify a maximum *allowed lateness* for window operators. Allowed lateness
-specifies by how much time elements can be late before they are dropped, and its default value is 0.
-Elements that arrive after the watermark has passed the end of the window but before it passes the end of
-the window plus the allowed lateness, are still added to the window. Depending on the trigger used,
-a late but not dropped element may cause the window to fire again. This is the case for the `EventTimeTrigger`.
-
-In order to make this work, Flink keeps the state of windows until their allowed lateness expires. Once this happens, Flink removes the window and deletes its state, as
-also described in the [Window Lifecycle](#window-lifecycle) section.
-
-<span class="label label-info">Default</span> By default, the allowed lateness is set to
-`0`. That is, elements that arrive behind the watermark will be dropped.
-
-You can specify an allowed lateness like this:
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-{% highlight java %}
-DataStream<T> input = ...;
-
-input
-    .keyBy(<key selector>)
-    .window(<window assigner>)
-    .allowedLateness(<time>)
-    .<windowed transformation>(<window function>);
-{% endhighlight %}
-</div>
-
-<div data-lang="scala" markdown="1">
-{% highlight scala %}
-val input: DataStream[T] = ...
-
-input
-    .keyBy(<key selector>)
-    .window(<window assigner>)
-    .allowedLateness(<time>)
-    .<windowed transformation>(<window function>)
-{% endhighlight %}
-</div>
-</div>
-
-<span class="label label-info">Note</span> When using the `GlobalWindows` window assigner no
-data is ever considered late because the end timestamp of the global window is `Long.MAX_VALUE`.
-
-### Getting late data as a side output
-
-Using Flink's [side output]({{ site.baseurl }}/dev/stream/side_output.html) feature you can get a stream of the data
-that was discarded as late.
-
-You first need to specify that you want to get late data using `sideOutputLateData(OutputTag)` on
-the windowed stream. Then, you can get the side-output stream on the result of the windowed
-operation:
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-{% highlight java %}
-final OutputTag<T> lateOutputTag = new OutputTag<T>("late-data"){};
-
-DataStream<T> input = ...;
-
-DataStream<T> result = input
-    .keyBy(<key selector>)
-    .window(<window assigner>)
-    .allowedLateness(<time>)
-    .sideOutputLateData(lateOutputTag)
-    .<windowed transformation>(<window function>);
-
-DataStream<T> lateStream = result.getSideOutput(lateOutputTag);
-{% endhighlight %}
-</div>
-
-<div data-lang="scala" markdown="1">
-{% highlight scala %}
-val lateOutputTag = OutputTag[T]("late-data")
-
-val input: DataStream[T] = ...
-
-val result = input
-    .keyBy(<key selector>)
-    .window(<window assigner>)
-    .allowedLateness(<time>)
-    .sideOutputLateData(lateOutputTag)
-    .<windowed transformation>(<window function>)
-
-val lateStream = result.getSideOutput(lateOutputTag)
-{% endhighlight %}
-</div>
-</div>
-
-### Late elements considerations
-
-When specifying an allowed lateness greater than 0, the window along with its content is kept after the watermark passes
-the end of the window. In these cases, when a late but not dropped element arrives, it could trigger another firing for the
-window. These firings are called `late firings`, as they are triggered by late events and in contrast to the `main firing`
-which is the first firing of the window. In case of session windows, late firings can further lead to merging of windows,
-as they may "bridge" the gap between two pre-existing, unmerged windows.
-
-<span class="label label-info">Attention</span> You should be aware that the elements emitted by a late firing should be treated as updated results of a previous computation, i.e., your data stream will contain multiple results for the same computation. Depending on your application, you need to take these duplicated results into account or deduplicate them.
-
-## Useful state size considerations
-
-Windows can be defined over long periods of time (such as days, weeks, or months) and therefore accumulate very large state. There are a couple of rules to keep in mind when estimating the storage requirements of your windowing computation:
-
-1. Flink creates one copy of each element per window to which it belongs. Given this, tumbling windows keep one copy of each element (an element belongs to exactly window unless it is dropped late). In contrast, sliding windows create several of each element, as explained in the [Window Assigners](#window-assigners) section. Hence, a sliding window of size 1 day and slide 1 second might not be a good idea.
-
-2. `FoldFunction` and `ReduceFunction` can significantly reduce the storage requirements, as they eagerly aggregate elements and store only one value per window. In contrast, just using a `WindowFunction` requires accumulating all elements.
-
-3. Using an `Evictor` prevents any pre-aggregation, as all the elements of a window have to be passed through the evictor before applying the computation (see [Evictors](#evictors)).


[7/7] flink git commit: [FLINK-7370][docs] rework the operator documentation structure

Posted by tw...@apache.org.
[FLINK-7370][docs] rework the operator documentation structure

- create category `Streaming/Operators`
- move `Streaming/Overview/DataStream Transformations` to `Streaming/Operators/Overview`
- move `ProcessFunction`, `Windows`, and `Async IO` to `Streaming/Operators`
- update previous links in the documentation
- create any necessary redirects for old URLs


Project: http://git-wip-us.apache.org/repos/asf/flink/repo
Commit: http://git-wip-us.apache.org/repos/asf/flink/commit/cafa45e2
Tree: http://git-wip-us.apache.org/repos/asf/flink/tree/cafa45e2
Diff: http://git-wip-us.apache.org/repos/asf/flink/diff/cafa45e2

Branch: refs/heads/master
Commit: cafa45e2ff97dddf807859ae8e22e694f2630783
Parents: ff70cc3
Author: Nico Kruber <ni...@data-artisans.com>
Authored: Fri Aug 4 10:56:58 2017 +0200
Committer: twalthr <tw...@apache.org>
Committed: Wed Aug 9 13:56:43 2017 +0200

----------------------------------------------------------------------
 docs/concepts/programming-model.md  |    2 +-
 docs/dev/datastream_api.md          | 1141 +----------------------------
 docs/dev/stream/asyncio.md          |    2 +-
 docs/dev/stream/operators.md        | 1169 ++++++++++++++++++++++++++++++
 docs/dev/stream/process_function.md |    2 +-
 docs/dev/stream/windows.md          | 1039 ++++++++++++++++++++++++++
 docs/dev/windows.md                 | 1039 --------------------------
 docs/redirects/windows.md           |    2 +-
 docs/redirects/windows_2.md         |   24 +
 9 files changed, 2239 insertions(+), 2181 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/flink/blob/cafa45e2/docs/concepts/programming-model.md
----------------------------------------------------------------------
diff --git a/docs/concepts/programming-model.md b/docs/concepts/programming-model.md
index fd5ebee..926fdd7 100644
--- a/docs/concepts/programming-model.md
+++ b/docs/concepts/programming-model.md
@@ -82,7 +82,7 @@ Often there is a one-to-one correspondence between the transformations in the pr
 in the dataflow. Sometimes, however, one transformation may consist of multiple transformation operators.
 
 Sources and sinks are documented in the [streaming connectors](../dev/connectors/index.html) and [batch connectors](../dev/batch/connectors.html) docs.
-Transformations are documented in [DataStream transformations](../dev/datastream_api.html#datastream-transformations) and [DataSet transformations](../dev/batch/dataset_transformations.html).
+Transformations are documented in [DataStream operators]({{ site.baseurl }}/dev/stream/operators.html) and [DataSet transformations](../dev/batch/dataset_transformations.html).
 
 {% top %}
 

http://git-wip-us.apache.org/repos/asf/flink/blob/cafa45e2/docs/dev/datastream_api.md
----------------------------------------------------------------------
diff --git a/docs/dev/datastream_api.md b/docs/dev/datastream_api.md
index 8b3899b..b7f02ef 100644
--- a/docs/dev/datastream_api.md
+++ b/docs/dev/datastream_api.md
@@ -38,7 +38,7 @@ to the basic concepts of the Flink API.
 In order to create your own Flink DataStream program, we encourage you to start with
 [anatomy of a Flink Program]({{ site.baseurl }}/dev/api_concepts.html#anatomy-of-a-flink-program)
 and gradually add your own
-[transformations](#datastream-transformations). The remaining sections act as references for additional
+[stream transformations]({{ site.baseurl }}/dev/stream/operators.html). The remaining sections act as references for additional
 operations and advanced features.
 
 
@@ -138,1143 +138,8 @@ word count program. If you want to see counts greater than 1, type the same word
 DataStream Transformations
 --------------------------
 
-Data transformations transform one or more DataStreams into a new DataStream. Programs can combine
-multiple transformations into sophisticated topologies.
-
-This section gives a description of all the available transformations.
-
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-
-<br />
-
-<table class="table table-bordered">
-  <thead>
-    <tr>
-      <th class="text-left" style="width: 25%">Transformation</th>
-      <th class="text-center">Description</th>
-    </tr>
-  </thead>
-  <tbody>
-    <tr>
-          <td><strong>Map</strong><br>DataStream &rarr; DataStream</td>
-          <td>
-            <p>Takes one element and produces one element. A map function that doubles the values of the input stream:</p>
-    {% highlight java %}
-DataStream<Integer> dataStream = //...
-dataStream.map(new MapFunction<Integer, Integer>() {
-    @Override
-    public Integer map(Integer value) throws Exception {
-        return 2 * value;
-    }
-});
-    {% endhighlight %}
-          </td>
-        </tr>
-
-        <tr>
-          <td><strong>FlatMap</strong><br>DataStream &rarr; DataStream</td>
-          <td>
-            <p>Takes one element and produces zero, one, or more elements. A flatmap function that splits sentences to words:</p>
-    {% highlight java %}
-dataStream.flatMap(new FlatMapFunction<String, String>() {
-    @Override
-    public void flatMap(String value, Collector<String> out)
-        throws Exception {
-        for(String word: value.split(" ")){
-            out.collect(word);
-        }
-    }
-});
-    {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Filter</strong><br>DataStream &rarr; DataStream</td>
-          <td>
-            <p>Evaluates a boolean function for each element and retains those for which the function returns true.
-            A filter that filters out zero values:
-            </p>
-    {% highlight java %}
-dataStream.filter(new FilterFunction<Integer>() {
-    @Override
-    public boolean filter(Integer value) throws Exception {
-        return value != 0;
-    }
-});
-    {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>KeyBy</strong><br>DataStream &rarr; KeyedStream</td>
-          <td>
-            <p>Logically partitions a stream into disjoint partitions, each partition containing elements of the same key.
-            Internally, this is implemented with hash partitioning. See <a href="{{ site.baseurl }}/dev/api_concepts.html#specifying-keys">keys</a> on how to specify keys.
-            This transformation returns a KeyedStream.</p>
-    {% highlight java %}
-dataStream.keyBy("someKey") // Key by field "someKey"
-dataStream.keyBy(0) // Key by the first element of a Tuple
-    {% endhighlight %}
-            <p>
-            <span class="label label-danger">Attention</span> 
-            A type <strong>cannot be a key</strong> if:
-    	    <ol> 
-    	    <li> it is a POJO type but does not override the <em>hashCode()</em> method and 
-    	    relies on the <em>Object.hashCode()</em> implementation.</li>
-    	    <li> it is an array of any type.</li>
-    	    </ol>
-    	    </p>
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Reduce</strong><br>KeyedStream &rarr; DataStream</td>
-          <td>
-            <p>A "rolling" reduce on a keyed data stream. Combines the current element with the last reduced value and
-            emits the new value.
-                    <br/>
-            	<br/>
-            A reduce function that creates a stream of partial sums:</p>
-            {% highlight java %}
-keyedStream.reduce(new ReduceFunction<Integer>() {
-    @Override
-    public Integer reduce(Integer value1, Integer value2)
-    throws Exception {
-        return value1 + value2;
-    }
-});
-            {% endhighlight %}
-            </p>
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Fold</strong><br>KeyedStream &rarr; DataStream</td>
-          <td>
-          <p>A "rolling" fold on a keyed data stream with an initial value.
-          Combines the current element with the last folded value and
-          emits the new value.
-          <br/>
-          <br/>
-          <p>A fold function that, when applied on the sequence (1,2,3,4,5),
-          emits the sequence "start-1", "start-1-2", "start-1-2-3", ...</p>
-          {% highlight java %}
-DataStream<String> result =
-  keyedStream.fold("start", new FoldFunction<Integer, String>() {
-    @Override
-    public String fold(String current, Integer value) {
-        return current + "-" + value;
-    }
-  });
-          {% endhighlight %}
-          </p>
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Aggregations</strong><br>KeyedStream &rarr; DataStream</td>
-          <td>
-            <p>Rolling aggregations on a keyed data stream. The difference between min
-	    and minBy is that min returns the minimum value, whereas minBy returns
-	    the element that has the minimum value in this field (same for max and maxBy).</p>
-    {% highlight java %}
-keyedStream.sum(0);
-keyedStream.sum("key");
-keyedStream.min(0);
-keyedStream.min("key");
-keyedStream.max(0);
-keyedStream.max("key");
-keyedStream.minBy(0);
-keyedStream.minBy("key");
-keyedStream.maxBy(0);
-keyedStream.maxBy("key");
-    {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Window</strong><br>KeyedStream &rarr; WindowedStream</td>
-          <td>
-            <p>Windows can be defined on already partitioned KeyedStreams. Windows group the data in each
-            key according to some characteristic (e.g., the data that arrived within the last 5 seconds).
-            See <a href="windows.html">windows</a> for a complete description of windows.
-    {% highlight java %}
-dataStream.keyBy(0).window(TumblingEventTimeWindows.of(Time.seconds(5))); // Last 5 seconds of data
-    {% endhighlight %}
-        </p>
-          </td>
-        </tr>
-        <tr>
-          <td><strong>WindowAll</strong><br>DataStream &rarr; AllWindowedStream</td>
-          <td>
-              <p>Windows can be defined on regular DataStreams. Windows group all the stream events
-              according to some characteristic (e.g., the data that arrived within the last 5 seconds).
-              See <a href="windows.html">windows</a> for a complete description of windows.</p>
-              <p><strong>WARNING:</strong> This is in many cases a <strong>non-parallel</strong> transformation. All records will be
-               gathered in one task for the windowAll operator.</p>
-  {% highlight java %}
-dataStream.windowAll(TumblingEventTimeWindows.of(Time.seconds(5))); // Last 5 seconds of data
-  {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Window Apply</strong><br>WindowedStream &rarr; DataStream<br>AllWindowedStream &rarr; DataStream</td>
-          <td>
-            <p>Applies a general function to the window as a whole. Below is a function that manually sums the elements of a window.</p>
-            <p><strong>Note:</strong> If you are using a windowAll transformation, you need to use an AllWindowFunction instead.</p>
-    {% highlight java %}
-windowedStream.apply (new WindowFunction<Tuple2<String,Integer>, Integer, Tuple, Window>() {
-    public void apply (Tuple tuple,
-            Window window,
-            Iterable<Tuple2<String, Integer>> values,
-            Collector<Integer> out) throws Exception {
-        int sum = 0;
-        for (value t: values) {
-            sum += t.f1;
-        }
-        out.collect (new Integer(sum));
-    }
-});
-
-// applying an AllWindowFunction on non-keyed window stream
-allWindowedStream.apply (new AllWindowFunction<Tuple2<String,Integer>, Integer, Window>() {
-    public void apply (Window window,
-            Iterable<Tuple2<String, Integer>> values,
-            Collector<Integer> out) throws Exception {
-        int sum = 0;
-        for (value t: values) {
-            sum += t.f1;
-        }
-        out.collect (new Integer(sum));
-    }
-});
-    {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Window Reduce</strong><br>WindowedStream &rarr; DataStream</td>
-          <td>
-            <p>Applies a functional reduce function to the window and returns the reduced value.</p>
-    {% highlight java %}
-windowedStream.reduce (new ReduceFunction<Tuple2<String,Integer>>() {
-    public Tuple2<String, Integer> reduce(Tuple2<String, Integer> value1, Tuple2<String, Integer> value2) throws Exception {
-        return new Tuple2<String,Integer>(value1.f0, value1.f1 + value2.f1);
-    }
-});
-    {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Window Fold</strong><br>WindowedStream &rarr; DataStream</td>
-          <td>
-            <p>Applies a functional fold function to the window and returns the folded value.
-               The example function, when applied on the sequence (1,2,3,4,5),
-               folds the sequence into the string "start-1-2-3-4-5":</p>
-    {% highlight java %}
-windowedStream.fold("start", new FoldFunction<Integer, String>() {
-    public String fold(String current, Integer value) {
-        return current + "-" + value;
-    }
-});
-    {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Aggregations on windows</strong><br>WindowedStream &rarr; DataStream</td>
-          <td>
-            <p>Aggregates the contents of a window. The difference between min
-	    and minBy is that min returns the minimun value, whereas minBy returns
-	    the element that has the minimum value in this field (same for max and maxBy).</p>
-    {% highlight java %}
-windowedStream.sum(0);
-windowedStream.sum("key");
-windowedStream.min(0);
-windowedStream.min("key");
-windowedStream.max(0);
-windowedStream.max("key");
-windowedStream.minBy(0);
-windowedStream.minBy("key");
-windowedStream.maxBy(0);
-windowedStream.maxBy("key");
-    {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Union</strong><br>DataStream* &rarr; DataStream</td>
-          <td>
-            <p>Union of two or more data streams creating a new stream containing all the elements from all the streams. Note: If you union a data stream
-            with itself you will get each element twice in the resulting stream.</p>
-    {% highlight java %}
-dataStream.union(otherStream1, otherStream2, ...);
-    {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Window Join</strong><br>DataStream,DataStream &rarr; DataStream</td>
-          <td>
-            <p>Join two data streams on a given key and a common window.</p>
-    {% highlight java %}
-dataStream.join(otherStream)
-    .where(<key selector>).equalTo(<key selector>)
-    .window(TumblingEventTimeWindows.of(Time.seconds(3)))
-    .apply (new JoinFunction () {...});
-    {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Window CoGroup</strong><br>DataStream,DataStream &rarr; DataStream</td>
-          <td>
-            <p>Cogroups two data streams on a given key and a common window.</p>
-    {% highlight java %}
-dataStream.coGroup(otherStream)
-    .where(0).equalTo(1)
-    .window(TumblingEventTimeWindows.of(Time.seconds(3)))
-    .apply (new CoGroupFunction () {...});
-    {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Connect</strong><br>DataStream,DataStream &rarr; ConnectedStreams</td>
-          <td>
-            <p>"Connects" two data streams retaining their types. Connect allowing for shared state between
-            the two streams.</p>
-    {% highlight java %}
-DataStream<Integer> someStream = //...
-DataStream<String> otherStream = //...
-
-ConnectedStreams<Integer, String> connectedStreams = someStream.connect(otherStream);
-    {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>CoMap, CoFlatMap</strong><br>ConnectedStreams &rarr; DataStream</td>
-          <td>
-            <p>Similar to map and flatMap on a connected data stream</p>
-    {% highlight java %}
-connectedStreams.map(new CoMapFunction<Integer, String, Boolean>() {
-    @Override
-    public Boolean map1(Integer value) {
-        return true;
-    }
-
-    @Override
-    public Boolean map2(String value) {
-        return false;
-    }
-});
-connectedStreams.flatMap(new CoFlatMapFunction<Integer, String, String>() {
-
-   @Override
-   public void flatMap1(Integer value, Collector<String> out) {
-       out.collect(value.toString());
-   }
-
-   @Override
-   public void flatMap2(String value, Collector<String> out) {
-       for (String word: value.split(" ")) {
-         out.collect(word);
-       }
-   }
-});
-    {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Split</strong><br>DataStream &rarr; SplitStream</td>
-          <td>
-            <p>
-                Split the stream into two or more streams according to some criterion.
-                {% highlight java %}
-SplitStream<Integer> split = someDataStream.split(new OutputSelector<Integer>() {
-    @Override
-    public Iterable<String> select(Integer value) {
-        List<String> output = new ArrayList<String>();
-        if (value % 2 == 0) {
-            output.add("even");
-        }
-        else {
-            output.add("odd");
-        }
-        return output;
-    }
-});
-                {% endhighlight %}
-            </p>
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Select</strong><br>SplitStream &rarr; DataStream</td>
-          <td>
-            <p>
-                Select one or more streams from a split stream.
-                {% highlight java %}
-SplitStream<Integer> split;
-DataStream<Integer> even = split.select("even");
-DataStream<Integer> odd = split.select("odd");
-DataStream<Integer> all = split.select("even","odd");
-                {% endhighlight %}
-            </p>
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Iterate</strong><br>DataStream &rarr; IterativeStream &rarr; DataStream</td>
-          <td>
-            <p>
-                Creates a "feedback" loop in the flow, by redirecting the output of one operator
-                to some previous operator. This is especially useful for defining algorithms that
-                continuously update a model. The following code starts with a stream and applies
-		the iteration body continuously. Elements that are greater than 0 are sent back
-		to the feedback channel, and the rest of the elements are forwarded downstream.
-		See <a href="#iterations">iterations</a> for a complete description.
-                {% highlight java %}
-IterativeStream<Long> iteration = initialStream.iterate();
-DataStream<Long> iterationBody = iteration.map (/*do something*/);
-DataStream<Long> feedback = iterationBody.filter(new FilterFunction<Long>(){
-    @Override
-    public boolean filter(Integer value) throws Exception {
-        return value > 0;
-    }
-});
-iteration.closeWith(feedback);
-DataStream<Long> output = iterationBody.filter(new FilterFunction<Long>(){
-    @Override
-    public boolean filter(Integer value) throws Exception {
-        return value <= 0;
-    }
-});
-                {% endhighlight %}
-            </p>
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Extract Timestamps</strong><br>DataStream &rarr; DataStream</td>
-          <td>
-            <p>
-                Extracts timestamps from records in order to work with windows
-                that use event time semantics. See <a href="{{ site.baseurl }}/dev/event_time.html">Event Time</a>.
-                {% highlight java %}
-stream.assignTimestamps (new TimeStampExtractor() {...});
-                {% endhighlight %}
-            </p>
-          </td>
-        </tr>
-  </tbody>
-</table>
-
-</div>
-
-<div data-lang="scala" markdown="1">
-
-<br />
-
-<table class="table table-bordered">
-  <thead>
-    <tr>
-      <th class="text-left" style="width: 25%">Transformation</th>
-      <th class="text-center">Description</th>
-    </tr>
-  </thead>
-  <tbody>
-    <tr>
-          <td><strong>Map</strong><br>DataStream &rarr; DataStream</td>
-          <td>
-            <p>Takes one element and produces one element. A map function that doubles the values of the input stream:</p>
-    {% highlight scala %}
-dataStream.map { x => x * 2 }
-    {% endhighlight %}
-          </td>
-        </tr>
-
-        <tr>
-          <td><strong>FlatMap</strong><br>DataStream &rarr; DataStream</td>
-          <td>
-            <p>Takes one element and produces zero, one, or more elements. A flatmap function that splits sentences to words:</p>
-    {% highlight scala %}
-dataStream.flatMap { str => str.split(" ") }
-    {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Filter</strong><br>DataStream &rarr; DataStream</td>
-          <td>
-            <p>Evaluates a boolean function for each element and retains those for which the function returns true.
-            A filter that filters out zero values:
-            </p>
-    {% highlight scala %}
-dataStream.filter { _ != 0 }
-    {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>KeyBy</strong><br>DataStream &rarr; KeyedStream</td>
-          <td>
-            <p>Logically partitions a stream into disjoint partitions, each partition containing elements of the same key.
-            Internally, this is implemented with hash partitioning. See <a href="{{ site.baseurl }}/dev/api_concepts.html#specifying-keys">keys</a> on how to specify keys.
-            This transformation returns a KeyedStream.</p>
-    {% highlight scala %}
-dataStream.keyBy("someKey") // Key by field "someKey"
-dataStream.keyBy(0) // Key by the first element of a Tuple
-    {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Reduce</strong><br>KeyedStream &rarr; DataStream</td>
-          <td>
-            <p>A "rolling" reduce on a keyed data stream. Combines the current element with the last reduced value and
-            emits the new value.
-                    <br/>
-            	<br/>
-            A reduce function that creates a stream of partial sums:</p>
-            {% highlight scala %}
-keyedStream.reduce { _ + _ }
-            {% endhighlight %}
-            </p>
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Fold</strong><br>KeyedStream &rarr; DataStream</td>
-          <td>
-          <p>A "rolling" fold on a keyed data stream with an initial value.
-          Combines the current element with the last folded value and
-          emits the new value.
-          <br/>
-          <br/>
-          <p>A fold function that, when applied on the sequence (1,2,3,4,5),
-          emits the sequence "start-1", "start-1-2", "start-1-2-3", ...</p>
-          {% highlight scala %}
-val result: DataStream[String] =
-    keyedStream.fold("start")((str, i) => { str + "-" + i })
-          {% endhighlight %}
-          </p>
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Aggregations</strong><br>KeyedStream &rarr; DataStream</td>
-          <td>
-            <p>Rolling aggregations on a keyed data stream. The difference between min
-	    and minBy is that min returns the minimun value, whereas minBy returns
-	    the element that has the minimum value in this field (same for max and maxBy).</p>
-    {% highlight scala %}
-keyedStream.sum(0)
-keyedStream.sum("key")
-keyedStream.min(0)
-keyedStream.min("key")
-keyedStream.max(0)
-keyedStream.max("key")
-keyedStream.minBy(0)
-keyedStream.minBy("key")
-keyedStream.maxBy(0)
-keyedStream.maxBy("key")
-    {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Window</strong><br>KeyedStream &rarr; WindowedStream</td>
-          <td>
-            <p>Windows can be defined on already partitioned KeyedStreams. Windows group the data in each
-            key according to some characteristic (e.g., the data that arrived within the last 5 seconds).
-            See <a href="windows.html">windows</a> for a description of windows.
-    {% highlight scala %}
-dataStream.keyBy(0).window(TumblingEventTimeWindows.of(Time.seconds(5))) // Last 5 seconds of data
-    {% endhighlight %}
-        </p>
-          </td>
-        </tr>
-        <tr>
-          <td><strong>WindowAll</strong><br>DataStream &rarr; AllWindowedStream</td>
-          <td>
-              <p>Windows can be defined on regular DataStreams. Windows group all the stream events
-              according to some characteristic (e.g., the data that arrived within the last 5 seconds).
-              See <a href="windows.html">windows</a> for a complete description of windows.</p>
-              <p><strong>WARNING:</strong> This is in many cases a <strong>non-parallel</strong> transformation. All records will be
-               gathered in one task for the windowAll operator.</p>
-  {% highlight scala %}
-dataStream.windowAll(TumblingEventTimeWindows.of(Time.seconds(5))) // Last 5 seconds of data
-  {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Window Apply</strong><br>WindowedStream &rarr; DataStream<br>AllWindowedStream &rarr; DataStream</td>
-          <td>
-            <p>Applies a general function to the window as a whole. Below is a function that manually sums the elements of a window.</p>
-            <p><strong>Note:</strong> If you are using a windowAll transformation, you need to use an AllWindowFunction instead.</p>
-    {% highlight scala %}
-windowedStream.apply { WindowFunction }
-
-// applying an AllWindowFunction on non-keyed window stream
-allWindowedStream.apply { AllWindowFunction }
-
-    {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Window Reduce</strong><br>WindowedStream &rarr; DataStream</td>
-          <td>
-            <p>Applies a functional reduce function to the window and returns the reduced value.</p>
-    {% highlight scala %}
-windowedStream.reduce { _ + _ }
-    {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Window Fold</strong><br>WindowedStream &rarr; DataStream</td>
-          <td>
-            <p>Applies a functional fold function to the window and returns the folded value.
-               The example function, when applied on the sequence (1,2,3,4,5),
-               folds the sequence into the string "start-1-2-3-4-5":</p>
-          {% highlight scala %}
-val result: DataStream[String] =
-    windowedStream.fold("start", (str, i) => { str + "-" + i })
-          {% endhighlight %}
-          </td>
-	</tr>
-        <tr>
-          <td><strong>Aggregations on windows</strong><br>WindowedStream &rarr; DataStream</td>
-          <td>
-            <p>Aggregates the contents of a window. The difference between min
-	    and minBy is that min returns the minimum value, whereas minBy returns
-	    the element that has the minimum value in this field (same for max and maxBy).</p>
-    {% highlight scala %}
-windowedStream.sum(0)
-windowedStream.sum("key")
-windowedStream.min(0)
-windowedStream.min("key")
-windowedStream.max(0)
-windowedStream.max("key")
-windowedStream.minBy(0)
-windowedStream.minBy("key")
-windowedStream.maxBy(0)
-windowedStream.maxBy("key")
-    {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Union</strong><br>DataStream* &rarr; DataStream</td>
-          <td>
-            <p>Union of two or more data streams creating a new stream containing all the elements from all the streams. Note: If you union a data stream
-            with itself you will get each element twice in the resulting stream.</p>
-    {% highlight scala %}
-dataStream.union(otherStream1, otherStream2, ...)
-    {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Window Join</strong><br>DataStream,DataStream &rarr; DataStream</td>
-          <td>
-            <p>Join two data streams on a given key and a common window.</p>
-    {% highlight scala %}
-dataStream.join(otherStream)
-    .where(<key selector>).equalTo(<key selector>)
-    .window(TumblingEventTimeWindows.of(Time.seconds(3)))
-    .apply { ... }
-    {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Window CoGroup</strong><br>DataStream,DataStream &rarr; DataStream</td>
-          <td>
-            <p>Cogroups two data streams on a given key and a common window.</p>
-    {% highlight scala %}
-dataStream.coGroup(otherStream)
-    .where(0).equalTo(1)
-    .window(TumblingEventTimeWindows.of(Time.seconds(3)))
-    .apply {}
-    {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Connect</strong><br>DataStream,DataStream &rarr; ConnectedStreams</td>
-          <td>
-            <p>"Connects" two data streams retaining their types, allowing for shared state between
-            the two streams.</p>
-    {% highlight scala %}
-someStream : DataStream[Int] = ...
-otherStream : DataStream[String] = ...
-
-val connectedStreams = someStream.connect(otherStream)
-    {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>CoMap, CoFlatMap</strong><br>ConnectedStreams &rarr; DataStream</td>
-          <td>
-            <p>Similar to map and flatMap on a connected data stream</p>
-    {% highlight scala %}
-connectedStreams.map(
-    (_ : Int) => true,
-    (_ : String) => false
-)
-connectedStreams.flatMap(
-    (_ : Int) => true,
-    (_ : String) => false
-)
-    {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Split</strong><br>DataStream &rarr; SplitStream</td>
-          <td>
-            <p>
-                Split the stream into two or more streams according to some criterion.
-                {% highlight scala %}
-val split = someDataStream.split(
-  (num: Int) =>
-    (num % 2) match {
-      case 0 => List("even")
-      case 1 => List("odd")
-    }
-)
-                {% endhighlight %}
-            </p>
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Select</strong><br>SplitStream &rarr; DataStream</td>
-          <td>
-            <p>
-                Select one or more streams from a split stream.
-                {% highlight scala %}
-
-val even = split select "even"
-val odd = split select "odd"
-val all = split.select("even","odd")
-                {% endhighlight %}
-            </p>
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Iterate</strong><br>DataStream &rarr; IterativeStream  &rarr; DataStream</td>
-          <td>
-            <p>
-                Creates a "feedback" loop in the flow, by redirecting the output of one operator
-                to some previous operator. This is especially useful for defining algorithms that
-                continuously update a model. The following code starts with a stream and applies
-		the iteration body continuously. Elements that are greater than 0 are sent back
-		to the feedback channel, and the rest of the elements are forwarded downstream.
-		See <a href="#iterations">iterations</a> for a complete description.
-                {% highlight java %}
-initialStream.iterate {
-  iteration => {
-    val iterationBody = iteration.map {/*do something*/}
-    (iterationBody.filter(_ > 0), iterationBody.filter(_ <= 0))
-  }
-}
-                {% endhighlight %}
-            </p>
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Extract Timestamps</strong><br>DataStream &rarr; DataStream</td>
-          <td>
-            <p>
-                Extracts timestamps from records in order to work with windows
-                that use event time semantics.
-                See <a href="{{ site.baseurl }}/apis/streaming/event_time.html">Event Time</a>.
-                {% highlight scala %}
-stream.assignTimestamps { timestampExtractor }
-                {% endhighlight %}
-            </p>
-          </td>
-        </tr>
-  </tbody>
-</table>
-
-Extraction from tuples, case classes and collections via anonymous pattern matching, like the following:
-{% highlight scala %}
-val data: DataStream[(Int, String, Double)] = // [...]
-data.map {
-  case (id, name, temperature) => // [...]
-}
-{% endhighlight %}
-is not supported by the API out-of-the-box. To use this feature, you should use a <a href="scala_api_extensions.html">Scala API extension</a>.
-
-
-</div>
-</div>
-
-The following transformations are available on data streams of Tuples:
-
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-
-<br />
-
-<table class="table table-bordered">
-  <thead>
-    <tr>
-      <th class="text-left" style="width: 20%">Transformation</th>
-      <th class="text-center">Description</th>
-    </tr>
-  </thead>
-  <tbody>
-   <tr>
-      <td><strong>Project</strong><br>DataStream &rarr; DataStream</td>
-      <td>
-        <p>Selects a subset of fields from the tuples
-{% highlight java %}
-DataStream<Tuple3<Integer, Double, String>> in = // [...]
-DataStream<Tuple2<String, Integer>> out = in.project(2,0);
-{% endhighlight %}
-        </p>
-      </td>
-    </tr>
-  </tbody>
-</table>
-
-</div>
-</div>
-
-
-### Physical partitioning
-
-Flink also gives low-level control (if desired) on the exact stream partitioning after a transformation,
-via the following functions.
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-
-<br />
-
-<table class="table table-bordered">
-  <thead>
-    <tr>
-      <th class="text-left" style="width: 20%">Transformation</th>
-      <th class="text-center">Description</th>
-    </tr>
-  </thead>
-  <tbody>
-   <tr>
-      <td><strong>Custom partitioning</strong><br>DataStream &rarr; DataStream</td>
-      <td>
-        <p>
-            Uses a user-defined Partitioner to select the target task for each element.
-            {% highlight java %}
-dataStream.partitionCustom(partitioner, "someKey");
-dataStream.partitionCustom(partitioner, 0);
-            {% endhighlight %}
-        </p>
-      </td>
-    </tr>
-   <tr>
-     <td><strong>Random partitioning</strong><br>DataStream &rarr; DataStream</td>
-     <td>
-       <p>
-            Partitions elements randomly according to a uniform distribution.
-            {% highlight java %}
-dataStream.shuffle();
-            {% endhighlight %}
-       </p>
-     </td>
-   </tr>
-   <tr>
-      <td><strong>Rebalancing (Round-robin partitioning)</strong><br>DataStream &rarr; DataStream</td>
-      <td>
-        <p>
-            Partitions elements round-robin, creating equal load per partition. Useful for performance
-            optimization in the presence of data skew.
-            {% highlight java %}
-dataStream.rebalance();
-            {% endhighlight %}
-        </p>
-      </td>
-    </tr>
-    <tr>
-      <td><strong>Rescaling</strong><br>DataStream &rarr; DataStream</td>
-      <td>
-        <p>
-            Partitions elements, round-robin, to a subset of downstream operations. This is
-            useful if you want to have pipelines where you, for example, fan out from
-            each parallel instance of a source to a subset of several mappers to distribute load
-            but don't want the full rebalance that rebalance() would incur. This would require only
-            local data transfers instead of transferring data over network, depending on
-            other configuration values such as the number of slots of TaskManagers.
-        </p>
-        <p>
-            The subset of downstream operations to which the upstream operation sends
-            elements depends on the degree of parallelism of both the upstream and downstream operation.
-            For example, if the upstream operation has parallelism 2 and the downstream operation
-            has parallelism 6, then one upstream operation would distribute elements to three
-            downstream operations while the other upstream operation would distribute to the other
-            three downstream operations. If, on the other hand, the downstream operation has parallelism
-            2 while the upstream operation has parallelism 6 then three upstream operations would
-            distribute to one downstream operation while the other three upstream operations would
-            distribute to the other downstream operation.
-        </p>
-        <p>
-            In cases where the different parallelisms are not multiples of each other one or several
-            downstream operations will have a differing number of inputs from upstream operations.
-        </p>
-        <p>
-            Please see this figure for a visualization of the connection pattern in the above
-            example:
-        </p>
-
-        <div style="text-align: center">
-            <img src="{{ site.baseurl }}/fig/rescale.svg" alt="Checkpoint barriers in data streams" />
-            </div>
-
-
-        <p>
-                    {% highlight java %}
-dataStream.rescale();
-            {% endhighlight %}
-
-        </p>
-      </td>
-    </tr>
-   <tr>
-      <td><strong>Broadcasting</strong><br>DataStream &rarr; DataStream</td>
-      <td>
-        <p>
-            Broadcasts elements to every partition.
-            {% highlight java %}
-dataStream.broadcast();
-            {% endhighlight %}
-        </p>
-      </td>
-    </tr>
-  </tbody>
-</table>
-
-</div>
-
-<div data-lang="scala" markdown="1">
-
-<br />
-
-<table class="table table-bordered">
-  <thead>
-    <tr>
-      <th class="text-left" style="width: 20%">Transformation</th>
-      <th class="text-center">Description</th>
-    </tr>
-  </thead>
-  <tbody>
-   <tr>
-      <td><strong>Custom partitioning</strong><br>DataStream &rarr; DataStream</td>
-      <td>
-        <p>
-            Uses a user-defined Partitioner to select the target task for each element.
-            {% highlight scala %}
-dataStream.partitionCustom(partitioner, "someKey")
-dataStream.partitionCustom(partitioner, 0)
-            {% endhighlight %}
-        </p>
-      </td>
-    </tr>
-   <tr>
-     <td><strong>Random partitioning</strong><br>DataStream &rarr; DataStream</td>
-     <td>
-       <p>
-            Partitions elements randomly according to a uniform distribution.
-            {% highlight scala %}
-dataStream.shuffle()
-            {% endhighlight %}
-       </p>
-     </td>
-   </tr>
-   <tr>
-      <td><strong>Rebalancing (Round-robin partitioning)</strong><br>DataStream &rarr; DataStream</td>
-      <td>
-        <p>
-            Partitions elements round-robin, creating equal load per partition. Useful for performance
-            optimization in the presence of data skew.
-            {% highlight scala %}
-dataStream.rebalance()
-            {% endhighlight %}
-        </p>
-      </td>
-    </tr>
-    <tr>
-      <td><strong>Rescaling</strong><br>DataStream &rarr; DataStream</td>
-      <td>
-        <p>
-            Partitions elements, round-robin, to a subset of downstream operations. This is
-            useful if you want to have pipelines where you, for example, fan out from
-            each parallel instance of a source to a subset of several mappers to distribute load
-            but don't want the full rebalance that rebalance() would incur. This would require only
-            local data transfers instead of transferring data over network, depending on
-            other configuration values such as the number of slots of TaskManagers.
-        </p>
-        <p>
-            The subset of downstream operations to which the upstream operation sends
-            elements depends on the degree of parallelism of both the upstream and downstream operation.
-            For example, if the upstream operation has parallelism 2 and the downstream operation
-            has parallelism 4, then one upstream operation would distribute elements to two
-            downstream operations while the other upstream operation would distribute to the other
-            two downstream operations. If, on the other hand, the downstream operation has parallelism
-            2 while the upstream operation has parallelism 4 then two upstream operations would
-            distribute to one downstream operation while the other two upstream operations would
-            distribute to the other downstream operations.
-        </p>
-        <p>
-            In cases where the different parallelisms are not multiples of each other one or several
-            downstream operations will have a differing number of inputs from upstream operations.
-
-        </p>
-        </p>
-            Please see this figure for a visualization of the connection pattern in the above
-            example:
-        </p>
-
-        <div style="text-align: center">
-            <img src="{{ site.baseurl }}/fig/rescale.svg" alt="Checkpoint barriers in data streams" />
-            </div>
-
-
-        <p>
-                    {% highlight java %}
-dataStream.rescale()
-            {% endhighlight %}
-
-        </p>
-      </td>
-    </tr>
-   <tr>
-      <td><strong>Broadcasting</strong><br>DataStream &rarr; DataStream</td>
-      <td>
-        <p>
-            Broadcasts elements to every partition.
-            {% highlight scala %}
-dataStream.broadcast()
-            {% endhighlight %}
-        </p>
-      </td>
-    </tr>
-  </tbody>
-</table>
-
-</div>
-</div>
-
-### Task chaining and resource groups
-
-Chaining two subsequent transformations means co-locating them within the same thread for better
-performance. Flink by default chains operators if this is possible (e.g., two subsequent map
-transformations). The API gives fine-grained control over chaining if desired:
-
-Use `StreamExecutionEnvironment.disableOperatorChaining()` if you want to disable chaining in
-the whole job. For more fine grained control, the following functions are available. Note that
-these functions can only be used right after a DataStream transformation as they refer to the
-previous transformation. For example, you can use `someStream.map(...).startNewChain()`, but
-you cannot use `someStream.startNewChain()`.
-
-A resource group is a slot in Flink, see
-[slots]({{site.baseurl}}/ops/config.html#configuring-taskmanager-processing-slots). You can
-manually isolate operators in separate slots if desired.
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-
-<br />
-
-<table class="table table-bordered">
-  <thead>
-    <tr>
-      <th class="text-left" style="width: 20%">Transformation</th>
-      <th class="text-center">Description</th>
-    </tr>
-  </thead>
-  <tbody>
-   <tr>
-      <td>Start new chain</td>
-      <td>
-        <p>Begin a new chain, starting with this operator. The two
-	mappers will be chained, and filter will not be chained to
-	the first mapper.
-{% highlight java %}
-someStream.filter(...).map(...).startNewChain().map(...);
-{% endhighlight %}
-        </p>
-      </td>
-    </tr>
-   <tr>
-      <td>Disable chaining</td>
-      <td>
-        <p>Do not chain the map operator
-{% highlight java %}
-someStream.map(...).disableChaining();
-{% endhighlight %}
-        </p>
-      </td>
-    </tr>
-    <tr>
-      <td>Set slot sharing group</td>
-      <td>
-        <p>Set the slot sharing group of an operation. Flink will put operations with the same
-        slot sharing group into the same slot while keeping operations that don't have the
-        slot sharing group in other slots. This can be used to isolate slots. The slot sharing
-        group is inherited from input operations if all input operations are in the same slot
-        sharing group.
-        The name of the default slot sharing group is "default", operations can explicitly
-        be put into this group by calling slotSharingGroup("default").
-{% highlight java %}
-someStream.filter(...).slotSharingGroup("name");
-{% endhighlight %}
-        </p>
-      </td>
-    </tr>
-  </tbody>
-</table>
-
-</div>
-
-<div data-lang="scala" markdown="1">
-
-<br />
-
-<table class="table table-bordered">
-  <thead>
-    <tr>
-      <th class="text-left" style="width: 20%">Transformation</th>
-      <th class="text-center">Description</th>
-    </tr>
-  </thead>
-  <tbody>
-   <tr>
-      <td>Start new chain</td>
-      <td>
-        <p>Begin a new chain, starting with this operator. The two
-	mappers will be chained, and filter will not be chained to
-	the first mapper.
-{% highlight scala %}
-someStream.filter(...).map(...).startNewChain().map(...)
-{% endhighlight %}
-        </p>
-      </td>
-    </tr>
-   <tr>
-      <td>Disable chaining</td>
-      <td>
-        <p>Do not chain the map operator
-{% highlight scala %}
-someStream.map(...).disableChaining()
-{% endhighlight %}
-        </p>
-      </td>
-    </tr>
-  <tr>
-      <td>Set slot sharing group</td>
-      <td>
-        <p>Set the slot sharing group of an operation. Flink will put operations with the same
-        slot sharing group into the same slot while keeping operations that don't have the
-        slot sharing group in other slots. This can be used to isolate slots. The slot sharing
-        group is inherited from input operations if all input operations are in the same slot
-        sharing group.
-        The name of the default slot sharing group is "default", operations can explicitly
-        be put into this group by calling slotSharingGroup("default").
-{% highlight java %}
-someStream.filter(...).slotSharingGroup("name")
-{% endhighlight %}
-        </p>
-      </td>
-    </tr>
-  </tbody>
-</table>
-
-</div>
-</div>
-
-
-{% top %}
+Moved. Please see [operators]({{ site.baseurl }}/dev/stream/operators.html) for an overview of the
+available stream transformations.
 
 Data Sources
 ------------

http://git-wip-us.apache.org/repos/asf/flink/blob/cafa45e2/docs/dev/stream/asyncio.md
----------------------------------------------------------------------
diff --git a/docs/dev/stream/asyncio.md b/docs/dev/stream/asyncio.md
index c4414b4..ec9c8ba 100644
--- a/docs/dev/stream/asyncio.md
+++ b/docs/dev/stream/asyncio.md
@@ -1,7 +1,7 @@
 ---
 title: "Asynchronous I/O for External Data Access"
 nav-title: "Async I/O"
-nav-parent_id: streaming
+nav-parent_id: operators
 nav-pos: 60
 ---
 <!--

http://git-wip-us.apache.org/repos/asf/flink/blob/cafa45e2/docs/dev/stream/operators.md
----------------------------------------------------------------------
diff --git a/docs/dev/stream/operators.md b/docs/dev/stream/operators.md
new file mode 100644
index 0000000..70bd9ae
--- /dev/null
+++ b/docs/dev/stream/operators.md
@@ -0,0 +1,1169 @@
+---
+title: "Operators"
+nav-id: operators
+nav-show_overview: true
+nav-parent_id: streaming
+nav-pos: 9
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+Operators transform one or more DataStreams into a new DataStream. Programs can combine
+multiple transformations into sophisticated topologies.
+
+This section gives a description of all the available transformations, the effective physical
+partitioning after applying those as well as insights into Flink's operator chaining.
+
+* toc
+{:toc}
+
+# DataStream Transformations
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+
+<br />
+
+<table class="table table-bordered">
+  <thead>
+    <tr>
+      <th class="text-left" style="width: 25%">Transformation</th>
+      <th class="text-center">Description</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+          <td><strong>Map</strong><br>DataStream &rarr; DataStream</td>
+          <td>
+            <p>Takes one element and produces one element. A map function that doubles the values of the input stream:</p>
+    {% highlight java %}
+DataStream<Integer> dataStream = //...
+dataStream.map(new MapFunction<Integer, Integer>() {
+    @Override
+    public Integer map(Integer value) throws Exception {
+        return 2 * value;
+    }
+});
+    {% endhighlight %}
+          </td>
+        </tr>
+
+        <tr>
+          <td><strong>FlatMap</strong><br>DataStream &rarr; DataStream</td>
+          <td>
+            <p>Takes one element and produces zero, one, or more elements. A flatmap function that splits sentences to words:</p>
+    {% highlight java %}
+dataStream.flatMap(new FlatMapFunction<String, String>() {
+    @Override
+    public void flatMap(String value, Collector<String> out)
+        throws Exception {
+        for(String word: value.split(" ")){
+            out.collect(word);
+        }
+    }
+});
+    {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Filter</strong><br>DataStream &rarr; DataStream</td>
+          <td>
+            <p>Evaluates a boolean function for each element and retains those for which the function returns true.
+            A filter that filters out zero values:
+            </p>
+    {% highlight java %}
+dataStream.filter(new FilterFunction<Integer>() {
+    @Override
+    public boolean filter(Integer value) throws Exception {
+        return value != 0;
+    }
+});
+    {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>KeyBy</strong><br>DataStream &rarr; KeyedStream</td>
+          <td>
+            <p>Logically partitions a stream into disjoint partitions, each partition containing elements of the same key.
+            Internally, this is implemented with hash partitioning. See <a href="{{ site.baseurl }}/dev/api_concepts.html#specifying-keys">keys</a> on how to specify keys.
+            This transformation returns a KeyedStream.</p>
+    {% highlight java %}
+dataStream.keyBy("someKey") // Key by field "someKey"
+dataStream.keyBy(0) // Key by the first element of a Tuple
+    {% endhighlight %}
+            <p>
+            <span class="label label-danger">Attention</span>
+            A type <strong>cannot be a key</strong> if:
+    	    <ol>
+    	    <li> it is a POJO type but does not override the <em>hashCode()</em> method and
+    	    relies on the <em>Object.hashCode()</em> implementation.</li>
+    	    <li> it is an array of any type.</li>
+    	    </ol>
+    	    </p>
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Reduce</strong><br>KeyedStream &rarr; DataStream</td>
+          <td>
+            <p>A "rolling" reduce on a keyed data stream. Combines the current element with the last reduced value and
+            emits the new value.
+                    <br/>
+            	<br/>
+            A reduce function that creates a stream of partial sums:</p>
+            {% highlight java %}
+keyedStream.reduce(new ReduceFunction<Integer>() {
+    @Override
+    public Integer reduce(Integer value1, Integer value2)
+    throws Exception {
+        return value1 + value2;
+    }
+});
+            {% endhighlight %}
+            </p>
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Fold</strong><br>KeyedStream &rarr; DataStream</td>
+          <td>
+          <p>A "rolling" fold on a keyed data stream with an initial value.
+          Combines the current element with the last folded value and
+          emits the new value.
+          <br/>
+          <br/>
+          <p>A fold function that, when applied on the sequence (1,2,3,4,5),
+          emits the sequence "start-1", "start-1-2", "start-1-2-3", ...</p>
+          {% highlight java %}
+DataStream<String> result =
+  keyedStream.fold("start", new FoldFunction<Integer, String>() {
+    @Override
+    public String fold(String current, Integer value) {
+        return current + "-" + value;
+    }
+  });
+          {% endhighlight %}
+          </p>
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Aggregations</strong><br>KeyedStream &rarr; DataStream</td>
+          <td>
+            <p>Rolling aggregations on a keyed data stream. The difference between min
+	    and minBy is that min returns the minimum value, whereas minBy returns
+	    the element that has the minimum value in this field (same for max and maxBy).</p>
+    {% highlight java %}
+keyedStream.sum(0);
+keyedStream.sum("key");
+keyedStream.min(0);
+keyedStream.min("key");
+keyedStream.max(0);
+keyedStream.max("key");
+keyedStream.minBy(0);
+keyedStream.minBy("key");
+keyedStream.maxBy(0);
+keyedStream.maxBy("key");
+    {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Window</strong><br>KeyedStream &rarr; WindowedStream</td>
+          <td>
+            <p>Windows can be defined on already partitioned KeyedStreams. Windows group the data in each
+            key according to some characteristic (e.g., the data that arrived within the last 5 seconds).
+            See <a href="windows.html">windows</a> for a complete description of windows.
+    {% highlight java %}
+dataStream.keyBy(0).window(TumblingEventTimeWindows.of(Time.seconds(5))); // Last 5 seconds of data
+    {% endhighlight %}
+        </p>
+          </td>
+        </tr>
+        <tr>
+          <td><strong>WindowAll</strong><br>DataStream &rarr; AllWindowedStream</td>
+          <td>
+              <p>Windows can be defined on regular DataStreams. Windows group all the stream events
+              according to some characteristic (e.g., the data that arrived within the last 5 seconds).
+              See <a href="windows.html">windows</a> for a complete description of windows.</p>
+              <p><strong>WARNING:</strong> This is in many cases a <strong>non-parallel</strong> transformation. All records will be
+               gathered in one task for the windowAll operator.</p>
+  {% highlight java %}
+dataStream.windowAll(TumblingEventTimeWindows.of(Time.seconds(5))); // Last 5 seconds of data
+  {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Window Apply</strong><br>WindowedStream &rarr; DataStream<br>AllWindowedStream &rarr; DataStream</td>
+          <td>
+            <p>Applies a general function to the window as a whole. Below is a function that manually sums the elements of a window.</p>
+            <p><strong>Note:</strong> If you are using a windowAll transformation, you need to use an AllWindowFunction instead.</p>
+    {% highlight java %}
+windowedStream.apply (new WindowFunction<Tuple2<String,Integer>, Integer, Tuple, Window>() {
+    public void apply (Tuple tuple,
+            Window window,
+            Iterable<Tuple2<String, Integer>> values,
+            Collector<Integer> out) throws Exception {
+        int sum = 0;
+        for (value t: values) {
+            sum += t.f1;
+        }
+        out.collect (new Integer(sum));
+    }
+});
+
+// applying an AllWindowFunction on non-keyed window stream
+allWindowedStream.apply (new AllWindowFunction<Tuple2<String,Integer>, Integer, Window>() {
+    public void apply (Window window,
+            Iterable<Tuple2<String, Integer>> values,
+            Collector<Integer> out) throws Exception {
+        int sum = 0;
+        for (value t: values) {
+            sum += t.f1;
+        }
+        out.collect (new Integer(sum));
+    }
+});
+    {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Window Reduce</strong><br>WindowedStream &rarr; DataStream</td>
+          <td>
+            <p>Applies a functional reduce function to the window and returns the reduced value.</p>
+    {% highlight java %}
+windowedStream.reduce (new ReduceFunction<Tuple2<String,Integer>>() {
+    public Tuple2<String, Integer> reduce(Tuple2<String, Integer> value1, Tuple2<String, Integer> value2) throws Exception {
+        return new Tuple2<String,Integer>(value1.f0, value1.f1 + value2.f1);
+    }
+});
+    {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Window Fold</strong><br>WindowedStream &rarr; DataStream</td>
+          <td>
+            <p>Applies a functional fold function to the window and returns the folded value.
+               The example function, when applied on the sequence (1,2,3,4,5),
+               folds the sequence into the string "start-1-2-3-4-5":</p>
+    {% highlight java %}
+windowedStream.fold("start", new FoldFunction<Integer, String>() {
+    public String fold(String current, Integer value) {
+        return current + "-" + value;
+    }
+});
+    {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Aggregations on windows</strong><br>WindowedStream &rarr; DataStream</td>
+          <td>
+            <p>Aggregates the contents of a window. The difference between min
+	    and minBy is that min returns the minimun value, whereas minBy returns
+	    the element that has the minimum value in this field (same for max and maxBy).</p>
+    {% highlight java %}
+windowedStream.sum(0);
+windowedStream.sum("key");
+windowedStream.min(0);
+windowedStream.min("key");
+windowedStream.max(0);
+windowedStream.max("key");
+windowedStream.minBy(0);
+windowedStream.minBy("key");
+windowedStream.maxBy(0);
+windowedStream.maxBy("key");
+    {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Union</strong><br>DataStream* &rarr; DataStream</td>
+          <td>
+            <p>Union of two or more data streams creating a new stream containing all the elements from all the streams. Note: If you union a data stream
+            with itself you will get each element twice in the resulting stream.</p>
+    {% highlight java %}
+dataStream.union(otherStream1, otherStream2, ...);
+    {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Window Join</strong><br>DataStream,DataStream &rarr; DataStream</td>
+          <td>
+            <p>Join two data streams on a given key and a common window.</p>
+    {% highlight java %}
+dataStream.join(otherStream)
+    .where(<key selector>).equalTo(<key selector>)
+    .window(TumblingEventTimeWindows.of(Time.seconds(3)))
+    .apply (new JoinFunction () {...});
+    {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Window CoGroup</strong><br>DataStream,DataStream &rarr; DataStream</td>
+          <td>
+            <p>Cogroups two data streams on a given key and a common window.</p>
+    {% highlight java %}
+dataStream.coGroup(otherStream)
+    .where(0).equalTo(1)
+    .window(TumblingEventTimeWindows.of(Time.seconds(3)))
+    .apply (new CoGroupFunction () {...});
+    {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Connect</strong><br>DataStream,DataStream &rarr; ConnectedStreams</td>
+          <td>
+            <p>"Connects" two data streams retaining their types. Connect allowing for shared state between
+            the two streams.</p>
+    {% highlight java %}
+DataStream<Integer> someStream = //...
+DataStream<String> otherStream = //...
+
+ConnectedStreams<Integer, String> connectedStreams = someStream.connect(otherStream);
+    {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>CoMap, CoFlatMap</strong><br>ConnectedStreams &rarr; DataStream</td>
+          <td>
+            <p>Similar to map and flatMap on a connected data stream</p>
+    {% highlight java %}
+connectedStreams.map(new CoMapFunction<Integer, String, Boolean>() {
+    @Override
+    public Boolean map1(Integer value) {
+        return true;
+    }
+
+    @Override
+    public Boolean map2(String value) {
+        return false;
+    }
+});
+connectedStreams.flatMap(new CoFlatMapFunction<Integer, String, String>() {
+
+   @Override
+   public void flatMap1(Integer value, Collector<String> out) {
+       out.collect(value.toString());
+   }
+
+   @Override
+   public void flatMap2(String value, Collector<String> out) {
+       for (String word: value.split(" ")) {
+         out.collect(word);
+       }
+   }
+});
+    {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Split</strong><br>DataStream &rarr; SplitStream</td>
+          <td>
+            <p>
+                Split the stream into two or more streams according to some criterion.
+                {% highlight java %}
+SplitStream<Integer> split = someDataStream.split(new OutputSelector<Integer>() {
+    @Override
+    public Iterable<String> select(Integer value) {
+        List<String> output = new ArrayList<String>();
+        if (value % 2 == 0) {
+            output.add("even");
+        }
+        else {
+            output.add("odd");
+        }
+        return output;
+    }
+});
+                {% endhighlight %}
+            </p>
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Select</strong><br>SplitStream &rarr; DataStream</td>
+          <td>
+            <p>
+                Select one or more streams from a split stream.
+                {% highlight java %}
+SplitStream<Integer> split;
+DataStream<Integer> even = split.select("even");
+DataStream<Integer> odd = split.select("odd");
+DataStream<Integer> all = split.select("even","odd");
+                {% endhighlight %}
+            </p>
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Iterate</strong><br>DataStream &rarr; IterativeStream &rarr; DataStream</td>
+          <td>
+            <p>
+                Creates a "feedback" loop in the flow, by redirecting the output of one operator
+                to some previous operator. This is especially useful for defining algorithms that
+                continuously update a model. The following code starts with a stream and applies
+		the iteration body continuously. Elements that are greater than 0 are sent back
+		to the feedback channel, and the rest of the elements are forwarded downstream.
+		See <a href="#iterations">iterations</a> for a complete description.
+                {% highlight java %}
+IterativeStream<Long> iteration = initialStream.iterate();
+DataStream<Long> iterationBody = iteration.map (/*do something*/);
+DataStream<Long> feedback = iterationBody.filter(new FilterFunction<Long>(){
+    @Override
+    public boolean filter(Integer value) throws Exception {
+        return value > 0;
+    }
+});
+iteration.closeWith(feedback);
+DataStream<Long> output = iterationBody.filter(new FilterFunction<Long>(){
+    @Override
+    public boolean filter(Integer value) throws Exception {
+        return value <= 0;
+    }
+});
+                {% endhighlight %}
+            </p>
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Extract Timestamps</strong><br>DataStream &rarr; DataStream</td>
+          <td>
+            <p>
+                Extracts timestamps from records in order to work with windows
+                that use event time semantics. See <a href="{{ site.baseurl }}/dev/event_time.html">Event Time</a>.
+                {% highlight java %}
+stream.assignTimestamps (new TimeStampExtractor() {...});
+                {% endhighlight %}
+            </p>
+          </td>
+        </tr>
+  </tbody>
+</table>
+
+</div>
+
+<div data-lang="scala" markdown="1">
+
+<br />
+
+<table class="table table-bordered">
+  <thead>
+    <tr>
+      <th class="text-left" style="width: 25%">Transformation</th>
+      <th class="text-center">Description</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+          <td><strong>Map</strong><br>DataStream &rarr; DataStream</td>
+          <td>
+            <p>Takes one element and produces one element. A map function that doubles the values of the input stream:</p>
+    {% highlight scala %}
+dataStream.map { x => x * 2 }
+    {% endhighlight %}
+          </td>
+        </tr>
+
+        <tr>
+          <td><strong>FlatMap</strong><br>DataStream &rarr; DataStream</td>
+          <td>
+            <p>Takes one element and produces zero, one, or more elements. A flatmap function that splits sentences to words:</p>
+    {% highlight scala %}
+dataStream.flatMap { str => str.split(" ") }
+    {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Filter</strong><br>DataStream &rarr; DataStream</td>
+          <td>
+            <p>Evaluates a boolean function for each element and retains those for which the function returns true.
+            A filter that filters out zero values:
+            </p>
+    {% highlight scala %}
+dataStream.filter { _ != 0 }
+    {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>KeyBy</strong><br>DataStream &rarr; KeyedStream</td>
+          <td>
+            <p>Logically partitions a stream into disjoint partitions, each partition containing elements of the same key.
+            Internally, this is implemented with hash partitioning. See <a href="{{ site.baseurl }}/dev/api_concepts.html#specifying-keys">keys</a> on how to specify keys.
+            This transformation returns a KeyedStream.</p>
+    {% highlight scala %}
+dataStream.keyBy("someKey") // Key by field "someKey"
+dataStream.keyBy(0) // Key by the first element of a Tuple
+    {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Reduce</strong><br>KeyedStream &rarr; DataStream</td>
+          <td>
+            <p>A "rolling" reduce on a keyed data stream. Combines the current element with the last reduced value and
+            emits the new value.
+                    <br/>
+            	<br/>
+            A reduce function that creates a stream of partial sums:</p>
+            {% highlight scala %}
+keyedStream.reduce { _ + _ }
+            {% endhighlight %}
+            </p>
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Fold</strong><br>KeyedStream &rarr; DataStream</td>
+          <td>
+          <p>A "rolling" fold on a keyed data stream with an initial value.
+          Combines the current element with the last folded value and
+          emits the new value.
+          <br/>
+          <br/>
+          <p>A fold function that, when applied on the sequence (1,2,3,4,5),
+          emits the sequence "start-1", "start-1-2", "start-1-2-3", ...</p>
+          {% highlight scala %}
+val result: DataStream[String] =
+    keyedStream.fold("start")((str, i) => { str + "-" + i })
+          {% endhighlight %}
+          </p>
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Aggregations</strong><br>KeyedStream &rarr; DataStream</td>
+          <td>
+            <p>Rolling aggregations on a keyed data stream. The difference between min
+	    and minBy is that min returns the minimun value, whereas minBy returns
+	    the element that has the minimum value in this field (same for max and maxBy).</p>
+    {% highlight scala %}
+keyedStream.sum(0)
+keyedStream.sum("key")
+keyedStream.min(0)
+keyedStream.min("key")
+keyedStream.max(0)
+keyedStream.max("key")
+keyedStream.minBy(0)
+keyedStream.minBy("key")
+keyedStream.maxBy(0)
+keyedStream.maxBy("key")
+    {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Window</strong><br>KeyedStream &rarr; WindowedStream</td>
+          <td>
+            <p>Windows can be defined on already partitioned KeyedStreams. Windows group the data in each
+            key according to some characteristic (e.g., the data that arrived within the last 5 seconds).
+            See <a href="windows.html">windows</a> for a description of windows.
+    {% highlight scala %}
+dataStream.keyBy(0).window(TumblingEventTimeWindows.of(Time.seconds(5))) // Last 5 seconds of data
+    {% endhighlight %}
+        </p>
+          </td>
+        </tr>
+        <tr>
+          <td><strong>WindowAll</strong><br>DataStream &rarr; AllWindowedStream</td>
+          <td>
+              <p>Windows can be defined on regular DataStreams. Windows group all the stream events
+              according to some characteristic (e.g., the data that arrived within the last 5 seconds).
+              See <a href="windows.html">windows</a> for a complete description of windows.</p>
+              <p><strong>WARNING:</strong> This is in many cases a <strong>non-parallel</strong> transformation. All records will be
+               gathered in one task for the windowAll operator.</p>
+  {% highlight scala %}
+dataStream.windowAll(TumblingEventTimeWindows.of(Time.seconds(5))) // Last 5 seconds of data
+  {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Window Apply</strong><br>WindowedStream &rarr; DataStream<br>AllWindowedStream &rarr; DataStream</td>
+          <td>
+            <p>Applies a general function to the window as a whole. Below is a function that manually sums the elements of a window.</p>
+            <p><strong>Note:</strong> If you are using a windowAll transformation, you need to use an AllWindowFunction instead.</p>
+    {% highlight scala %}
+windowedStream.apply { WindowFunction }
+
+// applying an AllWindowFunction on non-keyed window stream
+allWindowedStream.apply { AllWindowFunction }
+
+    {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Window Reduce</strong><br>WindowedStream &rarr; DataStream</td>
+          <td>
+            <p>Applies a functional reduce function to the window and returns the reduced value.</p>
+    {% highlight scala %}
+windowedStream.reduce { _ + _ }
+    {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Window Fold</strong><br>WindowedStream &rarr; DataStream</td>
+          <td>
+            <p>Applies a functional fold function to the window and returns the folded value.
+               The example function, when applied on the sequence (1,2,3,4,5),
+               folds the sequence into the string "start-1-2-3-4-5":</p>
+          {% highlight scala %}
+val result: DataStream[String] =
+    windowedStream.fold("start", (str, i) => { str + "-" + i })
+          {% endhighlight %}
+          </td>
+	</tr>
+        <tr>
+          <td><strong>Aggregations on windows</strong><br>WindowedStream &rarr; DataStream</td>
+          <td>
+            <p>Aggregates the contents of a window. The difference between min
+	    and minBy is that min returns the minimum value, whereas minBy returns
+	    the element that has the minimum value in this field (same for max and maxBy).</p>
+    {% highlight scala %}
+windowedStream.sum(0)
+windowedStream.sum("key")
+windowedStream.min(0)
+windowedStream.min("key")
+windowedStream.max(0)
+windowedStream.max("key")
+windowedStream.minBy(0)
+windowedStream.minBy("key")
+windowedStream.maxBy(0)
+windowedStream.maxBy("key")
+    {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Union</strong><br>DataStream* &rarr; DataStream</td>
+          <td>
+            <p>Union of two or more data streams creating a new stream containing all the elements from all the streams. Note: If you union a data stream
+            with itself you will get each element twice in the resulting stream.</p>
+    {% highlight scala %}
+dataStream.union(otherStream1, otherStream2, ...)
+    {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Window Join</strong><br>DataStream,DataStream &rarr; DataStream</td>
+          <td>
+            <p>Join two data streams on a given key and a common window.</p>
+    {% highlight scala %}
+dataStream.join(otherStream)
+    .where(<key selector>).equalTo(<key selector>)
+    .window(TumblingEventTimeWindows.of(Time.seconds(3)))
+    .apply { ... }
+    {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Window CoGroup</strong><br>DataStream,DataStream &rarr; DataStream</td>
+          <td>
+            <p>Cogroups two data streams on a given key and a common window.</p>
+    {% highlight scala %}
+dataStream.coGroup(otherStream)
+    .where(0).equalTo(1)
+    .window(TumblingEventTimeWindows.of(Time.seconds(3)))
+    .apply {}
+    {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Connect</strong><br>DataStream,DataStream &rarr; ConnectedStreams</td>
+          <td>
+            <p>"Connects" two data streams retaining their types, allowing for shared state between
+            the two streams.</p>
+    {% highlight scala %}
+someStream : DataStream[Int] = ...
+otherStream : DataStream[String] = ...
+
+val connectedStreams = someStream.connect(otherStream)
+    {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>CoMap, CoFlatMap</strong><br>ConnectedStreams &rarr; DataStream</td>
+          <td>
+            <p>Similar to map and flatMap on a connected data stream</p>
+    {% highlight scala %}
+connectedStreams.map(
+    (_ : Int) => true,
+    (_ : String) => false
+)
+connectedStreams.flatMap(
+    (_ : Int) => true,
+    (_ : String) => false
+)
+    {% endhighlight %}
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Split</strong><br>DataStream &rarr; SplitStream</td>
+          <td>
+            <p>
+                Split the stream into two or more streams according to some criterion.
+                {% highlight scala %}
+val split = someDataStream.split(
+  (num: Int) =>
+    (num % 2) match {
+      case 0 => List("even")
+      case 1 => List("odd")
+    }
+)
+                {% endhighlight %}
+            </p>
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Select</strong><br>SplitStream &rarr; DataStream</td>
+          <td>
+            <p>
+                Select one or more streams from a split stream.
+                {% highlight scala %}
+
+val even = split select "even"
+val odd = split select "odd"
+val all = split.select("even","odd")
+                {% endhighlight %}
+            </p>
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Iterate</strong><br>DataStream &rarr; IterativeStream  &rarr; DataStream</td>
+          <td>
+            <p>
+                Creates a "feedback" loop in the flow, by redirecting the output of one operator
+                to some previous operator. This is especially useful for defining algorithms that
+                continuously update a model. The following code starts with a stream and applies
+		the iteration body continuously. Elements that are greater than 0 are sent back
+		to the feedback channel, and the rest of the elements are forwarded downstream.
+		See <a href="#iterations">iterations</a> for a complete description.
+                {% highlight java %}
+initialStream.iterate {
+  iteration => {
+    val iterationBody = iteration.map {/*do something*/}
+    (iterationBody.filter(_ > 0), iterationBody.filter(_ <= 0))
+  }
+}
+                {% endhighlight %}
+            </p>
+          </td>
+        </tr>
+        <tr>
+          <td><strong>Extract Timestamps</strong><br>DataStream &rarr; DataStream</td>
+          <td>
+            <p>
+                Extracts timestamps from records in order to work with windows
+                that use event time semantics.
+                See <a href="{{ site.baseurl }}/apis/streaming/event_time.html">Event Time</a>.
+                {% highlight scala %}
+stream.assignTimestamps { timestampExtractor }
+                {% endhighlight %}
+            </p>
+          </td>
+        </tr>
+  </tbody>
+</table>
+
+Extraction from tuples, case classes and collections via anonymous pattern matching, like the following:
+{% highlight scala %}
+val data: DataStream[(Int, String, Double)] = // [...]
+data.map {
+  case (id, name, temperature) => // [...]
+}
+{% endhighlight %}
+is not supported by the API out-of-the-box. To use this feature, you should use a <a href="scala_api_extensions.html">Scala API extension</a>.
+
+
+</div>
+</div>
+
+The following transformations are available on data streams of Tuples:
+
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+
+<br />
+
+<table class="table table-bordered">
+  <thead>
+    <tr>
+      <th class="text-left" style="width: 20%">Transformation</th>
+      <th class="text-center">Description</th>
+    </tr>
+  </thead>
+  <tbody>
+   <tr>
+      <td><strong>Project</strong><br>DataStream &rarr; DataStream</td>
+      <td>
+        <p>Selects a subset of fields from the tuples
+{% highlight java %}
+DataStream<Tuple3<Integer, Double, String>> in = // [...]
+DataStream<Tuple2<String, Integer>> out = in.project(2,0);
+{% endhighlight %}
+        </p>
+      </td>
+    </tr>
+  </tbody>
+</table>
+
+</div>
+</div>
+
+
+# Physical partitioning
+
+Flink also gives low-level control (if desired) on the exact stream partitioning after a transformation,
+via the following functions.
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+
+<br />
+
+<table class="table table-bordered">
+  <thead>
+    <tr>
+      <th class="text-left" style="width: 20%">Transformation</th>
+      <th class="text-center">Description</th>
+    </tr>
+  </thead>
+  <tbody>
+   <tr>
+      <td><strong>Custom partitioning</strong><br>DataStream &rarr; DataStream</td>
+      <td>
+        <p>
+            Uses a user-defined Partitioner to select the target task for each element.
+            {% highlight java %}
+dataStream.partitionCustom(partitioner, "someKey");
+dataStream.partitionCustom(partitioner, 0);
+            {% endhighlight %}
+        </p>
+      </td>
+    </tr>
+   <tr>
+     <td><strong>Random partitioning</strong><br>DataStream &rarr; DataStream</td>
+     <td>
+       <p>
+            Partitions elements randomly according to a uniform distribution.
+            {% highlight java %}
+dataStream.shuffle();
+            {% endhighlight %}
+       </p>
+     </td>
+   </tr>
+   <tr>
+      <td><strong>Rebalancing (Round-robin partitioning)</strong><br>DataStream &rarr; DataStream</td>
+      <td>
+        <p>
+            Partitions elements round-robin, creating equal load per partition. Useful for performance
+            optimization in the presence of data skew.
+            {% highlight java %}
+dataStream.rebalance();
+            {% endhighlight %}
+        </p>
+      </td>
+    </tr>
+    <tr>
+      <td><strong>Rescaling</strong><br>DataStream &rarr; DataStream</td>
+      <td>
+        <p>
+            Partitions elements, round-robin, to a subset of downstream operations. This is
+            useful if you want to have pipelines where you, for example, fan out from
+            each parallel instance of a source to a subset of several mappers to distribute load
+            but don't want the full rebalance that rebalance() would incur. This would require only
+            local data transfers instead of transferring data over network, depending on
+            other configuration values such as the number of slots of TaskManagers.
+        </p>
+        <p>
+            The subset of downstream operations to which the upstream operation sends
+            elements depends on the degree of parallelism of both the upstream and downstream operation.
+            For example, if the upstream operation has parallelism 2 and the downstream operation
+            has parallelism 6, then one upstream operation would distribute elements to three
+            downstream operations while the other upstream operation would distribute to the other
+            three downstream operations. If, on the other hand, the downstream operation has parallelism
+            2 while the upstream operation has parallelism 6 then three upstream operations would
+            distribute to one downstream operation while the other three upstream operations would
+            distribute to the other downstream operation.
+        </p>
+        <p>
+            In cases where the different parallelisms are not multiples of each other one or several
+            downstream operations will have a differing number of inputs from upstream operations.
+        </p>
+        <p>
+            Please see this figure for a visualization of the connection pattern in the above
+            example:
+        </p>
+
+        <div style="text-align: center">
+            <img src="{{ site.baseurl }}/fig/rescale.svg" alt="Checkpoint barriers in data streams" />
+            </div>
+
+
+        <p>
+                    {% highlight java %}
+dataStream.rescale();
+            {% endhighlight %}
+
+        </p>
+      </td>
+    </tr>
+   <tr>
+      <td><strong>Broadcasting</strong><br>DataStream &rarr; DataStream</td>
+      <td>
+        <p>
+            Broadcasts elements to every partition.
+            {% highlight java %}
+dataStream.broadcast();
+            {% endhighlight %}
+        </p>
+      </td>
+    </tr>
+  </tbody>
+</table>
+
+</div>
+
+<div data-lang="scala" markdown="1">
+
+<br />
+
+<table class="table table-bordered">
+  <thead>
+    <tr>
+      <th class="text-left" style="width: 20%">Transformation</th>
+      <th class="text-center">Description</th>
+    </tr>
+  </thead>
+  <tbody>
+   <tr>
+      <td><strong>Custom partitioning</strong><br>DataStream &rarr; DataStream</td>
+      <td>
+        <p>
+            Uses a user-defined Partitioner to select the target task for each element.
+            {% highlight scala %}
+dataStream.partitionCustom(partitioner, "someKey")
+dataStream.partitionCustom(partitioner, 0)
+            {% endhighlight %}
+        </p>
+      </td>
+    </tr>
+   <tr>
+     <td><strong>Random partitioning</strong><br>DataStream &rarr; DataStream</td>
+     <td>
+       <p>
+            Partitions elements randomly according to a uniform distribution.
+            {% highlight scala %}
+dataStream.shuffle()
+            {% endhighlight %}
+       </p>
+     </td>
+   </tr>
+   <tr>
+      <td><strong>Rebalancing (Round-robin partitioning)</strong><br>DataStream &rarr; DataStream</td>
+      <td>
+        <p>
+            Partitions elements round-robin, creating equal load per partition. Useful for performance
+            optimization in the presence of data skew.
+            {% highlight scala %}
+dataStream.rebalance()
+            {% endhighlight %}
+        </p>
+      </td>
+    </tr>
+    <tr>
+      <td><strong>Rescaling</strong><br>DataStream &rarr; DataStream</td>
+      <td>
+        <p>
+            Partitions elements, round-robin, to a subset of downstream operations. This is
+            useful if you want to have pipelines where you, for example, fan out from
+            each parallel instance of a source to a subset of several mappers to distribute load
+            but don't want the full rebalance that rebalance() would incur. This would require only
+            local data transfers instead of transferring data over network, depending on
+            other configuration values such as the number of slots of TaskManagers.
+        </p>
+        <p>
+            The subset of downstream operations to which the upstream operation sends
+            elements depends on the degree of parallelism of both the upstream and downstream operation.
+            For example, if the upstream operation has parallelism 2 and the downstream operation
+            has parallelism 4, then one upstream operation would distribute elements to two
+            downstream operations while the other upstream operation would distribute to the other
+            two downstream operations. If, on the other hand, the downstream operation has parallelism
+            2 while the upstream operation has parallelism 4 then two upstream operations would
+            distribute to one downstream operation while the other two upstream operations would
+            distribute to the other downstream operations.
+        </p>
+        <p>
+            In cases where the different parallelisms are not multiples of each other one or several
+            downstream operations will have a differing number of inputs from upstream operations.
+
+        </p>
+        </p>
+            Please see this figure for a visualization of the connection pattern in the above
+            example:
+        </p>
+
+        <div style="text-align: center">
+            <img src="{{ site.baseurl }}/fig/rescale.svg" alt="Checkpoint barriers in data streams" />
+            </div>
+
+
+        <p>
+                    {% highlight java %}
+dataStream.rescale()
+            {% endhighlight %}
+
+        </p>
+      </td>
+    </tr>
+   <tr>
+      <td><strong>Broadcasting</strong><br>DataStream &rarr; DataStream</td>
+      <td>
+        <p>
+            Broadcasts elements to every partition.
+            {% highlight scala %}
+dataStream.broadcast()
+            {% endhighlight %}
+        </p>
+      </td>
+    </tr>
+  </tbody>
+</table>
+
+</div>
+</div>
+
+# Task chaining and resource groups
+
+Chaining two subsequent transformations means co-locating them within the same thread for better
+performance. Flink by default chains operators if this is possible (e.g., two subsequent map
+transformations). The API gives fine-grained control over chaining if desired:
+
+Use `StreamExecutionEnvironment.disableOperatorChaining()` if you want to disable chaining in
+the whole job. For more fine grained control, the following functions are available. Note that
+these functions can only be used right after a DataStream transformation as they refer to the
+previous transformation. For example, you can use `someStream.map(...).startNewChain()`, but
+you cannot use `someStream.startNewChain()`.
+
+A resource group is a slot in Flink, see
+[slots]({{site.baseurl}}/setup/config.html#configuring-taskmanager-processing-slots). You can
+manually isolate operators in separate slots if desired.
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+
+<br />
+
+<table class="table table-bordered">
+  <thead>
+    <tr>
+      <th class="text-left" style="width: 20%">Transformation</th>
+      <th class="text-center">Description</th>
+    </tr>
+  </thead>
+  <tbody>
+   <tr>
+      <td>Start new chain</td>
+      <td>
+        <p>Begin a new chain, starting with this operator. The two
+	mappers will be chained, and filter will not be chained to
+	the first mapper.
+{% highlight java %}
+someStream.filter(...).map(...).startNewChain().map(...);
+{% endhighlight %}
+        </p>
+      </td>
+    </tr>
+   <tr>
+      <td>Disable chaining</td>
+      <td>
+        <p>Do not chain the map operator
+{% highlight java %}
+someStream.map(...).disableChaining();
+{% endhighlight %}
+        </p>
+      </td>
+    </tr>
+    <tr>
+      <td>Set slot sharing group</td>
+      <td>
+        <p>Set the slot sharing group of an operation. Flink will put operations with the same
+        slot sharing group into the same slot while keeping operations that don't have the
+        slot sharing group in other slots. This can be used to isolate slots. The slot sharing
+        group is inherited from input operations if all input operations are in the same slot
+        sharing group.
+        The name of the default slot sharing group is "default", operations can explicitly
+        be put into this group by calling slotSharingGroup("default").
+{% highlight java %}
+someStream.filter(...).slotSharingGroup("name");
+{% endhighlight %}
+        </p>
+      </td>
+    </tr>
+  </tbody>
+</table>
+
+</div>
+
+<div data-lang="scala" markdown="1">
+
+<br />
+
+<table class="table table-bordered">
+  <thead>
+    <tr>
+      <th class="text-left" style="width: 20%">Transformation</th>
+      <th class="text-center">Description</th>
+    </tr>
+  </thead>
+  <tbody>
+   <tr>
+      <td>Start new chain</td>
+      <td>
+        <p>Begin a new chain, starting with this operator. The two
+	mappers will be chained, and filter will not be chained to
+	the first mapper.
+{% highlight scala %}
+someStream.filter(...).map(...).startNewChain().map(...)
+{% endhighlight %}
+        </p>
+      </td>
+    </tr>
+   <tr>
+      <td>Disable chaining</td>
+      <td>
+        <p>Do not chain the map operator
+{% highlight scala %}
+someStream.map(...).disableChaining()
+{% endhighlight %}
+        </p>
+      </td>
+    </tr>
+  <tr>
+      <td>Set slot sharing group</td>
+      <td>
+        <p>Set the slot sharing group of an operation. Flink will put operations with the same
+        slot sharing group into the same slot while keeping operations that don't have the
+        slot sharing group in other slots. This can be used to isolate slots. The slot sharing
+        group is inherited from input operations if all input operations are in the same slot
+        sharing group.
+        The name of the default slot sharing group is "default", operations can explicitly
+        be put into this group by calling slotSharingGroup("default").
+{% highlight java %}
+someStream.filter(...).slotSharingGroup("name")
+{% endhighlight %}
+        </p>
+      </td>
+    </tr>
+  </tbody>
+</table>
+
+</div>
+</div>
+
+
+{% top %}
+


[4/7] flink git commit: [FLINK-7370] [docs] Relocate files according to new structure

Posted by tw...@apache.org.
[FLINK-7370] [docs] Relocate files according to new structure

This closes #4477.


Project: http://git-wip-us.apache.org/repos/asf/flink/repo
Commit: http://git-wip-us.apache.org/repos/asf/flink/commit/31b86f60
Tree: http://git-wip-us.apache.org/repos/asf/flink/tree/31b86f60
Diff: http://git-wip-us.apache.org/repos/asf/flink/diff/31b86f60

Branch: refs/heads/master
Commit: 31b86f605ae1d2e3523d10a88e697dc4f05aef30
Parents: cafa45e
Author: twalthr <tw...@apache.org>
Authored: Wed Aug 9 12:21:31 2017 +0200
Committer: twalthr <tw...@apache.org>
Committed: Wed Aug 9 13:56:43 2017 +0200

----------------------------------------------------------------------
 docs/concepts/programming-model.md            |    6 +-
 docs/dev/connectors/index.md                  |    2 +-
 docs/dev/datastream_api.md                    |   25 +-
 docs/dev/event_time.md                        |    2 +-
 docs/dev/event_timestamp_extractors.md        |    2 +-
 docs/dev/stream/asyncio.md                    |  253 -----
 docs/dev/stream/operators.md                  | 1169 --------------------
 docs/dev/stream/operators/asyncio.md          |  253 +++++
 docs/dev/stream/operators/index.md            | 1169 ++++++++++++++++++++
 docs/dev/stream/operators/process_function.md |  238 ++++
 docs/dev/stream/operators/windows.md          | 1039 +++++++++++++++++
 docs/dev/stream/process_function.md           |  238 ----
 docs/dev/stream/side_output.md                |    2 +-
 docs/dev/stream/state/checkpointing.md        |    4 +-
 docs/dev/stream/windows.md                    | 1039 -----------------
 docs/ops/state/checkpoints.md                 |    2 +-
 docs/redirects/windows.md                     |   24 -
 docs/redirects/windows_2.md                   |   24 -
 18 files changed, 2727 insertions(+), 2764 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/flink/blob/31b86f60/docs/concepts/programming-model.md
----------------------------------------------------------------------
diff --git a/docs/concepts/programming-model.md b/docs/concepts/programming-model.md
index 926fdd7..cb127c4 100644
--- a/docs/concepts/programming-model.md
+++ b/docs/concepts/programming-model.md
@@ -34,7 +34,7 @@ Flink offers different levels of abstraction to develop streaming/batch applicat
 <img src="../fig/levels_of_abstraction.svg" alt="Programming levels of abstraction" class="offset" width="80%" />
 
   - The lowest level abstraction simply offers **stateful streaming**. It is embedded into the [DataStream API](../dev/datastream_api.html)
-    via the [Process Function](../dev/stream/process_function.html). It allows users freely process events from one or more streams,
+    via the [Process Function](../dev/stream/operators/process_function.html). It allows users freely process events from one or more streams,
     and use consistent fault tolerant *state*. In addition, users can register event time and processing time callbacks,
     allowing programs to realize sophisticated computations.
 
@@ -82,7 +82,7 @@ Often there is a one-to-one correspondence between the transformations in the pr
 in the dataflow. Sometimes, however, one transformation may consist of multiple transformation operators.
 
 Sources and sinks are documented in the [streaming connectors](../dev/connectors/index.html) and [batch connectors](../dev/batch/connectors.html) docs.
-Transformations are documented in [DataStream operators]({{ site.baseurl }}/dev/stream/operators.html) and [DataSet transformations](../dev/batch/dataset_transformations.html).
+Transformations are documented in [DataStream operators]({{ site.baseurl }}/dev/stream/operators/index.html) and [DataSet transformations](../dev/batch/dataset_transformations.html).
 
 {% top %}
 
@@ -133,7 +133,7 @@ One typically distinguishes different types of windows, such as *tumbling window
 <img src="../fig/windows.svg" alt="Time- and Count Windows" class="offset" width="80%" />
 
 More window examples can be found in this [blog post](https://flink.apache.org/news/2015/12/04/Introducing-windows.html).
-More details are in the [window docs](../dev/windows.html).
+More details are in the [window docs](../dev/stream/operators/windows.html).
 
 {% top %}
 

http://git-wip-us.apache.org/repos/asf/flink/blob/31b86f60/docs/dev/connectors/index.md
----------------------------------------------------------------------
diff --git a/docs/dev/connectors/index.md b/docs/dev/connectors/index.md
index 00c0853..f3ae039 100644
--- a/docs/dev/connectors/index.md
+++ b/docs/dev/connectors/index.md
@@ -71,7 +71,7 @@ Additional streaming connectors for Flink are being released through [Apache Bah
 Using a connector isn't the only way to get data in and out of Flink.
 One common pattern is to query an external database or web service in a `Map` or `FlatMap`
 in order to enrich the primary datastream.
-Flink offers an API for [Asynchronous I/O]({{ site.baseurl }}/dev/stream/asyncio.html)
+Flink offers an API for [Asynchronous I/O]({{ site.baseurl }}/dev/stream/operators/asyncio.html)
 to make it easier to do this kind of enrichment efficiently and robustly.
 
 ### Queryable State

http://git-wip-us.apache.org/repos/asf/flink/blob/31b86f60/docs/dev/datastream_api.md
----------------------------------------------------------------------
diff --git a/docs/dev/datastream_api.md b/docs/dev/datastream_api.md
index b7f02ef..0fc6033 100644
--- a/docs/dev/datastream_api.md
+++ b/docs/dev/datastream_api.md
@@ -38,7 +38,7 @@ to the basic concepts of the Flink API.
 In order to create your own Flink DataStream program, we encourage you to start with
 [anatomy of a Flink Program]({{ site.baseurl }}/dev/api_concepts.html#anatomy-of-a-flink-program)
 and gradually add your own
-[stream transformations]({{ site.baseurl }}/dev/stream/operators.html). The remaining sections act as references for additional
+[stream transformations]({{ site.baseurl }}/dev/stream/operators/index.html). The remaining sections act as references for additional
 operations and advanced features.
 
 
@@ -135,12 +135,6 @@ word count program. If you want to see counts greater than 1, type the same word
 
 {% top %}
 
-DataStream Transformations
---------------------------
-
-Moved. Please see [operators]({{ site.baseurl }}/dev/stream/operators.html) for an overview of the
-available stream transformations.
-
 Data Sources
 ------------
 
@@ -264,6 +258,13 @@ Custom:
 
 {% top %}
 
+DataStream Transformations
+--------------------------
+
+Please see [operators]({{ site.baseurl }}/dev/stream/operators/index.html) for an overview of the available stream transformations.
+
+{% top %}
+
 Data Sinks
 ----------
 
@@ -624,3 +625,13 @@ val myOutput: Iterator[(String, Int)] = DataStreamUtils.collect(myResult.getJava
 </div>
 
 {% top %}
+
+Where to go next?
+-----------------
+
+* [Operators]({{ site.baseurl }}/dev/stream/operators/index.html): Specification of available streaming operators.
+* [Event Time]({{ site.baseurl }}/dev/event_time.html): Introduction to Flink's notion of time.
+* [State & Fault Tolerance]({{ site.baseurl }}/dev/stream/state/index.html): Explanation of how to develop stateful applications.
+* [Connectors]({{ site.baseurl }}/dev/connectors/index.html): Description of available input and output connectors.
+
+{% top %}

http://git-wip-us.apache.org/repos/asf/flink/blob/31b86f60/docs/dev/event_time.md
----------------------------------------------------------------------
diff --git a/docs/dev/event_time.md b/docs/dev/event_time.md
index 3e5120b..70a7812 100644
--- a/docs/dev/event_time.md
+++ b/docs/dev/event_time.md
@@ -205,7 +205,7 @@ causes too much delay in the evaluation of the event time windows.
 
 For this reason, streaming programs may explicitly expect some *late* elements. Late elements are elements that
 arrive after the system's event time clock (as signaled by the watermarks) has already passed the time of the late element's
-timestamp. See [Allowed Lateness]({{ site.baseurl }}/dev/windows.html#allowed-lateness) for more information on how to work
+timestamp. See [Allowed Lateness]({{ site.baseurl }}/dev/stream/operators/windows.html#allowed-lateness) for more information on how to work
 with late elements in event time windows.
 
 

http://git-wip-us.apache.org/repos/asf/flink/blob/31b86f60/docs/dev/event_timestamp_extractors.md
----------------------------------------------------------------------
diff --git a/docs/dev/event_timestamp_extractors.md b/docs/dev/event_timestamp_extractors.md
index 34a27ff..b270491 100644
--- a/docs/dev/event_timestamp_extractors.md
+++ b/docs/dev/event_timestamp_extractors.md
@@ -79,7 +79,7 @@ time for testing. For these cases, Flink provides the `BoundedOutOfOrdernessTime
 the `maxOutOfOrderness`, i.e. the maximum amount of time an element is allowed to be late before being ignored when computing the
 final result for the given window. Lateness corresponds to the result of `t - t_w`, where `t` is the (event-time) timestamp of an
 element, and `t_w` that of the previous watermark. If `lateness > 0` then the element is considered late and is, by default, ignored when computing
-the result of the job for its corresponding window. See the documentation about [allowed lateness]({{ site.baseurl }}/dev/windows.html#allowed-lateness)
+the result of the job for its corresponding window. See the documentation about [allowed lateness]({{ site.baseurl }}/dev/stream/operators/windows.html#allowed-lateness)
 for more information about working with late elements.
 
 <div class="codetabs" markdown="1">

http://git-wip-us.apache.org/repos/asf/flink/blob/31b86f60/docs/dev/stream/asyncio.md
----------------------------------------------------------------------
diff --git a/docs/dev/stream/asyncio.md b/docs/dev/stream/asyncio.md
deleted file mode 100644
index ec9c8ba..0000000
--- a/docs/dev/stream/asyncio.md
+++ /dev/null
@@ -1,253 +0,0 @@
----
-title: "Asynchronous I/O for External Data Access"
-nav-title: "Async I/O"
-nav-parent_id: operators
-nav-pos: 60
----
-<!--
-Licensed to the Apache Software Foundation (ASF) under one
-or more contributor license agreements.  See the NOTICE file
-distributed with this work for additional information
-regarding copyright ownership.  The ASF licenses this file
-to you under the Apache License, Version 2.0 (the
-"License"); you may not use this file except in compliance
-with the License.  You may obtain a copy of the License at
-
-  http://www.apache.org/licenses/LICENSE-2.0
-
-Unless required by applicable law or agreed to in writing,
-software distributed under the License is distributed on an
-"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-KIND, either express or implied.  See the License for the
-specific language governing permissions and limitations
-under the License.
--->
-
-* ToC
-{:toc}
-
-This page explains the use of Flink's API for asynchronous I/O with external data stores.
-For users not familiar with asynchronous or event-driven programming, an article about Futures and
-event-driven programming may be useful preparation.
-
-Note: Details about the design and implementation of the asynchronous I/O utility can be found in the proposal and design document
-[FLIP-12: Asynchronous I/O Design and Implementation](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65870673).
-
-
-## The need for Asynchronous I/O Operations
-
-When interacting with external systems (for example when enriching stream events with data stored in a database), one needs to take care
-that communication delay with the external system does not dominate the streaming application's total work.
-
-Naively accessing data in the external database, for example in a `MapFunction`, typically means **synchronous** interaction:
-A request is sent to the database and the `MapFunction` waits until the response has been received. In many cases, this waiting
-makes up the vast majority of the function's time.
-
-Asynchronous interaction with the database means that a single parallel function instance can handle many requests concurrently and
-receive the responses concurrently. That way, the waiting time can be overlayed with sending other requests and
-receiving responses. At the very least, the waiting time is amortized over multiple requests. This leads in most cased to much higher
-streaming throughput.
-
-<img src="../../fig/async_io.svg" class="center" width="50%" />
-
-*Note:* Improving throughput by just scaling the `MapFunction` to a very high parallelism is in some cases possible as well, but usually
-comes at a very high resource cost: Having many more parallel MapFunction instances means more tasks, threads, Flink-internal network
-connections, network connections to the database, buffers, and general internal bookkeeping overhead.
-
-
-## Prerequisites
-
-As illustrated in the section above, implementing proper asynchronous I/O to a database (or key/value store) requires a client
-to that database that supports asynchronous requests. Many popular databases offer such a client.
-
-In the absence of such a client, one can try and turn a synchronous client into a limited concurrent client by creating
-multiple clients and handling the synchronous calls with a thread pool. However, this approach is usually less
-efficient than a proper asynchronous client.
-
-
-## Async I/O API
-
-Flink's Async I/O API allows users to use asynchronous request clients with data streams. The API handles the integration with
-data streams, well as handling order, event time, fault tolerance, etc.
-
-Assuming one has an asynchronous client for the target database, three parts are needed to implement a stream transformation
-with asynchronous I/O against the database:
-
-  - An implementation of `AsyncFunction` that dispatches the requests
-  - A *callback* that takes the result of the operation and hands it to the `AsyncCollector`
-  - Applying the async I/O operation on a DataStream as a transformation
-
-The following code example illustrates the basic pattern:
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-{% highlight java %}
-// This example implements the asynchronous request and callback with Futures that have the
-// interface of Java 8's futures (which is the same one followed by Flink's Future)
-
-/**
- * An implementation of the 'AsyncFunction' that sends requests and sets the callback.
- */
-class AsyncDatabaseRequest extends RichAsyncFunction<String, Tuple2<String, String>> {
-
-    /** The database specific client that can issue concurrent requests with callbacks */
-    private transient DatabaseClient client;
-
-    @Override
-    public void open(Configuration parameters) throws Exception {
-        client = new DatabaseClient(host, post, credentials);
-    }
-
-    @Override
-    public void close() throws Exception {
-        client.close();
-    }
-
-    @Override
-    public void asyncInvoke(final String str, final AsyncCollector<Tuple2<String, String>> asyncCollector) throws Exception {
-
-        // issue the asynchronous request, receive a future for result
-        Future<String> resultFuture = client.query(str);
-
-        // set the callback to be executed once the request by the client is complete
-        // the callback simply forwards the result to the collector
-        resultFuture.thenAccept( (String result) -> {
-
-            asyncCollector.collect(Collections.singleton(new Tuple2<>(str, result)));
-         
-        });
-    }
-}
-
-// create the original stream
-DataStream<String> stream = ...;
-
-// apply the async I/O transformation
-DataStream<Tuple2<String, String>> resultStream =
-    AsyncDataStream.unorderedWait(stream, new AsyncDatabaseRequest(), 1000, TimeUnit.MILLISECONDS, 100);
-
-{% endhighlight %}
-</div>
-<div data-lang="scala" markdown="1">
-{% highlight scala %}
-/**
- * An implementation of the 'AsyncFunction' that sends requests and sets the callback.
- */
-class AsyncDatabaseRequest extends AsyncFunction[String, (String, String)] {
-
-    /** The database specific client that can issue concurrent requests with callbacks */
-    lazy val client: DatabaseClient = new DatabaseClient(host, post, credentials)
-
-    /** The context used for the future callbacks */
-    implicit lazy val executor: ExecutionContext = ExecutionContext.fromExecutor(Executors.directExecutor())
-
-
-    override def asyncInvoke(str: String, asyncCollector: AsyncCollector[(String, String)]): Unit = {
-
-        // issue the asynchronous request, receive a future for the result
-        val resultFuture: Future[String] = client.query(str)
-
-        // set the callback to be executed once the request by the client is complete
-        // the callback simply forwards the result to the collector
-        resultFuture.onSuccess {
-            case result: String => asyncCollector.collect(Iterable((str, result)));
-        }
-    }
-}
-
-// create the original stream
-val stream: DataStream[String] = ...
-
-// apply the async I/O transformation
-val resultStream: DataStream[(String, String)] =
-    AsyncDataStream.unorderedWait(stream, new AsyncDatabaseRequest(), 1000, TimeUnit.MILLISECONDS, 100)
-
-{% endhighlight %}
-</div>
-</div>
-
-**Important note**: The `AsyncCollector` is completed with the first call of `AsyncCollector.collect`.
-All subsequent `collect` calls will be ignored.
-
-The following two parameters control the asynchronous operations:
-
-  - **Timeout**: The timeout defines how long an asynchronous request may take before it is considered failed. This parameter
-    guards against dead/failed requests.
-
-  - **Capacity**: This parameter defines how many asynchronous requests may be in progress at the same time.
-    Even though the async I/O approach leads typically to much better throughput, the operator can still be the bottleneck in
-    the streaming application. Limiting the number of concurrent requests ensures that the operator will not
-    accumulate an ever-growing backlog of pending requests, but that it will trigger backpressure once the capacity
-    is exhausted.
-
-
-### Order of Results
-
-The concurrent requests issued by the `AsyncFunction` frequently complete in some undefined order, based on which request finished first.
-To control in which order the resulting records are emitted, Flink offers two modes:
-
-  - **Unordered**: Result records are emitted as soon as the asynchronous request finishes.
-    The order of the records in the stream is different after the async I/O operator than before.
-    This mode has the lowest latency and lowest overhead, when used with *processing time* as the basic time characteristic.
-    Use `AsyncDataStream.unorderedWait(...)` for this mode.
-
-  - **Ordered**: In that case, the stream order is preserved. Result records are emitted in the same order as the asynchronous
-    requests are triggered (the order of the operators input records). To achieve that, the operator buffers a result record
-    until all its preceeding records are emitted (or timed out).
-    This usually introduces some amount of extra latency and some overhead in checkpointing, because records or results are maintained
-    in the checkpointed state for a longer time, compared to the unordered mode.
-    Use `AsyncDataStream.orderedWait(...)` for this mode.
-
-
-### Event Time
-
-When the streaming application works with [event time](../event_time.html), watermarks will be handled correctly by the
-asynchronous I/O operator. That means concretely the following for the two order modes:
-
-  - **Unordered**: Watermarks do not overtake records and vice versa, meaning watermarks establish an *order boundary*.
-    Records are emitted unordered only between watermarks.
-    A record occurring after a certain watermark will be emitted only after that watermark was emitted.
-    The watermark in turn will be emitted only after all result records from inputs before that watermark were emitted.
-
-    That means that in the presence of watermarks, the *unordered* mode introduces some of the same latency and management
-    overhead as the *ordered* mode does. The amount of that overhead depends on the watermark frequency.
-
-  - **Ordered**: Order of watermarks an records is preserved, just like order between records is preserved. There is no
-    significant change in overhead, compared to working with *processing time*.
-
-Please recall that *Ingestion Time* is a special case of *event time* with automatically generated watermarks that
-are based on the sources processing time.
-
-
-### Fault Tolerance Guarantees
-
-The asynchronous I/O operator offers full exactly-once fault tolerance guarantees. It stores the records for in-flight
-asynchronous requests in checkpoints and restores/re-triggers the requests when recovering from a failure.
-
-
-### Implementation Tips
-
-For implementations with *Futures* that have an *Executor* (or *ExecutionContext* in Scala) for callbacks, we suggets to use a `DirectExecutor`, because the
-callback typically does minimal work, and a `DirectExecutor` avoids an additional thread-to-thread handover overhead. The callback typically only hands
-the result to the `AsyncCollector`, which adds it to the output buffer. From there, the heavy logic that includes record emission and interaction
-with the checkpoint bookkeepting happens in a dedicated thread-pool anyways.
-
-A `DirectExecutor` can be obtained via `org.apache.flink.runtime.concurrent.Executors.directExecutor()` or
-`com.google.common.util.concurrent.MoreExecutors.directExecutor()`.
-
-
-### Caveat
-
-**The AsyncFunction is not called Multi-Threaded**
-
-A common confusion that we want to explicitly point out here is that the `AsyncFunction` is not called in a multi-threaded fashion.
-There exists only one instance of the `AsyncFunction` and it is called sequentially for each record in the respective partition
-of the stream. Unless the `asyncInvoke(...)` method returns fast and relies on a callback (by the client), it will not result in
-proper asynchronous I/O.
-
-For example, the following patterns result in a blocking `asyncInvoke(...)` functions and thus void the asynchronous behavior:
-
-  - Using a database client whose lookup/query method call blocks until the result has been received back
-
-  - Blocking/waiting on the future-type objects returned by an aynchronous client inside the `asyncInvoke(...)` method
-

http://git-wip-us.apache.org/repos/asf/flink/blob/31b86f60/docs/dev/stream/operators.md
----------------------------------------------------------------------
diff --git a/docs/dev/stream/operators.md b/docs/dev/stream/operators.md
deleted file mode 100644
index 70bd9ae..0000000
--- a/docs/dev/stream/operators.md
+++ /dev/null
@@ -1,1169 +0,0 @@
----
-title: "Operators"
-nav-id: operators
-nav-show_overview: true
-nav-parent_id: streaming
-nav-pos: 9
----
-<!--
-Licensed to the Apache Software Foundation (ASF) under one
-or more contributor license agreements.  See the NOTICE file
-distributed with this work for additional information
-regarding copyright ownership.  The ASF licenses this file
-to you under the Apache License, Version 2.0 (the
-"License"); you may not use this file except in compliance
-with the License.  You may obtain a copy of the License at
-
-  http://www.apache.org/licenses/LICENSE-2.0
-
-Unless required by applicable law or agreed to in writing,
-software distributed under the License is distributed on an
-"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-KIND, either express or implied.  See the License for the
-specific language governing permissions and limitations
-under the License.
--->
-
-Operators transform one or more DataStreams into a new DataStream. Programs can combine
-multiple transformations into sophisticated topologies.
-
-This section gives a description of all the available transformations, the effective physical
-partitioning after applying those as well as insights into Flink's operator chaining.
-
-* toc
-{:toc}
-
-# DataStream Transformations
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-
-<br />
-
-<table class="table table-bordered">
-  <thead>
-    <tr>
-      <th class="text-left" style="width: 25%">Transformation</th>
-      <th class="text-center">Description</th>
-    </tr>
-  </thead>
-  <tbody>
-    <tr>
-          <td><strong>Map</strong><br>DataStream &rarr; DataStream</td>
-          <td>
-            <p>Takes one element and produces one element. A map function that doubles the values of the input stream:</p>
-    {% highlight java %}
-DataStream<Integer> dataStream = //...
-dataStream.map(new MapFunction<Integer, Integer>() {
-    @Override
-    public Integer map(Integer value) throws Exception {
-        return 2 * value;
-    }
-});
-    {% endhighlight %}
-          </td>
-        </tr>
-
-        <tr>
-          <td><strong>FlatMap</strong><br>DataStream &rarr; DataStream</td>
-          <td>
-            <p>Takes one element and produces zero, one, or more elements. A flatmap function that splits sentences to words:</p>
-    {% highlight java %}
-dataStream.flatMap(new FlatMapFunction<String, String>() {
-    @Override
-    public void flatMap(String value, Collector<String> out)
-        throws Exception {
-        for(String word: value.split(" ")){
-            out.collect(word);
-        }
-    }
-});
-    {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Filter</strong><br>DataStream &rarr; DataStream</td>
-          <td>
-            <p>Evaluates a boolean function for each element and retains those for which the function returns true.
-            A filter that filters out zero values:
-            </p>
-    {% highlight java %}
-dataStream.filter(new FilterFunction<Integer>() {
-    @Override
-    public boolean filter(Integer value) throws Exception {
-        return value != 0;
-    }
-});
-    {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>KeyBy</strong><br>DataStream &rarr; KeyedStream</td>
-          <td>
-            <p>Logically partitions a stream into disjoint partitions, each partition containing elements of the same key.
-            Internally, this is implemented with hash partitioning. See <a href="{{ site.baseurl }}/dev/api_concepts.html#specifying-keys">keys</a> on how to specify keys.
-            This transformation returns a KeyedStream.</p>
-    {% highlight java %}
-dataStream.keyBy("someKey") // Key by field "someKey"
-dataStream.keyBy(0) // Key by the first element of a Tuple
-    {% endhighlight %}
-            <p>
-            <span class="label label-danger">Attention</span>
-            A type <strong>cannot be a key</strong> if:
-    	    <ol>
-    	    <li> it is a POJO type but does not override the <em>hashCode()</em> method and
-    	    relies on the <em>Object.hashCode()</em> implementation.</li>
-    	    <li> it is an array of any type.</li>
-    	    </ol>
-    	    </p>
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Reduce</strong><br>KeyedStream &rarr; DataStream</td>
-          <td>
-            <p>A "rolling" reduce on a keyed data stream. Combines the current element with the last reduced value and
-            emits the new value.
-                    <br/>
-            	<br/>
-            A reduce function that creates a stream of partial sums:</p>
-            {% highlight java %}
-keyedStream.reduce(new ReduceFunction<Integer>() {
-    @Override
-    public Integer reduce(Integer value1, Integer value2)
-    throws Exception {
-        return value1 + value2;
-    }
-});
-            {% endhighlight %}
-            </p>
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Fold</strong><br>KeyedStream &rarr; DataStream</td>
-          <td>
-          <p>A "rolling" fold on a keyed data stream with an initial value.
-          Combines the current element with the last folded value and
-          emits the new value.
-          <br/>
-          <br/>
-          <p>A fold function that, when applied on the sequence (1,2,3,4,5),
-          emits the sequence "start-1", "start-1-2", "start-1-2-3", ...</p>
-          {% highlight java %}
-DataStream<String> result =
-  keyedStream.fold("start", new FoldFunction<Integer, String>() {
-    @Override
-    public String fold(String current, Integer value) {
-        return current + "-" + value;
-    }
-  });
-          {% endhighlight %}
-          </p>
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Aggregations</strong><br>KeyedStream &rarr; DataStream</td>
-          <td>
-            <p>Rolling aggregations on a keyed data stream. The difference between min
-	    and minBy is that min returns the minimum value, whereas minBy returns
-	    the element that has the minimum value in this field (same for max and maxBy).</p>
-    {% highlight java %}
-keyedStream.sum(0);
-keyedStream.sum("key");
-keyedStream.min(0);
-keyedStream.min("key");
-keyedStream.max(0);
-keyedStream.max("key");
-keyedStream.minBy(0);
-keyedStream.minBy("key");
-keyedStream.maxBy(0);
-keyedStream.maxBy("key");
-    {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Window</strong><br>KeyedStream &rarr; WindowedStream</td>
-          <td>
-            <p>Windows can be defined on already partitioned KeyedStreams. Windows group the data in each
-            key according to some characteristic (e.g., the data that arrived within the last 5 seconds).
-            See <a href="windows.html">windows</a> for a complete description of windows.
-    {% highlight java %}
-dataStream.keyBy(0).window(TumblingEventTimeWindows.of(Time.seconds(5))); // Last 5 seconds of data
-    {% endhighlight %}
-        </p>
-          </td>
-        </tr>
-        <tr>
-          <td><strong>WindowAll</strong><br>DataStream &rarr; AllWindowedStream</td>
-          <td>
-              <p>Windows can be defined on regular DataStreams. Windows group all the stream events
-              according to some characteristic (e.g., the data that arrived within the last 5 seconds).
-              See <a href="windows.html">windows</a> for a complete description of windows.</p>
-              <p><strong>WARNING:</strong> This is in many cases a <strong>non-parallel</strong> transformation. All records will be
-               gathered in one task for the windowAll operator.</p>
-  {% highlight java %}
-dataStream.windowAll(TumblingEventTimeWindows.of(Time.seconds(5))); // Last 5 seconds of data
-  {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Window Apply</strong><br>WindowedStream &rarr; DataStream<br>AllWindowedStream &rarr; DataStream</td>
-          <td>
-            <p>Applies a general function to the window as a whole. Below is a function that manually sums the elements of a window.</p>
-            <p><strong>Note:</strong> If you are using a windowAll transformation, you need to use an AllWindowFunction instead.</p>
-    {% highlight java %}
-windowedStream.apply (new WindowFunction<Tuple2<String,Integer>, Integer, Tuple, Window>() {
-    public void apply (Tuple tuple,
-            Window window,
-            Iterable<Tuple2<String, Integer>> values,
-            Collector<Integer> out) throws Exception {
-        int sum = 0;
-        for (value t: values) {
-            sum += t.f1;
-        }
-        out.collect (new Integer(sum));
-    }
-});
-
-// applying an AllWindowFunction on non-keyed window stream
-allWindowedStream.apply (new AllWindowFunction<Tuple2<String,Integer>, Integer, Window>() {
-    public void apply (Window window,
-            Iterable<Tuple2<String, Integer>> values,
-            Collector<Integer> out) throws Exception {
-        int sum = 0;
-        for (value t: values) {
-            sum += t.f1;
-        }
-        out.collect (new Integer(sum));
-    }
-});
-    {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Window Reduce</strong><br>WindowedStream &rarr; DataStream</td>
-          <td>
-            <p>Applies a functional reduce function to the window and returns the reduced value.</p>
-    {% highlight java %}
-windowedStream.reduce (new ReduceFunction<Tuple2<String,Integer>>() {
-    public Tuple2<String, Integer> reduce(Tuple2<String, Integer> value1, Tuple2<String, Integer> value2) throws Exception {
-        return new Tuple2<String,Integer>(value1.f0, value1.f1 + value2.f1);
-    }
-});
-    {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Window Fold</strong><br>WindowedStream &rarr; DataStream</td>
-          <td>
-            <p>Applies a functional fold function to the window and returns the folded value.
-               The example function, when applied on the sequence (1,2,3,4,5),
-               folds the sequence into the string "start-1-2-3-4-5":</p>
-    {% highlight java %}
-windowedStream.fold("start", new FoldFunction<Integer, String>() {
-    public String fold(String current, Integer value) {
-        return current + "-" + value;
-    }
-});
-    {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Aggregations on windows</strong><br>WindowedStream &rarr; DataStream</td>
-          <td>
-            <p>Aggregates the contents of a window. The difference between min
-	    and minBy is that min returns the minimun value, whereas minBy returns
-	    the element that has the minimum value in this field (same for max and maxBy).</p>
-    {% highlight java %}
-windowedStream.sum(0);
-windowedStream.sum("key");
-windowedStream.min(0);
-windowedStream.min("key");
-windowedStream.max(0);
-windowedStream.max("key");
-windowedStream.minBy(0);
-windowedStream.minBy("key");
-windowedStream.maxBy(0);
-windowedStream.maxBy("key");
-    {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Union</strong><br>DataStream* &rarr; DataStream</td>
-          <td>
-            <p>Union of two or more data streams creating a new stream containing all the elements from all the streams. Note: If you union a data stream
-            with itself you will get each element twice in the resulting stream.</p>
-    {% highlight java %}
-dataStream.union(otherStream1, otherStream2, ...);
-    {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Window Join</strong><br>DataStream,DataStream &rarr; DataStream</td>
-          <td>
-            <p>Join two data streams on a given key and a common window.</p>
-    {% highlight java %}
-dataStream.join(otherStream)
-    .where(<key selector>).equalTo(<key selector>)
-    .window(TumblingEventTimeWindows.of(Time.seconds(3)))
-    .apply (new JoinFunction () {...});
-    {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Window CoGroup</strong><br>DataStream,DataStream &rarr; DataStream</td>
-          <td>
-            <p>Cogroups two data streams on a given key and a common window.</p>
-    {% highlight java %}
-dataStream.coGroup(otherStream)
-    .where(0).equalTo(1)
-    .window(TumblingEventTimeWindows.of(Time.seconds(3)))
-    .apply (new CoGroupFunction () {...});
-    {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Connect</strong><br>DataStream,DataStream &rarr; ConnectedStreams</td>
-          <td>
-            <p>"Connects" two data streams retaining their types. Connect allowing for shared state between
-            the two streams.</p>
-    {% highlight java %}
-DataStream<Integer> someStream = //...
-DataStream<String> otherStream = //...
-
-ConnectedStreams<Integer, String> connectedStreams = someStream.connect(otherStream);
-    {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>CoMap, CoFlatMap</strong><br>ConnectedStreams &rarr; DataStream</td>
-          <td>
-            <p>Similar to map and flatMap on a connected data stream</p>
-    {% highlight java %}
-connectedStreams.map(new CoMapFunction<Integer, String, Boolean>() {
-    @Override
-    public Boolean map1(Integer value) {
-        return true;
-    }
-
-    @Override
-    public Boolean map2(String value) {
-        return false;
-    }
-});
-connectedStreams.flatMap(new CoFlatMapFunction<Integer, String, String>() {
-
-   @Override
-   public void flatMap1(Integer value, Collector<String> out) {
-       out.collect(value.toString());
-   }
-
-   @Override
-   public void flatMap2(String value, Collector<String> out) {
-       for (String word: value.split(" ")) {
-         out.collect(word);
-       }
-   }
-});
-    {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Split</strong><br>DataStream &rarr; SplitStream</td>
-          <td>
-            <p>
-                Split the stream into two or more streams according to some criterion.
-                {% highlight java %}
-SplitStream<Integer> split = someDataStream.split(new OutputSelector<Integer>() {
-    @Override
-    public Iterable<String> select(Integer value) {
-        List<String> output = new ArrayList<String>();
-        if (value % 2 == 0) {
-            output.add("even");
-        }
-        else {
-            output.add("odd");
-        }
-        return output;
-    }
-});
-                {% endhighlight %}
-            </p>
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Select</strong><br>SplitStream &rarr; DataStream</td>
-          <td>
-            <p>
-                Select one or more streams from a split stream.
-                {% highlight java %}
-SplitStream<Integer> split;
-DataStream<Integer> even = split.select("even");
-DataStream<Integer> odd = split.select("odd");
-DataStream<Integer> all = split.select("even","odd");
-                {% endhighlight %}
-            </p>
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Iterate</strong><br>DataStream &rarr; IterativeStream &rarr; DataStream</td>
-          <td>
-            <p>
-                Creates a "feedback" loop in the flow, by redirecting the output of one operator
-                to some previous operator. This is especially useful for defining algorithms that
-                continuously update a model. The following code starts with a stream and applies
-		the iteration body continuously. Elements that are greater than 0 are sent back
-		to the feedback channel, and the rest of the elements are forwarded downstream.
-		See <a href="#iterations">iterations</a> for a complete description.
-                {% highlight java %}
-IterativeStream<Long> iteration = initialStream.iterate();
-DataStream<Long> iterationBody = iteration.map (/*do something*/);
-DataStream<Long> feedback = iterationBody.filter(new FilterFunction<Long>(){
-    @Override
-    public boolean filter(Integer value) throws Exception {
-        return value > 0;
-    }
-});
-iteration.closeWith(feedback);
-DataStream<Long> output = iterationBody.filter(new FilterFunction<Long>(){
-    @Override
-    public boolean filter(Integer value) throws Exception {
-        return value <= 0;
-    }
-});
-                {% endhighlight %}
-            </p>
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Extract Timestamps</strong><br>DataStream &rarr; DataStream</td>
-          <td>
-            <p>
-                Extracts timestamps from records in order to work with windows
-                that use event time semantics. See <a href="{{ site.baseurl }}/dev/event_time.html">Event Time</a>.
-                {% highlight java %}
-stream.assignTimestamps (new TimeStampExtractor() {...});
-                {% endhighlight %}
-            </p>
-          </td>
-        </tr>
-  </tbody>
-</table>
-
-</div>
-
-<div data-lang="scala" markdown="1">
-
-<br />
-
-<table class="table table-bordered">
-  <thead>
-    <tr>
-      <th class="text-left" style="width: 25%">Transformation</th>
-      <th class="text-center">Description</th>
-    </tr>
-  </thead>
-  <tbody>
-    <tr>
-          <td><strong>Map</strong><br>DataStream &rarr; DataStream</td>
-          <td>
-            <p>Takes one element and produces one element. A map function that doubles the values of the input stream:</p>
-    {% highlight scala %}
-dataStream.map { x => x * 2 }
-    {% endhighlight %}
-          </td>
-        </tr>
-
-        <tr>
-          <td><strong>FlatMap</strong><br>DataStream &rarr; DataStream</td>
-          <td>
-            <p>Takes one element and produces zero, one, or more elements. A flatmap function that splits sentences to words:</p>
-    {% highlight scala %}
-dataStream.flatMap { str => str.split(" ") }
-    {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Filter</strong><br>DataStream &rarr; DataStream</td>
-          <td>
-            <p>Evaluates a boolean function for each element and retains those for which the function returns true.
-            A filter that filters out zero values:
-            </p>
-    {% highlight scala %}
-dataStream.filter { _ != 0 }
-    {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>KeyBy</strong><br>DataStream &rarr; KeyedStream</td>
-          <td>
-            <p>Logically partitions a stream into disjoint partitions, each partition containing elements of the same key.
-            Internally, this is implemented with hash partitioning. See <a href="{{ site.baseurl }}/dev/api_concepts.html#specifying-keys">keys</a> on how to specify keys.
-            This transformation returns a KeyedStream.</p>
-    {% highlight scala %}
-dataStream.keyBy("someKey") // Key by field "someKey"
-dataStream.keyBy(0) // Key by the first element of a Tuple
-    {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Reduce</strong><br>KeyedStream &rarr; DataStream</td>
-          <td>
-            <p>A "rolling" reduce on a keyed data stream. Combines the current element with the last reduced value and
-            emits the new value.
-                    <br/>
-            	<br/>
-            A reduce function that creates a stream of partial sums:</p>
-            {% highlight scala %}
-keyedStream.reduce { _ + _ }
-            {% endhighlight %}
-            </p>
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Fold</strong><br>KeyedStream &rarr; DataStream</td>
-          <td>
-          <p>A "rolling" fold on a keyed data stream with an initial value.
-          Combines the current element with the last folded value and
-          emits the new value.
-          <br/>
-          <br/>
-          <p>A fold function that, when applied on the sequence (1,2,3,4,5),
-          emits the sequence "start-1", "start-1-2", "start-1-2-3", ...</p>
-          {% highlight scala %}
-val result: DataStream[String] =
-    keyedStream.fold("start")((str, i) => { str + "-" + i })
-          {% endhighlight %}
-          </p>
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Aggregations</strong><br>KeyedStream &rarr; DataStream</td>
-          <td>
-            <p>Rolling aggregations on a keyed data stream. The difference between min
-	    and minBy is that min returns the minimun value, whereas minBy returns
-	    the element that has the minimum value in this field (same for max and maxBy).</p>
-    {% highlight scala %}
-keyedStream.sum(0)
-keyedStream.sum("key")
-keyedStream.min(0)
-keyedStream.min("key")
-keyedStream.max(0)
-keyedStream.max("key")
-keyedStream.minBy(0)
-keyedStream.minBy("key")
-keyedStream.maxBy(0)
-keyedStream.maxBy("key")
-    {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Window</strong><br>KeyedStream &rarr; WindowedStream</td>
-          <td>
-            <p>Windows can be defined on already partitioned KeyedStreams. Windows group the data in each
-            key according to some characteristic (e.g., the data that arrived within the last 5 seconds).
-            See <a href="windows.html">windows</a> for a description of windows.
-    {% highlight scala %}
-dataStream.keyBy(0).window(TumblingEventTimeWindows.of(Time.seconds(5))) // Last 5 seconds of data
-    {% endhighlight %}
-        </p>
-          </td>
-        </tr>
-        <tr>
-          <td><strong>WindowAll</strong><br>DataStream &rarr; AllWindowedStream</td>
-          <td>
-              <p>Windows can be defined on regular DataStreams. Windows group all the stream events
-              according to some characteristic (e.g., the data that arrived within the last 5 seconds).
-              See <a href="windows.html">windows</a> for a complete description of windows.</p>
-              <p><strong>WARNING:</strong> This is in many cases a <strong>non-parallel</strong> transformation. All records will be
-               gathered in one task for the windowAll operator.</p>
-  {% highlight scala %}
-dataStream.windowAll(TumblingEventTimeWindows.of(Time.seconds(5))) // Last 5 seconds of data
-  {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Window Apply</strong><br>WindowedStream &rarr; DataStream<br>AllWindowedStream &rarr; DataStream</td>
-          <td>
-            <p>Applies a general function to the window as a whole. Below is a function that manually sums the elements of a window.</p>
-            <p><strong>Note:</strong> If you are using a windowAll transformation, you need to use an AllWindowFunction instead.</p>
-    {% highlight scala %}
-windowedStream.apply { WindowFunction }
-
-// applying an AllWindowFunction on non-keyed window stream
-allWindowedStream.apply { AllWindowFunction }
-
-    {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Window Reduce</strong><br>WindowedStream &rarr; DataStream</td>
-          <td>
-            <p>Applies a functional reduce function to the window and returns the reduced value.</p>
-    {% highlight scala %}
-windowedStream.reduce { _ + _ }
-    {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Window Fold</strong><br>WindowedStream &rarr; DataStream</td>
-          <td>
-            <p>Applies a functional fold function to the window and returns the folded value.
-               The example function, when applied on the sequence (1,2,3,4,5),
-               folds the sequence into the string "start-1-2-3-4-5":</p>
-          {% highlight scala %}
-val result: DataStream[String] =
-    windowedStream.fold("start", (str, i) => { str + "-" + i })
-          {% endhighlight %}
-          </td>
-	</tr>
-        <tr>
-          <td><strong>Aggregations on windows</strong><br>WindowedStream &rarr; DataStream</td>
-          <td>
-            <p>Aggregates the contents of a window. The difference between min
-	    and minBy is that min returns the minimum value, whereas minBy returns
-	    the element that has the minimum value in this field (same for max and maxBy).</p>
-    {% highlight scala %}
-windowedStream.sum(0)
-windowedStream.sum("key")
-windowedStream.min(0)
-windowedStream.min("key")
-windowedStream.max(0)
-windowedStream.max("key")
-windowedStream.minBy(0)
-windowedStream.minBy("key")
-windowedStream.maxBy(0)
-windowedStream.maxBy("key")
-    {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Union</strong><br>DataStream* &rarr; DataStream</td>
-          <td>
-            <p>Union of two or more data streams creating a new stream containing all the elements from all the streams. Note: If you union a data stream
-            with itself you will get each element twice in the resulting stream.</p>
-    {% highlight scala %}
-dataStream.union(otherStream1, otherStream2, ...)
-    {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Window Join</strong><br>DataStream,DataStream &rarr; DataStream</td>
-          <td>
-            <p>Join two data streams on a given key and a common window.</p>
-    {% highlight scala %}
-dataStream.join(otherStream)
-    .where(<key selector>).equalTo(<key selector>)
-    .window(TumblingEventTimeWindows.of(Time.seconds(3)))
-    .apply { ... }
-    {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Window CoGroup</strong><br>DataStream,DataStream &rarr; DataStream</td>
-          <td>
-            <p>Cogroups two data streams on a given key and a common window.</p>
-    {% highlight scala %}
-dataStream.coGroup(otherStream)
-    .where(0).equalTo(1)
-    .window(TumblingEventTimeWindows.of(Time.seconds(3)))
-    .apply {}
-    {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Connect</strong><br>DataStream,DataStream &rarr; ConnectedStreams</td>
-          <td>
-            <p>"Connects" two data streams retaining their types, allowing for shared state between
-            the two streams.</p>
-    {% highlight scala %}
-someStream : DataStream[Int] = ...
-otherStream : DataStream[String] = ...
-
-val connectedStreams = someStream.connect(otherStream)
-    {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>CoMap, CoFlatMap</strong><br>ConnectedStreams &rarr; DataStream</td>
-          <td>
-            <p>Similar to map and flatMap on a connected data stream</p>
-    {% highlight scala %}
-connectedStreams.map(
-    (_ : Int) => true,
-    (_ : String) => false
-)
-connectedStreams.flatMap(
-    (_ : Int) => true,
-    (_ : String) => false
-)
-    {% endhighlight %}
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Split</strong><br>DataStream &rarr; SplitStream</td>
-          <td>
-            <p>
-                Split the stream into two or more streams according to some criterion.
-                {% highlight scala %}
-val split = someDataStream.split(
-  (num: Int) =>
-    (num % 2) match {
-      case 0 => List("even")
-      case 1 => List("odd")
-    }
-)
-                {% endhighlight %}
-            </p>
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Select</strong><br>SplitStream &rarr; DataStream</td>
-          <td>
-            <p>
-                Select one or more streams from a split stream.
-                {% highlight scala %}
-
-val even = split select "even"
-val odd = split select "odd"
-val all = split.select("even","odd")
-                {% endhighlight %}
-            </p>
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Iterate</strong><br>DataStream &rarr; IterativeStream  &rarr; DataStream</td>
-          <td>
-            <p>
-                Creates a "feedback" loop in the flow, by redirecting the output of one operator
-                to some previous operator. This is especially useful for defining algorithms that
-                continuously update a model. The following code starts with a stream and applies
-		the iteration body continuously. Elements that are greater than 0 are sent back
-		to the feedback channel, and the rest of the elements are forwarded downstream.
-		See <a href="#iterations">iterations</a> for a complete description.
-                {% highlight java %}
-initialStream.iterate {
-  iteration => {
-    val iterationBody = iteration.map {/*do something*/}
-    (iterationBody.filter(_ > 0), iterationBody.filter(_ <= 0))
-  }
-}
-                {% endhighlight %}
-            </p>
-          </td>
-        </tr>
-        <tr>
-          <td><strong>Extract Timestamps</strong><br>DataStream &rarr; DataStream</td>
-          <td>
-            <p>
-                Extracts timestamps from records in order to work with windows
-                that use event time semantics.
-                See <a href="{{ site.baseurl }}/apis/streaming/event_time.html">Event Time</a>.
-                {% highlight scala %}
-stream.assignTimestamps { timestampExtractor }
-                {% endhighlight %}
-            </p>
-          </td>
-        </tr>
-  </tbody>
-</table>
-
-Extraction from tuples, case classes and collections via anonymous pattern matching, like the following:
-{% highlight scala %}
-val data: DataStream[(Int, String, Double)] = // [...]
-data.map {
-  case (id, name, temperature) => // [...]
-}
-{% endhighlight %}
-is not supported by the API out-of-the-box. To use this feature, you should use a <a href="scala_api_extensions.html">Scala API extension</a>.
-
-
-</div>
-</div>
-
-The following transformations are available on data streams of Tuples:
-
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-
-<br />
-
-<table class="table table-bordered">
-  <thead>
-    <tr>
-      <th class="text-left" style="width: 20%">Transformation</th>
-      <th class="text-center">Description</th>
-    </tr>
-  </thead>
-  <tbody>
-   <tr>
-      <td><strong>Project</strong><br>DataStream &rarr; DataStream</td>
-      <td>
-        <p>Selects a subset of fields from the tuples
-{% highlight java %}
-DataStream<Tuple3<Integer, Double, String>> in = // [...]
-DataStream<Tuple2<String, Integer>> out = in.project(2,0);
-{% endhighlight %}
-        </p>
-      </td>
-    </tr>
-  </tbody>
-</table>
-
-</div>
-</div>
-
-
-# Physical partitioning
-
-Flink also gives low-level control (if desired) on the exact stream partitioning after a transformation,
-via the following functions.
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-
-<br />
-
-<table class="table table-bordered">
-  <thead>
-    <tr>
-      <th class="text-left" style="width: 20%">Transformation</th>
-      <th class="text-center">Description</th>
-    </tr>
-  </thead>
-  <tbody>
-   <tr>
-      <td><strong>Custom partitioning</strong><br>DataStream &rarr; DataStream</td>
-      <td>
-        <p>
-            Uses a user-defined Partitioner to select the target task for each element.
-            {% highlight java %}
-dataStream.partitionCustom(partitioner, "someKey");
-dataStream.partitionCustom(partitioner, 0);
-            {% endhighlight %}
-        </p>
-      </td>
-    </tr>
-   <tr>
-     <td><strong>Random partitioning</strong><br>DataStream &rarr; DataStream</td>
-     <td>
-       <p>
-            Partitions elements randomly according to a uniform distribution.
-            {% highlight java %}
-dataStream.shuffle();
-            {% endhighlight %}
-       </p>
-     </td>
-   </tr>
-   <tr>
-      <td><strong>Rebalancing (Round-robin partitioning)</strong><br>DataStream &rarr; DataStream</td>
-      <td>
-        <p>
-            Partitions elements round-robin, creating equal load per partition. Useful for performance
-            optimization in the presence of data skew.
-            {% highlight java %}
-dataStream.rebalance();
-            {% endhighlight %}
-        </p>
-      </td>
-    </tr>
-    <tr>
-      <td><strong>Rescaling</strong><br>DataStream &rarr; DataStream</td>
-      <td>
-        <p>
-            Partitions elements, round-robin, to a subset of downstream operations. This is
-            useful if you want to have pipelines where you, for example, fan out from
-            each parallel instance of a source to a subset of several mappers to distribute load
-            but don't want the full rebalance that rebalance() would incur. This would require only
-            local data transfers instead of transferring data over network, depending on
-            other configuration values such as the number of slots of TaskManagers.
-        </p>
-        <p>
-            The subset of downstream operations to which the upstream operation sends
-            elements depends on the degree of parallelism of both the upstream and downstream operation.
-            For example, if the upstream operation has parallelism 2 and the downstream operation
-            has parallelism 6, then one upstream operation would distribute elements to three
-            downstream operations while the other upstream operation would distribute to the other
-            three downstream operations. If, on the other hand, the downstream operation has parallelism
-            2 while the upstream operation has parallelism 6 then three upstream operations would
-            distribute to one downstream operation while the other three upstream operations would
-            distribute to the other downstream operation.
-        </p>
-        <p>
-            In cases where the different parallelisms are not multiples of each other one or several
-            downstream operations will have a differing number of inputs from upstream operations.
-        </p>
-        <p>
-            Please see this figure for a visualization of the connection pattern in the above
-            example:
-        </p>
-
-        <div style="text-align: center">
-            <img src="{{ site.baseurl }}/fig/rescale.svg" alt="Checkpoint barriers in data streams" />
-            </div>
-
-
-        <p>
-                    {% highlight java %}
-dataStream.rescale();
-            {% endhighlight %}
-
-        </p>
-      </td>
-    </tr>
-   <tr>
-      <td><strong>Broadcasting</strong><br>DataStream &rarr; DataStream</td>
-      <td>
-        <p>
-            Broadcasts elements to every partition.
-            {% highlight java %}
-dataStream.broadcast();
-            {% endhighlight %}
-        </p>
-      </td>
-    </tr>
-  </tbody>
-</table>
-
-</div>
-
-<div data-lang="scala" markdown="1">
-
-<br />
-
-<table class="table table-bordered">
-  <thead>
-    <tr>
-      <th class="text-left" style="width: 20%">Transformation</th>
-      <th class="text-center">Description</th>
-    </tr>
-  </thead>
-  <tbody>
-   <tr>
-      <td><strong>Custom partitioning</strong><br>DataStream &rarr; DataStream</td>
-      <td>
-        <p>
-            Uses a user-defined Partitioner to select the target task for each element.
-            {% highlight scala %}
-dataStream.partitionCustom(partitioner, "someKey")
-dataStream.partitionCustom(partitioner, 0)
-            {% endhighlight %}
-        </p>
-      </td>
-    </tr>
-   <tr>
-     <td><strong>Random partitioning</strong><br>DataStream &rarr; DataStream</td>
-     <td>
-       <p>
-            Partitions elements randomly according to a uniform distribution.
-            {% highlight scala %}
-dataStream.shuffle()
-            {% endhighlight %}
-       </p>
-     </td>
-   </tr>
-   <tr>
-      <td><strong>Rebalancing (Round-robin partitioning)</strong><br>DataStream &rarr; DataStream</td>
-      <td>
-        <p>
-            Partitions elements round-robin, creating equal load per partition. Useful for performance
-            optimization in the presence of data skew.
-            {% highlight scala %}
-dataStream.rebalance()
-            {% endhighlight %}
-        </p>
-      </td>
-    </tr>
-    <tr>
-      <td><strong>Rescaling</strong><br>DataStream &rarr; DataStream</td>
-      <td>
-        <p>
-            Partitions elements, round-robin, to a subset of downstream operations. This is
-            useful if you want to have pipelines where you, for example, fan out from
-            each parallel instance of a source to a subset of several mappers to distribute load
-            but don't want the full rebalance that rebalance() would incur. This would require only
-            local data transfers instead of transferring data over network, depending on
-            other configuration values such as the number of slots of TaskManagers.
-        </p>
-        <p>
-            The subset of downstream operations to which the upstream operation sends
-            elements depends on the degree of parallelism of both the upstream and downstream operation.
-            For example, if the upstream operation has parallelism 2 and the downstream operation
-            has parallelism 4, then one upstream operation would distribute elements to two
-            downstream operations while the other upstream operation would distribute to the other
-            two downstream operations. If, on the other hand, the downstream operation has parallelism
-            2 while the upstream operation has parallelism 4 then two upstream operations would
-            distribute to one downstream operation while the other two upstream operations would
-            distribute to the other downstream operations.
-        </p>
-        <p>
-            In cases where the different parallelisms are not multiples of each other one or several
-            downstream operations will have a differing number of inputs from upstream operations.
-
-        </p>
-        </p>
-            Please see this figure for a visualization of the connection pattern in the above
-            example:
-        </p>
-
-        <div style="text-align: center">
-            <img src="{{ site.baseurl }}/fig/rescale.svg" alt="Checkpoint barriers in data streams" />
-            </div>
-
-
-        <p>
-                    {% highlight java %}
-dataStream.rescale()
-            {% endhighlight %}
-
-        </p>
-      </td>
-    </tr>
-   <tr>
-      <td><strong>Broadcasting</strong><br>DataStream &rarr; DataStream</td>
-      <td>
-        <p>
-            Broadcasts elements to every partition.
-            {% highlight scala %}
-dataStream.broadcast()
-            {% endhighlight %}
-        </p>
-      </td>
-    </tr>
-  </tbody>
-</table>
-
-</div>
-</div>
-
-# Task chaining and resource groups
-
-Chaining two subsequent transformations means co-locating them within the same thread for better
-performance. Flink by default chains operators if this is possible (e.g., two subsequent map
-transformations). The API gives fine-grained control over chaining if desired:
-
-Use `StreamExecutionEnvironment.disableOperatorChaining()` if you want to disable chaining in
-the whole job. For more fine grained control, the following functions are available. Note that
-these functions can only be used right after a DataStream transformation as they refer to the
-previous transformation. For example, you can use `someStream.map(...).startNewChain()`, but
-you cannot use `someStream.startNewChain()`.
-
-A resource group is a slot in Flink, see
-[slots]({{site.baseurl}}/setup/config.html#configuring-taskmanager-processing-slots). You can
-manually isolate operators in separate slots if desired.
-
-<div class="codetabs" markdown="1">
-<div data-lang="java" markdown="1">
-
-<br />
-
-<table class="table table-bordered">
-  <thead>
-    <tr>
-      <th class="text-left" style="width: 20%">Transformation</th>
-      <th class="text-center">Description</th>
-    </tr>
-  </thead>
-  <tbody>
-   <tr>
-      <td>Start new chain</td>
-      <td>
-        <p>Begin a new chain, starting with this operator. The two
-	mappers will be chained, and filter will not be chained to
-	the first mapper.
-{% highlight java %}
-someStream.filter(...).map(...).startNewChain().map(...);
-{% endhighlight %}
-        </p>
-      </td>
-    </tr>
-   <tr>
-      <td>Disable chaining</td>
-      <td>
-        <p>Do not chain the map operator
-{% highlight java %}
-someStream.map(...).disableChaining();
-{% endhighlight %}
-        </p>
-      </td>
-    </tr>
-    <tr>
-      <td>Set slot sharing group</td>
-      <td>
-        <p>Set the slot sharing group of an operation. Flink will put operations with the same
-        slot sharing group into the same slot while keeping operations that don't have the
-        slot sharing group in other slots. This can be used to isolate slots. The slot sharing
-        group is inherited from input operations if all input operations are in the same slot
-        sharing group.
-        The name of the default slot sharing group is "default", operations can explicitly
-        be put into this group by calling slotSharingGroup("default").
-{% highlight java %}
-someStream.filter(...).slotSharingGroup("name");
-{% endhighlight %}
-        </p>
-      </td>
-    </tr>
-  </tbody>
-</table>
-
-</div>
-
-<div data-lang="scala" markdown="1">
-
-<br />
-
-<table class="table table-bordered">
-  <thead>
-    <tr>
-      <th class="text-left" style="width: 20%">Transformation</th>
-      <th class="text-center">Description</th>
-    </tr>
-  </thead>
-  <tbody>
-   <tr>
-      <td>Start new chain</td>
-      <td>
-        <p>Begin a new chain, starting with this operator. The two
-	mappers will be chained, and filter will not be chained to
-	the first mapper.
-{% highlight scala %}
-someStream.filter(...).map(...).startNewChain().map(...)
-{% endhighlight %}
-        </p>
-      </td>
-    </tr>
-   <tr>
-      <td>Disable chaining</td>
-      <td>
-        <p>Do not chain the map operator
-{% highlight scala %}
-someStream.map(...).disableChaining()
-{% endhighlight %}
-        </p>
-      </td>
-    </tr>
-  <tr>
-      <td>Set slot sharing group</td>
-      <td>
-        <p>Set the slot sharing group of an operation. Flink will put operations with the same
-        slot sharing group into the same slot while keeping operations that don't have the
-        slot sharing group in other slots. This can be used to isolate slots. The slot sharing
-        group is inherited from input operations if all input operations are in the same slot
-        sharing group.
-        The name of the default slot sharing group is "default", operations can explicitly
-        be put into this group by calling slotSharingGroup("default").
-{% highlight java %}
-someStream.filter(...).slotSharingGroup("name")
-{% endhighlight %}
-        </p>
-      </td>
-    </tr>
-  </tbody>
-</table>
-
-</div>
-</div>
-
-
-{% top %}
-

http://git-wip-us.apache.org/repos/asf/flink/blob/31b86f60/docs/dev/stream/operators/asyncio.md
----------------------------------------------------------------------
diff --git a/docs/dev/stream/operators/asyncio.md b/docs/dev/stream/operators/asyncio.md
new file mode 100644
index 0000000..1ea0792
--- /dev/null
+++ b/docs/dev/stream/operators/asyncio.md
@@ -0,0 +1,253 @@
+---
+title: "Asynchronous I/O for External Data Access"
+nav-title: "Async I/O"
+nav-parent_id: streaming_operators
+nav-pos: 60
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+* ToC
+{:toc}
+
+This page explains the use of Flink's API for asynchronous I/O with external data stores.
+For users not familiar with asynchronous or event-driven programming, an article about Futures and
+event-driven programming may be useful preparation.
+
+Note: Details about the design and implementation of the asynchronous I/O utility can be found in the proposal and design document
+[FLIP-12: Asynchronous I/O Design and Implementation](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65870673).
+
+
+## The need for Asynchronous I/O Operations
+
+When interacting with external systems (for example when enriching stream events with data stored in a database), one needs to take care
+that communication delay with the external system does not dominate the streaming application's total work.
+
+Naively accessing data in the external database, for example in a `MapFunction`, typically means **synchronous** interaction:
+A request is sent to the database and the `MapFunction` waits until the response has been received. In many cases, this waiting
+makes up the vast majority of the function's time.
+
+Asynchronous interaction with the database means that a single parallel function instance can handle many requests concurrently and
+receive the responses concurrently. That way, the waiting time can be overlayed with sending other requests and
+receiving responses. At the very least, the waiting time is amortized over multiple requests. This leads in most cased to much higher
+streaming throughput.
+
+<img src="{{ site.baseurl }}/fig/async_io.svg" class="center" width="50%" />
+
+*Note:* Improving throughput by just scaling the `MapFunction` to a very high parallelism is in some cases possible as well, but usually
+comes at a very high resource cost: Having many more parallel MapFunction instances means more tasks, threads, Flink-internal network
+connections, network connections to the database, buffers, and general internal bookkeeping overhead.
+
+
+## Prerequisites
+
+As illustrated in the section above, implementing proper asynchronous I/O to a database (or key/value store) requires a client
+to that database that supports asynchronous requests. Many popular databases offer such a client.
+
+In the absence of such a client, one can try and turn a synchronous client into a limited concurrent client by creating
+multiple clients and handling the synchronous calls with a thread pool. However, this approach is usually less
+efficient than a proper asynchronous client.
+
+
+## Async I/O API
+
+Flink's Async I/O API allows users to use asynchronous request clients with data streams. The API handles the integration with
+data streams, well as handling order, event time, fault tolerance, etc.
+
+Assuming one has an asynchronous client for the target database, three parts are needed to implement a stream transformation
+with asynchronous I/O against the database:
+
+  - An implementation of `AsyncFunction` that dispatches the requests
+  - A *callback* that takes the result of the operation and hands it to the `AsyncCollector`
+  - Applying the async I/O operation on a DataStream as a transformation
+
+The following code example illustrates the basic pattern:
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+// This example implements the asynchronous request and callback with Futures that have the
+// interface of Java 8's futures (which is the same one followed by Flink's Future)
+
+/**
+ * An implementation of the 'AsyncFunction' that sends requests and sets the callback.
+ */
+class AsyncDatabaseRequest extends RichAsyncFunction<String, Tuple2<String, String>> {
+
+    /** The database specific client that can issue concurrent requests with callbacks */
+    private transient DatabaseClient client;
+
+    @Override
+    public void open(Configuration parameters) throws Exception {
+        client = new DatabaseClient(host, post, credentials);
+    }
+
+    @Override
+    public void close() throws Exception {
+        client.close();
+    }
+
+    @Override
+    public void asyncInvoke(final String str, final AsyncCollector<Tuple2<String, String>> asyncCollector) throws Exception {
+
+        // issue the asynchronous request, receive a future for result
+        Future<String> resultFuture = client.query(str);
+
+        // set the callback to be executed once the request by the client is complete
+        // the callback simply forwards the result to the collector
+        resultFuture.thenAccept( (String result) -> {
+
+            asyncCollector.collect(Collections.singleton(new Tuple2<>(str, result)));
+         
+        });
+    }
+}
+
+// create the original stream
+DataStream<String> stream = ...;
+
+// apply the async I/O transformation
+DataStream<Tuple2<String, String>> resultStream =
+    AsyncDataStream.unorderedWait(stream, new AsyncDatabaseRequest(), 1000, TimeUnit.MILLISECONDS, 100);
+
+{% endhighlight %}
+</div>
+<div data-lang="scala" markdown="1">
+{% highlight scala %}
+/**
+ * An implementation of the 'AsyncFunction' that sends requests and sets the callback.
+ */
+class AsyncDatabaseRequest extends AsyncFunction[String, (String, String)] {
+
+    /** The database specific client that can issue concurrent requests with callbacks */
+    lazy val client: DatabaseClient = new DatabaseClient(host, post, credentials)
+
+    /** The context used for the future callbacks */
+    implicit lazy val executor: ExecutionContext = ExecutionContext.fromExecutor(Executors.directExecutor())
+
+
+    override def asyncInvoke(str: String, asyncCollector: AsyncCollector[(String, String)]): Unit = {
+
+        // issue the asynchronous request, receive a future for the result
+        val resultFuture: Future[String] = client.query(str)
+
+        // set the callback to be executed once the request by the client is complete
+        // the callback simply forwards the result to the collector
+        resultFuture.onSuccess {
+            case result: String => asyncCollector.collect(Iterable((str, result)));
+        }
+    }
+}
+
+// create the original stream
+val stream: DataStream[String] = ...
+
+// apply the async I/O transformation
+val resultStream: DataStream[(String, String)] =
+    AsyncDataStream.unorderedWait(stream, new AsyncDatabaseRequest(), 1000, TimeUnit.MILLISECONDS, 100)
+
+{% endhighlight %}
+</div>
+</div>
+
+**Important note**: The `AsyncCollector` is completed with the first call of `AsyncCollector.collect`.
+All subsequent `collect` calls will be ignored.
+
+The following two parameters control the asynchronous operations:
+
+  - **Timeout**: The timeout defines how long an asynchronous request may take before it is considered failed. This parameter
+    guards against dead/failed requests.
+
+  - **Capacity**: This parameter defines how many asynchronous requests may be in progress at the same time.
+    Even though the async I/O approach leads typically to much better throughput, the operator can still be the bottleneck in
+    the streaming application. Limiting the number of concurrent requests ensures that the operator will not
+    accumulate an ever-growing backlog of pending requests, but that it will trigger backpressure once the capacity
+    is exhausted.
+
+
+### Order of Results
+
+The concurrent requests issued by the `AsyncFunction` frequently complete in some undefined order, based on which request finished first.
+To control in which order the resulting records are emitted, Flink offers two modes:
+
+  - **Unordered**: Result records are emitted as soon as the asynchronous request finishes.
+    The order of the records in the stream is different after the async I/O operator than before.
+    This mode has the lowest latency and lowest overhead, when used with *processing time* as the basic time characteristic.
+    Use `AsyncDataStream.unorderedWait(...)` for this mode.
+
+  - **Ordered**: In that case, the stream order is preserved. Result records are emitted in the same order as the asynchronous
+    requests are triggered (the order of the operators input records). To achieve that, the operator buffers a result record
+    until all its preceeding records are emitted (or timed out).
+    This usually introduces some amount of extra latency and some overhead in checkpointing, because records or results are maintained
+    in the checkpointed state for a longer time, compared to the unordered mode.
+    Use `AsyncDataStream.orderedWait(...)` for this mode.
+
+
+### Event Time
+
+When the streaming application works with [event time]({{ site.baseurl }}/dev/event_time.html), watermarks will be handled correctly by the
+asynchronous I/O operator. That means concretely the following for the two order modes:
+
+  - **Unordered**: Watermarks do not overtake records and vice versa, meaning watermarks establish an *order boundary*.
+    Records are emitted unordered only between watermarks.
+    A record occurring after a certain watermark will be emitted only after that watermark was emitted.
+    The watermark in turn will be emitted only after all result records from inputs before that watermark were emitted.
+
+    That means that in the presence of watermarks, the *unordered* mode introduces some of the same latency and management
+    overhead as the *ordered* mode does. The amount of that overhead depends on the watermark frequency.
+
+  - **Ordered**: Order of watermarks an records is preserved, just like order between records is preserved. There is no
+    significant change in overhead, compared to working with *processing time*.
+
+Please recall that *Ingestion Time* is a special case of *event time* with automatically generated watermarks that
+are based on the sources processing time.
+
+
+### Fault Tolerance Guarantees
+
+The asynchronous I/O operator offers full exactly-once fault tolerance guarantees. It stores the records for in-flight
+asynchronous requests in checkpoints and restores/re-triggers the requests when recovering from a failure.
+
+
+### Implementation Tips
+
+For implementations with *Futures* that have an *Executor* (or *ExecutionContext* in Scala) for callbacks, we suggets to use a `DirectExecutor`, because the
+callback typically does minimal work, and a `DirectExecutor` avoids an additional thread-to-thread handover overhead. The callback typically only hands
+the result to the `AsyncCollector`, which adds it to the output buffer. From there, the heavy logic that includes record emission and interaction
+with the checkpoint bookkeepting happens in a dedicated thread-pool anyways.
+
+A `DirectExecutor` can be obtained via `org.apache.flink.runtime.concurrent.Executors.directExecutor()` or
+`com.google.common.util.concurrent.MoreExecutors.directExecutor()`.
+
+
+### Caveat
+
+**The AsyncFunction is not called Multi-Threaded**
+
+A common confusion that we want to explicitly point out here is that the `AsyncFunction` is not called in a multi-threaded fashion.
+There exists only one instance of the `AsyncFunction` and it is called sequentially for each record in the respective partition
+of the stream. Unless the `asyncInvoke(...)` method returns fast and relies on a callback (by the client), it will not result in
+proper asynchronous I/O.
+
+For example, the following patterns result in a blocking `asyncInvoke(...)` functions and thus void the asynchronous behavior:
+
+  - Using a database client whose lookup/query method call blocks until the result has been received back
+
+  - Blocking/waiting on the future-type objects returned by an aynchronous client inside the `asyncInvoke(...)` method
+