You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@flink.apache.org by fh...@apache.org on 2017/06/20 10:30:38 UTC

flink git commit: [FLINK-6929] [table] Add documentation for Table API over windows.

Repository: flink
Updated Branches:
  refs/heads/master 8835da996 -> 7bb2c925a


[FLINK-6929] [table] Add documentation for Table API over windows.

This closes #4141.


Project: http://git-wip-us.apache.org/repos/asf/flink/repo
Commit: http://git-wip-us.apache.org/repos/asf/flink/commit/7bb2c925
Tree: http://git-wip-us.apache.org/repos/asf/flink/tree/7bb2c925
Diff: http://git-wip-us.apache.org/repos/asf/flink/diff/7bb2c925

Branch: refs/heads/master
Commit: 7bb2c925ab4fa6de4cd2af890468dd4784aa643c
Parents: 8835da9
Author: sunjincheng121 <su...@gmail.com>
Authored: Mon Jun 19 11:51:19 2017 +0800
Committer: Fabian Hueske <fh...@apache.org>
Committed: Tue Jun 20 12:30:54 2017 +0200

----------------------------------------------------------------------
 docs/dev/table/tableApi.md | 158 +++++++++++++++++++++++++++++++++++++++-
 1 file changed, 157 insertions(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/flink/blob/7bb2c925/docs/dev/table/tableApi.md
----------------------------------------------------------------------
diff --git a/docs/dev/table/tableApi.md b/docs/dev/table/tableApi.md
index 693b9bd..0811232 100644
--- a/docs/dev/table/tableApi.md
+++ b/docs/dev/table/tableApi.md
@@ -1267,7 +1267,163 @@ A session window is defined by using the `Session` class as follows:
 
 ### Over Windows
 
-**TO BE DONE**
+Over window aggregates are known from standard SQL (`OVER` clause) and defined in the `SELECT` clause of a query. Unlike group windows, which are specified in the `GROUP BY` clause, over windows do not collapse rows. Instead over window aggregates compute an aggregate for each input row over a range of its neighboring rows. 
+
+Over windows are defined using the `window(w: OverWindow*)` clause and referenced via an alias in the `select()` method. The following example shows how to define an over window aggregation on a table.
+
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+Table table = input
+  .window([OverWindow w].as("w"))           // define over window with alias w
+  .select("a, b.sum over w, c.min over w"); // aggregate over the over window w
+{% endhighlight %}
+</div>
+
+<div data-lang="scala" markdown="1">
+{% highlight scala %}
+val table = input
+  .window([w: OverWindow] as 'w)              // define over window with alias w
+  .select('a, 'b.sum over 'w, 'c.min over 'w) // aggregate over the over window w
+{% endhighlight %}
+</div>
+</div>
+
+The `OverWindow` defines a range of rows over which aggregates are computed. `OverWindow` is not an interface that users can implement. Instead, the Table API provides the `Over` class to configure the properties of the over window. Over windows can be defined on event-time or processing-time and on ranges specified as time interval or row-count. The supported over window definitions are exposed as methods on `Over` (and other classes) and are listed below:
+
+<table class="table table-bordered">
+  <thead>
+    <tr>
+      <th class="text-left" style="width: 20%">Method</th>
+      <th class="text-left">Required</th>
+      <th class="text-left">Description</th>
+    </tr>
+  </thead>
+
+  <tbody>
+    <tr>
+      <td><code>partitionBy</code></td>
+      <td>Optional</td>
+      <td>
+        <p>Defines a partitioning of the input on one or more attributes. Each partition is individually sorted and aggregate functions are applied to each partition separately.</p>
+
+        <p><b>Note:</b> In streaming environments, over window aggregates can only be computed in parallel if the window includes a partition by clause. Without <code>partitionBy(...)</code> the stream is processed by a single, non-parallel task.</p>
+      </td>
+    </tr>
+    <tr>
+      <td><code>orderBy</code></td>
+      <td>Required</td>
+      <td>
+        <p>Defines the order of rows within each partition and thereby the order in which the aggregate functions are applied to rows.</p>
+
+        <p><b>Note:</b> For streaming queries this must be a <a href="streaming.html#time-attributes">declared event-time or processing-time time attribute</a>. Currently, only a single sort attribute is supported.</p>
+      </td>
+    </tr>
+    <tr>
+      <td><code>preceding</code></td>
+      <td>Required</td>
+      <td>
+        <p>Defines the interval of rows that are included in the window and precede the current row. The interval can either be specified as time or row-count interval.</p>
+
+        <p><a href="tableApi.html#bounded-over-windows">Bounded over windows</a> are specified with the size of the interval, e.g., <code>10.minutes</code> for a time interval or <code>10.rows</code> for a row-count interval.</p>
+
+        <p><a href="tableApi.html#unbounded-over-windows">Unbounded over windows</a> are specified using a constant, i.e., <code>UNBOUNDED_RANGE</code> for a time interval or <code>UNBOUNDED_ROW</code> for a row-count interval. Unbounded over windows start with the first row of a partition.</p>
+      </td>
+    </tr>
+    <tr>
+      <td><code>following</code></td>
+      <td>Optional</td>
+      <td>
+        <p>Defines the window interval of rows that are included in the window and follow the current row. The interval must be specified in the same unit as the preceding interval (time or row-count).</p>
+
+        <p>At the moment, over windows with rows following the current row are not supported. All over windows must stop at the current row and only two values are supported for <code>following</code>, <code>CURRENT_RANGE</code> for a time interval and <code>CURRENT_ROW</code> for a row-count interval.</p>
+
+        <p>If the `following` clause is omitted, the window will end at the current row.</p>
+      </td>
+    </tr>
+    <tr>
+      <td><code>as</code></td>
+      <td>Required</td>
+      <td>
+        <p>Assigns an alias to the over window. The alias is used to reference the over window in the following <code>select()</code> clause.</p>
+      </td>
+    </tr>
+  </tbody>
+</table>
+
+**Note:** Currently, all aggregation functions in the same `select()` call must be computed of the same over window.
+
+#### Unbounded Over Windows
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+// Unbounded Event-time over window (assuming an event-time attribute "rowtime")
+.window(Over.partitionBy("a").orderBy("rowtime").preceding("unbounded_range").as("w"));
+
+// Unbounded Processing-time over window (assuming a processing-time attribute "proctime")
+.window(Over.partitionBy("a").orderBy("proctime").preceding("unbounded_range").as("w"));
+
+// Unbounded Event-time Row-count over window (assuming an event-time attribute "rowtime")
+.window(Over.partitionBy("a").orderBy("rowtime").preceding("unbounded_row").as("w"));
+ 
+// Unbounded Processing-time Row-count over window (assuming a processing-time attribute "proctime")
+.window(Over.partitionBy("a").orderBy("proctime").preceding("unbounded_row").as("w"));
+{% endhighlight %}
+</div>
+
+<div data-lang="scala" markdown="1">
+{% highlight scala %}
+// Unbounded Event-time over window (assuming an event-time attribute "rowtime")
+.window(Over partitionBy 'a orderBy 'rowtime preceding UNBOUNDED_RANGE as 'w)
+
+// Unbounded Processing-time over window (assuming a processing-time attribute "proctime")
+.window(Over partitionBy 'a orderBy 'proctime preceding UNBOUNDED_RANGE as 'w)
+
+// Unbounded Event-time Row-count over window (assuming an event-time attribute "rowtime")
+.window(Over partitionBy 'a orderBy 'rowtime preceding UNBOUNDED_ROW as 'w)
+ 
+// Unbounded Processing-time Row-count over window (assuming a processing-time attribute "proctime")
+.window(Over partitionBy 'a orderBy 'proctime preceding UNBOUNDED_ROW as 'w)
+{% endhighlight %}
+</div>
+</div>
+
+#### Bounded Over Windows
+<div class="codetabs" markdown="1">
+<div data-lang="java" markdown="1">
+{% highlight java %}
+// Bounded Event-time over window (assuming an event-time attribute "rowtime")
+.window(Over.partitionBy("a").orderBy("rowtime").preceding("1.minutes").as("w"))
+
+// Bounded Processing-time over window (assuming a processing-time attribute "proctime")
+.window(Over.partitionBy("a").orderBy("proctime").preceding("1.minutes").as("w"))
+
+// Bounded Event-time Row-count over window (assuming an event-time attribute "rowtime")
+.window(Over.partitionBy("a").orderBy("rowtime").preceding("10.rows").as("w"))
+ 
+// Bounded Processing-time Row-count over window (assuming a processing-time attribute "proctime")
+.window(Over.partitionBy("a").orderBy("proctime").preceding("10.rows").as("w"))
+{% endhighlight %}
+</div>
+
+<div data-lang="scala" markdown="1">
+{% highlight scala %}
+// Bounded Event-time over window (assuming an event-time attribute "rowtime")
+.window(Over partitionBy 'a orderBy 'rowtime preceding 1.minutes as 'w)
+
+// Bounded Processing-time over window (assuming a processing-time attribute "proctime")
+.window(Over partitionBy 'a orderBy 'proctime preceding 1.minutes as 'w)
+
+// Bounded Event-time Row-count over window (assuming an event-time attribute "rowtime")
+.window(Over partitionBy 'a orderBy 'rowtime preceding 10.rows as 'w)
+  
+// Bounded Processing-time Row-count over window (assuming a processing-time attribute "proctime")
+.window(Over partitionBy 'a orderBy 'proctime preceding 10.rows as 'w)
+{% endhighlight %}
+</div>
+</div>
+
+{% top %}
 
 Data Types
 ----------