You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@flink.apache.org by al...@apache.org on 2014/08/13 18:04:11 UTC

svn commit: r1617745 - in /incubator/flink/site/docs/0.6-SNAPSHOT: java_api_guide.html java_api_transformations.html run_example_quickstart.html

Author: aljoscha
Date: Wed Aug 13 16:04:11 2014
New Revision: 1617745

URL: http://svn.apache.org/r1617745
Log:
Update Documentation

Modified:
    incubator/flink/site/docs/0.6-SNAPSHOT/java_api_guide.html
    incubator/flink/site/docs/0.6-SNAPSHOT/java_api_transformations.html
    incubator/flink/site/docs/0.6-SNAPSHOT/run_example_quickstart.html

Modified: incubator/flink/site/docs/0.6-SNAPSHOT/java_api_guide.html
URL: http://svn.apache.org/viewvc/incubator/flink/site/docs/0.6-SNAPSHOT/java_api_guide.html?rev=1617745&r1=1617744&r2=1617745&view=diff
==============================================================================
--- incubator/flink/site/docs/0.6-SNAPSHOT/java_api_guide.html (original)
+++ incubator/flink/site/docs/0.6-SNAPSHOT/java_api_guide.html Wed Aug 13 16:04:11 2014
@@ -178,6 +178,24 @@
 </li>
 <li>
 <a href="#functions">Functions</a>
+<ul>
+<li>
+<ul>
+<li>
+<a href="#implementing-an-interface">Implementing an interface</a>
+</li>
+<li>
+<a href="#anonymous-classes">Anonymous classes</a>
+</li>
+<li>
+<a href="#java-8-lambdas">Java 8 Lambdas</a>
+</li>
+<li>
+<a href="#rich-functions">Rich functions</a>
+</li>
+</ul>
+</li>
+</ul>
 </li>
 <li>
 <a href="#data-types">Data Types</a>
@@ -300,7 +318,7 @@
         <span class="n">DataSet</span><span class="o">&lt;</span><span class="n">Tuple2</span><span class="o">&lt;</span><span class="n">String</span><span class="o">,</span> <span class="n">Integer</span><span class="o">&gt;&gt;</span> <span class="n">wordCounts</span> <span class="o">=</span> <span class="n">text</span>
             <span class="o">.</span><span class="na">flatMap</span><span class="o">(</span><span class="k">new</span> <span class="nf">LineSplitter</span><span class="o">())</span>
             <span class="o">.</span><span class="na">groupBy</span><span class="o">(</span><span class="mi">0</span><span class="o">)</span>
-            <span class="o">.</span><span class="na">aggregate</span><span class="o">(</span><span class="n">Aggregations</span><span class="o">.</span><span class="na">SUM</span><span class="o">,</span> <span class="mi">1</span><span class="o">);</span>
+            <span class="o">.</span><span class="na">sum</span><span class="o">(</span><span class="mi">1</span><span class="o">);</span>
 
         <span class="n">wordCounts</span><span class="o">.</span><span class="na">print</span><span class="o">();</span>
 
@@ -394,7 +412,7 @@ an execution environment for executing y
 <p>For specifying data sources the execution environment has several methods
 to read from files using various methods: you can just read them line by line,
 as CSV files, or using completely custom data input formats. To just read
-a text file as a sequence of lines, you could use:</p>
+a text file as a sequence of lines, you can use:</p>
 <div class="highlight"><pre><code class="language-java" data-lang="java"><span class="kd">final</span> <span class="n">ExecutionEnvironment</span> <span class="n">env</span> <span class="o">=</span> <span class="n">ExecutionEnvironment</span><span class="o">.</span><span class="na">getExecutionEnvironment</span><span class="o">();</span>
 
 <span class="n">DataSet</span><span class="o">&lt;</span><span class="n">String</span><span class="o">&gt;</span> <span class="n">text</span> <span class="o">=</span> <span class="n">env</span><span class="o">.</span><span class="na">readTextFile</span><span class="o">(</span><span class="s">&quot;file:///path/to/file&quot;</span><span class="o">);</span>
@@ -407,7 +425,7 @@ more information on data sources and inp
 <code>DataSet</code> which you can then write to a file, transform again, or
 combine with other <code>DataSet</code>s. You apply transformations by calling
 methods on <code>DataSet</code> with your own custom transformation function. For example,
-map looks like this:</p>
+a map transformation looks like this:</p>
 <div class="highlight"><pre><code class="language-java" data-lang="java"><span class="n">DataSet</span><span class="o">&lt;</span><span class="n">String</span><span class="o">&gt;</span> <span class="n">input</span> <span class="o">=</span> <span class="o">...;</span>
 
 <span class="n">DataSet</span><span class="o">&lt;</span><span class="n">Integer</span><span class="o">&gt;</span> <span class="n">tokenized</span> <span class="o">=</span> <span class="n">text</span><span class="o">.</span><span class="na">map</span><span class="o">(</span><span class="k">new</span> <span class="n">MapFunction</span><span class="o">&lt;</span><span class="n">String</span><span class="o">,</span> <span class="n">Integer</span><span class="o">&gt;()</span> <span class="o">{</span>
@@ -506,7 +524,7 @@ has full description of all transformati
     <tr>
       <td><strong>Filter</strong></td>
       <td>
-        <p>Evaluates a boolean function for each element and retains those for which the function returns *true*.</p>
+        <p>Evaluates a boolean function for each element and retains those for which the function returns true.</p>
 
 <div class="highlight"><pre><code class="language-java" data-lang="java"><span class="n">data</span><span class="o">.</span><span class="na">filter</span><span class="o">(</span><span class="k">new</span> <span class="n">FilterFunction</span><span class="o">&lt;</span><span class="n">Integer</span><span class="o">&gt;()</span> <span class="o">{</span>
   <span class="kd">public</span> <span class="kt">boolean</span> <span class="nf">filter</span><span class="o">(</span><span class="n">Integer</span> <span class="n">value</span><span class="o">)</span> <span class="o">{</span> <span class="k">return</span> <span class="n">value</span> <span class="o">&gt;</span> <span class="mi">1000</span><span class="o">;</span> <span class="o">}</span>
@@ -551,37 +569,13 @@ has full description of all transformati
       <td>
         <p>Aggregates a group of values into a single value. Aggregation functions can be thought of as built-in reduce functions. Aggregate may be applied on a full data set, or on a grouped data set.</p>
 
-<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="n">DataSet</span><span class="o">&lt;</span><span class="n">Tuple3</span><span class="o">&lt;</span><span class="n">Integer</span><span class="o">,</span> <span class="n">String</span><span class="o">,</span> <span class="n">Double</span><span class="o">&gt;&gt;</span> <span class="n">input</span> <span class="o">=</span> <span class="c1">// [...]</span>
+<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="n">Dataset</span><span class="o">&lt;</span><span class="n">Tuple3</span><span class="o">&lt;</span><span class="n">Integer</span><span class="o">,</span> <span class="n">String</span><span class="o">,</span> <span class="n">Double</span><span class="o">&gt;&gt;</span> <span class="n">input</span> <span class="o">=</span> <span class="c1">// [...]</span>
 <span class="n">DataSet</span><span class="o">&lt;</span><span class="n">Tuple3</span><span class="o">&lt;</span><span class="n">Integer</span><span class="o">,</span> <span class="n">String</span><span class="o">,</span> <span class="n">Double</span><span class="o">&gt;&gt;</span> <span class="n">output</span> <span class="o">=</span> <span class="n">input</span><span class="o">.</span><span class="na">aggregate</span><span class="o">(</span><span class="n">SUM</span><span class="o">,</span> <span class="mi">0</span><span class="o">).</span><span class="na">and</span><span class="o">(</span><span class="n">MIN</span><span class="o">,</span> <span class="mi">2</span><span class="o">);</span></code></pre></div>
 
-      </td>
-    </tr>
-
-    <tr>
-      <td><strong>ReduceGroup</strong></td>
-      <td>
-        <p>Combines a group of elements into one or more elements. ReduceGroup may be applied on a full data set, or on a grouped data set.</p>
-
-<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="n">data</span><span class="o">.</span><span class="na">reduceGroup</span><span class="o">(</span><span class="k">new</span> <span class="n">GroupReduceFunction</span><span class="o">&lt;</span><span class="n">Integer</span><span class="o">,</span> <span class="n">Integer</span><span class="o">&gt;</span> <span class="o">{</span>
-  <span class="kd">public</span> <span class="kt">void</span> <span class="nf">reduceGroup</span><span class="o">(</span><span class="n">Iterable</span><span class="o">&lt;</span><span class="n">Integer</span><span class="o">&gt;</span> <span class="n">values</span><span class="o">,</span> <span class="n">Collector</span><span class="o">&lt;</span><span class="n">Integer</span><span class="o">&gt;</span> <span class="n">out</span><span class="o">)</span> <span class="o">{</span>
-    <span class="kt">int</span> <span class="n">prefixSum</span> <span class="o">=</span> <span class="mi">0</span><span class="o">;</span>
-    <span class="k">for</span> <span class="o">(</span><span class="n">Integer</span> <span class="n">i</span> <span class="o">:</span> <span class="n">values</span><span class="o">)</span> <span class="o">{</span>
-      <span class="n">prefixSum</span> <span class="o">+=</span> <span class="n">i</span><span class="o">;</span>
-      <span class="n">out</span><span class="o">.</span><span class="na">collect</span><span class="o">(</span><span class="n">prefixSum</span><span class="o">);</span>
-    <span class="o">}</span>
-  <span class="o">}</span>
-<span class="o">});</span></code></pre></div>
-
-      </td>
-    </tr>
-
-    <tr>
-      <td><strong>Aggregate</strong></td>
-      <td>
-        <p>Aggregates a group of values into a single value. Aggregation functions can be thought of as built-in reduce functions. Aggregate may be applied on a full data set, or on a grouped data set.</p>
-
-<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="n">DataSet</span><span class="o">&lt;</span><span class="n">Tuple3</span><span class="o">&lt;</span><span class="n">Integer</span><span class="o">,</span> <span class="n">String</span><span class="o">,</span> <span class="n">Double</span><span class="o">&gt;&gt;</span> <span class="n">input</span> <span class="o">=</span> <span class="c1">// [...]</span>
-<span class="n">DataSet</span><span class="o">&lt;</span><span class="n">Tuple3</span><span class="o">&lt;</span><span class="n">Integer</span><span class="o">,</span> <span class="n">String</span><span class="o">,</span> <span class="n">Double</span><span class="o">&gt;&gt;</span> <span class="n">output</span> <span class="o">=</span> <span class="n">input</span><span class="o">.</span><span class="na">aggregate</span><span class="o">(</span><span class="n">SUM</span><span class="o">,</span> <span class="mi">0</span><span class="o">).</span><span class="na">and</span><span class="o">(</span><span class="n">MIN</span><span class="o">,</span> <span class="mi">2</span><span class="o">);</span></code></pre></div>
+    <p>You can also use short-hand syntax for minimum, maximum, and sum aggregations.</p>
+    
+<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="n">Dataset</span><span class="o">&lt;</span><span class="n">Tuple3</span><span class="o">&lt;</span><span class="n">Integer</span><span class="o">,</span> <span class="n">String</span><span class="o">,</span> <span class="n">Double</span><span class="o">&gt;&gt;</span> <span class="n">input</span> <span class="o">=</span> <span class="c1">// [...]</span>
+<span class="n">DataSet</span><span class="o">&lt;</span><span class="n">Tuple3</span><span class="o">&lt;</span><span class="n">Integer</span><span class="o">,</span> <span class="n">String</span><span class="o">,</span> <span class="n">Double</span><span class="o">&gt;&gt;</span> <span class="n">output</span> <span class="o">=</span> <span class="n">input</span><span class="o">.</span><span class="na">sum</span><span class="o">(</span><span class="mi">0</span><span class="o">).</span><span class="na">andMin</span><span class="o">(</span><span class="mi">2</span><span class="o">);</span></code></pre></div>
 
       </td>
     </tr>
@@ -589,7 +583,7 @@ has full description of all transformati
     </tr>
       <td><strong>Join</strong></td>
       <td>
-        <p>Joins two data sets by creating all pairs of elements that are equal on their keys. Optionally uses a JoinFunction to turn the pair of elements into a single element. See [keys](#keys) on how to define join keys.</p>
+        Joins two data sets by creating all pairs of elements that are equal on their keys. Optionally uses a JoinFunction to turn the pair of elements into a single element, or a FlatJoinFunction to turn the pair of elements into arbitararily many (including none) elements. See [keys](#keys) on how to define join keys.
 
 <div class="highlight"><pre><code class="language-java" data-lang="java"><span class="n">result</span> <span class="o">=</span> <span class="n">input1</span><span class="o">.</span><span class="na">join</span><span class="o">(</span><span class="n">input2</span><span class="o">)</span>
                <span class="o">.</span><span class="na">where</span><span class="o">(</span><span class="mi">0</span><span class="o">)</span>       <span class="c1">// key of the first input (tuple field 0)</span>
@@ -618,7 +612,7 @@ has full description of all transformati
     <tr>
       <td><strong>Cross</strong></td>
       <td>
-        <p>Builds the cartesian product (cross product) of two inputs, creating all pairs of elements. Optionally uses a CrossFunction to turn the pair of elements into a single element</p>
+        <p>Builds the Cartesian product (cross product) of two inputs, creating all pairs of elements. Optionally uses a CrossFunction to turn the pair of elements into a single element</p>
 
 <div class="highlight"><pre><code class="language-java" data-lang="java"><span class="n">DataSet</span><span class="o">&lt;</span><span class="n">Integer</span><span class="o">&gt;</span> <span class="n">data1</span> <span class="o">=</span> <span class="c1">// [...]</span>
 <span class="n">DataSet</span><span class="o">&lt;</span><span class="n">String</span><span class="o">&gt;</span> <span class="n">data2</span> <span class="o">=</span> <span class="c1">// [...]</span>
@@ -671,12 +665,160 @@ has full description of all transformati
 
 <h2 id="defining-keys">Defining Keys</h2>
 
+<p>One transformation (join, coGroup) require that a key is defined on
+its argument DataSets, and other transformations (Reduce, GroupReduce,
+Aggregate) allow that the DataSet is grouped on a key before they are
+applied.</p>
+
+<p>A DataSet is grouped as</p>
+
+<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="n">DataSet</span><span class="o">&lt;...&gt;</span> <span class="n">input</span> <span class="o">=</span> <span class="c1">// [...]</span>
+<span class="n">DataSet</span><span class="o">&lt;...&gt;</span> <span class="n">reduced</span> <span class="o">=</span> <span class="n">input</span>
+    <span class="o">.</span><span class="na">groupBy</span><span class="o">(</span><span class="cm">/*define key here*/</span><span class="o">)</span>
+    <span class="o">.</span><span class="na">reduceGroup</span><span class="o">(</span><span class="cm">/*do something*/</span><span class="o">);</span></code></pre></div>
+
+<p>The data model of Flink is not based on key-value pairs. Therefore,
+you do not need to physically pack the data set types into keys and
+values. Keys are &quot;virtual&quot;: they are defined as functions over the
+actual data to guide the grouping operator.</p>
+
+<p>The simplest case is grouping a data set of Tuples on one or more
+fields of the Tuple:</p>
+
+<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="n">DataSet</span><span class="o">&lt;</span><span class="n">Tuple3</span><span class="o">&lt;</span><span class="n">Integer</span><span class="o">,</span><span class="n">String</span><span class="o">,</span><span class="n">Long</span><span class="o">&gt;&gt;</span> <span class="n">input</span> <span class="o">=</span> <span class="c1">// [...]</span>
+<span class="n">DataSet</span><span class="o">&lt;</span><span class="n">Tuple3</span><span class="o">&lt;</span><span class="n">Integer</span><span class="o">,</span><span class="n">String</span><span class="o">,</span><span class="n">Long</span><span class="o">&gt;</span> <span class="n">grouped</span> <span class="o">=</span> <span class="n">input</span>
+    <span class="o">.</span><span class="na">groupBy</span><span class="o">(</span><span class="mi">0</span><span class="o">)</span>
+    <span class="o">.</span><span class="na">reduceGroup</span><span class="o">(</span><span class="cm">/*do something*/</span><span class="o">);</span></code></pre></div>
+
+<p>The data set is grouped on the first field of the tuples (the one of
+Integer type). The GroupReduceFunction will thus receive groups with
+the same value of the first field.</p>
+
+<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="n">DataSet</span><span class="o">&lt;</span><span class="n">Tuple3</span><span class="o">&lt;</span><span class="n">Integer</span><span class="o">,</span><span class="n">String</span><span class="o">,</span><span class="n">Long</span><span class="o">&gt;&gt;</span> <span class="n">input</span> <span class="o">=</span> <span class="c1">// [...]</span>
+<span class="n">DataSet</span><span class="o">&lt;</span><span class="n">Tuple3</span><span class="o">&lt;</span><span class="n">Integer</span><span class="o">,</span><span class="n">String</span><span class="o">,</span><span class="n">Long</span><span class="o">&gt;</span> <span class="n">grouped</span> <span class="o">=</span> <span class="n">input</span>
+    <span class="o">.</span><span class="na">groupBy</span><span class="o">(</span><span class="mi">0</span><span class="o">,</span><span class="mi">1</span><span class="o">)</span>
+    <span class="o">.</span><span class="na">reduce</span><span class="o">(</span><span class="cm">/*do something*/</span><span class="o">);</span></code></pre></div>
+
+<p>The data set is grouped on the composite key consisting of the first and the
+second fields, therefore the GroupReduceFuntion will receive groups
+with the same value for both fields.</p>
+
+<p>In general, key definition is done via a &quot;key selector&quot; function, which
+takes as argument one dataset element and returns a key of an
+arbitrary data type by performing an arbitrary computation on this
+element. For example:</p>
+
+<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="c1">// some ordinary POJO</span>
+<span class="kd">public</span> <span class="kd">class</span> <span class="nc">WC</span> <span class="o">{</span><span class="kd">public</span> <span class="n">String</span> <span class="n">word</span><span class="o">;</span> <span class="kd">public</span> <span class="kt">int</span> <span class="n">count</span><span class="o">;}</span>
+<span class="n">DataSet</span><span class="o">&lt;</span><span class="n">WC</span><span class="o">&gt;</span> <span class="n">words</span> <span class="o">=</span> <span class="c1">// [...]</span>
+<span class="n">DataSet</span><span class="o">&lt;</span><span class="n">WC</span><span class="o">&gt;</span> <span class="n">wordCounts</span> <span class="o">=</span> <span class="n">words</span>
+                         <span class="o">.</span><span class="na">groupBy</span><span class="o">(</span>
+                           <span class="k">new</span> <span class="n">KeySelector</span><span class="o">&lt;</span><span class="n">WC</span><span class="o">,</span> <span class="n">String</span><span class="o">&gt;()</span> <span class="o">{</span>
+                             <span class="kd">public</span> <span class="n">String</span> <span class="nf">getKey</span><span class="o">(</span><span class="n">WC</span> <span class="n">wc</span><span class="o">)</span> <span class="o">{</span> <span class="k">return</span> <span class="n">wc</span><span class="o">.</span><span class="na">word</span><span class="o">;</span> <span class="o">}</span>
+                           <span class="o">})</span>
+                         <span class="o">.</span><span class="na">reduce</span><span class="o">(</span><span class="cm">/*do something*/</span><span class="o">);</span></code></pre></div>
+
+<p>Remember that keys are not only used for grouping, but also joining and matching data sets:</p>
+
+<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="c1">// some POJO</span>
+<span class="kd">public</span> <span class="kd">class</span> <span class="nc">Rating</span> <span class="o">{</span>
+  <span class="kd">public</span> <span class="n">String</span> <span class="n">name</span><span class="o">;</span>
+  <span class="kd">public</span> <span class="n">String</span> <span class="n">category</span><span class="o">;</span>
+  <span class="kd">public</span> <span class="kt">int</span> <span class="n">points</span><span class="o">;</span>
+<span class="o">}</span>
+<span class="n">DataSet</span><span class="o">&lt;</span><span class="n">Rating</span><span class="o">&gt;</span> <span class="n">ratings</span> <span class="o">=</span> <span class="c1">// [...]</span>
+<span class="n">DataSet</span><span class="o">&lt;</span><span class="n">Tuple2</span><span class="o">&lt;</span><span class="n">String</span><span class="o">,</span> <span class="n">Double</span><span class="o">&gt;&gt;</span> <span class="n">weights</span> <span class="o">=</span> <span class="c1">// [...]</span>
+<span class="n">DataSet</span><span class="o">&lt;</span><span class="n">Tuple2</span><span class="o">&lt;</span><span class="n">String</span><span class="o">,</span> <span class="n">Double</span><span class="o">&gt;&gt;</span>
+            <span class="n">weightedRatings</span> <span class="o">=</span>
+            <span class="n">ratings</span><span class="o">.</span><span class="na">join</span><span class="o">(</span><span class="n">weights</span><span class="o">)</span>
+
+                   <span class="c1">// key of the first input</span>
+                   <span class="o">.</span><span class="na">where</span><span class="o">(</span><span class="k">new</span> <span class="n">KeySelector</span><span class="o">&lt;</span><span class="n">Rating</span><span class="o">,</span> <span class="n">String</span><span class="o">&gt;()</span> <span class="o">{</span>
+                            <span class="kd">public</span> <span class="n">String</span> <span class="nf">getKey</span><span class="o">(</span><span class="n">Rating</span> <span class="n">r</span><span class="o">)</span> <span class="o">{</span> <span class="k">return</span> <span class="n">r</span><span class="o">.</span><span class="na">category</span><span class="o">;</span> <span class="o">}</span>
+                          <span class="o">})</span>
+
+                   <span class="c1">// key of the second input</span>
+                   <span class="o">.</span><span class="na">equalTo</span><span class="o">(</span><span class="k">new</span> <span class="n">KeySelector</span><span class="o">&lt;</span><span class="n">Tuple2</span><span class="o">&lt;</span><span class="n">String</span><span class="o">,</span> <span class="n">Double</span><span class="o">&gt;,</span> <span class="n">String</span><span class="o">&gt;()</span> <span class="o">{</span>
+                              <span class="kd">public</span> <span class="n">String</span> <span class="nf">getKey</span><span class="o">(</span><span class="n">Tuple2</span><span class="o">&lt;</span><span class="n">String</span><span class="o">,</span> <span class="n">Double</span><span class="o">&gt;</span> <span class="n">t</span><span class="o">)</span> <span class="o">{</span> <span class="k">return</span> <span class="n">t</span><span class="o">.</span><span class="na">f0</span><span class="o">;</span> <span class="o">}</span>
+                            <span class="o">});</span></code></pre></div>
+
 <p><a href="#top">Back to top</a></p>
 
 <p><section id="functions"></p>
 
 <h2 id="functions">Functions</h2>
 
+<p>You can define a user-defined function and pass it to the DataSet
+transformations in several ways:</p>
+
+<h4 id="implementing-an-interface">Implementing an interface</h4>
+
+<p>The most basic way is to implement one of the provided interfaces:</p>
+
+<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="kd">class</span> <span class="nc">MyMapFunction</span> <span class="kd">implements</span> <span class="n">MapFunction</span><span class="o">&lt;</span><span class="n">String</span><span class="o">,</span> <span class="n">Integer</span><span class="o">&gt;</span> <span class="o">{</span>
+  <span class="kd">public</span> <span class="n">Integer</span> <span class="nf">map</span><span class="o">(</span><span class="n">String</span> <span class="n">value</span><span class="o">)</span> <span class="o">{</span> <span class="k">return</span> <span class="n">Integer</span><span class="o">.</span><span class="na">parseInt</span><span class="o">(</span><span class="n">value</span><span class="o">);</span> <span class="o">}</span>
+<span class="o">});</span>
+<span class="n">data</span><span class="o">.</span><span class="na">map</span> <span class="o">(</span><span class="k">new</span> <span class="nf">MyMapFunction</span><span class="o">());</span></code></pre></div>
+
+<h4 id="anonymous-classes">Anonymous classes</h4>
+
+<p>You can pass a function as an anonmymous class:</p>
+
+<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="n">data</span><span class="o">.</span><span class="na">map</span><span class="o">(</span><span class="k">new</span> <span class="n">MapFunction</span><span class="o">&lt;</span><span class="n">String</span><span class="o">,</span> <span class="n">Integer</span><span class="o">&gt;</span> <span class="o">()</span> <span class="o">{</span>
+  <span class="kd">public</span> <span class="n">Integer</span> <span class="nf">map</span><span class="o">(</span><span class="n">String</span> <span class="n">value</span><span class="o">)</span> <span class="o">{</span> <span class="k">return</span> <span class="n">Integer</span><span class="o">.</span><span class="na">parseInt</span><span class="o">(</span><span class="n">value</span><span class="o">);</span> <span class="o">}</span>
+<span class="o">});</span></code></pre></div>
+
+<h4 id="java-8-lambdas">Java 8 Lambdas</h4>
+
+<p><strong><em>Warning: Lambdas are currently only supported for filter and reduce
+   transformations</em></strong></p>
+
+<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="n">DataSet</span><span class="o">&lt;</span><span class="n">String</span><span class="o">&gt;</span> <span class="n">data</span> <span class="o">=</span> <span class="c1">// [...]</span>
+<span class="n">data</span><span class="o">.</span><span class="na">filter</span><span class="o">(</span><span class="n">s</span> <span class="o">-&gt;</span> <span class="n">s</span><span class="o">.</span><span class="na">startsWith</span><span class="o">(</span><span class="s">&quot;http://&quot;</span><span class="o">));</span></code></pre></div>
+
+<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="n">DataSet</span><span class="o">&lt;</span><span class="n">Integer</span><span class="o">&gt;</span> <span class="n">data</span> <span class="o">=</span> <span class="c1">// [...]</span>
+<span class="n">data</span><span class="o">.</span><span class="na">reduce</span><span class="o">((</span><span class="n">i1</span><span class="o">,</span><span class="n">i2</span><span class="o">)</span> <span class="o">-&gt;</span> <span class="n">i1</span> <span class="o">+</span> <span class="n">i2</span><span class="o">);</span></code></pre></div>
+
+<h4 id="rich-functions">Rich functions</h4>
+
+<p>All transformations that take as argument a user-defined function can
+instead take as argument a <em>rich</em> function. For example, instead of</p>
+
+<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="kd">class</span> <span class="nc">MyMapFunction</span> <span class="kd">implements</span> <span class="n">MapFunction</span><span class="o">&lt;</span><span class="n">String</span><span class="o">,</span> <span class="n">Integer</span><span class="o">&gt;</span> <span class="o">{</span>
+  <span class="kd">public</span> <span class="n">Integer</span> <span class="nf">map</span><span class="o">(</span><span class="n">String</span> <span class="n">value</span><span class="o">)</span> <span class="o">{</span> <span class="k">return</span> <span class="n">Integer</span><span class="o">.</span><span class="na">parseInt</span><span class="o">(</span><span class="n">value</span><span class="o">);</span> <span class="o">}</span>
+<span class="o">});</span></code></pre></div>
+
+<p>you can write</p>
+
+<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="kd">class</span> <span class="nc">MyMapFunction</span> <span class="kd">extends</span> <span class="n">RichMapFunction</span><span class="o">&lt;</span><span class="n">String</span><span class="o">,</span> <span class="n">Integer</span><span class="o">&gt;</span> <span class="o">{</span>
+  <span class="kd">public</span> <span class="n">Integer</span> <span class="nf">map</span><span class="o">(</span><span class="n">String</span> <span class="n">value</span><span class="o">)</span> <span class="o">{</span> <span class="k">return</span> <span class="n">Integer</span><span class="o">.</span><span class="na">parseInt</span><span class="o">(</span><span class="n">value</span><span class="o">);</span> <span class="o">}</span>
+<span class="o">});</span></code></pre></div>
+
+<p>and pass the function as usual to a <code>map</code> transformation:</p>
+
+<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="n">data</span><span class="o">.</span><span class="na">map</span><span class="o">(</span><span class="k">new</span> <span class="nf">MyMapFunction</span><span class="o">());</span></code></pre></div>
+
+<p>Rich functions can also be defined as an anonymous class:</p>
+
+<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="n">data</span><span class="o">.</span><span class="na">map</span> <span class="o">(</span><span class="k">new</span> <span class="n">RichMapFunction</span><span class="o">&lt;</span><span class="n">String</span><span class="o">,</span> <span class="n">Integer</span><span class="o">&gt;()</span> <span class="o">{</span>
+  <span class="kd">public</span> <span class="n">Integer</span> <span class="nf">map</span><span class="o">(</span><span class="n">String</span> <span class="n">value</span><span class="o">)</span> <span class="o">{</span> <span class="k">return</span> <span class="n">Integer</span><span class="o">.</span><span class="na">parseInt</span><span class="o">(</span><span class="n">value</span><span class="o">);</span> <span class="o">}</span>
+<span class="o">});</span></code></pre></div>
+
+<p>Rich functions provide, in addition to the user-defined function (map,
+reduce, etc), four methods: <code>open</code>, <code>close</code>, <code>getRuntimeContext</code>, and
+<code>setRuntimeContext</code>. These are useful for creating and finalizing
+local state, accessing broadcast variables (see
+<a href="#broadcast_variables">Broadcast Variables</a>, and for accessing runtime
+information such as accumulators and counters (see
+<a href="#accumulators_counters">Accumulators and Counters</a>, and information
+on iterations (see <a href="#iterations">Iterations</a>).</p>
+
+<p>In particular for the <code>reduceGroup</code> transformation, using a rich
+function is the only way to define an optional <code>combine</code> function. See
+the
+<a href="/java_api_transformations.html">transformations documentation</a>
+for a complete example.</p>
+
 <p><a href="#top">Back to top</a></p>
 
 <p><section id="types"></p>
@@ -737,7 +879,7 @@ has full description of all transformati
 </code></pre></div>
 <p>When working with operators that require a Key for grouping or matching records
 you need to implement a <code>KeySelector</code> for your custom type (see
-<a href="#transformations">Section Data Transformations</a>).</p>
+<a href="#keys">Section Defining Keys</a>).</p>
 <div class="highlight"><pre><code class="language-java" data-lang="java"><span class="n">wordCounts</span><span class="o">.</span><span class="na">groupBy</span><span class="o">(</span><span class="k">new</span> <span class="n">KeySelector</span><span class="o">&lt;</span><span class="n">WordCount</span><span class="o">,</span> <span class="n">String</span><span class="o">&gt;()</span> <span class="o">{</span>
     <span class="kd">public</span> <span class="n">String</span> <span class="nf">getKey</span><span class="o">(</span><span class="n">WordCount</span> <span class="n">v</span><span class="o">)</span> <span class="o">{</span>
         <span class="k">return</span> <span class="n">v</span><span class="o">.</span><span class="na">word</span><span class="o">;</span>

Modified: incubator/flink/site/docs/0.6-SNAPSHOT/java_api_transformations.html
URL: http://svn.apache.org/viewvc/incubator/flink/site/docs/0.6-SNAPSHOT/java_api_transformations.html?rev=1617745&r1=1617744&r2=1617745&view=diff
==============================================================================
--- incubator/flink/site/docs/0.6-SNAPSHOT/java_api_transformations.html (original)
+++ incubator/flink/site/docs/0.6-SNAPSHOT/java_api_transformations.html Wed Aug 13 16:04:11 2014
@@ -223,6 +223,9 @@
 <a href="#join-with-joinfunction">Join with JoinFunction</a>
 </li>
 <li>
+<a href="#join-with-flatjoinfunction">Join with FlatJoinFunction</a>
+</li>
+<li>
 <a href="#join-with-projection">Join with Projection</a>
 </li>
 <li>
@@ -478,7 +481,11 @@ Right now, this feature is only availabl
 
 <h4 id="combinable-groupreducefunctions">Combinable GroupReduceFunctions</h4>
 
-<p>In contrast to a <code>ReduceFunction</code>, a <code>GroupReduceFunction</code> is not necessarily combinable. In order to make a <code>GroupReduceFunction</code> combinable, you need to implement (override) the <code>combine()</code> method and annotate the <code>GroupReduceFunction</code> with the <code>@Combinable</code> annotation as shown here:</p>
+<p>In contrast to a <code>ReduceFunction</code>, a <code>GroupReduceFunction</code> is not
+necessarily combinable. In order to make a <code>GroupReduceFunction</code>
+combinable, you need to use the <code>RichGroupReduceFunction</code> variant,
+implement (override) the <code>combine()</code> method, and annotate the
+<code>GroupReduceFunction</code> with the <code>@Combinable</code> annotation as shown here:</p>
 <div class="highlight"><pre><code class="language-java" data-lang="java"><span class="c1">// Combinable GroupReduceFunction that computes two sums.</span>
 <span class="c1">// Note that we use the RichGroupReduceFunction because it defines the combine method</span>
 <span class="nd">@Combinable</span>
@@ -653,6 +660,27 @@ A <code>JoinFunction</code> receives one
                    <span class="c1">// applying the JoinFunction on joining pairs</span>
                    <span class="o">.</span><span class="na">with</span><span class="o">(</span><span class="k">new</span> <span class="nf">PointWeighter</span><span class="o">());</span>
 </code></pre></div>
+<h4 id="join-with-flatjoinfunction">Join with FlatJoinFunction</h4>
+
+<p>Analogous to Map and FlatMap, a FlatJoin function behaves in the same
+way as a JoinFunction, but instead of returning one element, it can
+return (collect), zero, one, or more elements.</p>
+
+<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="kd">public</span> <span class="kd">class</span> <span class="nc">PointWeighter</span>
+         <span class="kd">implements</span> <span class="n">FlatJoinFunction</span><span class="o">&lt;</span><span class="n">Rating</span><span class="o">,</span> <span class="n">Tuple2</span><span class="o">&lt;</span><span class="n">String</span><span class="o">,</span> <span class="n">Double</span><span class="o">&gt;,</span> <span class="n">Tuple2</span><span class="o">&lt;</span><span class="n">String</span><span class="o">,</span> <span class="n">Double</span><span class="o">&gt;&gt;</span> <span class="o">{</span>
+  <span class="nd">@Override</span>
+  <span class="kd">public</span> <span class="kt">void</span> <span class="nf">join</span><span class="o">(</span><span class="n">Rating</span> <span class="n">rating</span><span class="o">,</span> <span class="n">Tuple2</span><span class="o">&lt;</span><span class="n">String</span><span class="o">,</span> <span class="n">Double</span><span class="o">&gt;</span> <span class="n">weight</span><span class="o">,</span>
+      <span class="n">Collector</span><span class="o">&lt;</span><span class="n">Tuple2</span><span class="o">&lt;</span><span class="n">String</span><span class="o">,</span> <span class="n">Double</span><span class="o">&gt;&gt;</span> <span class="n">out</span><span class="o">)</span> <span class="o">{</span>
+    <span class="k">if</span> <span class="o">(</span><span class="n">weight</span><span class="o">.</span><span class="na">f1</span> <span class="o">&gt;</span> <span class="mf">0.1</span><span class="o">)</span> <span class="o">{</span>
+        <span class="n">out</span><span class="o">.</span><span class="na">collect</span><span class="o">(</span><span class="k">new</span> <span class="n">Tuple2</span><span class="o">&lt;</span><span class="n">String</span><span class="o">,</span> <span class="n">Double</span><span class="o">&gt;(</span><span class="n">rating</span><span class="o">.</span><span class="na">name</span><span class="o">,</span> <span class="n">rating</span><span class="o">.</span><span class="na">points</span> <span class="o">*</span> <span class="n">weight</span><span class="o">.</span><span class="na">f1</span><span class="o">));</span>
+    <span class="o">}</span>
+  <span class="o">}</span>
+<span class="o">}</span>
+
+<span class="n">DataSet</span><span class="o">&lt;</span><span class="n">Tuple2</span><span class="o">&lt;</span><span class="n">String</span><span class="o">,</span> <span class="n">Double</span><span class="o">&gt;&gt;</span>
+            <span class="n">weightedRatings</span> <span class="o">=</span>
+            <span class="n">ratings</span><span class="o">.</span><span class="na">join</span><span class="o">(</span><span class="n">weights</span><span class="o">)</span> <span class="c1">// [...]</span></code></pre></div>
+
 <h4 id="join-with-projection">Join with Projection</h4>
 
 <p>A Join transformation can construct result tuples using a projection as shown here:</p>

Modified: incubator/flink/site/docs/0.6-SNAPSHOT/run_example_quickstart.html
URL: http://svn.apache.org/viewvc/incubator/flink/site/docs/0.6-SNAPSHOT/run_example_quickstart.html?rev=1617745&r1=1617744&r2=1617745&view=diff
==============================================================================
--- incubator/flink/site/docs/0.6-SNAPSHOT/run_example_quickstart.html (original)
+++ incubator/flink/site/docs/0.6-SNAPSHOT/run_example_quickstart.html Wed Aug 13 16:04:11 2014
@@ -311,7 +311,7 @@ cd stratosphere
 <h1 id="analyze-the-result">Analyze the Result</h1>
 
 <p>Use the <a href="quickstart/plotPoints.py">Python Script</a> again to visualize the result</p>
-<div class="highlight"><pre><code class="language-bash" data-lang="bash">python2.7 plotPoints.py result result result-pdf
+<div class="highlight"><pre><code class="language-bash" data-lang="bash">python plotPoints.py result result result-pdf
 </code></pre></div>
 <p>The following three pictures show the results for the sample input above. Play around with the parameters (number of iterations, number of clusters) to see how they affect the result.</p>