You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@flink.apache.org by se...@apache.org on 2014/11/17 17:08:38 UTC

[1/3] incubator-flink git commit: [FLINK-1244] setCombinable() returns operator

Repository: incubator-flink
Updated Branches:
  refs/heads/master 7f8296ea3 -> 2d1532f33


[FLINK-1244] setCombinable() returns operator

This closes #205


Project: http://git-wip-us.apache.org/repos/asf/incubator-flink/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-flink/commit/a296f404
Tree: http://git-wip-us.apache.org/repos/asf/incubator-flink/tree/a296f404
Diff: http://git-wip-us.apache.org/repos/asf/incubator-flink/diff/a296f404

Branch: refs/heads/master
Commit: a296f404ee757f086596f12ea9762d7aaed457ff
Parents: 7f8296e
Author: zentol <s....@web.de>
Authored: Sun Nov 16 16:28:28 2014 +0100
Committer: Stephan Ewen <se...@apache.org>
Committed: Mon Nov 17 15:37:58 2014 +0100

----------------------------------------------------------------------
 .../org/apache/flink/api/java/operators/GroupReduceOperator.java | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-flink/blob/a296f404/flink-java/src/main/java/org/apache/flink/api/java/operators/GroupReduceOperator.java
----------------------------------------------------------------------
diff --git a/flink-java/src/main/java/org/apache/flink/api/java/operators/GroupReduceOperator.java b/flink-java/src/main/java/org/apache/flink/api/java/operators/GroupReduceOperator.java
index a040b14..327d12f 100644
--- a/flink-java/src/main/java/org/apache/flink/api/java/operators/GroupReduceOperator.java
+++ b/flink-java/src/main/java/org/apache/flink/api/java/operators/GroupReduceOperator.java
@@ -102,13 +102,15 @@ public class GroupReduceOperator<IN, OUT> extends SingleInputUdfOperator<IN, OUT
 		return combinable;
 	}
 	
-	public void setCombinable(boolean combinable) {
+	public GroupReduceOperator<IN, OUT> setCombinable(boolean combinable) {
 		// sanity check that the function is a subclass of the combine interface
 		if (combinable && !(function instanceof FlatCombineFunction)) {
 			throw new IllegalArgumentException("The function does not implement the combine interface.");
 		}
 		
 		this.combinable = combinable;
+		
+		return this;
 	}
 	
 	@Override


[3/3] incubator-flink git commit: Fix flaky test SumMinMaxITCase

Posted by se...@apache.org.
Fix flaky test SumMinMaxITCase


Project: http://git-wip-us.apache.org/repos/asf/incubator-flink/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-flink/commit/2d1532f3
Tree: http://git-wip-us.apache.org/repos/asf/incubator-flink/tree/2d1532f3
Diff: http://git-wip-us.apache.org/repos/asf/incubator-flink/diff/2d1532f3

Branch: refs/heads/master
Commit: 2d1532f33e05642477918121f25756f533ffe9c6
Parents: 9c1585e
Author: Stephan Ewen <se...@apache.org>
Authored: Mon Nov 17 16:24:04 2014 +0100
Committer: Stephan Ewen <se...@apache.org>
Committed: Mon Nov 17 16:24:04 2014 +0100

----------------------------------------------------------------------
 .../org/apache/flink/test/javaApiOperators/SumMinMaxITCase.java | 2 --
 .../org/apache/flink/api/scala/operators/SumMinMaxITCase.scala  | 5 +----
 2 files changed, 1 insertion(+), 6 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-flink/blob/2d1532f3/flink-tests/src/test/java/org/apache/flink/test/javaApiOperators/SumMinMaxITCase.java
----------------------------------------------------------------------
diff --git a/flink-tests/src/test/java/org/apache/flink/test/javaApiOperators/SumMinMaxITCase.java b/flink-tests/src/test/java/org/apache/flink/test/javaApiOperators/SumMinMaxITCase.java
index bf92893..61ba722 100644
--- a/flink-tests/src/test/java/org/apache/flink/test/javaApiOperators/SumMinMaxITCase.java
+++ b/flink-tests/src/test/java/org/apache/flink/test/javaApiOperators/SumMinMaxITCase.java
@@ -16,10 +16,8 @@
  * limitations under the License.
  */
 
-
 package org.apache.flink.test.javaApiOperators;
 
-
 import org.apache.flink.api.java.DataSet;
 import org.apache.flink.api.java.ExecutionEnvironment;
 import org.apache.flink.api.java.tuple.Tuple1;

http://git-wip-us.apache.org/repos/asf/incubator-flink/blob/2d1532f3/flink-tests/src/test/scala/org/apache/flink/api/scala/operators/SumMinMaxITCase.scala
----------------------------------------------------------------------
diff --git a/flink-tests/src/test/scala/org/apache/flink/api/scala/operators/SumMinMaxITCase.scala b/flink-tests/src/test/scala/org/apache/flink/api/scala/operators/SumMinMaxITCase.scala
index 488520e..a838907 100644
--- a/flink-tests/src/test/scala/org/apache/flink/api/scala/operators/SumMinMaxITCase.scala
+++ b/flink-tests/src/test/scala/org/apache/flink/api/scala/operators/SumMinMaxITCase.scala
@@ -15,6 +15,7 @@
  * See the License for the specific language governing permissions and
  * limitations under the License.
  */
+
 package org.apache.flink.api.scala.operators
 
 import org.apache.flink.api.java.aggregation.Aggregations
@@ -31,8 +32,6 @@ import scala.collection.mutable
 
 import org.apache.flink.api.scala._
 
-
-
 /**
  * These tests are copied from [[AggregateITCase]] replacing calls to aggregate with calls to sum,
  * min, and max
@@ -45,8 +44,6 @@ object SumMinMaxProgs {
       case 1 =>
         // Full aggregate
         val env = ExecutionEnvironment.getExecutionEnvironment
-        env.setDegreeOfParallelism(10)
-        //        val ds = CollectionDataSets.get3TupleDataSet(env)
         val ds = CollectionDataSets.get3TupleDataSet(env)
 
         val aggregateDs = ds


[2/3] incubator-flink git commit: [FLINK-1172] [docs] Fix broken links in documentation

Posted by se...@apache.org.
[FLINK-1172] [docs] Fix broken links in documentation

This closes #206


Project: http://git-wip-us.apache.org/repos/asf/incubator-flink/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-flink/commit/9c1585ee
Tree: http://git-wip-us.apache.org/repos/asf/incubator-flink/tree/9c1585ee
Diff: http://git-wip-us.apache.org/repos/asf/incubator-flink/diff/9c1585ee

Branch: refs/heads/master
Commit: 9c1585eefd4bc621e1fe9fbd9cb3055404b0dff1
Parents: a296f40
Author: zentol <s....@web.de>
Authored: Sun Nov 16 16:57:02 2014 +0100
Committer: Stephan Ewen <se...@apache.org>
Committed: Mon Nov 17 15:48:34 2014 +0100

----------------------------------------------------------------------
 docs/examples.md          |  2 +-
 docs/faq.md               |  6 +++---
 docs/iterations.md        |  2 +-
 docs/programming_guide.md | 32 ++++++++++++++++----------------
 4 files changed, 21 insertions(+), 21 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-flink/blob/9c1585ee/docs/examples.md
----------------------------------------------------------------------
diff --git a/docs/examples.md b/docs/examples.md
index 7cae4bc..d6d5655 100644
--- a/docs/examples.md
+++ b/docs/examples.md
@@ -79,7 +79,7 @@ The {% gh_link /flink-examples/flink-scala-examples/src/main/scala/org/apache/fl
 
 The PageRank algorithm computes the "importance" of pages in a graph defined by links, which point from one pages to another page. It is an iterative graph algorithm, which means that it repeatedly applies the same computation. In each iteration, each page distributes its current rank over all its neighbors, and compute its new rank as a taxed sum of the ranks it received from its neighbors. The PageRank algorithm was popularized by the Google search engine which uses the importance of webpages to rank the results of search queries.
 
-In this simple example, PageRank is implemented with a [bulk iteration](java_api_guide.html#iterations) and a fixed number of iterations.
+In this simple example, PageRank is implemented with a [bulk iteration](iterations.html) and a fixed number of iterations.
 
 <div class="codetabs" markdown="1">
 <div data-lang="java" markdown="1">

http://git-wip-us.apache.org/repos/asf/incubator-flink/blob/9c1585ee/docs/faq.md
----------------------------------------------------------------------
diff --git a/docs/faq.md b/docs/faq.md
index 36d7690..8d3b726 100644
--- a/docs/faq.md
+++ b/docs/faq.md
@@ -62,7 +62,7 @@ of the master and the worker where the exception occurred
 - When you start a program locally with the [LocalExecutor](local_execution.html),
 you can place breakpoints in your functions and debug them like normal
 Java/Scala programs.
-- The [Accumulators](java_api_guide.html#accumulators) are very helpful in
+- The [Accumulators](programming_guide.html#accumulators--counters) are very helpful in
 tracking the behavior of the parallel execution. They allow you to gather
 information inside the program's operations and show them after the program
 execution.
@@ -303,10 +303,10 @@ open source project in the next versions.
 
 ### Are Hadoop-like utilities, such as Counters and the DistributedCache supported?
 
-[Flink's Accumulators](java_api_guide.html#accumulators-&-counters) work very similar like
+[Flink's Accumulators](programming_guide.html#accumulators--counters) work very similar like
 [Hadoop's counters, but are more powerful.
 
 Flink has a {% gh_link /flink-core/src/main/java/org/apache/flink/api/common/cache/DistributedCache.java "Distributed Cache" %} that is deeply integrated with the APIs. Please refer to the {% gh_link /flink-java/src/main/java/org/apache/flink/api/java/ExecutionEnvironment.java#L561 "JavaDocs" %} for details on how to use it.
 
-In order to make data sets available on all tasks, we encourage you to use [Broadcast Variables](java_api_guide.html#broadcast_variables) instead. They are more efficient and easier to use than the distributed cache.
+In order to make data sets available on all tasks, we encourage you to use [Broadcast Variables](programming_guide.html#broadcast-variables) instead. They are more efficient and easier to use than the distributed cache.
 

http://git-wip-us.apache.org/repos/asf/incubator-flink/blob/9c1585ee/docs/iterations.md
----------------------------------------------------------------------
diff --git a/docs/iterations.md b/docs/iterations.md
index 1897b29..ca22b8e 100644
--- a/docs/iterations.md
+++ b/docs/iterations.md
@@ -157,7 +157,7 @@ setFinalState(solution);
 
 <div class="panel panel-default">
 	<div class="panel-body">
-	See the <strong><a href="scala_api_guide.html">Scala</a> and <a href="java_api_guide.html#iterations">Java</a> programming guides</strong> for details and code examples.</div>
+	See the <strong><a href="programming_guide.html">programming guide</a></strong> for details and code examples.</div>
 </div>
 
 ### Example: Propagate Minimum in Graph

http://git-wip-us.apache.org/repos/asf/incubator-flink/blob/9c1585ee/docs/programming_guide.md
----------------------------------------------------------------------
diff --git a/docs/programming_guide.md b/docs/programming_guide.md
index e191b0b..937292c 100644
--- a/docs/programming_guide.md
+++ b/docs/programming_guide.md
@@ -229,7 +229,7 @@ DataSet<String> text = env.readTextFile("file:///path/to/file");
 
 This will give you a DataSet on which you can then apply transformations. For
 more information on data sources and input formats, please refer to
-[Data Sources](#data_sources).
+[Data Sources](#data-sources).
 
 Once you have a DataSet you can apply transformations to create a new
 DataSet which you can then write to a file, transform again, or
@@ -269,7 +269,7 @@ a cluster, the result goes to the standard out stream of the cluster nodes and e
 up in the *.out* files of the workers).
 The first two do as the name suggests, the third one can be used to specify a
 custom data output format. Please refer
-to [Data Sinks](#data_sinks) for more information on writing to files and also
+to [Data Sinks](#data-sinks) for more information on writing to files and also
 about custom data output formats.
 
 Once you specified the complete program you need to call `execute` on
@@ -329,7 +329,7 @@ val text = env.readTextFile("file:///path/to/file")
 
 This will give you a DataSet on which you can then apply transformations. For
 more information on data sources and input formats, please refer to
-[Data Sources](#data_sources).
+[Data Sources](#data-sources).
 
 Once you have a DataSet you can apply transformations to create a new
 DataSet which you can then write to a file, transform again, or
@@ -370,7 +370,7 @@ a cluster, the result goes to the standard out stream of the cluster nodes and e
 up in the *.out* files of the workers).
 The first two do as the name suggests, the third one can be used to specify a
 custom data output format. Please refer
-to [Data Sinks](#data_sinks) for more information on writing to files and also
+to [Data Sinks](#data-sinks) for more information on writing to files and also
 about custom data output formats.
 
 Once you specified the complete program you need to call `execute` on
@@ -840,9 +840,9 @@ val result3 = in.groupBy(0).sortGroup(1, Order.ASCENDING).first(3)
 </div>
 </div>
 
-The [parallelism](#parallelism) of a transformation can be defined by `setParallelism(int)` while
+The [parallelism](#parallel-execution) of a transformation can be defined by `setParallelism(int)` while
 `name(String)` assigns a custom name to a transformation which is helpful for debugging. The same is
-possible for [Data Sources](#data_sources) and [Data Sinks](#data_sinks).
+possible for [Data Sources](#data-sources) and [Data Sinks](#data-sinks).
 
 [Back to Top](#top)
 
@@ -1297,10 +1297,10 @@ Rich functions provide, in addition to the user-defined function (map,
 reduce, etc), four methods: `open`, `close`, `getRuntimeContext`, and
 `setRuntimeContext`. These are useful for creating and finalizing
 local state, accessing broadcast variables (see
-[Broadcast Variables](#broadcast_variables), and for accessing runtime
+[Broadcast Variables](#broadcast-variables), and for accessing runtime
 information such as accumulators and counters (see
-[Accumulators and Counters](#accumulators_counters), and information
-on iterations (see [Iterations](#iterations)).
+[Accumulators and Counters](#accumulators--counters), and information
+on iterations (see [Iterations](iterations.html)).
 
 In particular for the `reduceGroup` transformation, using a rich
 function is the only way to define an optional `combine` function. See
@@ -2015,7 +2015,7 @@ env.execute("Iterative Pi Example");
 {% endhighlight %}
 
 You can also check out the
-{% gh_link /flink-examples/flink-java-examples/src/main/java/org/apache/flink/example/java/clustering/KMeans.java "K-Means example" %},
+{% gh_link /flink-examples/flink-java-examples/src/main/java/org/apache/flink/examples/java/clustering/KMeans.java "K-Means example" %},
 which uses a BulkIteration to cluster a set of unlabeled points.
 
 #### Delta Iterations
@@ -2272,7 +2272,7 @@ data.map(new MapFunction<String, String>() {
 
 Make sure that the names (`broadcastSetName` in the previous example) match when registering and
 accessing broadcasted data sets. For a complete example program, have a look at
-{% gh_link /flink-examples/flink-java-examples/src/main/java/org/apache/flink/example/java/clustering/KMeans.java#L96 "KMeans Algorithm" %}.
+{% gh_link /flink-examples/flink-java-examples/src/main/java/org/apache/flink/examples/java/clustering/KMeans.java#L96 "K-Means Algorithm" %}.
 </div>
 <div data-lang="scala" markdown="1">
 
@@ -2312,7 +2312,7 @@ of a function, or use the `withParameters(...)` method to pass in a configuratio
 Program Packaging & Distributed Execution
 -----------------------------------------
 
-As described in the [program skeleton](#skeleton) section, Flink programs can be executed on
+As described in the [program skeleton](#program-skeleton) section, Flink programs can be executed on
 clusters by using the `RemoteEnvironment`. Alternatively, programs can be packaged into JAR Files
 (Java Archives) for execution. Packaging the program is a prerequisite to executing them through the
 [command line interface](cli.html) or the [web interface](web_client.html).
@@ -2429,7 +2429,7 @@ name.
 A note on accumulators and iterations: Currently the result of accumulators is only available after
 the overall job ended. We plan to also make the result of the previous iteration available in the
 next iteration. You can use
-{% gh_link /flink-java/src/main/java/org/apache/flink/api/java/IterativeDataSet.java#L98 "Aggregators" %}
+{% gh_link /flink-java/src/main/java/org/apache/flink/api/java/operators/IterativeDataSet.java#L98 "Aggregators" %}
 to compute per-iteration statistics and base the termination of iterations on such statistics.
 
 __Custom accumulators:__
@@ -2463,7 +2463,7 @@ The degree of parallelism of a task can be specified in Flink on different level
 
 The parallelism of an individual operator, data source, or data sink can be defined by calling its
 `setParallelism()` method.  For example, the degree of parallelism of the `Sum` operator in the
-[WordCount](#example) example program can be set to `5` as follows :
+[WordCount](#example-program) example program can be set to `5` as follows :
 
 
 <div class="codetabs" markdown="1">
@@ -2506,7 +2506,7 @@ parallelism of an operator.
 
 The default parallelism of an execution environment can be specified by calling the
 `setDegreeOfParallelism()` method. To execute all operators, data sources, and data sinks of the
-[WordCount](#example) example program with a parallelism of `3`, set the default parallelism of the
+[WordCount](#example-program) example program with a parallelism of `3`, set the default parallelism of the
 execution environment as follows:
 
 <div class="codetabs" markdown="1">
@@ -2543,7 +2543,7 @@ env.execute("Word Count Example")
 
 A system-wide default parallelism for all execution environments can be defined by setting the
 `parallelization.degree.default` property in `./conf/flink-conf.yaml`. See the
-[Configuration]({{site.baseurl}}/config.html) documentation for details.
+[Configuration](config.html) documentation for details.
 
 [Back to top](#top)