You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@flink.apache.org by se...@apache.org on 2014/08/20 14:41:30 UTC
git commit: [FLINK-1053] Add documentation for mapPartition() function
Repository: incubator-flink
Updated Branches:
refs/heads/master 6c48063ad -> 4717c0c7d
[FLINK-1053] Add documentation for mapPartition() function
[ci skip]
Project: http://git-wip-us.apache.org/repos/asf/incubator-flink/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-flink/commit/4717c0c7
Tree: http://git-wip-us.apache.org/repos/asf/incubator-flink/tree/4717c0c7
Diff: http://git-wip-us.apache.org/repos/asf/incubator-flink/diff/4717c0c7
Branch: refs/heads/master
Commit: 4717c0c7d6964553343003611dda810ede3a7e56
Parents: 6c48063
Author: Stephan Ewen <se...@apache.org>
Authored: Wed Aug 20 14:39:16 2014 +0200
Committer: Stephan Ewen <se...@apache.org>
Committed: Wed Aug 20 14:40:39 2014 +0200
----------------------------------------------------------------------
docs/java_api_guide.md | 24 ++++++++++++++++++++++--
docs/java_api_transformations.md | 24 ++++++++++++++++++++++++
2 files changed, 46 insertions(+), 2 deletions(-)
----------------------------------------------------------------------
http://git-wip-us.apache.org/repos/asf/incubator-flink/blob/4717c0c7/docs/java_api_guide.md
----------------------------------------------------------------------
diff --git a/docs/java_api_guide.md b/docs/java_api_guide.md
index 76dd332..4441d93 100644
--- a/docs/java_api_guide.md
+++ b/docs/java_api_guide.md
@@ -246,7 +246,7 @@ data.map(new MapFunction<String, Integer>() {
<tr>
<td><strong>FlatMap</strong></td>
<td>
- <p>Takes one element and produces zero, one, or more elements.</p>
+ <p>Takes one element and produces zero, one, or more elements. </p>
{% highlight java %}
data.flatMap(new FlatMapFunction<String, String>() {
public void flatMap(String value, Collector<String> out) {
@@ -260,6 +260,26 @@ data.flatMap(new FlatMapFunction<String, String>() {
</tr>
<tr>
+ <td><strong>MapPartition</strong></td>
+ <td>
+ <p>Transforms a parallel partition in a single function call. The function get the partition as an `Iterable` stream and
+ can produce an arbitrary number of result values. The number of elements in each partition depends on the degree-of-parallelism
+ and previous operations.</p>
+{% highlight java %}
+data.mapPartition(new MapPartitionFunction<String, Long>() {
+ public void mapPartition(Iterable<String> values, Collector<Long> out) {
+ long c = 0;
+ for (String s : values) {
+ c++;
+ }
+ out.collect(c);
+ }
+});
+{% endhighlight %}
+ </td>
+ </tr>
+
+ <tr>
<td><strong>Filter</strong></td>
<td>
<p>Evaluates a boolean function for each element and retains those for which the function returns true.</p>
@@ -403,7 +423,7 @@ DataSet<Tuple2<String, Integer>> out = in.project(2,0).types(String.class, Integ
Defining Keys
-------------
-One transformation (join, coGroup) require that a key is defined on
+Some transformations (join, coGroup) require that a key is defined on
its argument DataSets, and other transformations (Reduce, GroupReduce,
Aggregate) allow that the DataSet is grouped on a key before they are
applied.
http://git-wip-us.apache.org/repos/asf/incubator-flink/blob/4717c0c7/docs/java_api_transformations.md
----------------------------------------------------------------------
diff --git a/docs/java_api_transformations.md b/docs/java_api_transformations.md
index e4980de..3e303c7 100644
--- a/docs/java_api_transformations.md
+++ b/docs/java_api_transformations.md
@@ -53,7 +53,31 @@ public class Tokenizer implements FlatMapFunction<String, String> {
// [...]
DataSet<String> textLines = // [...]
DataSet<String> words = textLines.flatMap(new Tokenizer());
+```
+
+### MapPartition
+
+The MapPartition function transforms a parallel partition in a single function call. The function get the partition as an `Iterable` stream and
+can produce an arbitrary number of result values. The number of elements in each partition depends on the degree-of-parallelism
+and previous operations.
+
+The following code transforms a `DataSet` of text lines into a `DataSet` of counts per partition:
+
+```java
+public class PartitionCounter implements MapPartitionFunction<String, Long> {
+ public void mapPartition(Iterable<String> values, Collector<Long> out) {
+ long c = 0;
+ for (String s : values) {
+ c++;
+ }
+ out.collect(c);
+ }
+}
+
+// [...]
+DataSet<String> textLines = // [...]
+DataSet<Long> counts = textLines.mapPartition(new PartitionCounter());
```
### Filter