You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by "Max Michels (JIRA)" <ji...@apache.org> on 2015/03/02 10:39:05 UTC
[jira] [Created] (FLINK-1621) Create a generalized combine function
Max Michels created FLINK-1621:
----------------------------------
Summary: Create a generalized combine function
Key: FLINK-1621
URL: https://issues.apache.org/jira/browse/FLINK-1621
Project: Flink
Issue Type: Improvement
Components: Distributed Runtime
Affects Versions: 0.9
Reporter: Max Michels
Fix For: 0.9
Flink allows combiners which accept a type {{I}} and combine the values of this type into type {{O}}. In Google Dataflow, combiners are more generalized. They accept an Input {{I}}, produce an intermediate combine value of {{T}}, and finally an output {{O}}. Flink's combiners are like the {{SimpleCombineFn}} in Google Dataflow.
Right now, we translate the {{KeyedCombineFn}} into a {{SortPartition}} followed by a {{MapPartition}} to emulate the Combiner's behavior. Rudimentary performance tests showed that this behavior causes a significant increase in run time compared to the proper Combine implementation.
Let's implement a more generalized Combiner to create a better mapping from Google Dataflow to Flink.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)