You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Fabian Hueske (JIRA)" <ji...@apache.org> on 2015/02/08 00:40:35 UTC
[jira] [Updated] (FLINK-685) Add support for semi-joins
[ https://issues.apache.org/jira/browse/FLINK-685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Fabian Hueske updated FLINK-685:
--------------------------------
Priority: Minor (was: Major)
> Add support for semi-joins
> --------------------------
>
> Key: FLINK-685
> URL: https://issues.apache.org/jira/browse/FLINK-685
> Project: Flink
> Issue Type: New Feature
> Reporter: GitHub Import
> Priority: Minor
> Labels: github-import
> Fix For: pre-apache
>
>
> A semi-join is basically a join filter. One input is "filtering" and the other one is "filtered".
> A tuple of the "filtered" input is emitted exactly once if the "filtering" input has one (ore more) tuples with matching join keys. That means that the output of a semi-join has the same type as the "filtered" input and the "filtering" input is completely discarded.
> In order to support a semi-join, we need to add an additional physical execution strategy, that ensures, that a tuple of the "filtered" input is emitted only once if the "filtering" input has more than one tuple with matching keys. Furthermore, a couple of optimizations compared to standard joins can be done such as storing only keys and not the full tuple of the "filtering" input in a hash table.
> ---------------- Imported from GitHub ----------------
> Url: https://github.com/stratosphere/stratosphere/issues/685
> Created by: [fhueske|https://github.com/fhueske]
> Labels: enhancement, java api, runtime,
> Milestone: Release 0.6 (unplanned)
> Created at: Mon Apr 14 12:05:29 CEST 2014
> State: open
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)