You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@flink.apache.org by "Chiwan Park (JIRA)" <ji...@apache.org> on 2015/08/07 18:20:47 UTC

[jira] [Commented] (FLINK-2107) Implement Hash Outer Join algorithm

    [ https://issues.apache.org/jira/browse/FLINK-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14662050#comment-14662050 ] 

Chiwan Park commented on FLINK-2107:
------------------------------------

Hi, I'm working on this issue. I have a question about implementation of hash based join.

In {{NonReusingBuildFirstHashMatchIterator.callWithNextKey}}, there are 3 points to call join function by fetching first value, second value, and remain values. Is there a purpose about this? Why the calling points are split?

> Implement Hash Outer Join algorithm
> -----------------------------------
>
>                 Key: FLINK-2107
>                 URL: https://issues.apache.org/jira/browse/FLINK-2107
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Local Runtime
>            Reporter: Fabian Hueske
>            Assignee: Chiwan Park
>            Priority: Minor
>             Fix For: pre-apache
>
>
> Flink does not natively support outer joins at the moment.
> This issue proposes to implement a hash outer join algorithm that can cover left and right outer joins.
> The implementation can be based on the regular hash join iterators (for example `ReusingBuildFirstHashMatchIterator` and `NonReusingBuildFirstHashMatchIterator`, see also `MatchDriver` class)
> The Reusing and NonReusing variants differ in whether object instances are reused or new objects are created. I would start with the NonReusing variant which is safer from a user's point of view and should also be easier to implement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)