You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tajo.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2014/07/22 16:56:38 UTC

[jira] [Commented] (TAJO-972) Broadcast join with left outer join returns duplicated rows.

    [ https://issues.apache.org/jira/browse/TAJO-972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14070334#comment-14070334 ] 

ASF GitHub Bot commented on TAJO-972:
-------------------------------------

GitHub user babokim opened a pull request:

    https://github.com/apache/tajo/pull/89

    TAJO-972: Broadcast join with left outer join returns duplicated rows.

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/babokim/tajo TAJO-972

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/tajo/pull/89.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #89
    
----
commit e391b87215d139be0309e1b120dda412b70d9e9c
Author: 김형준 <ba...@babokim-macbook-pro.local>
Date:   2014-07-22T11:23:55Z

    TAJO-972: Broadcast join with left outer join returns duplicated rows.

----


> Broadcast join with left outer join returns duplicated rows.
> ------------------------------------------------------------
>
>                 Key: TAJO-972
>                 URL: https://issues.apache.org/jira/browse/TAJO-972
>             Project: Tajo
>          Issue Type: Bug
>            Reporter: Hyoungjun Kim
>            Assignee: Hyoungjun Kim
>            Priority: Minor
>
> If LEFT OUTER JOIN has broadcast table and broadcast target table is left side, every tasks run join operation with all rows in broadcast table. So some tasks match and other tasks doesn't match. 
> For example:
> {noformat}
> default>select * from small
> id
> -----------------
> 1
> 2
> 3
> default>select * from large
> 1
> 4    <-- Block1 in HDFS
> 5
> ...
> 2    <-- Block2 in HDFS
> 6
> default> select a.id, b.id from small a left outer join large b on a.id = b.id
> a.id    b.id
> ---------------------------
> 1  1
> 2  null
> 3  null
> 1  null
> 2  2
> 3  null
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)