You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tajo.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2015/08/20 10:50:45 UTC

[jira] [Commented] (TAJO-1766) Improve the performance of cross join

    [ https://issues.apache.org/jira/browse/TAJO-1766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704524#comment-14704524 ] 

ASF GitHub Bot commented on TAJO-1766:
--------------------------------------

GitHub user jihoonson opened a pull request:

    https://github.com/apache/tajo/pull/706

    TAJO-1766: Improve the performance of cross join

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/jihoonson/tajo-2 TAJO-1766

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/tajo/pull/706.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #706
    
----
commit c585fd9b9f126cb3cd0492a8c8cfe07d82f617db
Author: Jihoon Son <ji...@apache.org>
Date:   2015-08-16T14:28:15Z

    Need to consider cross and non-cross joins together

commit 18fc1356c0d5296c49b614402f90fc84c4ac0790
Author: Jihoon Son <ji...@apache.org>
Date:   2015-08-17T01:36:02Z

    Merge branch 'TAJO-1766' of https://github.com/jihoonson/tajo-2 into TAJO-1766

commit b2c8435a886c8008598c367e5e757b601148cfac
Author: Jihoon Son <ji...@apache.org>
Date:   2015-08-17T06:18:01Z

    TAJO-1766

commit 1f974ceb14a62310ade9a7f631ee3ad531f080de
Author: Jihoon Son <ji...@apache.org>
Date:   2015-08-17T09:10:12Z

    TAJO-1766

commit 90306b6495314c28f4f7bdc1cb54b1ea799c689d
Author: Jihoon Son <ji...@apache.org>
Date:   2015-08-17T10:21:15Z

    TAJO-1766

commit 989d0ebd20e4250cd236029a796614f1e64ad891
Author: Jihoon Son <ji...@apache.org>
Date:   2015-08-18T01:51:36Z

    TAJO-1766

commit 25a7bccf9be42ebea99e8fae4c202b5332d9310e
Author: Jihoon Son <ji...@apache.org>
Date:   2015-08-19T01:05:42Z

    Merge branch 'master' of https://git-wip-us.apache.org/repos/asf/tajo into TAJO-1766

commit 96dd4329d1639615f4ffa65b9cf2562d0853e4e0
Author: Jihoon Son <ji...@apache.org>
Date:   2015-08-19T10:13:47Z

    add post logical plan verification

commit 5014b05277d9b79515fcff2d7253ad2eec2d650f
Author: Jihoon Son <ji...@apache.org>
Date:   2015-08-20T01:35:47Z

    improve error message

commit ccc6eba2f445f2b813322b4fb806b829bd0f3753
Author: Jihoon Son <ji...@apache.org>
Date:   2015-08-20T08:47:19Z

    Remove estimation code

commit 55dc2c6f3cfdd86ae10a1babcacc01b5e8bd6d03
Author: Jihoon Son <ji...@apache.org>
Date:   2015-08-20T08:48:33Z

    Merge branch 'master' of https://git-wip-us.apache.org/repos/asf/tajo into TAJO-1766
    
    Conflicts:
    	tajo-core/src/main/java/org/apache/tajo/master/GlobalEngine.java

----


> Improve the performance of cross join
> -------------------------------------
>
>                 Key: TAJO-1766
>                 URL: https://issues.apache.org/jira/browse/TAJO-1766
>             Project: Tajo
>          Issue Type: Improvement
>          Components: distributed query plan
>            Reporter: Jihoon Son
>            Assignee: Jihoon Son
>             Fix For: 0.11.0
>
>
> Cross join is one of the very heavy operations. Furthermore, this operator is performed by a single worker in the current implementation. (Please see the implementation of HashPartitioner. If partitionKeyIds is empty, getPartition() always returns a single value.)
> One possible alternative is executing cross join with broadcast join. That is, outer table (smaller one) is always broadcasted, and join is performed by the machine who stores a part of inner table.
> To do so, a new session variable is required to set the broadcast threshold for cross join. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)