You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tajo.apache.org by "Jihoon Son (JIRA)" <ji...@apache.org> on 2015/05/30 09:42:17 UTC

[jira] [Created] (TAJO-1632) Enable broadcast join planning for outer joins

Jihoon Son created TAJO-1632:
--------------------------------

             Summary: Enable broadcast join planning for outer joins
                 Key: TAJO-1632
                 URL: https://issues.apache.org/jira/browse/TAJO-1632
             Project: Tajo
          Issue Type: Improvement
          Components: distributed query plan
            Reporter: Jihoon Son
             Fix For: 0.11.0


TAJO-1553 is recently resolved to improve broadcast join planning, but it has a limitation for outer joins. That is, _for outer joins, preserved-row relations are not broadcastable to avoid input data duplication._ This rule might limit broadcast join opportunity. Let me consider the following query as an example.

{noformat}
select * from a left outer join b left outer join c
(a, b, and c are sufficiently small to be broadcasted.)
{noformat}

Please note that two consecutive left outer joins are associative. That is, their execution order can be changed without making result invalid. Thus, candidate query plans are as follows. (LOJ is short for left outer join)

1)
{noformat}
      LOJ
     /   \
  LOJ     c
 /   \
a     b
{noformat}

2)
{noformat}
  LOJ
 /   \
a     LOJ
     /   \
    b     c
{noformat}

In the query plan 1), only *a* is preserved-row. Thus, if the query plan 1) is selected, our current broadcast join planner makes the entire query plan as a single execution block with broadcast relations of *b* and *c*. 

In contrast, if the query plan 2) is selected, it is executed with two execution blocks each of which performs a left outer join because only *c* is not preserved-row and thus broadcastable.

This limitation according to the forms of selected query plan will degrade performance of outer join processing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)