You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tajo.apache.org by "Hudson (JIRA)" <ji...@apache.org> on 2013/10/11 14:29:44 UTC

[jira] [Commented] (TAJO-184) Refactor GlobalPlanner and global plan data structure

    [ https://issues.apache.org/jira/browse/TAJO-184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13792565#comment-13792565 ] 

Hudson commented on TAJO-184:
-----------------------------

ABORTED: Integrated in Tajo-trunk-postcommit #507 (See [https://builds.apache.org/job/Tajo-trunk-postcommit/507/])
TAJO-242: Enable omitted broadcast join feature after TAJO-184. (hyunsik) (hyunsik: https://git-wip-us.apache.org/repos/asf?p=incubator-tajo.git&a=commit&h=d01e47d3feaa7581024d2ac01760e52f1df9fab1)
* tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/GlobalPlanner.java
* tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/master/TestExecutionBlockCursor.java
* tajo-core/tajo-core-backend/benchmark/tpch/q9.tql
* tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/Repartitioner.java
* CHANGES.txt


> Refactor GlobalPlanner and global plan data structure
> -----------------------------------------------------
>
>                 Key: TAJO-184
>                 URL: https://issues.apache.org/jira/browse/TAJO-184
>             Project: Tajo
>          Issue Type: Improvement
>          Components: master, physical operator, planner/optimizer
>            Reporter: Hyunsik Choi
>            Assignee: Hyunsik Choi
>             Fix For: 0.2-incubating
>
>         Attachments: TAJO-184_2.patch, TAJO-184.patch
>
>
> Above all, I'm sorry for submitting a big patch. This patch modifies and refactors broadly global planning, logical planning, and physical planning parts. It was hard to separate this issue into smaller issues.
> Especially, this patch primarily rewrites GlobalPlanner and MasterPlan (global plan data structure) as follows:
>  * Removed GlobalPlanOptimizer
>  * Added DirectedGraph interface, SimpleDirectedGraph concret class, and a visitor class to visit a graph in post-order traverse way.
>  * Improved MasterPlan by using new graph API
>   ** query block graphs and an execution block graph are represented by SimpleDirectedGraph.
>   ** Now, we can traverse above graphs easily by using graph APIs.
>   ** Added DataChannel class to represent a data flow between execution blocks.
>  * MasterPlan.toString() prints a text graph to represent relationships among execution blocks and a distributed plan.
>  * Add more sophisticated explain feature for a distributed plan and logical plan. It is very useful for plan debugging.
>  * Now, the limit operator is pushed down to child execution block.
>   ** So, the intermediate data volume of a sort query with limit is reduced significantly.
>  * TableSubQuery (inline view) is supported. It follows SQL standards. So, you can do a query as follows:
> {code}
> SELECT *
> FROM
> (
>     SELECT
>         l_orderkey,
>         l_partkey,
>         url
>     FROM
>         (
>           SELECT
>             l_orderkey,
>             l_partkey,
>             CASE
>               WHEN
>                 l_partkey IS NOT NULL THEN ''
>               WHEN l_orderkey = 1 THEN '1'
>             ELSE
>               '2'
>             END AS url
>           FROM
>             lineitem
>         ) res1
>         JOIN
>         (
>           SELECT
>             *
>           FROM
>             part
>         ) res2
>         ON l_partkey = p_partkey
> ) result
> {code}
> In addition, I've refactored as follows:
>  * Column has a qualifier name.
>  * Improved Schema to deal with qualified column names
>  * When a TableDesc instance is retrieved, it is forced to have qualifier columns.
>  * Fixed TAJO-162 bug.
>  * Lots of trivial improvement and refactors.



--
This message was sent by Atlassian JIRA
(v6.1#6144)