You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tajo.apache.org by "Hudson (JIRA)" <ji...@apache.org> on 2013/10/11 14:29:44 UTC
[jira] [Commented] (TAJO-184) Refactor GlobalPlanner and global
plan data structure
[ https://issues.apache.org/jira/browse/TAJO-184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13792565#comment-13792565 ]
Hudson commented on TAJO-184:
-----------------------------
ABORTED: Integrated in Tajo-trunk-postcommit #507 (See [https://builds.apache.org/job/Tajo-trunk-postcommit/507/])
TAJO-242: Enable omitted broadcast join feature after TAJO-184. (hyunsik) (hyunsik: https://git-wip-us.apache.org/repos/asf?p=incubator-tajo.git&a=commit&h=d01e47d3feaa7581024d2ac01760e52f1df9fab1)
* tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/GlobalPlanner.java
* tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/master/TestExecutionBlockCursor.java
* tajo-core/tajo-core-backend/benchmark/tpch/q9.tql
* tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/Repartitioner.java
* CHANGES.txt
> Refactor GlobalPlanner and global plan data structure
> -----------------------------------------------------
>
> Key: TAJO-184
> URL: https://issues.apache.org/jira/browse/TAJO-184
> Project: Tajo
> Issue Type: Improvement
> Components: master, physical operator, planner/optimizer
> Reporter: Hyunsik Choi
> Assignee: Hyunsik Choi
> Fix For: 0.2-incubating
>
> Attachments: TAJO-184_2.patch, TAJO-184.patch
>
>
> Above all, I'm sorry for submitting a big patch. This patch modifies and refactors broadly global planning, logical planning, and physical planning parts. It was hard to separate this issue into smaller issues.
> Especially, this patch primarily rewrites GlobalPlanner and MasterPlan (global plan data structure) as follows:
> * Removed GlobalPlanOptimizer
> * Added DirectedGraph interface, SimpleDirectedGraph concret class, and a visitor class to visit a graph in post-order traverse way.
> * Improved MasterPlan by using new graph API
> ** query block graphs and an execution block graph are represented by SimpleDirectedGraph.
> ** Now, we can traverse above graphs easily by using graph APIs.
> ** Added DataChannel class to represent a data flow between execution blocks.
> * MasterPlan.toString() prints a text graph to represent relationships among execution blocks and a distributed plan.
> * Add more sophisticated explain feature for a distributed plan and logical plan. It is very useful for plan debugging.
> * Now, the limit operator is pushed down to child execution block.
> ** So, the intermediate data volume of a sort query with limit is reduced significantly.
> * TableSubQuery (inline view) is supported. It follows SQL standards. So, you can do a query as follows:
> {code}
> SELECT *
> FROM
> (
> SELECT
> l_orderkey,
> l_partkey,
> url
> FROM
> (
> SELECT
> l_orderkey,
> l_partkey,
> CASE
> WHEN
> l_partkey IS NOT NULL THEN ''
> WHEN l_orderkey = 1 THEN '1'
> ELSE
> '2'
> END AS url
> FROM
> lineitem
> ) res1
> JOIN
> (
> SELECT
> *
> FROM
> part
> ) res2
> ON l_partkey = p_partkey
> ) result
> {code}
> In addition, I've refactored as follows:
> * Column has a qualifier name.
> * Improved Schema to deal with qualified column names
> * When a TableDesc instance is retrieved, it is forced to have qualifier columns.
> * Fixed TAJO-162 bug.
> * Lots of trivial improvement and refactors.
--
This message was sent by Atlassian JIRA
(v6.1#6144)