You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tajo.apache.org by "Hyunsik Choi (JIRA)" <ji...@apache.org> on 2013/09/16 13:35:51 UTC
[jira] [Resolved] (TAJO-184) Refactor GlobalPlanner and global plan
data structure
[ https://issues.apache.org/jira/browse/TAJO-184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hyunsik Choi resolved TAJO-184.
-------------------------------
Resolution: Fixed
I've committed this patch. Thank you for the review!
> Refactor GlobalPlanner and global plan data structure
> -----------------------------------------------------
>
> Key: TAJO-184
> URL: https://issues.apache.org/jira/browse/TAJO-184
> Project: Tajo
> Issue Type: Improvement
> Components: master, physical operator, planner/optimizer
> Reporter: Hyunsik Choi
> Assignee: Hyunsik Choi
> Fix For: 0.2-incubating
>
> Attachments: TAJO-184_2.patch, TAJO-184.patch
>
>
> Above all, I'm sorry for submitting a big patch. This patch modifies and refactors broadly global planning, logical planning, and physical planning parts. It was hard to separate this issue into smaller issues.
> Especially, this patch primarily rewrites GlobalPlanner and MasterPlan (global plan data structure) as follows:
> * Removed GlobalPlanOptimizer
> * Added DirectedGraph interface, SimpleDirectedGraph concret class, and a visitor class to visit a graph in post-order traverse way.
> * Improved MasterPlan by using new graph API
> ** query block graphs and an execution block graph are represented by SimpleDirectedGraph.
> ** Now, we can traverse above graphs easily by using graph APIs.
> ** Added DataChannel class to represent a data flow between execution blocks.
> * MasterPlan.toString() prints a text graph to represent relationships among execution blocks and a distributed plan.
> * Add more sophisticated explain feature for a distributed plan and logical plan. It is very useful for plan debugging.
> * Now, the limit operator is pushed down to child execution block.
> ** So, the intermediate data volume of a sort query with limit is reduced significantly.
> * TableSubQuery (inline view) is supported. It follows SQL standards. So, you can do a query as follows:
> {code}
> SELECT *
> FROM
> (
> SELECT
> l_orderkey,
> l_partkey,
> url
> FROM
> (
> SELECT
> l_orderkey,
> l_partkey,
> CASE
> WHEN
> l_partkey IS NOT NULL THEN ''
> WHEN l_orderkey = 1 THEN '1'
> ELSE
> '2'
> END AS url
> FROM
> lineitem
> ) res1
> JOIN
> (
> SELECT
> *
> FROM
> part
> ) res2
> ON l_partkey = p_partkey
> ) result
> {code}
> In addition, I've refactored as follows:
> * Column has a qualifier name.
> * Improved Schema to deal with qualified column names
> * When a TableDesc instance is retrieved, it is forced to have qualifier columns.
> * Fixed TAJO-162 bug.
> * Lots of trivial improvement and refactors.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira