You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Ashish Thusoo (JIRA)" <ji...@apache.org> on 2008/12/18 03:36:46 UTC
[jira] Updated: (HIVE-186) Refactor code to use a single graph, nodeprocessor, dispatcher and rule abstraction

     [ https://issues.apache.org/jira/browse/HIVE-186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ashish Thusoo updated HIVE-186:
-------------------------------

    Attachment: patch-186.txt

This patch contains the cleanup and refactoring of all the graph walking and rules framework. The unified framework is in the package

org.apache.hadoop.hive.ql.lib

Node is the interface that must be implemented by the graph in order to use the graph walkers and rule dispatchers available within this framework. There are two implementations of this interface currently -

1. ASTNode - in ql.parse that is a wrapper around the CommonTree classes of the antlr runtime.
2. Operator - in ql.exec that implements the operator tree nodes

I have also removed the DefaultDispatcher implementation of the Dispatcher. This functionality can be equivalently expressed using DefaultRuleDispatcher. Accordingly I have cleaned out the GenMR* processors and the ColumnPruner to reflect these changes. ColumnPruner is also split into ColumnPrunerProcFactory to create the processors for the various rules needed therein and ColumnPrunerProcCtx which is used to carry the context information (this class is an implementation of NodeProcessorCtx) between rules.

I have gotten rid of all the classes related to the ASTs (ASTEvent, ASTDispatcher, ASTProcessor, ASTEventProcessor etc...)

The Node interfaces are processed by implementations of NodeProcessor. I have removed the reflection bases invocation that we were doing in the earlier DefaultDispatcher and DefaultRuleDispatcher. Now only a single process function is called and the user has to implement a different processors for different rules (see ColumnPrunerProcFactory).

The walker interface has been renamed to GraphWalker and the default implementation is now callled DefaultGraphWalker. Also I have eliminated the TopoWalker. DefaultGraphWalker is now not an abstract class so that clients can use it right out of the box. The ColumnPrunerWalker and the GenMapRedWalker are still subclasses of the DefaultGraphWalker.


> Refactor code to use a single graph, nodeprocessor, dispatcher and rule abstraction
> -----------------------------------------------------------------------------------
>
>                 Key: HIVE-186
>                 URL: https://issues.apache.org/jira/browse/HIVE-186
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Ashish Thusoo
>            Assignee: Ashish Thusoo
>         Attachments: patch-186.txt
>
>
> Currently, the query processor has two different tree and rule abstractions - one for ASTs and one for Operator Graphs. We should clean this up so that we have a single abstraction that can be reused at different stages in the query compiler.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.