You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Alan Gates (JIRA)" <ji...@apache.org> on 2008/01/29 18:42:34 UTC

[jira] Commented: (PIG-73) Re-Design of Logical Plan

    [ https://issues.apache.org/jira/browse/PIG-73?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12563590#action_12563590 ] 

Alan Gates commented on PIG-73:
-------------------------------

General comments:

Not sure I understand what LogicalOperator::opTable holds.  Are these
operators that a given operator will be feeding data to?

I'm not sure LogicalOperator should be holding the list of its inputs.  We may
want to rearrange the tree as part of optimization, and we don't to have mess
around inside the operators to do it.  Operators may need to hold references
to operators that can't get moved out of them (like filter holding a reference
to it's condition), but not to just whatever operator happens to be next or
previous in the tree.  That should be held outside in a separate structure.

Why does ArithmeticOperator have it's own type enum?  It should should use the
values from DataType instead.

Why does boolean get its own constant class (LOConstantBooleanOperator)?  Shouldn't there just be one
Constant class, and it may be a constant of any type?

LogicalOperator should have a type field (I think we agreed on this already).

ArithmeticOperator should not extend Scalar, as conceptually you can do
arithmetic on complex types (scalar * tuple for example).

Type checking should be done via a visitor rather than via validateOperator
calls (I think we already agreed on this).

What is the value in separating out ArithmeticBinary from BooleanBinary
operators?  Both take 2 arguments, which is what's important about them being
binary.  The type they return can be handled separately.

Why is Filter's condition an OperatorKey instead of a LogicalOperator?  Is
this key used to lookup in the opTable?  That seems an unecessary level of
indirection.

What is a SinkOperator?

Given the desire to eventually allow scalars in the outer levels of the
language, it's not clear to me that operators such as LOFilter should extend
BagOperator.  These won't necessarily always return bags.  Plus I'm not sure
what the additional value of having them extend this is.  It seems to me we'd
want to change it so that all operators return a Schema instead.

I would suggest several changes:
1) Create a class to contain a tree of operators.  This way the operators
themselves don't have to worry about tree navigation or changes when we want
to move them around for optimization.  This logic can be held in the container
operator.

2) Flatten out the tree quite a bit.  Each of the existing relational
operators (LOStore, LOLoad, LOFilter, etc.) can directly extend
LogicalOperator.  Add an ExpressionOperator that also extends LogicalOperator.
There can then be classes BinaryLogicalOperator, UnaryLogicalOperator,
FunctionOperator, ConstantOperator, BinCondOperator, etc.  that extend
ExpressionOperator.  Some of these (e.g. BinaryLogicalOperator) will have
their own extenders.

3) Every operator should have a getSchema() method that returns a Schema.

> Re-Design of Logical Plan
> -------------------------
>
>                 Key: PIG-73
>                 URL: https://issues.apache.org/jira/browse/PIG-73
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>         Attachments: diff.2008.01.23, diff.2008.01.24
>
>
> I am opening this new bug to track a specific work item within the broader context of improving Pig support for optimizations.
> See related items:
>    - PIG-50 in the jira
>    - the design spec at: http://wiki.apache.org/pig/PigExecutionModel
> In particular we want to remove from the logical plan those aspects that directly relate to the *execution* stage of a plan, hence improving decoupling. Currently EvalSpecs and Conds (and relative pipes) are tightly coupled with logical operators.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.