You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Alan Gates (JIRA)" <ji...@apache.org> on 2008/06/10 19:32:44 UTC

[jira] Created: (PIG-262) Pig needs an optimizer

Pig needs an optimizer
----------------------

                 Key: PIG-262
                 URL: https://issues.apache.org/jira/browse/PIG-262
             Project: Pig
          Issue Type: New Feature
          Components: impl
            Reporter: Alan Gates
            Assignee: Alan Gates


We need to add an optimizer to pig.  This will enable us to do some traditional optimizations, such as filter and projection pushing, join order and execution choices, etc.  It will also enable optimizations specific to map/reduce (such as using the combiner).

The optimizer will need to operate at various levels, including the logical, physical, and possibly map/reduce plan levels.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-262) Pig needs an optimizer

Posted by "Pi Song (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12609578#action_12609578 ] 

Pi Song commented on PIG-262:
-----------------------------

I have a feeling that, in perfect world, plan optimizer (or rewriter to be more generalized) and validator should be combined into a single framework.

I look at "logical plan compilation process" as analogous to data going through operators.

> Pig needs an optimizer
> ----------------------
>
>                 Key: PIG-262
>                 URL: https://issues.apache.org/jira/browse/PIG-262
>             Project: Pig
>          Issue Type: Sub-task
>          Components: impl
>            Reporter: Alan Gates
>            Assignee: Alan Gates
>         Attachments: optimizer.patch
>
>
> We need to add an optimizer to pig.  This will enable us to do some traditional optimizations, such as filter and projection pushing, join order and execution choices, etc.  It will also enable optimizations specific to map/reduce (such as using the combiner).
> The optimizer will need to operate at various levels, including the logical, physical, and possibly map/reduce plan levels.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-262) Pig needs an optimizer

Posted by "Pi Song (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12610525#action_12610525 ] 

Pi Song commented on PIG-262:
-----------------------------

findAll() sounds good but have to be more careful when doing transformation. Graph rewriting naturally is non-deterministic, that is, the result can be different if you start from different parts of the graph. Also, a transformed bit in the graph can again match the initial rule in later matching iterations.

BTW, let's say "good to have"

> Pig needs an optimizer
> ----------------------
>
>                 Key: PIG-262
>                 URL: https://issues.apache.org/jira/browse/PIG-262
>             Project: Pig
>          Issue Type: Sub-task
>          Components: impl
>            Reporter: Alan Gates
>            Assignee: Alan Gates
>         Attachments: optimizer.patch
>
>
> We need to add an optimizer to pig.  This will enable us to do some traditional optimizations, such as filter and projection pushing, join order and execution choices, etc.  It will also enable optimizations specific to map/reduce (such as using the combiner).
> The optimizer will need to operate at various levels, including the logical, physical, and possibly map/reduce plan levels.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-262) Pig needs an optimizer

Posted by "Pi Song (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pi Song updated PIG-262:
------------------------

    Comment: was deleted

> Pig needs an optimizer
> ----------------------
>
>                 Key: PIG-262
>                 URL: https://issues.apache.org/jira/browse/PIG-262
>             Project: Pig
>          Issue Type: Sub-task
>          Components: impl
>            Reporter: Alan Gates
>            Assignee: Alan Gates
>         Attachments: optimizer.patch
>
>
> We need to add an optimizer to pig.  This will enable us to do some traditional optimizations, such as filter and projection pushing, join order and execution choices, etc.  It will also enable optimizations specific to map/reduce (such as using the combiner).
> The optimizer will need to operate at various levels, including the logical, physical, and possibly map/reduce plan levels.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-262) Pig needs an optimizer

Posted by "Shravan Matthur Narayanamurthy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12604517#action_12604517 ] 

Shravan Matthur Narayanamurthy commented on PIG-262:
----------------------------------------------------

Neat!!

I just had a couple of minor comments

1) In the java doc you say that PlanOptimizer need not be subclassed but you do so for the LogicalOptimizer. Wasn't just a plan optimizer with LogicalTransformer sufficient?
2) The rule in the LogicalOptiizer is wrong.  Line 43 in LogicalOptimizer should be nodes.add(LOLoad.class.getName());
3) Like Pi said we are entirely trusting the optimizer to insert right types. I guess we need to run two passes of the type checker once before the optimizer and once after.

> Pig needs an optimizer
> ----------------------
>
>                 Key: PIG-262
>                 URL: https://issues.apache.org/jira/browse/PIG-262
>             Project: Pig
>          Issue Type: Sub-task
>          Components: impl
>            Reporter: Alan Gates
>            Assignee: Alan Gates
>         Attachments: optimizer.patch
>
>
> We need to add an optimizer to pig.  This will enable us to do some traditional optimizations, such as filter and projection pushing, join order and execution choices, etc.  It will also enable optimizations specific to map/reduce (such as using the combiner).
> The optimizer will need to operate at various levels, including the logical, physical, and possibly map/reduce plan levels.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-262) Pig needs an optimizer

Posted by "Pi Song (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12605239#action_12605239 ] 

Pi Song commented on PIG-262:
-----------------------------

Just to say having the transformer that inserts casting before validation and the real optimizer after validation seems to be alright to me.

> Pig needs an optimizer
> ----------------------
>
>                 Key: PIG-262
>                 URL: https://issues.apache.org/jira/browse/PIG-262
>             Project: Pig
>          Issue Type: Sub-task
>          Components: impl
>            Reporter: Alan Gates
>            Assignee: Alan Gates
>         Attachments: optimizer.patch
>
>
> We need to add an optimizer to pig.  This will enable us to do some traditional optimizations, such as filter and projection pushing, join order and execution choices, etc.  It will also enable optimizations specific to map/reduce (such as using the combiner).
> The optimizer will need to operate at various levels, including the logical, physical, and possibly map/reduce plan levels.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-262) Pig needs an optimizer

Posted by "Pi Song (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12609877#action_12609877 ] 

Pi Song commented on PIG-262:
-----------------------------

Anyway at this stage having to interleave two frameworks in the code should be fine. I want to see the new engine working as soon as possible.

> Pig needs an optimizer
> ----------------------
>
>                 Key: PIG-262
>                 URL: https://issues.apache.org/jira/browse/PIG-262
>             Project: Pig
>          Issue Type: Sub-task
>          Components: impl
>            Reporter: Alan Gates
>            Assignee: Alan Gates
>         Attachments: optimizer.patch
>
>
> We need to add an optimizer to pig.  This will enable us to do some traditional optimizations, such as filter and projection pushing, join order and execution choices, etc.  It will also enable optimizations specific to map/reduce (such as using the combiner).
> The optimizer will need to operate at various levels, including the logical, physical, and possibly map/reduce plan levels.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-262) Pig needs an optimizer

Posted by "Alan Gates (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12604893#action_12604893 ] 

Alan Gates commented on PIG-262:
--------------------------------

If we feel that strongly that we need to insert the casts before type checking, I think I should just put it in the validator framework.  There's no reason to split the optimizer over it, when the validator can handle it.

> Pig needs an optimizer
> ----------------------
>
>                 Key: PIG-262
>                 URL: https://issues.apache.org/jira/browse/PIG-262
>             Project: Pig
>          Issue Type: Sub-task
>          Components: impl
>            Reporter: Alan Gates
>            Assignee: Alan Gates
>         Attachments: optimizer.patch
>
>
> We need to add an optimizer to pig.  This will enable us to do some traditional optimizations, such as filter and projection pushing, join order and execution choices, etc.  It will also enable optimizations specific to map/reduce (such as using the combiner).
> The optimizer will need to operate at various levels, including the logical, physical, and possibly map/reduce plan levels.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-262) Pig needs an optimizer

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12612595#action_12612595 ] 

Daniel Dai commented on PIG-262:
--------------------------------

For Pi's concern:
1. Yes, it's better to refactory since there's lots of repeated code
2. It can be done but we need to change all LOXXXX class to add a new member visit(RuleMatcherVisitor). I can change if it is better.

> Pig needs an optimizer
> ----------------------
>
>                 Key: PIG-262
>                 URL: https://issues.apache.org/jira/browse/PIG-262
>             Project: Pig
>          Issue Type: Sub-task
>          Components: impl
>            Reporter: Alan Gates
>            Assignee: Alan Gates
>         Attachments: optimizer.patch, optimizer2.patch
>
>
> We need to add an optimizer to pig.  This will enable us to do some traditional optimizations, such as filter and projection pushing, join order and execution choices, etc.  It will also enable optimizations specific to map/reduce (such as using the combiner).
> The optimizer will need to operate at various levels, including the logical, physical, and possibly map/reduce plan levels.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-262) Pig needs an optimizer

Posted by "Alan Gates (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12604553#action_12604553 ] 

Alan Gates commented on PIG-262:
--------------------------------

In response to Shravan's comment:  "In the java doc you say that PlanOptimizer need not be subclassed but you do so for the LogicalOptimizer. Wasn't just a plan optimizer with LogicalTransformer sufficient?"  I decided it was a little cleaner to put all the rules in one place, so that I could instantiate them for testing and the real code in the same way.  Otherwise everytime I defined the logical optimizer, I'd have to define all the rules.  I should update the javadoc comments.

In response to the type checker concerns.  I went back and forth on this, and chose to put them in the optimizer because it actually transforms the tree, while the validators mostly check the tree.  I could put this in the validators in front of the type checker (that's where I originally had it).  Since the cast I'm inserting just mimics whatever the load in front of it does, hopefully it won't get it wrong (ie we don't need to typecheck again).  One way or another the optimizer will be rearranging the tree, which will mean inserting some casts, etc.

Also, I realized I neglected to put include tests that actually use this new functionality.  I'll be adding those before I submit this patch.



> Pig needs an optimizer
> ----------------------
>
>                 Key: PIG-262
>                 URL: https://issues.apache.org/jira/browse/PIG-262
>             Project: Pig
>          Issue Type: Sub-task
>          Components: impl
>            Reporter: Alan Gates
>            Assignee: Alan Gates
>         Attachments: optimizer.patch
>
>
> We need to add an optimizer to pig.  This will enable us to do some traditional optimizations, such as filter and projection pushing, join order and execution choices, etc.  It will also enable optimizations specific to map/reduce (such as using the combiner).
> The optimizer will need to operate at various levels, including the logical, physical, and possibly map/reduce plan levels.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-262) Pig needs an optimizer

Posted by "Alan Gates (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alan Gates updated PIG-262:
---------------------------

    Issue Type: Sub-task  (was: New Feature)
        Parent: PIG-157

> Pig needs an optimizer
> ----------------------
>
>                 Key: PIG-262
>                 URL: https://issues.apache.org/jira/browse/PIG-262
>             Project: Pig
>          Issue Type: Sub-task
>          Components: impl
>            Reporter: Alan Gates
>            Assignee: Alan Gates
>
> We need to add an optimizer to pig.  This will enable us to do some traditional optimizations, such as filter and projection pushing, join order and execution choices, etc.  It will also enable optimizations specific to map/reduce (such as using the combiner).
> The optimizer will need to operate at various levels, including the logical, physical, and possibly map/reduce plan levels.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-262) Pig needs an optimizer

Posted by "Alan Gates (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12609619#action_12609619 ] 

Alan Gates commented on PIG-262:
--------------------------------

I think I agree, though I don't want to start mixing up those frameworks right at the moment.  I want to focus on getting what we have working first.  They are slightly different, one focuses on checking the tree while one focuses on rearranging it.  But both end up adding nodes to the plan.  And thus they will both need tools to patch up graph connections and schemas after these changes.  It's not clear whether we should maintain the distinction and have a framework that can run both or whether we should merge the two into one class.  I think the former, but I'm not sure.

> Pig needs an optimizer
> ----------------------
>
>                 Key: PIG-262
>                 URL: https://issues.apache.org/jira/browse/PIG-262
>             Project: Pig
>          Issue Type: Sub-task
>          Components: impl
>            Reporter: Alan Gates
>            Assignee: Alan Gates
>         Attachments: optimizer.patch
>
>
> We need to add an optimizer to pig.  This will enable us to do some traditional optimizations, such as filter and projection pushing, join order and execution choices, etc.  It will also enable optimizations specific to map/reduce (such as using the combiner).
> The optimizer will need to operate at various levels, including the logical, physical, and possibly map/reduce plan levels.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-262) Pig needs an optimizer

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12610321#action_12610321 ] 

Daniel Dai commented on PIG-262:
--------------------------------

Currently the optimizer only find the first match per plan per rule, sometimes we need to find all matches in a plan. I think we need to add a new function RuleMatcher.findAll(). It will find all matched nodes against a rule inside a plan based on current snapshot. 

> Pig needs an optimizer
> ----------------------
>
>                 Key: PIG-262
>                 URL: https://issues.apache.org/jira/browse/PIG-262
>             Project: Pig
>          Issue Type: Sub-task
>          Components: impl
>            Reporter: Alan Gates
>            Assignee: Alan Gates
>         Attachments: optimizer.patch
>
>
> We need to add an optimizer to pig.  This will enable us to do some traditional optimizations, such as filter and projection pushing, join order and execution choices, etc.  It will also enable optimizations specific to map/reduce (such as using the combiner).
> The optimizer will need to operate at various levels, including the logical, physical, and possibly map/reduce plan levels.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-262) Pig needs an optimizer

Posted by "Pi Song (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12604910#action_12604910 ] 

Pi Song commented on PIG-262:
-----------------------------

You'll have it on Monday!!

> Pig needs an optimizer
> ----------------------
>
>                 Key: PIG-262
>                 URL: https://issues.apache.org/jira/browse/PIG-262
>             Project: Pig
>          Issue Type: Sub-task
>          Components: impl
>            Reporter: Alan Gates
>            Assignee: Alan Gates
>         Attachments: optimizer.patch
>
>
> We need to add an optimizer to pig.  This will enable us to do some traditional optimizations, such as filter and projection pushing, join order and execution choices, etc.  It will also enable optimizations specific to map/reduce (such as using the combiner).
> The optimizer will need to operate at various levels, including the logical, physical, and possibly map/reduce plan levels.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-262) Pig needs an optimizer

Posted by "Pi Song (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12604915#action_12604915 ] 

Pi Song commented on PIG-262:
-----------------------------

A minute, I think transformer and validator being interleaved is logically correct.
Let's think about the way we process our plans like data that goes through a number of stages. In some stages, you might want to transform but in some stages you might just want to validate.  We just don't have it modeled as graph right now but that doesn't matter.
Our plan processing is not complex (from what I can see it is very linear) We just have to interleave transformer and validator in the right order manually in the code!

> Pig needs an optimizer
> ----------------------
>
>                 Key: PIG-262
>                 URL: https://issues.apache.org/jira/browse/PIG-262
>             Project: Pig
>          Issue Type: Sub-task
>          Components: impl
>            Reporter: Alan Gates
>            Assignee: Alan Gates
>         Attachments: optimizer.patch
>
>
> We need to add an optimizer to pig.  This will enable us to do some traditional optimizations, such as filter and projection pushing, join order and execution choices, etc.  It will also enable optimizations specific to map/reduce (such as using the combiner).
> The optimizer will need to operate at various levels, including the logical, physical, and possibly map/reduce plan levels.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-262) Pig needs an optimizer

Posted by "Pi Song (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12604685#action_12604685 ] 

Pi Song commented on PIG-262:
-----------------------------

I think the solution might not look elegant but we have to split the optimizer logic into two parts. It will look like:-

- Call optimizer for insert load schema cast
- Call validator
- Call the real optimizer

because it just happens that we also use optimizer for inserting schema cast

> Pig needs an optimizer
> ----------------------
>
>                 Key: PIG-262
>                 URL: https://issues.apache.org/jira/browse/PIG-262
>             Project: Pig
>          Issue Type: Sub-task
>          Components: impl
>            Reporter: Alan Gates
>            Assignee: Alan Gates
>         Attachments: optimizer.patch
>
>
> We need to add an optimizer to pig.  This will enable us to do some traditional optimizations, such as filter and projection pushing, join order and execution choices, etc.  It will also enable optimizations specific to map/reduce (such as using the combiner).
> The optimizer will need to operate at various levels, including the logical, physical, and possibly map/reduce plan levels.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-262) Pig needs an optimizer

Posted by "Pi Song (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12612486#action_12612486 ] 

Pi Song commented on PIG-262:
-----------------------------

Comments:-
1. LogicalTransformer.removeFromChain() seems to duplicate the code in insertBetween. It also doesn't handle internal links within LOCOGroup.
2. We've got DependencyOrderWalker and DFSWalker implementation classes available. Isn't it better to reuse them?

> Pig needs an optimizer
> ----------------------
>
>                 Key: PIG-262
>                 URL: https://issues.apache.org/jira/browse/PIG-262
>             Project: Pig
>          Issue Type: Sub-task
>          Components: impl
>            Reporter: Alan Gates
>            Assignee: Alan Gates
>         Attachments: optimizer.patch, optimizer2.patch
>
>
> We need to add an optimizer to pig.  This will enable us to do some traditional optimizations, such as filter and projection pushing, join order and execution choices, etc.  It will also enable optimizations specific to map/reduce (such as using the combiner).
> The optimizer will need to operate at various levels, including the logical, physical, and possibly map/reduce plan levels.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-262) Pig needs an optimizer

Posted by "Alan Gates (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12609272#action_12609272 ] 

Alan Gates commented on PIG-262:
--------------------------------

Checked in a modified version of optimizer.patch.  I'm still open to moving the type cast insertion functionality to the validator, as we discussed.  But I've checked it in with the functionality in the optimizer for mainly to preserve the optimizer framework and give an example of how to use it.  Once we have additional optimizer rules, it will be easier to relocate this.

> Pig needs an optimizer
> ----------------------
>
>                 Key: PIG-262
>                 URL: https://issues.apache.org/jira/browse/PIG-262
>             Project: Pig
>          Issue Type: Sub-task
>          Components: impl
>            Reporter: Alan Gates
>            Assignee: Alan Gates
>         Attachments: optimizer.patch
>
>
> We need to add an optimizer to pig.  This will enable us to do some traditional optimizations, such as filter and projection pushing, join order and execution choices, etc.  It will also enable optimizations specific to map/reduce (such as using the combiner).
> The optimizer will need to operate at various levels, including the logical, physical, and possibly map/reduce plan levels.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-262) Pig needs an optimizer

Posted by "Pi Song (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12604489#action_12604489 ] 

Pi Song commented on PIG-262:
-----------------------------

Very clean!

Seems to me this load type cast optimizer may have to go before type checking.

> Pig needs an optimizer
> ----------------------
>
>                 Key: PIG-262
>                 URL: https://issues.apache.org/jira/browse/PIG-262
>             Project: Pig
>          Issue Type: Sub-task
>          Components: impl
>            Reporter: Alan Gates
>            Assignee: Alan Gates
>         Attachments: optimizer.patch
>
>
> We need to add an optimizer to pig.  This will enable us to do some traditional optimizations, such as filter and projection pushing, join order and execution choices, etc.  It will also enable optimizations specific to map/reduce (such as using the combiner).
> The optimizer will need to operate at various levels, including the logical, physical, and possibly map/reduce plan levels.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-262) Pig needs an optimizer

Posted by "Alan Gates (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alan Gates updated PIG-262:
---------------------------

    Attachment: optimizer.patch

A beginning of the logical optimizer.  This includes just logical optimizer, with only one rule, to insert type casts when the user has requested them or the data is self describing.

> Pig needs an optimizer
> ----------------------
>
>                 Key: PIG-262
>                 URL: https://issues.apache.org/jira/browse/PIG-262
>             Project: Pig
>          Issue Type: Sub-task
>          Components: impl
>            Reporter: Alan Gates
>            Assignee: Alan Gates
>         Attachments: optimizer.patch
>
>
> We need to add an optimizer to pig.  This will enable us to do some traditional optimizations, such as filter and projection pushing, join order and execution choices, etc.  It will also enable optimizations specific to map/reduce (such as using the combiner).
> The optimizer will need to operate at various levels, including the logical, physical, and possibly map/reduce plan levels.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-262) Pig needs an optimizer

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai updated PIG-262:
---------------------------

    Attachment: optimizer2.patch

It is true we should be careful when doing transformation, but I still belive we need it because many optimizations depend on that. Let's define it more clearly to alleviate the non-deterministic problem:
1. All matches are taken from the snapshot before any transformation
2. We use well-defined tree walker algorithm: depth-first, dependency-order, etc
3. For every node visited, check all possible matches start from the node

I provide an implementation optimizer2.patch for reference. This is an incremental patch based on optimizer.patch.

> Pig needs an optimizer
> ----------------------
>
>                 Key: PIG-262
>                 URL: https://issues.apache.org/jira/browse/PIG-262
>             Project: Pig
>          Issue Type: Sub-task
>          Components: impl
>            Reporter: Alan Gates
>            Assignee: Alan Gates
>         Attachments: optimizer.patch, optimizer2.patch
>
>
> We need to add an optimizer to pig.  This will enable us to do some traditional optimizations, such as filter and projection pushing, join order and execution choices, etc.  It will also enable optimizations specific to map/reduce (such as using the combiner).
> The optimizer will need to operate at various levels, including the logical, physical, and possibly map/reduce plan levels.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.