You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by Daniel Dai <da...@gmail.com> on 2014/02/09 09:20:00 UTC

Review Request 17878: Make scalar work

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17878/
-----------------------------------------------------------

Review request for pig, Cheolsoo Park and Rohini Palaniswamy.


Bugs: PIG-3757
    https://issues.apache.org/jira/browse/PIG-3757


Repository: pig


Description
-------

See PIG-3757


Diffs
-----

  branches/tez/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java 1565513 
  branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/PigProcessor.java 1565513 
  branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java 1565513 
  branches/tez/src/org/apache/pig/impl/builtin/ReadScalarsTez.java PRE-CREATION 
  branches/tez/src/org/apache/pig/newplan/logical/visitor/ScalarVisitor.java 1565513 

Diff: https://reviews.apache.org/r/17878/diff/


Testing
-------


Thanks,

Daniel Dai


Re: Review Request 17878: Make scalar work

Posted by Rohini Palaniswamy <ro...@gmail.com>.

> On Feb. 17, 2014, 8:18 a.m., Rohini Palaniswamy wrote:
> > Discussed with Daniel offline. Since this patch makes a lot of change and fixing all cases would take some time, better to check this in its current state and continue fixing it in a separate jira.  I already checked in PIG-3766 and this patch has to be rebased. It will be time consuming to rebase again and again and work on this patch as other patches go in as this patch makes a lot of changes to TezCompiler.

Rest of the fixes/review comments went in with PIG-3842.


- Rohini


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17878/#review34634
-----------------------------------------------------------


On Feb. 17, 2014, 4:25 a.m., Daniel Dai wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/17878/
> -----------------------------------------------------------
> 
> (Updated Feb. 17, 2014, 4:25 a.m.)
> 
> 
> Review request for pig, Cheolsoo Park and Rohini Palaniswamy.
> 
> 
> Bugs: PIG-3757
>     https://issues.apache.org/jira/browse/PIG-3757
> 
> 
> Repository: pig
> 
> 
> Description
> -------
> 
> See PIG-3757
> 
> 
> Diffs
> -----
> 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PhyPlanVisitor.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/CombinerPackager.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/MultiQueryOptimizerTez.java PRE-CREATION 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/PigProcessor.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/SecondaryKeyOptimizerTez.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezOperator.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/util/TezCompilerUtil.java 1567297 
>   branches/tez/src/org/apache/pig/impl/builtin/ReadScalarsTez.java PRE-CREATION 
>   branches/tez/src/org/apache/pig/newplan/logical/visitor/ScalarVisitor.java 1567297 
> 
> Diff: https://reviews.apache.org/r/17878/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Daniel Dai
> 
>


Re: Review Request 17878: Make scalar work

Posted by Rohini Palaniswamy <ro...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17878/#review34634
-----------------------------------------------------------

Ship it!


Discussed with Daniel offline. Since this patch makes a lot of change and fixing all cases would take some time, better to check this in its current state and continue fixing it in a separate jira.  I already checked in PIG-3766 and this patch has to be rebased. It will be time consuming to rebase again and again and work on this patch as other patches go in as this patch makes a lot of changes to TezCompiler.

- Rohini Palaniswamy


On Feb. 17, 2014, 4:25 a.m., Daniel Dai wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/17878/
> -----------------------------------------------------------
> 
> (Updated Feb. 17, 2014, 4:25 a.m.)
> 
> 
> Review request for pig, Cheolsoo Park and Rohini Palaniswamy.
> 
> 
> Bugs: PIG-3757
>     https://issues.apache.org/jira/browse/PIG-3757
> 
> 
> Repository: pig
> 
> 
> Description
> -------
> 
> See PIG-3757
> 
> 
> Diffs
> -----
> 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PhyPlanVisitor.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/CombinerPackager.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/MultiQueryOptimizerTez.java PRE-CREATION 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/PigProcessor.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/SecondaryKeyOptimizerTez.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezOperator.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/util/TezCompilerUtil.java 1567297 
>   branches/tez/src/org/apache/pig/impl/builtin/ReadScalarsTez.java PRE-CREATION 
>   branches/tez/src/org/apache/pig/newplan/logical/visitor/ScalarVisitor.java 1567297 
> 
> Diff: https://reviews.apache.org/r/17878/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Daniel Dai
> 
>


Re: Review Request 17878: Make scalar work

Posted by Rohini Palaniswamy <ro...@gmail.com>.

> On Feb. 17, 2014, 5:30 a.m., Rohini Palaniswamy wrote:
> > branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java, line 172
> > <https://reviews.apache.org/r/17878/diff/2/?file=486612#file486612line172>
> >
> >     Might be easier if this optimizer is the first one to be called before Combiner and secondary key optimizer. Just a thought. If there are no issues with having it later, then fine.
> 
> Daniel Dai wrote:
>     This is follow MR version. In MR, multiquery and combiner optimization are mutually exclusive (multiquery MR cannot have combiner), and we find combiner is more desired than multiquery, so we run combiner optimization first. Fortunately that is not the case for Tez. But I still want to keep the sequence of rules triggered to minimize Tez only issue.

Earlier I had fixed the combiner and secondary key optimizer of Tez to work with splits. If you don't move it to beginning then you will have to fix MultiQueryOptimizer to copy over combine plans and secondary sort orders in the TezEdgeDescriptor. 


> On Feb. 17, 2014, 5:30 a.m., Rohini Palaniswamy wrote:
> > branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java, lines 772-773
> > <https://reviews.apache.org/r/17878/diff/2/?file=486610#file486610line772>
> >
> >     It would be easier to keep this here for curTezOp.getSplitParent() != null (no multiquery)
> 
> Daniel Dai wrote:
>     Not sure if I understand, but seems there is nothing special for filter for splitter, can you be more specific?

There is always a extra empty POFilter after POSplit. In MR it is removed in NoopFilterRemover.java. It is easy/efficient to do it here as it saves on traversing the whole plan again and then finding it and removing it. 


> On Feb. 17, 2014, 5:30 a.m., Rohini Palaniswamy wrote:
> > branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java, line 2110
> > <https://reviews.apache.org/r/17878/diff/2/?file=486610#file486610line2110>
> >
> >     You will need these, else there will be hanging operators in the plan
> 
> Daniel Dai wrote:
>     I also changed compile(PhysicalOperator op), which now handles split differently, splitter will only be added once

I am not very sure. Can you check once if removing this does not introduce PIG-3620 again?


> On Feb. 17, 2014, 5:30 a.m., Rohini Palaniswamy wrote:
> > branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java, line 2112
> > <https://reviews.apache.org/r/17878/diff/2/?file=486610#file486610line2112>
> >
> >     You will need this as well, else input will be pointing to wrong physical operataors.
> 
> Daniel Dai wrote:
>     Checked seems to be Ok. Guess that is also due the change of how I generate the first split. Splitter's physical plan is composed with PhysicalPlan.connect, which should deal with inputs correctly.

I hit this problem with multiquery off. With multiquery on this was never a problem.


- Rohini


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17878/#review34061
-----------------------------------------------------------


On Feb. 17, 2014, 4:25 a.m., Daniel Dai wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/17878/
> -----------------------------------------------------------
> 
> (Updated Feb. 17, 2014, 4:25 a.m.)
> 
> 
> Review request for pig, Cheolsoo Park and Rohini Palaniswamy.
> 
> 
> Bugs: PIG-3757
>     https://issues.apache.org/jira/browse/PIG-3757
> 
> 
> Repository: pig
> 
> 
> Description
> -------
> 
> See PIG-3757
> 
> 
> Diffs
> -----
> 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PhyPlanVisitor.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/CombinerPackager.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/MultiQueryOptimizerTez.java PRE-CREATION 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/PigProcessor.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/SecondaryKeyOptimizerTez.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezOperator.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/util/TezCompilerUtil.java 1567297 
>   branches/tez/src/org/apache/pig/impl/builtin/ReadScalarsTez.java PRE-CREATION 
>   branches/tez/src/org/apache/pig/newplan/logical/visitor/ScalarVisitor.java 1567297 
> 
> Diff: https://reviews.apache.org/r/17878/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Daniel Dai
> 
>


Re: Review Request 17878: Make scalar work

Posted by Daniel Dai <da...@gmail.com>.

> On Feb. 17, 2014, 5:30 a.m., Rohini Palaniswamy wrote:
> > branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java, lines 772-773
> > <https://reviews.apache.org/r/17878/diff/2/?file=486610#file486610line772>
> >
> >     It would be easier to keep this here for curTezOp.getSplitParent() != null (no multiquery)
> 
> Daniel Dai wrote:
>     Not sure if I understand, but seems there is nothing special for filter for splitter, can you be more specific?
> 
> Rohini Palaniswamy wrote:
>     There is always a extra empty POFilter after POSplit. In MR it is removed in NoopFilterRemover.java. It is easy/efficient to do it here as it saves on traversing the whole plan again and then finding it and removing it.

I try to do it in MultiQueryOptimizerTez, there is a TODO tag for that. Actually we need to check the filter condition, cuz not filter in splittee is redundant, right?


> On Feb. 17, 2014, 5:30 a.m., Rohini Palaniswamy wrote:
> > branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java, line 2110
> > <https://reviews.apache.org/r/17878/diff/2/?file=486610#file486610line2110>
> >
> >     You will need these, else there will be hanging operators in the plan
> 
> Daniel Dai wrote:
>     I also changed compile(PhysicalOperator op), which now handles split differently, splitter will only be added once
> 
> Rohini Palaniswamy wrote:
>     I am not very sure. Can you check once if removing this does not introduce PIG-3620 again?

During my debugging, I manually examined bunch of plans, I am pretty sure it should not be the case.


> On Feb. 17, 2014, 5:30 a.m., Rohini Palaniswamy wrote:
> > branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java, line 2112
> > <https://reviews.apache.org/r/17878/diff/2/?file=486610#file486610line2112>
> >
> >     You will need this as well, else input will be pointing to wrong physical operataors.
> 
> Daniel Dai wrote:
>     Checked seems to be Ok. Guess that is also due the change of how I generate the first split. Splitter's physical plan is composed with PhysicalPlan.connect, which should deal with inputs correctly.
> 
> Rohini Palaniswamy wrote:
>     I hit this problem with multiquery off. With multiquery on this was never a problem.

I can run e2e tests with mutiquery off to see.


> On Feb. 17, 2014, 5:30 a.m., Rohini Palaniswamy wrote:
> > branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java, line 172
> > <https://reviews.apache.org/r/17878/diff/2/?file=486612#file486612line172>
> >
> >     Might be easier if this optimizer is the first one to be called before Combiner and secondary key optimizer. Just a thought. If there are no issues with having it later, then fine.
> 
> Daniel Dai wrote:
>     This is follow MR version. In MR, multiquery and combiner optimization are mutually exclusive (multiquery MR cannot have combiner), and we find combiner is more desired than multiquery, so we run combiner optimization first. Fortunately that is not the case for Tez. But I still want to keep the sequence of rules triggered to minimize Tez only issue.
> 
> Rohini Palaniswamy wrote:
>     Earlier I had fixed the combiner and secondary key optimizer of Tez to work with splits. If you don't move it to beginning then you will have to fix MultiQueryOptimizer to copy over combine plans and secondary sort orders in the TezEdgeDescriptor.

The good thing is combiner/secondary key property is on edge. When POSplit consume a vertex A, I keep the original edge from A to its successor. So combiner/secondary key property is still there.


- Daniel


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17878/#review34061
-----------------------------------------------------------


On Feb. 17, 2014, 4:25 a.m., Daniel Dai wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/17878/
> -----------------------------------------------------------
> 
> (Updated Feb. 17, 2014, 4:25 a.m.)
> 
> 
> Review request for pig, Cheolsoo Park and Rohini Palaniswamy.
> 
> 
> Bugs: PIG-3757
>     https://issues.apache.org/jira/browse/PIG-3757
> 
> 
> Repository: pig
> 
> 
> Description
> -------
> 
> See PIG-3757
> 
> 
> Diffs
> -----
> 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PhyPlanVisitor.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/CombinerPackager.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/MultiQueryOptimizerTez.java PRE-CREATION 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/PigProcessor.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/SecondaryKeyOptimizerTez.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezOperator.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/util/TezCompilerUtil.java 1567297 
>   branches/tez/src/org/apache/pig/impl/builtin/ReadScalarsTez.java PRE-CREATION 
>   branches/tez/src/org/apache/pig/newplan/logical/visitor/ScalarVisitor.java 1567297 
> 
> Diff: https://reviews.apache.org/r/17878/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Daniel Dai
> 
>


Re: Review Request 17878: Make scalar work

Posted by Daniel Dai <da...@gmail.com>.

> On Feb. 17, 2014, 5:30 a.m., Rohini Palaniswamy wrote:
> > branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/MultiQueryOptimizerTez.java, line 52
> > <https://reviews.apache.org/r/17878/diff/2/?file=486607#file486607line52>
> >
> >     Can we create a jira to do this later? Currently nested splits work.

Will do. Actually it is very easy to just not to check that in MultiqueryOptimizer, but need to think through if there are any complications.


> On Feb. 17, 2014, 5:30 a.m., Rohini Palaniswamy wrote:
> > branches/tez/src/org/apache/pig/impl/builtin/ReadScalarsTez.java, lines 48-49
> > <https://reviews.apache.org/r/17878/diff/1/?file=481154#file481154line48>
> >
> >     1) Need to handle case where scalar is null here. 
> >     
> >     2) Need to handle case where scalar has more than one record and throw error. It would be better to do that while writing scalar itself instead of here as we have better control now. This would avoid cases where wrong usage of scalars can bring down namenode.

Good catch, I will fix it.


> On Feb. 17, 2014, 5:30 a.m., Rohini Palaniswamy wrote:
> > branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java, line 2110
> > <https://reviews.apache.org/r/17878/diff/2/?file=486610#file486610line2110>
> >
> >     You will need these, else there will be hanging operators in the plan

I also changed compile(PhysicalOperator op), which now handles split differently, splitter will only be added once


> On Feb. 17, 2014, 5:30 a.m., Rohini Palaniswamy wrote:
> > branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java, line 172
> > <https://reviews.apache.org/r/17878/diff/2/?file=486612#file486612line172>
> >
> >     Might be easier if this optimizer is the first one to be called before Combiner and secondary key optimizer. Just a thought. If there are no issues with having it later, then fine.

This is follow MR version. In MR, multiquery and combiner optimization are mutually exclusive (multiquery MR cannot have combiner), and we find combiner is more desired than multiquery, so we run combiner optimization first. Fortunately that is not the case for Tez. But I still want to keep the sequence of rules triggered to minimize Tez only issue.


> On Feb. 17, 2014, 5:30 a.m., Rohini Palaniswamy wrote:
> > branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java, line 2112
> > <https://reviews.apache.org/r/17878/diff/2/?file=486610#file486610line2112>
> >
> >     You will need this as well, else input will be pointing to wrong physical operataors.

Checked seems to be Ok. Guess that is also due the change of how I generate the first split. Splitter's physical plan is composed with PhysicalPlan.connect, which should deal with inputs correctly.


> On Feb. 17, 2014, 5:30 a.m., Rohini Palaniswamy wrote:
> > branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java, lines 772-773
> > <https://reviews.apache.org/r/17878/diff/2/?file=486610#file486610line772>
> >
> >     It would be easier to keep this here for curTezOp.getSplitParent() != null (no multiquery)

Not sure if I understand, but seems there is nothing special for filter for splitter, can you be more specific?


- Daniel


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17878/#review34061
-----------------------------------------------------------


On Feb. 17, 2014, 4:25 a.m., Daniel Dai wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/17878/
> -----------------------------------------------------------
> 
> (Updated Feb. 17, 2014, 4:25 a.m.)
> 
> 
> Review request for pig, Cheolsoo Park and Rohini Palaniswamy.
> 
> 
> Bugs: PIG-3757
>     https://issues.apache.org/jira/browse/PIG-3757
> 
> 
> Repository: pig
> 
> 
> Description
> -------
> 
> See PIG-3757
> 
> 
> Diffs
> -----
> 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PhyPlanVisitor.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/CombinerPackager.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/MultiQueryOptimizerTez.java PRE-CREATION 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/PigProcessor.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/SecondaryKeyOptimizerTez.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezOperator.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/util/TezCompilerUtil.java 1567297 
>   branches/tez/src/org/apache/pig/impl/builtin/ReadScalarsTez.java PRE-CREATION 
>   branches/tez/src/org/apache/pig/newplan/logical/visitor/ScalarVisitor.java 1567297 
> 
> Diff: https://reviews.apache.org/r/17878/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Daniel Dai
> 
>


Re: Review Request 17878: Make scalar work

Posted by Rohini Palaniswamy <ro...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17878/#review34061
-----------------------------------------------------------


TestTezCompiler golden files will need updating.


branches/tez/src/org/apache/pig/impl/builtin/ReadScalarsTez.java
<https://reviews.apache.org/r/17878/#comment64048>

    1) Need to handle case where scalar is null here. 
    
    2) Need to handle case where scalar has more than one record and throw error. It would be better to do that while writing scalar itself instead of here as we have better control now. This would avoid cases where wrong usage of scalars can bring down namenode.



branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/MultiQueryOptimizerTez.java
<https://reviews.apache.org/r/17878/#comment64785>

    Can we create a jira to do this later? Currently nested splits work. 



branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java
<https://reviews.apache.org/r/17878/#comment64779>

    It would be easier to keep this here for curTezOp.getSplitParent() != null (no multiquery)



branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java
<https://reviews.apache.org/r/17878/#comment64776>

    You will need these, else there will be hanging operators in the plan



branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java
<https://reviews.apache.org/r/17878/#comment64778>

    You will need this as well, else input will be pointing to wrong physical operataors. 



branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java
<https://reviews.apache.org/r/17878/#comment64780>

    Might be easier if this optimizer is the first one to be called before Combiner and secondary key optimizer. Just a thought. If there are no issues with having it later, then fine.


- Rohini Palaniswamy


On Feb. 17, 2014, 4:25 a.m., Daniel Dai wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/17878/
> -----------------------------------------------------------
> 
> (Updated Feb. 17, 2014, 4:25 a.m.)
> 
> 
> Review request for pig, Cheolsoo Park and Rohini Palaniswamy.
> 
> 
> Bugs: PIG-3757
>     https://issues.apache.org/jira/browse/PIG-3757
> 
> 
> Repository: pig
> 
> 
> Description
> -------
> 
> See PIG-3757
> 
> 
> Diffs
> -----
> 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PhyPlanVisitor.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/CombinerPackager.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/MultiQueryOptimizerTez.java PRE-CREATION 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/PigProcessor.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/SecondaryKeyOptimizerTez.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezOperator.java 1567297 
>   branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/util/TezCompilerUtil.java 1567297 
>   branches/tez/src/org/apache/pig/impl/builtin/ReadScalarsTez.java PRE-CREATION 
>   branches/tez/src/org/apache/pig/newplan/logical/visitor/ScalarVisitor.java 1567297 
> 
> Diff: https://reviews.apache.org/r/17878/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Daniel Dai
> 
>


Re: Review Request 17878: Make scalar work

Posted by Daniel Dai <da...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17878/
-----------------------------------------------------------

(Updated Feb. 17, 2014, 4:25 a.m.)


Review request for pig, Cheolsoo Park and Rohini Palaniswamy.


Bugs: PIG-3757
    https://issues.apache.org/jira/browse/PIG-3757


Repository: pig


Description
-------

See PIG-3757


Diffs (updated)
-----

  branches/tez/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java 1567297 
  branches/tez/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PhyPlanVisitor.java 1567297 
  branches/tez/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/CombinerPackager.java 1567297 
  branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/MultiQueryOptimizerTez.java PRE-CREATION 
  branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/PigProcessor.java 1567297 
  branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/SecondaryKeyOptimizerTez.java 1567297 
  branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java 1567297 
  branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java 1567297 
  branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java 1567297 
  branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezOperator.java 1567297 
  branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/util/TezCompilerUtil.java 1567297 
  branches/tez/src/org/apache/pig/impl/builtin/ReadScalarsTez.java PRE-CREATION 
  branches/tez/src/org/apache/pig/newplan/logical/visitor/ScalarVisitor.java 1567297 

Diff: https://reviews.apache.org/r/17878/diff/


Testing
-------


Thanks,

Daniel Dai


Re: Review Request 17878: Make scalar work

Posted by Daniel Dai <da...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17878/
-----------------------------------------------------------

(Updated Feb. 15, 2014, 2:14 a.m.)


Review request for pig, Cheolsoo Park and Rohini Palaniswamy.


Bugs: PIG-3757
    https://issues.apache.org/jira/browse/PIG-3757


Repository: pig


Description
-------

See PIG-3757


Diffs (updated)
-----

  branches/tez/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java 1567297 
  branches/tez/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/CombinerPackager.java 1567297 
  branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/MultiQueryOptimizerTez.java PRE-CREATION 
  branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/PigProcessor.java 1567297 
  branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/SecondaryKeyOptimizerTez.java 1567297 
  branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java 1567297 
  branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java 1567297 
  branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java 1567297 
  branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezOperator.java 1567297 
  branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/util/TezCompilerUtil.java 1567297 
  branches/tez/src/org/apache/pig/impl/builtin/ReadScalarsTez.java PRE-CREATION 
  branches/tez/src/org/apache/pig/newplan/logical/visitor/ScalarVisitor.java 1567297 

Diff: https://reviews.apache.org/r/17878/diff/


Testing
-------


Thanks,

Daniel Dai