You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by Pallavi Rao <pa...@inmobi.com> on 2016/02/01 13:33:22 UTC
Review Request 43044: PIG-4766 Ensure GroupBy is optimized for all
algebraic Operations
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/43044/
-----------------------------------------------------------
Review request for pig, Xianda Ke, liyun zhang, Mohit Sabharwal, and Xuefu Zhang.
Bugs: PIG-4766
https://issues.apache.org/jira/browse/PIG-4766
Repository: pig-git
Description
-------
PIG-4709 introduced Combiner optimization for Group By. However, the patch did not handle cases where constant/conditional expressions were used. It also did not handle limit.
This patch is to address those gaps.
Diffs
-----
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/PORelationToExprProject.java 5fb49e2
src/org/apache/pig/backend/hadoop/executionengine/spark/optimizer/CombinerOptimizer.java a05d009
src/org/apache/pig/backend/hadoop/executionengine/util/CombinerOptimizerUtil.java 5c0919f
src/org/apache/pig/data/SelfSpillBag.java 4e08b99
test/org/apache/pig/test/TestCombiner.java b2e81ac
Diff: https://reviews.apache.org/r/43044/diff/
Testing
-------
With this patch, all tests in TestCombiner pass.
Thanks,
Pallavi Rao
Re: Review Request 43044: PIG-4766 Ensure GroupBy is optimized for all
algebraic Operations
Posted by Pallavi Rao <pa...@inmobi.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/43044/
-----------------------------------------------------------
(Updated Feb. 5, 2016, 4:25 a.m.)
Review request for pig, Xianda Ke, liyun zhang, Mohit Sabharwal, and Xuefu Zhang.
Changes
-------
Rebased patch
Bugs: PIG-4766
https://issues.apache.org/jira/browse/PIG-4766
Repository: pig-git
Description
-------
PIG-4709 introduced Combiner optimization for Group By. However, the patch did not handle cases where constant/conditional expressions were used. It also did not handle limit.
This patch is to address those gaps.
Diffs (updated)
-----
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/PORelationToExprProject.java 5fb49e2
src/org/apache/pig/backend/hadoop/executionengine/spark/converter/ReduceByConverter.java d4b521a
src/org/apache/pig/backend/hadoop/executionengine/spark/optimizer/CombinerOptimizer.java a05d009
src/org/apache/pig/backend/hadoop/executionengine/util/CombinerOptimizerUtil.java 5c0919f
test/org/apache/pig/newplan/logical/relational/TestLocationInPhysicalPlan.java 0e45434
test/org/apache/pig/test/TestCombiner.java b2e81ac
Diff: https://reviews.apache.org/r/43044/diff/
Testing
-------
With this patch, all tests in TestCombiner pass.
Thanks,
Pallavi Rao
Re: Review Request 43044: PIG-4766 Ensure GroupBy is optimized for all
algebraic Operations
Posted by Pallavi Rao <pa...@inmobi.com>.
> On Feb. 4, 2016, 8:56 a.m., kelly zhang wrote:
> > src/org/apache/pig/backend/hadoop/executionengine/spark/converter/ReduceByConverter.java, line 181
> > <https://reviews.apache.org/r/43044/diff/2/?file=1230607#file1230607line181>
> >
> > can we consider all the tuples with null key are same?
> >
> > I explain the detail in jira page.
Answered your question on the JIRA :-)
> On Feb. 4, 2016, 8:56 a.m., kelly zhang wrote:
> > src/org/apache/pig/data/SelfSpillBag.java, line 55
> > <https://reviews.apache.org/r/43044/diff/2/?file=1230610#file1230610line55>
> >
> > This modification is checked in PIG-4611.
Oh! I missed the change. Will revert this change. Thanks for pointing out.
- Pallavi
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/43044/#review117779
-----------------------------------------------------------
On Feb. 3, 2016, 6:23 a.m., Pallavi Rao wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/43044/
> -----------------------------------------------------------
>
> (Updated Feb. 3, 2016, 6:23 a.m.)
>
>
> Review request for pig, Xianda Ke, liyun zhang, Mohit Sabharwal, and Xuefu Zhang.
>
>
> Bugs: PIG-4766
> https://issues.apache.org/jira/browse/PIG-4766
>
>
> Repository: pig-git
>
>
> Description
> -------
>
> PIG-4709 introduced Combiner optimization for Group By. However, the patch did not handle cases where constant/conditional expressions were used. It also did not handle limit.
>
> This patch is to address those gaps.
>
>
> Diffs
> -----
>
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/PORelationToExprProject.java 5fb49e2
> src/org/apache/pig/backend/hadoop/executionengine/spark/converter/ReduceByConverter.java d4b521a
> src/org/apache/pig/backend/hadoop/executionengine/spark/optimizer/CombinerOptimizer.java a05d009
> src/org/apache/pig/backend/hadoop/executionengine/util/CombinerOptimizerUtil.java 5c0919f
> src/org/apache/pig/data/SelfSpillBag.java 4e08b99
> test/org/apache/pig/newplan/logical/relational/TestLocationInPhysicalPlan.java 0e45434
> test/org/apache/pig/test/TestCombiner.java b2e81ac
>
> Diff: https://reviews.apache.org/r/43044/diff/
>
>
> Testing
> -------
>
> With this patch, all tests in TestCombiner pass.
>
>
> Thanks,
>
> Pallavi Rao
>
>
Re: Review Request 43044: PIG-4766 Ensure GroupBy is optimized for all
algebraic Operations
Posted by kelly zhang <li...@intel.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/43044/#review117779
-----------------------------------------------------------
src/org/apache/pig/backend/hadoop/executionengine/spark/converter/ReduceByConverter.java (line 181)
<https://reviews.apache.org/r/43044/#comment179047>
can we consider all the tuples with null key are same?
I explain the detail in jira page.
src/org/apache/pig/data/SelfSpillBag.java (line 55)
<https://reviews.apache.org/r/43044/#comment179046>
This modification is checked in PIG-4611.
- kelly zhang
On Feb. 3, 2016, 6:23 a.m., Pallavi Rao wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/43044/
> -----------------------------------------------------------
>
> (Updated Feb. 3, 2016, 6:23 a.m.)
>
>
> Review request for pig, Xianda Ke, liyun zhang, Mohit Sabharwal, and Xuefu Zhang.
>
>
> Bugs: PIG-4766
> https://issues.apache.org/jira/browse/PIG-4766
>
>
> Repository: pig-git
>
>
> Description
> -------
>
> PIG-4709 introduced Combiner optimization for Group By. However, the patch did not handle cases where constant/conditional expressions were used. It also did not handle limit.
>
> This patch is to address those gaps.
>
>
> Diffs
> -----
>
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/PORelationToExprProject.java 5fb49e2
> src/org/apache/pig/backend/hadoop/executionengine/spark/converter/ReduceByConverter.java d4b521a
> src/org/apache/pig/backend/hadoop/executionengine/spark/optimizer/CombinerOptimizer.java a05d009
> src/org/apache/pig/backend/hadoop/executionengine/util/CombinerOptimizerUtil.java 5c0919f
> src/org/apache/pig/data/SelfSpillBag.java 4e08b99
> test/org/apache/pig/newplan/logical/relational/TestLocationInPhysicalPlan.java 0e45434
> test/org/apache/pig/test/TestCombiner.java b2e81ac
>
> Diff: https://reviews.apache.org/r/43044/diff/
>
>
> Testing
> -------
>
> With this patch, all tests in TestCombiner pass.
>
>
> Thanks,
>
> Pallavi Rao
>
>
Re: Review Request 43044: PIG-4766 Ensure GroupBy is optimized for all
algebraic Operations
Posted by Pallavi Rao <pa...@inmobi.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/43044/
-----------------------------------------------------------
(Updated Feb. 3, 2016, 6:23 a.m.)
Review request for pig, Xianda Ke, liyun zhang, Mohit Sabharwal, and Xuefu Zhang.
Changes
-------
Fixed some UT failures in UTs other than TestCombiner.
Bugs: PIG-4766
https://issues.apache.org/jira/browse/PIG-4766
Repository: pig-git
Description
-------
PIG-4709 introduced Combiner optimization for Group By. However, the patch did not handle cases where constant/conditional expressions were used. It also did not handle limit.
This patch is to address those gaps.
Diffs (updated)
-----
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/PORelationToExprProject.java 5fb49e2
src/org/apache/pig/backend/hadoop/executionengine/spark/converter/ReduceByConverter.java d4b521a
src/org/apache/pig/backend/hadoop/executionengine/spark/optimizer/CombinerOptimizer.java a05d009
src/org/apache/pig/backend/hadoop/executionengine/util/CombinerOptimizerUtil.java 5c0919f
src/org/apache/pig/data/SelfSpillBag.java 4e08b99
test/org/apache/pig/newplan/logical/relational/TestLocationInPhysicalPlan.java 0e45434
test/org/apache/pig/test/TestCombiner.java b2e81ac
Diff: https://reviews.apache.org/r/43044/diff/
Testing
-------
With this patch, all tests in TestCombiner pass.
Thanks,
Pallavi Rao