You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by Mark Wagner <wa...@gmail.com> on 2014/03/04 22:40:28 UTC
Re: Review Request 15881: PIG-3591: Refactor POPackage
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15881/
-----------------------------------------------------------
(Updated March 4, 2014, 9:40 p.m.)
Review request for pig and Cheolsoo Park.
Changes
-------
I have fixed several multiquery issues, streaming for the last input in joins and illustrator works again.
Bugs: PIG-3591
https://issues.apache.org/jira/browse/PIG-3591
Repository: pig-git
Description
-------
Separate "packaging" logic from "shuffle handling" logic. This moves the packaging logic to a new class "Packager", which is extended by CombinePackager, LitePackager, MultiQueryPackager, and JoinPackager.
This is not finished. Known problem are illustrate and streaming the last input are not implemented.
Diffs (updated)
-----
src/org/apache/pig/backend/hadoop/executionengine/fetch/FetchOptimizer.java d801f6f
src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/AccumulatorOptimizer.java 3638b5c
src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/CombinerOptimizer.java 18a382b
src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java 5e28eb6
src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRCompiler.java 5dddab7
src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRUtil.java 93de6d5
src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java eb7c428
src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MultiQueryOptimizer.java 64f0ee1
src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PhyPlanSetter.java 933363d
src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigCombiner.java 773a22c
src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigGenericMapReduce.java eea5ce3
src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/SecondaryKeyOptimizer.java 1578630
src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/plans/POPackageAnnotator.java 47137d5
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PhyPlanVisitor.java abb16ff
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PlanPrinter.java ff82801
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/XMLPhysicalPlanPrinter.java 892c26f
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/CombinerPackager.java PRE-CREATION
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/JoinPackager.java PRE-CREATION
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/LitePackager.java PRE-CREATION
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/MultiQueryPackager.java PRE-CREATION
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POCombinerPackage.java 9105a0e
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POJoinPackage.java 82f11ac
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POMultiQueryPackage.java d604174
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java 86314d9
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackageLite.java c200715
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/Packager.java PRE-CREATION
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/util/PlanHelper.java b860521
src/org/apache/pig/data/ReadOnceBag.java e2b3887
src/org/apache/pig/newplan/logical/relational/LogToPhyTranslationVisitor.java 7112695
src/org/apache/pig/pen/IllustratorAttacher.java db9c707
src/org/apache/pig/pen/PhysicalPlanResetter.java f50fba7
test/org/apache/pig/test/TestJobSubmission.java cccade4
test/org/apache/pig/test/TestPackage.java 6b197d1
test/org/apache/pig/test/data/GoldenFiles/Cogroup.gld 35ed75a
test/org/apache/pig/test/data/GoldenFiles/MRC1.gld 0a34728
test/org/apache/pig/test/data/GoldenFiles/MRC10.gld b9fad0f
test/org/apache/pig/test/data/GoldenFiles/MRC11.gld 861608d
test/org/apache/pig/test/data/GoldenFiles/MRC12.gld 15f1158
test/org/apache/pig/test/data/GoldenFiles/MRC13.gld 0d4de2b
test/org/apache/pig/test/data/GoldenFiles/MRC14.gld eafa509
test/org/apache/pig/test/data/GoldenFiles/MRC15.gld e8bd281
test/org/apache/pig/test/data/GoldenFiles/MRC16.gld 6151d1e
test/org/apache/pig/test/data/GoldenFiles/MRC17.gld cec5d57
test/org/apache/pig/test/data/GoldenFiles/MRC18.gld 01cdc32
test/org/apache/pig/test/data/GoldenFiles/MRC19.gld 8688178
test/org/apache/pig/test/data/GoldenFiles/MRC2.gld 97f0ed0
test/org/apache/pig/test/data/GoldenFiles/MRC3.gld 77a155f
test/org/apache/pig/test/data/GoldenFiles/MRC4.gld 7c5078e
test/org/apache/pig/test/data/GoldenFiles/MRC6.gld 45cba2d
test/org/apache/pig/test/data/GoldenFiles/MRC7.gld 09fcaec
test/org/apache/pig/test/data/GoldenFiles/MRC8.gld 2d301a9
test/org/apache/pig/test/data/GoldenFiles/MRC9.gld c5d7047
Diff: https://reviews.apache.org/r/15881/diff/
Testing
-------
ant test-commit passes, with the exception of TestExampleGenerator which is caused by the illustrate work not being finished. I also ran some manual scripts, which worked.
Thanks,
Mark Wagner
Re: Review Request 15881: PIG-3591: Refactor POPackage
Posted by Mark Wagner <wa...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15881/
-----------------------------------------------------------
(Updated April 5, 2014, 2:25 a.m.)
Review request for pig and Cheolsoo Park.
Bugs: PIG-3591
https://issues.apache.org/jira/browse/PIG-3591
Repository: pig-git
Description
-------
Separate "packaging" logic from "shuffle handling" logic. This moves the packaging logic to a new class "Packager", which is extended by CombinePackager, LitePackager, MultiQueryPackager, and JoinPackager.
This is not finished. Known problem are illustrate and streaming the last input are not implemented.
Diffs (updated)
-----
src/org/apache/pig/backend/hadoop/executionengine/fetch/FetchOptimizer.java d801f6f
src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/AccumulatorOptimizer.java 3638b5c
src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/CombinerOptimizer.java 18a382b
src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java b7ce7e2
src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRCompiler.java 5dddab7
src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRUtil.java 93de6d5
src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java d3ebeb3
src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MultiQueryOptimizer.java 64f0ee1
src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PhyPlanSetter.java 933363d
src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigCombiner.java 773a22c
src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigGenericMapReduce.java eea5ce3
src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/SecondaryKeyOptimizer.java 54740a0
src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/plans/POPackageAnnotator.java 47137d5
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PhyPlanVisitor.java abb16ff
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PlanPrinter.java ff82801
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/XMLPhysicalPlanPrinter.java 892c26f
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/CombinerPackager.java PRE-CREATION
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/JoinPackager.java PRE-CREATION
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/LitePackager.java PRE-CREATION
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/MultiQueryPackager.java PRE-CREATION
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POCombinerPackage.java 9105a0e
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POJoinPackage.java 82f11ac
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POMultiQueryPackage.java d604174
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java 86314d9
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackageLite.java c200715
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPartialAgg.java a92e643
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/Packager.java PRE-CREATION
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/util/PlanHelper.java b860521
src/org/apache/pig/data/ReadOnceBag.java e2b3887
src/org/apache/pig/data/SelfSpillBag.java 2fcf36a
src/org/apache/pig/newplan/logical/relational/LogToPhyTranslationVisitor.java 7112695
src/org/apache/pig/pen/IllustratorAttacher.java db9c707
src/org/apache/pig/pen/PhysicalPlanResetter.java f50fba7
test/org/apache/pig/test/TestExampleGenerator.java 42f757f
test/org/apache/pig/test/TestJobSubmission.java 9741dc9
test/org/apache/pig/test/TestPackage.java 6b197d1
test/org/apache/pig/test/data/GoldenFiles/Cogroup.gld 35ed75a
test/org/apache/pig/test/data/GoldenFiles/MRC1.gld 0a34728
test/org/apache/pig/test/data/GoldenFiles/MRC10.gld b9fad0f
test/org/apache/pig/test/data/GoldenFiles/MRC11.gld 861608d
test/org/apache/pig/test/data/GoldenFiles/MRC12.gld 15f1158
test/org/apache/pig/test/data/GoldenFiles/MRC13.gld 0d4de2b
test/org/apache/pig/test/data/GoldenFiles/MRC14.gld eafa509
test/org/apache/pig/test/data/GoldenFiles/MRC15.gld e8bd281
test/org/apache/pig/test/data/GoldenFiles/MRC16.gld 6151d1e
test/org/apache/pig/test/data/GoldenFiles/MRC17.gld cec5d57
test/org/apache/pig/test/data/GoldenFiles/MRC18.gld 01cdc32
test/org/apache/pig/test/data/GoldenFiles/MRC19.gld 8688178
test/org/apache/pig/test/data/GoldenFiles/MRC2.gld 97f0ed0
test/org/apache/pig/test/data/GoldenFiles/MRC3.gld 77a155f
test/org/apache/pig/test/data/GoldenFiles/MRC4.gld 7c5078e
test/org/apache/pig/test/data/GoldenFiles/MRC6.gld 45cba2d
test/org/apache/pig/test/data/GoldenFiles/MRC7.gld 09fcaec
test/org/apache/pig/test/data/GoldenFiles/MRC8.gld 2d301a9
test/org/apache/pig/test/data/GoldenFiles/MRC9.gld c5d7047
Diff: https://reviews.apache.org/r/15881/diff/
Testing
-------
ant test-commit passes, with the exception of TestExampleGenerator which is caused by the illustrate work not being finished. I also ran some manual scripts, which worked.
Thanks,
Mark Wagner
Re: Review Request 15881: PIG-3591: Refactor POPackage
Posted by Cheolsoo Park <pi...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15881/#review39417
-----------------------------------------------------------
Ship it!
I believe you have already run unit tests. If so, let's get this in.
- Cheolsoo Park
On April 3, 2014, 2:57 a.m., Mark Wagner wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15881/
> -----------------------------------------------------------
>
> (Updated April 3, 2014, 2:57 a.m.)
>
>
> Review request for pig and Cheolsoo Park.
>
>
> Bugs: PIG-3591
> https://issues.apache.org/jira/browse/PIG-3591
>
>
> Repository: pig-git
>
>
> Description
> -------
>
> Separate "packaging" logic from "shuffle handling" logic. This moves the packaging logic to a new class "Packager", which is extended by CombinePackager, LitePackager, MultiQueryPackager, and JoinPackager.
>
> This is not finished. Known problem are illustrate and streaming the last input are not implemented.
>
>
> Diffs
> -----
>
> src/org/apache/pig/backend/hadoop/executionengine/fetch/FetchOptimizer.java d801f6f
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/AccumulatorOptimizer.java 3638b5c
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/CombinerOptimizer.java 18a382b
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java 5e28eb6
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRCompiler.java 5dddab7
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRUtil.java 93de6d5
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java eb7c428
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MultiQueryOptimizer.java 64f0ee1
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PhyPlanSetter.java 933363d
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigCombiner.java 773a22c
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigGenericMapReduce.java eea5ce3
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/SecondaryKeyOptimizer.java 1578630
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/plans/POPackageAnnotator.java 47137d5
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PhyPlanVisitor.java abb16ff
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PlanPrinter.java ff82801
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/XMLPhysicalPlanPrinter.java 892c26f
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/CombinerPackager.java PRE-CREATION
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/JoinPackager.java PRE-CREATION
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/LitePackager.java PRE-CREATION
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/MultiQueryPackager.java PRE-CREATION
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POCombinerPackage.java 9105a0e
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POJoinPackage.java 82f11ac
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POMultiQueryPackage.java d604174
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java 86314d9
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackageLite.java c200715
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPartialAgg.java a92e643
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/Packager.java PRE-CREATION
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/util/PlanHelper.java b860521
> src/org/apache/pig/data/ReadOnceBag.java e2b3887
> src/org/apache/pig/data/SelfSpillBag.java 2fcf36a
> src/org/apache/pig/newplan/logical/relational/LogToPhyTranslationVisitor.java 7112695
> src/org/apache/pig/pen/IllustratorAttacher.java db9c707
> src/org/apache/pig/pen/PhysicalPlanResetter.java f50fba7
> test/org/apache/pig/test/TestExampleGenerator.java 42f757f
> test/org/apache/pig/test/TestJobSubmission.java cccade4
> test/org/apache/pig/test/TestPackage.java 6b197d1
> test/org/apache/pig/test/data/GoldenFiles/Cogroup.gld 35ed75a
> test/org/apache/pig/test/data/GoldenFiles/MRC1.gld 0a34728
> test/org/apache/pig/test/data/GoldenFiles/MRC10.gld b9fad0f
> test/org/apache/pig/test/data/GoldenFiles/MRC11.gld 861608d
> test/org/apache/pig/test/data/GoldenFiles/MRC12.gld 15f1158
> test/org/apache/pig/test/data/GoldenFiles/MRC13.gld 0d4de2b
> test/org/apache/pig/test/data/GoldenFiles/MRC14.gld eafa509
> test/org/apache/pig/test/data/GoldenFiles/MRC15.gld e8bd281
> test/org/apache/pig/test/data/GoldenFiles/MRC16.gld 6151d1e
> test/org/apache/pig/test/data/GoldenFiles/MRC17.gld cec5d57
> test/org/apache/pig/test/data/GoldenFiles/MRC18.gld 01cdc32
> test/org/apache/pig/test/data/GoldenFiles/MRC19.gld 8688178
> test/org/apache/pig/test/data/GoldenFiles/MRC2.gld 97f0ed0
> test/org/apache/pig/test/data/GoldenFiles/MRC3.gld 77a155f
> test/org/apache/pig/test/data/GoldenFiles/MRC4.gld 7c5078e
> test/org/apache/pig/test/data/GoldenFiles/MRC6.gld 45cba2d
> test/org/apache/pig/test/data/GoldenFiles/MRC7.gld 09fcaec
> test/org/apache/pig/test/data/GoldenFiles/MRC8.gld 2d301a9
> test/org/apache/pig/test/data/GoldenFiles/MRC9.gld c5d7047
>
> Diff: https://reviews.apache.org/r/15881/diff/
>
>
> Testing
> -------
>
> ant test-commit passes, with the exception of TestExampleGenerator which is caused by the illustrate work not being finished. I also ran some manual scripts, which worked.
>
>
> Thanks,
>
> Mark Wagner
>
>
Re: Review Request 15881: PIG-3591: Refactor POPackage
Posted by Rohini Palaniswamy <ro...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15881/#review39461
-----------------------------------------------------------
Ship it!
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/LitePackager.java
<https://reviews.apache.org/r/15881/#comment71849>
This can be removed
- Rohini Palaniswamy
On April 3, 2014, 2:57 a.m., Mark Wagner wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15881/
> -----------------------------------------------------------
>
> (Updated April 3, 2014, 2:57 a.m.)
>
>
> Review request for pig and Cheolsoo Park.
>
>
> Bugs: PIG-3591
> https://issues.apache.org/jira/browse/PIG-3591
>
>
> Repository: pig-git
>
>
> Description
> -------
>
> Separate "packaging" logic from "shuffle handling" logic. This moves the packaging logic to a new class "Packager", which is extended by CombinePackager, LitePackager, MultiQueryPackager, and JoinPackager.
>
> This is not finished. Known problem are illustrate and streaming the last input are not implemented.
>
>
> Diffs
> -----
>
> src/org/apache/pig/backend/hadoop/executionengine/fetch/FetchOptimizer.java d801f6f
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/AccumulatorOptimizer.java 3638b5c
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/CombinerOptimizer.java 18a382b
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java 5e28eb6
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRCompiler.java 5dddab7
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRUtil.java 93de6d5
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java eb7c428
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MultiQueryOptimizer.java 64f0ee1
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PhyPlanSetter.java 933363d
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigCombiner.java 773a22c
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigGenericMapReduce.java eea5ce3
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/SecondaryKeyOptimizer.java 1578630
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/plans/POPackageAnnotator.java 47137d5
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PhyPlanVisitor.java abb16ff
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PlanPrinter.java ff82801
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/XMLPhysicalPlanPrinter.java 892c26f
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/CombinerPackager.java PRE-CREATION
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/JoinPackager.java PRE-CREATION
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/LitePackager.java PRE-CREATION
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/MultiQueryPackager.java PRE-CREATION
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POCombinerPackage.java 9105a0e
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POJoinPackage.java 82f11ac
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POMultiQueryPackage.java d604174
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java 86314d9
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackageLite.java c200715
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPartialAgg.java a92e643
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/Packager.java PRE-CREATION
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/util/PlanHelper.java b860521
> src/org/apache/pig/data/ReadOnceBag.java e2b3887
> src/org/apache/pig/data/SelfSpillBag.java 2fcf36a
> src/org/apache/pig/newplan/logical/relational/LogToPhyTranslationVisitor.java 7112695
> src/org/apache/pig/pen/IllustratorAttacher.java db9c707
> src/org/apache/pig/pen/PhysicalPlanResetter.java f50fba7
> test/org/apache/pig/test/TestExampleGenerator.java 42f757f
> test/org/apache/pig/test/TestJobSubmission.java cccade4
> test/org/apache/pig/test/TestPackage.java 6b197d1
> test/org/apache/pig/test/data/GoldenFiles/Cogroup.gld 35ed75a
> test/org/apache/pig/test/data/GoldenFiles/MRC1.gld 0a34728
> test/org/apache/pig/test/data/GoldenFiles/MRC10.gld b9fad0f
> test/org/apache/pig/test/data/GoldenFiles/MRC11.gld 861608d
> test/org/apache/pig/test/data/GoldenFiles/MRC12.gld 15f1158
> test/org/apache/pig/test/data/GoldenFiles/MRC13.gld 0d4de2b
> test/org/apache/pig/test/data/GoldenFiles/MRC14.gld eafa509
> test/org/apache/pig/test/data/GoldenFiles/MRC15.gld e8bd281
> test/org/apache/pig/test/data/GoldenFiles/MRC16.gld 6151d1e
> test/org/apache/pig/test/data/GoldenFiles/MRC17.gld cec5d57
> test/org/apache/pig/test/data/GoldenFiles/MRC18.gld 01cdc32
> test/org/apache/pig/test/data/GoldenFiles/MRC19.gld 8688178
> test/org/apache/pig/test/data/GoldenFiles/MRC2.gld 97f0ed0
> test/org/apache/pig/test/data/GoldenFiles/MRC3.gld 77a155f
> test/org/apache/pig/test/data/GoldenFiles/MRC4.gld 7c5078e
> test/org/apache/pig/test/data/GoldenFiles/MRC6.gld 45cba2d
> test/org/apache/pig/test/data/GoldenFiles/MRC7.gld 09fcaec
> test/org/apache/pig/test/data/GoldenFiles/MRC8.gld 2d301a9
> test/org/apache/pig/test/data/GoldenFiles/MRC9.gld c5d7047
>
> Diff: https://reviews.apache.org/r/15881/diff/
>
>
> Testing
> -------
>
> ant test-commit passes, with the exception of TestExampleGenerator which is caused by the illustrate work not being finished. I also ran some manual scripts, which worked.
>
>
> Thanks,
>
> Mark Wagner
>
>
Re: Review Request 15881: PIG-3591: Refactor POPackage
Posted by Mark Wagner <wa...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15881/
-----------------------------------------------------------
(Updated April 3, 2014, 2:57 a.m.)
Review request for pig and Cheolsoo Park.
Changes
-------
Addressed most review comments. I added a test for order by illustration which is failing right now.
Bugs: PIG-3591
https://issues.apache.org/jira/browse/PIG-3591
Repository: pig-git
Description
-------
Separate "packaging" logic from "shuffle handling" logic. This moves the packaging logic to a new class "Packager", which is extended by CombinePackager, LitePackager, MultiQueryPackager, and JoinPackager.
This is not finished. Known problem are illustrate and streaming the last input are not implemented.
Diffs (updated)
-----
src/org/apache/pig/backend/hadoop/executionengine/fetch/FetchOptimizer.java d801f6f
src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/AccumulatorOptimizer.java 3638b5c
src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/CombinerOptimizer.java 18a382b
src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java 5e28eb6
src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRCompiler.java 5dddab7
src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRUtil.java 93de6d5
src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java eb7c428
src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MultiQueryOptimizer.java 64f0ee1
src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PhyPlanSetter.java 933363d
src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigCombiner.java 773a22c
src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigGenericMapReduce.java eea5ce3
src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/SecondaryKeyOptimizer.java 1578630
src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/plans/POPackageAnnotator.java 47137d5
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PhyPlanVisitor.java abb16ff
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PlanPrinter.java ff82801
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/XMLPhysicalPlanPrinter.java 892c26f
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/CombinerPackager.java PRE-CREATION
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/JoinPackager.java PRE-CREATION
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/LitePackager.java PRE-CREATION
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/MultiQueryPackager.java PRE-CREATION
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POCombinerPackage.java 9105a0e
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POJoinPackage.java 82f11ac
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POMultiQueryPackage.java d604174
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java 86314d9
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackageLite.java c200715
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPartialAgg.java a92e643
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/Packager.java PRE-CREATION
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/util/PlanHelper.java b860521
src/org/apache/pig/data/ReadOnceBag.java e2b3887
src/org/apache/pig/data/SelfSpillBag.java 2fcf36a
src/org/apache/pig/newplan/logical/relational/LogToPhyTranslationVisitor.java 7112695
src/org/apache/pig/pen/IllustratorAttacher.java db9c707
src/org/apache/pig/pen/PhysicalPlanResetter.java f50fba7
test/org/apache/pig/test/TestExampleGenerator.java 42f757f
test/org/apache/pig/test/TestJobSubmission.java cccade4
test/org/apache/pig/test/TestPackage.java 6b197d1
test/org/apache/pig/test/data/GoldenFiles/Cogroup.gld 35ed75a
test/org/apache/pig/test/data/GoldenFiles/MRC1.gld 0a34728
test/org/apache/pig/test/data/GoldenFiles/MRC10.gld b9fad0f
test/org/apache/pig/test/data/GoldenFiles/MRC11.gld 861608d
test/org/apache/pig/test/data/GoldenFiles/MRC12.gld 15f1158
test/org/apache/pig/test/data/GoldenFiles/MRC13.gld 0d4de2b
test/org/apache/pig/test/data/GoldenFiles/MRC14.gld eafa509
test/org/apache/pig/test/data/GoldenFiles/MRC15.gld e8bd281
test/org/apache/pig/test/data/GoldenFiles/MRC16.gld 6151d1e
test/org/apache/pig/test/data/GoldenFiles/MRC17.gld cec5d57
test/org/apache/pig/test/data/GoldenFiles/MRC18.gld 01cdc32
test/org/apache/pig/test/data/GoldenFiles/MRC19.gld 8688178
test/org/apache/pig/test/data/GoldenFiles/MRC2.gld 97f0ed0
test/org/apache/pig/test/data/GoldenFiles/MRC3.gld 77a155f
test/org/apache/pig/test/data/GoldenFiles/MRC4.gld 7c5078e
test/org/apache/pig/test/data/GoldenFiles/MRC6.gld 45cba2d
test/org/apache/pig/test/data/GoldenFiles/MRC7.gld 09fcaec
test/org/apache/pig/test/data/GoldenFiles/MRC8.gld 2d301a9
test/org/apache/pig/test/data/GoldenFiles/MRC9.gld c5d7047
Diff: https://reviews.apache.org/r/15881/diff/
Testing
-------
ant test-commit passes, with the exception of TestExampleGenerator which is caused by the illustrate work not being finished. I also ran some manual scripts, which worked.
Thanks,
Mark Wagner
Re: Review Request 15881: PIG-3591: Refactor POPackage
Posted by Cheolsoo Park <pi...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15881/#review36236
-----------------------------------------------------------
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/Packager.java
<https://reviews.apache.org/r/15881/#comment67182>
I've been running unit tests overnight. I see only one failure!
This line causes an error in TestExampleGenerator#testLimit. Here is the stack trace-
Testcase: testLimit took 0.042 sec
Caused an ERROR
Exception
java.io.IOException: Exception
at org.apache.pig.PigServer.getExamples(PigServer.java:1281)
at org.apache.pig.test.TestExampleGenerator.testLimit(TestExampleGenerator.java:348)
Caused by: java.lang.ClassCastException: org.apache.pig.data.BinSedesTuple cannot be cast to org.apache.pig.pen.util.ExampleTuple
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.Packager.illustratorMarkup2(Packager.java:168)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.Packager.getValueTuple(Packager.java:225)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPackage$PeekedBag$1.next(POPackage.java:422)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPackage$PeekedBag$1.next(POPackage.java:405)
at org.apache.pig.data.DefaultAbstractBag.addAll(DefaultAbstractBag.java:151)
at org.apache.pig.data.DefaultAbstractBag.addAll(DefaultAbstractBag.java:137)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.Packager.attachInput(Packager.java:104)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPackage.getNextTuple(POPackage.java:275)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.processOnePackageOutput(PigGenericMapReduce.java:420)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:411)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:255)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
at org.apache.pig.pen.LocalMapReduceSimulator.launchPig(LocalMapReduceSimulator.java:236)
at org.apache.pig.pen.ExampleGenerator.getData(ExampleGenerator.java:258)
at org.apache.pig.pen.ExampleGenerator.getData(ExampleGenerator.java:239)
at org.apache.pig.pen.LineageTrimmingVisitor.init(LineageTrimmingVisitor.java:104)
at org.apache.pig.pen.LineageTrimmingVisitor.<init>(LineageTrimmingVisitor.java:99)
at org.apache.pig.pen.ExampleGenerator.getExamples(ExampleGenerator.java:167)
at org.apache.pig.PigServer.getExamples(PigServer.java:1275)
- Cheolsoo Park
On March 4, 2014, 9:40 p.m., Mark Wagner wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15881/
> -----------------------------------------------------------
>
> (Updated March 4, 2014, 9:40 p.m.)
>
>
> Review request for pig and Cheolsoo Park.
>
>
> Bugs: PIG-3591
> https://issues.apache.org/jira/browse/PIG-3591
>
>
> Repository: pig-git
>
>
> Description
> -------
>
> Separate "packaging" logic from "shuffle handling" logic. This moves the packaging logic to a new class "Packager", which is extended by CombinePackager, LitePackager, MultiQueryPackager, and JoinPackager.
>
> This is not finished. Known problem are illustrate and streaming the last input are not implemented.
>
>
> Diffs
> -----
>
> src/org/apache/pig/backend/hadoop/executionengine/fetch/FetchOptimizer.java d801f6f
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/AccumulatorOptimizer.java 3638b5c
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/CombinerOptimizer.java 18a382b
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java 5e28eb6
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRCompiler.java 5dddab7
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRUtil.java 93de6d5
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java eb7c428
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MultiQueryOptimizer.java 64f0ee1
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PhyPlanSetter.java 933363d
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigCombiner.java 773a22c
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigGenericMapReduce.java eea5ce3
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/SecondaryKeyOptimizer.java 1578630
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/plans/POPackageAnnotator.java 47137d5
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PhyPlanVisitor.java abb16ff
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PlanPrinter.java ff82801
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/XMLPhysicalPlanPrinter.java 892c26f
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/CombinerPackager.java PRE-CREATION
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/JoinPackager.java PRE-CREATION
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/LitePackager.java PRE-CREATION
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/MultiQueryPackager.java PRE-CREATION
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POCombinerPackage.java 9105a0e
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POJoinPackage.java 82f11ac
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POMultiQueryPackage.java d604174
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java 86314d9
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackageLite.java c200715
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/Packager.java PRE-CREATION
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/util/PlanHelper.java b860521
> src/org/apache/pig/data/ReadOnceBag.java e2b3887
> src/org/apache/pig/newplan/logical/relational/LogToPhyTranslationVisitor.java 7112695
> src/org/apache/pig/pen/IllustratorAttacher.java db9c707
> src/org/apache/pig/pen/PhysicalPlanResetter.java f50fba7
> test/org/apache/pig/test/TestJobSubmission.java cccade4
> test/org/apache/pig/test/TestPackage.java 6b197d1
> test/org/apache/pig/test/data/GoldenFiles/Cogroup.gld 35ed75a
> test/org/apache/pig/test/data/GoldenFiles/MRC1.gld 0a34728
> test/org/apache/pig/test/data/GoldenFiles/MRC10.gld b9fad0f
> test/org/apache/pig/test/data/GoldenFiles/MRC11.gld 861608d
> test/org/apache/pig/test/data/GoldenFiles/MRC12.gld 15f1158
> test/org/apache/pig/test/data/GoldenFiles/MRC13.gld 0d4de2b
> test/org/apache/pig/test/data/GoldenFiles/MRC14.gld eafa509
> test/org/apache/pig/test/data/GoldenFiles/MRC15.gld e8bd281
> test/org/apache/pig/test/data/GoldenFiles/MRC16.gld 6151d1e
> test/org/apache/pig/test/data/GoldenFiles/MRC17.gld cec5d57
> test/org/apache/pig/test/data/GoldenFiles/MRC18.gld 01cdc32
> test/org/apache/pig/test/data/GoldenFiles/MRC19.gld 8688178
> test/org/apache/pig/test/data/GoldenFiles/MRC2.gld 97f0ed0
> test/org/apache/pig/test/data/GoldenFiles/MRC3.gld 77a155f
> test/org/apache/pig/test/data/GoldenFiles/MRC4.gld 7c5078e
> test/org/apache/pig/test/data/GoldenFiles/MRC6.gld 45cba2d
> test/org/apache/pig/test/data/GoldenFiles/MRC7.gld 09fcaec
> test/org/apache/pig/test/data/GoldenFiles/MRC8.gld 2d301a9
> test/org/apache/pig/test/data/GoldenFiles/MRC9.gld c5d7047
>
> Diff: https://reviews.apache.org/r/15881/diff/
>
>
> Testing
> -------
>
> ant test-commit passes, with the exception of TestExampleGenerator which is caused by the illustrate work not being finished. I also ran some manual scripts, which worked.
>
>
> Thanks,
>
> Mark Wagner
>
>
Re: Review Request 15881: PIG-3591: Refactor POPackage
Posted by Mark Wagner <wa...@gmail.com>.
> On March 6, 2014, 8:04 a.m., Rohini Palaniswamy wrote:
> > src/org/apache/pig/pen/IllustratorAttacher.java, lines 418-421
> > <https://reviews.apache.org/r/15881/diff/3/?file=509987#file509987line418>
> >
> > Don't we need the equivalent of this code in visitPackage?
The Packager is now responsible for this. It is present in JoinPackager.
- Mark
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15881/#review36273
-----------------------------------------------------------
On April 3, 2014, 2:57 a.m., Mark Wagner wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15881/
> -----------------------------------------------------------
>
> (Updated April 3, 2014, 2:57 a.m.)
>
>
> Review request for pig and Cheolsoo Park.
>
>
> Bugs: PIG-3591
> https://issues.apache.org/jira/browse/PIG-3591
>
>
> Repository: pig-git
>
>
> Description
> -------
>
> Separate "packaging" logic from "shuffle handling" logic. This moves the packaging logic to a new class "Packager", which is extended by CombinePackager, LitePackager, MultiQueryPackager, and JoinPackager.
>
> This is not finished. Known problem are illustrate and streaming the last input are not implemented.
>
>
> Diffs
> -----
>
> src/org/apache/pig/backend/hadoop/executionengine/fetch/FetchOptimizer.java d801f6f
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/AccumulatorOptimizer.java 3638b5c
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/CombinerOptimizer.java 18a382b
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java 5e28eb6
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRCompiler.java 5dddab7
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRUtil.java 93de6d5
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java eb7c428
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MultiQueryOptimizer.java 64f0ee1
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PhyPlanSetter.java 933363d
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigCombiner.java 773a22c
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigGenericMapReduce.java eea5ce3
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/SecondaryKeyOptimizer.java 1578630
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/plans/POPackageAnnotator.java 47137d5
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PhyPlanVisitor.java abb16ff
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PlanPrinter.java ff82801
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/XMLPhysicalPlanPrinter.java 892c26f
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/CombinerPackager.java PRE-CREATION
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/JoinPackager.java PRE-CREATION
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/LitePackager.java PRE-CREATION
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/MultiQueryPackager.java PRE-CREATION
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POCombinerPackage.java 9105a0e
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POJoinPackage.java 82f11ac
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POMultiQueryPackage.java d604174
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java 86314d9
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackageLite.java c200715
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPartialAgg.java a92e643
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/Packager.java PRE-CREATION
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/util/PlanHelper.java b860521
> src/org/apache/pig/data/ReadOnceBag.java e2b3887
> src/org/apache/pig/data/SelfSpillBag.java 2fcf36a
> src/org/apache/pig/newplan/logical/relational/LogToPhyTranslationVisitor.java 7112695
> src/org/apache/pig/pen/IllustratorAttacher.java db9c707
> src/org/apache/pig/pen/PhysicalPlanResetter.java f50fba7
> test/org/apache/pig/test/TestExampleGenerator.java 42f757f
> test/org/apache/pig/test/TestJobSubmission.java cccade4
> test/org/apache/pig/test/TestPackage.java 6b197d1
> test/org/apache/pig/test/data/GoldenFiles/Cogroup.gld 35ed75a
> test/org/apache/pig/test/data/GoldenFiles/MRC1.gld 0a34728
> test/org/apache/pig/test/data/GoldenFiles/MRC10.gld b9fad0f
> test/org/apache/pig/test/data/GoldenFiles/MRC11.gld 861608d
> test/org/apache/pig/test/data/GoldenFiles/MRC12.gld 15f1158
> test/org/apache/pig/test/data/GoldenFiles/MRC13.gld 0d4de2b
> test/org/apache/pig/test/data/GoldenFiles/MRC14.gld eafa509
> test/org/apache/pig/test/data/GoldenFiles/MRC15.gld e8bd281
> test/org/apache/pig/test/data/GoldenFiles/MRC16.gld 6151d1e
> test/org/apache/pig/test/data/GoldenFiles/MRC17.gld cec5d57
> test/org/apache/pig/test/data/GoldenFiles/MRC18.gld 01cdc32
> test/org/apache/pig/test/data/GoldenFiles/MRC19.gld 8688178
> test/org/apache/pig/test/data/GoldenFiles/MRC2.gld 97f0ed0
> test/org/apache/pig/test/data/GoldenFiles/MRC3.gld 77a155f
> test/org/apache/pig/test/data/GoldenFiles/MRC4.gld 7c5078e
> test/org/apache/pig/test/data/GoldenFiles/MRC6.gld 45cba2d
> test/org/apache/pig/test/data/GoldenFiles/MRC7.gld 09fcaec
> test/org/apache/pig/test/data/GoldenFiles/MRC8.gld 2d301a9
> test/org/apache/pig/test/data/GoldenFiles/MRC9.gld c5d7047
>
> Diff: https://reviews.apache.org/r/15881/diff/
>
>
> Testing
> -------
>
> ant test-commit passes, with the exception of TestExampleGenerator which is caused by the illustrate work not being finished. I also ran some manual scripts, which worked.
>
>
> Thanks,
>
> Mark Wagner
>
>
Re: Review Request 15881: PIG-3591: Refactor POPackage
Posted by Rohini Palaniswamy <ro...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15881/#review36273
-----------------------------------------------------------
I am still going over the main classes POPackage and Packager implementations. Just going over thoroughly to ensure no piece of code is missed out in the refactor. Will update any comments on that tomorrow.
src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRCompiler.java
<https://reviews.apache.org/r/15881/#comment67229>
I believe this check should not be removed
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PlanPrinter.java
<https://reviews.apache.org/r/15881/#comment67257>
Don't we still need this? i.e Print out the list of packagers MultiQueryPackager has.
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/XMLPhysicalPlanPrinter.java
<https://reviews.apache.org/r/15881/#comment67259>
Don't we still need this?
src/org/apache/pig/pen/IllustratorAttacher.java
<https://reviews.apache.org/r/15881/#comment67304>
Don't we need the equivalent of this code in visitPackage?
- Rohini Palaniswamy
On March 4, 2014, 9:40 p.m., Mark Wagner wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15881/
> -----------------------------------------------------------
>
> (Updated March 4, 2014, 9:40 p.m.)
>
>
> Review request for pig and Cheolsoo Park.
>
>
> Bugs: PIG-3591
> https://issues.apache.org/jira/browse/PIG-3591
>
>
> Repository: pig-git
>
>
> Description
> -------
>
> Separate "packaging" logic from "shuffle handling" logic. This moves the packaging logic to a new class "Packager", which is extended by CombinePackager, LitePackager, MultiQueryPackager, and JoinPackager.
>
> This is not finished. Known problem are illustrate and streaming the last input are not implemented.
>
>
> Diffs
> -----
>
> src/org/apache/pig/backend/hadoop/executionengine/fetch/FetchOptimizer.java d801f6f
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/AccumulatorOptimizer.java 3638b5c
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/CombinerOptimizer.java 18a382b
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java 5e28eb6
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRCompiler.java 5dddab7
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRUtil.java 93de6d5
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java eb7c428
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MultiQueryOptimizer.java 64f0ee1
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PhyPlanSetter.java 933363d
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigCombiner.java 773a22c
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigGenericMapReduce.java eea5ce3
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/SecondaryKeyOptimizer.java 1578630
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/plans/POPackageAnnotator.java 47137d5
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PhyPlanVisitor.java abb16ff
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PlanPrinter.java ff82801
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/XMLPhysicalPlanPrinter.java 892c26f
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/CombinerPackager.java PRE-CREATION
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/JoinPackager.java PRE-CREATION
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/LitePackager.java PRE-CREATION
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/MultiQueryPackager.java PRE-CREATION
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POCombinerPackage.java 9105a0e
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POJoinPackage.java 82f11ac
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POMultiQueryPackage.java d604174
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java 86314d9
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackageLite.java c200715
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/Packager.java PRE-CREATION
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/util/PlanHelper.java b860521
> src/org/apache/pig/data/ReadOnceBag.java e2b3887
> src/org/apache/pig/newplan/logical/relational/LogToPhyTranslationVisitor.java 7112695
> src/org/apache/pig/pen/IllustratorAttacher.java db9c707
> src/org/apache/pig/pen/PhysicalPlanResetter.java f50fba7
> test/org/apache/pig/test/TestJobSubmission.java cccade4
> test/org/apache/pig/test/TestPackage.java 6b197d1
> test/org/apache/pig/test/data/GoldenFiles/Cogroup.gld 35ed75a
> test/org/apache/pig/test/data/GoldenFiles/MRC1.gld 0a34728
> test/org/apache/pig/test/data/GoldenFiles/MRC10.gld b9fad0f
> test/org/apache/pig/test/data/GoldenFiles/MRC11.gld 861608d
> test/org/apache/pig/test/data/GoldenFiles/MRC12.gld 15f1158
> test/org/apache/pig/test/data/GoldenFiles/MRC13.gld 0d4de2b
> test/org/apache/pig/test/data/GoldenFiles/MRC14.gld eafa509
> test/org/apache/pig/test/data/GoldenFiles/MRC15.gld e8bd281
> test/org/apache/pig/test/data/GoldenFiles/MRC16.gld 6151d1e
> test/org/apache/pig/test/data/GoldenFiles/MRC17.gld cec5d57
> test/org/apache/pig/test/data/GoldenFiles/MRC18.gld 01cdc32
> test/org/apache/pig/test/data/GoldenFiles/MRC19.gld 8688178
> test/org/apache/pig/test/data/GoldenFiles/MRC2.gld 97f0ed0
> test/org/apache/pig/test/data/GoldenFiles/MRC3.gld 77a155f
> test/org/apache/pig/test/data/GoldenFiles/MRC4.gld 7c5078e
> test/org/apache/pig/test/data/GoldenFiles/MRC6.gld 45cba2d
> test/org/apache/pig/test/data/GoldenFiles/MRC7.gld 09fcaec
> test/org/apache/pig/test/data/GoldenFiles/MRC8.gld 2d301a9
> test/org/apache/pig/test/data/GoldenFiles/MRC9.gld c5d7047
>
> Diff: https://reviews.apache.org/r/15881/diff/
>
>
> Testing
> -------
>
> ant test-commit passes, with the exception of TestExampleGenerator which is caused by the illustrate work not being finished. I also ran some manual scripts, which worked.
>
>
> Thanks,
>
> Mark Wagner
>
>
Re: Review Request 15881: PIG-3591: Refactor POPackage
Posted by Mark Wagner <wa...@gmail.com>.
> On March 5, 2014, 3:01 a.m., Cheolsoo Park wrote:
> > src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/JoinPackager.java, line 42
> > <https://reviews.apache.org/r/15881/diff/3/?file=509975#file509975line42>
> >
> >
> >
> > Do we need this flag? It's set to true at L97, but it's never used anywhere.
Yes, as brought up by Rohini. I've added its use in the right places.
> On March 5, 2014, 3:01 a.m., Cheolsoo Park wrote:
> > src/org/apache/pig/pen/IllustratorAttacher.java, line 258
> > <https://reviews.apache.org/r/15881/diff/3/?file=509987#file509987line258>
> >
> > Don't we need to change this to the following instead of commenting it out?
> >
> > !(preds.get(0).getPkgr() instanceof LitePackager) &&
I don't think it's ever possible to have a POPackageLite and isDistinct
- Mark
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15881/#review36193
-----------------------------------------------------------
On March 4, 2014, 9:40 p.m., Mark Wagner wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15881/
> -----------------------------------------------------------
>
> (Updated March 4, 2014, 9:40 p.m.)
>
>
> Review request for pig and Cheolsoo Park.
>
>
> Bugs: PIG-3591
> https://issues.apache.org/jira/browse/PIG-3591
>
>
> Repository: pig-git
>
>
> Description
> -------
>
> Separate "packaging" logic from "shuffle handling" logic. This moves the packaging logic to a new class "Packager", which is extended by CombinePackager, LitePackager, MultiQueryPackager, and JoinPackager.
>
> This is not finished. Known problem are illustrate and streaming the last input are not implemented.
>
>
> Diffs
> -----
>
> src/org/apache/pig/backend/hadoop/executionengine/fetch/FetchOptimizer.java d801f6f
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/AccumulatorOptimizer.java 3638b5c
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/CombinerOptimizer.java 18a382b
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java 5e28eb6
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRCompiler.java 5dddab7
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRUtil.java 93de6d5
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java eb7c428
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MultiQueryOptimizer.java 64f0ee1
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PhyPlanSetter.java 933363d
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigCombiner.java 773a22c
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigGenericMapReduce.java eea5ce3
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/SecondaryKeyOptimizer.java 1578630
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/plans/POPackageAnnotator.java 47137d5
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PhyPlanVisitor.java abb16ff
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PlanPrinter.java ff82801
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/XMLPhysicalPlanPrinter.java 892c26f
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/CombinerPackager.java PRE-CREATION
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/JoinPackager.java PRE-CREATION
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/LitePackager.java PRE-CREATION
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/MultiQueryPackager.java PRE-CREATION
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POCombinerPackage.java 9105a0e
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POJoinPackage.java 82f11ac
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POMultiQueryPackage.java d604174
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java 86314d9
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackageLite.java c200715
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/Packager.java PRE-CREATION
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/util/PlanHelper.java b860521
> src/org/apache/pig/data/ReadOnceBag.java e2b3887
> src/org/apache/pig/newplan/logical/relational/LogToPhyTranslationVisitor.java 7112695
> src/org/apache/pig/pen/IllustratorAttacher.java db9c707
> src/org/apache/pig/pen/PhysicalPlanResetter.java f50fba7
> test/org/apache/pig/test/TestJobSubmission.java cccade4
> test/org/apache/pig/test/TestPackage.java 6b197d1
> test/org/apache/pig/test/data/GoldenFiles/Cogroup.gld 35ed75a
> test/org/apache/pig/test/data/GoldenFiles/MRC1.gld 0a34728
> test/org/apache/pig/test/data/GoldenFiles/MRC10.gld b9fad0f
> test/org/apache/pig/test/data/GoldenFiles/MRC11.gld 861608d
> test/org/apache/pig/test/data/GoldenFiles/MRC12.gld 15f1158
> test/org/apache/pig/test/data/GoldenFiles/MRC13.gld 0d4de2b
> test/org/apache/pig/test/data/GoldenFiles/MRC14.gld eafa509
> test/org/apache/pig/test/data/GoldenFiles/MRC15.gld e8bd281
> test/org/apache/pig/test/data/GoldenFiles/MRC16.gld 6151d1e
> test/org/apache/pig/test/data/GoldenFiles/MRC17.gld cec5d57
> test/org/apache/pig/test/data/GoldenFiles/MRC18.gld 01cdc32
> test/org/apache/pig/test/data/GoldenFiles/MRC19.gld 8688178
> test/org/apache/pig/test/data/GoldenFiles/MRC2.gld 97f0ed0
> test/org/apache/pig/test/data/GoldenFiles/MRC3.gld 77a155f
> test/org/apache/pig/test/data/GoldenFiles/MRC4.gld 7c5078e
> test/org/apache/pig/test/data/GoldenFiles/MRC6.gld 45cba2d
> test/org/apache/pig/test/data/GoldenFiles/MRC7.gld 09fcaec
> test/org/apache/pig/test/data/GoldenFiles/MRC8.gld 2d301a9
> test/org/apache/pig/test/data/GoldenFiles/MRC9.gld c5d7047
>
> Diff: https://reviews.apache.org/r/15881/diff/
>
>
> Testing
> -------
>
> ant test-commit passes, with the exception of TestExampleGenerator which is caused by the illustrate work not being finished. I also ran some manual scripts, which worked.
>
>
> Thanks,
>
> Mark Wagner
>
>
Re: Review Request 15881: PIG-3591: Refactor POPackage
Posted by Cheolsoo Park <pi...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15881/#review36193
-----------------------------------------------------------
Looks good to me. Thank you very much for the great work!
I have few minor comments below.
src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/AccumulatorOptimizer.java
<https://reviews.apache.org/r/15881/#comment67110>
Replace "(POPackage) po_package).getPkgr()" with pkgr since you already have the local variable?
src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/AccumulatorOptimizer.java
<https://reviews.apache.org/r/15881/#comment67111>
Same as above.
src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MultiQueryOptimizer.java
<https://reviews.apache.org/r/15881/#comment67120>
Replace pk.getPkgr() with fromPkgr?
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/CombinerPackager.java
<https://reviews.apache.org/r/15881/#comment67122>
Can you delete all the trailing white spaces when committing your patch?
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/CombinerPackager.java
<https://reviews.apache.org/r/15881/#comment67129>
Can you update the comment here? POCombinerPackage should be POPackage now, right?
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/JoinPackager.java
<https://reviews.apache.org/r/15881/#comment67133>
Do we need this flag? It's set to true at L97, but it's never used anywhere.
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java
<https://reviews.apache.org/r/15881/#comment67142>
Please adjust the width of the line.
src/org/apache/pig/pen/IllustratorAttacher.java
<https://reviews.apache.org/r/15881/#comment67143>
Don't we need to change this to the following instead of commenting it out?
!(preds.get(0).getPkgr() instanceof LitePackager) &&
test/org/apache/pig/test/TestJobSubmission.java
<https://reviews.apache.org/r/15881/#comment67144>
Can you delete this dead code? This only makes it harder to merge branches.
- Cheolsoo Park
On March 4, 2014, 9:40 p.m., Mark Wagner wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15881/
> -----------------------------------------------------------
>
> (Updated March 4, 2014, 9:40 p.m.)
>
>
> Review request for pig and Cheolsoo Park.
>
>
> Bugs: PIG-3591
> https://issues.apache.org/jira/browse/PIG-3591
>
>
> Repository: pig-git
>
>
> Description
> -------
>
> Separate "packaging" logic from "shuffle handling" logic. This moves the packaging logic to a new class "Packager", which is extended by CombinePackager, LitePackager, MultiQueryPackager, and JoinPackager.
>
> This is not finished. Known problem are illustrate and streaming the last input are not implemented.
>
>
> Diffs
> -----
>
> src/org/apache/pig/backend/hadoop/executionengine/fetch/FetchOptimizer.java d801f6f
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/AccumulatorOptimizer.java 3638b5c
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/CombinerOptimizer.java 18a382b
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java 5e28eb6
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRCompiler.java 5dddab7
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRUtil.java 93de6d5
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java eb7c428
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MultiQueryOptimizer.java 64f0ee1
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PhyPlanSetter.java 933363d
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigCombiner.java 773a22c
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigGenericMapReduce.java eea5ce3
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/SecondaryKeyOptimizer.java 1578630
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/plans/POPackageAnnotator.java 47137d5
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PhyPlanVisitor.java abb16ff
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PlanPrinter.java ff82801
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/XMLPhysicalPlanPrinter.java 892c26f
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/CombinerPackager.java PRE-CREATION
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/JoinPackager.java PRE-CREATION
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/LitePackager.java PRE-CREATION
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/MultiQueryPackager.java PRE-CREATION
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POCombinerPackage.java 9105a0e
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POJoinPackage.java 82f11ac
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POMultiQueryPackage.java d604174
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java 86314d9
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackageLite.java c200715
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/Packager.java PRE-CREATION
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/util/PlanHelper.java b860521
> src/org/apache/pig/data/ReadOnceBag.java e2b3887
> src/org/apache/pig/newplan/logical/relational/LogToPhyTranslationVisitor.java 7112695
> src/org/apache/pig/pen/IllustratorAttacher.java db9c707
> src/org/apache/pig/pen/PhysicalPlanResetter.java f50fba7
> test/org/apache/pig/test/TestJobSubmission.java cccade4
> test/org/apache/pig/test/TestPackage.java 6b197d1
> test/org/apache/pig/test/data/GoldenFiles/Cogroup.gld 35ed75a
> test/org/apache/pig/test/data/GoldenFiles/MRC1.gld 0a34728
> test/org/apache/pig/test/data/GoldenFiles/MRC10.gld b9fad0f
> test/org/apache/pig/test/data/GoldenFiles/MRC11.gld 861608d
> test/org/apache/pig/test/data/GoldenFiles/MRC12.gld 15f1158
> test/org/apache/pig/test/data/GoldenFiles/MRC13.gld 0d4de2b
> test/org/apache/pig/test/data/GoldenFiles/MRC14.gld eafa509
> test/org/apache/pig/test/data/GoldenFiles/MRC15.gld e8bd281
> test/org/apache/pig/test/data/GoldenFiles/MRC16.gld 6151d1e
> test/org/apache/pig/test/data/GoldenFiles/MRC17.gld cec5d57
> test/org/apache/pig/test/data/GoldenFiles/MRC18.gld 01cdc32
> test/org/apache/pig/test/data/GoldenFiles/MRC19.gld 8688178
> test/org/apache/pig/test/data/GoldenFiles/MRC2.gld 97f0ed0
> test/org/apache/pig/test/data/GoldenFiles/MRC3.gld 77a155f
> test/org/apache/pig/test/data/GoldenFiles/MRC4.gld 7c5078e
> test/org/apache/pig/test/data/GoldenFiles/MRC6.gld 45cba2d
> test/org/apache/pig/test/data/GoldenFiles/MRC7.gld 09fcaec
> test/org/apache/pig/test/data/GoldenFiles/MRC8.gld 2d301a9
> test/org/apache/pig/test/data/GoldenFiles/MRC9.gld c5d7047
>
> Diff: https://reviews.apache.org/r/15881/diff/
>
>
> Testing
> -------
>
> ant test-commit passes, with the exception of TestExampleGenerator which is caused by the illustrate work not being finished. I also ran some manual scripts, which worked.
>
>
> Thanks,
>
> Mark Wagner
>
>
Re: Review Request 15881: PIG-3591: Refactor POPackage
Posted by Mark Wagner <wa...@gmail.com>.
> On March 20, 2014, 12:37 p.m., Rohini Palaniswamy wrote:
> > src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/JoinPackager.java, line 212
> > <https://reviews.apache.org/r/15881/diff/3/?file=509975#file509975line212>
> >
> > I did not get this part about readOnce and copying to a new bag. Also why default bag? If the bag is big won't you hit OOM with this copy?
readOnce indicates that the execution layer can provide a tuple stream for this bag. In MR this is only the last input, but in Tez we can do it for any input.
> On March 20, 2014, 12:37 p.m., Rohini Palaniswamy wrote:
> > src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java, lines 261-265
> > <https://reviews.apache.org/r/15881/diff/3/?file=509981#file509981line261>
> >
> > Shouldn't we have this condition (if(pkgr.isDistinct())) in POPackage instead of Packager.getNext()?
I wanted to keep that type of logic within the Packagers and limit the POPackage to interfacing with the execution engine.
- Mark
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15881/#review37876
-----------------------------------------------------------
On April 3, 2014, 2:57 a.m., Mark Wagner wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15881/
> -----------------------------------------------------------
>
> (Updated April 3, 2014, 2:57 a.m.)
>
>
> Review request for pig and Cheolsoo Park.
>
>
> Bugs: PIG-3591
> https://issues.apache.org/jira/browse/PIG-3591
>
>
> Repository: pig-git
>
>
> Description
> -------
>
> Separate "packaging" logic from "shuffle handling" logic. This moves the packaging logic to a new class "Packager", which is extended by CombinePackager, LitePackager, MultiQueryPackager, and JoinPackager.
>
> This is not finished. Known problem are illustrate and streaming the last input are not implemented.
>
>
> Diffs
> -----
>
> src/org/apache/pig/backend/hadoop/executionengine/fetch/FetchOptimizer.java d801f6f
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/AccumulatorOptimizer.java 3638b5c
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/CombinerOptimizer.java 18a382b
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java 5e28eb6
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRCompiler.java 5dddab7
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRUtil.java 93de6d5
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java eb7c428
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MultiQueryOptimizer.java 64f0ee1
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PhyPlanSetter.java 933363d
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigCombiner.java 773a22c
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigGenericMapReduce.java eea5ce3
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/SecondaryKeyOptimizer.java 1578630
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/plans/POPackageAnnotator.java 47137d5
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PhyPlanVisitor.java abb16ff
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PlanPrinter.java ff82801
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/XMLPhysicalPlanPrinter.java 892c26f
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/CombinerPackager.java PRE-CREATION
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/JoinPackager.java PRE-CREATION
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/LitePackager.java PRE-CREATION
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/MultiQueryPackager.java PRE-CREATION
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POCombinerPackage.java 9105a0e
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POJoinPackage.java 82f11ac
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POMultiQueryPackage.java d604174
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java 86314d9
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackageLite.java c200715
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPartialAgg.java a92e643
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/Packager.java PRE-CREATION
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/util/PlanHelper.java b860521
> src/org/apache/pig/data/ReadOnceBag.java e2b3887
> src/org/apache/pig/data/SelfSpillBag.java 2fcf36a
> src/org/apache/pig/newplan/logical/relational/LogToPhyTranslationVisitor.java 7112695
> src/org/apache/pig/pen/IllustratorAttacher.java db9c707
> src/org/apache/pig/pen/PhysicalPlanResetter.java f50fba7
> test/org/apache/pig/test/TestExampleGenerator.java 42f757f
> test/org/apache/pig/test/TestJobSubmission.java cccade4
> test/org/apache/pig/test/TestPackage.java 6b197d1
> test/org/apache/pig/test/data/GoldenFiles/Cogroup.gld 35ed75a
> test/org/apache/pig/test/data/GoldenFiles/MRC1.gld 0a34728
> test/org/apache/pig/test/data/GoldenFiles/MRC10.gld b9fad0f
> test/org/apache/pig/test/data/GoldenFiles/MRC11.gld 861608d
> test/org/apache/pig/test/data/GoldenFiles/MRC12.gld 15f1158
> test/org/apache/pig/test/data/GoldenFiles/MRC13.gld 0d4de2b
> test/org/apache/pig/test/data/GoldenFiles/MRC14.gld eafa509
> test/org/apache/pig/test/data/GoldenFiles/MRC15.gld e8bd281
> test/org/apache/pig/test/data/GoldenFiles/MRC16.gld 6151d1e
> test/org/apache/pig/test/data/GoldenFiles/MRC17.gld cec5d57
> test/org/apache/pig/test/data/GoldenFiles/MRC18.gld 01cdc32
> test/org/apache/pig/test/data/GoldenFiles/MRC19.gld 8688178
> test/org/apache/pig/test/data/GoldenFiles/MRC2.gld 97f0ed0
> test/org/apache/pig/test/data/GoldenFiles/MRC3.gld 77a155f
> test/org/apache/pig/test/data/GoldenFiles/MRC4.gld 7c5078e
> test/org/apache/pig/test/data/GoldenFiles/MRC6.gld 45cba2d
> test/org/apache/pig/test/data/GoldenFiles/MRC7.gld 09fcaec
> test/org/apache/pig/test/data/GoldenFiles/MRC8.gld 2d301a9
> test/org/apache/pig/test/data/GoldenFiles/MRC9.gld c5d7047
>
> Diff: https://reviews.apache.org/r/15881/diff/
>
>
> Testing
> -------
>
> ant test-commit passes, with the exception of TestExampleGenerator which is caused by the illustrate work not being finished. I also ran some manual scripts, which worked.
>
>
> Thanks,
>
> Mark Wagner
>
>
Re: Review Request 15881: PIG-3591: Refactor POPackage
Posted by Rohini Palaniswamy <ro...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15881/#review37876
-----------------------------------------------------------
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/JoinPackager.java
<https://reviews.apache.org/r/15881/#comment69655>
useDefaultBag not used here. firstTime needs to be in attachInput and useDefaultBag used there
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/JoinPackager.java
<https://reviews.apache.org/r/15881/#comment69656>
InternalCachedBag if not useDefaultBag
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/JoinPackager.java
<https://reviews.apache.org/r/15881/#comment69657>
I did not get this part about readOnce and copying to a new bag. Also why default bag? If the bag is big won't you hit OOM with this copy?
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/LitePackager.java
<https://reviews.apache.org/r/15881/#comment69658>
clone.keyInfo = new HashMap<Integer, Pair<Boolean, Map<Integer, Integer>>>(keyInfo);
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java
<https://reviews.apache.org/r/15881/#comment69653>
POJoinPackage also prints forEach.getFlatStr(). Probably have a pkgr.name() and override that for each Packager
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java
<https://reviews.apache.org/r/15881/#comment69644>
Shouldn't we have this condition (if(pkgr.isDistinct())) in POPackage instead of Packager.getNext()?
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java
<https://reviews.apache.org/r/15881/#comment69649>
reporter.progress will never be called
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java
<https://reviews.apache.org/r/15881/#comment69648>
private static
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java
<https://reviews.apache.org/r/15881/#comment69650>
Wouldn't the index be always same for all the elements in the bag? Why get everytime?
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/Packager.java
<https://reviews.apache.org/r/15881/#comment69647>
Same question here on readOnce. Why do we copy the bag on readonce?
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/Packager.java
<https://reviews.apache.org/r/15881/#comment69651>
Should be a deep clone as was in POPackage
- Rohini Palaniswamy
On March 4, 2014, 9:40 p.m., Mark Wagner wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15881/
> -----------------------------------------------------------
>
> (Updated March 4, 2014, 9:40 p.m.)
>
>
> Review request for pig and Cheolsoo Park.
>
>
> Bugs: PIG-3591
> https://issues.apache.org/jira/browse/PIG-3591
>
>
> Repository: pig-git
>
>
> Description
> -------
>
> Separate "packaging" logic from "shuffle handling" logic. This moves the packaging logic to a new class "Packager", which is extended by CombinePackager, LitePackager, MultiQueryPackager, and JoinPackager.
>
> This is not finished. Known problem are illustrate and streaming the last input are not implemented.
>
>
> Diffs
> -----
>
> src/org/apache/pig/backend/hadoop/executionengine/fetch/FetchOptimizer.java d801f6f
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/AccumulatorOptimizer.java 3638b5c
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/CombinerOptimizer.java 18a382b
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java 5e28eb6
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRCompiler.java 5dddab7
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRUtil.java 93de6d5
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java eb7c428
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MultiQueryOptimizer.java 64f0ee1
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PhyPlanSetter.java 933363d
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigCombiner.java 773a22c
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigGenericMapReduce.java eea5ce3
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/SecondaryKeyOptimizer.java 1578630
> src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/plans/POPackageAnnotator.java 47137d5
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PhyPlanVisitor.java abb16ff
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PlanPrinter.java ff82801
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/XMLPhysicalPlanPrinter.java 892c26f
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/CombinerPackager.java PRE-CREATION
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/JoinPackager.java PRE-CREATION
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/LitePackager.java PRE-CREATION
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/MultiQueryPackager.java PRE-CREATION
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POCombinerPackage.java 9105a0e
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POJoinPackage.java 82f11ac
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POMultiQueryPackage.java d604174
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java 86314d9
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackageLite.java c200715
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/Packager.java PRE-CREATION
> src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/util/PlanHelper.java b860521
> src/org/apache/pig/data/ReadOnceBag.java e2b3887
> src/org/apache/pig/newplan/logical/relational/LogToPhyTranslationVisitor.java 7112695
> src/org/apache/pig/pen/IllustratorAttacher.java db9c707
> src/org/apache/pig/pen/PhysicalPlanResetter.java f50fba7
> test/org/apache/pig/test/TestJobSubmission.java cccade4
> test/org/apache/pig/test/TestPackage.java 6b197d1
> test/org/apache/pig/test/data/GoldenFiles/Cogroup.gld 35ed75a
> test/org/apache/pig/test/data/GoldenFiles/MRC1.gld 0a34728
> test/org/apache/pig/test/data/GoldenFiles/MRC10.gld b9fad0f
> test/org/apache/pig/test/data/GoldenFiles/MRC11.gld 861608d
> test/org/apache/pig/test/data/GoldenFiles/MRC12.gld 15f1158
> test/org/apache/pig/test/data/GoldenFiles/MRC13.gld 0d4de2b
> test/org/apache/pig/test/data/GoldenFiles/MRC14.gld eafa509
> test/org/apache/pig/test/data/GoldenFiles/MRC15.gld e8bd281
> test/org/apache/pig/test/data/GoldenFiles/MRC16.gld 6151d1e
> test/org/apache/pig/test/data/GoldenFiles/MRC17.gld cec5d57
> test/org/apache/pig/test/data/GoldenFiles/MRC18.gld 01cdc32
> test/org/apache/pig/test/data/GoldenFiles/MRC19.gld 8688178
> test/org/apache/pig/test/data/GoldenFiles/MRC2.gld 97f0ed0
> test/org/apache/pig/test/data/GoldenFiles/MRC3.gld 77a155f
> test/org/apache/pig/test/data/GoldenFiles/MRC4.gld 7c5078e
> test/org/apache/pig/test/data/GoldenFiles/MRC6.gld 45cba2d
> test/org/apache/pig/test/data/GoldenFiles/MRC7.gld 09fcaec
> test/org/apache/pig/test/data/GoldenFiles/MRC8.gld 2d301a9
> test/org/apache/pig/test/data/GoldenFiles/MRC9.gld c5d7047
>
> Diff: https://reviews.apache.org/r/15881/diff/
>
>
> Testing
> -------
>
> ant test-commit passes, with the exception of TestExampleGenerator which is caused by the illustrate work not being finished. I also ran some manual scripts, which worked.
>
>
> Thanks,
>
> Mark Wagner
>
>