You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by Mark Wagner <wa...@gmail.com> on 2014/03/04 22:40:28 UTC

Re: Review Request 15881: PIG-3591: Refactor POPackage

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15881/
-----------------------------------------------------------

(Updated March 4, 2014, 9:40 p.m.)


Review request for pig and Cheolsoo Park.


Changes
-------

I have fixed several multiquery issues, streaming for the last input in joins and illustrator works again.


Bugs: PIG-3591
    https://issues.apache.org/jira/browse/PIG-3591


Repository: pig-git


Description
-------

Separate "packaging" logic from "shuffle handling" logic. This moves the packaging logic to a new class "Packager", which is extended by CombinePackager, LitePackager, MultiQueryPackager, and JoinPackager.

This is not finished. Known problem are illustrate and streaming the last input are not implemented.


Diffs (updated)
-----

  src/org/apache/pig/backend/hadoop/executionengine/fetch/FetchOptimizer.java d801f6f 
  src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/AccumulatorOptimizer.java 3638b5c 
  src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/CombinerOptimizer.java 18a382b 
  src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java 5e28eb6 
  src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRCompiler.java 5dddab7 
  src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRUtil.java 93de6d5 
  src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java eb7c428 
  src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MultiQueryOptimizer.java 64f0ee1 
  src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PhyPlanSetter.java 933363d 
  src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigCombiner.java 773a22c 
  src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigGenericMapReduce.java eea5ce3 
  src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/SecondaryKeyOptimizer.java 1578630 
  src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/plans/POPackageAnnotator.java 47137d5 
  src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PhyPlanVisitor.java abb16ff 
  src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PlanPrinter.java ff82801 
  src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/XMLPhysicalPlanPrinter.java 892c26f 
  src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/CombinerPackager.java PRE-CREATION 
  src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/JoinPackager.java PRE-CREATION 
  src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/LitePackager.java PRE-CREATION 
  src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/MultiQueryPackager.java PRE-CREATION 
  src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POCombinerPackage.java 9105a0e 
  src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POJoinPackage.java 82f11ac 
  src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POMultiQueryPackage.java d604174 
  src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java 86314d9 
  src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackageLite.java c200715 
  src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/Packager.java PRE-CREATION 
  src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/util/PlanHelper.java b860521 
  src/org/apache/pig/data/ReadOnceBag.java e2b3887 
  src/org/apache/pig/newplan/logical/relational/LogToPhyTranslationVisitor.java 7112695 
  src/org/apache/pig/pen/IllustratorAttacher.java db9c707 
  src/org/apache/pig/pen/PhysicalPlanResetter.java f50fba7 
  test/org/apache/pig/test/TestJobSubmission.java cccade4 
  test/org/apache/pig/test/TestPackage.java 6b197d1 
  test/org/apache/pig/test/data/GoldenFiles/Cogroup.gld 35ed75a 
  test/org/apache/pig/test/data/GoldenFiles/MRC1.gld 0a34728 
  test/org/apache/pig/test/data/GoldenFiles/MRC10.gld b9fad0f 
  test/org/apache/pig/test/data/GoldenFiles/MRC11.gld 861608d 
  test/org/apache/pig/test/data/GoldenFiles/MRC12.gld 15f1158 
  test/org/apache/pig/test/data/GoldenFiles/MRC13.gld 0d4de2b 
  test/org/apache/pig/test/data/GoldenFiles/MRC14.gld eafa509 
  test/org/apache/pig/test/data/GoldenFiles/MRC15.gld e8bd281 
  test/org/apache/pig/test/data/GoldenFiles/MRC16.gld 6151d1e 
  test/org/apache/pig/test/data/GoldenFiles/MRC17.gld cec5d57 
  test/org/apache/pig/test/data/GoldenFiles/MRC18.gld 01cdc32 
  test/org/apache/pig/test/data/GoldenFiles/MRC19.gld 8688178 
  test/org/apache/pig/test/data/GoldenFiles/MRC2.gld 97f0ed0 
  test/org/apache/pig/test/data/GoldenFiles/MRC3.gld 77a155f 
  test/org/apache/pig/test/data/GoldenFiles/MRC4.gld 7c5078e 
  test/org/apache/pig/test/data/GoldenFiles/MRC6.gld 45cba2d 
  test/org/apache/pig/test/data/GoldenFiles/MRC7.gld 09fcaec 
  test/org/apache/pig/test/data/GoldenFiles/MRC8.gld 2d301a9 
  test/org/apache/pig/test/data/GoldenFiles/MRC9.gld c5d7047 

Diff: https://reviews.apache.org/r/15881/diff/


Testing
-------

ant test-commit passes, with the exception of TestExampleGenerator which is caused by the illustrate work not being finished. I also ran some manual scripts, which worked.


Thanks,

Mark Wagner


Re: Review Request 15881: PIG-3591: Refactor POPackage

Posted by Mark Wagner <wa...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15881/
-----------------------------------------------------------

(Updated April 5, 2014, 2:25 a.m.)


Review request for pig and Cheolsoo Park.


Bugs: PIG-3591
    https://issues.apache.org/jira/browse/PIG-3591


Repository: pig-git


Description
-------

Separate "packaging" logic from "shuffle handling" logic. This moves the packaging logic to a new class "Packager", which is extended by CombinePackager, LitePackager, MultiQueryPackager, and JoinPackager.

This is not finished. Known problem are illustrate and streaming the last input are not implemented.


Diffs (updated)
-----

  src/org/apache/pig/backend/hadoop/executionengine/fetch/FetchOptimizer.java d801f6f 
  src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/AccumulatorOptimizer.java 3638b5c 
  src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/CombinerOptimizer.java 18a382b 
  src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java b7ce7e2 
  src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRCompiler.java 5dddab7 
  src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRUtil.java 93de6d5 
  src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java d3ebeb3 
  src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MultiQueryOptimizer.java 64f0ee1 
  src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PhyPlanSetter.java 933363d 
  src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigCombiner.java 773a22c 
  src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigGenericMapReduce.java eea5ce3 
  src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/SecondaryKeyOptimizer.java 54740a0 
  src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/plans/POPackageAnnotator.java 47137d5 
  src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PhyPlanVisitor.java abb16ff 
  src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PlanPrinter.java ff82801 
  src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/XMLPhysicalPlanPrinter.java 892c26f 
  src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/CombinerPackager.java PRE-CREATION 
  src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/JoinPackager.java PRE-CREATION 
  src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/LitePackager.java PRE-CREATION 
  src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/MultiQueryPackager.java PRE-CREATION 
  src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POCombinerPackage.java 9105a0e 
  src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POJoinPackage.java 82f11ac 
  src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POMultiQueryPackage.java d604174 
  src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java 86314d9 
  src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackageLite.java c200715 
  src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPartialAgg.java a92e643 
  src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/Packager.java PRE-CREATION 
  src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/util/PlanHelper.java b860521 
  src/org/apache/pig/data/ReadOnceBag.java e2b3887 
  src/org/apache/pig/data/SelfSpillBag.java 2fcf36a 
  src/org/apache/pig/newplan/logical/relational/LogToPhyTranslationVisitor.java 7112695 
  src/org/apache/pig/pen/IllustratorAttacher.java db9c707 
  src/org/apache/pig/pen/PhysicalPlanResetter.java f50fba7 
  test/org/apache/pig/test/TestExampleGenerator.java 42f757f 
  test/org/apache/pig/test/TestJobSubmission.java 9741dc9 
  test/org/apache/pig/test/TestPackage.java 6b197d1 
  test/org/apache/pig/test/data/GoldenFiles/Cogroup.gld 35ed75a 
  test/org/apache/pig/test/data/GoldenFiles/MRC1.gld 0a34728 
  test/org/apache/pig/test/data/GoldenFiles/MRC10.gld b9fad0f 
  test/org/apache/pig/test/data/GoldenFiles/MRC11.gld 861608d 
  test/org/apache/pig/test/data/GoldenFiles/MRC12.gld 15f1158 
  test/org/apache/pig/test/data/GoldenFiles/MRC13.gld 0d4de2b 
  test/org/apache/pig/test/data/GoldenFiles/MRC14.gld eafa509 
  test/org/apache/pig/test/data/GoldenFiles/MRC15.gld e8bd281 
  test/org/apache/pig/test/data/GoldenFiles/MRC16.gld 6151d1e 
  test/org/apache/pig/test/data/GoldenFiles/MRC17.gld cec5d57 
  test/org/apache/pig/test/data/GoldenFiles/MRC18.gld 01cdc32 
  test/org/apache/pig/test/data/GoldenFiles/MRC19.gld 8688178 
  test/org/apache/pig/test/data/GoldenFiles/MRC2.gld 97f0ed0 
  test/org/apache/pig/test/data/GoldenFiles/MRC3.gld 77a155f 
  test/org/apache/pig/test/data/GoldenFiles/MRC4.gld 7c5078e 
  test/org/apache/pig/test/data/GoldenFiles/MRC6.gld 45cba2d 
  test/org/apache/pig/test/data/GoldenFiles/MRC7.gld 09fcaec 
  test/org/apache/pig/test/data/GoldenFiles/MRC8.gld 2d301a9 
  test/org/apache/pig/test/data/GoldenFiles/MRC9.gld c5d7047 

Diff: https://reviews.apache.org/r/15881/diff/


Testing
-------

ant test-commit passes, with the exception of TestExampleGenerator which is caused by the illustrate work not being finished. I also ran some manual scripts, which worked.


Thanks,

Mark Wagner


Re: Review Request 15881: PIG-3591: Refactor POPackage

Posted by Cheolsoo Park <pi...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15881/#review39417
-----------------------------------------------------------

Ship it!


I believe you have already run unit tests. If so, let's get this in.

- Cheolsoo Park


On April 3, 2014, 2:57 a.m., Mark Wagner wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15881/
> -----------------------------------------------------------
> 
> (Updated April 3, 2014, 2:57 a.m.)
> 
> 
> Review request for pig and Cheolsoo Park.
> 
> 
> Bugs: PIG-3591
>     https://issues.apache.org/jira/browse/PIG-3591
> 
> 
> Repository: pig-git
> 
> 
> Description
> -------
> 
> Separate "packaging" logic from "shuffle handling" logic. This moves the packaging logic to a new class "Packager", which is extended by CombinePackager, LitePackager, MultiQueryPackager, and JoinPackager.
> 
> This is not finished. Known problem are illustrate and streaming the last input are not implemented.
> 
> 
> Diffs
> -----
> 
>   src/org/apache/pig/backend/hadoop/executionengine/fetch/FetchOptimizer.java d801f6f 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/AccumulatorOptimizer.java 3638b5c 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/CombinerOptimizer.java 18a382b 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java 5e28eb6 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRCompiler.java 5dddab7 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRUtil.java 93de6d5 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java eb7c428 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MultiQueryOptimizer.java 64f0ee1 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PhyPlanSetter.java 933363d 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigCombiner.java 773a22c 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigGenericMapReduce.java eea5ce3 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/SecondaryKeyOptimizer.java 1578630 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/plans/POPackageAnnotator.java 47137d5 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PhyPlanVisitor.java abb16ff 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PlanPrinter.java ff82801 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/XMLPhysicalPlanPrinter.java 892c26f 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/CombinerPackager.java PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/JoinPackager.java PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/LitePackager.java PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/MultiQueryPackager.java PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POCombinerPackage.java 9105a0e 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POJoinPackage.java 82f11ac 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POMultiQueryPackage.java d604174 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java 86314d9 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackageLite.java c200715 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPartialAgg.java a92e643 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/Packager.java PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/util/PlanHelper.java b860521 
>   src/org/apache/pig/data/ReadOnceBag.java e2b3887 
>   src/org/apache/pig/data/SelfSpillBag.java 2fcf36a 
>   src/org/apache/pig/newplan/logical/relational/LogToPhyTranslationVisitor.java 7112695 
>   src/org/apache/pig/pen/IllustratorAttacher.java db9c707 
>   src/org/apache/pig/pen/PhysicalPlanResetter.java f50fba7 
>   test/org/apache/pig/test/TestExampleGenerator.java 42f757f 
>   test/org/apache/pig/test/TestJobSubmission.java cccade4 
>   test/org/apache/pig/test/TestPackage.java 6b197d1 
>   test/org/apache/pig/test/data/GoldenFiles/Cogroup.gld 35ed75a 
>   test/org/apache/pig/test/data/GoldenFiles/MRC1.gld 0a34728 
>   test/org/apache/pig/test/data/GoldenFiles/MRC10.gld b9fad0f 
>   test/org/apache/pig/test/data/GoldenFiles/MRC11.gld 861608d 
>   test/org/apache/pig/test/data/GoldenFiles/MRC12.gld 15f1158 
>   test/org/apache/pig/test/data/GoldenFiles/MRC13.gld 0d4de2b 
>   test/org/apache/pig/test/data/GoldenFiles/MRC14.gld eafa509 
>   test/org/apache/pig/test/data/GoldenFiles/MRC15.gld e8bd281 
>   test/org/apache/pig/test/data/GoldenFiles/MRC16.gld 6151d1e 
>   test/org/apache/pig/test/data/GoldenFiles/MRC17.gld cec5d57 
>   test/org/apache/pig/test/data/GoldenFiles/MRC18.gld 01cdc32 
>   test/org/apache/pig/test/data/GoldenFiles/MRC19.gld 8688178 
>   test/org/apache/pig/test/data/GoldenFiles/MRC2.gld 97f0ed0 
>   test/org/apache/pig/test/data/GoldenFiles/MRC3.gld 77a155f 
>   test/org/apache/pig/test/data/GoldenFiles/MRC4.gld 7c5078e 
>   test/org/apache/pig/test/data/GoldenFiles/MRC6.gld 45cba2d 
>   test/org/apache/pig/test/data/GoldenFiles/MRC7.gld 09fcaec 
>   test/org/apache/pig/test/data/GoldenFiles/MRC8.gld 2d301a9 
>   test/org/apache/pig/test/data/GoldenFiles/MRC9.gld c5d7047 
> 
> Diff: https://reviews.apache.org/r/15881/diff/
> 
> 
> Testing
> -------
> 
> ant test-commit passes, with the exception of TestExampleGenerator which is caused by the illustrate work not being finished. I also ran some manual scripts, which worked.
> 
> 
> Thanks,
> 
> Mark Wagner
> 
>


Re: Review Request 15881: PIG-3591: Refactor POPackage

Posted by Rohini Palaniswamy <ro...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15881/#review39461
-----------------------------------------------------------

Ship it!



src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/LitePackager.java
<https://reviews.apache.org/r/15881/#comment71849>

    This can be removed


- Rohini Palaniswamy


On April 3, 2014, 2:57 a.m., Mark Wagner wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15881/
> -----------------------------------------------------------
> 
> (Updated April 3, 2014, 2:57 a.m.)
> 
> 
> Review request for pig and Cheolsoo Park.
> 
> 
> Bugs: PIG-3591
>     https://issues.apache.org/jira/browse/PIG-3591
> 
> 
> Repository: pig-git
> 
> 
> Description
> -------
> 
> Separate "packaging" logic from "shuffle handling" logic. This moves the packaging logic to a new class "Packager", which is extended by CombinePackager, LitePackager, MultiQueryPackager, and JoinPackager.
> 
> This is not finished. Known problem are illustrate and streaming the last input are not implemented.
> 
> 
> Diffs
> -----
> 
>   src/org/apache/pig/backend/hadoop/executionengine/fetch/FetchOptimizer.java d801f6f 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/AccumulatorOptimizer.java 3638b5c 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/CombinerOptimizer.java 18a382b 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java 5e28eb6 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRCompiler.java 5dddab7 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRUtil.java 93de6d5 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java eb7c428 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MultiQueryOptimizer.java 64f0ee1 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PhyPlanSetter.java 933363d 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigCombiner.java 773a22c 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigGenericMapReduce.java eea5ce3 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/SecondaryKeyOptimizer.java 1578630 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/plans/POPackageAnnotator.java 47137d5 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PhyPlanVisitor.java abb16ff 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PlanPrinter.java ff82801 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/XMLPhysicalPlanPrinter.java 892c26f 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/CombinerPackager.java PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/JoinPackager.java PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/LitePackager.java PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/MultiQueryPackager.java PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POCombinerPackage.java 9105a0e 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POJoinPackage.java 82f11ac 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POMultiQueryPackage.java d604174 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java 86314d9 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackageLite.java c200715 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPartialAgg.java a92e643 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/Packager.java PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/util/PlanHelper.java b860521 
>   src/org/apache/pig/data/ReadOnceBag.java e2b3887 
>   src/org/apache/pig/data/SelfSpillBag.java 2fcf36a 
>   src/org/apache/pig/newplan/logical/relational/LogToPhyTranslationVisitor.java 7112695 
>   src/org/apache/pig/pen/IllustratorAttacher.java db9c707 
>   src/org/apache/pig/pen/PhysicalPlanResetter.java f50fba7 
>   test/org/apache/pig/test/TestExampleGenerator.java 42f757f 
>   test/org/apache/pig/test/TestJobSubmission.java cccade4 
>   test/org/apache/pig/test/TestPackage.java 6b197d1 
>   test/org/apache/pig/test/data/GoldenFiles/Cogroup.gld 35ed75a 
>   test/org/apache/pig/test/data/GoldenFiles/MRC1.gld 0a34728 
>   test/org/apache/pig/test/data/GoldenFiles/MRC10.gld b9fad0f 
>   test/org/apache/pig/test/data/GoldenFiles/MRC11.gld 861608d 
>   test/org/apache/pig/test/data/GoldenFiles/MRC12.gld 15f1158 
>   test/org/apache/pig/test/data/GoldenFiles/MRC13.gld 0d4de2b 
>   test/org/apache/pig/test/data/GoldenFiles/MRC14.gld eafa509 
>   test/org/apache/pig/test/data/GoldenFiles/MRC15.gld e8bd281 
>   test/org/apache/pig/test/data/GoldenFiles/MRC16.gld 6151d1e 
>   test/org/apache/pig/test/data/GoldenFiles/MRC17.gld cec5d57 
>   test/org/apache/pig/test/data/GoldenFiles/MRC18.gld 01cdc32 
>   test/org/apache/pig/test/data/GoldenFiles/MRC19.gld 8688178 
>   test/org/apache/pig/test/data/GoldenFiles/MRC2.gld 97f0ed0 
>   test/org/apache/pig/test/data/GoldenFiles/MRC3.gld 77a155f 
>   test/org/apache/pig/test/data/GoldenFiles/MRC4.gld 7c5078e 
>   test/org/apache/pig/test/data/GoldenFiles/MRC6.gld 45cba2d 
>   test/org/apache/pig/test/data/GoldenFiles/MRC7.gld 09fcaec 
>   test/org/apache/pig/test/data/GoldenFiles/MRC8.gld 2d301a9 
>   test/org/apache/pig/test/data/GoldenFiles/MRC9.gld c5d7047 
> 
> Diff: https://reviews.apache.org/r/15881/diff/
> 
> 
> Testing
> -------
> 
> ant test-commit passes, with the exception of TestExampleGenerator which is caused by the illustrate work not being finished. I also ran some manual scripts, which worked.
> 
> 
> Thanks,
> 
> Mark Wagner
> 
>


Re: Review Request 15881: PIG-3591: Refactor POPackage

Posted by Mark Wagner <wa...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15881/
-----------------------------------------------------------

(Updated April 3, 2014, 2:57 a.m.)


Review request for pig and Cheolsoo Park.


Changes
-------

Addressed most review comments. I added a test for order by illustration which is failing right now.


Bugs: PIG-3591
    https://issues.apache.org/jira/browse/PIG-3591


Repository: pig-git


Description
-------

Separate "packaging" logic from "shuffle handling" logic. This moves the packaging logic to a new class "Packager", which is extended by CombinePackager, LitePackager, MultiQueryPackager, and JoinPackager.

This is not finished. Known problem are illustrate and streaming the last input are not implemented.


Diffs (updated)
-----

  src/org/apache/pig/backend/hadoop/executionengine/fetch/FetchOptimizer.java d801f6f 
  src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/AccumulatorOptimizer.java 3638b5c 
  src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/CombinerOptimizer.java 18a382b 
  src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java 5e28eb6 
  src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRCompiler.java 5dddab7 
  src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRUtil.java 93de6d5 
  src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java eb7c428 
  src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MultiQueryOptimizer.java 64f0ee1 
  src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PhyPlanSetter.java 933363d 
  src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigCombiner.java 773a22c 
  src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigGenericMapReduce.java eea5ce3 
  src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/SecondaryKeyOptimizer.java 1578630 
  src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/plans/POPackageAnnotator.java 47137d5 
  src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PhyPlanVisitor.java abb16ff 
  src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PlanPrinter.java ff82801 
  src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/XMLPhysicalPlanPrinter.java 892c26f 
  src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/CombinerPackager.java PRE-CREATION 
  src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/JoinPackager.java PRE-CREATION 
  src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/LitePackager.java PRE-CREATION 
  src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/MultiQueryPackager.java PRE-CREATION 
  src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POCombinerPackage.java 9105a0e 
  src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POJoinPackage.java 82f11ac 
  src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POMultiQueryPackage.java d604174 
  src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java 86314d9 
  src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackageLite.java c200715 
  src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPartialAgg.java a92e643 
  src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/Packager.java PRE-CREATION 
  src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/util/PlanHelper.java b860521 
  src/org/apache/pig/data/ReadOnceBag.java e2b3887 
  src/org/apache/pig/data/SelfSpillBag.java 2fcf36a 
  src/org/apache/pig/newplan/logical/relational/LogToPhyTranslationVisitor.java 7112695 
  src/org/apache/pig/pen/IllustratorAttacher.java db9c707 
  src/org/apache/pig/pen/PhysicalPlanResetter.java f50fba7 
  test/org/apache/pig/test/TestExampleGenerator.java 42f757f 
  test/org/apache/pig/test/TestJobSubmission.java cccade4 
  test/org/apache/pig/test/TestPackage.java 6b197d1 
  test/org/apache/pig/test/data/GoldenFiles/Cogroup.gld 35ed75a 
  test/org/apache/pig/test/data/GoldenFiles/MRC1.gld 0a34728 
  test/org/apache/pig/test/data/GoldenFiles/MRC10.gld b9fad0f 
  test/org/apache/pig/test/data/GoldenFiles/MRC11.gld 861608d 
  test/org/apache/pig/test/data/GoldenFiles/MRC12.gld 15f1158 
  test/org/apache/pig/test/data/GoldenFiles/MRC13.gld 0d4de2b 
  test/org/apache/pig/test/data/GoldenFiles/MRC14.gld eafa509 
  test/org/apache/pig/test/data/GoldenFiles/MRC15.gld e8bd281 
  test/org/apache/pig/test/data/GoldenFiles/MRC16.gld 6151d1e 
  test/org/apache/pig/test/data/GoldenFiles/MRC17.gld cec5d57 
  test/org/apache/pig/test/data/GoldenFiles/MRC18.gld 01cdc32 
  test/org/apache/pig/test/data/GoldenFiles/MRC19.gld 8688178 
  test/org/apache/pig/test/data/GoldenFiles/MRC2.gld 97f0ed0 
  test/org/apache/pig/test/data/GoldenFiles/MRC3.gld 77a155f 
  test/org/apache/pig/test/data/GoldenFiles/MRC4.gld 7c5078e 
  test/org/apache/pig/test/data/GoldenFiles/MRC6.gld 45cba2d 
  test/org/apache/pig/test/data/GoldenFiles/MRC7.gld 09fcaec 
  test/org/apache/pig/test/data/GoldenFiles/MRC8.gld 2d301a9 
  test/org/apache/pig/test/data/GoldenFiles/MRC9.gld c5d7047 

Diff: https://reviews.apache.org/r/15881/diff/


Testing
-------

ant test-commit passes, with the exception of TestExampleGenerator which is caused by the illustrate work not being finished. I also ran some manual scripts, which worked.


Thanks,

Mark Wagner


Re: Review Request 15881: PIG-3591: Refactor POPackage

Posted by Cheolsoo Park <pi...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15881/#review36236
-----------------------------------------------------------



src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/Packager.java
<https://reviews.apache.org/r/15881/#comment67182>

    I've been running unit tests overnight. I see only one failure!
    
    This line causes an error in TestExampleGenerator#testLimit. Here is the stack trace-
    
    Testcase: testLimit took 0.042 sec
            Caused an ERROR
    Exception 
    java.io.IOException: Exception 
            at org.apache.pig.PigServer.getExamples(PigServer.java:1281)
            at org.apache.pig.test.TestExampleGenerator.testLimit(TestExampleGenerator.java:348)
    Caused by: java.lang.ClassCastException: org.apache.pig.data.BinSedesTuple cannot be cast to org.apache.pig.pen.util.ExampleTuple
            at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.Packager.illustratorMarkup2(Packager.java:168)
            at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.Packager.getValueTuple(Packager.java:225)
            at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPackage$PeekedBag$1.next(POPackage.java:422)
            at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPackage$PeekedBag$1.next(POPackage.java:405)
            at org.apache.pig.data.DefaultAbstractBag.addAll(DefaultAbstractBag.java:151)
            at org.apache.pig.data.DefaultAbstractBag.addAll(DefaultAbstractBag.java:137)
            at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.Packager.attachInput(Packager.java:104)
            at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPackage.getNextTuple(POPackage.java:275)
            at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.processOnePackageOutput(PigGenericMapReduce.java:420)
            at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:411)
            at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:255)
            at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
            at org.apache.pig.pen.LocalMapReduceSimulator.launchPig(LocalMapReduceSimulator.java:236)
            at org.apache.pig.pen.ExampleGenerator.getData(ExampleGenerator.java:258)
            at org.apache.pig.pen.ExampleGenerator.getData(ExampleGenerator.java:239)
            at org.apache.pig.pen.LineageTrimmingVisitor.init(LineageTrimmingVisitor.java:104)
            at org.apache.pig.pen.LineageTrimmingVisitor.<init>(LineageTrimmingVisitor.java:99)
            at org.apache.pig.pen.ExampleGenerator.getExamples(ExampleGenerator.java:167)
            at org.apache.pig.PigServer.getExamples(PigServer.java:1275)


- Cheolsoo Park


On March 4, 2014, 9:40 p.m., Mark Wagner wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15881/
> -----------------------------------------------------------
> 
> (Updated March 4, 2014, 9:40 p.m.)
> 
> 
> Review request for pig and Cheolsoo Park.
> 
> 
> Bugs: PIG-3591
>     https://issues.apache.org/jira/browse/PIG-3591
> 
> 
> Repository: pig-git
> 
> 
> Description
> -------
> 
> Separate "packaging" logic from "shuffle handling" logic. This moves the packaging logic to a new class "Packager", which is extended by CombinePackager, LitePackager, MultiQueryPackager, and JoinPackager.
> 
> This is not finished. Known problem are illustrate and streaming the last input are not implemented.
> 
> 
> Diffs
> -----
> 
>   src/org/apache/pig/backend/hadoop/executionengine/fetch/FetchOptimizer.java d801f6f 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/AccumulatorOptimizer.java 3638b5c 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/CombinerOptimizer.java 18a382b 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java 5e28eb6 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRCompiler.java 5dddab7 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRUtil.java 93de6d5 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java eb7c428 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MultiQueryOptimizer.java 64f0ee1 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PhyPlanSetter.java 933363d 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigCombiner.java 773a22c 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigGenericMapReduce.java eea5ce3 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/SecondaryKeyOptimizer.java 1578630 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/plans/POPackageAnnotator.java 47137d5 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PhyPlanVisitor.java abb16ff 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PlanPrinter.java ff82801 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/XMLPhysicalPlanPrinter.java 892c26f 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/CombinerPackager.java PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/JoinPackager.java PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/LitePackager.java PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/MultiQueryPackager.java PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POCombinerPackage.java 9105a0e 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POJoinPackage.java 82f11ac 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POMultiQueryPackage.java d604174 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java 86314d9 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackageLite.java c200715 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/Packager.java PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/util/PlanHelper.java b860521 
>   src/org/apache/pig/data/ReadOnceBag.java e2b3887 
>   src/org/apache/pig/newplan/logical/relational/LogToPhyTranslationVisitor.java 7112695 
>   src/org/apache/pig/pen/IllustratorAttacher.java db9c707 
>   src/org/apache/pig/pen/PhysicalPlanResetter.java f50fba7 
>   test/org/apache/pig/test/TestJobSubmission.java cccade4 
>   test/org/apache/pig/test/TestPackage.java 6b197d1 
>   test/org/apache/pig/test/data/GoldenFiles/Cogroup.gld 35ed75a 
>   test/org/apache/pig/test/data/GoldenFiles/MRC1.gld 0a34728 
>   test/org/apache/pig/test/data/GoldenFiles/MRC10.gld b9fad0f 
>   test/org/apache/pig/test/data/GoldenFiles/MRC11.gld 861608d 
>   test/org/apache/pig/test/data/GoldenFiles/MRC12.gld 15f1158 
>   test/org/apache/pig/test/data/GoldenFiles/MRC13.gld 0d4de2b 
>   test/org/apache/pig/test/data/GoldenFiles/MRC14.gld eafa509 
>   test/org/apache/pig/test/data/GoldenFiles/MRC15.gld e8bd281 
>   test/org/apache/pig/test/data/GoldenFiles/MRC16.gld 6151d1e 
>   test/org/apache/pig/test/data/GoldenFiles/MRC17.gld cec5d57 
>   test/org/apache/pig/test/data/GoldenFiles/MRC18.gld 01cdc32 
>   test/org/apache/pig/test/data/GoldenFiles/MRC19.gld 8688178 
>   test/org/apache/pig/test/data/GoldenFiles/MRC2.gld 97f0ed0 
>   test/org/apache/pig/test/data/GoldenFiles/MRC3.gld 77a155f 
>   test/org/apache/pig/test/data/GoldenFiles/MRC4.gld 7c5078e 
>   test/org/apache/pig/test/data/GoldenFiles/MRC6.gld 45cba2d 
>   test/org/apache/pig/test/data/GoldenFiles/MRC7.gld 09fcaec 
>   test/org/apache/pig/test/data/GoldenFiles/MRC8.gld 2d301a9 
>   test/org/apache/pig/test/data/GoldenFiles/MRC9.gld c5d7047 
> 
> Diff: https://reviews.apache.org/r/15881/diff/
> 
> 
> Testing
> -------
> 
> ant test-commit passes, with the exception of TestExampleGenerator which is caused by the illustrate work not being finished. I also ran some manual scripts, which worked.
> 
> 
> Thanks,
> 
> Mark Wagner
> 
>


Re: Review Request 15881: PIG-3591: Refactor POPackage

Posted by Mark Wagner <wa...@gmail.com>.

> On March 6, 2014, 8:04 a.m., Rohini Palaniswamy wrote:
> > src/org/apache/pig/pen/IllustratorAttacher.java, lines 418-421
> > <https://reviews.apache.org/r/15881/diff/3/?file=509987#file509987line418>
> >
> >     Don't we need the equivalent of this code in visitPackage?

The Packager is now responsible for this. It is present in JoinPackager.


- Mark


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15881/#review36273
-----------------------------------------------------------


On April 3, 2014, 2:57 a.m., Mark Wagner wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15881/
> -----------------------------------------------------------
> 
> (Updated April 3, 2014, 2:57 a.m.)
> 
> 
> Review request for pig and Cheolsoo Park.
> 
> 
> Bugs: PIG-3591
>     https://issues.apache.org/jira/browse/PIG-3591
> 
> 
> Repository: pig-git
> 
> 
> Description
> -------
> 
> Separate "packaging" logic from "shuffle handling" logic. This moves the packaging logic to a new class "Packager", which is extended by CombinePackager, LitePackager, MultiQueryPackager, and JoinPackager.
> 
> This is not finished. Known problem are illustrate and streaming the last input are not implemented.
> 
> 
> Diffs
> -----
> 
>   src/org/apache/pig/backend/hadoop/executionengine/fetch/FetchOptimizer.java d801f6f 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/AccumulatorOptimizer.java 3638b5c 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/CombinerOptimizer.java 18a382b 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java 5e28eb6 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRCompiler.java 5dddab7 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRUtil.java 93de6d5 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java eb7c428 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MultiQueryOptimizer.java 64f0ee1 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PhyPlanSetter.java 933363d 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigCombiner.java 773a22c 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigGenericMapReduce.java eea5ce3 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/SecondaryKeyOptimizer.java 1578630 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/plans/POPackageAnnotator.java 47137d5 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PhyPlanVisitor.java abb16ff 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PlanPrinter.java ff82801 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/XMLPhysicalPlanPrinter.java 892c26f 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/CombinerPackager.java PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/JoinPackager.java PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/LitePackager.java PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/MultiQueryPackager.java PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POCombinerPackage.java 9105a0e 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POJoinPackage.java 82f11ac 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POMultiQueryPackage.java d604174 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java 86314d9 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackageLite.java c200715 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPartialAgg.java a92e643 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/Packager.java PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/util/PlanHelper.java b860521 
>   src/org/apache/pig/data/ReadOnceBag.java e2b3887 
>   src/org/apache/pig/data/SelfSpillBag.java 2fcf36a 
>   src/org/apache/pig/newplan/logical/relational/LogToPhyTranslationVisitor.java 7112695 
>   src/org/apache/pig/pen/IllustratorAttacher.java db9c707 
>   src/org/apache/pig/pen/PhysicalPlanResetter.java f50fba7 
>   test/org/apache/pig/test/TestExampleGenerator.java 42f757f 
>   test/org/apache/pig/test/TestJobSubmission.java cccade4 
>   test/org/apache/pig/test/TestPackage.java 6b197d1 
>   test/org/apache/pig/test/data/GoldenFiles/Cogroup.gld 35ed75a 
>   test/org/apache/pig/test/data/GoldenFiles/MRC1.gld 0a34728 
>   test/org/apache/pig/test/data/GoldenFiles/MRC10.gld b9fad0f 
>   test/org/apache/pig/test/data/GoldenFiles/MRC11.gld 861608d 
>   test/org/apache/pig/test/data/GoldenFiles/MRC12.gld 15f1158 
>   test/org/apache/pig/test/data/GoldenFiles/MRC13.gld 0d4de2b 
>   test/org/apache/pig/test/data/GoldenFiles/MRC14.gld eafa509 
>   test/org/apache/pig/test/data/GoldenFiles/MRC15.gld e8bd281 
>   test/org/apache/pig/test/data/GoldenFiles/MRC16.gld 6151d1e 
>   test/org/apache/pig/test/data/GoldenFiles/MRC17.gld cec5d57 
>   test/org/apache/pig/test/data/GoldenFiles/MRC18.gld 01cdc32 
>   test/org/apache/pig/test/data/GoldenFiles/MRC19.gld 8688178 
>   test/org/apache/pig/test/data/GoldenFiles/MRC2.gld 97f0ed0 
>   test/org/apache/pig/test/data/GoldenFiles/MRC3.gld 77a155f 
>   test/org/apache/pig/test/data/GoldenFiles/MRC4.gld 7c5078e 
>   test/org/apache/pig/test/data/GoldenFiles/MRC6.gld 45cba2d 
>   test/org/apache/pig/test/data/GoldenFiles/MRC7.gld 09fcaec 
>   test/org/apache/pig/test/data/GoldenFiles/MRC8.gld 2d301a9 
>   test/org/apache/pig/test/data/GoldenFiles/MRC9.gld c5d7047 
> 
> Diff: https://reviews.apache.org/r/15881/diff/
> 
> 
> Testing
> -------
> 
> ant test-commit passes, with the exception of TestExampleGenerator which is caused by the illustrate work not being finished. I also ran some manual scripts, which worked.
> 
> 
> Thanks,
> 
> Mark Wagner
> 
>


Re: Review Request 15881: PIG-3591: Refactor POPackage

Posted by Rohini Palaniswamy <ro...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15881/#review36273
-----------------------------------------------------------


I am still going over the main classes POPackage and Packager implementations. Just going over thoroughly to ensure no piece of code is missed out in the refactor. Will update any comments on that tomorrow. 


src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRCompiler.java
<https://reviews.apache.org/r/15881/#comment67229>

    I believe this check should not be removed



src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PlanPrinter.java
<https://reviews.apache.org/r/15881/#comment67257>

    Don't we still need this? i.e Print out the list of packagers MultiQueryPackager has.



src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/XMLPhysicalPlanPrinter.java
<https://reviews.apache.org/r/15881/#comment67259>

    Don't we still need this?



src/org/apache/pig/pen/IllustratorAttacher.java
<https://reviews.apache.org/r/15881/#comment67304>

    Don't we need the equivalent of this code in visitPackage?


- Rohini Palaniswamy


On March 4, 2014, 9:40 p.m., Mark Wagner wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15881/
> -----------------------------------------------------------
> 
> (Updated March 4, 2014, 9:40 p.m.)
> 
> 
> Review request for pig and Cheolsoo Park.
> 
> 
> Bugs: PIG-3591
>     https://issues.apache.org/jira/browse/PIG-3591
> 
> 
> Repository: pig-git
> 
> 
> Description
> -------
> 
> Separate "packaging" logic from "shuffle handling" logic. This moves the packaging logic to a new class "Packager", which is extended by CombinePackager, LitePackager, MultiQueryPackager, and JoinPackager.
> 
> This is not finished. Known problem are illustrate and streaming the last input are not implemented.
> 
> 
> Diffs
> -----
> 
>   src/org/apache/pig/backend/hadoop/executionengine/fetch/FetchOptimizer.java d801f6f 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/AccumulatorOptimizer.java 3638b5c 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/CombinerOptimizer.java 18a382b 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java 5e28eb6 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRCompiler.java 5dddab7 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRUtil.java 93de6d5 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java eb7c428 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MultiQueryOptimizer.java 64f0ee1 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PhyPlanSetter.java 933363d 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigCombiner.java 773a22c 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigGenericMapReduce.java eea5ce3 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/SecondaryKeyOptimizer.java 1578630 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/plans/POPackageAnnotator.java 47137d5 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PhyPlanVisitor.java abb16ff 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PlanPrinter.java ff82801 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/XMLPhysicalPlanPrinter.java 892c26f 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/CombinerPackager.java PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/JoinPackager.java PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/LitePackager.java PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/MultiQueryPackager.java PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POCombinerPackage.java 9105a0e 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POJoinPackage.java 82f11ac 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POMultiQueryPackage.java d604174 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java 86314d9 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackageLite.java c200715 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/Packager.java PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/util/PlanHelper.java b860521 
>   src/org/apache/pig/data/ReadOnceBag.java e2b3887 
>   src/org/apache/pig/newplan/logical/relational/LogToPhyTranslationVisitor.java 7112695 
>   src/org/apache/pig/pen/IllustratorAttacher.java db9c707 
>   src/org/apache/pig/pen/PhysicalPlanResetter.java f50fba7 
>   test/org/apache/pig/test/TestJobSubmission.java cccade4 
>   test/org/apache/pig/test/TestPackage.java 6b197d1 
>   test/org/apache/pig/test/data/GoldenFiles/Cogroup.gld 35ed75a 
>   test/org/apache/pig/test/data/GoldenFiles/MRC1.gld 0a34728 
>   test/org/apache/pig/test/data/GoldenFiles/MRC10.gld b9fad0f 
>   test/org/apache/pig/test/data/GoldenFiles/MRC11.gld 861608d 
>   test/org/apache/pig/test/data/GoldenFiles/MRC12.gld 15f1158 
>   test/org/apache/pig/test/data/GoldenFiles/MRC13.gld 0d4de2b 
>   test/org/apache/pig/test/data/GoldenFiles/MRC14.gld eafa509 
>   test/org/apache/pig/test/data/GoldenFiles/MRC15.gld e8bd281 
>   test/org/apache/pig/test/data/GoldenFiles/MRC16.gld 6151d1e 
>   test/org/apache/pig/test/data/GoldenFiles/MRC17.gld cec5d57 
>   test/org/apache/pig/test/data/GoldenFiles/MRC18.gld 01cdc32 
>   test/org/apache/pig/test/data/GoldenFiles/MRC19.gld 8688178 
>   test/org/apache/pig/test/data/GoldenFiles/MRC2.gld 97f0ed0 
>   test/org/apache/pig/test/data/GoldenFiles/MRC3.gld 77a155f 
>   test/org/apache/pig/test/data/GoldenFiles/MRC4.gld 7c5078e 
>   test/org/apache/pig/test/data/GoldenFiles/MRC6.gld 45cba2d 
>   test/org/apache/pig/test/data/GoldenFiles/MRC7.gld 09fcaec 
>   test/org/apache/pig/test/data/GoldenFiles/MRC8.gld 2d301a9 
>   test/org/apache/pig/test/data/GoldenFiles/MRC9.gld c5d7047 
> 
> Diff: https://reviews.apache.org/r/15881/diff/
> 
> 
> Testing
> -------
> 
> ant test-commit passes, with the exception of TestExampleGenerator which is caused by the illustrate work not being finished. I also ran some manual scripts, which worked.
> 
> 
> Thanks,
> 
> Mark Wagner
> 
>


Re: Review Request 15881: PIG-3591: Refactor POPackage

Posted by Mark Wagner <wa...@gmail.com>.

> On March 5, 2014, 3:01 a.m., Cheolsoo Park wrote:
> > src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/JoinPackager.java, line 42
> > <https://reviews.apache.org/r/15881/diff/3/?file=509975#file509975line42>
> >
> >     
> >     
> >     Do we need this flag? It's set to true at L97, but it's never used anywhere.

Yes, as brought up by Rohini. I've added its use in the right places.


> On March 5, 2014, 3:01 a.m., Cheolsoo Park wrote:
> > src/org/apache/pig/pen/IllustratorAttacher.java, line 258
> > <https://reviews.apache.org/r/15881/diff/3/?file=509987#file509987line258>
> >
> >     Don't we need to change this to the following instead of commenting it out?
> >     
> >     !(preds.get(0).getPkgr() instanceof LitePackager) &&

I don't think it's ever possible to have a POPackageLite and isDistinct 


- Mark


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15881/#review36193
-----------------------------------------------------------


On March 4, 2014, 9:40 p.m., Mark Wagner wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15881/
> -----------------------------------------------------------
> 
> (Updated March 4, 2014, 9:40 p.m.)
> 
> 
> Review request for pig and Cheolsoo Park.
> 
> 
> Bugs: PIG-3591
>     https://issues.apache.org/jira/browse/PIG-3591
> 
> 
> Repository: pig-git
> 
> 
> Description
> -------
> 
> Separate "packaging" logic from "shuffle handling" logic. This moves the packaging logic to a new class "Packager", which is extended by CombinePackager, LitePackager, MultiQueryPackager, and JoinPackager.
> 
> This is not finished. Known problem are illustrate and streaming the last input are not implemented.
> 
> 
> Diffs
> -----
> 
>   src/org/apache/pig/backend/hadoop/executionengine/fetch/FetchOptimizer.java d801f6f 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/AccumulatorOptimizer.java 3638b5c 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/CombinerOptimizer.java 18a382b 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java 5e28eb6 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRCompiler.java 5dddab7 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRUtil.java 93de6d5 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java eb7c428 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MultiQueryOptimizer.java 64f0ee1 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PhyPlanSetter.java 933363d 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigCombiner.java 773a22c 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigGenericMapReduce.java eea5ce3 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/SecondaryKeyOptimizer.java 1578630 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/plans/POPackageAnnotator.java 47137d5 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PhyPlanVisitor.java abb16ff 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PlanPrinter.java ff82801 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/XMLPhysicalPlanPrinter.java 892c26f 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/CombinerPackager.java PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/JoinPackager.java PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/LitePackager.java PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/MultiQueryPackager.java PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POCombinerPackage.java 9105a0e 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POJoinPackage.java 82f11ac 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POMultiQueryPackage.java d604174 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java 86314d9 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackageLite.java c200715 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/Packager.java PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/util/PlanHelper.java b860521 
>   src/org/apache/pig/data/ReadOnceBag.java e2b3887 
>   src/org/apache/pig/newplan/logical/relational/LogToPhyTranslationVisitor.java 7112695 
>   src/org/apache/pig/pen/IllustratorAttacher.java db9c707 
>   src/org/apache/pig/pen/PhysicalPlanResetter.java f50fba7 
>   test/org/apache/pig/test/TestJobSubmission.java cccade4 
>   test/org/apache/pig/test/TestPackage.java 6b197d1 
>   test/org/apache/pig/test/data/GoldenFiles/Cogroup.gld 35ed75a 
>   test/org/apache/pig/test/data/GoldenFiles/MRC1.gld 0a34728 
>   test/org/apache/pig/test/data/GoldenFiles/MRC10.gld b9fad0f 
>   test/org/apache/pig/test/data/GoldenFiles/MRC11.gld 861608d 
>   test/org/apache/pig/test/data/GoldenFiles/MRC12.gld 15f1158 
>   test/org/apache/pig/test/data/GoldenFiles/MRC13.gld 0d4de2b 
>   test/org/apache/pig/test/data/GoldenFiles/MRC14.gld eafa509 
>   test/org/apache/pig/test/data/GoldenFiles/MRC15.gld e8bd281 
>   test/org/apache/pig/test/data/GoldenFiles/MRC16.gld 6151d1e 
>   test/org/apache/pig/test/data/GoldenFiles/MRC17.gld cec5d57 
>   test/org/apache/pig/test/data/GoldenFiles/MRC18.gld 01cdc32 
>   test/org/apache/pig/test/data/GoldenFiles/MRC19.gld 8688178 
>   test/org/apache/pig/test/data/GoldenFiles/MRC2.gld 97f0ed0 
>   test/org/apache/pig/test/data/GoldenFiles/MRC3.gld 77a155f 
>   test/org/apache/pig/test/data/GoldenFiles/MRC4.gld 7c5078e 
>   test/org/apache/pig/test/data/GoldenFiles/MRC6.gld 45cba2d 
>   test/org/apache/pig/test/data/GoldenFiles/MRC7.gld 09fcaec 
>   test/org/apache/pig/test/data/GoldenFiles/MRC8.gld 2d301a9 
>   test/org/apache/pig/test/data/GoldenFiles/MRC9.gld c5d7047 
> 
> Diff: https://reviews.apache.org/r/15881/diff/
> 
> 
> Testing
> -------
> 
> ant test-commit passes, with the exception of TestExampleGenerator which is caused by the illustrate work not being finished. I also ran some manual scripts, which worked.
> 
> 
> Thanks,
> 
> Mark Wagner
> 
>


Re: Review Request 15881: PIG-3591: Refactor POPackage

Posted by Cheolsoo Park <pi...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15881/#review36193
-----------------------------------------------------------


Looks good to me. Thank you very much for the great work!

I have few minor comments below.


src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/AccumulatorOptimizer.java
<https://reviews.apache.org/r/15881/#comment67110>

    Replace "(POPackage) po_package).getPkgr()" with pkgr since you already have the local variable?



src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/AccumulatorOptimizer.java
<https://reviews.apache.org/r/15881/#comment67111>

    Same as above.



src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MultiQueryOptimizer.java
<https://reviews.apache.org/r/15881/#comment67120>

    Replace pk.getPkgr() with fromPkgr?



src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/CombinerPackager.java
<https://reviews.apache.org/r/15881/#comment67122>

    Can you delete all the trailing white spaces when committing your patch?



src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/CombinerPackager.java
<https://reviews.apache.org/r/15881/#comment67129>

    Can you update the comment here? POCombinerPackage should be POPackage now, right?



src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/JoinPackager.java
<https://reviews.apache.org/r/15881/#comment67133>

    
    
    Do we need this flag? It's set to true at L97, but it's never used anywhere.



src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java
<https://reviews.apache.org/r/15881/#comment67142>

    Please adjust the width of the line.



src/org/apache/pig/pen/IllustratorAttacher.java
<https://reviews.apache.org/r/15881/#comment67143>

    Don't we need to change this to the following instead of commenting it out?
    
    !(preds.get(0).getPkgr() instanceof LitePackager) &&



test/org/apache/pig/test/TestJobSubmission.java
<https://reviews.apache.org/r/15881/#comment67144>

    Can you delete this dead code? This only makes it harder to merge branches.


- Cheolsoo Park


On March 4, 2014, 9:40 p.m., Mark Wagner wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15881/
> -----------------------------------------------------------
> 
> (Updated March 4, 2014, 9:40 p.m.)
> 
> 
> Review request for pig and Cheolsoo Park.
> 
> 
> Bugs: PIG-3591
>     https://issues.apache.org/jira/browse/PIG-3591
> 
> 
> Repository: pig-git
> 
> 
> Description
> -------
> 
> Separate "packaging" logic from "shuffle handling" logic. This moves the packaging logic to a new class "Packager", which is extended by CombinePackager, LitePackager, MultiQueryPackager, and JoinPackager.
> 
> This is not finished. Known problem are illustrate and streaming the last input are not implemented.
> 
> 
> Diffs
> -----
> 
>   src/org/apache/pig/backend/hadoop/executionengine/fetch/FetchOptimizer.java d801f6f 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/AccumulatorOptimizer.java 3638b5c 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/CombinerOptimizer.java 18a382b 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java 5e28eb6 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRCompiler.java 5dddab7 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRUtil.java 93de6d5 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java eb7c428 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MultiQueryOptimizer.java 64f0ee1 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PhyPlanSetter.java 933363d 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigCombiner.java 773a22c 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigGenericMapReduce.java eea5ce3 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/SecondaryKeyOptimizer.java 1578630 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/plans/POPackageAnnotator.java 47137d5 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PhyPlanVisitor.java abb16ff 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PlanPrinter.java ff82801 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/XMLPhysicalPlanPrinter.java 892c26f 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/CombinerPackager.java PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/JoinPackager.java PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/LitePackager.java PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/MultiQueryPackager.java PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POCombinerPackage.java 9105a0e 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POJoinPackage.java 82f11ac 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POMultiQueryPackage.java d604174 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java 86314d9 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackageLite.java c200715 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/Packager.java PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/util/PlanHelper.java b860521 
>   src/org/apache/pig/data/ReadOnceBag.java e2b3887 
>   src/org/apache/pig/newplan/logical/relational/LogToPhyTranslationVisitor.java 7112695 
>   src/org/apache/pig/pen/IllustratorAttacher.java db9c707 
>   src/org/apache/pig/pen/PhysicalPlanResetter.java f50fba7 
>   test/org/apache/pig/test/TestJobSubmission.java cccade4 
>   test/org/apache/pig/test/TestPackage.java 6b197d1 
>   test/org/apache/pig/test/data/GoldenFiles/Cogroup.gld 35ed75a 
>   test/org/apache/pig/test/data/GoldenFiles/MRC1.gld 0a34728 
>   test/org/apache/pig/test/data/GoldenFiles/MRC10.gld b9fad0f 
>   test/org/apache/pig/test/data/GoldenFiles/MRC11.gld 861608d 
>   test/org/apache/pig/test/data/GoldenFiles/MRC12.gld 15f1158 
>   test/org/apache/pig/test/data/GoldenFiles/MRC13.gld 0d4de2b 
>   test/org/apache/pig/test/data/GoldenFiles/MRC14.gld eafa509 
>   test/org/apache/pig/test/data/GoldenFiles/MRC15.gld e8bd281 
>   test/org/apache/pig/test/data/GoldenFiles/MRC16.gld 6151d1e 
>   test/org/apache/pig/test/data/GoldenFiles/MRC17.gld cec5d57 
>   test/org/apache/pig/test/data/GoldenFiles/MRC18.gld 01cdc32 
>   test/org/apache/pig/test/data/GoldenFiles/MRC19.gld 8688178 
>   test/org/apache/pig/test/data/GoldenFiles/MRC2.gld 97f0ed0 
>   test/org/apache/pig/test/data/GoldenFiles/MRC3.gld 77a155f 
>   test/org/apache/pig/test/data/GoldenFiles/MRC4.gld 7c5078e 
>   test/org/apache/pig/test/data/GoldenFiles/MRC6.gld 45cba2d 
>   test/org/apache/pig/test/data/GoldenFiles/MRC7.gld 09fcaec 
>   test/org/apache/pig/test/data/GoldenFiles/MRC8.gld 2d301a9 
>   test/org/apache/pig/test/data/GoldenFiles/MRC9.gld c5d7047 
> 
> Diff: https://reviews.apache.org/r/15881/diff/
> 
> 
> Testing
> -------
> 
> ant test-commit passes, with the exception of TestExampleGenerator which is caused by the illustrate work not being finished. I also ran some manual scripts, which worked.
> 
> 
> Thanks,
> 
> Mark Wagner
> 
>


Re: Review Request 15881: PIG-3591: Refactor POPackage

Posted by Mark Wagner <wa...@gmail.com>.

> On March 20, 2014, 12:37 p.m., Rohini Palaniswamy wrote:
> > src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/JoinPackager.java, line 212
> > <https://reviews.apache.org/r/15881/diff/3/?file=509975#file509975line212>
> >
> >     I did not get this part about readOnce and copying to a new bag. Also why default bag? If the bag is big won't you hit OOM with this copy?

readOnce indicates that the execution layer can provide a tuple stream for this bag. In MR this is only the last input, but in Tez we can do it for any input.


> On March 20, 2014, 12:37 p.m., Rohini Palaniswamy wrote:
> > src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java, lines 261-265
> > <https://reviews.apache.org/r/15881/diff/3/?file=509981#file509981line261>
> >
> >     Shouldn't we have this condition (if(pkgr.isDistinct())) in POPackage instead of Packager.getNext()?

I wanted to keep that type of logic within the Packagers and limit the POPackage to interfacing with the execution engine.


- Mark


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15881/#review37876
-----------------------------------------------------------


On April 3, 2014, 2:57 a.m., Mark Wagner wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15881/
> -----------------------------------------------------------
> 
> (Updated April 3, 2014, 2:57 a.m.)
> 
> 
> Review request for pig and Cheolsoo Park.
> 
> 
> Bugs: PIG-3591
>     https://issues.apache.org/jira/browse/PIG-3591
> 
> 
> Repository: pig-git
> 
> 
> Description
> -------
> 
> Separate "packaging" logic from "shuffle handling" logic. This moves the packaging logic to a new class "Packager", which is extended by CombinePackager, LitePackager, MultiQueryPackager, and JoinPackager.
> 
> This is not finished. Known problem are illustrate and streaming the last input are not implemented.
> 
> 
> Diffs
> -----
> 
>   src/org/apache/pig/backend/hadoop/executionengine/fetch/FetchOptimizer.java d801f6f 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/AccumulatorOptimizer.java 3638b5c 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/CombinerOptimizer.java 18a382b 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java 5e28eb6 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRCompiler.java 5dddab7 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRUtil.java 93de6d5 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java eb7c428 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MultiQueryOptimizer.java 64f0ee1 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PhyPlanSetter.java 933363d 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigCombiner.java 773a22c 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigGenericMapReduce.java eea5ce3 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/SecondaryKeyOptimizer.java 1578630 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/plans/POPackageAnnotator.java 47137d5 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PhyPlanVisitor.java abb16ff 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PlanPrinter.java ff82801 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/XMLPhysicalPlanPrinter.java 892c26f 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/CombinerPackager.java PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/JoinPackager.java PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/LitePackager.java PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/MultiQueryPackager.java PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POCombinerPackage.java 9105a0e 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POJoinPackage.java 82f11ac 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POMultiQueryPackage.java d604174 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java 86314d9 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackageLite.java c200715 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPartialAgg.java a92e643 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/Packager.java PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/util/PlanHelper.java b860521 
>   src/org/apache/pig/data/ReadOnceBag.java e2b3887 
>   src/org/apache/pig/data/SelfSpillBag.java 2fcf36a 
>   src/org/apache/pig/newplan/logical/relational/LogToPhyTranslationVisitor.java 7112695 
>   src/org/apache/pig/pen/IllustratorAttacher.java db9c707 
>   src/org/apache/pig/pen/PhysicalPlanResetter.java f50fba7 
>   test/org/apache/pig/test/TestExampleGenerator.java 42f757f 
>   test/org/apache/pig/test/TestJobSubmission.java cccade4 
>   test/org/apache/pig/test/TestPackage.java 6b197d1 
>   test/org/apache/pig/test/data/GoldenFiles/Cogroup.gld 35ed75a 
>   test/org/apache/pig/test/data/GoldenFiles/MRC1.gld 0a34728 
>   test/org/apache/pig/test/data/GoldenFiles/MRC10.gld b9fad0f 
>   test/org/apache/pig/test/data/GoldenFiles/MRC11.gld 861608d 
>   test/org/apache/pig/test/data/GoldenFiles/MRC12.gld 15f1158 
>   test/org/apache/pig/test/data/GoldenFiles/MRC13.gld 0d4de2b 
>   test/org/apache/pig/test/data/GoldenFiles/MRC14.gld eafa509 
>   test/org/apache/pig/test/data/GoldenFiles/MRC15.gld e8bd281 
>   test/org/apache/pig/test/data/GoldenFiles/MRC16.gld 6151d1e 
>   test/org/apache/pig/test/data/GoldenFiles/MRC17.gld cec5d57 
>   test/org/apache/pig/test/data/GoldenFiles/MRC18.gld 01cdc32 
>   test/org/apache/pig/test/data/GoldenFiles/MRC19.gld 8688178 
>   test/org/apache/pig/test/data/GoldenFiles/MRC2.gld 97f0ed0 
>   test/org/apache/pig/test/data/GoldenFiles/MRC3.gld 77a155f 
>   test/org/apache/pig/test/data/GoldenFiles/MRC4.gld 7c5078e 
>   test/org/apache/pig/test/data/GoldenFiles/MRC6.gld 45cba2d 
>   test/org/apache/pig/test/data/GoldenFiles/MRC7.gld 09fcaec 
>   test/org/apache/pig/test/data/GoldenFiles/MRC8.gld 2d301a9 
>   test/org/apache/pig/test/data/GoldenFiles/MRC9.gld c5d7047 
> 
> Diff: https://reviews.apache.org/r/15881/diff/
> 
> 
> Testing
> -------
> 
> ant test-commit passes, with the exception of TestExampleGenerator which is caused by the illustrate work not being finished. I also ran some manual scripts, which worked.
> 
> 
> Thanks,
> 
> Mark Wagner
> 
>


Re: Review Request 15881: PIG-3591: Refactor POPackage

Posted by Rohini Palaniswamy <ro...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15881/#review37876
-----------------------------------------------------------



src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/JoinPackager.java
<https://reviews.apache.org/r/15881/#comment69655>

    useDefaultBag not used here. firstTime needs to be in attachInput and useDefaultBag used there



src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/JoinPackager.java
<https://reviews.apache.org/r/15881/#comment69656>

    InternalCachedBag if not useDefaultBag



src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/JoinPackager.java
<https://reviews.apache.org/r/15881/#comment69657>

    I did not get this part about readOnce and copying to a new bag. Also why default bag? If the bag is big won't you hit OOM with this copy?



src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/LitePackager.java
<https://reviews.apache.org/r/15881/#comment69658>

    clone.keyInfo = new HashMap<Integer, Pair<Boolean, Map<Integer, Integer>>>(keyInfo);



src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java
<https://reviews.apache.org/r/15881/#comment69653>

    POJoinPackage also prints forEach.getFlatStr(). Probably have a pkgr.name() and override that for each Packager



src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java
<https://reviews.apache.org/r/15881/#comment69644>

    Shouldn't we have this condition (if(pkgr.isDistinct())) in POPackage instead of Packager.getNext()? 



src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java
<https://reviews.apache.org/r/15881/#comment69649>

    reporter.progress will never be called



src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java
<https://reviews.apache.org/r/15881/#comment69648>

    private static



src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java
<https://reviews.apache.org/r/15881/#comment69650>

    Wouldn't the index be always same for all the elements in the bag? Why get everytime?



src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/Packager.java
<https://reviews.apache.org/r/15881/#comment69647>

    Same question here on readOnce. Why do we copy the bag on readonce?



src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/Packager.java
<https://reviews.apache.org/r/15881/#comment69651>

    Should be a deep clone as was in POPackage


- Rohini Palaniswamy


On March 4, 2014, 9:40 p.m., Mark Wagner wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15881/
> -----------------------------------------------------------
> 
> (Updated March 4, 2014, 9:40 p.m.)
> 
> 
> Review request for pig and Cheolsoo Park.
> 
> 
> Bugs: PIG-3591
>     https://issues.apache.org/jira/browse/PIG-3591
> 
> 
> Repository: pig-git
> 
> 
> Description
> -------
> 
> Separate "packaging" logic from "shuffle handling" logic. This moves the packaging logic to a new class "Packager", which is extended by CombinePackager, LitePackager, MultiQueryPackager, and JoinPackager.
> 
> This is not finished. Known problem are illustrate and streaming the last input are not implemented.
> 
> 
> Diffs
> -----
> 
>   src/org/apache/pig/backend/hadoop/executionengine/fetch/FetchOptimizer.java d801f6f 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/AccumulatorOptimizer.java 3638b5c 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/CombinerOptimizer.java 18a382b 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java 5e28eb6 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRCompiler.java 5dddab7 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRUtil.java 93de6d5 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MapReduceLauncher.java eb7c428 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MultiQueryOptimizer.java 64f0ee1 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PhyPlanSetter.java 933363d 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigCombiner.java 773a22c 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigGenericMapReduce.java eea5ce3 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/SecondaryKeyOptimizer.java 1578630 
>   src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/plans/POPackageAnnotator.java 47137d5 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PhyPlanVisitor.java abb16ff 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PlanPrinter.java ff82801 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/XMLPhysicalPlanPrinter.java 892c26f 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/CombinerPackager.java PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/JoinPackager.java PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/LitePackager.java PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/MultiQueryPackager.java PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POCombinerPackage.java 9105a0e 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POJoinPackage.java 82f11ac 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POMultiQueryPackage.java d604174 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java 86314d9 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackageLite.java c200715 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/Packager.java PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/util/PlanHelper.java b860521 
>   src/org/apache/pig/data/ReadOnceBag.java e2b3887 
>   src/org/apache/pig/newplan/logical/relational/LogToPhyTranslationVisitor.java 7112695 
>   src/org/apache/pig/pen/IllustratorAttacher.java db9c707 
>   src/org/apache/pig/pen/PhysicalPlanResetter.java f50fba7 
>   test/org/apache/pig/test/TestJobSubmission.java cccade4 
>   test/org/apache/pig/test/TestPackage.java 6b197d1 
>   test/org/apache/pig/test/data/GoldenFiles/Cogroup.gld 35ed75a 
>   test/org/apache/pig/test/data/GoldenFiles/MRC1.gld 0a34728 
>   test/org/apache/pig/test/data/GoldenFiles/MRC10.gld b9fad0f 
>   test/org/apache/pig/test/data/GoldenFiles/MRC11.gld 861608d 
>   test/org/apache/pig/test/data/GoldenFiles/MRC12.gld 15f1158 
>   test/org/apache/pig/test/data/GoldenFiles/MRC13.gld 0d4de2b 
>   test/org/apache/pig/test/data/GoldenFiles/MRC14.gld eafa509 
>   test/org/apache/pig/test/data/GoldenFiles/MRC15.gld e8bd281 
>   test/org/apache/pig/test/data/GoldenFiles/MRC16.gld 6151d1e 
>   test/org/apache/pig/test/data/GoldenFiles/MRC17.gld cec5d57 
>   test/org/apache/pig/test/data/GoldenFiles/MRC18.gld 01cdc32 
>   test/org/apache/pig/test/data/GoldenFiles/MRC19.gld 8688178 
>   test/org/apache/pig/test/data/GoldenFiles/MRC2.gld 97f0ed0 
>   test/org/apache/pig/test/data/GoldenFiles/MRC3.gld 77a155f 
>   test/org/apache/pig/test/data/GoldenFiles/MRC4.gld 7c5078e 
>   test/org/apache/pig/test/data/GoldenFiles/MRC6.gld 45cba2d 
>   test/org/apache/pig/test/data/GoldenFiles/MRC7.gld 09fcaec 
>   test/org/apache/pig/test/data/GoldenFiles/MRC8.gld 2d301a9 
>   test/org/apache/pig/test/data/GoldenFiles/MRC9.gld c5d7047 
> 
> Diff: https://reviews.apache.org/r/15881/diff/
> 
> 
> Testing
> -------
> 
> ant test-commit passes, with the exception of TestExampleGenerator which is caused by the illustrate work not being finished. I also ran some manual scripts, which worked.
> 
> 
> Thanks,
> 
> Mark Wagner
> 
>