You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by Mark Wagner <wa...@gmail.com> on 2013/10/16 21:12:07 UTC
Review Request 14679: Initial implementation of PigProcessor
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14679/
-----------------------------------------------------------
Review request for pig, Cheolsoo Park, Daniel Dai, and Rohini Palaniswamy.
Bugs: PIG-3521
https://issues.apache.org/jira/browse/PIG-3521
Repository: pig-git
Description
-------
This patch adds the PigProcessor and related changes. The current patch supports MR* jobs.
* Updates the Tez dependency to match Tez's trunk
* Add PigProcessor which roughly follows the existing Mappers and Reducers in Pig.
* The handling of input has been factored out of the PigProcessor into a new interface: InputHandler. Two implementations of InputHandler have been added: FileInputHandler and ShuffledInputHandler.
* Makes changes to TezDagBuilder to serialize and ship the necessary information from the frontend. These changes are mostly inspired by/stolen from the JobControlCompiler.
* Adds a TezPOPackageAnnotator which is analogous to the POPackageAnnotator, but for Tez.
* Fixes a problem with edge creation in the TezDagBuilder.
Diffs
-----
ivy.xml c603def
src/org/apache/pig/backend/hadoop/executionengine/tez/FileInputHandler.java PRE-CREATION
src/org/apache/pig/backend/hadoop/executionengine/tez/InputHandler.java PRE-CREATION
src/org/apache/pig/backend/hadoop/executionengine/tez/PigProcessor.java 6724f2b
src/org/apache/pig/backend/hadoop/executionengine/tez/ShuffledInputHandler.java PRE-CREATION
src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java 48c0955
src/org/apache/pig/backend/hadoop/executionengine/tez/TezJobControlCompiler.java 05b0c54
src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java 4cc9ab4
src/org/apache/pig/backend/hadoop/executionengine/tez/TezPOPackageAnnotator.java PRE-CREATION
Diff: https://reviews.apache.org/r/14679/diff/
Testing
-------
Only integration testing has been done. Jobs with 1, 2, and 3 stages have been executed successfully. I'll be adding unit tests.
Thanks,
Mark Wagner
Re: Review Request 14679: Initial implementation of PigProcessor
Posted by Daniel Dai <da...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14679/#review27104
-----------------------------------------------------------
Ship it!
Simple load/store works for me (with some minor fix, and frontend throw exception after job finish though). Still trying complex queries. But we can commit this patch first and fix based on it.
- Daniel Dai
On Oct. 16, 2013, 7:12 p.m., Mark Wagner wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/14679/
> -----------------------------------------------------------
>
> (Updated Oct. 16, 2013, 7:12 p.m.)
>
>
> Review request for pig, Cheolsoo Park, Daniel Dai, and Rohini Palaniswamy.
>
>
> Bugs: PIG-3521
> https://issues.apache.org/jira/browse/PIG-3521
>
>
> Repository: pig-git
>
>
> Description
> -------
>
> This patch adds the PigProcessor and related changes. The current patch supports MR* jobs.
>
> * Updates the Tez dependency to match Tez's trunk
> * Add PigProcessor which roughly follows the existing Mappers and Reducers in Pig.
> * The handling of input has been factored out of the PigProcessor into a new interface: InputHandler. Two implementations of InputHandler have been added: FileInputHandler and ShuffledInputHandler.
> * Makes changes to TezDagBuilder to serialize and ship the necessary information from the frontend. These changes are mostly inspired by/stolen from the JobControlCompiler.
> * Adds a TezPOPackageAnnotator which is analogous to the POPackageAnnotator, but for Tez.
> * Fixes a problem with edge creation in the TezDagBuilder.
>
>
> Diffs
> -----
>
> ivy.xml c603def
> src/org/apache/pig/backend/hadoop/executionengine/tez/FileInputHandler.java PRE-CREATION
> src/org/apache/pig/backend/hadoop/executionengine/tez/InputHandler.java PRE-CREATION
> src/org/apache/pig/backend/hadoop/executionengine/tez/PigProcessor.java 6724f2b
> src/org/apache/pig/backend/hadoop/executionengine/tez/ShuffledInputHandler.java PRE-CREATION
> src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java 48c0955
> src/org/apache/pig/backend/hadoop/executionengine/tez/TezJobControlCompiler.java 05b0c54
> src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java 4cc9ab4
> src/org/apache/pig/backend/hadoop/executionengine/tez/TezPOPackageAnnotator.java PRE-CREATION
>
> Diff: https://reviews.apache.org/r/14679/diff/
>
>
> Testing
> -------
>
> Only integration testing has been done. Jobs with 1, 2, and 3 stages have been executed successfully. I'll be adding unit tests.
>
>
> Thanks,
>
> Mark Wagner
>
>