You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by Cheolsoo Park <pi...@gmail.com> on 2013/10/05 02:21:25 UTC
Review Request 14504: PIG-3500 Initial implementation of TezCompiler
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14504/
-----------------------------------------------------------
Review request for pig, Daniel Dai, Mark Wagner, and Rohini Palaniswamy.
Bugs: PIG-3500
https://issues.apache.org/jira/browse/PIG-3500
Repository: pig-git
Description
-------
Initial implementation of TezCompiler that converts physical plan into tez plan. This version works only for basic operators including LOAD, STORE, FILTER, FOREACH, GROUP, and JOIN.
Here is an example:
a = load '/tmp/input' as (name, age, gpa);
b = filter a by age>=30;
c = group b by age;
d = foreach c generate group as age, COUNT(b);
e = load '/tmp/fact' as (age, comments);
f = join d by age, e by age;
store f into '/tmp/output';
>> pig -x tez -e 'explain -script test.pig'
#--------------------------------------------------
# TEZ plan:
#--------------------------------------------------
Tez vertex scope-29
c: Local Rearrange[tuple]{bytearray}(false) - scope-8
| |
| Project[bytearray][1] - scope-9
|
|---b: Filter[bag] - scope-1
| |
| Greater Than or Equal[boolean] - scope-5
| |
| |---Cast[int] - scope-3
| | |
| | |---Project[bytearray][1] - scope-2
| |
| |---Constant(30) - scope-4
|
|---a: Load(/tmp/input:org.apache.pig.builtin.PigStorage) - scope-0
Tez vertex scope-30
f: Local Rearrange[tuple]{bytearray}(false) - scope-21
| |
| Project[bytearray][0] - scope-22
|
|---d: New For Each(false,false)[bag] - scope-15
| |
| Project[bytearray][0] - scope-10
| |
| POUserFunc(org.apache.pig.builtin.COUNT)[long] - scope-13
| |
| |---Project[bag][1] - scope-12
|
|---c: Package[tuple]{bytearray} - scope-7
Tez vertex scope-31
f: Local Rearrange[tuple]{bytearray}(false) - scope-23
| |
| Project[bytearray][0] - scope-24
|
|---e: Load(/tmp/fact:org.apache.pig.builtin.PigStorage) - scope-16
Tez vertex scope-32
f: Store(/tmp/output:org.apache.pig.builtin.PigStorage) - scope-28
|
|---f: New For Each(true,true)[tuple] - scope-27
| |
| Project[bag][1] - scope-25
| |
| Project[bag][2] - scope-26
|
|---f: Package[tuple]{bytearray} - scope-20
Diffs
-----
ivy.xml 7163c89
ivy/libraries.properties ea08384
src/org/apache/pig/backend/hadoop/executionengine/tez/MapOper.java 623ec95
src/org/apache/pig/backend/hadoop/executionengine/tez/PigProcessor.java PRE-CREATION
src/org/apache/pig/backend/hadoop/executionengine/tez/ReduceOper.java ef5fe84
src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java a4c9c59
src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompilerException.java PRE-CREATION
src/org/apache/pig/backend/hadoop/executionengine/tez/TezExecType.java 1d90f95
src/org/apache/pig/backend/hadoop/executionengine/tez/TezExecutionEngine.java 5e9caf6
src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java e182f0d
src/org/apache/pig/backend/hadoop/executionengine/tez/TezOperator.java ca06151
src/org/apache/pig/backend/hadoop/executionengine/tez/TezPrinter.java 5d68a85
src/org/apache/pig/backend/hadoop/executionengine/tez/TezScriptState.java PRE-CREATION
src/org/apache/pig/impl/PigContext.java 1b6ac61
test/org/apache/pig/test/TestMRCompiler.java 8c85280
test/org/apache/pig/test/Util.java a2bc1cf
test/org/apache/pig/test/data/GoldenFiles/TEZC1.gld PRE-CREATION
test/org/apache/pig/test/data/GoldenFiles/TEZC2.gld PRE-CREATION
test/org/apache/pig/test/data/GoldenFiles/TEZC3.gld PRE-CREATION
test/org/apache/pig/tez/TestTezCompiler.java PRE-CREATION
Diff: https://reviews.apache.org/r/14504/diff/
Testing
-------
Added unit tests cases to TestTezCompiler. Note that this patch requires the lasted version (trunk) of Apache Tez.
Thanks,
Cheolsoo Park
Re: Review Request 14504: PIG-3500 Initial implementation of TezCompiler
Posted by Mark Wagner <wa...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14504/#review26760
-----------------------------------------------------------
src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java
<https://reviews.apache.org/r/14504/#comment52086>
Whitespace
src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java
<https://reviews.apache.org/r/14504/#comment52087>
s/MapReduceLauncher/TezLauncher
src/org/apache/pig/backend/hadoop/executionengine/tez/TezOperator.java
<https://reviews.apache.org/r/14504/#comment52088>
A Tez vertex can have multiple inputs and outputs, each with their own combiner. We'll need to associate combine plans with their inputs/outputs. That's probably beyond the scope of this patch though.
src/org/apache/pig/backend/hadoop/executionengine/tez/TezOperator.java
<https://reviews.apache.org/r/14504/#comment52089>
Same comment as for the combinePlan.
- Mark Wagner
On Oct. 5, 2013, 12:21 a.m., Cheolsoo Park wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/14504/
> -----------------------------------------------------------
>
> (Updated Oct. 5, 2013, 12:21 a.m.)
>
>
> Review request for pig, Daniel Dai, Mark Wagner, and Rohini Palaniswamy.
>
>
> Bugs: PIG-3500
> https://issues.apache.org/jira/browse/PIG-3500
>
>
> Repository: pig-git
>
>
> Description
> -------
>
> Initial implementation of TezCompiler that converts physical plan into tez plan. This version works only for basic operators including LOAD, STORE, FILTER, FOREACH, GROUP, and JOIN.
>
> Here is an example:
>
> a = load '/tmp/input' as (name, age, gpa);
> b = filter a by age>=30;
> c = group b by age;
> d = foreach c generate group as age, COUNT(b);
> e = load '/tmp/fact' as (age, comments);
> f = join d by age, e by age;
> store f into '/tmp/output';
>
> >> pig -x tez -e 'explain -script test.pig'
>
> #--------------------------------------------------
> # TEZ plan:
> #--------------------------------------------------
> Tez vertex scope-29
> c: Local Rearrange[tuple]{bytearray}(false) - scope-8
> | |
> | Project[bytearray][1] - scope-9
> |
> |---b: Filter[bag] - scope-1
> | |
> | Greater Than or Equal[boolean] - scope-5
> | |
> | |---Cast[int] - scope-3
> | | |
> | | |---Project[bytearray][1] - scope-2
> | |
> | |---Constant(30) - scope-4
> |
> |---a: Load(/tmp/input:org.apache.pig.builtin.PigStorage) - scope-0
> Tez vertex scope-30
> f: Local Rearrange[tuple]{bytearray}(false) - scope-21
> | |
> | Project[bytearray][0] - scope-22
> |
> |---d: New For Each(false,false)[bag] - scope-15
> | |
> | Project[bytearray][0] - scope-10
> | |
> | POUserFunc(org.apache.pig.builtin.COUNT)[long] - scope-13
> | |
> | |---Project[bag][1] - scope-12
> |
> |---c: Package[tuple]{bytearray} - scope-7
> Tez vertex scope-31
> f: Local Rearrange[tuple]{bytearray}(false) - scope-23
> | |
> | Project[bytearray][0] - scope-24
> |
> |---e: Load(/tmp/fact:org.apache.pig.builtin.PigStorage) - scope-16
> Tez vertex scope-32
> f: Store(/tmp/output:org.apache.pig.builtin.PigStorage) - scope-28
> |
> |---f: New For Each(true,true)[tuple] - scope-27
> | |
> | Project[bag][1] - scope-25
> | |
> | Project[bag][2] - scope-26
> |
> |---f: Package[tuple]{bytearray} - scope-20
>
>
> Diffs
> -----
>
> ivy.xml 7163c89
> ivy/libraries.properties ea08384
> src/org/apache/pig/backend/hadoop/executionengine/tez/MapOper.java 623ec95
> src/org/apache/pig/backend/hadoop/executionengine/tez/PigProcessor.java PRE-CREATION
> src/org/apache/pig/backend/hadoop/executionengine/tez/ReduceOper.java ef5fe84
> src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java a4c9c59
> src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompilerException.java PRE-CREATION
> src/org/apache/pig/backend/hadoop/executionengine/tez/TezExecType.java 1d90f95
> src/org/apache/pig/backend/hadoop/executionengine/tez/TezExecutionEngine.java 5e9caf6
> src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java e182f0d
> src/org/apache/pig/backend/hadoop/executionengine/tez/TezOperator.java ca06151
> src/org/apache/pig/backend/hadoop/executionengine/tez/TezPrinter.java 5d68a85
> src/org/apache/pig/backend/hadoop/executionengine/tez/TezScriptState.java PRE-CREATION
> src/org/apache/pig/impl/PigContext.java 1b6ac61
> test/org/apache/pig/test/TestMRCompiler.java 8c85280
> test/org/apache/pig/test/Util.java a2bc1cf
> test/org/apache/pig/test/data/GoldenFiles/TEZC1.gld PRE-CREATION
> test/org/apache/pig/test/data/GoldenFiles/TEZC2.gld PRE-CREATION
> test/org/apache/pig/test/data/GoldenFiles/TEZC3.gld PRE-CREATION
> test/org/apache/pig/tez/TestTezCompiler.java PRE-CREATION
>
> Diff: https://reviews.apache.org/r/14504/diff/
>
>
> Testing
> -------
>
> Added unit tests cases to TestTezCompiler. Note that this patch requires the lasted version (trunk) of Apache Tez.
>
>
> Thanks,
>
> Cheolsoo Park
>
>