You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by Cheolsoo Park <pi...@gmail.com> on 2013/10/05 02:21:25 UTC

Review Request 14504: PIG-3500 Initial implementation of TezCompiler

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14504/
-----------------------------------------------------------

Review request for pig, Daniel Dai, Mark Wagner, and Rohini Palaniswamy.


Bugs: PIG-3500
    https://issues.apache.org/jira/browse/PIG-3500


Repository: pig-git


Description
-------

Initial implementation of TezCompiler that converts physical plan into tez plan. This version works only for basic operators including LOAD, STORE, FILTER, FOREACH, GROUP, and JOIN.

Here is an example:

a = load '/tmp/input' as (name, age, gpa);
b = filter a by age>=30;
c = group b by age;
d = foreach c generate group as age, COUNT(b);
e = load '/tmp/fact' as (age, comments);
f = join d by age, e by age;
store f into '/tmp/output';

>> pig -x tez -e 'explain -script test.pig'

#--------------------------------------------------
# TEZ plan:
#--------------------------------------------------
Tez vertex scope-29
c: Local Rearrange[tuple]{bytearray}(false) - scope-8
|   |
|   Project[bytearray][1] - scope-9
|
|---b: Filter[bag] - scope-1
    |   |
    |   Greater Than or Equal[boolean] - scope-5
    |   |
    |   |---Cast[int] - scope-3
    |   |   |
    |   |   |---Project[bytearray][1] - scope-2
    |   |
    |   |---Constant(30) - scope-4
    |
    |---a: Load(/tmp/input:org.apache.pig.builtin.PigStorage) - scope-0
Tez vertex scope-30
f: Local Rearrange[tuple]{bytearray}(false) - scope-21
|   |
|   Project[bytearray][0] - scope-22
|
|---d: New For Each(false,false)[bag] - scope-15
    |   |
    |   Project[bytearray][0] - scope-10
    |   |
    |   POUserFunc(org.apache.pig.builtin.COUNT)[long] - scope-13
    |   |
    |   |---Project[bag][1] - scope-12
    |
    |---c: Package[tuple]{bytearray} - scope-7
Tez vertex scope-31
f: Local Rearrange[tuple]{bytearray}(false) - scope-23
|   |
|   Project[bytearray][0] - scope-24
|
|---e: Load(/tmp/fact:org.apache.pig.builtin.PigStorage) - scope-16
Tez vertex scope-32
f: Store(/tmp/output:org.apache.pig.builtin.PigStorage) - scope-28
|
|---f: New For Each(true,true)[tuple] - scope-27
    |   |
    |   Project[bag][1] - scope-25
    |   |
    |   Project[bag][2] - scope-26
    |
    |---f: Package[tuple]{bytearray} - scope-20


Diffs
-----

  ivy.xml 7163c89 
  ivy/libraries.properties ea08384 
  src/org/apache/pig/backend/hadoop/executionengine/tez/MapOper.java 623ec95 
  src/org/apache/pig/backend/hadoop/executionengine/tez/PigProcessor.java PRE-CREATION 
  src/org/apache/pig/backend/hadoop/executionengine/tez/ReduceOper.java ef5fe84 
  src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java a4c9c59 
  src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompilerException.java PRE-CREATION 
  src/org/apache/pig/backend/hadoop/executionengine/tez/TezExecType.java 1d90f95 
  src/org/apache/pig/backend/hadoop/executionengine/tez/TezExecutionEngine.java 5e9caf6 
  src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java e182f0d 
  src/org/apache/pig/backend/hadoop/executionengine/tez/TezOperator.java ca06151 
  src/org/apache/pig/backend/hadoop/executionengine/tez/TezPrinter.java 5d68a85 
  src/org/apache/pig/backend/hadoop/executionengine/tez/TezScriptState.java PRE-CREATION 
  src/org/apache/pig/impl/PigContext.java 1b6ac61 
  test/org/apache/pig/test/TestMRCompiler.java 8c85280 
  test/org/apache/pig/test/Util.java a2bc1cf 
  test/org/apache/pig/test/data/GoldenFiles/TEZC1.gld PRE-CREATION 
  test/org/apache/pig/test/data/GoldenFiles/TEZC2.gld PRE-CREATION 
  test/org/apache/pig/test/data/GoldenFiles/TEZC3.gld PRE-CREATION 
  test/org/apache/pig/tez/TestTezCompiler.java PRE-CREATION 

Diff: https://reviews.apache.org/r/14504/diff/


Testing
-------

Added unit tests cases to TestTezCompiler. Note that this patch requires the lasted version (trunk) of Apache Tez.


Thanks,

Cheolsoo Park


Re: Review Request 14504: PIG-3500 Initial implementation of TezCompiler

Posted by Mark Wagner <wa...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14504/#review26760
-----------------------------------------------------------



src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java
<https://reviews.apache.org/r/14504/#comment52086>

    Whitespace



src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java
<https://reviews.apache.org/r/14504/#comment52087>

    s/MapReduceLauncher/TezLauncher



src/org/apache/pig/backend/hadoop/executionengine/tez/TezOperator.java
<https://reviews.apache.org/r/14504/#comment52088>

    A Tez vertex can have multiple inputs and outputs, each with their own combiner. We'll need to associate combine plans with their inputs/outputs. That's probably beyond the scope of this patch though.



src/org/apache/pig/backend/hadoop/executionengine/tez/TezOperator.java
<https://reviews.apache.org/r/14504/#comment52089>

    Same comment as for the combinePlan.


- Mark Wagner


On Oct. 5, 2013, 12:21 a.m., Cheolsoo Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/14504/
> -----------------------------------------------------------
> 
> (Updated Oct. 5, 2013, 12:21 a.m.)
> 
> 
> Review request for pig, Daniel Dai, Mark Wagner, and Rohini Palaniswamy.
> 
> 
> Bugs: PIG-3500
>     https://issues.apache.org/jira/browse/PIG-3500
> 
> 
> Repository: pig-git
> 
> 
> Description
> -------
> 
> Initial implementation of TezCompiler that converts physical plan into tez plan. This version works only for basic operators including LOAD, STORE, FILTER, FOREACH, GROUP, and JOIN.
> 
> Here is an example:
> 
> a = load '/tmp/input' as (name, age, gpa);
> b = filter a by age>=30;
> c = group b by age;
> d = foreach c generate group as age, COUNT(b);
> e = load '/tmp/fact' as (age, comments);
> f = join d by age, e by age;
> store f into '/tmp/output';
> 
> >> pig -x tez -e 'explain -script test.pig'
> 
> #--------------------------------------------------
> # TEZ plan:
> #--------------------------------------------------
> Tez vertex scope-29
> c: Local Rearrange[tuple]{bytearray}(false) - scope-8
> |   |
> |   Project[bytearray][1] - scope-9
> |
> |---b: Filter[bag] - scope-1
>     |   |
>     |   Greater Than or Equal[boolean] - scope-5
>     |   |
>     |   |---Cast[int] - scope-3
>     |   |   |
>     |   |   |---Project[bytearray][1] - scope-2
>     |   |
>     |   |---Constant(30) - scope-4
>     |
>     |---a: Load(/tmp/input:org.apache.pig.builtin.PigStorage) - scope-0
> Tez vertex scope-30
> f: Local Rearrange[tuple]{bytearray}(false) - scope-21
> |   |
> |   Project[bytearray][0] - scope-22
> |
> |---d: New For Each(false,false)[bag] - scope-15
>     |   |
>     |   Project[bytearray][0] - scope-10
>     |   |
>     |   POUserFunc(org.apache.pig.builtin.COUNT)[long] - scope-13
>     |   |
>     |   |---Project[bag][1] - scope-12
>     |
>     |---c: Package[tuple]{bytearray} - scope-7
> Tez vertex scope-31
> f: Local Rearrange[tuple]{bytearray}(false) - scope-23
> |   |
> |   Project[bytearray][0] - scope-24
> |
> |---e: Load(/tmp/fact:org.apache.pig.builtin.PigStorage) - scope-16
> Tez vertex scope-32
> f: Store(/tmp/output:org.apache.pig.builtin.PigStorage) - scope-28
> |
> |---f: New For Each(true,true)[tuple] - scope-27
>     |   |
>     |   Project[bag][1] - scope-25
>     |   |
>     |   Project[bag][2] - scope-26
>     |
>     |---f: Package[tuple]{bytearray} - scope-20
> 
> 
> Diffs
> -----
> 
>   ivy.xml 7163c89 
>   ivy/libraries.properties ea08384 
>   src/org/apache/pig/backend/hadoop/executionengine/tez/MapOper.java 623ec95 
>   src/org/apache/pig/backend/hadoop/executionengine/tez/PigProcessor.java PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/tez/ReduceOper.java ef5fe84 
>   src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java a4c9c59 
>   src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompilerException.java PRE-CREATION 
>   src/org/apache/pig/backend/hadoop/executionengine/tez/TezExecType.java 1d90f95 
>   src/org/apache/pig/backend/hadoop/executionengine/tez/TezExecutionEngine.java 5e9caf6 
>   src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java e182f0d 
>   src/org/apache/pig/backend/hadoop/executionengine/tez/TezOperator.java ca06151 
>   src/org/apache/pig/backend/hadoop/executionengine/tez/TezPrinter.java 5d68a85 
>   src/org/apache/pig/backend/hadoop/executionengine/tez/TezScriptState.java PRE-CREATION 
>   src/org/apache/pig/impl/PigContext.java 1b6ac61 
>   test/org/apache/pig/test/TestMRCompiler.java 8c85280 
>   test/org/apache/pig/test/Util.java a2bc1cf 
>   test/org/apache/pig/test/data/GoldenFiles/TEZC1.gld PRE-CREATION 
>   test/org/apache/pig/test/data/GoldenFiles/TEZC2.gld PRE-CREATION 
>   test/org/apache/pig/test/data/GoldenFiles/TEZC3.gld PRE-CREATION 
>   test/org/apache/pig/tez/TestTezCompiler.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/14504/diff/
> 
> 
> Testing
> -------
> 
> Added unit tests cases to TestTezCompiler. Note that this patch requires the lasted version (trunk) of Apache Tez.
> 
> 
> Thanks,
> 
> Cheolsoo Park
> 
>