You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by Alex Bain <am...@gmail.com> on 2014/01/16 01:44:09 UTC

Review Request 16926: PIG-3557 Implement LIMIT optimizations in Tez

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16926/
-----------------------------------------------------------

Review request for pig, Cheolsoo Park, Daniel Dai, Mark Wagner, and Rohini Palaniswamy.


Bugs: PIG-3557
    https://issues.apache.org/jira/browse/PIG-3557


Repository: pig-git


Description
-------

Implement LIMIT optimizations in Tez - https://issues.apache.org/jira/browse/PIG-3557

1. If the previous Tez vertex has a requestedParallelism of 1 and does not start with a POLoad, we don't need to add a second LIMIT vertex (since the LIMIT we put at the end of the previous vertex is good enough).

2. If we are not in the "limited order by" case, instead of the regular shuffle-sort edge we can use an unsorted shuffle edge.
--This code is added but commented out, since it depends on TEZ-661.

3. I manually verified that the LimitOptimizer can push LIMIT to the InputHandler in certain cases (no code changes).


Diffs
-----

  src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java 5c6a242 
  test/org/apache/pig/test/data/GoldenFiles/TEZC7.gld 9cf5baf 

Diff: https://reviews.apache.org/r/16926/diff/


Testing
-------

TestTezCompiler unit test updated
ant test-tez passes
e2e tests - same results as in clean tez branch


Thanks,

Alex Bain


Re: Review Request 16926: PIG-3557 Implement LIMIT optimizations in Tez

Posted by Cheolsoo Park <pi...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16926/#review31980
-----------------------------------------------------------

Ship it!


I will commit it after running tests.

- Cheolsoo Park


On Jan. 16, 2014, 12:44 a.m., Alex Bain wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/16926/
> -----------------------------------------------------------
> 
> (Updated Jan. 16, 2014, 12:44 a.m.)
> 
> 
> Review request for pig, Cheolsoo Park, Daniel Dai, Mark Wagner, and Rohini Palaniswamy.
> 
> 
> Bugs: PIG-3557
>     https://issues.apache.org/jira/browse/PIG-3557
> 
> 
> Repository: pig-git
> 
> 
> Description
> -------
> 
> Implement LIMIT optimizations in Tez - https://issues.apache.org/jira/browse/PIG-3557
> 
> 1. If the previous Tez vertex has a requestedParallelism of 1 and does not start with a POLoad, we don't need to add a second LIMIT vertex (since the LIMIT we put at the end of the previous vertex is good enough).
> 
> 2. If we are not in the "limited order by" case, instead of the regular shuffle-sort edge we can use an unsorted shuffle edge.
> --This code is added but commented out, since it depends on TEZ-661.
> 
> 3. I manually verified that the LimitOptimizer can push LIMIT to the InputHandler in certain cases (no code changes).
> 
> 
> Diffs
> -----
> 
>   src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java 5c6a242 
>   test/org/apache/pig/test/data/GoldenFiles/TEZC7.gld 9cf5baf 
> 
> Diff: https://reviews.apache.org/r/16926/diff/
> 
> 
> Testing
> -------
> 
> TestTezCompiler unit test updated
> ant test-tez passes
> e2e tests - same results as in clean tez branch
> 
> 
> Thanks,
> 
> Alex Bain
> 
>