You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by Cheolsoo Park <pi...@gmail.com> on 2013/12/11 18:41:48 UTC

Review Request 16165: PIG-3618 Replace broadcast edges with scatter/gather edges in union

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16165/
-----------------------------------------------------------

Review request for pig, Alex Bain, Daniel Dai, Mark Wagner, and Rohini Palaniswamy.


Bugs: PIG-3618
    https://issues.apache.org/jira/browse/PIG-3618


Repository: pig-git


Description
-------

Replace broadcast edges with scatter/gather edges for union. I also changed the name of POPackage from POBroadcastTezLoad to POUnionTezLoad.


Diffs
-----

  src/org/apache/pig/backend/hadoop/executionengine/tez/POBroadcastTezLoad.java d7b9d5a 
  src/org/apache/pig/backend/hadoop/executionengine/tez/POLocalRearrangeTez.java 2a96da8 
  src/org/apache/pig/backend/hadoop/executionengine/tez/POUnionTezLoad.java e69de29 
  src/org/apache/pig/backend/hadoop/executionengine/tez/PigProcessor.java 8fb18d3 
  src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java ea4d167 
  src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java 2e55ea9 
  src/org/apache/pig/backend/hadoop/executionengine/tez/TezPOPackageAnnotator.java 79c97f7 
  test/e2e/pig/tests/tez.conf b7dd862 

Diff: https://reviews.apache.org/r/16165/diff/


Testing
-------

Ran tez unit tests and e2e tests.

I set the parallel of union in the e2e test case to 2.


Thanks,

Cheolsoo Park


Re: Review Request 16165: PIG-3618 Replace broadcast edges with scatter/gather edges in union

Posted by Cheolsoo Park <pi...@gmail.com>.

> On Dec. 11, 2013, 9:45 p.m., Daniel Dai wrote:
> > Let's put a comment subject to change after TEZ-661. waitingQueue seems unnecessary since we can remember where we are in POUnionTezLoad, and this could increase the chance of OOM. But we don't need to address it now, since it is just a temporary fix.

Thank you Daniel for the review. I added a TODO comment regarding TEZ-661 and removed the queue in a new patch. Now I use the the int count to keep track of duplicate tuples.

I'll commit this new patch.


- Cheolsoo


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16165/#review30229
-----------------------------------------------------------


On Dec. 11, 2013, 10:47 p.m., Cheolsoo Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/16165/
> -----------------------------------------------------------
> 
> (Updated Dec. 11, 2013, 10:47 p.m.)
> 
> 
> Review request for pig, Alex Bain, Daniel Dai, Mark Wagner, and Rohini Palaniswamy.
> 
> 
> Bugs: PIG-3618
>     https://issues.apache.org/jira/browse/PIG-3618
> 
> 
> Repository: pig-git
> 
> 
> Description
> -------
> 
> Replace broadcast edges with scatter/gather edges for union. I also changed the name of POPackage from POBroadcastTezLoad to POUnionTezLoad.
> 
> 
> Diffs
> -----
> 
>   src/org/apache/pig/backend/hadoop/executionengine/tez/POBroadcastTezLoad.java d7b9d5a 
>   src/org/apache/pig/backend/hadoop/executionengine/tez/POLocalRearrangeTez.java 2a96da8 
>   src/org/apache/pig/backend/hadoop/executionengine/tez/POUnionTezLoad.java e69de29 
>   src/org/apache/pig/backend/hadoop/executionengine/tez/PigProcessor.java 8fb18d3 
>   src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java ea4d167 
>   src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java 2e55ea9 
>   src/org/apache/pig/backend/hadoop/executionengine/tez/TezPOPackageAnnotator.java 79c97f7 
>   test/e2e/pig/tests/tez.conf b7dd862 
> 
> Diff: https://reviews.apache.org/r/16165/diff/
> 
> 
> Testing
> -------
> 
> Ran tez unit tests and e2e tests.
> 
> I set the parallel of union in the e2e test case to 2.
> 
> 
> Thanks,
> 
> Cheolsoo Park
> 
>


Re: Review Request 16165: PIG-3618 Replace broadcast edges with scatter/gather edges in union

Posted by Daniel Dai <da...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16165/#review30229
-----------------------------------------------------------

Ship it!


Let's put a comment subject to change after TEZ-661. waitingQueue seems unnecessary since we can remember where we are in POUnionTezLoad, and this could increase the chance of OOM. But we don't need to address it now, since it is just a temporary fix.

- Daniel Dai


On Dec. 11, 2013, 5:41 p.m., Cheolsoo Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/16165/
> -----------------------------------------------------------
> 
> (Updated Dec. 11, 2013, 5:41 p.m.)
> 
> 
> Review request for pig, Alex Bain, Daniel Dai, Mark Wagner, and Rohini Palaniswamy.
> 
> 
> Bugs: PIG-3618
>     https://issues.apache.org/jira/browse/PIG-3618
> 
> 
> Repository: pig-git
> 
> 
> Description
> -------
> 
> Replace broadcast edges with scatter/gather edges for union. I also changed the name of POPackage from POBroadcastTezLoad to POUnionTezLoad.
> 
> 
> Diffs
> -----
> 
>   src/org/apache/pig/backend/hadoop/executionengine/tez/POBroadcastTezLoad.java d7b9d5a 
>   src/org/apache/pig/backend/hadoop/executionengine/tez/POLocalRearrangeTez.java 2a96da8 
>   src/org/apache/pig/backend/hadoop/executionengine/tez/POUnionTezLoad.java e69de29 
>   src/org/apache/pig/backend/hadoop/executionengine/tez/PigProcessor.java 8fb18d3 
>   src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java ea4d167 
>   src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java 2e55ea9 
>   src/org/apache/pig/backend/hadoop/executionengine/tez/TezPOPackageAnnotator.java 79c97f7 
>   test/e2e/pig/tests/tez.conf b7dd862 
> 
> Diff: https://reviews.apache.org/r/16165/diff/
> 
> 
> Testing
> -------
> 
> Ran tez unit tests and e2e tests.
> 
> I set the parallel of union in the e2e test case to 2.
> 
> 
> Thanks,
> 
> Cheolsoo Park
> 
>


Re: Review Request 16165: PIG-3618 Replace broadcast edges with scatter/gather edges in union

Posted by Daniel Dai <da...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16165/#review30234
-----------------------------------------------------------

Ship it!


Ship It!

- Daniel Dai


On Dec. 11, 2013, 10:47 p.m., Cheolsoo Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/16165/
> -----------------------------------------------------------
> 
> (Updated Dec. 11, 2013, 10:47 p.m.)
> 
> 
> Review request for pig, Alex Bain, Daniel Dai, Mark Wagner, and Rohini Palaniswamy.
> 
> 
> Bugs: PIG-3618
>     https://issues.apache.org/jira/browse/PIG-3618
> 
> 
> Repository: pig-git
> 
> 
> Description
> -------
> 
> Replace broadcast edges with scatter/gather edges for union. I also changed the name of POPackage from POBroadcastTezLoad to POUnionTezLoad.
> 
> 
> Diffs
> -----
> 
>   src/org/apache/pig/backend/hadoop/executionengine/tez/POBroadcastTezLoad.java d7b9d5a 
>   src/org/apache/pig/backend/hadoop/executionengine/tez/POLocalRearrangeTez.java 2a96da8 
>   src/org/apache/pig/backend/hadoop/executionengine/tez/POUnionTezLoad.java e69de29 
>   src/org/apache/pig/backend/hadoop/executionengine/tez/PigProcessor.java 8fb18d3 
>   src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java ea4d167 
>   src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java 2e55ea9 
>   src/org/apache/pig/backend/hadoop/executionengine/tez/TezPOPackageAnnotator.java 79c97f7 
>   test/e2e/pig/tests/tez.conf b7dd862 
> 
> Diff: https://reviews.apache.org/r/16165/diff/
> 
> 
> Testing
> -------
> 
> Ran tez unit tests and e2e tests.
> 
> I set the parallel of union in the e2e test case to 2.
> 
> 
> Thanks,
> 
> Cheolsoo Park
> 
>


Re: Review Request 16165: PIG-3618 Replace broadcast edges with scatter/gather edges in union

Posted by Cheolsoo Park <pi...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16165/
-----------------------------------------------------------

(Updated Dec. 11, 2013, 10:47 p.m.)


Review request for pig, Alex Bain, Daniel Dai, Mark Wagner, and Rohini Palaniswamy.


Changes
-------

Got rid of queue in POUnionTezLoad for efficiency. Added a TODO comment for TEZ-661.


Bugs: PIG-3618
    https://issues.apache.org/jira/browse/PIG-3618


Repository: pig-git


Description
-------

Replace broadcast edges with scatter/gather edges for union. I also changed the name of POPackage from POBroadcastTezLoad to POUnionTezLoad.


Diffs (updated)
-----

  src/org/apache/pig/backend/hadoop/executionengine/tez/POBroadcastTezLoad.java d7b9d5a 
  src/org/apache/pig/backend/hadoop/executionengine/tez/POLocalRearrangeTez.java 2a96da8 
  src/org/apache/pig/backend/hadoop/executionengine/tez/POUnionTezLoad.java e69de29 
  src/org/apache/pig/backend/hadoop/executionengine/tez/PigProcessor.java 8fb18d3 
  src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java ea4d167 
  src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java 2e55ea9 
  src/org/apache/pig/backend/hadoop/executionengine/tez/TezPOPackageAnnotator.java 79c97f7 
  test/e2e/pig/tests/tez.conf b7dd862 

Diff: https://reviews.apache.org/r/16165/diff/


Testing
-------

Ran tez unit tests and e2e tests.

I set the parallel of union in the e2e test case to 2.


Thanks,

Cheolsoo Park