You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by Rohini Palaniswamy <ro...@gmail.com> on 2014/09/14 07:20:38 UTC
Review Request 25617: PIG-4104: Accumulator UDF throws OOM in Tez
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25617/
-----------------------------------------------------------
Review request for pig, Cheolsoo Park and Daniel Dai.
Bugs: PIG-4104
https://issues.apache.org/jira/browse/PIG-4104
Repository: pig
Description
-------
Use a separate TezAccumulativeTupleBuffer that iterates through the inputs and returns tuples in batches instead of making a full copy.
Diffs
-----
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/PigConfiguration.java 1624398
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POForEach.java 1624398
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java 1624398
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/POShuffleTezLoad.java 1624398
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/util/AccumulatorOptimizerUtil.java 1624398
Diff: https://reviews.apache.org/r/25617/diff/
Testing
-------
Ran TestAccumulator in unit test and Accumulator, SecondarySort test groups in e2e and they all passed. Will run the full suite before committing.
Thanks,
Rohini Palaniswamy
Re: Review Request 25617: PIG-4104: Accumulator UDF throws OOM in Tez
Posted by Rohini Palaniswamy <ro...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25617/
-----------------------------------------------------------
(Updated Sept. 14, 2014, 5:56 a.m.)
Review request for pig, Cheolsoo Park and Daniel Dai.
Changes
-------
Reuse the same TezAccumulativeTupleBuffer for all input keys.
Bugs: PIG-4104
https://issues.apache.org/jira/browse/PIG-4104
Repository: pig
Description
-------
Use a separate TezAccumulativeTupleBuffer that iterates through the inputs and returns tuples in batches instead of making a full copy.
Diffs (updated)
-----
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/PigConfiguration.java 1624398
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POForEach.java 1624398
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java 1624398
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/POShuffleTezLoad.java 1624398
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/util/AccumulatorOptimizerUtil.java 1624398
Diff: https://reviews.apache.org/r/25617/diff/
Testing
-------
Ran TestAccumulator in unit test and Accumulator, SecondarySort test groups in e2e and they all passed. Will run the full suite before committing.
Thanks,
Rohini Palaniswamy
Re: Review Request 25617: PIG-4104: Accumulator UDF throws OOM in Tez
Posted by Rohini Palaniswamy <ro...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25617/#review53276
-----------------------------------------------------------
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/POShuffleTezLoad.java
<https://reviews.apache.org/r/25617/#comment92868>
Daniel,
Would there be a problem if one instance of TezAccumulativeTupleBuffer was used for each record as the ArrayList bags can be cleared and min key reset? I am only concerned with the case of streaming. I am still not familiar with internals of streaming and I believe there were cases copies of data had to be made for streaming.
- Rohini Palaniswamy
On Sept. 14, 2014, 5:20 a.m., Rohini Palaniswamy wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/25617/
> -----------------------------------------------------------
>
> (Updated Sept. 14, 2014, 5:20 a.m.)
>
>
> Review request for pig, Cheolsoo Park and Daniel Dai.
>
>
> Bugs: PIG-4104
> https://issues.apache.org/jira/browse/PIG-4104
>
>
> Repository: pig
>
>
> Description
> -------
>
> Use a separate TezAccumulativeTupleBuffer that iterates through the inputs and returns tuples in batches instead of making a full copy.
>
>
> Diffs
> -----
>
> http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/PigConfiguration.java 1624398
> http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POForEach.java 1624398
> http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java 1624398
> http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/POShuffleTezLoad.java 1624398
> http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/util/AccumulatorOptimizerUtil.java 1624398
>
> Diff: https://reviews.apache.org/r/25617/diff/
>
>
> Testing
> -------
>
> Ran TestAccumulator in unit test and Accumulator, SecondarySort test groups in e2e and they all passed. Will run the full suite before committing.
>
>
> Thanks,
>
> Rohini Palaniswamy
>
>