You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by Rohini Palaniswamy <ro...@gmail.com> on 2016/02/11 21:25:16 UTC
Review Request 43498: PIG-4759 Fix TestMultiQueryLocal.testStoreOrder
failure from the initial change
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/43498/
-----------------------------------------------------------
Review request for pig and Daniel Dai.
Bugs: PIG-4759
https://issues.apache.org/jira/browse/PIG-4759
Repository: pig
Description
-------
After the initial fix which changed to a separate DAG for LOAD after STORE in TezCompiler, TestMultiQueryLocal.testStoreOrder broke.
Problem was split code was iterating over the operators picking the ones to be split and splitting them. The dependencies were messed up and the resulting plan created DAGs that were not correctly connected and there were two empty DAGs which caused an exception. The new logic is as follows
1) Traverse the plan from root to leaves and find operators to segment from the root.
2) Disconnect all its successors before moving into new plans. Previously this was not done and due to dependencies the operToSegment was also moved to new plan causing empty DAGs.
3) Move each of the successors into their new plans and recursively split the successor. When successors are moved into new plans, they can also move other successors along with them due to dependencies and with the recursive splits they can end up in any container. So we scan the plan to see which container they ended up and then connect with that to maintain the dependency correctly.
4) Repeat steps 1 to 3 for remaining operators to segment in the initial plan.
Diffs
-----
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/TezPlanContainer.java 1729745
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/TezPlanContainerPrinter.java 1729745
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/TezPrinter.java 1729745
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/tools/pigstats/tez/TezDAGStats.java 1729745
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-LoadStore-1.gld 1729745
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-LoadStore-2-JDK7.gld PRE-CREATION
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-LoadStore-2.gld PRE-CREATION
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Native-1.gld 1729745
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/tez/TestTezCompiler.java 1729745
Diff: https://reviews.apache.org/r/43498/diff/
Testing
-------
Added additional tests in TestTezCompiler and also printed more information in the explain output to see the DAG dependency graph.
Was trying to get the ordering same for JDK7 and JDK8, but it was kind of wasting a lot of time and I did not get to the bottom of it. So ended up doing two different golden files. Will track that down sometime later to see if it can be changed to make explain output same for JDK7 and JDK8.
Thanks,
Rohini Palaniswamy
Re: Review Request 43498: PIG-4759 Fix
TestMultiQueryLocal.testStoreOrder failure from the initial patch
Posted by Daniel Dai <da...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/43498/#review118981
-----------------------------------------------------------
Ship it!
Ship It!
- Daniel Dai
On Feb. 11, 2016, 8:25 p.m., Rohini Palaniswamy wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/43498/
> -----------------------------------------------------------
>
> (Updated Feb. 11, 2016, 8:25 p.m.)
>
>
> Review request for pig and Daniel Dai.
>
>
> Bugs: PIG-4759
> https://issues.apache.org/jira/browse/PIG-4759
>
>
> Repository: pig
>
>
> Description
> -------
>
> After the initial fix which changed to a separate DAG for LOAD after STORE in TezCompiler, TestMultiQueryLocal.testStoreOrder broke.
>
> Problem was split code was iterating over the operators picking the ones to be split and splitting them. The dependencies were messed up and the resulting plan created DAGs that were not correctly connected and there were two empty DAGs which caused an exception. The new logic is as follows
> 1) Traverse the plan from root to leaves and find operators to segment from the root.
> 2) Disconnect all its successors before moving into new plans. Previously this was not done and due to dependencies the operToSegment was also moved to new plan causing empty DAGs.
> 3) Move each of the successors into their new plans and recursively split the successor. When successors are moved into new plans, they can also move other successors along with them due to dependencies and with the recursive splits they can end up in any container. So we scan the plan to see which container they ended up and then connect with that to maintain the dependency correctly.
> 4) Repeat steps 1 to 3 for remaining operators to segment in the initial plan.
>
>
> Diffs
> -----
>
> http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/TezPlanContainer.java 1729745
> http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/TezPlanContainerPrinter.java 1729745
> http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/TezPrinter.java 1729745
> http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/tools/pigstats/tez/TezDAGStats.java 1729745
> http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-LoadStore-1.gld 1729745
> http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-LoadStore-2-JDK7.gld PRE-CREATION
> http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-LoadStore-2.gld PRE-CREATION
> http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Native-1.gld 1729745
> http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/tez/TestTezCompiler.java 1729745
>
> Diff: https://reviews.apache.org/r/43498/diff/
>
>
> Testing
> -------
>
> Added additional tests in TestTezCompiler and also printed more information in the explain output to see the DAG dependency graph.
>
> Was trying to get the ordering same for JDK7 and JDK8, but it was kind of wasting a lot of time and I did not get to the bottom of it. So ended up doing two different golden files. Will track that down sometime later to see if it can be changed to make explain output same for JDK7 and JDK8.
>
>
> Thanks,
>
> Rohini Palaniswamy
>
>
Re: Review Request 43498: PIG-4759 Fix
TestMultiQueryLocal.testStoreOrder failure from the initial patch
Posted by Rohini Palaniswamy <ro...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/43498/
-----------------------------------------------------------
(Updated Feb. 11, 2016, 8:25 p.m.)
Review request for pig and Daniel Dai.
Summary (updated)
-----------------
PIG-4759 Fix TestMultiQueryLocal.testStoreOrder failure from the initial patch
Bugs: PIG-4759
https://issues.apache.org/jira/browse/PIG-4759
Repository: pig
Description
-------
After the initial fix which changed to a separate DAG for LOAD after STORE in TezCompiler, TestMultiQueryLocal.testStoreOrder broke.
Problem was split code was iterating over the operators picking the ones to be split and splitting them. The dependencies were messed up and the resulting plan created DAGs that were not correctly connected and there were two empty DAGs which caused an exception. The new logic is as follows
1) Traverse the plan from root to leaves and find operators to segment from the root.
2) Disconnect all its successors before moving into new plans. Previously this was not done and due to dependencies the operToSegment was also moved to new plan causing empty DAGs.
3) Move each of the successors into their new plans and recursively split the successor. When successors are moved into new plans, they can also move other successors along with them due to dependencies and with the recursive splits they can end up in any container. So we scan the plan to see which container they ended up and then connect with that to maintain the dependency correctly.
4) Repeat steps 1 to 3 for remaining operators to segment in the initial plan.
Diffs
-----
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/TezPlanContainer.java 1729745
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/TezPlanContainerPrinter.java 1729745
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/TezPrinter.java 1729745
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/tools/pigstats/tez/TezDAGStats.java 1729745
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-LoadStore-1.gld 1729745
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-LoadStore-2-JDK7.gld PRE-CREATION
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-LoadStore-2.gld PRE-CREATION
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Native-1.gld 1729745
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/tez/TestTezCompiler.java 1729745
Diff: https://reviews.apache.org/r/43498/diff/
Testing
-------
Added additional tests in TestTezCompiler and also printed more information in the explain output to see the DAG dependency graph.
Was trying to get the ordering same for JDK7 and JDK8, but it was kind of wasting a lot of time and I did not get to the bottom of it. So ended up doing two different golden files. Will track that down sometime later to see if it can be changed to make explain output same for JDK7 and JDK8.
Thanks,
Rohini Palaniswamy