You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by Rohini Palaniswamy <ro...@gmail.com> on 2016/02/11 21:25:16 UTC

Review Request 43498: PIG-4759 Fix TestMultiQueryLocal.testStoreOrder failure from the initial change

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/43498/
-----------------------------------------------------------

Review request for pig and Daniel Dai.


Bugs: PIG-4759
    https://issues.apache.org/jira/browse/PIG-4759


Repository: pig


Description
-------

After the initial fix which changed to a separate DAG for LOAD after STORE in TezCompiler, TestMultiQueryLocal.testStoreOrder broke.

Problem was split code was iterating over the operators picking the ones to be split and splitting them. The dependencies were messed up and the resulting plan created DAGs that were not correctly connected and there were two empty DAGs which caused an exception. The new logic is as follows
     1) Traverse the plan from root to leaves and find operators to segment from the root.
     2) Disconnect all its successors before moving into new plans. Previously this was not done and due to dependencies the operToSegment was also moved to new plan causing empty DAGs.
     3) Move each of the successors into their new plans and recursively split the successor. When successors are moved into new plans, they can also move other successors along with them due to dependencies and with the recursive splits they can end up in any container. So we scan the plan to see which container they ended up and then connect with that to maintain the dependency correctly.
     4) Repeat steps 1 to 3 for remaining operators to segment in the initial plan.


Diffs
-----

  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/TezPlanContainer.java 1729745 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/TezPlanContainerPrinter.java 1729745 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/TezPrinter.java 1729745 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/tools/pigstats/tez/TezDAGStats.java 1729745 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-LoadStore-1.gld 1729745 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-LoadStore-2-JDK7.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-LoadStore-2.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Native-1.gld 1729745 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/tez/TestTezCompiler.java 1729745 

Diff: https://reviews.apache.org/r/43498/diff/


Testing
-------

Added additional tests in TestTezCompiler and also printed more information in the explain output to see the DAG dependency graph.

Was trying to get the ordering same for JDK7 and JDK8, but it was kind of wasting a lot of time and I did not get to the bottom of it. So ended up doing two different golden files. Will track that down sometime later to see if it can be changed to make explain output same for JDK7 and JDK8.


Thanks,

Rohini Palaniswamy


Re: Review Request 43498: PIG-4759 Fix TestMultiQueryLocal.testStoreOrder failure from the initial patch

Posted by Daniel Dai <da...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/43498/#review118981
-----------------------------------------------------------


Ship it!




Ship It!

- Daniel Dai


On Feb. 11, 2016, 8:25 p.m., Rohini Palaniswamy wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/43498/
> -----------------------------------------------------------
> 
> (Updated Feb. 11, 2016, 8:25 p.m.)
> 
> 
> Review request for pig and Daniel Dai.
> 
> 
> Bugs: PIG-4759
>     https://issues.apache.org/jira/browse/PIG-4759
> 
> 
> Repository: pig
> 
> 
> Description
> -------
> 
> After the initial fix which changed to a separate DAG for LOAD after STORE in TezCompiler, TestMultiQueryLocal.testStoreOrder broke.
> 
> Problem was split code was iterating over the operators picking the ones to be split and splitting them. The dependencies were messed up and the resulting plan created DAGs that were not correctly connected and there were two empty DAGs which caused an exception. The new logic is as follows
>      1) Traverse the plan from root to leaves and find operators to segment from the root.
>      2) Disconnect all its successors before moving into new plans. Previously this was not done and due to dependencies the operToSegment was also moved to new plan causing empty DAGs.
>      3) Move each of the successors into their new plans and recursively split the successor. When successors are moved into new plans, they can also move other successors along with them due to dependencies and with the recursive splits they can end up in any container. So we scan the plan to see which container they ended up and then connect with that to maintain the dependency correctly.
>      4) Repeat steps 1 to 3 for remaining operators to segment in the initial plan.
> 
> 
> Diffs
> -----
> 
>   http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/TezPlanContainer.java 1729745 
>   http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/TezPlanContainerPrinter.java 1729745 
>   http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/TezPrinter.java 1729745 
>   http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/tools/pigstats/tez/TezDAGStats.java 1729745 
>   http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-LoadStore-1.gld 1729745 
>   http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-LoadStore-2-JDK7.gld PRE-CREATION 
>   http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-LoadStore-2.gld PRE-CREATION 
>   http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Native-1.gld 1729745 
>   http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/tez/TestTezCompiler.java 1729745 
> 
> Diff: https://reviews.apache.org/r/43498/diff/
> 
> 
> Testing
> -------
> 
> Added additional tests in TestTezCompiler and also printed more information in the explain output to see the DAG dependency graph.
> 
> Was trying to get the ordering same for JDK7 and JDK8, but it was kind of wasting a lot of time and I did not get to the bottom of it. So ended up doing two different golden files. Will track that down sometime later to see if it can be changed to make explain output same for JDK7 and JDK8.
> 
> 
> Thanks,
> 
> Rohini Palaniswamy
> 
>


Re: Review Request 43498: PIG-4759 Fix TestMultiQueryLocal.testStoreOrder failure from the initial patch

Posted by Rohini Palaniswamy <ro...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/43498/
-----------------------------------------------------------

(Updated Feb. 11, 2016, 8:25 p.m.)


Review request for pig and Daniel Dai.


Summary (updated)
-----------------

PIG-4759 Fix TestMultiQueryLocal.testStoreOrder failure from the initial patch


Bugs: PIG-4759
    https://issues.apache.org/jira/browse/PIG-4759


Repository: pig


Description
-------

After the initial fix which changed to a separate DAG for LOAD after STORE in TezCompiler, TestMultiQueryLocal.testStoreOrder broke.

Problem was split code was iterating over the operators picking the ones to be split and splitting them. The dependencies were messed up and the resulting plan created DAGs that were not correctly connected and there were two empty DAGs which caused an exception. The new logic is as follows
     1) Traverse the plan from root to leaves and find operators to segment from the root.
     2) Disconnect all its successors before moving into new plans. Previously this was not done and due to dependencies the operToSegment was also moved to new plan causing empty DAGs.
     3) Move each of the successors into their new plans and recursively split the successor. When successors are moved into new plans, they can also move other successors along with them due to dependencies and with the recursive splits they can end up in any container. So we scan the plan to see which container they ended up and then connect with that to maintain the dependency correctly.
     4) Repeat steps 1 to 3 for remaining operators to segment in the initial plan.


Diffs
-----

  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/TezPlanContainer.java 1729745 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/TezPlanContainerPrinter.java 1729745 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/TezPrinter.java 1729745 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/tools/pigstats/tez/TezDAGStats.java 1729745 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-LoadStore-1.gld 1729745 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-LoadStore-2-JDK7.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-LoadStore-2.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Native-1.gld 1729745 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/tez/TestTezCompiler.java 1729745 

Diff: https://reviews.apache.org/r/43498/diff/


Testing
-------

Added additional tests in TestTezCompiler and also printed more information in the explain output to see the DAG dependency graph.

Was trying to get the ordering same for JDK7 and JDK8, but it was kind of wasting a lot of time and I did not get to the bottom of it. So ended up doing two different golden files. Will track that down sometime later to see if it can be changed to make explain output same for JDK7 and JDK8.


Thanks,

Rohini Palaniswamy