You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by Rohini Palaniswamy <ro...@gmail.com> on 2017/08/01 22:41:45 UTC

Re: Review Request 61265: PIG-5256: Bytecode generation for POFilter and POForeach

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61265/
-----------------------------------------------------------

(Updated Aug. 1, 2017, 10:41 p.m.)


Review request for pig, Daniel Dai and Koji Noguchi.


Changes
-------

Did some cleanup and fixed some e2e failures. There a couple more to fix


Bugs: PIG-5256
    https://issues.apache.org/jira/browse/PIG-5256


Repository: pig


Description (updated)
-------

For each POForEach or POFilter operator in the plan, a corresponding class is created by inlining code for the input plans. 
Bytecode is generated directly through asm APIs and the generated class files are shipped to the tasks as a jar. On the frontend, the generated classes are dynamically added to the PigContext classloader. The explain command now decompiles the class files in the explain output directory. Refer to the golden files in the test for examples of generated code.


TODO items to be addressed in a separate jira:
1) PIG-5279 - Support for MR and Spark. Currently only done for Tez. Will also add documentation in this jira
2) Support for Accumulator
3) Support for CROSS
4) Run the optimizer on combiner plans


Diffs (updated)
-----

  http://svn.apache.org/repos/asf/pig/trunk/build.xml 1803384 
  http://svn.apache.org/repos/asf/pig/trunk/ivy.xml 1803384 
  http://svn.apache.org/repos/asf/pig/trunk/ivy/libraries.properties 1803384 
  http://svn.apache.org/repos/asf/pig/trunk/shade/pom.xml PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/PigConfiguration.java 1803384 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/PigServer.java 1803384 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/HExecutionEngine.java 1803384 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/optimizer/AsmUtil.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/optimizer/CodeGenerator.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/optimizer/FilterCodeGenerator.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/optimizer/ForEachCodeGenerator.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/PhysicalOperator.java 1803384 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/Add.java 1803384 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/Divide.java 1803384 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/EqualToExpr.java 1803384 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/GTOrEqualToExpr.java 1803384 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/GreaterThanExpr.java 1803384 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/LTOrEqualToExpr.java 1803384 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/LessThanExpr.java 1803384 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/Mod.java 1803384 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/Multiply.java 1803384 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/NotEqualToExpr.java 1803384 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POCast.java 1803384 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POProject.java 1803384 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/PORegexp.java 1803384 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/PORelationToExprProject.java 1803384 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java 1803384 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/Subtract.java 1803384 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/PODistinct.java 1803384 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POFilter.java 1803384 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POForEach.java 1803384 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POLimit.java 1803384 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POOptimizedForEach.java 1803384 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POSort.java 1803384 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POSortedDistinct.java 1803384 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java 1803384 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/TezResourceManager.java 1803384 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/TezPlanContainer.java 1803384 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/optimizer/RuntimeCodeGenOptimizer.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/builtin/InvokerGenerator.java 1803384 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/data/DataType.java 1803384 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/PigContext.java 1803384 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/io/FileLocalizer.java 1803384 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/plan/OperatorPlan.java 1803384 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/util/ClassDecompiler.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/util/JarManager.java 1803384 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java 1803384 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/expression/UserFuncExpression.java 1803384 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/parser/LogicalPlanBuilder.java 1803384 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/tools/grunt/GruntParser.java 1803384 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestByteCodeGen.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestProjectStarRangeInUdf.java 1803384 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testFilter/POFilter_scope_12.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testFilter/POForEach_scope_11.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testFilter/plan.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testFilterSplit/POFilter_scope_14.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testFilterSplit/POFilter_scope_9.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testFilterSplit/POForEach_scope_7.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testFilterSplit/plan.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachSplit/POForEach_scope_20.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachSplit/POForEach_scope_32.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachSplit/POForEach_scope_9.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachSplit/plan.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachUDFEval/POForEach_scope_35.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachUDFEval/plan.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POFilter_scope_20.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POFilter_scope_48.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_14.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_26.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_37.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_42.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_54.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_61.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_67.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_7.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_70.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_76.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/plan.gld PRE-CREATION 


Diff: https://reviews.apache.org/r/61265/diff/2/

Changes: https://reviews.apache.org/r/61265/diff/1-2/


Testing (updated)
-------

Running full suite of unit and e2e tests with pig.opt.bytecode=true. Will update soon


Thanks,

Rohini Palaniswamy


Re: Review Request 61265: PIG-5256: Bytecode generation for POFilter and POForeach

Posted by Rohini Palaniswamy <ro...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61265/
-----------------------------------------------------------

(Updated Aug. 15, 2017, 9:16 p.m.)


Review request for pig, Daniel Dai and Koji Noguchi.


Changes
-------

Extended generation of common code to include BinaryExpressionOperators, UnaryExpressionOperators, POCast and POBinCond as well apart from POUserFunc, PODistinct and POSort. There were many cases of our user scripts with PIG-3000 issue where the UDF was part of a POBinCond.


Bugs: PIG-5256
    https://issues.apache.org/jira/browse/PIG-5256


Repository: pig


Description
-------

For each POForEach or POFilter operator in the plan, a corresponding class is created by inlining code for the input plans. 
Bytecode is generated directly through asm APIs and the generated class files are shipped to the tasks as a jar. On the frontend, the generated classes are dynamically added to the PigContext classloader. The explain command now decompiles the class files in the explain output directory. Refer to the golden files in the test for examples of generated code.

License:
The newly added fernflower.jar used for decompilation is from IntelliJ (java decompiler used by IntelliJ IDEA)  https://github.com/JetBrains/intellij-community/blob/master/plugins/java-decompiler/engine/src/org/jetbrains/java/decompiler/main/Fernflower.java  and the license of that is Apache 2.0 ithttps://github.com/JetBrains/intellij-community/blob/master/LICENSE.txt

TODO items to be addressed in a separate jira:
1) PIG-5279 - Support for MR and Spark. Currently only done for Tez. Will also add documentation in this jira
2) Support for Accumulator
3) Support for CROSS
4) Run the optimizer on combiner plans
5) Fix for test failures - TestScriptLanguage.runParallelTest2, Jython_CompileBindRun_3
 java.lang.LinkageError: loader (instance of  org/apache/pig/impl/PigContext$ContextClassLoader): attempted  duplicate class definition for name: "org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POForEach_scope_6"
 Since NodeIdGenerator is ThreadLocal, running same script in parallel with different parameters using embedded Python causes conflict. Requires ThreadLocal classloaders in PigContext. Will address in a separate jira - PIG-5291.

Other jiras fixed as part of this:
1) PIG-4515: org.apache.pig.builtin.Distinct throws ClassCastException
2) When pig.opt.bytecode=true, UDFs defined as an alias inside nested foreach are executed only once. This was actually the primary goal of doing bytecode generation
PIG-3000: Optimize nested foreach
PIG-1633: Using an alias withing Nested Foreach causes indeterminate behaviour


Diffs (updated)
-----

  http://svn.apache.org/repos/asf/pig/trunk/build.xml 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/ivy.xml 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/ivy/libraries.properties 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/shade/pom.xml PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/EvalFunc.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/PigConfiguration.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/PigServer.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/HExecutionEngine.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRUtil.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/optimizer/AsmUtil.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/optimizer/CodeGenerator.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/optimizer/FilterCodeGenerator.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/optimizer/ForEachCodeGenerator.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/BagInputOperator.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/PhysicalOperator.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/Add.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/Divide.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/EqualToExpr.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/GTOrEqualToExpr.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/GreaterThanExpr.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/LTOrEqualToExpr.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/LessThanExpr.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/Mod.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/Multiply.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/NotEqualToExpr.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POCast.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POProject.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/PORegexp.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/PORelationToExprProject.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/Subtract.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/PODistinct.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POFilter.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POForEach.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POLimit.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POOptimizedForEach.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POSort.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POSortedDistinct.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/util/PlanHelper.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/TezResourceManager.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/TezPlanContainer.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/optimizer/RuntimeCodeGenOptimizer.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/util/TezCompilerUtil.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/util/CombinerOptimizerUtil.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/builtin/Distinct.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/builtin/InvokerGenerator.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/data/DataType.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/data/GetNextTupleDataBag.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/data/SimpleAbstractDataBag.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/PigContext.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/io/FileLocalizer.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/plan/OperatorPlan.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/util/ClassDecompiler.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/util/JarManager.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/expression/AddExpression.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/expression/AndExpression.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/expression/BinCondExpression.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/expression/CastExpression.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/expression/ConstantExpression.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/expression/DereferenceExpression.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/expression/DivideExpression.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/expression/EqualExpression.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/expression/GreaterThanEqualExpression.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/expression/GreaterThanExpression.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/expression/IsNullExpression.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/expression/LessThanEqualExpression.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/expression/LessThanExpression.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/expression/LogicalExpression.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/expression/MapLookupExpression.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/expression/ModExpression.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/expression/MultiplyExpression.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/expression/NegativeExpression.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/expression/NotEqualExpression.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/expression/NotExpression.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/expression/OrExpression.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/expression/ProjectExpression.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/expression/RegexExpression.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/expression/ScalarExpression.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/expression/SubtractExpression.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/expression/UserFuncExpression.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/parser/LogicalPlanBuilder.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/tools/grunt/GruntParser.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestBuiltin.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestByteCodeGen.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestEvalPipelineLocal.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestProjectStarRangeInUdf.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/Util.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/MRC17.gld 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testFilter/POFilter_scope_12.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testFilter/POForEach_scope_11.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testFilter/plan.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testFilterSplit/POFilter_scope_14.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testFilterSplit/POFilter_scope_9.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testFilterSplit/POForEach_scope_7.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testFilterSplit/plan.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachMapLookUp/POForEach_scope_8.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachMapLookUp/plan.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachSplit/POForEach_scope_20.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachSplit/POForEach_scope_32.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachSplit/POForEach_scope_9.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachSplit/plan.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachUDFEval/POForEach_scope_53.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachUDFEval/plan.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachUDFEval2/POFilter_scope_45.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachUDFEval2/POForEach_scope_18.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachUDFEval2/POForEach_scope_31.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachUDFEval2/POForEach_scope_43.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachUDFEval2/POForEach_scope_49.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachUDFEval2/plan.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POFilter_scope_20.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POFilter_scope_48.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_14.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_26.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_37.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_42.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_54.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_61.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_67.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_7.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_70.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_76.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/plan.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Distinct-2.gld 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/tez/TestTezAutoParallelism.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/tez/TestTezCompiler.java 1805089 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/tez/TestTezGraceParallelism.java 1805089 


Diff: https://reviews.apache.org/r/61265/diff/8/

Changes: https://reviews.apache.org/r/61265/diff/7-8/


Testing
-------

Ran full suite of unit and e2e tests with both pig.opt.bytecode=true. All tests except TestScriptLanguage.runParallelTest2, Jython_CompileBindRun_3 mentioned in description pass.


Thanks,

Rohini Palaniswamy


Re: Review Request 61265: PIG-5256: Bytecode generation for POFilter and POForeach

Posted by Rohini Palaniswamy <ro...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61265/
-----------------------------------------------------------

(Updated Aug. 11, 2017, 7:56 p.m.)


Review request for pig, Daniel Dai and Koji Noguchi.


Changes
-------

Changes done
  1) Fixed the unnecessary bag construction for PORelationToExprProject in case of isStar() which got missed when I removed PORelationToExprProject from isStar() handling in CodeGenerator to handle EOP cases.
  2) Extended cases where Read once bags to UDFs as well which made a big improvement to performance and totally brings down spilling required in a task. By default EvalFunc.isInputBagIterationOnce is true. i.e, it is assumed that all UDFs iterator over input bag only once. Did not make default false as there would be only very few UDFs that might iterate on bag more than once. Any one turning on pig.opt.bytecode can fix that UDF.


Bugs: PIG-5256
    https://issues.apache.org/jira/browse/PIG-5256


Repository: pig


Description (updated)
-------

For each POForEach or POFilter operator in the plan, a corresponding class is created by inlining code for the input plans. 
Bytecode is generated directly through asm APIs and the generated class files are shipped to the tasks as a jar. On the frontend, the generated classes are dynamically added to the PigContext classloader. The explain command now decompiles the class files in the explain output directory. Refer to the golden files in the test for examples of generated code.

License:
The newly added fernflower.jar used for decompilation is from IntelliJ (java decompiler used by IntelliJ IDEA)  https://github.com/JetBrains/intellij-community/blob/master/plugins/java-decompiler/engine/src/org/jetbrains/java/decompiler/main/Fernflower.java  and the license of that is Apache 2.0 ithttps://github.com/JetBrains/intellij-community/blob/master/LICENSE.txt

TODO items to be addressed in a separate jira:
1) PIG-5279 - Support for MR and Spark. Currently only done for Tez. Will also add documentation in this jira
2) Support for Accumulator
3) Support for CROSS
4) Run the optimizer on combiner plans
5) Fix for test failures - TestScriptLanguage.runParallelTest2, Jython_CompileBindRun_3
 java.lang.LinkageError: loader (instance of  org/apache/pig/impl/PigContext$ContextClassLoader): attempted  duplicate class definition for name: "org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POForEach_scope_6"
 Since NodeIdGenerator is ThreadLocal, running same script in parallel with different parameters using embedded Python causes conflict. Requires ThreadLocal classloaders in PigContext. Will address in a separate jira - PIG-5291.

Other jiras fixed as part of this:
1) PIG-4515: org.apache.pig.builtin.Distinct throws ClassCastException
2) When pig.opt.bytecode=true, UDFs defined as an alias inside nested foreach are executed only once. This was actually the primary goal of doing bytecode generation
PIG-3000: Optimize nested foreach
PIG-1633: Using an alias withing Nested Foreach causes indeterminate behaviour


Diffs (updated)
-----

  http://svn.apache.org/repos/asf/pig/trunk/build.xml 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/ivy.xml 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/ivy/libraries.properties 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/shade/pom.xml PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/EvalFunc.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/PigConfiguration.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/PigServer.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/HExecutionEngine.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRUtil.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/optimizer/AsmUtil.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/optimizer/CodeGenerator.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/optimizer/FilterCodeGenerator.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/optimizer/ForEachCodeGenerator.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/BagInputOperator.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/PhysicalOperator.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/Add.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/Divide.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/EqualToExpr.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/GTOrEqualToExpr.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/GreaterThanExpr.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/LTOrEqualToExpr.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/LessThanExpr.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/Mod.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/Multiply.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/NotEqualToExpr.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POCast.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POProject.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/PORegexp.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/PORelationToExprProject.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/Subtract.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/PODistinct.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POFilter.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POForEach.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POLimit.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POOptimizedForEach.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POSort.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POSortedDistinct.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/TezResourceManager.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/TezPlanContainer.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/optimizer/RuntimeCodeGenOptimizer.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/util/TezCompilerUtil.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/util/CombinerOptimizerUtil.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/builtin/Distinct.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/builtin/InvokerGenerator.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/data/DataType.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/data/GetNextTupleDataBag.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/data/SimpleAbstractDataBag.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/PigContext.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/io/FileLocalizer.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/plan/OperatorPlan.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/util/ClassDecompiler.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/util/JarManager.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/expression/UserFuncExpression.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/parser/LogicalPlanBuilder.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/tools/grunt/GruntParser.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestBuiltin.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestByteCodeGen.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestEvalPipelineLocal.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestProjectStarRangeInUdf.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/Util.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/MRC17.gld 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testFilter/POFilter_scope_12.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testFilter/POForEach_scope_11.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testFilter/plan.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testFilterSplit/POFilter_scope_14.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testFilterSplit/POFilter_scope_9.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testFilterSplit/POForEach_scope_7.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testFilterSplit/plan.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachMapLookUp/POForEach_scope_8.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachMapLookUp/plan.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachSplit/POForEach_scope_20.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachSplit/POForEach_scope_32.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachSplit/POForEach_scope_9.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachSplit/plan.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachUDFEval/POForEach_scope_35.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachUDFEval/plan.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachUDFEval2/POFilter_scope_45.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachUDFEval2/POForEach_scope_18.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachUDFEval2/POForEach_scope_31.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachUDFEval2/POForEach_scope_43.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachUDFEval2/POForEach_scope_49.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachUDFEval2/plan.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POFilter_scope_20.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POFilter_scope_48.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_14.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_26.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_37.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_42.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_54.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_61.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_67.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_7.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_70.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_76.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/plan.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Distinct-2.gld 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/tez/TestTezAutoParallelism.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/tez/TestTezCompiler.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/tez/TestTezGraceParallelism.java 1804148 


Diff: https://reviews.apache.org/r/61265/diff/7/

Changes: https://reviews.apache.org/r/61265/diff/6-7/


Testing
-------

Ran full suite of unit and e2e tests with both pig.opt.bytecode=true. All tests except TestScriptLanguage.runParallelTest2, Jython_CompileBindRun_3 mentioned in description pass.


Thanks,

Rohini Palaniswamy


Re: Review Request 61265: PIG-5256: Bytecode generation for POFilter and POForeach

Posted by Rohini Palaniswamy <ro...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61265/
-----------------------------------------------------------

(Updated Aug. 9, 2017, 3:59 a.m.)


Review request for pig, Daniel Dai and Koji Noguchi.


Changes
-------

During performance testing on large data, found that were lot of spills. Fixed that by avoiding materialization of the full databag in nested foreach similar to how it happens in regular processing.  

Also in addition to udfs, any repeated POSort and PODistinct is now processed only once. Also took care of clearing those values for garbage collection as soon as the last foreach plan referring to them is done instead of at the end when input was detached.


Bugs: PIG-5256
    https://issues.apache.org/jira/browse/PIG-5256


Repository: pig


Description
-------

For each POForEach or POFilter operator in the plan, a corresponding class is created by inlining code for the input plans. 
Bytecode is generated directly through asm APIs and the generated class files are shipped to the tasks as a jar. On the frontend, the generated classes are dynamically added to the PigContext classloader. The explain command now decompiles the class files in the explain output directory. Refer to the golden files in the test for examples of generated code.

License:
The newly added fernflower.jar used for decompilation is from IntelliJ (java decompiler used by IntelliJ IDEA)  https://github.com/JetBrains/intellij-community/blob/master/plugins/java-decompiler/engine/src/org/jetbrains/java/decompiler/main/Fernflower.java  and the license of that is Apache 2.0 ithttps://github.com/JetBrains/intellij-community/blob/master/LICENSE.txt

TODO items to be addressed in a separate jira:
1) PIG-5279 - Support for MR and Spark. Currently only done for Tez. Will also add documentation in this jira
2) Support for Accumulator
3) Support for CROSS
4) Run the optimizer on combiner plans
5) Fix for test failures - TestScriptLanguage.runParallelTest2, Jython_CompileBindRun_3
 java.lang.LinkageError: loader (instance of  org/apache/pig/impl/PigContext$ContextClassLoader): attempted  duplicate class definition for name: "org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POForEach_scope_6"
 Since NodeIdGenerator is ThreadLocal, running same script in parallel with different parameters using embedded Python causes conflict. Requires ThreadLocal classloaders in PigContext. Will address in a separate jira.

Other jiras fixed as part of this:
1) PIG-4515: org.apache.pig.builtin.Distinct throws ClassCastException
2) When pig.opt.bytecode=true, UDFs defined as an alias inside nested foreach are executed only once. This was actually the primary goal of doing bytecode generation
PIG-3000: Optimize nested foreach
PIG-1633: Using an alias withing Nested Foreach causes indeterminate behaviour


Diffs (updated)
-----

  http://svn.apache.org/repos/asf/pig/trunk/build.xml 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/ivy.xml 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/ivy/libraries.properties 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/shade/pom.xml PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/PigConfiguration.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/PigServer.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/HExecutionEngine.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRUtil.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/optimizer/AsmUtil.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/optimizer/CodeGenerator.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/optimizer/FilterCodeGenerator.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/optimizer/ForEachCodeGenerator.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/BagInputOperator.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/PhysicalOperator.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/Add.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/Divide.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/EqualToExpr.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/GTOrEqualToExpr.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/GreaterThanExpr.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/LTOrEqualToExpr.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/LessThanExpr.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/Mod.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/Multiply.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/NotEqualToExpr.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POCast.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POProject.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/PORegexp.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/PORelationToExprProject.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/Subtract.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/PODistinct.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POFilter.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POForEach.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POLimit.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POOptimizedForEach.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POSort.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POSortedDistinct.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/TezResourceManager.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/TezPlanContainer.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/optimizer/RuntimeCodeGenOptimizer.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/util/TezCompilerUtil.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/util/CombinerOptimizerUtil.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/builtin/Distinct.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/builtin/InvokerGenerator.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/data/DataType.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/data/GetNextTupleDataBag.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/data/SimpleAbstractDataBag.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/PigContext.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/io/FileLocalizer.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/plan/OperatorPlan.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/util/ClassDecompiler.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/util/JarManager.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/expression/UserFuncExpression.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/parser/LogicalPlanBuilder.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/tools/grunt/GruntParser.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestBuiltin.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestByteCodeGen.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestEvalPipelineLocal.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestProjectStarRangeInUdf.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/Util.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/MRC17.gld 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testFilter/POFilter_scope_12.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testFilter/POForEach_scope_11.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testFilter/plan.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testFilterSplit/POFilter_scope_14.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testFilterSplit/POFilter_scope_9.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testFilterSplit/POForEach_scope_7.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testFilterSplit/plan.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachMapLookUp/POForEach_scope_8.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachMapLookUp/plan.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachSplit/POForEach_scope_20.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachSplit/POForEach_scope_32.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachSplit/POForEach_scope_9.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachSplit/plan.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachUDFEval/POForEach_scope_35.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachUDFEval/plan.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachUDFEval2/POFilter_scope_45.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachUDFEval2/POForEach_scope_18.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachUDFEval2/POForEach_scope_31.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachUDFEval2/POForEach_scope_43.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachUDFEval2/POForEach_scope_49.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachUDFEval2/plan.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POFilter_scope_20.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POFilter_scope_48.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_14.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_26.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_37.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_42.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_54.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_61.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_67.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_7.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_70.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_76.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/plan.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Distinct-2.gld 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/tez/TestTezAutoParallelism.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/tez/TestTezCompiler.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/tez/TestTezGraceParallelism.java 1804148 


Diff: https://reviews.apache.org/r/61265/diff/6/

Changes: https://reviews.apache.org/r/61265/diff/5-6/


Testing
-------

Ran full suite of unit and e2e tests with both pig.opt.bytecode=true. All tests except TestScriptLanguage.runParallelTest2, Jython_CompileBindRun_3 mentioned in description pass.


Thanks,

Rohini Palaniswamy


Re: Review Request 61265: PIG-5256: Bytecode generation for POFilter and POForeach

Posted by Rohini Palaniswamy <ro...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61265/
-----------------------------------------------------------

(Updated Aug. 7, 2017, 6:05 p.m.)


Review request for pig, Daniel Dai and Koji Noguchi.


Changes
-------

Fix test failure for Orc_Pushdown_4. Had not noticed this failure earlier due to https://issues.apache.org/jira/browse/PIG-5286


Bugs: PIG-5256
    https://issues.apache.org/jira/browse/PIG-5256


Repository: pig


Description
-------

For each POForEach or POFilter operator in the plan, a corresponding class is created by inlining code for the input plans. 
Bytecode is generated directly through asm APIs and the generated class files are shipped to the tasks as a jar. On the frontend, the generated classes are dynamically added to the PigContext classloader. The explain command now decompiles the class files in the explain output directory. Refer to the golden files in the test for examples of generated code.

License:
The newly added fernflower.jar used for decompilation is from IntelliJ (java decompiler used by IntelliJ IDEA)  https://github.com/JetBrains/intellij-community/blob/master/plugins/java-decompiler/engine/src/org/jetbrains/java/decompiler/main/Fernflower.java  and the license of that is Apache 2.0 ithttps://github.com/JetBrains/intellij-community/blob/master/LICENSE.txt

TODO items to be addressed in a separate jira:
1) PIG-5279 - Support for MR and Spark. Currently only done for Tez. Will also add documentation in this jira
2) Support for Accumulator
3) Support for CROSS
4) Run the optimizer on combiner plans
5) Fix for test failures - TestScriptLanguage.runParallelTest2, Jython_CompileBindRun_3
 java.lang.LinkageError: loader (instance of  org/apache/pig/impl/PigContext$ContextClassLoader): attempted  duplicate class definition for name: "org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POForEach_scope_6"
 Since NodeIdGenerator is ThreadLocal, running same script in parallel with different parameters using embedded Python causes conflict. Requires ThreadLocal classloaders in PigContext. Will address in a separate jira.

Other jiras fixed as part of this:
1) PIG-4515: org.apache.pig.builtin.Distinct throws ClassCastException
2) When pig.opt.bytecode=true, UDFs defined as an alias inside nested foreach are executed only once. This was actually the primary goal of doing bytecode generation
PIG-3000: Optimize nested foreach
PIG-1633: Using an alias withing Nested Foreach causes indeterminate behaviour


Diffs (updated)
-----

  http://svn.apache.org/repos/asf/pig/trunk/build.xml 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/ivy.xml 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/ivy/libraries.properties 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/shade/pom.xml PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/PigConfiguration.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/PigServer.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/HExecutionEngine.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRUtil.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/optimizer/AsmUtil.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/optimizer/CodeGenerator.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/optimizer/FilterCodeGenerator.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/optimizer/ForEachCodeGenerator.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/PhysicalOperator.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/Add.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/Divide.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/EqualToExpr.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/GTOrEqualToExpr.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/GreaterThanExpr.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/LTOrEqualToExpr.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/LessThanExpr.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/Mod.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/Multiply.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/NotEqualToExpr.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POCast.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POProject.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/PORegexp.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/PORelationToExprProject.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/Subtract.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/PODistinct.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POFilter.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POForEach.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POLimit.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POOptimizedForEach.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POSort.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POSortedDistinct.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/TezResourceManager.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/TezPlanContainer.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/optimizer/RuntimeCodeGenOptimizer.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/util/TezCompilerUtil.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/util/CombinerOptimizerUtil.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/builtin/Distinct.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/builtin/InvokerGenerator.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/data/DataType.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/PigContext.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/io/FileLocalizer.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/plan/OperatorPlan.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/util/ClassDecompiler.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/util/JarManager.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/expression/UserFuncExpression.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/parser/LogicalPlanBuilder.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/tools/grunt/GruntParser.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestBuiltin.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestByteCodeGen.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestEvalPipelineLocal.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestProjectStarRangeInUdf.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/Util.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/MRC17.gld 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testFilter/POFilter_scope_12.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testFilter/POForEach_scope_11.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testFilter/plan.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testFilterSplit/POFilter_scope_14.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testFilterSplit/POFilter_scope_9.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testFilterSplit/POForEach_scope_7.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testFilterSplit/plan.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachMapLookUp/POForEach_scope_8.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachMapLookUp/plan.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachSplit/POForEach_scope_20.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachSplit/POForEach_scope_32.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachSplit/POForEach_scope_9.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachSplit/plan.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachUDFEval/POForEach_scope_35.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachUDFEval/plan.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachUDFEval2/POForEach_scope_18.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachUDFEval2/POForEach_scope_31.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachUDFEval2/POForEach_scope_43.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachUDFEval2/POForEach_scope_45.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachUDFEval2/plan.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POFilter_scope_20.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POFilter_scope_48.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_14.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_26.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_37.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_42.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_54.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_61.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_67.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_7.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_70.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_76.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/plan.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Distinct-2.gld 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/tez/TestTezAutoParallelism.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/tez/TestTezCompiler.java 1804148 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/tez/TestTezGraceParallelism.java 1804148 


Diff: https://reviews.apache.org/r/61265/diff/5/

Changes: https://reviews.apache.org/r/61265/diff/4-5/


Testing
-------

Ran full suite of unit and e2e tests with both pig.opt.bytecode=true. All tests except TestScriptLanguage.runParallelTest2, Jython_CompileBindRun_3 mentioned in description pass.


Thanks,

Rohini Palaniswamy


Re: Review Request 61265: PIG-5256: Bytecode generation for POFilter and POForeach

Posted by Rohini Palaniswamy <ro...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61265/
-----------------------------------------------------------

(Updated Aug. 3, 2017, 10:46 p.m.)


Review request for pig, Daniel Dai and Koji Noguchi.


Changes
-------

Final patch with some more cleanup and all tests but TestScriptLanguage.runParallelTest2, Jython_CompileBindRun_3 passing. Will create a separate jira for those two


Bugs: PIG-5256
    https://issues.apache.org/jira/browse/PIG-5256


Repository: pig


Description (updated)
-------

For each POForEach or POFilter operator in the plan, a corresponding class is created by inlining code for the input plans. 
Bytecode is generated directly through asm APIs and the generated class files are shipped to the tasks as a jar. On the frontend, the generated classes are dynamically added to the PigContext classloader. The explain command now decompiles the class files in the explain output directory. Refer to the golden files in the test for examples of generated code.

License:
The newly added fernflower.jar used for decompilation is from IntelliJ (java decompiler used by IntelliJ IDEA)  https://github.com/JetBrains/intellij-community/blob/master/plugins/java-decompiler/engine/src/org/jetbrains/java/decompiler/main/Fernflower.java  and the license of that is Apache 2.0 ithttps://github.com/JetBrains/intellij-community/blob/master/LICENSE.txt

TODO items to be addressed in a separate jira:
1) PIG-5279 - Support for MR and Spark. Currently only done for Tez. Will also add documentation in this jira
2) Support for Accumulator
3) Support for CROSS
4) Run the optimizer on combiner plans
5) Fix for test failures - TestScriptLanguage.runParallelTest2, Jython_CompileBindRun_3
 java.lang.LinkageError: loader (instance of  org/apache/pig/impl/PigContext$ContextClassLoader): attempted  duplicate class definition for name: "org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POForEach_scope_6"
 Since NodeIdGenerator is ThreadLocal, running same script in parallel with different parameters using embedded Python causes conflict. Requires ThreadLocal classloaders in PigContext. Will address in a separate jira.

Other jiras fixed as part of this:
1) PIG-4515: org.apache.pig.builtin.Distinct throws ClassCastException
2) When pig.opt.bytecode=true, UDFs defined as an alias inside nested foreach are executed only once. This was actually the primary goal of doing bytecode generation
PIG-3000: Optimize nested foreach
PIG-1633: Using an alias withing Nested Foreach causes indeterminate behaviour


Diffs (updated)
-----

  http://svn.apache.org/repos/asf/pig/trunk/build.xml 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/ivy.xml 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/ivy/libraries.properties 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/shade/pom.xml PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/PigConfiguration.java 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/PigServer.java 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/HExecutionEngine.java 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRUtil.java 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/optimizer/AsmUtil.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/optimizer/CodeGenerator.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/optimizer/FilterCodeGenerator.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/optimizer/ForEachCodeGenerator.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/PhysicalOperator.java 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/Add.java 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/Divide.java 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/EqualToExpr.java 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/GTOrEqualToExpr.java 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/GreaterThanExpr.java 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/LTOrEqualToExpr.java 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/LessThanExpr.java 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/Mod.java 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/Multiply.java 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/NotEqualToExpr.java 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POCast.java 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POProject.java 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/PORegexp.java 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/PORelationToExprProject.java 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/Subtract.java 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/PODistinct.java 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POFilter.java 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POForEach.java 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POLimit.java 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POOptimizedForEach.java 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POSort.java 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POSortedDistinct.java 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/TezResourceManager.java 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/TezPlanContainer.java 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/optimizer/RuntimeCodeGenOptimizer.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/util/TezCompilerUtil.java 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/util/CombinerOptimizerUtil.java 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/builtin/Distinct.java 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/builtin/InvokerGenerator.java 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/data/DataType.java 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/PigContext.java 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/io/FileLocalizer.java 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/plan/OperatorPlan.java 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/util/ClassDecompiler.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/util/JarManager.java 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/expression/UserFuncExpression.java 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/parser/LogicalPlanBuilder.java 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/tools/grunt/GruntParser.java 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestBuiltin.java 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestByteCodeGen.java PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestEvalPipelineLocal.java 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestProjectStarRangeInUdf.java 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/Util.java 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/MRC17.gld 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testFilter/POFilter_scope_12.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testFilter/POForEach_scope_11.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testFilter/plan.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testFilterSplit/POFilter_scope_14.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testFilterSplit/POFilter_scope_9.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testFilterSplit/POForEach_scope_7.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testFilterSplit/plan.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachMapLookUp/POForEach_scope_8.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachMapLookUp/plan.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachSplit/POForEach_scope_20.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachSplit/POForEach_scope_32.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachSplit/POForEach_scope_9.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachSplit/plan.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachUDFEval/POForEach_scope_35.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testForeachUDFEval/plan.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POFilter_scope_20.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POFilter_scope_48.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_14.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_26.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_37.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_42.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_54.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_61.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_67.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_7.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_70.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/POForEach_scope_76.java.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/bytecodegen/tez/testNestedForeach/plan.gld PRE-CREATION 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Distinct-2.gld 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/tez/TestTezAutoParallelism.java 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/tez/TestTezCompiler.java 1804046 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/tez/TestTezGraceParallelism.java 1804046 


Diff: https://reviews.apache.org/r/61265/diff/3/

Changes: https://reviews.apache.org/r/61265/diff/2-3/


Testing (updated)
-------

Ran full suite of unit and e2e tests with both pig.opt.bytecode=true. All tests except TestScriptLanguage.runParallelTest2, Jython_CompileBindRun_3 mentioned in description pass.


Thanks,

Rohini Palaniswamy