You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by Daniel Dai <da...@gmail.com> on 2011/02/01 20:08:23 UTC

Review Request: Indeterministic behavior in local mode due to static variable PigMapReduce.sJobConf

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/376/
-----------------------------------------------------------

Review request for pig and Richard Ding.


Summary
-------

The below script when run in local mode gives me a different output. It looks like in local mode I have to store a relation obtained through streaming in order to use it afterwards.

For example consider the below script :

DEFINE MySTREAMUDF `test.sh`;
A = LOAD 'myinput' USING PigStorage() AS (myId:chararray, data2, data3,data4 );
B = STREAM A THROUGH MySTREAMUDF AS (wId:chararray, num:int);
--STORE B into 'output.B';
C = JOIN B by wId LEFT OUTER, A by myId;
D = FOREACH C GENERATE B::wId,B::num,data4 ;
D = STREAM D THROUGH MySTREAMUDF AS (f1:chararray,f2:int);
--STORE D into 'output.D';
E = foreach B GENERATE wId,num;
F = DISTINCT E;
G = GROUP F ALL;
H = FOREACH G GENERATE COUNT_STAR(F) as TotalCount;
I = CROSS D,H;
STORE I into 'output.I';


This addresses bug PIG-1831.
    https://issues.apache.org/jira/browse/PIG-1831


Diffs
-----

  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigMapBase.java 1065894 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigMapReduce.java 1065894 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/partitioners/SkewedPartitioner.java 1065894 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POCollectedGroup.java 1065894 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POCombinerPackage.java 1065894 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/PODistinct.java 1065894 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POJoinPackage.java 1065894 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POMergeCogroup.java 1065894 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POMergeJoin.java 1065894 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java 1065894 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPartitionRearrange.java 1065894 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POSort.java 1065894 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/util/MapRedUtil.java 1065894 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/streaming/HadoopExecutableManager.java 1065894 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/builtin/Distinct.java 1065894 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/data/InternalCachedBag.java 1065894 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/data/InternalDistinctBag.java 1065894 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/data/InternalSortedBag.java 1065894 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/builtin/DefaultIndexableLoader.java 1065894 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/io/FileLocalizer.java 1065894 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestFRJoin.java 1065894 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestFinish.java 1065894 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/utils/FILTERFROMFILE.java 1065894 

Diff: https://reviews.apache.org/r/376/diff


Testing
-------


Thanks,

Daniel


Re: Review Request: Indeterministic behavior in local mode due to static variable PigMapReduce.sJobConf

Posted by Daniel Dai <da...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/376/
-----------------------------------------------------------

(Updated 2011-02-17 17:57:02.235808)


Review request for pig and Richard Ding.


Summary
-------

The below script when run in local mode gives me a different output. It looks like in local mode I have to store a relation obtained through streaming in order to use it afterwards.

For example consider the below script :

DEFINE MySTREAMUDF `test.sh`;
A = LOAD 'myinput' USING PigStorage() AS (myId:chararray, data2, data3,data4 );
B = STREAM A THROUGH MySTREAMUDF AS (wId:chararray, num:int);
--STORE B into 'output.B';
C = JOIN B by wId LEFT OUTER, A by myId;
D = FOREACH C GENERATE B::wId,B::num,data4 ;
D = STREAM D THROUGH MySTREAMUDF AS (f1:chararray,f2:int);
--STORE D into 'output.D';
E = foreach B GENERATE wId,num;
F = DISTINCT E;
G = GROUP F ALL;
H = FOREACH G GENERATE COUNT_STAR(F) as TotalCount;
I = CROSS D,H;
STORE I into 'output.I';


This addresses bug PIG-1831.
    https://issues.apache.org/jira/browse/PIG-1831


Diffs (updated)
-----

  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigMapBase.java 1071829 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigMapReduce.java 1071829 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/partitioners/SkewedPartitioner.java 1071829 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POCollectedGroup.java 1071829 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POCombinerPackage.java 1071829 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/PODistinct.java 1071829 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POJoinPackage.java 1071829 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POMergeCogroup.java 1071829 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POMergeJoin.java 1071829 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java 1071829 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPartitionRearrange.java 1071829 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POSort.java 1071829 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/util/MapRedUtil.java 1071829 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/streaming/HadoopExecutableManager.java 1071829 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/builtin/Distinct.java 1071829 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/data/InternalCachedBag.java 1071829 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/data/InternalDistinctBag.java 1071829 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/data/InternalSortedBag.java 1071829 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/builtin/DefaultIndexableLoader.java 1071829 
  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/io/FileLocalizer.java 1071829 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestFRJoin.java 1071829 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestFinish.java 1071829 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestPruneColumn.java 1071829 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/utils/FILTERFROMFILE.java 1071829 

Diff: https://reviews.apache.org/r/376/diff


Testing (updated)
-------

test-patch:
     [exec] -1 overall.  
     [exec] 
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec] 
     [exec]     +1 tests included.  The patch appears to include 12 new or modified tests.
     [exec] 
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec] 
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec] 
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
     [exec] 
     [exec]     -1 release audit.  The applied patch generated 517 release audit warnings (more than the trunk's current 516 warnings).

Ignore release audit warning, no new file added.

Unit test:
    all pass

End-to-end test:
    all pass


Thanks,

Daniel