You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by Daniel Dai <da...@gmail.com> on 2011/03/18 22:27:28 UTC

Review Request: Bug in new logical plan : No output generated even though there are valid records

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/517/
-----------------------------------------------------------

Review request for pig and thejas.


Summary
-------

I have the below script which provides me no output even though there are valid records in relation B which is used for the left out join.

A0 = load 'input' using Maploader() as ( map1, map2, map3 );
A = filter A0 by ( (map2#'params'#'prop' == 464) and (map2#'params'#'query' is not null) );
B0 = filter A by (map1#'type' == 'c');
B = filter B0 by ( map2#'info'#'s' matches 'aaaa|bbb|cccc');
C = filter A by (map1#'type' == 'p');
D = join B by map2#'params'#'query' LEFT OUTER , C by map2#'params'#'query';
store D into 'output';

This is a bug with the newlogical plan. From the plan i can see that map1#'type' and map2#'info'#'s' is not marked as RequiredKeys ,
but where as all the fields reffered in the firts filter statement is marked as required.

For the script to work I have to turn off the coloumn prune optimizer by -t ColumnMapKeyPrune or rearrange the script such that;
B0 = filter A0 by ( (map2#'params'#'prop' == 464) and (map2#'params'#'query' is not null) and (map1#'type' == 'c') );
C = filter A0 by ( (map2#'params'#'prop' == 464) and (map2#'params'#'query' is not null) and (map1#'type' == 'p') );


This addresses bug PIG-1892.
    https://issues.apache.org/jira/browse/PIG-1892


Diffs
-----

  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/rules/MapKeysPruneHelper.java 1082312 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestPruneColumn.java 1082312 

Diff: https://reviews.apache.org/r/517/diff


Testing
-------

Test-patch:
     [exec] -1 overall.  
     [exec] 
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec] 
     [exec]     +1 tests included.  The patch appears to include 3 new or modified tests.
     [exec] 
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec] 
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec] 
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
     [exec] 
     [exec]     -1 release audit.  The applied patch generated 541 release audit warnings (more than the trunk's current 539 warnings).

Unit test:
    all pass


Thanks,

Daniel


Re: Review Request: Bug in new logical plan : No output generated even though there are valid records

Posted by Daniel Dai <da...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/517/
-----------------------------------------------------------

(Updated 2011-03-18 14:34:17.932065)


Review request for pig and thejas.


Summary
-------

I have the below script which provides me no output even though there are valid records in relation B which is used for the left out join.

A0 = load 'input' using Maploader() as ( map1, map2, map3 );
A = filter A0 by ( (map2#'params'#'prop' == 464) and (map2#'params'#'query' is not null) );
B0 = filter A by (map1#'type' == 'c');
B = filter B0 by ( map2#'info'#'s' matches 'aaaa|bbb|cccc');
C = filter A by (map1#'type' == 'p');
D = join B by map2#'params'#'query' LEFT OUTER , C by map2#'params'#'query';
store D into 'output';

This is a bug with the newlogical plan. From the plan i can see that map1#'type' and map2#'info'#'s' is not marked as RequiredKeys ,
but where as all the fields reffered in the firts filter statement is marked as required.

For the script to work I have to turn off the coloumn prune optimizer by -t ColumnMapKeyPrune or rearrange the script such that;
B0 = filter A0 by ( (map2#'params'#'prop' == 464) and (map2#'params'#'query' is not null) and (map1#'type' == 'c') );
C = filter A0 by ( (map2#'params'#'prop' == 464) and (map2#'params'#'query' is not null) and (map1#'type' == 'p') );


This addresses bug PIG-1892.
    https://issues.apache.org/jira/browse/PIG-1892


Diffs
-----

  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/rules/MapKeysPruneHelper.java 1082312 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestPruneColumn.java 1082312 

Diff: https://reviews.apache.org/r/517/diff


Testing (updated)
-------

Test-patch:
     [exec] -1 overall.  
     [exec] 
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec] 
     [exec]     +1 tests included.  The patch appears to include 3 new or modified tests.
     [exec] 
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec] 
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec] 
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
     [exec] 
     [exec]     -1 release audit.  The applied patch generated 541 release audit warnings (more than the trunk's current 539 warnings).
No new files added, ignore "release audit warnings".

Unit test:
    all pass


Thanks,

Daniel