You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Daniel Dai (JIRA)" <ji...@apache.org> on 2011/03/25 02:35:05 UTC

[jira] [Created] (PIG-1935) New logical plan: Should not push up filter in front of Bincond

New logical plan: Should not push up filter in front of Bincond
---------------------------------------------------------------

                 Key: PIG-1935
                 URL: https://issues.apache.org/jira/browse/PIG-1935
             Project: Pig
          Issue Type: Bug
          Components: impl
    Affects Versions: 0.8.0
            Reporter: Daniel Dai
            Assignee: Daniel Dai
             Fix For: 0.8.0


The following script produce wrong result:
{code}
data = LOAD 'data.txt' using PigStorage() as (referrer:chararray, canonical_url:chararray, ip:chararray);
best_url = FOREACH data GENERATE ((canonical_url != '' and canonical_url is not null) ? canonical_url : referrer) AS url, ip;
filtered = FILTER best_url BY url == 'badsite.com';
dump filtered;
{code}
data.txt:
badsite.com             127.0.0.1
goodsite.com/1?foo=true goodsite.com    127.0.0.1

Expected:
(badsite.com,127.0.0.1)

We get nothing.

Thanks Corbin Hoenes for reporting.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-1935) New logical plan: Should not push up filter in front of Bincond

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai updated PIG-1935:
----------------------------

    Attachment: PIG-1935-1.patch

> New logical plan: Should not push up filter in front of Bincond
> ---------------------------------------------------------------
>
>                 Key: PIG-1935
>                 URL: https://issues.apache.org/jira/browse/PIG-1935
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.8.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.8.0
>
>         Attachments: PIG-1935-1.patch
>
>
> The following script produce wrong result:
> {code}
> data = LOAD 'data.txt' using PigStorage() as (referrer:chararray, canonical_url:chararray, ip:chararray);
> best_url = FOREACH data GENERATE ((canonical_url != '' and canonical_url is not null) ? canonical_url : referrer) AS url, ip;
> filtered = FILTER best_url BY url == 'badsite.com';
> dump filtered;
> {code}
> data.txt:
> badsite.com             127.0.0.1
> goodsite.com/1?foo=true goodsite.com    127.0.0.1
> Expected:
> (badsite.com,127.0.0.1)
> We get nothing.
> Thanks Corbin Hoenes for reporting.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (PIG-1935) New logical plan: Should not push up filter in front of Bincond

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai resolved PIG-1935.
-----------------------------

      Resolution: Fixed
    Hadoop Flags: [Reviewed]

Patch committed to both trunk and 0.8 branch.

> New logical plan: Should not push up filter in front of Bincond
> ---------------------------------------------------------------
>
>                 Key: PIG-1935
>                 URL: https://issues.apache.org/jira/browse/PIG-1935
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.8.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.8.0
>
>         Attachments: PIG-1935-1.patch
>
>
> The following script produce wrong result:
> {code}
> data = LOAD 'data.txt' using PigStorage() as (referrer:chararray, canonical_url:chararray, ip:chararray);
> best_url = FOREACH data GENERATE ((canonical_url != '' and canonical_url is not null) ? canonical_url : referrer) AS url, ip;
> filtered = FILTER best_url BY url == 'badsite.com';
> dump filtered;
> {code}
> data.txt:
> badsite.com             127.0.0.1
> goodsite.com/1?foo=true goodsite.com    127.0.0.1
> Expected:
> (badsite.com,127.0.0.1)
> We get nothing.
> Thanks Corbin Hoenes for reporting.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-1935) New logical plan: Should not push up filter in front of Bincond

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13015526#comment-13015526 ] 

jiraposter@reviews.apache.org commented on PIG-1935:
----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/544/
-----------------------------------------------------------

(Updated 2011-04-04 18:10:55.974552)


Review request for pig and thejas.


Summary
-------

The following script produce wrong result:

data = LOAD 'data.txt' using PigStorage() as (referrer:chararray, canonical_url:chararray, ip:chararray);
best_url = FOREACH data GENERATE ((canonical_url != '' and canonical_url is not null) ? canonical_url : referrer) AS url, ip;
filtered = FILTER best_url BY url == 'badsite.com';
dump filtered;

data.txt:
badsite.com 127.0.0.1
goodsite.com/1?foo=true goodsite.com 127.0.0.1

Expected:
(badsite.com,127.0.0.1)

We get nothing.


This addresses bug PIG-1935.
    https://issues.apache.org/jira/browse/PIG-1935


Diffs
-----

  http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/expression/BinCondExpression.java 1085215 
  http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestNewPlanFilterAboveForeach.java 1085215 

Diff: https://reviews.apache.org/r/544/diff


Testing
-------

test-patch:
     [exec] +1 overall.  
     [exec] 
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec] 
     [exec]     +1 tests included.  The patch appears to include 3 new or modified tests.
     [exec] 
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec] 
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec] 
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
     [exec] 
     [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.

Unit test:
    all pass

End-to-end test:
    all pass


Thanks,

Daniel



> New logical plan: Should not push up filter in front of Bincond
> ---------------------------------------------------------------
>
>                 Key: PIG-1935
>                 URL: https://issues.apache.org/jira/browse/PIG-1935
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.8.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.8.0
>
>         Attachments: PIG-1935-1.patch
>
>
> The following script produce wrong result:
> {code}
> data = LOAD 'data.txt' using PigStorage() as (referrer:chararray, canonical_url:chararray, ip:chararray);
> best_url = FOREACH data GENERATE ((canonical_url != '' and canonical_url is not null) ? canonical_url : referrer) AS url, ip;
> filtered = FILTER best_url BY url == 'badsite.com';
> dump filtered;
> {code}
> data.txt:
> badsite.com             127.0.0.1
> goodsite.com/1?foo=true goodsite.com    127.0.0.1
> Expected:
> (badsite.com,127.0.0.1)
> We get nothing.
> Thanks Corbin Hoenes for reporting.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-1935) New logical plan: Should not push up filter in front of Bincond

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13015553#comment-13015553 ] 

jiraposter@reviews.apache.org commented on PIG-1935:
----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/544/#review382
-----------------------------------------------------------

Ship it!


+1

- thejas


On 2011-04-04 18:10:55, Daniel Dai wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/544/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-04-04 18:10:55)
bq.  
bq.  
bq.  Review request for pig and thejas.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  The following script produce wrong result:
bq.  
bq.  data = LOAD 'data.txt' using PigStorage() as (referrer:chararray, canonical_url:chararray, ip:chararray);
bq.  best_url = FOREACH data GENERATE ((canonical_url != '' and canonical_url is not null) ? canonical_url : referrer) AS url, ip;
bq.  filtered = FILTER best_url BY url == 'badsite.com';
bq.  dump filtered;
bq.  
bq.  data.txt:
bq.  badsite.com 127.0.0.1
bq.  goodsite.com/1?foo=true goodsite.com 127.0.0.1
bq.  
bq.  Expected:
bq.  (badsite.com,127.0.0.1)
bq.  
bq.  We get nothing.
bq.  
bq.  
bq.  This addresses bug PIG-1935.
bq.      https://issues.apache.org/jira/browse/PIG-1935
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/newplan/logical/expression/BinCondExpression.java 1085215 
bq.    http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestNewPlanFilterAboveForeach.java 1085215 
bq.  
bq.  Diff: https://reviews.apache.org/r/544/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  test-patch:
bq.       [exec] +1 overall.  
bq.       [exec] 
bq.       [exec]     +1 @author.  The patch does not contain any @author tags.
bq.       [exec] 
bq.       [exec]     +1 tests included.  The patch appears to include 3 new or modified tests.
bq.       [exec] 
bq.       [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
bq.       [exec] 
bq.       [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
bq.       [exec] 
bq.       [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
bq.       [exec] 
bq.       [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
bq.  
bq.  Unit test:
bq.      all pass
bq.  
bq.  End-to-end test:
bq.      all pass
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Daniel
bq.  
bq.



> New logical plan: Should not push up filter in front of Bincond
> ---------------------------------------------------------------
>
>                 Key: PIG-1935
>                 URL: https://issues.apache.org/jira/browse/PIG-1935
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.8.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.8.0
>
>         Attachments: PIG-1935-1.patch
>
>
> The following script produce wrong result:
> {code}
> data = LOAD 'data.txt' using PigStorage() as (referrer:chararray, canonical_url:chararray, ip:chararray);
> best_url = FOREACH data GENERATE ((canonical_url != '' and canonical_url is not null) ? canonical_url : referrer) AS url, ip;
> filtered = FILTER best_url BY url == 'badsite.com';
> dump filtered;
> {code}
> data.txt:
> badsite.com             127.0.0.1
> goodsite.com/1?foo=true goodsite.com    127.0.0.1
> Expected:
> (badsite.com,127.0.0.1)
> We get nothing.
> Thanks Corbin Hoenes for reporting.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-1935) New logical plan: Should not push up filter in front of Bincond

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13011064#comment-13011064 ] 

Daniel Dai commented on PIG-1935:
---------------------------------

It fail because we erroneous push filter above foreach. In BinCond case, we shall disable that by setting a new uid for BinCond output.

> New logical plan: Should not push up filter in front of Bincond
> ---------------------------------------------------------------
>
>                 Key: PIG-1935
>                 URL: https://issues.apache.org/jira/browse/PIG-1935
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.8.0
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.8.0
>
>
> The following script produce wrong result:
> {code}
> data = LOAD 'data.txt' using PigStorage() as (referrer:chararray, canonical_url:chararray, ip:chararray);
> best_url = FOREACH data GENERATE ((canonical_url != '' and canonical_url is not null) ? canonical_url : referrer) AS url, ip;
> filtered = FILTER best_url BY url == 'badsite.com';
> dump filtered;
> {code}
> data.txt:
> badsite.com             127.0.0.1
> goodsite.com/1?foo=true goodsite.com    127.0.0.1
> Expected:
> (badsite.com,127.0.0.1)
> We get nothing.
> Thanks Corbin Hoenes for reporting.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira