You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Pradeep Kamath (JIRA)" <ji...@apache.org> on 2008/08/08 03:08:44 UTC

[jira] Created: (PIG-369) Filter does not allow udf as the filter operator and only allows ComparisonOperators

Filter does not allow udf as the filter operator and only allows ComparisonOperators
------------------------------------------------------------------------------------

                 Key: PIG-369
                 URL: https://issues.apache.org/jira/browse/PIG-369
             Project: Pig
          Issue Type: Bug
    Affects Versions: types_branch
            Reporter: Pradeep Kamath
             Fix For: types_branch


The following pig script does not work:
{code}
register util.jar;
define MyFilterSet util.FilterUdf('filter.txt');
A = load 'simpletest' using PigStorage() as ( x, y );
B = filter A by MyFilterSet(x);
dump B;
{code}

The following error is seen:
{noformat}

java -cp pig.jar:$localc org.apache.pig.Main filter.pig 
2008-08-07 17:59:37,663 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: localhost:9000
2008-08-07 17:59:37,748 [main] WARN  org.apache.hadoop.fs.FileSystem - "localhost:9000" is a deprecated filesystem name. Use "hdfs://localhost:9000/" instead.
2008-08-07 17:59:38,035 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: localhost:9001
2008-08-07 17:59:38,166 [main] WARN  org.apache.hadoop.fs.FileSystem - "localhost:9000" is a deprecated filesystem name. Use "hdfs://localhost:9000/" instead.
java.io.IOException: Unable to open iterator for alias: B [org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc]
        at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.setPlan(POFilter.java:179)
        at org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:592)
        at org.apache.pig.impl.logicalLayer.LOFilter.visit(LOFilter.java:102)
        at org.apache.pig.impl.logicalLayer.LOFilter.visit(LOFilter.java:31)
        at org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:68)
        at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
        at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:245)
        at org.apache.pig.PigServer.compilePp(PigServer.java:590)
        at org.apache.pig.PigServer.execute(PigServer.java:516)
        at org.apache.pig.PigServer.openIterator(PigServer.java:307)
        at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:258)
        at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:175)
        at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:82)
        at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:64)
        at org.apache.pig.Main.main(Main.java:302)
Caused by: java.lang.ClassCastException: org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc
        ... 15 more

{noformat}

I looked further and the issue seems to be in POFilter which only thinks of the filter operator as a ComparisonOperator and doesn't allow a UDF for filtering:
{code}
public void setPlan(PhysicalPlan plan) {
        this.plan = plan;
        comOp = (ComparisonOperator) (plan.getLeaves()).get(0);
        compOperandType = comOp.getOperandType();
    }
{code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (PIG-369) Filter does not allow udf as the filter operator and only allows ComparisonOperators

Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Olga Natkovich reassigned PIG-369:
----------------------------------

    Assignee: Shravan Matthur Narayanamurthy

> Filter does not allow udf as the filter operator and only allows ComparisonOperators
> ------------------------------------------------------------------------------------
>
>                 Key: PIG-369
>                 URL: https://issues.apache.org/jira/browse/PIG-369
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: types_branch
>            Reporter: Pradeep Kamath
>            Assignee: Shravan Matthur Narayanamurthy
>             Fix For: types_branch
>
>
> The following pig script does not work:
> {code}
> register util.jar;
> define MyFilterSet util.FilterUdf('filter.txt');
> A = load 'simpletest' using PigStorage() as ( x, y );
> B = filter A by MyFilterSet(x);
> dump B;
> {code}
> The following error is seen:
> {noformat}
> java -cp pig.jar:$localc org.apache.pig.Main filter.pig 
> 2008-08-07 17:59:37,663 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: localhost:9000
> 2008-08-07 17:59:37,748 [main] WARN  org.apache.hadoop.fs.FileSystem - "localhost:9000" is a deprecated filesystem name. Use "hdfs://localhost:9000/" instead.
> 2008-08-07 17:59:38,035 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: localhost:9001
> 2008-08-07 17:59:38,166 [main] WARN  org.apache.hadoop.fs.FileSystem - "localhost:9000" is a deprecated filesystem name. Use "hdfs://localhost:9000/" instead.
> java.io.IOException: Unable to open iterator for alias: B [org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc]
>         at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.setPlan(POFilter.java:179)
>         at org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:592)
>         at org.apache.pig.impl.logicalLayer.LOFilter.visit(LOFilter.java:102)
>         at org.apache.pig.impl.logicalLayer.LOFilter.visit(LOFilter.java:31)
>         at org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:68)
>         at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
>         at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:245)
>         at org.apache.pig.PigServer.compilePp(PigServer.java:590)
>         at org.apache.pig.PigServer.execute(PigServer.java:516)
>         at org.apache.pig.PigServer.openIterator(PigServer.java:307)
>         at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:258)
>         at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:175)
>         at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:82)
>         at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:64)
>         at org.apache.pig.Main.main(Main.java:302)
> Caused by: java.lang.ClassCastException: org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc
>         ... 15 more
> {noformat}
> I looked further and the issue seems to be in POFilter which only thinks of the filter operator as a ComparisonOperator and doesn't allow a UDF for filtering:
> {code}
> public void setPlan(PhysicalPlan plan) {
>         this.plan = plan;
>         comOp = (ComparisonOperator) (plan.getLeaves()).get(0);
>         compOperandType = comOp.getOperandType();
>     }
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-369) Filter does not allow udf as the filter operator and only allows ComparisonOperators

Posted by "Shravan Matthur Narayanamurthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shravan Matthur Narayanamurthy updated PIG-369:
-----------------------------------------------

    Comment: was deleted

> Filter does not allow udf as the filter operator and only allows ComparisonOperators
> ------------------------------------------------------------------------------------
>
>                 Key: PIG-369
>                 URL: https://issues.apache.org/jira/browse/PIG-369
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: types_branch
>            Reporter: Pradeep Kamath
>            Assignee: Shravan Matthur Narayanamurthy
>             Fix For: types_branch
>
>
> The following pig script does not work:
> {code}
> register util.jar;
> define MyFilterSet util.FilterUdf('filter.txt');
> A = load 'simpletest' using PigStorage() as ( x, y );
> B = filter A by MyFilterSet(x);
> dump B;
> {code}
> The following error is seen:
> {noformat}
> java -cp pig.jar:$localc org.apache.pig.Main filter.pig 
> 2008-08-07 17:59:37,663 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: localhost:9000
> 2008-08-07 17:59:37,748 [main] WARN  org.apache.hadoop.fs.FileSystem - "localhost:9000" is a deprecated filesystem name. Use "hdfs://localhost:9000/" instead.
> 2008-08-07 17:59:38,035 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: localhost:9001
> 2008-08-07 17:59:38,166 [main] WARN  org.apache.hadoop.fs.FileSystem - "localhost:9000" is a deprecated filesystem name. Use "hdfs://localhost:9000/" instead.
> java.io.IOException: Unable to open iterator for alias: B [org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc]
>         at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.setPlan(POFilter.java:179)
>         at org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:592)
>         at org.apache.pig.impl.logicalLayer.LOFilter.visit(LOFilter.java:102)
>         at org.apache.pig.impl.logicalLayer.LOFilter.visit(LOFilter.java:31)
>         at org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:68)
>         at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
>         at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:245)
>         at org.apache.pig.PigServer.compilePp(PigServer.java:590)
>         at org.apache.pig.PigServer.execute(PigServer.java:516)
>         at org.apache.pig.PigServer.openIterator(PigServer.java:307)
>         at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:258)
>         at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:175)
>         at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:82)
>         at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:64)
>         at org.apache.pig.Main.main(Main.java:302)
> Caused by: java.lang.ClassCastException: org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc
>         ... 15 more
> {noformat}
> I looked further and the issue seems to be in POFilter which only thinks of the filter operator as a ComparisonOperator and doesn't allow a UDF for filtering:
> {code}
> public void setPlan(PhysicalPlan plan) {
>         this.plan = plan;
>         comOp = (ComparisonOperator) (plan.getLeaves()).get(0);
>         compOperandType = comOp.getOperandType();
>     }
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-369) Filter does not allow udf as the filter operator and only allows ComparisonOperators

Posted by "Shravan Matthur Narayanamurthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shravan Matthur Narayanamurthy updated PIG-369:
-----------------------------------------------

    Status: Patch Available  (was: Open)

Changes to POFilter: Made the leaf operator to a PhysicalOperator from ComparisonOperator
Changes to TypeCheckingVisitor: Was failing in checkInnerPlan as LOUserFunc was not one of the supported roots. Added it since it can occur in the inner plan of a filter and used the visit(LOUserFunc) method to do the necessary type checking and schema propogation.
Changes to Translator: Minor. The addition was causing a NPE. Fixed it by putting a null check
Added a new unit test for FilterUDF

> Filter does not allow udf as the filter operator and only allows ComparisonOperators
> ------------------------------------------------------------------------------------
>
>                 Key: PIG-369
>                 URL: https://issues.apache.org/jira/browse/PIG-369
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: types_branch
>            Reporter: Pradeep Kamath
>            Assignee: Shravan Matthur Narayanamurthy
>             Fix For: types_branch
>
>         Attachments: 369.patch
>
>
> The following pig script does not work:
> {code}
> register util.jar;
> define MyFilterSet util.FilterUdf('filter.txt');
> A = load 'simpletest' using PigStorage() as ( x, y );
> B = filter A by MyFilterSet(x);
> dump B;
> {code}
> The following error is seen:
> {noformat}
> java -cp pig.jar:$localc org.apache.pig.Main filter.pig 
> 2008-08-07 17:59:37,663 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: localhost:9000
> 2008-08-07 17:59:37,748 [main] WARN  org.apache.hadoop.fs.FileSystem - "localhost:9000" is a deprecated filesystem name. Use "hdfs://localhost:9000/" instead.
> 2008-08-07 17:59:38,035 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: localhost:9001
> 2008-08-07 17:59:38,166 [main] WARN  org.apache.hadoop.fs.FileSystem - "localhost:9000" is a deprecated filesystem name. Use "hdfs://localhost:9000/" instead.
> java.io.IOException: Unable to open iterator for alias: B [org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc]
>         at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.setPlan(POFilter.java:179)
>         at org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:592)
>         at org.apache.pig.impl.logicalLayer.LOFilter.visit(LOFilter.java:102)
>         at org.apache.pig.impl.logicalLayer.LOFilter.visit(LOFilter.java:31)
>         at org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:68)
>         at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
>         at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:245)
>         at org.apache.pig.PigServer.compilePp(PigServer.java:590)
>         at org.apache.pig.PigServer.execute(PigServer.java:516)
>         at org.apache.pig.PigServer.openIterator(PigServer.java:307)
>         at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:258)
>         at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:175)
>         at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:82)
>         at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:64)
>         at org.apache.pig.Main.main(Main.java:302)
> Caused by: java.lang.ClassCastException: org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc
>         ... 15 more
> {noformat}
> I looked further and the issue seems to be in POFilter which only thinks of the filter operator as a ComparisonOperator and doesn't allow a UDF for filtering:
> {code}
> public void setPlan(PhysicalPlan plan) {
>         this.plan = plan;
>         comOp = (ComparisonOperator) (plan.getLeaves()).get(0);
>         compOperandType = comOp.getOperandType();
>     }
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-369) Filter does not allow udf as the filter operator and only allows ComparisonOperators

Posted by "Shravan Matthur Narayanamurthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shravan Matthur Narayanamurthy updated PIG-369:
-----------------------------------------------

    Status: Open  (was: Patch Available)

> Filter does not allow udf as the filter operator and only allows ComparisonOperators
> ------------------------------------------------------------------------------------
>
>                 Key: PIG-369
>                 URL: https://issues.apache.org/jira/browse/PIG-369
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: types_branch
>            Reporter: Pradeep Kamath
>            Assignee: Shravan Matthur Narayanamurthy
>             Fix For: types_branch
>
>
> The following pig script does not work:
> {code}
> register util.jar;
> define MyFilterSet util.FilterUdf('filter.txt');
> A = load 'simpletest' using PigStorage() as ( x, y );
> B = filter A by MyFilterSet(x);
> dump B;
> {code}
> The following error is seen:
> {noformat}
> java -cp pig.jar:$localc org.apache.pig.Main filter.pig 
> 2008-08-07 17:59:37,663 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: localhost:9000
> 2008-08-07 17:59:37,748 [main] WARN  org.apache.hadoop.fs.FileSystem - "localhost:9000" is a deprecated filesystem name. Use "hdfs://localhost:9000/" instead.
> 2008-08-07 17:59:38,035 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: localhost:9001
> 2008-08-07 17:59:38,166 [main] WARN  org.apache.hadoop.fs.FileSystem - "localhost:9000" is a deprecated filesystem name. Use "hdfs://localhost:9000/" instead.
> java.io.IOException: Unable to open iterator for alias: B [org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc]
>         at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.setPlan(POFilter.java:179)
>         at org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:592)
>         at org.apache.pig.impl.logicalLayer.LOFilter.visit(LOFilter.java:102)
>         at org.apache.pig.impl.logicalLayer.LOFilter.visit(LOFilter.java:31)
>         at org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:68)
>         at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
>         at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:245)
>         at org.apache.pig.PigServer.compilePp(PigServer.java:590)
>         at org.apache.pig.PigServer.execute(PigServer.java:516)
>         at org.apache.pig.PigServer.openIterator(PigServer.java:307)
>         at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:258)
>         at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:175)
>         at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:82)
>         at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:64)
>         at org.apache.pig.Main.main(Main.java:302)
> Caused by: java.lang.ClassCastException: org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc
>         ... 15 more
> {noformat}
> I looked further and the issue seems to be in POFilter which only thinks of the filter operator as a ComparisonOperator and doesn't allow a UDF for filtering:
> {code}
> public void setPlan(PhysicalPlan plan) {
>         this.plan = plan;
>         comOp = (ComparisonOperator) (plan.getLeaves()).get(0);
>         compOperandType = comOp.getOperandType();
>     }
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (PIG-369) Filter does not allow udf as the filter operator and only allows ComparisonOperators

Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Olga Natkovich resolved PIG-369.
--------------------------------

    Resolution: Fixed

patch committed; thanks, shravan!

> Filter does not allow udf as the filter operator and only allows ComparisonOperators
> ------------------------------------------------------------------------------------
>
>                 Key: PIG-369
>                 URL: https://issues.apache.org/jira/browse/PIG-369
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: types_branch
>            Reporter: Pradeep Kamath
>            Assignee: Shravan Matthur Narayanamurthy
>             Fix For: types_branch
>
>         Attachments: 369.patch
>
>
> The following pig script does not work:
> {code}
> register util.jar;
> define MyFilterSet util.FilterUdf('filter.txt');
> A = load 'simpletest' using PigStorage() as ( x, y );
> B = filter A by MyFilterSet(x);
> dump B;
> {code}
> The following error is seen:
> {noformat}
> java -cp pig.jar:$localc org.apache.pig.Main filter.pig 
> 2008-08-07 17:59:37,663 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: localhost:9000
> 2008-08-07 17:59:37,748 [main] WARN  org.apache.hadoop.fs.FileSystem - "localhost:9000" is a deprecated filesystem name. Use "hdfs://localhost:9000/" instead.
> 2008-08-07 17:59:38,035 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: localhost:9001
> 2008-08-07 17:59:38,166 [main] WARN  org.apache.hadoop.fs.FileSystem - "localhost:9000" is a deprecated filesystem name. Use "hdfs://localhost:9000/" instead.
> java.io.IOException: Unable to open iterator for alias: B [org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc]
>         at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.setPlan(POFilter.java:179)
>         at org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:592)
>         at org.apache.pig.impl.logicalLayer.LOFilter.visit(LOFilter.java:102)
>         at org.apache.pig.impl.logicalLayer.LOFilter.visit(LOFilter.java:31)
>         at org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:68)
>         at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
>         at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:245)
>         at org.apache.pig.PigServer.compilePp(PigServer.java:590)
>         at org.apache.pig.PigServer.execute(PigServer.java:516)
>         at org.apache.pig.PigServer.openIterator(PigServer.java:307)
>         at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:258)
>         at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:175)
>         at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:82)
>         at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:64)
>         at org.apache.pig.Main.main(Main.java:302)
> Caused by: java.lang.ClassCastException: org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc
>         ... 15 more
> {noformat}
> I looked further and the issue seems to be in POFilter which only thinks of the filter operator as a ComparisonOperator and doesn't allow a UDF for filtering:
> {code}
> public void setPlan(PhysicalPlan plan) {
>         this.plan = plan;
>         comOp = (ComparisonOperator) (plan.getLeaves()).get(0);
>         compOperandType = comOp.getOperandType();
>     }
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-369) Filter does not allow udf as the filter operator and only allows ComparisonOperators

Posted by "Shravan Matthur Narayanamurthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shravan Matthur Narayanamurthy updated PIG-369:
-----------------------------------------------

    Status: Open  (was: Patch Available)

I am including this patch in the patch for Pig-375.

> Filter does not allow udf as the filter operator and only allows ComparisonOperators
> ------------------------------------------------------------------------------------
>
>                 Key: PIG-369
>                 URL: https://issues.apache.org/jira/browse/PIG-369
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: types_branch
>            Reporter: Pradeep Kamath
>            Assignee: Shravan Matthur Narayanamurthy
>             Fix For: types_branch
>
>         Attachments: 369.patch
>
>
> The following pig script does not work:
> {code}
> register util.jar;
> define MyFilterSet util.FilterUdf('filter.txt');
> A = load 'simpletest' using PigStorage() as ( x, y );
> B = filter A by MyFilterSet(x);
> dump B;
> {code}
> The following error is seen:
> {noformat}
> java -cp pig.jar:$localc org.apache.pig.Main filter.pig 
> 2008-08-07 17:59:37,663 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: localhost:9000
> 2008-08-07 17:59:37,748 [main] WARN  org.apache.hadoop.fs.FileSystem - "localhost:9000" is a deprecated filesystem name. Use "hdfs://localhost:9000/" instead.
> 2008-08-07 17:59:38,035 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: localhost:9001
> 2008-08-07 17:59:38,166 [main] WARN  org.apache.hadoop.fs.FileSystem - "localhost:9000" is a deprecated filesystem name. Use "hdfs://localhost:9000/" instead.
> java.io.IOException: Unable to open iterator for alias: B [org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc]
>         at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.setPlan(POFilter.java:179)
>         at org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:592)
>         at org.apache.pig.impl.logicalLayer.LOFilter.visit(LOFilter.java:102)
>         at org.apache.pig.impl.logicalLayer.LOFilter.visit(LOFilter.java:31)
>         at org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:68)
>         at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
>         at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:245)
>         at org.apache.pig.PigServer.compilePp(PigServer.java:590)
>         at org.apache.pig.PigServer.execute(PigServer.java:516)
>         at org.apache.pig.PigServer.openIterator(PigServer.java:307)
>         at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:258)
>         at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:175)
>         at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:82)
>         at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:64)
>         at org.apache.pig.Main.main(Main.java:302)
> Caused by: java.lang.ClassCastException: org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc
>         ... 15 more
> {noformat}
> I looked further and the issue seems to be in POFilter which only thinks of the filter operator as a ComparisonOperator and doesn't allow a UDF for filtering:
> {code}
> public void setPlan(PhysicalPlan plan) {
>         this.plan = plan;
>         comOp = (ComparisonOperator) (plan.getLeaves()).get(0);
>         compOperandType = comOp.getOperandType();
>     }
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-369) Filter does not allow udf as the filter operator and only allows ComparisonOperators

Posted by "Shravan Matthur Narayanamurthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shravan Matthur Narayanamurthy updated PIG-369:
-----------------------------------------------

    Attachment: 369.patch

> Filter does not allow udf as the filter operator and only allows ComparisonOperators
> ------------------------------------------------------------------------------------
>
>                 Key: PIG-369
>                 URL: https://issues.apache.org/jira/browse/PIG-369
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: types_branch
>            Reporter: Pradeep Kamath
>            Assignee: Shravan Matthur Narayanamurthy
>             Fix For: types_branch
>
>         Attachments: 369.patch
>
>
> The following pig script does not work:
> {code}
> register util.jar;
> define MyFilterSet util.FilterUdf('filter.txt');
> A = load 'simpletest' using PigStorage() as ( x, y );
> B = filter A by MyFilterSet(x);
> dump B;
> {code}
> The following error is seen:
> {noformat}
> java -cp pig.jar:$localc org.apache.pig.Main filter.pig 
> 2008-08-07 17:59:37,663 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: localhost:9000
> 2008-08-07 17:59:37,748 [main] WARN  org.apache.hadoop.fs.FileSystem - "localhost:9000" is a deprecated filesystem name. Use "hdfs://localhost:9000/" instead.
> 2008-08-07 17:59:38,035 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: localhost:9001
> 2008-08-07 17:59:38,166 [main] WARN  org.apache.hadoop.fs.FileSystem - "localhost:9000" is a deprecated filesystem name. Use "hdfs://localhost:9000/" instead.
> java.io.IOException: Unable to open iterator for alias: B [org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc]
>         at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.setPlan(POFilter.java:179)
>         at org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:592)
>         at org.apache.pig.impl.logicalLayer.LOFilter.visit(LOFilter.java:102)
>         at org.apache.pig.impl.logicalLayer.LOFilter.visit(LOFilter.java:31)
>         at org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:68)
>         at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
>         at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:245)
>         at org.apache.pig.PigServer.compilePp(PigServer.java:590)
>         at org.apache.pig.PigServer.execute(PigServer.java:516)
>         at org.apache.pig.PigServer.openIterator(PigServer.java:307)
>         at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:258)
>         at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:175)
>         at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:82)
>         at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:64)
>         at org.apache.pig.Main.main(Main.java:302)
> Caused by: java.lang.ClassCastException: org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc
>         ... 15 more
> {noformat}
> I looked further and the issue seems to be in POFilter which only thinks of the filter operator as a ComparisonOperator and doesn't allow a UDF for filtering:
> {code}
> public void setPlan(PhysicalPlan plan) {
>         this.plan = plan;
>         comOp = (ComparisonOperator) (plan.getLeaves()).get(0);
>         compOperandType = comOp.getOperandType();
>     }
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-369) Filter does not allow udf as the filter operator and only allows ComparisonOperators

Posted by "Shravan Matthur Narayanamurthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shravan Matthur Narayanamurthy updated PIG-369:
-----------------------------------------------

    Status: Patch Available  (was: Open)

Implemented the visit(LOCross) method in LogToPhyTranslationVisitor. This mimics what we were doing in Pig-1.0. To summarize, the following script with Cross will be converted as shown below:

{noformat}
A1 = load 'f1';
A2 = load 'f2';
.
.
.
An = load 'fn';
B = cross A1,A2,...,An;
{noformat}

{noformat}
A1 = load 'f1';
.
.
.
An = load 'fn';
B1 = foreach A1 generate flatten(GFCross('n','0')), flatten(*);
B2 = foreach A2 generate flatten(GFCross('n','1')), flatten(*);
.
.
.
Bn = foreach An generate flatten(GFCross('n','n-1')), flatten(*);
C = splgroup B1 by ($0,$1,..,$n-1) inner, B2 by ($0,$1,..,$n-1) inner, ..., Bn by ($0,$1,..,$n-1) inner;
D = foreach C generate flatten($1), flatten($2), ..., flatten($n);
{noformat}

GFCross outputs a bag with n-tuples and the foreach flattens the bag attaches them to the original tuples thus replicating each tuple.

The only difference from a normal pig script is the splgroup where the local-rearrange has a slight modification. When it is processing a cross, it removes the first n values from each value tuple which were attached to it by the foreach and passes the correct tuple as value while retaining the first n values as the key.

For ex, the foreach might produce (2,1,R,4) where (R,4) is the actual tuple & (2,1) is one of the tuples in the GFCross output. The localrearrange here arranges such tuples into keys and values by makeing (2,1) the key and (R,4) the value.

So the patch has two changes: one to translator & the other to localrearrange.

> Filter does not allow udf as the filter operator and only allows ComparisonOperators
> ------------------------------------------------------------------------------------
>
>                 Key: PIG-369
>                 URL: https://issues.apache.org/jira/browse/PIG-369
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: types_branch
>            Reporter: Pradeep Kamath
>            Assignee: Shravan Matthur Narayanamurthy
>             Fix For: types_branch
>
>
> The following pig script does not work:
> {code}
> register util.jar;
> define MyFilterSet util.FilterUdf('filter.txt');
> A = load 'simpletest' using PigStorage() as ( x, y );
> B = filter A by MyFilterSet(x);
> dump B;
> {code}
> The following error is seen:
> {noformat}
> java -cp pig.jar:$localc org.apache.pig.Main filter.pig 
> 2008-08-07 17:59:37,663 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: localhost:9000
> 2008-08-07 17:59:37,748 [main] WARN  org.apache.hadoop.fs.FileSystem - "localhost:9000" is a deprecated filesystem name. Use "hdfs://localhost:9000/" instead.
> 2008-08-07 17:59:38,035 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: localhost:9001
> 2008-08-07 17:59:38,166 [main] WARN  org.apache.hadoop.fs.FileSystem - "localhost:9000" is a deprecated filesystem name. Use "hdfs://localhost:9000/" instead.
> java.io.IOException: Unable to open iterator for alias: B [org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc]
>         at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.setPlan(POFilter.java:179)
>         at org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:592)
>         at org.apache.pig.impl.logicalLayer.LOFilter.visit(LOFilter.java:102)
>         at org.apache.pig.impl.logicalLayer.LOFilter.visit(LOFilter.java:31)
>         at org.apache.pig.impl.plan.DependencyOrderWalker.walk(DependencyOrderWalker.java:68)
>         at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51)
>         at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:245)
>         at org.apache.pig.PigServer.compilePp(PigServer.java:590)
>         at org.apache.pig.PigServer.execute(PigServer.java:516)
>         at org.apache.pig.PigServer.openIterator(PigServer.java:307)
>         at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:258)
>         at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:175)
>         at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:82)
>         at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:64)
>         at org.apache.pig.Main.main(Main.java:302)
> Caused by: java.lang.ClassCastException: org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc
>         ... 15 more
> {noformat}
> I looked further and the issue seems to be in POFilter which only thinks of the filter operator as a ComparisonOperator and doesn't allow a UDF for filtering:
> {code}
> public void setPlan(PhysicalPlan plan) {
>         this.plan = plan;
>         comOp = (ComparisonOperator) (plan.getLeaves()).get(0);
>         compOperandType = comOp.getOperandType();
>     }
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.