You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@pig.apache.org by "Viraj Bhat (JIRA)" <ji...@apache.org> on 2008/12/17 03:22:44 UTC

[jira] Created: (PIG-568) Reducer plan generation fails when UDF contains integers as parameters

Reducer plan generation fails when UDF contains integers as parameters
----------------------------------------------------------------------

                 Key: PIG-568
                 URL: https://issues.apache.org/jira/browse/PIG-568
             Project: Pig
          Issue Type: Bug
          Components: impl
    Affects Versions: types_branch
            Reporter: Viraj Bhat
            Priority: Critical
             Fix For: types_branch


Consider the following pig script which contains a UDF known as MYUDF. MYUDF is a dummy UDF which takes in a Bag E and a set of integers as offsets. 

{code}
register myudf.jar;

A = load 'visits.txt' using PigStorage() as ( name:chararray, url:chararray, timestamp:chararray);

B = filter A by (
        (name is not null) AND
        (timestamp is not null)
        );

C = group B by (
        url
        );

D = foreach C {
        E = order B by timestamp;
        generate E;
        }

G = foreach D generate
        param.MYUDF(E, -1, 0, 1);
--this works
--param.MYUDF(E,'-1','0','1'); 

explain G;
dump G;
{code}

If you execute the above script, it fails during the reducer phase where the POUserFunc(MYUDF)[bag] is being called. The MYUDF code is infact not called but somehow the parameters passed to the MYUDF cause the exception in the reduce plan. If you replace the offsets -1,0,1 with '-1', '0', '1' (strings) the UDF seems to get called and the script works fine. 
=============================================================================================================================
java.io.IOException: Received Error while processing the reduce plan.
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:307)
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:247)
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:224)
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:136)
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:318)
	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2209)
=============================================================================================================================

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-568) Reducer plan generation fails when UDF contains integers as parameters

Posted by "Pradeep Kamath (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/PIG-568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pradeep Kamath updated PIG-568:
-------------------------------

    Assignee: Pradeep Kamath

Though the script has a UDF and error seems to rise from the use of the UDF, the root cause is a bug in PONegative which is used to represent the -1 argument. -1 is modelled as the Constant(1) as an expression to PONegative. This is a duplicate of issue https://issues.apache.org/jira/browse/PIG-522

> Reducer plan generation fails when UDF contains integers as parameters
> ----------------------------------------------------------------------
>
>                 Key: PIG-568
>                 URL: https://issues.apache.org/jira/browse/PIG-568
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: types_branch
>            Reporter: Viraj Bhat
>            Assignee: Pradeep Kamath
>            Priority: Critical
>             Fix For: types_branch
>
>         Attachments: MYUDF.java, myudfint.pig, visits.txt
>
>
> Consider the following pig script which contains a UDF known as MYUDF. MYUDF is a dummy UDF which takes in a Bag E and a set of integers as offsets. 
> {code}
> register myudf.jar;
> A = load 'visits.txt' using PigStorage() as ( name:chararray, url:chararray, timestamp:chararray);
> B = filter A by (
>         (name is not null) AND
>         (timestamp is not null)
>         );
> C = group B by (
>         url
>         );
> D = foreach C {
>         E = order B by timestamp;
>         generate E;
>         }
> G = foreach D generate
>         param.MYUDF(E, -1, 0, 1);
> --this works
> --param.MYUDF(E,'-1','0','1'); 
> explain G;
> dump G;
> {code}
> If you execute the above script, it fails during the reducer phase where the POUserFunc(MYUDF)[bag] is being called. The MYUDF code is infact not called but somehow the parameters passed to the MYUDF cause the exception in the reduce plan. If you replace the offsets -1,0,1 with '-1', '0', '1' (strings) the UDF seems to get called and the script works fine. 
> =============================================================================================================================
> java.io.IOException: Received Error while processing the reduce plan.
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:307)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:247)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:224)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:136)
> 	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:318)
> 	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2209)
> =============================================================================================================================

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-568) Reducer plan generation fails when UDF contains integers as parameters

Posted by "Viraj Bhat (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/PIG-568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Viraj Bhat updated PIG-568:
---------------------------

    Attachment: visits.txt

Test data

> Reducer plan generation fails when UDF contains integers as parameters
> ----------------------------------------------------------------------
>
>                 Key: PIG-568
>                 URL: https://issues.apache.org/jira/browse/PIG-568
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: types_branch
>            Reporter: Viraj Bhat
>            Priority: Critical
>             Fix For: types_branch
>
>         Attachments: myudfint.pig, visits.txt
>
>
> Consider the following pig script which contains a UDF known as MYUDF. MYUDF is a dummy UDF which takes in a Bag E and a set of integers as offsets. 
> {code}
> register myudf.jar;
> A = load 'visits.txt' using PigStorage() as ( name:chararray, url:chararray, timestamp:chararray);
> B = filter A by (
>         (name is not null) AND
>         (timestamp is not null)
>         );
> C = group B by (
>         url
>         );
> D = foreach C {
>         E = order B by timestamp;
>         generate E;
>         }
> G = foreach D generate
>         param.MYUDF(E, -1, 0, 1);
> --this works
> --param.MYUDF(E,'-1','0','1'); 
> explain G;
> dump G;
> {code}
> If you execute the above script, it fails during the reducer phase where the POUserFunc(MYUDF)[bag] is being called. The MYUDF code is infact not called but somehow the parameters passed to the MYUDF cause the exception in the reduce plan. If you replace the offsets -1,0,1 with '-1', '0', '1' (strings) the UDF seems to get called and the script works fine. 
> =============================================================================================================================
> java.io.IOException: Received Error while processing the reduce plan.
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:307)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:247)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:224)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:136)
> 	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:318)
> 	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2209)
> =============================================================================================================================

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (PIG-568) Reducer plan generation fails when UDF contains integers as parameters

Posted by "Pradeep Kamath (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/PIG-568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pradeep Kamath resolved PIG-568.
--------------------------------

    Resolution: Duplicate

Marking as duplicate of https://issues.apache.org/jira/browse/PIG-522 as explained in the previous comment

> Reducer plan generation fails when UDF contains integers as parameters
> ----------------------------------------------------------------------
>
>                 Key: PIG-568
>                 URL: https://issues.apache.org/jira/browse/PIG-568
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: types_branch
>            Reporter: Viraj Bhat
>            Assignee: Pradeep Kamath
>            Priority: Critical
>             Fix For: types_branch
>
>         Attachments: MYUDF.java, myudfint.pig, visits.txt
>
>
> Consider the following pig script which contains a UDF known as MYUDF. MYUDF is a dummy UDF which takes in a Bag E and a set of integers as offsets. 
> {code}
> register myudf.jar;
> A = load 'visits.txt' using PigStorage() as ( name:chararray, url:chararray, timestamp:chararray);
> B = filter A by (
>         (name is not null) AND
>         (timestamp is not null)
>         );
> C = group B by (
>         url
>         );
> D = foreach C {
>         E = order B by timestamp;
>         generate E;
>         }
> G = foreach D generate
>         param.MYUDF(E, -1, 0, 1);
> --this works
> --param.MYUDF(E,'-1','0','1'); 
> explain G;
> dump G;
> {code}
> If you execute the above script, it fails during the reducer phase where the POUserFunc(MYUDF)[bag] is being called. The MYUDF code is infact not called but somehow the parameters passed to the MYUDF cause the exception in the reduce plan. If you replace the offsets -1,0,1 with '-1', '0', '1' (strings) the UDF seems to get called and the script works fine. 
> =============================================================================================================================
> java.io.IOException: Received Error while processing the reduce plan.
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:307)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:247)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:224)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:136)
> 	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:318)
> 	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2209)
> =============================================================================================================================

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-568) Reducer plan generation fails when UDF contains integers as parameters

Posted by "Viraj Bhat (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/PIG-568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Viraj Bhat updated PIG-568:
---------------------------

    Attachment: MYUDF.java

Dummy UDF MYUDF.java used in the Pig script

> Reducer plan generation fails when UDF contains integers as parameters
> ----------------------------------------------------------------------
>
>                 Key: PIG-568
>                 URL: https://issues.apache.org/jira/browse/PIG-568
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: types_branch
>            Reporter: Viraj Bhat
>            Priority: Critical
>             Fix For: types_branch
>
>         Attachments: MYUDF.java, myudfint.pig, visits.txt
>
>
> Consider the following pig script which contains a UDF known as MYUDF. MYUDF is a dummy UDF which takes in a Bag E and a set of integers as offsets. 
> {code}
> register myudf.jar;
> A = load 'visits.txt' using PigStorage() as ( name:chararray, url:chararray, timestamp:chararray);
> B = filter A by (
>         (name is not null) AND
>         (timestamp is not null)
>         );
> C = group B by (
>         url
>         );
> D = foreach C {
>         E = order B by timestamp;
>         generate E;
>         }
> G = foreach D generate
>         param.MYUDF(E, -1, 0, 1);
> --this works
> --param.MYUDF(E,'-1','0','1'); 
> explain G;
> dump G;
> {code}
> If you execute the above script, it fails during the reducer phase where the POUserFunc(MYUDF)[bag] is being called. The MYUDF code is infact not called but somehow the parameters passed to the MYUDF cause the exception in the reduce plan. If you replace the offsets -1,0,1 with '-1', '0', '1' (strings) the UDF seems to get called and the script works fine. 
> =============================================================================================================================
> java.io.IOException: Received Error while processing the reduce plan.
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:307)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:247)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:224)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:136)
> 	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:318)
> 	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2209)
> =============================================================================================================================

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-568) Reducer plan generation fails when UDF contains integers as parameters

Posted by "Viraj Bhat (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/PIG-568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Viraj Bhat updated PIG-568:
---------------------------

    Attachment: myudfint.pig

Pig Script causing the exception

> Reducer plan generation fails when UDF contains integers as parameters
> ----------------------------------------------------------------------
>
>                 Key: PIG-568
>                 URL: https://issues.apache.org/jira/browse/PIG-568
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: types_branch
>            Reporter: Viraj Bhat
>            Priority: Critical
>             Fix For: types_branch
>
>         Attachments: myudfint.pig, visits.txt
>
>
> Consider the following pig script which contains a UDF known as MYUDF. MYUDF is a dummy UDF which takes in a Bag E and a set of integers as offsets. 
> {code}
> register myudf.jar;
> A = load 'visits.txt' using PigStorage() as ( name:chararray, url:chararray, timestamp:chararray);
> B = filter A by (
>         (name is not null) AND
>         (timestamp is not null)
>         );
> C = group B by (
>         url
>         );
> D = foreach C {
>         E = order B by timestamp;
>         generate E;
>         }
> G = foreach D generate
>         param.MYUDF(E, -1, 0, 1);
> --this works
> --param.MYUDF(E,'-1','0','1'); 
> explain G;
> dump G;
> {code}
> If you execute the above script, it fails during the reducer phase where the POUserFunc(MYUDF)[bag] is being called. The MYUDF code is infact not called but somehow the parameters passed to the MYUDF cause the exception in the reduce plan. If you replace the offsets -1,0,1 with '-1', '0', '1' (strings) the UDF seems to get called and the script works fine. 
> =============================================================================================================================
> java.io.IOException: Received Error while processing the reduce plan.
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:307)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:247)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:224)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:136)
> 	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:318)
> 	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2209)
> =============================================================================================================================

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.