You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Vivek Padmanabhan (JIRA)" <ji...@apache.org> on 2011/03/11 12:10:59 UTC

[jira] Created: (PIG-1895) Class cast exception while projecting udf result

Class cast exception while projecting udf result
------------------------------------------------

                 Key: PIG-1895
                 URL: https://issues.apache.org/jira/browse/PIG-1895
             Project: Pig
          Issue Type: Bug
          Components: impl
    Affects Versions: 0.8.0, 0.7.0, 0.9.0
            Reporter: Vivek Padmanabhan


Class cast exception is thrown when I try to project the result from my udf. The udf has a defined schema DataType.BAG,DataType.LONG and DataType.INTEGER

The below is my script
{code}
Data = load 'file:/home/pvivek/Desktop/input' using PigStorage() as ( i: int );
AllData = group Data all parallel 1;
SampledData = foreach AllData generate org.vivek.TestEvalFunc(Data, 5) as rs;
SampledData1 = foreach SampledData generate rs.sampled;
{code}

Even though the output schema defines "sampled" as a data bag, while processing, instead of sending only the data bag generated from the UDF , the entire tuple was sent to the projection as result.
{code}
Exception recieved :
java.lang.ClassCastException: org.apache.pig.data.BinSedesTuple cannot be cast to org.apache.pig.data.DataBag
	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:484)
	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:197)
	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:480)
	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:197)
	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:339)
	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:291)
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:434)
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:402)
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:382)
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:1)
	at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
	at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566)
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
{code}
This issue is happening with 0.9/0.8 and 0.7



--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (PIG-1895) Class cast exception while projecting udf result

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai resolved PIG-1895.
-----------------------------

       Resolution: Fixed
    Fix Version/s:     (was: 0.8.0)
                   0.8.1

Fixed by PIG-1866.

> Class cast exception while projecting udf result
> ------------------------------------------------
>
>                 Key: PIG-1895
>                 URL: https://issues.apache.org/jira/browse/PIG-1895
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.7.0, 0.8.0, 0.9.0
>            Reporter: Vivek Padmanabhan
>            Assignee: Daniel Dai
>             Fix For: 0.8.1
>
>
> Class cast exception is thrown when I try to project the result from my udf. The udf has a defined schema DataType.BAG,DataType.LONG and DataType.INTEGER
> The below is my script
> {code}
> Data = load 'file:/home/pvivek/Desktop/input' using PigStorage() as ( i: int );
> AllData = group Data all parallel 1;
> SampledData = foreach AllData generate org.vivek.TestEvalFunc(Data, 5) as rs;
> SampledData1 = foreach SampledData generate rs.sampled;
> {code}
> Even though the output schema defines "sampled" as a data bag, while processing, instead of sending only the data bag generated from the UDF , the entire tuple was sent to the projection as result.
> {code}
> Exception recieved :
> java.lang.ClassCastException: org.apache.pig.data.BinSedesTuple cannot be cast to org.apache.pig.data.DataBag
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:484)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:197)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:480)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:197)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:339)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:291)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:434)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:402)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:382)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:1)
> 	at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
> 	at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566)
> 	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
> 	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
> {code}
> This issue is happening with 0.9/0.8 and 0.7

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-1895) Class cast exception while projecting udf result

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13009248#comment-13009248 ] 

Daniel Dai commented on PIG-1895:
---------------------------------

This issue is the same nature as PIG-1866. Sounds weird but seems Pig have trouble projecting a bag out of a tuple.

> Class cast exception while projecting udf result
> ------------------------------------------------
>
>                 Key: PIG-1895
>                 URL: https://issues.apache.org/jira/browse/PIG-1895
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.7.0, 0.8.0, 0.9.0
>            Reporter: Vivek Padmanabhan
>            Assignee: Daniel Dai
>             Fix For: 0.8.0
>
>
> Class cast exception is thrown when I try to project the result from my udf. The udf has a defined schema DataType.BAG,DataType.LONG and DataType.INTEGER
> The below is my script
> {code}
> Data = load 'file:/home/pvivek/Desktop/input' using PigStorage() as ( i: int );
> AllData = group Data all parallel 1;
> SampledData = foreach AllData generate org.vivek.TestEvalFunc(Data, 5) as rs;
> SampledData1 = foreach SampledData generate rs.sampled;
> {code}
> Even though the output schema defines "sampled" as a data bag, while processing, instead of sending only the data bag generated from the UDF , the entire tuple was sent to the projection as result.
> {code}
> Exception recieved :
> java.lang.ClassCastException: org.apache.pig.data.BinSedesTuple cannot be cast to org.apache.pig.data.DataBag
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:484)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:197)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:480)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:197)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:339)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:291)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:434)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:402)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:382)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:1)
> 	at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
> 	at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566)
> 	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
> 	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
> {code}
> This issue is happening with 0.9/0.8 and 0.7

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (PIG-1895) Class cast exception while projecting udf result

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006761#comment-13006761 ] 

Daniel Dai commented on PIG-1895:
---------------------------------

Sorry, the last comment is not mean for this Jira, ignore it.

> Class cast exception while projecting udf result
> ------------------------------------------------
>
>                 Key: PIG-1895
>                 URL: https://issues.apache.org/jira/browse/PIG-1895
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.7.0, 0.8.0, 0.9.0
>            Reporter: Vivek Padmanabhan
>            Assignee: Daniel Dai
>             Fix For: 0.8.0
>
>
> Class cast exception is thrown when I try to project the result from my udf. The udf has a defined schema DataType.BAG,DataType.LONG and DataType.INTEGER
> The below is my script
> {code}
> Data = load 'file:/home/pvivek/Desktop/input' using PigStorage() as ( i: int );
> AllData = group Data all parallel 1;
> SampledData = foreach AllData generate org.vivek.TestEvalFunc(Data, 5) as rs;
> SampledData1 = foreach SampledData generate rs.sampled;
> {code}
> Even though the output schema defines "sampled" as a data bag, while processing, instead of sending only the data bag generated from the UDF , the entire tuple was sent to the projection as result.
> {code}
> Exception recieved :
> java.lang.ClassCastException: org.apache.pig.data.BinSedesTuple cannot be cast to org.apache.pig.data.DataBag
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:484)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:197)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:480)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:197)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:339)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:291)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:434)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:402)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:382)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:1)
> 	at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
> 	at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566)
> 	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
> 	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
> {code}
> This issue is happening with 0.9/0.8 and 0.7

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (PIG-1895) Class cast exception while projecting udf result

Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Olga Natkovich updated PIG-1895:
--------------------------------

    Fix Version/s: 0.8.0
         Assignee: Daniel Dai

> Class cast exception while projecting udf result
> ------------------------------------------------
>
>                 Key: PIG-1895
>                 URL: https://issues.apache.org/jira/browse/PIG-1895
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.7.0, 0.8.0, 0.9.0
>            Reporter: Vivek Padmanabhan
>            Assignee: Daniel Dai
>             Fix For: 0.8.0
>
>
> Class cast exception is thrown when I try to project the result from my udf. The udf has a defined schema DataType.BAG,DataType.LONG and DataType.INTEGER
> The below is my script
> {code}
> Data = load 'file:/home/pvivek/Desktop/input' using PigStorage() as ( i: int );
> AllData = group Data all parallel 1;
> SampledData = foreach AllData generate org.vivek.TestEvalFunc(Data, 5) as rs;
> SampledData1 = foreach SampledData generate rs.sampled;
> {code}
> Even though the output schema defines "sampled" as a data bag, while processing, instead of sending only the data bag generated from the UDF , the entire tuple was sent to the projection as result.
> {code}
> Exception recieved :
> java.lang.ClassCastException: org.apache.pig.data.BinSedesTuple cannot be cast to org.apache.pig.data.DataBag
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:484)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:197)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:480)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:197)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:339)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:291)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:434)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:402)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:382)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:1)
> 	at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
> 	at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566)
> 	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
> 	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
> {code}
> This issue is happening with 0.9/0.8 and 0.7

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (PIG-1895) Class cast exception while projecting udf result

Posted by "Vivek Padmanabhan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13005604#comment-13005604 ] 

Vivek Padmanabhan commented on PIG-1895:
----------------------------------------

UDF Source code :
{code}
import java.io.IOException;
import java.util.ArrayList;
import java.util.Iterator;

import org.apache.pig.EvalFunc;
import org.apache.pig.data.DataBag;
import org.apache.pig.data.DataType;
import org.apache.pig.data.DefaultBagFactory;
import org.apache.pig.data.DefaultTupleFactory;
import org.apache.pig.data.Tuple;
import org.apache.pig.impl.logicalLayer.schema.Schema;

public class TestEvalFunc extends EvalFunc<Tuple>{
	
	public Tuple exec(Tuple input) throws IOException {
		ArrayList<Tuple> tupleList = new ArrayList<Tuple>(2);
		DataBag values = (DataBag)(input.get(0));
		for (Iterator<Tuple> vit = values.iterator(); vit.hasNext();) {
			tupleList.add(vit.next());
		}
		DataBag sampleBag = DefaultBagFactory.getInstance().newDefaultBag(tupleList);
		Tuple output = DefaultTupleFactory.getInstance().newTuple(3);
		output.set(0, sampleBag);
		output.set(1, new Long(3));
		output.set(2, new Integer(2));
		return output;
	}
	public Schema outputSchema(Schema input) {
		Schema udfSchema = new Schema();
		udfSchema.add(new Schema.FieldSchema("sampled",DataType.BAG));
		udfSchema.add(new Schema.FieldSchema("k",DataType.LONG));
		udfSchema.add(new Schema.FieldSchema("i",DataType.INTEGER));
		return udfSchema;
	}
}
{code}

Test Case to Verify;
{code}
import static org.apache.pig.ExecType.LOCAL;

import java.util.ArrayList;
import java.util.Iterator;

import junit.framework.TestCase;

import org.apache.pig.PigServer;
import org.apache.pig.data.Tuple;

public class MyPigUnitTests extends TestCase {
  private static String patternString = "(\\d+)!+(\\w+)~+(\\w+)";
  public static ArrayList<String[]> data = new ArrayList<String[]>();
  static {
    data.add(new String[] { "1"});
    data.add(new String[] { "3"});
  }

  private static String [] script = new String []{
     "Data = load 'file:/home/pvivek/Desktop/input' using PigStorage() as ( i: int );",
     "AllData = group Data all parallel 1;",
     "SampledData = foreach AllData generate org.vivek.TestEvalFunc(Data, 5) as rs;",
     "SampledData1 = foreach SampledData generate rs.sampled;",
  };

  public void test () throws Exception
  {
     String filename = TestHelper.createTempFile(data, "");
     PigServer pig = new PigServer(LOCAL);
     filename = filename.replace("\\", "\\\\");
     patternString = patternString.replace("\\", "\\\\");
     
     for (String query : script) {
        pig.registerQuery(query);
    }
     Iterator<?> it = pig.openIterator("SampledData1");
     int tupleCount = 0;
     while (it.hasNext()) {
        Tuple tuple = (Tuple) it.next();
        if (tuple == null)
          break;
        else {
          if (tuple.size() > 0) {
              tupleCount++;
          }
        }
      }
     assertEquals(1, tupleCount);
  }
}
{code}

> Class cast exception while projecting udf result
> ------------------------------------------------
>
>                 Key: PIG-1895
>                 URL: https://issues.apache.org/jira/browse/PIG-1895
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.7.0, 0.8.0, 0.9.0
>            Reporter: Vivek Padmanabhan
>
> Class cast exception is thrown when I try to project the result from my udf. The udf has a defined schema DataType.BAG,DataType.LONG and DataType.INTEGER
> The below is my script
> {code}
> Data = load 'file:/home/pvivek/Desktop/input' using PigStorage() as ( i: int );
> AllData = group Data all parallel 1;
> SampledData = foreach AllData generate org.vivek.TestEvalFunc(Data, 5) as rs;
> SampledData1 = foreach SampledData generate rs.sampled;
> {code}
> Even though the output schema defines "sampled" as a data bag, while processing, instead of sending only the data bag generated from the UDF , the entire tuple was sent to the projection as result.
> {code}
> Exception recieved :
> java.lang.ClassCastException: org.apache.pig.data.BinSedesTuple cannot be cast to org.apache.pig.data.DataBag
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:484)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:197)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:480)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:197)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:339)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:291)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:434)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:402)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:382)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:1)
> 	at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
> 	at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566)
> 	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
> 	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
> {code}
> This issue is happening with 0.9/0.8 and 0.7

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (PIG-1895) Class cast exception while projecting udf result

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006757#comment-13006757 ] 

Daniel Dai commented on PIG-1895:
---------------------------------

TestEvalFunc forget to set twolevelaccess flag for the bag. Pig 0.9 does not require twolevelaccess flag. However, we saw Pig 0.9 fail with a different stack:

ERROR 1000: Invalid field reference. Referenced field [guid] does not exist in schema: null.

org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1067: Unable to explain alias C3
        at org.apache.pig.PigServer.explain(PigServer.java:993)
        at org.apache.pig.tools.grunt.GruntParser.explainCurrentBatch(GruntParser.java:368)
        at org.apache.pig.tools.grunt.GruntParser.processExplain(GruntParser.java:300)
        at org.apache.pig.tools.grunt.GruntParser.processExplain(GruntParser.java:263)
        at org.apache.pig.tools.pigscript.parser.PigScriptParser.Explain(PigScriptParser.java:665)
        at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:325)
        at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:176)
        at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:152)
        at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:90)
        at org.apache.pig.Main.run(Main.java:537)
        at org.apache.pig.Main.main(Main.java:108)
Caused by: org.apache.pig.impl.plan.PlanValidationException: ERROR 1000: Invalid field reference. Referenced field [guid] does not exist in schema: null.
        at org.apache.pig.newplan.logical.visitor.ColumnAliasConversionVisitor$1.visit(ColumnAliasConversionVisitor.java:114)
        at org.apache.pig.newplan.logical.expression.DereferenceExpression.accept(DereferenceExpression.java:83)
        at org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
        at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
        at org.apache.pig.newplan.logical.optimizer.AllExpressionVisitor.visit(AllExpressionVisitor.java:114)
        at org.apache.pig.newplan.logical.relational.LOGenerate.accept(LOGenerate.java:240)
        at org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
        at org.apache.pig.newplan.logical.optimizer.AllExpressionVisitor.visit(AllExpressionVisitor.java:104)
        at org.apache.pig.newplan.logical.relational.LOForEach.accept(LOForEach.java:73)
        at org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
        at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
        at org.apache.pig.PigServer$Graph.compile(PigServer.java:1538)
        at org.apache.pig.PigServer$Graph.compile(PigServer.java:1533)
        at org.apache.pig.PigServer$Graph.access$200(PigServer.java:1295)
        at org.apache.pig.PigServer.buildStorePlan(PigServer.java:1195)
        at org.apache.pig.PigServer.explain(PigServer.java:956)


> Class cast exception while projecting udf result
> ------------------------------------------------
>
>                 Key: PIG-1895
>                 URL: https://issues.apache.org/jira/browse/PIG-1895
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.7.0, 0.8.0, 0.9.0
>            Reporter: Vivek Padmanabhan
>            Assignee: Daniel Dai
>             Fix For: 0.8.0
>
>
> Class cast exception is thrown when I try to project the result from my udf. The udf has a defined schema DataType.BAG,DataType.LONG and DataType.INTEGER
> The below is my script
> {code}
> Data = load 'file:/home/pvivek/Desktop/input' using PigStorage() as ( i: int );
> AllData = group Data all parallel 1;
> SampledData = foreach AllData generate org.vivek.TestEvalFunc(Data, 5) as rs;
> SampledData1 = foreach SampledData generate rs.sampled;
> {code}
> Even though the output schema defines "sampled" as a data bag, while processing, instead of sending only the data bag generated from the UDF , the entire tuple was sent to the projection as result.
> {code}
> Exception recieved :
> java.lang.ClassCastException: org.apache.pig.data.BinSedesTuple cannot be cast to org.apache.pig.data.DataBag
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:484)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:197)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:480)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:197)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:339)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:291)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:434)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:402)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:382)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:1)
> 	at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
> 	at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566)
> 	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
> 	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
> {code}
> This issue is happening with 0.9/0.8 and 0.7

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira