You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Vivek Padmanabhan (JIRA)" <ji...@apache.org> on 2011/03/11 12:10:59 UTC
[jira] Created: (PIG-1895) Class cast exception while projecting
udf result
Class cast exception while projecting udf result
------------------------------------------------
Key: PIG-1895
URL: https://issues.apache.org/jira/browse/PIG-1895
Project: Pig
Issue Type: Bug
Components: impl
Affects Versions: 0.8.0, 0.7.0, 0.9.0
Reporter: Vivek Padmanabhan
Class cast exception is thrown when I try to project the result from my udf. The udf has a defined schema DataType.BAG,DataType.LONG and DataType.INTEGER
The below is my script
{code}
Data = load 'file:/home/pvivek/Desktop/input' using PigStorage() as ( i: int );
AllData = group Data all parallel 1;
SampledData = foreach AllData generate org.vivek.TestEvalFunc(Data, 5) as rs;
SampledData1 = foreach SampledData generate rs.sampled;
{code}
Even though the output schema defines "sampled" as a data bag, while processing, instead of sending only the data bag generated from the UDF , the entire tuple was sent to the projection as result.
{code}
Exception recieved :
java.lang.ClassCastException: org.apache.pig.data.BinSedesTuple cannot be cast to org.apache.pig.data.DataBag
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:484)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:197)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:480)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:197)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:339)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:291)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:434)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:402)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:382)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:1)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
{code}
This issue is happening with 0.9/0.8 and 0.7
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (PIG-1895) Class cast exception while projecting
udf result
Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Daniel Dai resolved PIG-1895.
-----------------------------
Resolution: Fixed
Fix Version/s: (was: 0.8.0)
0.8.1
Fixed by PIG-1866.
> Class cast exception while projecting udf result
> ------------------------------------------------
>
> Key: PIG-1895
> URL: https://issues.apache.org/jira/browse/PIG-1895
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.7.0, 0.8.0, 0.9.0
> Reporter: Vivek Padmanabhan
> Assignee: Daniel Dai
> Fix For: 0.8.1
>
>
> Class cast exception is thrown when I try to project the result from my udf. The udf has a defined schema DataType.BAG,DataType.LONG and DataType.INTEGER
> The below is my script
> {code}
> Data = load 'file:/home/pvivek/Desktop/input' using PigStorage() as ( i: int );
> AllData = group Data all parallel 1;
> SampledData = foreach AllData generate org.vivek.TestEvalFunc(Data, 5) as rs;
> SampledData1 = foreach SampledData generate rs.sampled;
> {code}
> Even though the output schema defines "sampled" as a data bag, while processing, instead of sending only the data bag generated from the UDF , the entire tuple was sent to the projection as result.
> {code}
> Exception recieved :
> java.lang.ClassCastException: org.apache.pig.data.BinSedesTuple cannot be cast to org.apache.pig.data.DataBag
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:484)
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:197)
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:480)
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:197)
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:339)
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:291)
> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:434)
> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:402)
> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:382)
> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:1)
> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
> at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
> at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
> {code}
> This issue is happening with 0.9/0.8 and 0.7
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-1895) Class cast exception while projecting
udf result
Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13009248#comment-13009248 ]
Daniel Dai commented on PIG-1895:
---------------------------------
This issue is the same nature as PIG-1866. Sounds weird but seems Pig have trouble projecting a bag out of a tuple.
> Class cast exception while projecting udf result
> ------------------------------------------------
>
> Key: PIG-1895
> URL: https://issues.apache.org/jira/browse/PIG-1895
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.7.0, 0.8.0, 0.9.0
> Reporter: Vivek Padmanabhan
> Assignee: Daniel Dai
> Fix For: 0.8.0
>
>
> Class cast exception is thrown when I try to project the result from my udf. The udf has a defined schema DataType.BAG,DataType.LONG and DataType.INTEGER
> The below is my script
> {code}
> Data = load 'file:/home/pvivek/Desktop/input' using PigStorage() as ( i: int );
> AllData = group Data all parallel 1;
> SampledData = foreach AllData generate org.vivek.TestEvalFunc(Data, 5) as rs;
> SampledData1 = foreach SampledData generate rs.sampled;
> {code}
> Even though the output schema defines "sampled" as a data bag, while processing, instead of sending only the data bag generated from the UDF , the entire tuple was sent to the projection as result.
> {code}
> Exception recieved :
> java.lang.ClassCastException: org.apache.pig.data.BinSedesTuple cannot be cast to org.apache.pig.data.DataBag
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:484)
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:197)
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:480)
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:197)
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:339)
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:291)
> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:434)
> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:402)
> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:382)
> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:1)
> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
> at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
> at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
> {code}
> This issue is happening with 0.9/0.8 and 0.7
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (PIG-1895) Class cast exception while projecting
udf result
Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006761#comment-13006761 ]
Daniel Dai commented on PIG-1895:
---------------------------------
Sorry, the last comment is not mean for this Jira, ignore it.
> Class cast exception while projecting udf result
> ------------------------------------------------
>
> Key: PIG-1895
> URL: https://issues.apache.org/jira/browse/PIG-1895
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.7.0, 0.8.0, 0.9.0
> Reporter: Vivek Padmanabhan
> Assignee: Daniel Dai
> Fix For: 0.8.0
>
>
> Class cast exception is thrown when I try to project the result from my udf. The udf has a defined schema DataType.BAG,DataType.LONG and DataType.INTEGER
> The below is my script
> {code}
> Data = load 'file:/home/pvivek/Desktop/input' using PigStorage() as ( i: int );
> AllData = group Data all parallel 1;
> SampledData = foreach AllData generate org.vivek.TestEvalFunc(Data, 5) as rs;
> SampledData1 = foreach SampledData generate rs.sampled;
> {code}
> Even though the output schema defines "sampled" as a data bag, while processing, instead of sending only the data bag generated from the UDF , the entire tuple was sent to the projection as result.
> {code}
> Exception recieved :
> java.lang.ClassCastException: org.apache.pig.data.BinSedesTuple cannot be cast to org.apache.pig.data.DataBag
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:484)
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:197)
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:480)
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:197)
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:339)
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:291)
> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:434)
> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:402)
> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:382)
> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:1)
> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
> at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
> at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
> {code}
> This issue is happening with 0.9/0.8 and 0.7
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (PIG-1895) Class cast exception while projecting
udf result
Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Olga Natkovich updated PIG-1895:
--------------------------------
Fix Version/s: 0.8.0
Assignee: Daniel Dai
> Class cast exception while projecting udf result
> ------------------------------------------------
>
> Key: PIG-1895
> URL: https://issues.apache.org/jira/browse/PIG-1895
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.7.0, 0.8.0, 0.9.0
> Reporter: Vivek Padmanabhan
> Assignee: Daniel Dai
> Fix For: 0.8.0
>
>
> Class cast exception is thrown when I try to project the result from my udf. The udf has a defined schema DataType.BAG,DataType.LONG and DataType.INTEGER
> The below is my script
> {code}
> Data = load 'file:/home/pvivek/Desktop/input' using PigStorage() as ( i: int );
> AllData = group Data all parallel 1;
> SampledData = foreach AllData generate org.vivek.TestEvalFunc(Data, 5) as rs;
> SampledData1 = foreach SampledData generate rs.sampled;
> {code}
> Even though the output schema defines "sampled" as a data bag, while processing, instead of sending only the data bag generated from the UDF , the entire tuple was sent to the projection as result.
> {code}
> Exception recieved :
> java.lang.ClassCastException: org.apache.pig.data.BinSedesTuple cannot be cast to org.apache.pig.data.DataBag
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:484)
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:197)
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:480)
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:197)
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:339)
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:291)
> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:434)
> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:402)
> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:382)
> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:1)
> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
> at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
> at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
> {code}
> This issue is happening with 0.9/0.8 and 0.7
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (PIG-1895) Class cast exception while projecting
udf result
Posted by "Vivek Padmanabhan (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13005604#comment-13005604 ]
Vivek Padmanabhan commented on PIG-1895:
----------------------------------------
UDF Source code :
{code}
import java.io.IOException;
import java.util.ArrayList;
import java.util.Iterator;
import org.apache.pig.EvalFunc;
import org.apache.pig.data.DataBag;
import org.apache.pig.data.DataType;
import org.apache.pig.data.DefaultBagFactory;
import org.apache.pig.data.DefaultTupleFactory;
import org.apache.pig.data.Tuple;
import org.apache.pig.impl.logicalLayer.schema.Schema;
public class TestEvalFunc extends EvalFunc<Tuple>{
public Tuple exec(Tuple input) throws IOException {
ArrayList<Tuple> tupleList = new ArrayList<Tuple>(2);
DataBag values = (DataBag)(input.get(0));
for (Iterator<Tuple> vit = values.iterator(); vit.hasNext();) {
tupleList.add(vit.next());
}
DataBag sampleBag = DefaultBagFactory.getInstance().newDefaultBag(tupleList);
Tuple output = DefaultTupleFactory.getInstance().newTuple(3);
output.set(0, sampleBag);
output.set(1, new Long(3));
output.set(2, new Integer(2));
return output;
}
public Schema outputSchema(Schema input) {
Schema udfSchema = new Schema();
udfSchema.add(new Schema.FieldSchema("sampled",DataType.BAG));
udfSchema.add(new Schema.FieldSchema("k",DataType.LONG));
udfSchema.add(new Schema.FieldSchema("i",DataType.INTEGER));
return udfSchema;
}
}
{code}
Test Case to Verify;
{code}
import static org.apache.pig.ExecType.LOCAL;
import java.util.ArrayList;
import java.util.Iterator;
import junit.framework.TestCase;
import org.apache.pig.PigServer;
import org.apache.pig.data.Tuple;
public class MyPigUnitTests extends TestCase {
private static String patternString = "(\\d+)!+(\\w+)~+(\\w+)";
public static ArrayList<String[]> data = new ArrayList<String[]>();
static {
data.add(new String[] { "1"});
data.add(new String[] { "3"});
}
private static String [] script = new String []{
"Data = load 'file:/home/pvivek/Desktop/input' using PigStorage() as ( i: int );",
"AllData = group Data all parallel 1;",
"SampledData = foreach AllData generate org.vivek.TestEvalFunc(Data, 5) as rs;",
"SampledData1 = foreach SampledData generate rs.sampled;",
};
public void test () throws Exception
{
String filename = TestHelper.createTempFile(data, "");
PigServer pig = new PigServer(LOCAL);
filename = filename.replace("\\", "\\\\");
patternString = patternString.replace("\\", "\\\\");
for (String query : script) {
pig.registerQuery(query);
}
Iterator<?> it = pig.openIterator("SampledData1");
int tupleCount = 0;
while (it.hasNext()) {
Tuple tuple = (Tuple) it.next();
if (tuple == null)
break;
else {
if (tuple.size() > 0) {
tupleCount++;
}
}
}
assertEquals(1, tupleCount);
}
}
{code}
> Class cast exception while projecting udf result
> ------------------------------------------------
>
> Key: PIG-1895
> URL: https://issues.apache.org/jira/browse/PIG-1895
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.7.0, 0.8.0, 0.9.0
> Reporter: Vivek Padmanabhan
>
> Class cast exception is thrown when I try to project the result from my udf. The udf has a defined schema DataType.BAG,DataType.LONG and DataType.INTEGER
> The below is my script
> {code}
> Data = load 'file:/home/pvivek/Desktop/input' using PigStorage() as ( i: int );
> AllData = group Data all parallel 1;
> SampledData = foreach AllData generate org.vivek.TestEvalFunc(Data, 5) as rs;
> SampledData1 = foreach SampledData generate rs.sampled;
> {code}
> Even though the output schema defines "sampled" as a data bag, while processing, instead of sending only the data bag generated from the UDF , the entire tuple was sent to the projection as result.
> {code}
> Exception recieved :
> java.lang.ClassCastException: org.apache.pig.data.BinSedesTuple cannot be cast to org.apache.pig.data.DataBag
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:484)
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:197)
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:480)
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:197)
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:339)
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:291)
> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:434)
> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:402)
> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:382)
> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:1)
> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
> at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
> at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
> {code}
> This issue is happening with 0.9/0.8 and 0.7
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (PIG-1895) Class cast exception while projecting
udf result
Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13006757#comment-13006757 ]
Daniel Dai commented on PIG-1895:
---------------------------------
TestEvalFunc forget to set twolevelaccess flag for the bag. Pig 0.9 does not require twolevelaccess flag. However, we saw Pig 0.9 fail with a different stack:
ERROR 1000: Invalid field reference. Referenced field [guid] does not exist in schema: null.
org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1067: Unable to explain alias C3
at org.apache.pig.PigServer.explain(PigServer.java:993)
at org.apache.pig.tools.grunt.GruntParser.explainCurrentBatch(GruntParser.java:368)
at org.apache.pig.tools.grunt.GruntParser.processExplain(GruntParser.java:300)
at org.apache.pig.tools.grunt.GruntParser.processExplain(GruntParser.java:263)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.Explain(PigScriptParser.java:665)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:325)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:176)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:152)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:90)
at org.apache.pig.Main.run(Main.java:537)
at org.apache.pig.Main.main(Main.java:108)
Caused by: org.apache.pig.impl.plan.PlanValidationException: ERROR 1000: Invalid field reference. Referenced field [guid] does not exist in schema: null.
at org.apache.pig.newplan.logical.visitor.ColumnAliasConversionVisitor$1.visit(ColumnAliasConversionVisitor.java:114)
at org.apache.pig.newplan.logical.expression.DereferenceExpression.accept(DereferenceExpression.java:83)
at org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
at org.apache.pig.newplan.logical.optimizer.AllExpressionVisitor.visit(AllExpressionVisitor.java:114)
at org.apache.pig.newplan.logical.relational.LOGenerate.accept(LOGenerate.java:240)
at org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
at org.apache.pig.newplan.logical.optimizer.AllExpressionVisitor.visit(AllExpressionVisitor.java:104)
at org.apache.pig.newplan.logical.relational.LOForEach.accept(LOForEach.java:73)
at org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50)
at org.apache.pig.PigServer$Graph.compile(PigServer.java:1538)
at org.apache.pig.PigServer$Graph.compile(PigServer.java:1533)
at org.apache.pig.PigServer$Graph.access$200(PigServer.java:1295)
at org.apache.pig.PigServer.buildStorePlan(PigServer.java:1195)
at org.apache.pig.PigServer.explain(PigServer.java:956)
> Class cast exception while projecting udf result
> ------------------------------------------------
>
> Key: PIG-1895
> URL: https://issues.apache.org/jira/browse/PIG-1895
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.7.0, 0.8.0, 0.9.0
> Reporter: Vivek Padmanabhan
> Assignee: Daniel Dai
> Fix For: 0.8.0
>
>
> Class cast exception is thrown when I try to project the result from my udf. The udf has a defined schema DataType.BAG,DataType.LONG and DataType.INTEGER
> The below is my script
> {code}
> Data = load 'file:/home/pvivek/Desktop/input' using PigStorage() as ( i: int );
> AllData = group Data all parallel 1;
> SampledData = foreach AllData generate org.vivek.TestEvalFunc(Data, 5) as rs;
> SampledData1 = foreach SampledData generate rs.sampled;
> {code}
> Even though the output schema defines "sampled" as a data bag, while processing, instead of sending only the data bag generated from the UDF , the entire tuple was sent to the projection as result.
> {code}
> Exception recieved :
> java.lang.ClassCastException: org.apache.pig.data.BinSedesTuple cannot be cast to org.apache.pig.data.DataBag
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:484)
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:197)
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:480)
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:197)
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:339)
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:291)
> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:434)
> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:402)
> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:382)
> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:1)
> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
> at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
> at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
> {code}
> This issue is happening with 0.9/0.8 and 0.7
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira