You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Olga Natkovich (JIRA)" <ji...@apache.org> on 2011/04/11 21:12:06 UTC
[jira] [Updated] (PIG-1979) New logical plan failing with ERROR
2229: Couldn't find matching uid -1
[ https://issues.apache.org/jira/browse/PIG-1979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Olga Natkovich updated PIG-1979:
--------------------------------
Fix Version/s: 0.9.0
Assignee: Daniel Dai
> New logical plan failing with ERROR 2229: Couldn't find matching uid -1
> ------------------------------------------------------------------------
>
> Key: PIG-1979
> URL: https://issues.apache.org/jira/browse/PIG-1979
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.8.0, 0.9.0
> Reporter: Vivek Padmanabhan
> Assignee: Daniel Dai
> Fix For: 0.9.0
>
>
> The below is my script
> {code}
> register myudf.jar;
> c01 = LOAD 'input' USING org.test.MyTableLoader('');
> c02 = FILTER c01 BY result == 'OK' AND formatted IS NOT NULL AND formatted != '' ;
> c03 = FOREACH c02 GENERATE url, formatted, FLATTEN(usage);
> c04 = FOREACH c03 GENERATE usage::domain AS domain, url, formatted;
> doc_001 = FOREACH c04 GENERATE domain,url, FLATTEN(MyExtractor(formatted)) AS category;
> doc_004_1 = GROUP doc_001 BY (domain,url);
> doc_005 = FOREACH doc_004_1 GENERATE group.domain as domain, group.url as url, doc_001.category as category;
> STORE doc_005 INTO 'out_final' USING PigStorage();
> review1 = FOREACH c04 GENERATE domain,url, MyExtractor(formatted) AS rev;
> review2 = FILTER review1 BY SIZE(rev)>0;
> joinresult = JOIN review2 by (domain,url), doc_005 by (domain,url);
> finalresult = FOREACH joinresult GENERATE doc_005::category;
> STORE finalresult INTO 'out_final' using PigStorage();
> {code}
> The script is failing in building the plan, while applying for logical optimization rule for AddForEach.
> ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2229: Couldn't find matching uid -1 for project (Name: Project Type: bytearray Uid: 106 Input: 0 Column: 5)
> The problem is happening when I try to include doc_005::category in the projection for relation finalresult. This is field is orginated from the udf org.vivek.udfs.MyExtractor (source given below).
> {code}
> import java.io.IOException;
> import org.apache.pig.EvalFunc;
> import org.apache.pig.data.*;
> import org.apache.pig.impl.logicalLayer.FrontendException;
> import org.apache.pig.impl.logicalLayer.schema.Schema;
> import org.apache.pig.impl.logicalLayer.schema.Schema.FieldSchema;
> public class MyExtractor extends EvalFunc<DataBag>
> {
> @Override
> public Schema outputSchema(Schema arg0) {
> try {
> return Schema.generateNestedSchema(DataType.BAG, DataType.CHARARRAY);
> } catch (FrontendException e) {
> System.err.println("Error while generating schema. "+e);
> return new Schema(new FieldSchema(null, DataType.BAG));
> }
> }
> @Override
> public DataBag exec(Tuple inputTuple)
> throws IOException
> {
> try {
> Tuple tp2 = TupleFactory.getInstance().newTuple(1);
> tp2.set(0, (inputTuple.get(0).toString()+inputTuple.hashCode()));
> DataBag retBag = BagFactory.getInstance().newDefaultBag();
> retBag.add(tp2);
> return retBag;
> }
> catch (Exception e) {
> throw new IOException(" Caught exception", e);
> }
> }
> }
> {code}
> The script goes through fine if I disable AddForEach rule by -t AddForEach
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira