You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Daniel Dai (JIRA)" <ji...@apache.org> on 2011/04/15 02:26:05 UTC
[jira] [Resolved] (PIG-1979) New logical plan failing with ERROR 2229: Couldn't find matching uid -1

     [ https://issues.apache.org/jira/browse/PIG-1979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai resolved PIG-1979.
-----------------------------

      Resolution: Fixed
    Hadoop Flags: [Reviewed]

Patch committed to both trunk and 0.8 branch.

> New logical plan failing with ERROR 2229: Couldn't find matching uid -1 
> ------------------------------------------------------------------------
>
>                 Key: PIG-1979
>                 URL: https://issues.apache.org/jira/browse/PIG-1979
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.8.0, 0.9.0
>            Reporter: Vivek Padmanabhan
>            Assignee: Daniel Dai
>             Fix For: 0.8.0
>
>         Attachments: PIG-1979-1-trunk.patch, PIG-1979-1.patch
>
>
> The below is my script 
> {code}
> register myudf.jar;
> c01 = LOAD 'input'  USING org.test.MyTableLoader('');
> c02 = FILTER c01  BY result == 'OK'  AND formatted IS NOT NULL  AND formatted != '' ;
> c03 = FOREACH c02 GENERATE url, formatted, FLATTEN(usage);
> c04 = FOREACH c03 GENERATE usage::domain AS domain, url, formatted;
> doc_001 = FOREACH c04 GENERATE domain,url, FLATTEN(MyExtractor(formatted)) AS category;
> doc_004_1 = GROUP doc_001 BY (domain,url);
> doc_005 = FOREACH doc_004_1 GENERATE group.domain as domain, group.url as url, doc_001.category as category;
> STORE doc_005 INTO 'out_final' USING PigStorage();
> review1 = FOREACH c04 GENERATE domain,url, MyExtractor(formatted) AS rev;
> review2 = FILTER review1 BY SIZE(rev)>0;
> joinresult = JOIN review2 by (domain,url), doc_005 by (domain,url);
> finalresult = FOREACH joinresult GENERATE  doc_005::category;
> STORE finalresult INTO 'out_final' using PigStorage();
> {code}
> The script is failing in building the plan, while applying for logical optimization rule for AddForEach.
> ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2229: Couldn't find matching uid -1 for project (Name: Project Type: bytearray Uid: 106 Input: 0 Column: 5)
> The problem is happening when I try to include doc_005::category in the projection for relation finalresult. This is field is orginated from the udf org.vivek.udfs.MyExtractor (source given below).
> {code}
> import java.io.IOException;
> import org.apache.pig.EvalFunc;
> import org.apache.pig.data.*;
> import org.apache.pig.impl.logicalLayer.FrontendException;
> import org.apache.pig.impl.logicalLayer.schema.Schema;
> import org.apache.pig.impl.logicalLayer.schema.Schema.FieldSchema;
> public class MyExtractor extends EvalFunc<DataBag>
> {
>   @Override
> 	public Schema outputSchema(Schema arg0) {
> 	  try {
> 			return Schema.generateNestedSchema(DataType.BAG, DataType.CHARARRAY);
> 		} catch (FrontendException e) {
> 			System.err.println("Error while generating schema. "+e);
> 			return new Schema(new FieldSchema(null, DataType.BAG));
> 		}
> 	}
>   @Override
>   public DataBag exec(Tuple inputTuple)
>     throws IOException
>   {
>     try {
>       Tuple tp2 = TupleFactory.getInstance().newTuple(1);
>       tp2.set(0, (inputTuple.get(0).toString()+inputTuple.hashCode()));
>       DataBag retBag = BagFactory.getInstance().newDefaultBag();
>       retBag.add(tp2);
>       return retBag;
>     }
>     catch (Exception e) {
>       throw new IOException(" Caught exception", e);
>     }
>   }
> }
> {code}
> The script goes through fine if I disable AddForEach rule by -t AddForEach

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira