You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Tom White <to...@gmail.com> on 2009/01/05 22:52:40 UTC

Number of columns in a relation

Hi,

I'm trying to filter on the number of columns in a relation as
suggested in the FAQ, but I get the following error. This is in the
types branch. Has the syntax changed or does this look like a bug?

A = LOAD 'foo' USING PigStorage('\t');
B = FILTER A BY ARITY(*) < 5;
DUMP B;
2009-01-05 21:46:56,355 [main] ERROR
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc
- Caught error from UDF
org.apache.pig.builtin.ARITY[org.apache.pig.data.DataByteArray cannot
be cast to org.apache.pig.data.Tuple
[org.apache.pig.data.DataByteArray cannot be cast to
org.apache.pig.data.Tuple]]
2009-01-05 21:46:56,356 [main] INFO
org.apache.pig.backend.local.executionengine.LocalPigLauncher - Failed
jobs!!
2009-01-05 21:46:56,356 [main] INFO
org.apache.pig.backend.local.executionengine.LocalPigLauncher - 1 out
of 1 failed!
2009-01-05 21:46:56,357 [main] ERROR
org.apache.pig.tools.grunt.GruntParser - java.io.IOException: Unable
to open iterator for alias: B [Job terminated with anomalous status
FAILED]
	at org.apache.pig.PigServer.openIterator(PigServer.java:389)
	at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:269)
	at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:178)
	at org.apache.pig.tools.grunt.GruntParser.parseContOnError(GruntParser.java:94)
	at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:58)
	at org.apache.pig.Main.main(Main.java:282)
Caused by: java.io.IOException: Job terminated with anomalous status FAILED
	... 6 more

2009-01-05 21:46:56,357 [main] ERROR
org.apache.pig.tools.grunt.GruntParser - Unable to open iterator for
alias: B [Job terminated with anomalous status FAILED]
2009-01-05 21:46:56,357 [main] ERROR
org.apache.pig.tools.grunt.GruntParser - java.io.IOException: Unable
to open iterator for alias: B [Job terminated with anomalous status
FAILED]

Thanks,
Tom

RE: Number of columns in a relation

Posted by Olga Natkovich <ol...@yahoo-inc.com>.
Yes, SIZE it a better function to use. Unfortunately, it will have the
same problem. It is a bug and we are looking into it now and will update
once we know what the solution is.

Olga 

> -----Original Message-----
> From: Kevin Weil [mailto:kevinweil@gmail.com] 
> Sent: Monday, January 05, 2009 2:07 PM
> To: pig-user@hadoop.apache.org
> Subject: Re: Number of columns in a relation
> 
> You should be able to use SIZE instead of ARITY, which has 
> been deprecated.
> There is a lot of useful information at
> http://wiki.apache.org/pig/TrunkToTypesChanges.
> 
> Write back if this doesn't help.
> 
> Thanks,
> Kevin
> 
> On Mon, Jan 5, 2009 at 1:52 PM, Tom White 
> <to...@gmail.com> wrote:
> 
> > Hi,
> >
> > I'm trying to filter on the number of columns in a relation as 
> > suggested in the FAQ, but I get the following error. This is in the 
> > types branch. Has the syntax changed or does this look like a bug?
> >
> > A = LOAD 'foo' USING PigStorage('\t'); B = FILTER A BY 
> ARITY(*) < 5; 
> > DUMP B;
> > 2009-01-05 21:46:56,355 [main] ERROR
> >
> > 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expression
> > Operators.POUserFunc
> > - Caught error from UDF
> > 
> org.apache.pig.builtin.ARITY[org.apache.pig.data.DataByteArray cannot 
> > be cast to org.apache.pig.data.Tuple 
> > [org.apache.pig.data.DataByteArray cannot be cast to 
> > org.apache.pig.data.Tuple]]
> > 2009-01-05 21:46:56,356 [main] INFO
> > 
> org.apache.pig.backend.local.executionengine.LocalPigLauncher 
> - Failed 
> > jobs!!
> > 2009-01-05 21:46:56,356 [main] INFO
> > 
> org.apache.pig.backend.local.executionengine.LocalPigLauncher - 1 out 
> > of 1 failed!
> > 2009-01-05 21:46:56,357 [main] ERROR
> > org.apache.pig.tools.grunt.GruntParser - 
> java.io.IOException: Unable 
> > to open iterator for alias: B [Job terminated with anomalous status 
> > FAILED]
> >        at org.apache.pig.PigServer.openIterator(PigServer.java:389)
> >        at
> > 
> org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser
> .java:269)
> >        at
> > 
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(Pi
> gScriptParser.java:178)
> >        at
> > 
> org.apache.pig.tools.grunt.GruntParser.parseContOnError(GruntP
> arser.java:94)
> >        at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:58)
> >        at org.apache.pig.Main.main(Main.java:282)
> > Caused by: java.io.IOException: Job terminated with 
> anomalous status FAILED
> >        ... 6 more
> >
> > 2009-01-05 21:46:56,357 [main] ERROR
> > org.apache.pig.tools.grunt.GruntParser - Unable to open iterator for
> > alias: B [Job terminated with anomalous status FAILED]
> > 2009-01-05 21:46:56,357 [main] ERROR
> > org.apache.pig.tools.grunt.GruntParser - 
> java.io.IOException: Unable 
> > to open iterator for alias: B [Job terminated with anomalous status 
> > FAILED]
> >
> > Thanks,
> > Tom
> >
> 

Re: Number of columns in a relation

Posted by Kevin Weil <ke...@gmail.com>.
You should be able to use SIZE instead of ARITY, which has been deprecated.
There is a lot of useful information at
http://wiki.apache.org/pig/TrunkToTypesChanges.

Write back if this doesn't help.

Thanks,
Kevin

On Mon, Jan 5, 2009 at 1:52 PM, Tom White <to...@gmail.com> wrote:

> Hi,
>
> I'm trying to filter on the number of columns in a relation as
> suggested in the FAQ, but I get the following error. This is in the
> types branch. Has the syntax changed or does this look like a bug?
>
> A = LOAD 'foo' USING PigStorage('\t');
> B = FILTER A BY ARITY(*) < 5;
> DUMP B;
> 2009-01-05 21:46:56,355 [main] ERROR
>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc
> - Caught error from UDF
> org.apache.pig.builtin.ARITY[org.apache.pig.data.DataByteArray cannot
> be cast to org.apache.pig.data.Tuple
> [org.apache.pig.data.DataByteArray cannot be cast to
> org.apache.pig.data.Tuple]]
> 2009-01-05 21:46:56,356 [main] INFO
> org.apache.pig.backend.local.executionengine.LocalPigLauncher - Failed
> jobs!!
> 2009-01-05 21:46:56,356 [main] INFO
> org.apache.pig.backend.local.executionengine.LocalPigLauncher - 1 out
> of 1 failed!
> 2009-01-05 21:46:56,357 [main] ERROR
> org.apache.pig.tools.grunt.GruntParser - java.io.IOException: Unable
> to open iterator for alias: B [Job terminated with anomalous status
> FAILED]
>        at org.apache.pig.PigServer.openIterator(PigServer.java:389)
>        at
> org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:269)
>        at
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:178)
>        at
> org.apache.pig.tools.grunt.GruntParser.parseContOnError(GruntParser.java:94)
>        at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:58)
>        at org.apache.pig.Main.main(Main.java:282)
> Caused by: java.io.IOException: Job terminated with anomalous status FAILED
>        ... 6 more
>
> 2009-01-05 21:46:56,357 [main] ERROR
> org.apache.pig.tools.grunt.GruntParser - Unable to open iterator for
> alias: B [Job terminated with anomalous status FAILED]
> 2009-01-05 21:46:56,357 [main] ERROR
> org.apache.pig.tools.grunt.GruntParser - java.io.IOException: Unable
> to open iterator for alias: B [Job terminated with anomalous status
> FAILED]
>
> Thanks,
> Tom
>

RE: Number of columns in a relation

Posted by Olga Natkovich <ol...@yahoo-inc.com>.
https://issues.apache.org/jira/browse/PIG-597 

> -----Original Message-----
> From: Tom White [mailto:tom.e.white@gmail.com] 
> Sent: Monday, January 05, 2009 1:53 PM
> To: pig-user@hadoop.apache.org
> Subject: Number of columns in a relation
> 
> Hi,
> 
> I'm trying to filter on the number of columns in a relation 
> as suggested in the FAQ, but I get the following error. This 
> is in the types branch. Has the syntax changed or does this 
> look like a bug?
> 
> A = LOAD 'foo' USING PigStorage('\t');
> B = FILTER A BY ARITY(*) < 5;
> DUMP B;
> 2009-01-05 21:46:56,355 [main] ERROR
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.ex
> pressionOperators.POUserFunc
> - Caught error from UDF
> org.apache.pig.builtin.ARITY[org.apache.pig.data.DataByteArray
>  cannot be cast to org.apache.pig.data.Tuple 
> [org.apache.pig.data.DataByteArray cannot be cast to 
> org.apache.pig.data.Tuple]]
> 2009-01-05 21:46:56,356 [main] INFO
> org.apache.pig.backend.local.executionengine.LocalPigLauncher 
> - Failed jobs!!
> 2009-01-05 21:46:56,356 [main] INFO
> org.apache.pig.backend.local.executionengine.LocalPigLauncher 
> - 1 out of 1 failed!
> 2009-01-05 21:46:56,357 [main] ERROR
> org.apache.pig.tools.grunt.GruntParser - java.io.IOException: 
> Unable to open iterator for alias: B [Job terminated with 
> anomalous status FAILED]
> 	at org.apache.pig.PigServer.openIterator(PigServer.java:389)
> 	at 
> org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser
> .java:269)
> 	at 
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(Pi
> gScriptParser.java:178)
> 	at 
> org.apache.pig.tools.grunt.GruntParser.parseContOnError(GruntP
> arser.java:94)
> 	at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:58)
> 	at org.apache.pig.Main.main(Main.java:282)
> Caused by: java.io.IOException: Job terminated with anomalous 
> status FAILED
> 	... 6 more
> 
> 2009-01-05 21:46:56,357 [main] ERROR
> org.apache.pig.tools.grunt.GruntParser - Unable to open iterator for
> alias: B [Job terminated with anomalous status FAILED]
> 2009-01-05 21:46:56,357 [main] ERROR
> org.apache.pig.tools.grunt.GruntParser - java.io.IOException: 
> Unable to open iterator for alias: B [Job terminated with 
> anomalous status FAILED]
> 
> Thanks,
> Tom
>