You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Jonathan Holloway <jo...@gmail.com> on 2011/06/23 18:41:16 UTC

Invalid Alias Issue

Hi all,

I'm getting the exception (at the end) from the following using Pig:

eLine = FOREACH logLine
    GENERATE
        FLATTEN(
            REGEX_EXTRACT_ALL(
                $0,
                '.*Output.Count\\s*\\-\\s*([A-Za-z\\.]+)\\s*(\\d+)'
                )
        ) AS (ename:CHARARRAY, ecount:DOUBLE);

nameGroup = GROUP eLine BY eventName;

lines = FOREACH nameGroup GENERATE group as name,
    MAX(com.example.BagToTupleUDF((tuple)eLine.ecount)) as maxCount;

My UDF is converting the values from a bag {(12),(4),(7),(190)} to a tuple
of doubles (12,4,7,190).

Can anybody help explain how i can use the Pig builtin functions MAX, MIN,
AVG over this kind of data extracted from a regex?

Many thanks,
Jon.

---

org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error during
parsing. Invalid alias: MAX in {group: chararray,eLine: {ename:
chararray,ecount: double}}
 at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1617)
at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1561)
 at org.apache.pig.PigServer.registerQuery(PigServer.java:533)
at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:868)
 at org.apache.pig.pigunit.pig.GruntParser.processPig(GruntParser.java:61)
at
org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:388)
 at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
at org.apache.pig.pigunit.pig.PigServer.registerScript(PigServer.java:53)
 at org.apache.pig.pigunit.PigTest.registerScript(PigTest.java:160)
at org.apache.pig.pigunit.PigTest.assertOutput(PigTest.java:244)
 at
message_archiver.reporting.pig.functions.OEPigTest.singleRawTextFile(OEPigTest.java:78)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
at
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
 at
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
at
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
 at
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
at
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
 at
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
 at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:44)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:180)
 at org.junit.runners.ParentRunner.access$000(ParentRunner.java:41)
at org.junit.runners.ParentRunner$1.evaluate(ParentRunner.java:173)
 at
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
at
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
 at org.junit.runners.ParentRunner.run(ParentRunner.java:220)
at
org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:49)
 at
org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
 at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
 at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
Caused by: org.apache.pig.impl.logicalLayer.parser.ParseException:

Invalid alias: MAX in {group: chararray,line: {name: chararray,count:
double}}
at
org.apache.pig.impl.logicalLayer.parser.QueryParser.AliasFieldOrSpec(QueryParser.java:7415)
 at
org.apache.pig.impl.logicalLayer.parser.QueryParser.ColOrSpec(QueryParser.java:7226)
at
org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseEvalSpec(QueryParser.java:5297)
 at
org.apache.pig.impl.logicalLayer.parser.QueryParser.UnaryExpr(QueryParser.java:5187)
at
org.apache.pig.impl.logicalLayer.parser.QueryParser.CastExpr(QueryParser.java:5133)
 at
org.apache.pig.impl.logicalLayer.parser.QueryParser.MultiplicativeExpr(QueryParser.java:5042)
at
org.apache.pig.impl.logicalLayer.parser.QueryParser.AdditiveExpr(QueryParser.java:4968)
 at
org.apache.pig.impl.logicalLayer.parser.QueryParser.InfixExpr(QueryParser.java:4934)
at
org.apache.pig.impl.logicalLayer.parser.QueryParser.FlattenedGenerateItem(QueryParser.java:4861)
 at
org.apache.pig.impl.logicalLayer.parser.QueryParser.FlattenedGenerateItemList(QueryParser.java:4760)
at
org.apache.pig.impl.logicalLayer.parser.QueryParser.GenerateStatement(QueryParser.java:4704)
 at
org.apache.pig.impl.logicalLayer.parser.QueryParser.NestedBlock(QueryParser.java:4030)
at
org.apache.pig.impl.logicalLayer.parser.QueryParser.ForEachClause(QueryParser.java:3433)
 at
org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:1464)
at
org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:1013)
 at
org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:800)
at
org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:63)
 at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1611)
... 34 more

Re: Invalid Alias Issue

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
you can cast to longs and doubles from strings, that should've helped.

On Fri, Jun 24, 2011 at 4:10 PM, Jonathan Holloway <
jonathan.holloway@gmail.com> wrote:

> I ended up fixing this issue - i did change it to a bag after but the main
> problem was that regexextractall was returning everything as a string (bia
> group) which meant that max, avg etc... was not matched as a matching
> function for a bag of tuple doubles.
>
> I ended up writing a new udf for extractall to return types based on
> whether \d or \w was used in the regexp. Flattening that to specfic types
> didnt work.
>
> That solved the issue, would appreciate the feedback on the udf and
> approach - will post it early next week on pastebin. If there's a better way
> then please let me know.
>
> This whole solution was because I  wanted to get around the issue of
> creating a new udf for each log line type I needed to parse.
>
> Many thanks,
> Jon
>
> On 24 Jun 2011, at 23:45, Dmitriy Ryaboy <dv...@gmail.com> wrote:
>
> > <mime-attachment.txt>
>

Re: Invalid Alias Issue

Posted by Jonathan Holloway <jo...@gmail.com>.
I ended up fixing this issue - i did change it to a bag after but the main problem was that regexextractall was returning everything as a string (bia group) which meant that max, avg etc... was not matched as a matching function for a bag of tuple doubles. 

I ended up writing a new udf for extractall to return types based on whether \d or \w was used in the regexp. Flattening that to specfic types didnt work. 

That solved the issue, would appreciate the feedback on the udf and approach - will post it early next week on pastebin. If there's a better way then please let me know. 

This whole solution was because I  wanted to get around the issue of creating a new udf for each log line type I needed to parse.

Many thanks,
Jon

On 24 Jun 2011, at 23:45, Dmitriy Ryaboy <dv...@gmail.com> wrote:

> <mime-attachment.txt>

Re: Invalid Alias Issue

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
Why are you casting eLine.ecount as a tuple? It's a bag (all ecounts with
this eventName)

D

On Thu, Jun 23, 2011 at 9:41 AM, Jonathan Holloway <
jonathan.holloway@gmail.com> wrote:

> Hi all,
>
> I'm getting the exception (at the end) from the following using Pig:
>
> eLine = FOREACH logLine
>    GENERATE
>        FLATTEN(
>            REGEX_EXTRACT_ALL(
>                $0,
>                '.*Output.Count\\s*\\-\\s*([A-Za-z\\.]+)\\s*(\\d+)'
>                )
>        ) AS (ename:CHARARRAY, ecount:DOUBLE);
>
> nameGroup = GROUP eLine BY eventName;
>
> lines = FOREACH nameGroup GENERATE group as name,
>    MAX(com.example.BagToTupleUDF((tuple)eLine.ecount)) as maxCount;
>
> My UDF is converting the values from a bag {(12),(4),(7),(190)} to a tuple
> of doubles (12,4,7,190).
>
> Can anybody help explain how i can use the Pig builtin functions MAX, MIN,
> AVG over this kind of data extracted from a regex?
>
> Many thanks,
> Jon.
>
> ---
>
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error
> during
> parsing. Invalid alias: MAX in {group: chararray,eLine: {ename:
> chararray,ecount: double}}
>  at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1617)
> at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1561)
>  at org.apache.pig.PigServer.registerQuery(PigServer.java:533)
> at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:868)
>  at org.apache.pig.pigunit.pig.GruntParser.processPig(GruntParser.java:61)
> at
>
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:388)
>  at
>
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
> at org.apache.pig.pigunit.pig.PigServer.registerScript(PigServer.java:53)
>  at org.apache.pig.pigunit.PigTest.registerScript(PigTest.java:160)
> at org.apache.pig.pigunit.PigTest.assertOutput(PigTest.java:244)
>  at
>
> message_archiver.reporting.pig.functions.OEPigTest.singleRawTextFile(OEPigTest.java:78)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>  at java.lang.reflect.Method.invoke(Method.java:597)
> at
>
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
>  at
>
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
> at
>
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
>  at
>
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
> at
>
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
>  at
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
> at
>
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>  at
>
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:44)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:180)
>  at org.junit.runners.ParentRunner.access$000(ParentRunner.java:41)
> at org.junit.runners.ParentRunner$1.evaluate(ParentRunner.java:173)
>  at
>
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
> at
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
>  at org.junit.runners.ParentRunner.run(ParentRunner.java:220)
> at
>
> org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:49)
>  at
>
> org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
> at
>
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
>  at
>
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
> at
>
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
>  at
>
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
> Caused by: org.apache.pig.impl.logicalLayer.parser.ParseException:
>
> Invalid alias: MAX in {group: chararray,line: {name: chararray,count:
> double}}
> at
>
> org.apache.pig.impl.logicalLayer.parser.QueryParser.AliasFieldOrSpec(QueryParser.java:7415)
>  at
>
> org.apache.pig.impl.logicalLayer.parser.QueryParser.ColOrSpec(QueryParser.java:7226)
> at
>
> org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseEvalSpec(QueryParser.java:5297)
>  at
>
> org.apache.pig.impl.logicalLayer.parser.QueryParser.UnaryExpr(QueryParser.java:5187)
> at
>
> org.apache.pig.impl.logicalLayer.parser.QueryParser.CastExpr(QueryParser.java:5133)
>  at
>
> org.apache.pig.impl.logicalLayer.parser.QueryParser.MultiplicativeExpr(QueryParser.java:5042)
> at
>
> org.apache.pig.impl.logicalLayer.parser.QueryParser.AdditiveExpr(QueryParser.java:4968)
>  at
>
> org.apache.pig.impl.logicalLayer.parser.QueryParser.InfixExpr(QueryParser.java:4934)
> at
>
> org.apache.pig.impl.logicalLayer.parser.QueryParser.FlattenedGenerateItem(QueryParser.java:4861)
>  at
>
> org.apache.pig.impl.logicalLayer.parser.QueryParser.FlattenedGenerateItemList(QueryParser.java:4760)
> at
>
> org.apache.pig.impl.logicalLayer.parser.QueryParser.GenerateStatement(QueryParser.java:4704)
>  at
>
> org.apache.pig.impl.logicalLayer.parser.QueryParser.NestedBlock(QueryParser.java:4030)
> at
>
> org.apache.pig.impl.logicalLayer.parser.QueryParser.ForEachClause(QueryParser.java:3433)
>  at
>
> org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:1464)
> at
>
> org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:1013)
>  at
>
> org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:800)
> at
>
> org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:63)
>  at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1611)
> ... 34 more
>