You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Jonathan Holloway <jo...@gmail.com> on 2011/06/23 18:41:16 UTC
Invalid Alias Issue
Hi all,
I'm getting the exception (at the end) from the following using Pig:
eLine = FOREACH logLine
GENERATE
FLATTEN(
REGEX_EXTRACT_ALL(
$0,
'.*Output.Count\\s*\\-\\s*([A-Za-z\\.]+)\\s*(\\d+)'
)
) AS (ename:CHARARRAY, ecount:DOUBLE);
nameGroup = GROUP eLine BY eventName;
lines = FOREACH nameGroup GENERATE group as name,
MAX(com.example.BagToTupleUDF((tuple)eLine.ecount)) as maxCount;
My UDF is converting the values from a bag {(12),(4),(7),(190)} to a tuple
of doubles (12,4,7,190).
Can anybody help explain how i can use the Pig builtin functions MAX, MIN,
AVG over this kind of data extracted from a regex?
Many thanks,
Jon.
---
org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error during
parsing. Invalid alias: MAX in {group: chararray,eLine: {ename:
chararray,ecount: double}}
at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1617)
at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1561)
at org.apache.pig.PigServer.registerQuery(PigServer.java:533)
at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:868)
at org.apache.pig.pigunit.pig.GruntParser.processPig(GruntParser.java:61)
at
org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:388)
at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
at org.apache.pig.pigunit.pig.PigServer.registerScript(PigServer.java:53)
at org.apache.pig.pigunit.PigTest.registerScript(PigTest.java:160)
at org.apache.pig.pigunit.PigTest.assertOutput(PigTest.java:244)
at
message_archiver.reporting.pig.functions.OEPigTest.singleRawTextFile(OEPigTest.java:78)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
at
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
at
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
at
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
at
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
at
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:44)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:180)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:41)
at org.junit.runners.ParentRunner$1.evaluate(ParentRunner.java:173)
at
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
at
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
at org.junit.runners.ParentRunner.run(ParentRunner.java:220)
at
org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:49)
at
org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
Caused by: org.apache.pig.impl.logicalLayer.parser.ParseException:
Invalid alias: MAX in {group: chararray,line: {name: chararray,count:
double}}
at
org.apache.pig.impl.logicalLayer.parser.QueryParser.AliasFieldOrSpec(QueryParser.java:7415)
at
org.apache.pig.impl.logicalLayer.parser.QueryParser.ColOrSpec(QueryParser.java:7226)
at
org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseEvalSpec(QueryParser.java:5297)
at
org.apache.pig.impl.logicalLayer.parser.QueryParser.UnaryExpr(QueryParser.java:5187)
at
org.apache.pig.impl.logicalLayer.parser.QueryParser.CastExpr(QueryParser.java:5133)
at
org.apache.pig.impl.logicalLayer.parser.QueryParser.MultiplicativeExpr(QueryParser.java:5042)
at
org.apache.pig.impl.logicalLayer.parser.QueryParser.AdditiveExpr(QueryParser.java:4968)
at
org.apache.pig.impl.logicalLayer.parser.QueryParser.InfixExpr(QueryParser.java:4934)
at
org.apache.pig.impl.logicalLayer.parser.QueryParser.FlattenedGenerateItem(QueryParser.java:4861)
at
org.apache.pig.impl.logicalLayer.parser.QueryParser.FlattenedGenerateItemList(QueryParser.java:4760)
at
org.apache.pig.impl.logicalLayer.parser.QueryParser.GenerateStatement(QueryParser.java:4704)
at
org.apache.pig.impl.logicalLayer.parser.QueryParser.NestedBlock(QueryParser.java:4030)
at
org.apache.pig.impl.logicalLayer.parser.QueryParser.ForEachClause(QueryParser.java:3433)
at
org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:1464)
at
org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:1013)
at
org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:800)
at
org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:63)
at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1611)
... 34 more
Re: Invalid Alias Issue
Posted by Dmitriy Ryaboy <dv...@gmail.com>.
you can cast to longs and doubles from strings, that should've helped.
On Fri, Jun 24, 2011 at 4:10 PM, Jonathan Holloway <
jonathan.holloway@gmail.com> wrote:
> I ended up fixing this issue - i did change it to a bag after but the main
> problem was that regexextractall was returning everything as a string (bia
> group) which meant that max, avg etc... was not matched as a matching
> function for a bag of tuple doubles.
>
> I ended up writing a new udf for extractall to return types based on
> whether \d or \w was used in the regexp. Flattening that to specfic types
> didnt work.
>
> That solved the issue, would appreciate the feedback on the udf and
> approach - will post it early next week on pastebin. If there's a better way
> then please let me know.
>
> This whole solution was because I wanted to get around the issue of
> creating a new udf for each log line type I needed to parse.
>
> Many thanks,
> Jon
>
> On 24 Jun 2011, at 23:45, Dmitriy Ryaboy <dv...@gmail.com> wrote:
>
> > <mime-attachment.txt>
>
Re: Invalid Alias Issue
Posted by Jonathan Holloway <jo...@gmail.com>.
I ended up fixing this issue - i did change it to a bag after but the main problem was that regexextractall was returning everything as a string (bia group) which meant that max, avg etc... was not matched as a matching function for a bag of tuple doubles.
I ended up writing a new udf for extractall to return types based on whether \d or \w was used in the regexp. Flattening that to specfic types didnt work.
That solved the issue, would appreciate the feedback on the udf and approach - will post it early next week on pastebin. If there's a better way then please let me know.
This whole solution was because I wanted to get around the issue of creating a new udf for each log line type I needed to parse.
Many thanks,
Jon
On 24 Jun 2011, at 23:45, Dmitriy Ryaboy <dv...@gmail.com> wrote:
> <mime-attachment.txt>
Re: Invalid Alias Issue
Posted by Dmitriy Ryaboy <dv...@gmail.com>.
Why are you casting eLine.ecount as a tuple? It's a bag (all ecounts with
this eventName)
D
On Thu, Jun 23, 2011 at 9:41 AM, Jonathan Holloway <
jonathan.holloway@gmail.com> wrote:
> Hi all,
>
> I'm getting the exception (at the end) from the following using Pig:
>
> eLine = FOREACH logLine
> GENERATE
> FLATTEN(
> REGEX_EXTRACT_ALL(
> $0,
> '.*Output.Count\\s*\\-\\s*([A-Za-z\\.]+)\\s*(\\d+)'
> )
> ) AS (ename:CHARARRAY, ecount:DOUBLE);
>
> nameGroup = GROUP eLine BY eventName;
>
> lines = FOREACH nameGroup GENERATE group as name,
> MAX(com.example.BagToTupleUDF((tuple)eLine.ecount)) as maxCount;
>
> My UDF is converting the values from a bag {(12),(4),(7),(190)} to a tuple
> of doubles (12,4,7,190).
>
> Can anybody help explain how i can use the Pig builtin functions MAX, MIN,
> AVG over this kind of data extracted from a regex?
>
> Many thanks,
> Jon.
>
> ---
>
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error
> during
> parsing. Invalid alias: MAX in {group: chararray,eLine: {ename:
> chararray,ecount: double}}
> at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1617)
> at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1561)
> at org.apache.pig.PigServer.registerQuery(PigServer.java:533)
> at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:868)
> at org.apache.pig.pigunit.pig.GruntParser.processPig(GruntParser.java:61)
> at
>
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:388)
> at
>
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
> at org.apache.pig.pigunit.pig.PigServer.registerScript(PigServer.java:53)
> at org.apache.pig.pigunit.PigTest.registerScript(PigTest.java:160)
> at org.apache.pig.pigunit.PigTest.assertOutput(PigTest.java:244)
> at
>
> message_archiver.reporting.pig.functions.OEPigTest.singleRawTextFile(OEPigTest.java:78)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at
>
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
> at
>
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
> at
>
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
> at
>
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
> at
>
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
> at
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
> at
>
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
> at
>
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:44)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:180)
> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:41)
> at org.junit.runners.ParentRunner$1.evaluate(ParentRunner.java:173)
> at
>
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
> at
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:220)
> at
>
> org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:49)
> at
>
> org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
> at
>
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
> at
>
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
> at
>
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
> at
>
> org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
> Caused by: org.apache.pig.impl.logicalLayer.parser.ParseException:
>
> Invalid alias: MAX in {group: chararray,line: {name: chararray,count:
> double}}
> at
>
> org.apache.pig.impl.logicalLayer.parser.QueryParser.AliasFieldOrSpec(QueryParser.java:7415)
> at
>
> org.apache.pig.impl.logicalLayer.parser.QueryParser.ColOrSpec(QueryParser.java:7226)
> at
>
> org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseEvalSpec(QueryParser.java:5297)
> at
>
> org.apache.pig.impl.logicalLayer.parser.QueryParser.UnaryExpr(QueryParser.java:5187)
> at
>
> org.apache.pig.impl.logicalLayer.parser.QueryParser.CastExpr(QueryParser.java:5133)
> at
>
> org.apache.pig.impl.logicalLayer.parser.QueryParser.MultiplicativeExpr(QueryParser.java:5042)
> at
>
> org.apache.pig.impl.logicalLayer.parser.QueryParser.AdditiveExpr(QueryParser.java:4968)
> at
>
> org.apache.pig.impl.logicalLayer.parser.QueryParser.InfixExpr(QueryParser.java:4934)
> at
>
> org.apache.pig.impl.logicalLayer.parser.QueryParser.FlattenedGenerateItem(QueryParser.java:4861)
> at
>
> org.apache.pig.impl.logicalLayer.parser.QueryParser.FlattenedGenerateItemList(QueryParser.java:4760)
> at
>
> org.apache.pig.impl.logicalLayer.parser.QueryParser.GenerateStatement(QueryParser.java:4704)
> at
>
> org.apache.pig.impl.logicalLayer.parser.QueryParser.NestedBlock(QueryParser.java:4030)
> at
>
> org.apache.pig.impl.logicalLayer.parser.QueryParser.ForEachClause(QueryParser.java:3433)
> at
>
> org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:1464)
> at
>
> org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:1013)
> at
>
> org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:800)
> at
>
> org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:63)
> at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1611)
> ... 34 more
>