You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Daniel Dai <da...@hortonworks.com> on 2012/01/02 09:55:59 UTC

Re: Using AvroStorage()

The Jira is PIG-1749. It should in trunk as well. Open a ticket if you
cannot make a particular version work.

Daniel

On Tue, Dec 13, 2011 at 11:47 AM, IGZ Nick <ig...@gmail.com> wrote:

> DUMP works as expected
> If I write the exact same thing in one line, it works.. I remember seeing a
> JIRA for this some time back, but am not able to find it now.
>
> On Wed, Dec 14, 2011 at 12:23 AM, Stan Rosenberg <
> srosenberg@proclivitysystems.com> wrote:
>
> > There is something syntactically wrong with your script.
> > MismatchedTokenException seems to indicate that the semicolon
> > character was expected (ttype==93).
> > What happens if you replace the entire "STORE A ..." line by say "DUMP
> A"?
> >
> > On Tue, Dec 13, 2011 at 1:17 PM, IGZ Nick <ig...@gmail.com> wrote:
> > > Hi Stan,
> > >
> > > Here is my pig script:
> > > REGISTER avro-1.4.0.jar
> > > REGISTER joda-time-1.6.jar
> > > REGISTER json-simple-1.1.jar
> > > REGISTER jackson-core-asl-1.5.5.jar
> > > REGISTER jackson-mapper-asl-1.5.5.jar
> > > REGISTER pig-0.9.1-SNAPSHOT.jar
> > > REGISTER dwh-udf-0.1.jar
> > > REGISTER piggybank.jar
> > > REGISTER linkedin-pig-0.8.jar
> > > REGISTER google-collect-1.0-rc2.jar;
> > >
> > > A = LOAD '/user/hshankar/temp' USING PigStorage();RMF
> > > '/user/hshankar/out1';STORE A INTO '/user/hshankar/out1' USING
> > > org.apache.pig.piggybank.storage.avro.AvroStorage('{"type": "record",
> > > "name": "test", "fields": [{"name":"my_region", "type": "string"}]}');
> > >
> > > On executing it, I get this error:
> > > 2011-12-13 18:16:35,133 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> > > ERROR 1200: Pig script failed to parse: MismatchedTokenException(93!=3)
> > > Details at logfile:
> > /export/home/hshankar/pig_scripts/pig_1323800194535.log
> > >
> > > Log file contains:
> > > Pig Stack Trace
> > > ---------------
> > > ERROR 1200: Pig script failed to parse: MismatchedTokenException(93!=3)
> > >
> > > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error
> > > during parsing. Pig script failed to parse:
> > MismatchedTokenException(93!=3)
> > >        at
> org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1652)
> > >        at
> > org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1597)
> > >        at org.apache.pig.PigServer.registerQuery(PigServer.java:583)
> > >        at
> > > org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:942)
> > >        at
> > >
> >
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:386)
> > >        at
> > >
> >
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:188)
> > >        at
> > >
> >
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:164)
> > >        at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81)
> > >        at org.apache.pig.Main.run(Main.java:553)
> > >        at org.apache.pig.Main.main(Main.java:108)
> > > Caused by: Failed to parse: Pig script failed to parse:
> > > MismatchedTokenException(93!=3)
> > >        at
> > >
> org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:178)
> > >        at
> org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1644)
> > >        ... 9 more
> > > Caused by: MismatchedTokenException(93!=3)
> > >        at
> > >
> >
> org.apache.pig.parser.AstValidator.recoverFromMismatchedToken(AstValidator.java:209)
> > >        at
> org.antlr.runtime.BaseRecognizer.match(BaseRecognizer.java:115)
> > >        at
> > > org.apache.pig.parser.AstValidator.func_clause(AstValidator.java:3497)
> > >        at
> > > org.apache.pig.parser.AstValidator.store_clause(AstValidator.java:4626)
> > >        at
> > > org.apache.pig.parser.AstValidator.op_clause(AstValidator.java:970)
> > >        at
> > >
> >
> org.apache.pig.parser.AstValidator.general_statement(AstValidator.java:574)
> > >        at
> > > org.apache.pig.parser.AstValidator.statement(AstValidator.java:396)
> > >        at
> org.apache.pig.parser.AstValidator.query(AstValidator.java:306)
> > >        at
> > >
> >
> org.apache.pig.parser.QueryParserDriver.validateAst(QueryParserDriver.java:236)
> > >        at
> > >
> org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:168)
> > >        ... 10 more
> > >
> >
> ================================================================================
> > >
> > >
> > > On Tue, Dec 13, 2011 at 9:05 PM, Stan Rosenberg <
> > > srosenberg@proclivitysystems.com> wrote:
> > >
> > >> The following test script works for me:
> > >> =============================================
> > >>
> > >> A = load '$LOGS' using
> > org.apache.pig.piggybank.storage.avro.AvroStorage();
> > >> describe A;
> > >>
> > >> B = foreach A generate region as my_region, google_ip;
> > >>
> > >> dump B;
> > >>
> > >> store B into './output' using
> > >> org.apache.pig.piggybank.storage.avro.AvroStorage(
> > >> '{"debug": 5,
> > >>  "schema": {"type": "record", "name": "test", "fields": [{"name":
> > >> "my_region", "type": ["null", "string"]}, {"name": "ip", "type":
> > >> ["null", "string"]}]}
> > >> }');
> > >> =============================================================
> > >> Note you don't need to pass the first parameter, i.e., 'schema'; you
> > >> can just pass a string formatted in json.
> > >> If you're still getting MismatchException, please compile a small
> > >> repro and send it to the list.
> > >>
> > >> stan
> > >>
> > >> On Tue, Dec 13, 2011 at 5:49 AM, IGZ Nick <ig...@gmail.com>
> wrote:
> > >> > Hi all,
> > >> >
> > >> > I want to keep the pig script and storage schema separate. Is it
> > possible
> > >> > to do this in a clean way? THe only way that has worked so far is to
> > do
> > >> > like:
> > >> > AvroStorage('schema',
> > >> >
> > >>
> >
> '{"name":"xyz","type":"record","fields":[{"name":"abc","type":"string"}]}');
> > >> >
> > >> > That too, all the schema in one line. If I split it onto multiple
> > lines,
> > >> I
> > >> > get a MismatchException (93-3) or something like that. Is there no
> > way to
> > >> > do AvroStorage('file', <hdfs path of schema file>) or something of
> > that
> > >> > sort, or at least be able to specify the schema in multiple lines?
> > >> >
> > >> > Thanks,
> > >>
> >
>