You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Xuefu Zhang (JIRA)" <ji...@apache.org> on 2011/03/15 19:17:29 UTC

[jira] Commented: (PIG-1581) Parser fails to recognize semicolons in quoted strings

    [ https://issues.apache.org/jira/browse/PIG-1581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13007063#comment-13007063 ] 

Xuefu Zhang commented on PIG-1581:
----------------------------------

This issue has a similar cause to that of PIG-731

> Parser fails to recognize semicolons in quoted strings
> ------------------------------------------------------
>
>                 Key: PIG-1581
>                 URL: https://issues.apache.org/jira/browse/PIG-1581
>             Project: Pig
>          Issue Type: Bug
>          Components: grunt
>    Affects Versions: 0.7.0
>         Environment: CentOS 5.5
>            Reporter: Christopher Hackman
>            Assignee: Xuefu Zhang
>            Priority: Minor
>             Fix For: 0.9.0
>
>
> Within some contexts, the parser fails to treat semicolons correctly, and sees them as an EOL.
> Given an input file:
> /test1.txt (in the hdfs)
> 1;a
> 2;b
> 3;c
> 4;d
> 5;e
> And the following Pig script:
> REGISTER /tmp/piggybank.jar ;
> DEFINE REGEXEXTRACTALL org.apache.pig.piggybank.evaluation.string.RegexExtractAll();
> lines = LOAD '/test1.txt' AS (line:chararray);
> delimited = FOREACH lines GENERATE FLATTEN (
>         REGEXEXTRACTALL(line, '^(\\d+);(\\w+)$')
> ) AS (
>         digit:int,
>         word:chararray
> );
> DUMP delimited;
> I receive the following error:
> ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1000: Error during parsing. Lexical error at line 5, column 40.  Encountered: <EOF> after : "\'^(\\\\d+);"

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira