You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Tushar Pradhan (JIRA)" <ji...@apache.org> on 2012/06/12 17:11:42 UTC

[jira] [Created] (PIG-2752) Infinite (?) parser loop with complex FOREACH expression

Tushar Pradhan created PIG-2752:
-----------------------------------

             Summary: Infinite (?) parser loop with complex FOREACH expression
                 Key: PIG-2752
                 URL: https://issues.apache.org/jira/browse/PIG-2752
             Project: Pig
          Issue Type: Bug
          Components: parser
    Affects Versions: 0.10.0
         Environment: Linux
            Reporter: Tushar Pradhan


The following Pig script seems to hang in the parser for Pig 0.10.0. It works fine for Pig 0.8.1.

----
X = LOAD 'X' USING PigStorage(',') AS (
term: chararray,dcount: long,dcount_0: long,dcount_1: long,dcount_2: long,dcount_4: long,dcount_5: long,dcount_6: long,dcount_7: long,dcount_8: long,dcount_9: long,dcount_10: long,dcount_11: long,dcount_12: long,dcount_13: long,dcount_U: long,dcount_L: long,dcount_C: long,dcount_M: long,dcount_P: long,dcount_T: long,dcount_S: long,dcount_R: long,dcount_Z: long,dcount_K: long);

Y =
    FOREACH X
    GENERATE
        term,
        (
            (dcount_U > 0 OR dcount_C > 0 OR dcount_M > 0) AND (dcount_1 > 1 OR dcount_1 == 1 AND dcount == 1) ? 1 : (
            (dcount_U > 0 OR dcount_C > 0 OR dcount_M > 0) AND (dcount_2 > 1 OR dcount_2 == 1 AND dcount == 1) ? 2 : (
            (dcount_U > 0 OR dcount_C > 0 OR dcount_M > 0) AND (dcount_7 > 1 OR dcount_7 == 1 AND dcount == 1) ? 7 : (
            (dcount_U > 0 OR dcount_C > 0 OR dcount_M > 0) AND (dcount_9 > 1 OR dcount_9 == 1 AND dcount == 1) ? 9 : (
            (dcount_U > 0 OR dcount_C > 0 OR dcount_M > 0) AND (dcount_11 > 1 OR dcount_11 == 1 AND dcount == 1) ? 11 : (
            dcount_5 > 1 OR dcount_5 == 1 AND dcount == 1 ? 5 : (
            dcount_6 > 1 OR dcount_6 == 1 AND dcount == 1 ? 6 : (
            dcount_8 > 1 OR dcount_8 == 1 AND dcount == 1 ? 8 : (
            dcount_10 > 1 OR dcount_10 == 1 AND dcount == 1 ? 10 : (
            dcount_12 > 1 OR dcount_12 == 1 AND dcount == 1 ? 12 : (
            (dcount_U > 0 OR dcount_C > 0 OR dcount_M > 0) AND (dcount_13 > 0 OR dcount_13 == 1 AND dcount == 1) ? 13 : (
            dcount_4 > 0 ? 4 : 0)))))))))))
        ) AS besttype;

STORE Y INTO 'Y';
----

2012-06-12 08:04:46,435 [main] INFO  org.apache.pig.Main - Apache Pig version 0.10.0-SNAPSHOT (rexported) compiled May 08 2012, 08:26:29
2012-06-12 08:04:46,435 [main] INFO  org.apache.pig.Main - Logging error messages to: /tmp/pig_1339513486431.log
2012-06-12 08:04:46,950 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: file:///

The hang occurs in both local and Hadoop modes

If I simplify the 'besttype' expression in the FOREACH a bit, the script works fine.

The input 'X' directory isn't necessary as the processing gets stuck in the parser, but if needed, can contain a sample 'part-r-00000' file with the line:

#1,49,1,0,0,0,0,0,0,0,0,0,0,0,48,0,0,0,0,49,1,2,0,0,43





--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira