You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Woody Wang (Created) (JIRA)" <ji...@apache.org> on 2012/03/15 11:15:37 UTC

[jira] [Created] (PIG-2592) Using FILTER after FOREACH in Pig-Latin failed

Using FILTER after FOREACH in Pig-Latin failed
----------------------------------------------

                 Key: PIG-2592
                 URL: https://issues.apache.org/jira/browse/PIG-2592
             Project: Pig
          Issue Type: Bug
          Components: impl, parser
    Affects Versions: 0.9.2
         Environment: MacOS X 10.6.7

java version "1.6.0_29"
Java(TM) SE Runtime Environment (build 1.6.0_29-b11-402-11M3527)
Java HotSpot(TM) 64-Bit Server VM (build 20.4-b02-402, mixed mode)


            Reporter: Woody Wang


Suppose we have a data file(test.txt) whose content is:

1,2,3
2,3,4
3,4,5
4,5,6
I want to select the records whose the 1st field is '3'. The Pig script is:

t = LOAD 'test.txt' USING PigStorage(',');
t1 = FOREACH t GENERATE $0 AS i0:chararray, $1 AS i1:chararray, $2 AS i2:chararray;
f1 = FILTER t1 BY i0 == '3';
DUMP f1
The task runs well but the output result is nothing. EXPLAIN f1 shows:

#--------------------------------------------------
# Map Reduce Plan                                  
#--------------------------------------------------
MapReduce node scope-27
Map Plan
f1: Store(fakefile:org.apache.pig.builtin.PigStorage) - scope-26
|
|---f1: Filter[bag] - scope-22
    |   |
    |   Equal To[boolean] - scope-25
    |   |
    |   |---Project[chararray][0] - scope-23
    |   |
    |   |---Constant(3) - scope-24
    |
    |---t1: New For Each(false,false,false)[bag] - scope-21
        |   |
        |   Project[bytearray][0] - scope-15
        |   |
        |   Project[bytearray][1] - scope-17
        |   |
        |   Project[bytearray][2] - scope-19
        |
        |---t: Load(file:///Users/woody/test.txt:PigStorage(',')) - scope-14--------
Global sort: false
----------------
However, if I change the head 2 lines into:

t1 = LOAD 'test.txt' USING PigStorage(',') AS (i0:chararray, i1:chararray, i2:chararray)
(i.e. assign the schema in LOAD statement)

The task works well and the result is also correct. In this case, the EXPLAIN f1 shows:

#--------------------------------------------------
# Map Reduce Plan                                  
#--------------------------------------------------
MapReduce node scope-33
Map Plan
f1: Store(fakefile:org.apache.pig.builtin.PigStorage) - scope-32
|
|---f1: Filter[bag] - scope-28
    |   |
    |   Equal To[boolean] - scope-31
    |   |
    |   |---Project[chararray][0] - scope-29
    |   |
    |   |---Constant(3) - scope-30
    |
    |---t1: New For Each(false,false,false)[bag] - scope-27
        |   |
        |   Cast[chararray] - scope-19
        |   |
        |   |---Project[bytearray][0] - scope-18
        |   |
        |   Cast[chararray] - scope-22
        |   |
        |   |---Project[bytearray][1] - scope-21
        |   |
        |   Cast[chararray] - scope-25
        |   |
        |   |---Project[bytearray][2] - scope-24
        |
        |---t1: Load(file:///Users/woody/test.txt:PigStorage(',')) - scope-17--------
Global sort: false



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (PIG-2592) Using FILTER after FOREACH in Pig-Latin failed

Posted by "Woody Wang (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Woody Wang updated PIG-2592:
----------------------------

    Description: 
Suppose we have a data file(test.txt) whose content is:

1,2,3
2,3,4
3,4,5
4,5,6
I want to select the records whose the 1st field is '3'. The Pig script is:

t = LOAD 'test.txt' USING PigStorage(',');
t1 = FOREACH t GENERATE $0 AS i0:chararray, $1 AS i1:chararray, $2 AS i2:chararray;
f1 = FILTER t1 BY i0 == '3';
DUMP f1
The task runs well but the output result is nothing. EXPLAIN f1 shows:
{noformat}
#--------------------------------------------------
# Map Reduce Plan                                  
#--------------------------------------------------
MapReduce node scope-27
Map Plan
f1: Store(fakefile:org.apache.pig.builtin.PigStorage) - scope-26
|
|---f1: Filter[bag] - scope-22
    |   |
    |   Equal To[boolean] - scope-25
    |   |
    |   |---Project[chararray][0] - scope-23
    |   |
    |   |---Constant(3) - scope-24
    |
    |---t1: New For Each(false,false,false)[bag] - scope-21
        |   |
        |   Project[bytearray][0] - scope-15
        |   |
        |   Project[bytearray][1] - scope-17
        |   |
        |   Project[bytearray][2] - scope-19
        |
        |---t: Load(file:///Users/woody/test.txt:PigStorage(',')) - scope-14--------
Global sort: false
----------------
{noformat}

However, if I change the head 2 lines into:

t1 = LOAD 'test.txt' USING PigStorage(',') AS (i0:chararray, i1:chararray, i2:chararray)
(i.e. assign the schema in LOAD statement)

The task works well and the result is also correct. In this case, the EXPLAIN f1 shows:

{noformat}
#--------------------------------------------------
# Map Reduce Plan                                  
#--------------------------------------------------
MapReduce node scope-33
Map Plan
f1: Store(fakefile:org.apache.pig.builtin.PigStorage) - scope-32
|
|---f1: Filter[bag] - scope-28
    |   |
    |   Equal To[boolean] - scope-31
    |   |
    |   |---Project[chararray][0] - scope-29
    |   |
    |   |---Constant(3) - scope-30
    |
    |---t1: New For Each(false,false,false)[bag] - scope-27
        |   |
        |   Cast[chararray] - scope-19
        |   |
        |   |---Project[bytearray][0] - scope-18
        |   |
        |   Cast[chararray] - scope-22
        |   |
        |   |---Project[bytearray][1] - scope-21
        |   |
        |   Cast[chararray] - scope-25
        |   |
        |   |---Project[bytearray][2] - scope-24
        |
        |---t1: Load(file:///Users/woody/test.txt:PigStorage(',')) - scope-17--------
Global sort: false
{noformat}


  was:
Suppose we have a data file(test.txt) whose content is:

1,2,3
2,3,4
3,4,5
4,5,6
I want to select the records whose the 1st field is '3'. The Pig script is:

t = LOAD 'test.txt' USING PigStorage(',');
t1 = FOREACH t GENERATE $0 AS i0:chararray, $1 AS i1:chararray, $2 AS i2:chararray;
f1 = FILTER t1 BY i0 == '3';
DUMP f1
The task runs well but the output result is nothing. EXPLAIN f1 shows:

#--------------------------------------------------
# Map Reduce Plan                                  
#--------------------------------------------------
MapReduce node scope-27
Map Plan
f1: Store(fakefile:org.apache.pig.builtin.PigStorage) - scope-26
|
|---f1: Filter[bag] - scope-22
    |   |
    |   Equal To[boolean] - scope-25
    |   |
    |   |---Project[chararray][0] - scope-23
    |   |
    |   |---Constant(3) - scope-24
    |
    |---t1: New For Each(false,false,false)[bag] - scope-21
        |   |
        |   Project[bytearray][0] - scope-15
        |   |
        |   Project[bytearray][1] - scope-17
        |   |
        |   Project[bytearray][2] - scope-19
        |
        |---t: Load(file:///Users/woody/test.txt:PigStorage(',')) - scope-14--------
Global sort: false
----------------
However, if I change the head 2 lines into:

t1 = LOAD 'test.txt' USING PigStorage(',') AS (i0:chararray, i1:chararray, i2:chararray)
(i.e. assign the schema in LOAD statement)

The task works well and the result is also correct. In this case, the EXPLAIN f1 shows:

#--------------------------------------------------
# Map Reduce Plan                                  
#--------------------------------------------------
MapReduce node scope-33
Map Plan
f1: Store(fakefile:org.apache.pig.builtin.PigStorage) - scope-32
|
|---f1: Filter[bag] - scope-28
    |   |
    |   Equal To[boolean] - scope-31
    |   |
    |   |---Project[chararray][0] - scope-29
    |   |
    |   |---Constant(3) - scope-30
    |
    |---t1: New For Each(false,false,false)[bag] - scope-27
        |   |
        |   Cast[chararray] - scope-19
        |   |
        |   |---Project[bytearray][0] - scope-18
        |   |
        |   Cast[chararray] - scope-22
        |   |
        |   |---Project[bytearray][1] - scope-21
        |   |
        |   Cast[chararray] - scope-25
        |   |
        |   |---Project[bytearray][2] - scope-24
        |
        |---t1: Load(file:///Users/woody/test.txt:PigStorage(',')) - scope-17--------
Global sort: false



    
> Using FILTER after FOREACH in Pig-Latin failed
> ----------------------------------------------
>
>                 Key: PIG-2592
>                 URL: https://issues.apache.org/jira/browse/PIG-2592
>             Project: Pig
>          Issue Type: Bug
>          Components: impl, parser
>    Affects Versions: 0.9.2
>         Environment: MacOS X 10.6.7
> java version "1.6.0_29"
> Java(TM) SE Runtime Environment (build 1.6.0_29-b11-402-11M3527)
> Java HotSpot(TM) 64-Bit Server VM (build 20.4-b02-402, mixed mode)
>            Reporter: Woody Wang
>
> Suppose we have a data file(test.txt) whose content is:
> 1,2,3
> 2,3,4
> 3,4,5
> 4,5,6
> I want to select the records whose the 1st field is '3'. The Pig script is:
> t = LOAD 'test.txt' USING PigStorage(',');
> t1 = FOREACH t GENERATE $0 AS i0:chararray, $1 AS i1:chararray, $2 AS i2:chararray;
> f1 = FILTER t1 BY i0 == '3';
> DUMP f1
> The task runs well but the output result is nothing. EXPLAIN f1 shows:
> {noformat}
> #--------------------------------------------------
> # Map Reduce Plan                                  
> #--------------------------------------------------
> MapReduce node scope-27
> Map Plan
> f1: Store(fakefile:org.apache.pig.builtin.PigStorage) - scope-26
> |
> |---f1: Filter[bag] - scope-22
>     |   |
>     |   Equal To[boolean] - scope-25
>     |   |
>     |   |---Project[chararray][0] - scope-23
>     |   |
>     |   |---Constant(3) - scope-24
>     |
>     |---t1: New For Each(false,false,false)[bag] - scope-21
>         |   |
>         |   Project[bytearray][0] - scope-15
>         |   |
>         |   Project[bytearray][1] - scope-17
>         |   |
>         |   Project[bytearray][2] - scope-19
>         |
>         |---t: Load(file:///Users/woody/test.txt:PigStorage(',')) - scope-14--------
> Global sort: false
> ----------------
> {noformat}
> However, if I change the head 2 lines into:
> t1 = LOAD 'test.txt' USING PigStorage(',') AS (i0:chararray, i1:chararray, i2:chararray)
> (i.e. assign the schema in LOAD statement)
> The task works well and the result is also correct. In this case, the EXPLAIN f1 shows:
> {noformat}
> #--------------------------------------------------
> # Map Reduce Plan                                  
> #--------------------------------------------------
> MapReduce node scope-33
> Map Plan
> f1: Store(fakefile:org.apache.pig.builtin.PigStorage) - scope-32
> |
> |---f1: Filter[bag] - scope-28
>     |   |
>     |   Equal To[boolean] - scope-31
>     |   |
>     |   |---Project[chararray][0] - scope-29
>     |   |
>     |   |---Constant(3) - scope-30
>     |
>     |---t1: New For Each(false,false,false)[bag] - scope-27
>         |   |
>         |   Cast[chararray] - scope-19
>         |   |
>         |   |---Project[bytearray][0] - scope-18
>         |   |
>         |   Cast[chararray] - scope-22
>         |   |
>         |   |---Project[bytearray][1] - scope-21
>         |   |
>         |   Cast[chararray] - scope-25
>         |   |
>         |   |---Project[bytearray][2] - scope-24
>         |
>         |---t1: Load(file:///Users/woody/test.txt:PigStorage(',')) - scope-17--------
> Global sort: false
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (PIG-2592) Using FILTER after FOREACH in Pig-Latin failed

Posted by "Daniel Dai (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai resolved PIG-2592.
-----------------------------

    Resolution: Duplicate

This is a known issue, see PIG-2315. Close this issue and use PIG-2315 to track the issue.
                
> Using FILTER after FOREACH in Pig-Latin failed
> ----------------------------------------------
>
>                 Key: PIG-2592
>                 URL: https://issues.apache.org/jira/browse/PIG-2592
>             Project: Pig
>          Issue Type: Bug
>          Components: impl, parser
>    Affects Versions: 0.9.2
>         Environment: MacOS X 10.7.2
> java version "1.6.0_29"
> Java(TM) SE Runtime Environment (build 1.6.0_29-b11-402-11M3527)
> Java HotSpot(TM) 64-Bit Server VM (build 20.4-b02-402, mixed mode)
>            Reporter: Woody Wang
>
> Suppose we have a data file(test.txt) whose content is:
> 1,2,3
> 2,3,4
> 3,4,5
> 4,5,6
> I want to select the records whose the 1st field is '3'. The Pig script is:
> t = LOAD 'test.txt' USING PigStorage(',');
> t1 = FOREACH t GENERATE $0 AS i0:chararray, $1 AS i1:chararray, $2 AS i2:chararray;
> f1 = FILTER t1 BY i0 == '3';
> DUMP f1
> The task runs well but the output result is nothing. EXPLAIN f1 shows:
> {noformat}
> #--------------------------------------------------
> # Map Reduce Plan                                  
> #--------------------------------------------------
> MapReduce node scope-27
> Map Plan
> f1: Store(fakefile:org.apache.pig.builtin.PigStorage) - scope-26
> |
> |---f1: Filter[bag] - scope-22
>     |   |
>     |   Equal To[boolean] - scope-25
>     |   |
>     |   |---Project[chararray][0] - scope-23
>     |   |
>     |   |---Constant(3) - scope-24
>     |
>     |---t1: New For Each(false,false,false)[bag] - scope-21
>         |   |
>         |   Project[bytearray][0] - scope-15
>         |   |
>         |   Project[bytearray][1] - scope-17
>         |   |
>         |   Project[bytearray][2] - scope-19
>         |
>         |---t: Load(file:///Users/woody/test.txt:PigStorage(',')) - scope-14--------
> Global sort: false
> ----------------
> {noformat}
> However, if I change the head 2 lines into:
> t1 = LOAD 'test.txt' USING PigStorage(',') AS (i0:chararray, i1:chararray, i2:chararray)
> (i.e. assign the schema in LOAD statement)
> The task works well and the result is also correct. In this case, the EXPLAIN f1 shows:
> {noformat}
> #--------------------------------------------------
> # Map Reduce Plan                                  
> #--------------------------------------------------
> MapReduce node scope-33
> Map Plan
> f1: Store(fakefile:org.apache.pig.builtin.PigStorage) - scope-32
> |
> |---f1: Filter[bag] - scope-28
>     |   |
>     |   Equal To[boolean] - scope-31
>     |   |
>     |   |---Project[chararray][0] - scope-29
>     |   |
>     |   |---Constant(3) - scope-30
>     |
>     |---t1: New For Each(false,false,false)[bag] - scope-27
>         |   |
>         |   Cast[chararray] - scope-19
>         |   |
>         |   |---Project[bytearray][0] - scope-18
>         |   |
>         |   Cast[chararray] - scope-22
>         |   |
>         |   |---Project[bytearray][1] - scope-21
>         |   |
>         |   Cast[chararray] - scope-25
>         |   |
>         |   |---Project[bytearray][2] - scope-24
>         |
>         |---t1: Load(file:///Users/woody/test.txt:PigStorage(',')) - scope-17--------
> Global sort: false
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (PIG-2592) Using FILTER after FOREACH in Pig-Latin failed

Posted by "Woody Wang (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Woody Wang updated PIG-2592:
----------------------------

    Environment: 
MacOS X 10.7.2

java version "1.6.0_29"
Java(TM) SE Runtime Environment (build 1.6.0_29-b11-402-11M3527)
Java HotSpot(TM) 64-Bit Server VM (build 20.4-b02-402, mixed mode)



  was:
MacOS X 10.6.7

java version "1.6.0_29"
Java(TM) SE Runtime Environment (build 1.6.0_29-b11-402-11M3527)
Java HotSpot(TM) 64-Bit Server VM (build 20.4-b02-402, mixed mode)



    
> Using FILTER after FOREACH in Pig-Latin failed
> ----------------------------------------------
>
>                 Key: PIG-2592
>                 URL: https://issues.apache.org/jira/browse/PIG-2592
>             Project: Pig
>          Issue Type: Bug
>          Components: impl, parser
>    Affects Versions: 0.9.2
>         Environment: MacOS X 10.7.2
> java version "1.6.0_29"
> Java(TM) SE Runtime Environment (build 1.6.0_29-b11-402-11M3527)
> Java HotSpot(TM) 64-Bit Server VM (build 20.4-b02-402, mixed mode)
>            Reporter: Woody Wang
>
> Suppose we have a data file(test.txt) whose content is:
> 1,2,3
> 2,3,4
> 3,4,5
> 4,5,6
> I want to select the records whose the 1st field is '3'. The Pig script is:
> t = LOAD 'test.txt' USING PigStorage(',');
> t1 = FOREACH t GENERATE $0 AS i0:chararray, $1 AS i1:chararray, $2 AS i2:chararray;
> f1 = FILTER t1 BY i0 == '3';
> DUMP f1
> The task runs well but the output result is nothing. EXPLAIN f1 shows:
> {noformat}
> #--------------------------------------------------
> # Map Reduce Plan                                  
> #--------------------------------------------------
> MapReduce node scope-27
> Map Plan
> f1: Store(fakefile:org.apache.pig.builtin.PigStorage) - scope-26
> |
> |---f1: Filter[bag] - scope-22
>     |   |
>     |   Equal To[boolean] - scope-25
>     |   |
>     |   |---Project[chararray][0] - scope-23
>     |   |
>     |   |---Constant(3) - scope-24
>     |
>     |---t1: New For Each(false,false,false)[bag] - scope-21
>         |   |
>         |   Project[bytearray][0] - scope-15
>         |   |
>         |   Project[bytearray][1] - scope-17
>         |   |
>         |   Project[bytearray][2] - scope-19
>         |
>         |---t: Load(file:///Users/woody/test.txt:PigStorage(',')) - scope-14--------
> Global sort: false
> ----------------
> {noformat}
> However, if I change the head 2 lines into:
> t1 = LOAD 'test.txt' USING PigStorage(',') AS (i0:chararray, i1:chararray, i2:chararray)
> (i.e. assign the schema in LOAD statement)
> The task works well and the result is also correct. In this case, the EXPLAIN f1 shows:
> {noformat}
> #--------------------------------------------------
> # Map Reduce Plan                                  
> #--------------------------------------------------
> MapReduce node scope-33
> Map Plan
> f1: Store(fakefile:org.apache.pig.builtin.PigStorage) - scope-32
> |
> |---f1: Filter[bag] - scope-28
>     |   |
>     |   Equal To[boolean] - scope-31
>     |   |
>     |   |---Project[chararray][0] - scope-29
>     |   |
>     |   |---Constant(3) - scope-30
>     |
>     |---t1: New For Each(false,false,false)[bag] - scope-27
>         |   |
>         |   Cast[chararray] - scope-19
>         |   |
>         |   |---Project[bytearray][0] - scope-18
>         |   |
>         |   Cast[chararray] - scope-22
>         |   |
>         |   |---Project[bytearray][1] - scope-21
>         |   |
>         |   Cast[chararray] - scope-25
>         |   |
>         |   |---Project[bytearray][2] - scope-24
>         |
>         |---t1: Load(file:///Users/woody/test.txt:PigStorage(',')) - scope-17--------
> Global sort: false
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira