You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Thejas M Nair (JIRA)" <ji...@apache.org> on 2009/10/14 18:29:31 UTC

[jira] Commented: (PIG-1022) optimizer pushes filter before the foreach that generates column used by filter

    [ https://issues.apache.org/jira/browse/PIG-1022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765612#action_12765612 ] 

Thejas M Nair commented on PIG-1022:
------------------------------------

${code}
grunt> explain filt;
#-----------------------------------------------
# Logical Plan:
#-----------------------------------------------

Store 1-1162 Schema: {name: chararray,gid: chararray} Type: Unknown
|
|---ForEach 1-1148 Schema: {name: chararray,gid: chararray} Type: bag
    |   |
    |   Project 1-1144 Projections: [0] Overloaded: false FieldSchema: name: chararray Type: chararray
    |   Input: Project 1-1145 Projections: [0] Overloaded: false|
    |   |---Project 1-1145 Projections: [0] Overloaded: false FieldSchema: group: tuple({name: chararray,gid: chararray}) Type: tuple
    |       Input: CoGroup 1-1138
    |   |
    |   Project 1-1146 Projections: [1] Overloaded: false FieldSchema: gid: chararray Type: chararray
    |   Input: Project 1-1147 Projections: [0] Overloaded: false|
    |   |---Project 1-1147 Projections: [0] Overloaded: false FieldSchema: group: tuple({name: chararray,gid: chararray}) Type: tuple
    |       Input: CoGroup 1-1138
    |
    |---CoGroup 1-1138 Schema: {group: (name: chararray,gid: chararray),f: {name: chararray,gender: chararray,age: chararray,score: chararray,gid: chararray}} Type: bag
        |   |
        |   Project 1-1136 Projections: [0] Overloaded: false FieldSchema: name: chararray Type: chararray
        |   Input: ForEach 1-1135
        |   |
        |   Project 1-1137 Projections: [4] Overloaded: false FieldSchema: gid: chararray Type: chararray
        |   Input: ForEach 1-1135
        |
        |---ForEach 1-1135 Schema: {name: chararray,gender: chararray,age: chararray,score: chararray,gid: chararray} Type: bag
            |   |
            |   Project 1-1130 Projections: [0] Overloaded: false FieldSchema: name: chararray Type: chararray
            |   Input: Filter 1-1152
            |   |
            |   Project 1-1131 Projections: [1] Overloaded: false FieldSchema: gender: chararray Type: chararray
            |   Input: Filter 1-1152
            |   |
            |   Project 1-1132 Projections: [2] Overloaded: false FieldSchema: age: chararray Type: chararray
            |   Input: Filter 1-1152
            |   |
            |   Project 1-1133 Projections: [3] Overloaded: false FieldSchema: score: chararray Type: chararray
            |   Input: Filter 1-1152
            |   |
            |   Const 1-1134( 200 ) FieldSchema: chararray Type: chararray
            |
            |---Filter 1-1152 Schema: {name: chararray,gender: chararray,age: chararray,score: chararray} Type: bag
                |   |
                |   Equal 1-1151 FieldSchema: boolean Type: boolean
                |   |
                |   |---Project 1-1149 Projections: [0] Overloaded: false FieldSchema: name: chararray Type: chararray
                |   |   Input: ForEach 1-1161
                |   |
                |   |---Const 1-1150( 200 ) FieldSchema: chararray Type: chararray
                |
                |---ForEach 1-1161 Schema: {name: chararray,gender: chararray,age: chararray,score: chararray} Type: bag
                    |   |
                    |   Cast 1-1154 FieldSchema: name: chararray Type: chararray
                    |   |
                    |   |---Project 1-1153 Projections: [0] Overloaded: false FieldSchema: name: bytearray Type: bytearray
                    |       Input: Load 1-1123
                    |   |
                    |   Cast 1-1156 FieldSchema: gender: chararray Type: chararray
                    |   |
                    |   |---Project 1-1155 Projections: [1] Overloaded: false FieldSchema: gender: bytearray Type: bytearray
                    |       Input: Load 1-1123
                    |   |
                    |   Cast 1-1158 FieldSchema: age: chararray Type: chararray
                    |   |
                    |   |---Project 1-1157 Projections: [2] Overloaded: false FieldSchema: age: bytearray Type: bytearray
                    |       Input: Load 1-1123
                    |   |
                    |   Cast 1-1160 FieldSchema: score: chararray Type: chararray
                    |   |
                    |   |---Project 1-1159 Projections: [3] Overloaded: false FieldSchema: score: bytearray Type: bytearray
                    |       Input: Load 1-1123
                    |
                    |---Load 1-1123 Schema: {name: bytearray,gender: bytearray,age: bytearray,score: bytearray} Type: bag

${code}

> optimizer pushes filter before the foreach that generates column used by filter
> -------------------------------------------------------------------------------
>
>                 Key: PIG-1022
>                 URL: https://issues.apache.org/jira/browse/PIG-1022
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>            Reporter: Thejas M Nair
>
> grunt> l = load 'students.txt' using PigStorage() as (name:chararray, gender:chararray, age:chararray, score:chararray);
> grunt> f = foreach l generate name, gender, age,score, '200'  as gid:chararray;
> grunt> g = group f by (name, gid);
> grunt> f2 = foreach g generate group.name as name: chararray, group.gid as gid: chararray;
> grunt> filt = filter f2 by gid == '200';
> grunt> explain filt;
> In the plan generated filt is pushed up after the load and before the first foreach, even though the filter is on gid which is generated in first foreach.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.