You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Krisztian Kasa (Jira)" <ji...@apache.org> on 2020/07/27 16:33:00 UTC

[jira] [Assigned] (HIVE-23939) SharedWorkOptimizer: taking the union of columns in mergeable TableScans

     [ https://issues.apache.org/jira/browse/HIVE-23939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Krisztian Kasa reassigned HIVE-23939:
-------------------------------------


> SharedWorkOptimizer: taking the union of columns in mergeable TableScans
> ------------------------------------------------------------------------
>
>                 Key: HIVE-23939
>                 URL: https://issues.apache.org/jira/browse/HIVE-23939
>             Project: Hive
>          Issue Type: Improvement
>          Components: Physical Optimizer
>            Reporter: Krisztian Kasa
>            Assignee: Krisztian Kasa
>            Priority: Major
>
> {code}
> POSTHOOK: query: explain
> select case when (select count(*) 
>                   from store_sales 
>                   where ss_quantity between 1 and 20) > 409437
>             then (select avg(ss_ext_list_price) 
>                   from store_sales 
>                   where ss_quantity between 1 and 20) 
>             else (select avg(ss_net_paid_inc_tax)
>                   from store_sales
>                   where ss_quantity between 1 and 20) end bucket1 ,
>        case when (select count(*)
>                   from store_sales
>                   where ss_quantity between 21 and 40) > 4595804
>             then (select avg(ss_ext_list_price)
>                   from store_sales
>                   where ss_quantity between 21 and 40) 
>             else (select avg(ss_net_paid_inc_tax)
>                   from store_sales
>                   where ss_quantity between 21 and 40) end bucket2,
>        case when (select count(*)
>                   from store_sales
>                   where ss_quantity between 41 and 60) > 7887297
>             then (select avg(ss_ext_list_price)
>                   from store_sales
>                   where ss_quantity between 41 and 60)
>             else (select avg(ss_net_paid_inc_tax)
>                   from store_sales
>                   where ss_quantity between 41 and 60) end bucket3,
>        case when (select count(*)
>                   from store_sales
>                   where ss_quantity between 61 and 80) > 10872978
>             then (select avg(ss_ext_list_price)
>                   from store_sales
>                   where ss_quantity between 61 and 80)
>             else (select avg(ss_net_paid_inc_tax)
>                   from store_sales
>                   where ss_quantity between 61 and 80) end bucket4,
>        case when (select count(*)
>                   from store_sales
>                   where ss_quantity between 81 and 100) > 43571537
>             then (select avg(ss_ext_list_price)
>                   from store_sales
>                   where ss_quantity between 81 and 100)
>             else (select avg(ss_net_paid_inc_tax)
>                   from store_sales
>                   where ss_quantity between 81 and 100) end bucket5
> from reason
> where r_reason_sk = 1
> POSTHOOK: type: QUERY
> POSTHOOK: Input: default@reason
> POSTHOOK: Input: default@store_sales
> POSTHOOK: Output: hdfs://### HDFS PATH ###
> Plan optimized by CBO.
> Vertex dependency in root stage
> Reducer 10 <- Reducer 34 (CUSTOM_SIMPLE_EDGE), Reducer 9 (CUSTOM_SIMPLE_EDGE)
> Reducer 11 <- Reducer 10 (CUSTOM_SIMPLE_EDGE), Reducer 18 (CUSTOM_SIMPLE_EDGE)
> Reducer 12 <- Reducer 11 (CUSTOM_SIMPLE_EDGE), Reducer 24 (CUSTOM_SIMPLE_EDGE)
> Reducer 13 <- Reducer 12 (CUSTOM_SIMPLE_EDGE), Reducer 30 (CUSTOM_SIMPLE_EDGE)
> Reducer 14 <- Reducer 13 (CUSTOM_SIMPLE_EDGE), Reducer 19 (CUSTOM_SIMPLE_EDGE)
> Reducer 15 <- Reducer 14 (CUSTOM_SIMPLE_EDGE), Reducer 25 (CUSTOM_SIMPLE_EDGE)
> Reducer 16 <- Reducer 15 (CUSTOM_SIMPLE_EDGE), Reducer 31 (CUSTOM_SIMPLE_EDGE)
> Reducer 18 <- Map 17 (CUSTOM_SIMPLE_EDGE)
> Reducer 19 <- Map 17 (CUSTOM_SIMPLE_EDGE)
> Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE), Reducer 20 (CUSTOM_SIMPLE_EDGE)
> Reducer 20 <- Map 17 (CUSTOM_SIMPLE_EDGE)
> Reducer 21 <- Map 17 (CUSTOM_SIMPLE_EDGE)
> Reducer 22 <- Map 17 (CUSTOM_SIMPLE_EDGE)
> Reducer 24 <- Map 23 (CUSTOM_SIMPLE_EDGE)
> Reducer 25 <- Map 23 (CUSTOM_SIMPLE_EDGE)
> Reducer 26 <- Map 23 (CUSTOM_SIMPLE_EDGE)
> Reducer 27 <- Map 23 (CUSTOM_SIMPLE_EDGE)
> Reducer 28 <- Map 23 (CUSTOM_SIMPLE_EDGE)
> Reducer 3 <- Reducer 2 (CUSTOM_SIMPLE_EDGE), Reducer 26 (CUSTOM_SIMPLE_EDGE)
> Reducer 30 <- Map 29 (CUSTOM_SIMPLE_EDGE)
> Reducer 31 <- Map 29 (CUSTOM_SIMPLE_EDGE)
> Reducer 32 <- Map 29 (CUSTOM_SIMPLE_EDGE)
> Reducer 33 <- Map 29 (CUSTOM_SIMPLE_EDGE)
> Reducer 34 <- Map 29 (CUSTOM_SIMPLE_EDGE)
> Reducer 4 <- Reducer 3 (CUSTOM_SIMPLE_EDGE), Reducer 32 (CUSTOM_SIMPLE_EDGE)
> Reducer 5 <- Reducer 21 (CUSTOM_SIMPLE_EDGE), Reducer 4 (CUSTOM_SIMPLE_EDGE)
> Reducer 6 <- Reducer 27 (CUSTOM_SIMPLE_EDGE), Reducer 5 (CUSTOM_SIMPLE_EDGE)
> Reducer 7 <- Reducer 33 (CUSTOM_SIMPLE_EDGE), Reducer 6 (CUSTOM_SIMPLE_EDGE)
> Reducer 8 <- Reducer 22 (CUSTOM_SIMPLE_EDGE), Reducer 7 (CUSTOM_SIMPLE_EDGE)
> Reducer 9 <- Reducer 28 (CUSTOM_SIMPLE_EDGE), Reducer 8 (CUSTOM_SIMPLE_EDGE)
> Stage-0
>   Fetch Operator
>     limit:-1
>     Stage-1
>       Reducer 16
>       File Output Operator [FS_154]
>         Select Operator [SEL_153] (rows=2 width=560)
>           Output:["_col0","_col1","_col2","_col3","_col4"]
>           Merge Join Operator [MERGEJOIN_185] (rows=2 width=1140)
>             Conds:(Left Outer),Output:["_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8","_col9","_col10","_col11","_col12","_col13","_col14","_col15"]
>           <-Reducer 15 [CUSTOM_SIMPLE_EDGE]
>             PARTITION_ONLY_SHUFFLE [RS_150]
>               Merge Join Operator [MERGEJOIN_184] (rows=2 width=1028)
>                 Conds:(Left Outer),Output:["_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8","_col9","_col10","_col11","_col12","_col13","_col14"]
>               <-Reducer 14 [CUSTOM_SIMPLE_EDGE]
>                 PARTITION_ONLY_SHUFFLE [RS_147]
>                   Merge Join Operator [MERGEJOIN_183] (rows=2 width=916)
>                     Conds:(Left Outer),Output:["_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8","_col9","_col10","_col11","_col12","_col13"]
>                   <-Reducer 13 [CUSTOM_SIMPLE_EDGE]
>                     PARTITION_ONLY_SHUFFLE [RS_144]
>                       Merge Join Operator [MERGEJOIN_182] (rows=2 width=912)
>                         Conds:(Left Outer),Output:["_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8","_col9","_col10","_col11","_col12"]
>                       <-Reducer 12 [CUSTOM_SIMPLE_EDGE]
>                         PARTITION_ONLY_SHUFFLE [RS_141]
>                           Merge Join Operator [MERGEJOIN_181] (rows=2 width=800)
>                             Conds:(Left Outer),Output:["_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8","_col9","_col10","_col11"]
>                           <-Reducer 11 [CUSTOM_SIMPLE_EDGE]
>                             PARTITION_ONLY_SHUFFLE [RS_138]
>                               Merge Join Operator [MERGEJOIN_180] (rows=2 width=688)
>                                 Conds:(Left Outer),Output:["_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8","_col9","_col10"]
>                               <-Reducer 10 [CUSTOM_SIMPLE_EDGE]
>                                 PARTITION_ONLY_SHUFFLE [RS_135]
>                                   Merge Join Operator [MERGEJOIN_179] (rows=2 width=684)
>                                     Conds:(Left Outer),Output:["_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8","_col9"]
>                                   <-Reducer 34 [CUSTOM_SIMPLE_EDGE] vectorized
>                                     PARTITION_ONLY_SHUFFLE [RS_275]
>                                       Select Operator [SEL_274] (rows=1 width=112)
>                                         Output:["_col0"]
>                                         Group By Operator [GBY_273] (rows=1 width=120)
>                                           Output:["_col0","_col1"],aggregations:["sum(VALUE._col0)","count(VALUE._col1)"]
>                                         <-Map 29 [CUSTOM_SIMPLE_EDGE] vectorized
>                                           PARTITION_ONLY_SHUFFLE [RS_254]
>                                             Group By Operator [GBY_249] (rows=1 width=120)
>                                               Output:["_col0","_col1"],aggregations:["sum(ss_net_paid_inc_tax)","count(ss_net_paid_inc_tax)"]
>                                               Select Operator [SEL_244] (rows=182855757 width=110)
>                                                 Output:["ss_net_paid_inc_tax"]
>                                                 Filter Operator [FIL_239] (rows=182855757 width=110)
>                                                   predicate:ss_quantity BETWEEN 41 AND 60
>                                                   TableScan [TS_80] (rows=575995635 width=110)
>                                                     default@store_sales,store_sales,Tbl:COMPLETE,Col:COMPLETE,Output:["ss_quantity","ss_net_paid_inc_tax"]
>                                   <-Reducer 9 [CUSTOM_SIMPLE_EDGE]
>                                     PARTITION_ONLY_SHUFFLE [RS_132]
>                                       Merge Join Operator [MERGEJOIN_178] (rows=2 width=572)
>                                         Conds:(Left Outer),Output:["_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"]
>                                       <-Reducer 28 [CUSTOM_SIMPLE_EDGE] vectorized
>                                         PARTITION_ONLY_SHUFFLE [RS_272]
>                                           Select Operator [SEL_271] (rows=1 width=112)
>                                             Output:["_col0"]
>                                             Group By Operator [GBY_270] (rows=1 width=120)
>                                               Output:["_col0","_col1"],aggregations:["sum(VALUE._col0)","count(VALUE._col1)"]
>                                             <-Map 23 [CUSTOM_SIMPLE_EDGE] vectorized
>                                               PARTITION_ONLY_SHUFFLE [RS_231]
>                                                 Group By Operator [GBY_226] (rows=1 width=120)
>                                                   Output:["_col0","_col1"],aggregations:["sum(ss_ext_list_price)","count(ss_ext_list_price)"]
>                                                   Select Operator [SEL_221] (rows=182855757 width=110)
>                                                     Output:["ss_ext_list_price"]
>                                                     Filter Operator [FIL_216] (rows=182855757 width=110)
>                                                       predicate:ss_quantity BETWEEN 41 AND 60
>                                                       TableScan [TS_73] (rows=575995635 width=110)
>                                                         default@store_sales,store_sales,Tbl:COMPLETE,Col:COMPLETE,Output:["ss_quantity","ss_ext_list_price"]
>                                       <-Reducer 8 [CUSTOM_SIMPLE_EDGE]
>                                         PARTITION_ONLY_SHUFFLE [RS_129]
>                                           Merge Join Operator [MERGEJOIN_177] (rows=2 width=460)
>                                             Conds:(Left Outer),Output:["_col1","_col2","_col3","_col4","_col5","_col6","_col7"]
>                                           <-Reducer 22 [CUSTOM_SIMPLE_EDGE] vectorized
>                                             PARTITION_ONLY_SHUFFLE [RS_269]
>                                               Select Operator [SEL_268] (rows=1 width=4)
>                                                 Output:["_col0"]
>                                                 Group By Operator [GBY_267] (rows=1 width=8)
>                                                   Output:["_col0"],aggregations:["count(VALUE._col0)"]
>                                                 <-Map 17 [CUSTOM_SIMPLE_EDGE] vectorized
>                                                   PARTITION_ONLY_SHUFFLE [RS_208]
>                                                     Group By Operator [GBY_203] (rows=1 width=8)
>                                                       Output:["_col0"],aggregations:["count()"]
>                                                       Select Operator [SEL_198] (rows=182855757 width=3)
>                                                         Filter Operator [FIL_193] (rows=182855757 width=3)
>                                                           predicate:ss_quantity BETWEEN 41 AND 60
>                                                           TableScan [TS_66] (rows=575995635 width=3)
>                                                             default@store_sales,store_sales,Tbl:COMPLETE,Col:COMPLETE,Output:["ss_quantity"]
>                                           <-Reducer 7 [CUSTOM_SIMPLE_EDGE]
>                                             PARTITION_ONLY_SHUFFLE [RS_126]
>                                               Merge Join Operator [MERGEJOIN_176] (rows=2 width=456)
>                                                 Conds:(Left Outer),Output:["_col1","_col2","_col3","_col4","_col5","_col6"]
>                                               <-Reducer 33 [CUSTOM_SIMPLE_EDGE] vectorized
>                                                 PARTITION_ONLY_SHUFFLE [RS_266]
>                                                   Select Operator [SEL_265] (rows=1 width=112)
>                                                     Output:["_col0"]
>                                                     Group By Operator [GBY_264] (rows=1 width=120)
>                                                       Output:["_col0","_col1"],aggregations:["sum(VALUE._col0)","count(VALUE._col1)"]
>                                                     <-Map 29 [CUSTOM_SIMPLE_EDGE] vectorized
>                                                       PARTITION_ONLY_SHUFFLE [RS_253]
>                                                         Group By Operator [GBY_248] (rows=1 width=120)
>                                                           Output:["_col0","_col1"],aggregations:["sum(ss_net_paid_inc_tax)","count(ss_net_paid_inc_tax)"]
>                                                           Select Operator [SEL_243] (rows=182855757 width=110)
>                                                             Output:["ss_net_paid_inc_tax"]
>                                                             Filter Operator [FIL_238] (rows=182855757 width=110)
>                                                               predicate:ss_quantity BETWEEN 21 AND 40
>                                                                Please refer to the previous TableScan [TS_80]
>                                               <-Reducer 6 [CUSTOM_SIMPLE_EDGE]
>                                                 PARTITION_ONLY_SHUFFLE [RS_123]
>                                                   Merge Join Operator [MERGEJOIN_175] (rows=2 width=344)
>                                                     Conds:(Left Outer),Output:["_col1","_col2","_col3","_col4","_col5"]
>                                                   <-Reducer 27 [CUSTOM_SIMPLE_EDGE] vectorized
>                                                     PARTITION_ONLY_SHUFFLE [RS_263]
>                                                       Select Operator [SEL_262] (rows=1 width=112)
>                                                         Output:["_col0"]
>                                                         Group By Operator [GBY_261] (rows=1 width=120)
>                                                           Output:["_col0","_col1"],aggregations:["sum(VALUE._col0)","count(VALUE._col1)"]
>                                                         <-Map 23 [CUSTOM_SIMPLE_EDGE] vectorized
>                                                           PARTITION_ONLY_SHUFFLE [RS_230]
>                                                             Group By Operator [GBY_225] (rows=1 width=120)
>                                                               Output:["_col0","_col1"],aggregations:["sum(ss_ext_list_price)","count(ss_ext_list_price)"]
>                                                               Select Operator [SEL_220] (rows=182855757 width=110)
>                                                                 Output:["ss_ext_list_price"]
>                                                                 Filter Operator [FIL_215] (rows=182855757 width=110)
>                                                                   predicate:ss_quantity BETWEEN 21 AND 40
>                                                                    Please refer to the previous TableScan [TS_73]
>                                                   <-Reducer 5 [CUSTOM_SIMPLE_EDGE]
>                                                     PARTITION_ONLY_SHUFFLE [RS_120]
>                                                       Merge Join Operator [MERGEJOIN_174] (rows=2 width=232)
>                                                         Conds:(Left Outer),Output:["_col1","_col2","_col3","_col4"]
>                                                       <-Reducer 21 [CUSTOM_SIMPLE_EDGE] vectorized
>                                                         PARTITION_ONLY_SHUFFLE [RS_260]
>                                                           Select Operator [SEL_259] (rows=1 width=4)
>                                                             Output:["_col0"]
>                                                             Group By Operator [GBY_258] (rows=1 width=8)
>                                                               Output:["_col0"],aggregations:["count(VALUE._col0)"]
>                                                             <-Map 17 [CUSTOM_SIMPLE_EDGE] vectorized
>                                                               PARTITION_ONLY_SHUFFLE [RS_207]
>                                                                 Group By Operator [GBY_202] (rows=1 width=8)
>                                                                   Output:["_col0"],aggregations:["count()"]
>                                                                   Select Operator [SEL_197] (rows=182855757 width=3)
>                                                                     Filter Operator [FIL_192] (rows=182855757 width=3)
>                                                                       predicate:ss_quantity BETWEEN 21 AND 40
>                                                                        Please refer to the previous TableScan [TS_66]
>                                                       <-Reducer 4 [CUSTOM_SIMPLE_EDGE]
>                                                         PARTITION_ONLY_SHUFFLE [RS_117]
>                                                           Merge Join Operator [MERGEJOIN_173] (rows=2 width=228)
>                                                             Conds:(Left Outer),Output:["_col1","_col2","_col3"]
>                                                           <-Reducer 3 [CUSTOM_SIMPLE_EDGE]
>                                                             PARTITION_ONLY_SHUFFLE [RS_114]
>                                                               Merge Join Operator [MERGEJOIN_172] (rows=2 width=116)
>                                                                 Conds:(Left Outer),Output:["_col1","_col2"]
>                                                               <-Reducer 2 [CUSTOM_SIMPLE_EDGE]
>                                                                 PARTITION_ONLY_SHUFFLE [RS_111]
>                                                                   Merge Join Operator [MERGEJOIN_171] (rows=2 width=4)
>                                                                     Conds:(Left Outer),Output:["_col1"]
>                                                                   <-Map 1 [CUSTOM_SIMPLE_EDGE] vectorized
>                                                                     PARTITION_ONLY_SHUFFLE [RS_188]
>                                                                       Select Operator [SEL_187] (rows=2 width=4)
>                                                                         Filter Operator [FIL_186] (rows=2 width=4)
>                                                                           predicate:(r_reason_sk = 1)
>                                                                           TableScan [TS_0] (rows=72 width=4)
>                                                                             default@reason,reason,Tbl:COMPLETE,Col:COMPLETE,Output:["r_reason_sk"]
>                                                                   <-Reducer 20 [CUSTOM_SIMPLE_EDGE] vectorized
>                                                                     PARTITION_ONLY_SHUFFLE [RS_211]
>                                                                       Select Operator [SEL_210] (rows=1 width=4)
>                                                                         Output:["_col0"]
>                                                                         Group By Operator [GBY_209] (rows=1 width=8)
>                                                                           Output:["_col0"],aggregations:["count(VALUE._col0)"]
>                                                                         <-Map 17 [CUSTOM_SIMPLE_EDGE] vectorized
>                                                                           PARTITION_ONLY_SHUFFLE [RS_206]
>                                                                             Group By Operator [GBY_201] (rows=1 width=8)
>                                                                               Output:["_col0"],aggregations:["count()"]
>                                                                               Select Operator [SEL_196] (rows=182855757 width=3)
>                                                                                 Filter Operator [FIL_191] (rows=182855757 width=3)
>                                                                                   predicate:ss_quantity BETWEEN 1 AND 20
>                                                                                    Please refer to the previous TableScan [TS_66]
>                                                               <-Reducer 26 [CUSTOM_SIMPLE_EDGE] vectorized
>                                                                 PARTITION_ONLY_SHUFFLE [RS_234]
>                                                                   Select Operator [SEL_233] (rows=1 width=112)
>                                                                     Output:["_col0"]
>                                                                     Group By Operator [GBY_232] (rows=1 width=120)
>                                                                       Output:["_col0","_col1"],aggregations:["sum(VALUE._col0)","count(VALUE._col1)"]
>                                                                     <-Map 23 [CUSTOM_SIMPLE_EDGE] vectorized
>                                                                       PARTITION_ONLY_SHUFFLE [RS_229]
>                                                                         Group By Operator [GBY_224] (rows=1 width=120)
>                                                                           Output:["_col0","_col1"],aggregations:["sum(ss_ext_list_price)","count(ss_ext_list_price)"]
>                                                                           Select Operator [SEL_219] (rows=182855757 width=110)
>                                                                             Output:["ss_ext_list_price"]
>                                                                             Filter Operator [FIL_214] (rows=182855757 width=110)
>                                                                               predicate:ss_quantity BETWEEN 1 AND 20
>                                                                                Please refer to the previous TableScan [TS_73]
>                                                           <-Reducer 32 [CUSTOM_SIMPLE_EDGE] vectorized
>                                                             PARTITION_ONLY_SHUFFLE [RS_257]
>                                                               Select Operator [SEL_256] (rows=1 width=112)
>                                                                 Output:["_col0"]
>                                                                 Group By Operator [GBY_255] (rows=1 width=120)
>                                                                   Output:["_col0","_col1"],aggregations:["sum(VALUE._col0)","count(VALUE._col1)"]
>                                                                 <-Map 29 [CUSTOM_SIMPLE_EDGE] vectorized
>                                                                   PARTITION_ONLY_SHUFFLE [RS_252]
>                                                                     Group By Operator [GBY_247] (rows=1 width=120)
>                                                                       Output:["_col0","_col1"],aggregations:["sum(ss_net_paid_inc_tax)","count(ss_net_paid_inc_tax)"]
>                                                                       Select Operator [SEL_242] (rows=182855757 width=110)
>                                                                         Output:["ss_net_paid_inc_tax"]
>                                                                         Filter Operator [FIL_237] (rows=182855757 width=110)
>                                                                           predicate:ss_quantity BETWEEN 1 AND 20
>                                                                            Please refer to the previous TableScan [TS_80]
>                               <-Reducer 18 [CUSTOM_SIMPLE_EDGE] vectorized
>                                 PARTITION_ONLY_SHUFFLE [RS_278]
>                                   Select Operator [SEL_277] (rows=1 width=4)
>                                     Output:["_col0"]
>                                     Group By Operator [GBY_276] (rows=1 width=8)
>                                       Output:["_col0"],aggregations:["count(VALUE._col0)"]
>                                     <-Map 17 [CUSTOM_SIMPLE_EDGE] vectorized
>                                       PARTITION_ONLY_SHUFFLE [RS_204]
>                                         Group By Operator [GBY_199] (rows=1 width=8)
>                                           Output:["_col0"],aggregations:["count()"]
>                                           Select Operator [SEL_194] (rows=182855757 width=3)
>                                             Filter Operator [FIL_189] (rows=182855757 width=3)
>                                               predicate:ss_quantity BETWEEN 61 AND 80
>                                                Please refer to the previous TableScan [TS_66]
>                           <-Reducer 24 [CUSTOM_SIMPLE_EDGE] vectorized
>                             PARTITION_ONLY_SHUFFLE [RS_281]
>                               Select Operator [SEL_280] (rows=1 width=112)
>                                 Output:["_col0"]
>                                 Group By Operator [GBY_279] (rows=1 width=120)
>                                   Output:["_col0","_col1"],aggregations:["sum(VALUE._col0)","count(VALUE._col1)"]
>                                 <-Map 23 [CUSTOM_SIMPLE_EDGE] vectorized
>                                   PARTITION_ONLY_SHUFFLE [RS_227]
>                                     Group By Operator [GBY_222] (rows=1 width=120)
>                                       Output:["_col0","_col1"],aggregations:["sum(ss_ext_list_price)","count(ss_ext_list_price)"]
>                                       Select Operator [SEL_217] (rows=182855757 width=110)
>                                         Output:["ss_ext_list_price"]
>                                         Filter Operator [FIL_212] (rows=182855757 width=110)
>                                           predicate:ss_quantity BETWEEN 61 AND 80
>                                            Please refer to the previous TableScan [TS_73]
>                       <-Reducer 30 [CUSTOM_SIMPLE_EDGE] vectorized
>                         PARTITION_ONLY_SHUFFLE [RS_284]
>                           Select Operator [SEL_283] (rows=1 width=112)
>                             Output:["_col0"]
>                             Group By Operator [GBY_282] (rows=1 width=120)
>                               Output:["_col0","_col1"],aggregations:["sum(VALUE._col0)","count(VALUE._col1)"]
>                             <-Map 29 [CUSTOM_SIMPLE_EDGE] vectorized
>                               PARTITION_ONLY_SHUFFLE [RS_250]
>                                 Group By Operator [GBY_245] (rows=1 width=120)
>                                   Output:["_col0","_col1"],aggregations:["sum(ss_net_paid_inc_tax)","count(ss_net_paid_inc_tax)"]
>                                   Select Operator [SEL_240] (rows=182855757 width=110)
>                                     Output:["ss_net_paid_inc_tax"]
>                                     Filter Operator [FIL_235] (rows=182855757 width=110)
>                                       predicate:ss_quantity BETWEEN 61 AND 80
>                                        Please refer to the previous TableScan [TS_80]
>                   <-Reducer 19 [CUSTOM_SIMPLE_EDGE] vectorized
>                     PARTITION_ONLY_SHUFFLE [RS_287]
>                       Select Operator [SEL_286] (rows=1 width=4)
>                         Output:["_col0"]
>                         Group By Operator [GBY_285] (rows=1 width=8)
>                           Output:["_col0"],aggregations:["count(VALUE._col0)"]
>                         <-Map 17 [CUSTOM_SIMPLE_EDGE] vectorized
>                           PARTITION_ONLY_SHUFFLE [RS_205]
>                             Group By Operator [GBY_200] (rows=1 width=8)
>                               Output:["_col0"],aggregations:["count()"]
>                               Select Operator [SEL_195] (rows=182855757 width=3)
>                                 Filter Operator [FIL_190] (rows=182855757 width=3)
>                                   predicate:ss_quantity BETWEEN 81 AND 100
>                                    Please refer to the previous TableScan [TS_66]
>               <-Reducer 25 [CUSTOM_SIMPLE_EDGE] vectorized
>                 PARTITION_ONLY_SHUFFLE [RS_290]
>                   Select Operator [SEL_289] (rows=1 width=112)
>                     Output:["_col0"]
>                     Group By Operator [GBY_288] (rows=1 width=120)
>                       Output:["_col0","_col1"],aggregations:["sum(VALUE._col0)","count(VALUE._col1)"]
>                     <-Map 23 [CUSTOM_SIMPLE_EDGE] vectorized
>                       PARTITION_ONLY_SHUFFLE [RS_228]
>                         Group By Operator [GBY_223] (rows=1 width=120)
>                           Output:["_col0","_col1"],aggregations:["sum(ss_ext_list_price)","count(ss_ext_list_price)"]
>                           Select Operator [SEL_218] (rows=182855757 width=110)
>                             Output:["ss_ext_list_price"]
>                             Filter Operator [FIL_213] (rows=182855757 width=110)
>                               predicate:ss_quantity BETWEEN 81 AND 100
>                                Please refer to the previous TableScan [TS_73]
>           <-Reducer 31 [CUSTOM_SIMPLE_EDGE] vectorized
>             PARTITION_ONLY_SHUFFLE [RS_293]
>               Select Operator [SEL_292] (rows=1 width=112)
>                 Output:["_col0"]
>                 Group By Operator [GBY_291] (rows=1 width=120)
>                   Output:["_col0","_col1"],aggregations:["sum(VALUE._col0)","count(VALUE._col1)"]
>                 <-Map 29 [CUSTOM_SIMPLE_EDGE] vectorized
>                   PARTITION_ONLY_SHUFFLE [RS_251]
>                     Group By Operator [GBY_246] (rows=1 width=120)
>                       Output:["_col0","_col1"],aggregations:["sum(ss_net_paid_inc_tax)","count(ss_net_paid_inc_tax)"]
>                       Select Operator [SEL_241] (rows=182855757 width=110)
>                         Output:["ss_net_paid_inc_tax"]
>                         Filter Operator [FIL_236] (rows=182855757 width=110)
>                           predicate:ss_quantity BETWEEN 81 AND 100
>                            Please refer to the previous TableScan [TS_80]
> {code}
> {code}
> TableScan [TS_80] (rows=575995635 width=110)
> default@store_sales,store_sales,Tbl:COMPLETE,Col:COMPLETE,Output:["ss_quantity","ss_net_paid_inc_tax"]
> {code}
> {code}
> TableScan [TS_73] (rows=575995635 width=110)default@store_sales,store_sales,Tbl:COMPLETE,Col:COMPLETE,Output:["ss_quantity","ss_ext_list_price"]
> {code}
> {code}
> TableScan [TS_66] (rows=575995635 width=3)
> default@store_sales,store_sales,Tbl:COMPLETE,Col:COMPLETE,Output:["ss_quantity"]
> {code}
> Table *store_sales* read by three TableScans. The difference between then is the projected columns.
> The goal of this patch is to merge those TableScan operators and project the columns from all three original TS.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)